ccmap module

ccmap.CCMAP.copy([fill]) To create a new copy of CCMAP object
ccmap.CCMAP.get_ticks([binsize]) To get xticks and yticks for the matrix
ccmap.CCMAP.make_readable() Enable reading the numpy array binary file.
ccmap.CCMAP.make_unreadable() Disable reading the numpy array binary file from local file system
ccmap.CCMAP.make_writable() Create new numpy array binary file on local file system and enable reading/writing to this file
ccmap.CCMAP.make_editable() Enable editing numpy array binary file
ccmap.resolutionToBinsize(resolution) Return the bin size from the resolution unit
ccmap.binsizeToResolution(binsize) Return the resolution unit from the bin size
ccmap.jsonify(ccMapObj) Changes data type of attributes in CCMAP object for json module.
ccmap.dejsonify(ccMapObj[, json_dict]) Change back the data type of attributes in CCMAP object.
ccmap.save_ccmap(ccMapObj, outfile[, ...]) Save CCMAP object on file
ccmap.load_ccmap(infile[, workDir]) Load CCMAP object from an input file
ccmap.export_cmap(cmap, outfile[, ...]) To export .ccmap as text file

ccmap.CCMAP class

class CCMAP(dtype='float32')

This class contains variables to store Hi-C Data.

The class is instantiated by two methods:
>>> ccMapObj = gcMapExplorer.lib.ccmap.CCMAP()
>>> ccMapObj = gcMapExplorer.lib.ccmap.CCMAP(dtype='float32')
Parameters:dtype (str, Optional) – Data type for matrix. [Default='float32']
path2matrix

str – Path to numpy array binary file on local file system

yticks

list – Minimum and maximum locations along Y-axis. e.g. yticks=[0, 400000]

xticks

list – Minimum and maximum locations along X-axis. e.g. xticks=[0, 400000]

binsize

int – Resolution of data. In case of 10kb resolution, binsize is 10000.

title

str – Title of the data

xlabel

str – Title for X-axis

ylabel

str – Title for Y-axis

shape

tuple – Overall shape of matrix

minvalue

float – Minimum value in matrix

maxvalue

float – Maximum value in matrix

matrix

numpy.memmap – A memmap object pointing to matrix.

HiC map data is saved as a numpy array binary file on local file system. This file can be only read after mapping to a numpy memmap object. After mapping, matrix can be used as a numpy array. The file name is randomly generated as npBinary_XXXXXXXXXX.tmp, where X can be a alphanumeric charecter. Please see details in Numpy memmap.

When ccmap is saved, this file is renamed with ‘.npbin’ extension.

bNoData

numpy.ndarray – A boolean numpy array of matrix shape

bLog

bool – If values in matrix are in log

state

str – State of CCMAP object

This keyword stores the state of the object. The state ensures when the numpy array binary file should be deleted from the local file system.

Three keywords are used:
  • temporary :

    When object is created, it is in temporary state. After executing the script, numpy array binary file is automatically deleted from the local file system.

  • saved:

    When a temporary or new object is saved, the numpy array binary file is copied to the destination directory and state is changed to saved. However, after saving, state become temporary. This method ensures that the saved copy is not deleted and only temporary copy is deleted after executing the script.

    When a already saved object is loaded, it is in saved state. The numpy array binary file is read from the original location and remaines saved at the original location after executing the script.

  • compressed:

    When object is saved and numpy array binary file is simultaneously compressed, it is saved as compressed state. When this object is loaded, the numpy array binary file is decompressed into the working directory. This decompressed file is automatically deleted after execution of script while compressed file remains saved at the original location.

dtype

str – Data type of matrix

copy(fill=None)

To create a new copy of CCMAP object

This method can be used to create a new copy of gcMapExplorer.lib.ccmap.CCMAP. A new numpy array binary file will be created and all values from old file will be copied.

Parameters:fill (float) – Fill map with the value. If not given, map values will be copied.
get_ticks(binsize=None)

To get xticks and yticks for the matrix

Parameters:binsize (int) – Number of base in each bin or pixel or box of contact map.
Returns:
  • xticks (numpy.array) – 1D array containing positions along X-axis
  • yticks (numpy.array) – 1D array containing positions along X-axis
make_editable()

Enable editing numpy array binary file

make_readable()

Enable reading the numpy array binary file.

Matrix file is saved on local file system. This file can be only read after mapping to a memmap object. This method maps the numpy memmap object to self.matrix variable. After using this method, gcMapExplorer.lib.ccmap.CCMAP.matrix can be used directly as similar to numpy array. Please see details in Numpy memmap

make_unreadable()

Disable reading the numpy array binary file from local file system

make_writable()

Create new numpy array binary file on local file system and enable reading/writing to this file

Note

If a matrix file with similar name is already present, old file will be backed up.

ccmap module

resolutionToBinsize(resolution)

Return the bin size from the resolution unit

It is a convenient function to convert resolution unit to binsize. It has a support of base (b), kilobase (kb), megabase (mb) and gigabase (gb) unit. It also convert decimal resolution unit as shown below in examples.

Parameters:resolution (str) – resolution in b, kb, mb or gb.
Returns:binsize – bin size
Return type:int

Examples

>>> resolutionToBinsize('1b')
1
>>> resolutionToBinsize('10b')
10
>>> resolutionToBinsize('1kb')
1000
>>> resolutionToBinsize('16kb')
16000
>>> resolutionToBinsize('1.23kb')
1230
>>> resolutionToBinsize('1.6mb')
1600000
>>> resolutionToBinsize('1.457mb')
1457000
binsizeToResolution(binsize)

Return the resolution unit from the bin size

It is a convenient function to convert binsize into resolution unit. It has a support of base (b), kilobase (kb), megabase (mb) and gigabase (gb) unit. It also convert binsize to decimal resolution unit as shown below in examples.

Parameters:binsize (int) – bin size
Returns:resolution – resolution unit
Return type:str

Examples

>>> binsizeToResolution(1)
'1b'
>>> binsizeToResolution(10)
'10b'
>>> binsizeToResolution(10000)
'10kb'
>>> binsizeToResolution(100000)
'100kb'
>>> binsizeToResolution(125500)
'125.5kb'
>>> binsizeToResolution(1000000)
'1mb'
>>> binsizeToResolution(1634300)
'1.6343mb'
jsonify(ccMapObj)

Changes data type of attributes in CCMAP object for json module.

Before saving the CCMAP object, its attributes data types are neccessary to change because few data types are not supported by json.

Therefore, it is converted into other data types which are supported by json. These are the following attributes which are changed:

Attributes Original modified
bNoData numpy boolean array string of 0 and 1
xticks list of integer list of string
yticks list of integer list of string
minvalue float string
maxvalue float string
shape tuple of integer list of string

Warning

If a object is passed through this method, it should be again passed through gcMapExplorer.lib.ccmap.dejsonify() for any further use. Otherwise, this object cannot be used in any other methods because of the attributes data type modifications.

Parameters:ccMapObj (gcMapExplorer.lib.ccmap.CCMAP) – A CCMAP object
Returns:
Return type:None
dejsonify(ccMapObj, json_dict=None)

Change back the data type of attributes in CCMAP object.

Before loading the CCMAP object, its attributes data types are neccessary to change back.

Therefore, it is converted into original data types as shown in a table (see gcMapExplorer.lib.ccmap.jsonify())

Parameters:
save_ccmap(ccMapObj, outfile, compress=False, logHandler=None)

Save CCMAP object on file

CCMAP object can be saved as file for easy use. json module is used to save the object. The binary numpy array file is copied in the destination directory. If compress=True, the array file will be compressed in gzip format.

Note

  • Compression significantly reduces the array file size. However, its loading is slow during initiation when file is decompressed.
  • After loading, the decompressed binary numpy array file takes additional memory on local file system.
Parameters:
  • ccMapObj (gcMapExplorer.lib.ccmap.CCMAP) – A CCMAP object, which has to be saved
  • outfile (str) – Name of output file including path to the directory/folder where file should be saved.
  • compress (bool) – If True, numpy array file will be compressed.
Returns:

Return type:

None

load_ccmap(infile, workDir=None)

Load CCMAP object from an input file

CCMAP object can be created from the input file, which was earlier saved using gcMapExplorer.lib.ccmap.save_ccmap(). If the binary numpy array is compressed, this file is automatically extracted in the current working directory. After completion of the execution, this decompressed file will be automatically deleted. The compressed saved file will be remained unchanged.

Parameters:
  • infile (str) – Name of the inputp file including path to the directory/folder where file is saved.
  • workDir (str) – Name of working directory, where temporary files will be kept.If workDir = None, file will be generated in OS based temporary directory.
Returns:

ccMapObj – A CCMAP object

Return type:

gcMapExplorer.lib.ccmap.CCMAP

export_cmap(cmap, outfile, doNotWriteZeros=True)

To export .ccmap as text file

This function export .ccmap as coordinate list (COO) format sparse matrix file. In COO format, lists of (row, column, value) as three tab seprated columns are written in output file.

Parameters: