ccmap module

ccmap.CCMAP.copy([fill]) To create a new copy of CCMAP object
ccmap.CCMAP.get_ticks([binsize]) To get xticks and yticks for the matrix
ccmap.CCMAP.make_readable() Enable reading the numpy array binary file.
ccmap.CCMAP.make_unreadable() Disable reading the numpy array binary file from local file system
ccmap.CCMAP.make_writable() Create new numpy array binary file on local file system and enable reading/writing to this file
ccmap.CCMAP.make_editable() Enable editing numpy array binary file
ccmap.jsonify(ccMapObj) Changes data type of attributes in CCMAP object for json module.
ccmap.dejsonify(ccMapObj[, json_dict]) Change back the data type of attributes in CCMAP object.
ccmap.save_ccmap(ccMapObj, outfile[, …]) Save CCMAP object on file
ccmap.load_ccmap(infile[, workDir]) Load CCMAP object from an input file
ccmap.export_cmap(ccmap, outfile[, …]) To export .ccmap as text file
ccmap.checkCCMapObjectOrFile(ccMap[, workDir]) Check whether ccmap is a object or file
ccmap.downSampleCCMap(cmap[, level, method, …]) Downsample or coarsen the contact map
ccmap.getOutputShapeFor2DMapDownsampling(…) Helper function to determine output shape of map for downsampling
ccmap.downSample2DMap(inMatrix[, outMatrix, …]) Downsample or coarsen the matrix

ccmap.CCMAP class

class CCMAP(dtype='float32')

This class contains variables to store Hi-C Data.

The class is instantiated by two methods:
>>> ccMapObj = gcMapExplorer.lib.ccmap.CCMAP()
>>> ccMapObj = gcMapExplorer.lib.ccmap.CCMAP(dtype='float32')
Parameters:dtype (str, Optional) – Data type for matrix. [Default='float32']
path2matrix

str – Path to numpy array binary file on local file system

yticks

list – Minimum and maximum locations along Y-axis. e.g. yticks=[0, 400000]

xticks

list – Minimum and maximum locations along X-axis. e.g. xticks=[0, 400000]

binsize

int – Resolution of data. In case of 10kb resolution, binsize is 10000.

title

str – Title of the data

xlabel

str – Title for X-axis

ylabel

str – Title for Y-axis

shape

tuple – Overall shape of matrix

minvalue

float – Minimum value in matrix

maxvalue

float – Maximum value in matrix

matrix

numpy.memmap – A memmap object pointing to matrix.

HiC map data is saved as a numpy array binary file on local file system. This file can be only read after mapping to a numpy memmap object. After mapping, matrix can be used as a numpy array. The file name is randomly generated as npBinary_XXXXXXXXXX.tmp, where X can be a alphanumeric charecter. Please see details in Numpy memmap.

When ccmap is saved, this file is renamed with ‘.npbin’ extension.

bNoData

numpy.ndarray – A boolean numpy array of matrix shape

bLog

bool – If values in matrix are in log

state

str – State of CCMAP object

This keyword stores the state of the object. The state ensures when the numpy array binary file should be deleted from the local file system.

Three keywords are used:
  • temporary :

    When object is created, it is in temporary state. After executing the script, numpy array binary file is automatically deleted from the local file system.

  • saved:

    When a temporary or new object is saved, the numpy array binary file is copied to the destination directory and state is changed to saved. However, after saving, state become temporary. This method ensures that the saved copy is not deleted and only temporary copy is deleted after executing the script.

    When a already saved object is loaded, it is in saved state. The numpy array binary file is read from the original location and remains saved at the original location after executing the script.

  • compressed:

    When object is saved and numpy array binary file is simultaneously compressed, it is saved as compressed state. When this object is loaded, the numpy array binary file is decompressed into the working directory. This decompressed file is automatically deleted after execution of script while compressed file remains saved at the original location.

dtype

str – Data type of matrix

copy(fill=None)

To create a new copy of CCMAP object

This method can be used to create a new copy of gcMapExplorer.lib.ccmap.CCMAP. A new numpy array binary file will be created and all values from old file will be copied.

Parameters:fill (float) – Fill map with the value. If not given, map values will be copied.
get_ticks(binsize=None)

To get xticks and yticks for the matrix

Parameters:binsize (int) – Number of base in each bin or pixel or box of contact map.
Returns:
  • xticks (numpy.array) – 1D array containing positions along X-axis
  • yticks (numpy.array) – 1D array containing positions along X-axis
make_editable()

Enable editing numpy array binary file

make_readable()

Enable reading the numpy array binary file.

Matrix file is saved on local file system. This file can be only read after mapping to a memmap object. This method maps the numpy memmap object to self.matrix variable. After using this method, gcMapExplorer.lib.ccmap.CCMAP.matrix can be used directly as similar to numpy array. Please see details in Numpy memmap

make_unreadable()

Disable reading the numpy array binary file from local file system

make_writable()

Create new numpy array binary file on local file system and enable reading/writing to this file

Note

If a matrix file with similar name is already present, old file will be backed up.

ccmap module

jsonify(ccMapObj)

Changes data type of attributes in CCMAP object for json module.

Before saving the CCMAP object, its attributes data types are necessary to change because few data types are not supported by json.

Therefore, it is converted into other data types which are supported by json. These are the following attributes which are changed:

Attributes Original modified
bNoData numpy boolean array string of 0 and 1
xticks list of integer list of string
yticks list of integer list of string
minvalue float string
maxvalue float string
binsize integer string
shape tuple of integer list of string
dtype Numpy dtype string

Warning

If a object is passed through this method, it should be again passed through gcMapExplorer.lib.ccmap.dejsonify() for any further use. Otherwise, this object cannot be used in any other methods because of the attributes data type modifications.

Parameters:ccMapObj (gcMapExplorer.lib.ccmap.CCMAP) – A CCMAP object
Returns:
Return type:None
dejsonify(ccMapObj, json_dict=None)

Change back the data type of attributes in CCMAP object.

Before loading the CCMAP object, its attributes data types are necessary to change back.

Therefore, it is converted into original data types as shown in a table (see gcMapExplorer.lib.ccmap.jsonify())

Parameters:
save_ccmap(ccMapObj, outfile, compress=False, logHandler=None)

Save CCMAP object on file

CCMAP object can be saved as file for easy use. json module is used to save the object. The binary numpy array file is copied in the destination directory. If compress=True, the array file will be compressed in gzip format.

Note

  • Compression significantly reduces the array file size. However, its loading is slow during initiation when file is decompressed.
  • After loading, the decompressed binary numpy array file takes additional memory on local file system.
Parameters:
  • ccMapObj (gcMapExplorer.lib.ccmap.CCMAP) – A CCMAP object, which has to be saved
  • outfile (str) – Name of output file including path to the directory/folder where file should be saved.
  • compress (bool) – If True, numpy array file will be compressed.
Returns:

Return type:

None

load_ccmap(infile, workDir=None)

Load CCMAP object from an input file

CCMAP object can be created from the input file, which was earlier saved using gcMapExplorer.lib.ccmap.save_ccmap(). If the binary numpy array is compressed, this file is automatically extracted in the current working directory. After completion of the execution, this decompressed file will be automatically deleted. The compressed saved file will be remained unchanged.

Parameters:
  • infile (str) – Name of the inputp file including path to the directory/folder where file is saved.
  • workDir (str) – Name of working directory, where temporary files will be kept.If workDir = None, file will be generated in OS based temporary directory.
Returns:

ccMapObj – A CCMAP object

Return type:

gcMapExplorer.lib.ccmap.CCMAP

export_cmap(ccmap, outfile, doNotWriteZeros=True)

To export .ccmap as text file

This function export .ccmap as coordinate list (COO) format sparse matrix file. In COO format, lists of (row, column, value) as three tab separated columns are written in output file.

Parameters:
checkCCMapObjectOrFile(ccMap, workDir=None)

Check whether ccmap is a object or file

It can be used to check whether input is a gcMapExplorer.lib.ccmap.CCMAP or a ccmap file.

It returns the gcMapExplorer.lib.ccmap.CCMAP and input type name: i.e. File or Object as an identification keyword for the input.

In case if ccMap argument is a filename, this file will be opened as a gcMapExplorer.lib.ccmap.CCMAP object and will be returned with ccmapType as File.

In case if ccMap argument is a gcMapExplorer.lib.ccmap.CCMAP object, this file, same object will be returned with ccmapType as Object.

Parameters:
  • ccMap (gcMapExplorer.lib.ccmap.CCMAP or str) – CCMAP object or ccmap file.
  • workDir (str) – Path to the directory where temporary intermediate files are generated. If None, files are generated in the temporary directory according to the main configuration.
Returns:

downSampleCCMap(cmap, level=2, method='sum', workDir=None)

Downsample or coarsen the contact map

It can be used to downsample the contact map by any given factor.

Parameters:
  • level (int) – The factor by which map has to be downsampled. For example, if input map has resolution of 10 kb and level = 4, the ouput map will have 40 kb resolution.
  • method (str) – Method of downsampling. Three accepted methods are sum: sum all values, mean: Average of all values and max: Maximum of all values.
Returns:

ccMapObj – CCMAP object

Return type:

gcMapExplorer.lib.ccmap.CCMAP

getOutputShapeFor2DMapDownsampling(inputShape, level=2)

Helper function to determine output shape of map for downsampling

Parameters:
  • level (int) – The factor by which map has to be downsampled. For example, if input map has resolution of 10 kb and level = 4, the ouput map will have 40 kb resolution.
  • inputShape ((int, int)) – Shape of input map as tuple. Preferably, CCMAP.shape.
Returns:

Output shape after downsampling the map

Return type:

outputShape (int, int)

downSample2DMap(inMatrix, outMatrix=None, level=2, method='sum')

Downsample or coarsen the matrix

It can be used to downsample the matrix by any given factor. It is the core function used to downsample ccmap and gcmap.

Parameters:
  • inMatrix (numpy.ndarray) – Input 2D matrix of numpy array type.
  • outMatrix (numpy.ndarray) – Output 2D numpy array in which all output values will be return. Note that shape of output array should be same as will be in downsampled array. To get the shape of output array priror to downsampling, use gcMapExplorer.lib.ccmap.getOutputShapeFor2DMapDownsampling(). In case if it is None, a output matrix will be returned.
  • level (int) – The factor by which map has to be downsampled. For example, if input map has resolution of 10 kb and level = 4, the ouput map will have 40 kb resolution.
  • method (str) – Method of downsampling. Three accepted methods are sum: sum all values, mean: Average of all values and max: Maximum of all values.
Returns:

outMatrix – In case if argument is outMatrix=None, output matrix will be returned.

Return type:

numpy.ndarray