corrMatrix module

corrMatrix.calculateCorrMatrix(…[, maskvalue]) Calculate correlation matrix from a 2D numpy array.
corrMatrix.calculateCovMatrix(…[, maskvalue]) Calculate covariance matrix from a 2D numpy array.
corrMatrix.calculateCorrelation(ndarray x, …) Calculate correlation between two 1D numpy array.
corrMatrix.calculateCovariance(ndarray x, …) Calculate covariance between two 1D numpy array.
corrMatrix.calculateCorrMatrixForCCMap(…) Calculate correlation matrix of a contact map.
corrMatrix.calculateCorrMatrixForGCMaps(…) Calculate Correlation matrix for all maps present in input gcmap file It calculates correlation between all rows and columns of contact map.
calculateCorrMatrix(ndarray in_array, ndarray out=None, maskvalue=None)

Calculate correlation matrix from a 2D numpy array. During calculation, array values equal to maskvalue will not be considered.

Parameters:
  • in_array (numpy.ndarray) – Input numpy array
  • out (numpy.ndarray) – If it is None, output array is returned.
  • maskvalue (float) – If this value is given, all elements of input array with this equal value is masked during the calculation.
Returns:

out – In case of out=None, ouput array will be returned. Otherwise, None is returned

Return type:

numpy.ndarray or None

calculateCovMatrix(ndarray in_array, ndarray out=None, maskvalue=None)

Calculate covariance matrix from a 2D numpy array. During calculation, array values equal to maskvalue will not be considered.

Parameters:
  • in_array (numpy.ndarray) – Input numpy array
  • out (numpy.ndarray) – If it is None, output array is returned.
  • maskvalue (float) – If this value is given, all elements of input array with this equal value is masked during the calculation.
Returns:

out – In case of out=None, ouput array will be returned. Otherwise, None is returned

Return type:

numpy.ndarray or None

calculateCorrelation(ndarray x, ndarray y, maskvalue=None)

Calculate correlation between two 1D numpy array. During calculation, array values equal to maskvalue will not be considered.

Parameters:
  • x (numpy.ndarray) – First input numpy array
  • y (numpy.ndarray) – Second input numpy array
  • maskvalue (float) – If this value is given, all elements of input array with this equal value is masked during the calculation.
Returns:

result – Correlation coefficient

Return type:

float

calculateCovariance(ndarray x, ndarray y, maskvalue=None)

Calculate covariance between two 1D numpy array. During calculation, array values equal to maskvalue will not be considered.

Parameters:
  • x (numpy.ndarray) – First input numpy array
  • y (numpy.ndarray) – Second input numpy array
  • maskvalue (float) – If this value is given, all elements of input array with this equal value is masked during the calculation.
Returns:

result – Covariance value

Return type:

float

calculateCorrMatrixForCCMap(inputCCMap, logspace=False, maskvalue=0.0, vmin=None, vmax=None, outFile=None, workDir=None)

Calculate correlation matrix of a contact map. It calculates correlation between all rows and columns of contact map.

Parameters:
  • ccMap (gcMapExplorer.lib.ccmap.CCMAP or ccmap file) – A CCMAP object containing observed contact frequency or a ccmap file.
  • logspace (bool) – If its value is True, at first map is converted as logarithm of map and subsequently correlation will be calculated.
  • maskvalue (float) – Do not consider bins with this value during calculation. By default here it is zero because bins with zero is considered to be have missing data.
  • vmin (float) – Minimum threshold value for normalization. If contact frequency is less than or equal to this threshold value, this value is discarded during normalization.
  • vmax (float) – Maximum threshold value for normalization. If contact frequency is greater than or equal to this threshold value, this value is discarded during normalization.
  • outFile (str) – Name of output ccmap file, to save directly the correlation matrix as a ccmap file. In case of this option, None will return.
  • workDir (str) – Path to the directory where temporary intermediate files are generated. If None, files are generated in the temporary directory according to the main configuration.
Returns:

ccMapObj – Correlation matrix as ccmap object. When outFile is provided, None is returned. In case of any other error, None is returned.

Return type:

gcMapExplorer.lib.ccmap.CCMAP or None

calculateCorrMatrixForGCMaps(gcMapInputFile, gcMapOutFile, logspace=False, maskvalue=0.0, vmin=None, vmax=None, replaceMatrix=False, compression='lzf', workDir=None)

Calculate Correlation matrix for all maps present in input gcmap file It calculates correlation between all rows and columns of contact map.

Parameters:
  • gcMapInputFile (str) – Input gcmap file.
  • gcMapOutFile (str) – Output gcmap file.
  • logspace (bool) – If its value is True, at first map is converted as logarithm of map and subsequently correlation will be calculated.
  • maskvalue (float) – Do not consider bins with this value during calculation. By default here it is zero because bins with zero is considered to be have missing data.
  • vmin (float) – Minimum threshold value for normalization. If contact frequency is less than or equal to this threshold value, this value is discarded during normalization.
  • vmax (float) – Maximum threshold value for normalization. If contact frequency is greater than or equal to this threshold value, this value is discarded during normalization.
  • replaceMatrix (bool) – If its value is True, the map will be replaced in output file. Otherwise, if a map is present, calculation will be skipped.
  • compression (str) – Compression method in output gcmap file. Presently allowed : lzf for LZF compression and gzip for GZIP compression.
  • workDir (str) – Path to the directory where temporary intermediate files are generated. If None, files are generated in the temporary directory according to the main configuration.
Returns:

Return type:

None