How to normalize Hi-C map?

To normalize a Hi-C map, several methods have been implemented.

Note

Following tutorial needs *.ccmap files, generated in previous tutorial.

See also

Module gcMapExplorer.lib.normalizer for all implemented normalization methods in detail.

Remove old files if any present in normalize directory

In [1]:
%%bash

for f in ./normalized/*; do
    [ -e "$f" ] && rm $f
done

Matrix balancing algorithm by Knight and Ruiz

import gcMapExplorer.lib module

In [2]:
import gcMapExplorer.lib as gmlib
import numpy as np
import os

Directly Normalize and save ccmap file

In [3]:
# Name of ccmap file with path
raw_ccmap_file = 'cmaps/CooMatrix/chr22_100kb_RawObserved.ccmap'

# Name of output file
outFile = 'normalized/chr22_100kb_normKR_direct.ccmap'

# Perform normalization and save to output file
gmlib.normalizer.normalizeCCMapByKR(raw_ccmap_file, outFile=outFile, memory='RAM')
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr22 map...
INFO:normalizer:        ...Finished KR Normalization for chr22 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr22_100kb_normKR_direct.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr22_100kb_normKR_direct.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr22_100kb_normKR_direct.npbin] ...
INFO:save_ccmap:       Finished!!!

See also

Function gcMapExplorer.lib.normalizer.normalizeCCMapByKR() for more details.

Normalize and save Hi-C map thorough CCMAP object

  • Load the ccmap as CCMAP object
  • Normalize it
  • Save as ccmap file
In [4]:
# Load the ccmap file as CCMAP object
raw_ccmap = gmlib.ccmap.load_ccmap(raw_ccmap_file)

# Perform normalization and save to output file
norm_ccmap = gmlib.normalizer.normalizeCCMapByKR(raw_ccmap_file, memory='RAM')

# Save ccmap file
gmlib.ccmap.save_ccmap(norm_ccmap, 'normalized/chr22_100kb_normKR.ccmap', compress=True)

# Remove CCMAP object from memory and generated temporary files
del raw_ccmap
del norm_ccmap
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr22 map...
INFO:normalizer:        ...Finished KR Normalization for chr22 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr22_100kb_normKR.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr22_100kb_normKR.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr22_100kb_normKR.npbin] ...
INFO:save_ccmap:       Finished!!!

See also

Function gcMapExplorer.lib.ccmap.load_ccmap() for more details.

Whether using RAM and HDD yield same result

Here, we test whether using RAM and HDD yield same results. We also calculate total time taken for normalization.

In [5]:
raw_ccmap = gmlib.ccmap.load_ccmap('cmaps/CooMatrix/chr1_100kb_RawObserved.ccmap')

print('Time using RAM: ')
%timeit norm_ram = gmlib.normalizer.normalizeCCMapByKR(raw_ccmap, memory='RAM')

print('Time using HDD: ')
%timeit norm_hdd = gmlib.normalizer.normalizeCCMapByKR(raw_ccmap, memory='HDD')

# Again renormalize
norm_hdd = gmlib.normalizer.normalizeCCMapByKR(raw_ccmap, memory='HDD')
norm_ram = gmlib.normalizer.normalizeCCMapByKR(raw_ccmap, memory='RAM')
norm_ram.make_readable()
norm_hdd.make_readable()

print('If matrix from RAM and HDD are similar: ', np.allclose(norm_ram.matrix, norm_hdd.matrix) )
del raw_ccmap
del norm_ram
del norm_hdd
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr1 map...
Time using RAM:
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through HDD.
INFO:normalizer: KR Normalization is in process for chr1 map...
6.38 s ± 827 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Time using HDD:
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through HDD.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through HDD.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through HDD.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through HDD.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through HDD.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through HDD.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through HDD.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through HDD.
INFO:normalizer: KR Normalization is in process for chr1 map...
20 s ± 988 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
If matrix from RAM and HDD are similar:  True

Normalize and save all ccmaps

In [6]:
chroms = [1, 5, 15, 20, 21]      # List of chromosomes

# Loop for each chromosome
for chrom in chroms:
    input_file = 'cmaps/CooMatrix/chr{0}_100kb_RawObserved.ccmap' .format(chrom)
    output_file = 'normalized/chr{0}_100kb_normKR.ccmap' .format(chrom)

    raw_ccmap = gmlib.ccmap.load_ccmap(input_file)
    norm_ccmap = gmlib.normalizer.normalizeCCMapByKR(raw_ccmap, memory='RAM', workDir=os.getcwd())
    gmlib.ccmap.save_ccmap(norm_ccmap, output_file, compress=True)

    del raw_ccmap     # Remove CCMAP object from memory and any related temporary files
    del norm_ccmap    # Remove CCMAP object from memory and any related temporary files
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr1_100kb_normKR.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr1_100kb_normKR.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr1_100kb_normKR.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr5 map...
INFO:normalizer:        ...Finished KR Normalization for chr5 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr5_100kb_normKR.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr5_100kb_normKR.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr5_100kb_normKR.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr15 map...
INFO:normalizer:        ...Finished KR Normalization for chr15 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr15_100kb_normKR.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr15_100kb_normKR.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr15_100kb_normKR.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr20 map...
INFO:normalizer:        ...Finished KR Normalization for chr20 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr20_100kb_normKR.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr20_100kb_normKR.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr20_100kb_normKR.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr21 map...
INFO:normalizer:        ...Finished KR Normalization for chr21 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr21_100kb_normKR.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr21_100kb_normKR.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr21_100kb_normKR.npbin] ...
INFO:save_ccmap:       Finished!!!

All normalized ccmap files are saved in output directory.


Normalize all maps from a gcmap file using KR method

Maps stored in a gcmap files can be normalized and stored in another gcmap files. Tolerance is increased to 1e-5 value so that normalized values can be later comapred with original algorithm.

In [7]:
# Input raw gcmap file
raw_gcmap_file = 'cmaps/CooMatrix/rawObserved_100kb.gcmap'

# Name of output gcmap file
normKR_gcmap_file = 'normalized/normKR_100kb.gcmap'

# Perform normalization and save to output file
gmlib.normalizer.normalizeGCMapByKR(raw_gcmap_file, normKR_gcmap_file, tol=1e-4)
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr21 map...
INFO:normalizer:        ...Finished KR Normalization for chr21 map...
INFO:addCCMap2GCMap: Opened file [normalized/normKR_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normKR_100kb.gcmap] for [chr21] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr21] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr21] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr21] ...
INFO:addCCMap2GCMap: Closed file [normalized/normKR_100kb.gcmap]...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr22 map...
INFO:normalizer:        ...Finished KR Normalization for chr22 map...
INFO:addCCMap2GCMap: Opened file [normalized/normKR_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normKR_100kb.gcmap] for [chr22] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr22] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr22] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr22] ...
INFO:addCCMap2GCMap: Closed file [normalized/normKR_100kb.gcmap]...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr20 map...
INFO:normalizer:        ...Finished KR Normalization for chr20 map...
INFO:addCCMap2GCMap: Opened file [normalized/normKR_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normKR_100kb.gcmap] for [chr20] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr20] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr20] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr20] ...
INFO:addCCMap2GCMap: Closed file [normalized/normKR_100kb.gcmap]...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr15 map...
INFO:normalizer:        ...Finished KR Normalization for chr15 map...
INFO:addCCMap2GCMap: Opened file [normalized/normKR_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normKR_100kb.gcmap] for [chr15] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr15] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr15] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr15] ...
INFO:addCCMap2GCMap: Closed file [normalized/normKR_100kb.gcmap]...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr5 map...
INFO:normalizer:        ...Finished KR Normalization for chr5 map...
INFO:addCCMap2GCMap: Opened file [normalized/normKR_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normKR_100kb.gcmap] for [chr5] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr5] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr5] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr5] ...
INFO:addCCMap2GCMap: Closed file [normalized/normKR_100kb.gcmap]...
INFO:normalizer: KR Normalization will be done through RAM.
INFO:normalizer: KR Normalization is in process for chr1 map...
INFO:normalizer:        ...Finished KR Normalization for chr1 map...
INFO:addCCMap2GCMap: Opened file [normalized/normKR_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normKR_100kb.gcmap] for [chr1] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr1] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr1] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr1] ...
INFO:addCCMap2GCMap: Closed file [normalized/normKR_100kb.gcmap]...

See also

Function gcMapExplorer.lib.normalizer.normalizeGCMapByKR() for more details.


Normalize by Iterative correction method

This method normalize the raw contact map by removing biases from experimental procedure. For more details, see this publication.

See also

Function gcMapExplorer.lib.normalizer.normalizeCCMapByIC() for more details.

In [8]:
chroms = [1, 5, 15, 20, 21]      # List of chromosomes

# Loop for each chromosome
for chrom in chroms:
    input_file = 'cmaps/CooMatrix/chr{0}_100kb_RawObserved.ccmap' .format(chrom)
    output_file = 'normalized/chr{0}_100kb_IC.ccmap' .format(chrom)

    raw_ccmap = gmlib.ccmap.load_ccmap(input_file)
    norm_ccmap = gmlib.normalizer.normalizeCCMapByIC(raw_ccmap)
    gmlib.ccmap.save_ccmap(norm_ccmap, output_file, compress=True)

    del raw_ccmap     # Remove CCMAP object from memory and any related temporary files
    del norm_ccmap    # Remove CCMAP object from memory and any related temporary files
INFO:normalizer: Iterative Correction is in process for chr1 map...
INFO:normalizer:        ...Finished Iterative Correction for chr1 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr1_100kb_IC.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr1_100kb_IC.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr1_100kb_IC.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: Iterative Correction is in process for chr5 map...
INFO:normalizer:        ...Finished Iterative Correction for chr5 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr5_100kb_IC.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr5_100kb_IC.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr5_100kb_IC.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: Iterative Correction is in process for chr15 map...
INFO:normalizer:        ...Finished Iterative Correction for chr15 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr15_100kb_IC.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr15_100kb_IC.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr15_100kb_IC.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: Iterative Correction is in process for chr20 map...
INFO:normalizer:        ...Finished Iterative Correction for chr20 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr20_100kb_IC.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr20_100kb_IC.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr20_100kb_IC.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: Iterative Correction is in process for chr21 map...
INFO:normalizer:        ...Finished Iterative Correction for chr21 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr21_100kb_IC.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr21_100kb_IC.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr21_100kb_IC.npbin] ...
INFO:save_ccmap:       Finished!!!


Normalize all maps from a gcmap file using IC method

Maps stored in a gcmap files can be normalized and stored in another gcmap files.

Note: Here we used high tolerance and large number of iteration.

In [9]:
# Input raw gcmap file
raw_gcmap_file = 'cmaps/CooMatrix/rawObserved_100kb.gcmap'

# Name of output gcmap file
normIC_gcmap_file = 'normalized/normIC_100kb.gcmap'

# Perform normalization and save to output file
# Not that here we used high tolerance and large number of iteration
gmlib.normalizer.normalizeGCMapByIC(raw_gcmap_file, normIC_gcmap_file, tol=1e-4, iteration=3000)
INFO:normalizer: Iterative Correction is in process for chr21 map...
INFO:normalizer:        ...Finished Iterative Correction for chr21 map...
INFO:addCCMap2GCMap: Opened file [normalized/normIC_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normIC_100kb.gcmap] for [chr21] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr21] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr21] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr21] ...
INFO:addCCMap2GCMap: Closed file [normalized/normIC_100kb.gcmap]...
INFO:normalizer: Iterative Correction is in process for chr22 map...
INFO:normalizer:        ...Finished Iterative Correction for chr22 map...
INFO:addCCMap2GCMap: Opened file [normalized/normIC_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normIC_100kb.gcmap] for [chr22] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr22] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr22] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr22] ...
INFO:addCCMap2GCMap: Closed file [normalized/normIC_100kb.gcmap]...
INFO:normalizer: Iterative Correction is in process for chr20 map...
INFO:normalizer:        ...Finished Iterative Correction for chr20 map...
INFO:addCCMap2GCMap: Opened file [normalized/normIC_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normIC_100kb.gcmap] for [chr20] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr20] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr20] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr20] ...
INFO:addCCMap2GCMap: Closed file [normalized/normIC_100kb.gcmap]...
INFO:normalizer: Iterative Correction is in process for chr15 map...
INFO:normalizer:        ...Finished Iterative Correction for chr15 map...
INFO:addCCMap2GCMap: Opened file [normalized/normIC_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normIC_100kb.gcmap] for [chr15] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr15] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr15] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr15] ...
INFO:addCCMap2GCMap: Closed file [normalized/normIC_100kb.gcmap]...
INFO:normalizer: Iterative Correction is in process for chr5 map...
INFO:normalizer:        ...Finished Iterative Correction for chr5 map...
INFO:addCCMap2GCMap: Opened file [normalized/normIC_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normIC_100kb.gcmap] for [chr5] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr5] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr5] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr5] ...
INFO:addCCMap2GCMap: Closed file [normalized/normIC_100kb.gcmap]...
INFO:normalizer: Iterative Correction is in process for chr1 map...
INFO:normalizer:        ...Finished Iterative Correction for chr1 map...
INFO:addCCMap2GCMap: Opened file [normalized/normIC_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normIC_100kb.gcmap] for [chr1] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr1] ...
INFO:addCCMap2GCMap: Generating downsampled maps for [chr1] ...
INFO:addCCMap2GCMap:     ... Finished downsampling for [chr1] ...
INFO:addCCMap2GCMap: Closed file [normalized/normIC_100kb.gcmap]...

See also

Function gcMapExplorer.lib.normalizer.normalizeGCMapByIC() for more details.


Normalize by Median Contact Frequency Scaling (MCFS)

This method can be used to scale Hi-C map using median contact values for respective distance between two locations/coordinates. At first, median distance contact frequency for each distance is calculated, and subsequently, the observed contact frequency is scaled (divided) by respective median distance contact frequency.

See also

Function gcMapExplorer.lib.normalizer.normalizeCCMapByMCFS() for more details.

In [10]:
chroms = [1, 5, 15, 20, 21]      # List of chromosomes

# Loop for each chromosome
for chrom in chroms:
    input_file = 'cmaps/CooMatrix/chr{0}_100kb_RawObserved.ccmap' .format(chrom)
    output_file = 'normalized/chr{0}_100kb_MCFS.ccmap' .format(chrom)

    raw_ccmap = gmlib.ccmap.load_ccmap(input_file)
    norm_ccmap = gmlib.normalizer.normalizeCCMapByMCFS(raw_ccmap)
    gmlib.ccmap.save_ccmap(norm_ccmap, output_file, compress=True)

    del raw_ccmap     # Remove CCMAP object from memory and any related temporary files
    del norm_ccmap    # Remove CCMAP object from memory and any related temporary files
INFO:normalizer: Median Contact Frequency Scaling is in process for chr1 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr1 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr1_100kb_MCFS.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr1_100kb_MCFS.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr1_100kb_MCFS.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: Median Contact Frequency Scaling is in process for chr5 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr5 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr5_100kb_MCFS.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr5_100kb_MCFS.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr5_100kb_MCFS.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: Median Contact Frequency Scaling is in process for chr15 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr15 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr15_100kb_MCFS.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr15_100kb_MCFS.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr15_100kb_MCFS.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: Median Contact Frequency Scaling is in process for chr20 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr20 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr20_100kb_MCFS.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr20_100kb_MCFS.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr20_100kb_MCFS.npbin] ...
INFO:save_ccmap:       Finished!!!

INFO:normalizer: Median Contact Frequency Scaling is in process for chr21 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr21 map...
INFO:save_ccmap: Saving ccmap to file [normalized/chr21_100kb_MCFS.ccmap] and [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr21_100kb_MCFS.npbin] ...
INFO:save_ccmap: Compressing [/home/rajendra/workspace/genome_3d_organization/tutorials_modules/normalized/chr21_100kb_MCFS.npbin] ...
INFO:save_ccmap:       Finished!!!


Normalize all maps from a gcmap file using MCFS method

Maps stored in a gcmap files can be normalized and stored in another gcmap files.

In [11]:
# Name of output gcmap file
normMCFS_gcmap_file = 'normalized/normMCFS_100kb.gcmap'


# Perform scaling and save to output file
gmlib.normalizer.normalizeGCMapByMCFS(raw_gcmap_file, normMCFS_gcmap_file)
INFO:normalizer: Median Contact Frequency Scaling is in process for chr21 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr21 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr21 - 100kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr21] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr22 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr22 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr22 - 100kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr22] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr22 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr22 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr22 - 200kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr22] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr20 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr20 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr20 - 100kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr20] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr20 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr20 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr20 - 200kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr20] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr15 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr15 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr15 - 100kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr15] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr15 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr15 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr15 - 200kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr15] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr15 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr15 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr15 - 400kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr15] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr5 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr5 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr5 - 100kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr5] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr5 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr5 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr5 - 200kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr5] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr5 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr5 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr5 - 400kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr5] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr1 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr1 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr1 - 100kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr1] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr1 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr1 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr1 - 200kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr1] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr1 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr1 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr1 - 400kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr1] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...
INFO:normalizer: Median Contact Frequency Scaling is in process for chr1 map...
INFO:normalizer:        ...Finished Median Contact Frequency Scaling for chr1 map...
INFO:addCCMap2GCMap: Opened file [normalized/normMCFS_100kb.gcmap] for reading writing..
INFO:addCCMap2GCMap: Adding data to [normalized/normMCFS_100kb.gcmap] for [chr1 - 800kb] ...
INFO:addCCMap2GCMap:     ...Finished adding data for [chr1] ...
INFO:addCCMap2GCMap: Closed file [normalized/normMCFS_100kb.gcmap]...

See also

Function gcMapExplorer.lib.normalizer.normalizeGCMapByIC() for more details.