class BigWigHandler
¶
BigWigHandler (filenames[, …]) |
To handle bigWig files and to convert it to h5 file |
BigWigHandler.getBigWigInfo () |
Retrieve chromosome names and their sizes |
BigWigHandler.bigWigtoWig ([outfilenames]) |
To generate Wig file |
BigWigHandler.saveAsH5 (filename[, …]) |
Save data to h5 file. |
-
class
BigWigHandler
(filenames, pathTobigWigToWig=None, pathTobigWigInfo=None, chromName=None, methodToCombine='mean', workDir=None, maxEntryWrite=10000000)¶ To handle bigWig files and to convert it to h5 file
This class can be used to convert bigWig file to h5 file. It can also be used to combine several bigWig files that are originated from replicated experiments.
Warning
Presently
bigWigToWig
andbigWigInfo
is not available for Windows OS. Therefore, this class will fail in this OS.-
bigWigFileNames
¶ str or list[str] – List of bigWig file names including path
-
pathTobigWigToWig
¶ str – Path to
bigWigToWig
program. It can be downloaded from http://hgdownload.cse.ucsc.edu/admin/exe/ for MacOSX and Linux. If path to program is already present in configuration file, it will be taken from the configuration.If it is not present in configuration file, the input path should be provided. It will be stored in configuration file for later use.
-
pathTobigWigInfo
¶ str – Path to
bigWigInfo
program. It can be downloaded from http://hgdownload.cse.ucsc.edu/admin/exe/ for MacOSX and Linux. If path to program is already present in configuration file, it will be taken from the configuration.If it is not present in configuration file, the input path should be provided. It will be stored in configuration file for later use.
-
WigFileNames
¶ str – List of Wig file names, either automatically generated or given by user
-
chromName
¶ str – Name of input target chromosome. If this is provided, only this chromosome data is extracted and stored in h5 file.
-
wigHandle
¶ WigHandler – WigHandler instance to parse Wig file and save data as hdf5 file
-
chromSizeInfo
¶ dict – A dictionary containing chromosome size information
-
methodToCombine
¶ str – method to combine bigWig/Wig files, Presently, accepted keywords are:
mean
,min
andmax
-
maxEntryWrite
¶ int – Number of lines read from Wig file at an instant, after this, data is dumped in temporary numpy array file
Parameters: - filenames (str or list[str]) – A bigWig file or list of bigWig files including path
- pathTobigWigToWig (str) –
Path to
bigWigToWig
program. It can be downloaded from http://hgdownload.cse.ucsc.edu/admin/exe/ for MacOSX and Linux. If path to program is already present in configuration file, it will be taken from the configuration.If it is not present in configuration file, the input path should be provided. It will be stored in configuration file for later use.
- pathTobigWigInfo (str) –
Path to
bigWigInfo
program. It can be downloaded from http://hgdownload.cse.ucsc.edu/admin/exe/ for MacOSX and Linux. If path to program is already present in configuration file, it will be taken from the configuration.If it is not present in configuration file, the input path should be provided. It will be stored in configuration file for later use.
- chromName (str) – Name of input target chromosome. If this is provided, only this chromosome data is extracted and stored in h5 file.
- methodToCombine (str) – method to combine bigWig/Wig files, Presently, accepted keywords are:
mean
,min
andmax
- maxEntryWrite (int) – Number of lines read from Wig file at an instant, after this, data is dumped in temporary numpy array file. To reduce memory (RAM) occupancy, reduce this number because large numbers need large RAM.
-
_bigWigtoWig
(bigWigFileName, outfilename)¶ Base method to generate Wig file from a bigWig file
Use
BigWigHandler.bigWigtoWig()
to automatically convert all bigWig files to Wig files.Warning
Private method. Use it at your own risk. It is used internally in
BigWigHandler.bigWigtoWig()
Parameters:
-
_checkBigWigInfoProgram
(pathTobigWigInfo)¶ Check if bigWigInfo program is available or accessible.
If program is not available in configuration file, the given path will be stored in the file after checking its accessibility.
The path is stored in
gcMapExplorer.lib.genomicsDataHandler.BigWigHandler.pathTobigWigInfo
Parameters: pathTobigWigInfo (str) – Path to bigWigInfo program
-
_checkBigWigToWigProgram
(pathTobigWigToWig)¶ Check if bigWigToWig program is available or accessible.
If program is not available in configuration file, the given path will be stored in the file after checking its accessibility.
The path is stored in
gcMapExplorer.lib.genomicsDataHandler.BigWigHandler.pathTobigWigToWig
Parameters: pathTobigWigToWig (str) – Path to bigWigToWig program
-
_getBigWigInfo
(filename)¶ Base method to Retrieve chromosome names and their sizes
- Chromosome size information is stored for a given bigWig file. If size of chromosome is already present in dictionary, largest size is stored in dictionary.
- Use
BigWigHandler.getBigWigInfo()
to automatically retrieve chromosome size information from all bigWig files.
Warning
Private method. Use it at your own risk. It is used internally in
BigWigHandler.getBigWigInfo()
Parameters: filename (str) – Input bigWig file
-
bigWigtoWig
(outfilenames=None)¶ To generate Wig file
It uses bigWigToWig program to convert bigWig to Wig file. It uses
BigWigHandler.chromSizeInfo
to extract the listed chromosome data.If outfilenames are provided, wig files are generated with these names. Otherwise, Wig file names are generated randomly and listed in
BigWigHandler.WigFileNames
. If these files are generated with random names, these will be deleted after execution.Parameters: outfilenames (str or list of strip) – List of Wig file names. If None
, names are automatically generated, files are temporarily created and after execution, all files are deleted.
-
getBigWigInfo
()¶ Retrieve chromosome names and their sizes
BigWigInfo program is executed on all listed bigWig files and chromosomes name with respective size is stored in
BigWigHandler.chromSizeInfo
variable. From the several listed bigWig files, only largest size of chromosomes are considered.If
BigWigHandler.chromName
is provided, only target chromosome information is kept inBigWigHandler.chromSizeInfo
dictionary.
-
saveAsH5
(filename, tmpNumpyArrayFiles=None, title=None, resolutions=None, coarsening_methods=None, compression='lzf', keep_original=False)¶ Save data to h5 file.
Parameters: - filename (str) – Output hdf5 file name with h5 extension.
- tmpNumpyArrayFiles (
TempNumpyArrayFiles
(optional)) – Usually not required. ThisTempNumpyArrayFiles
instance stores the temporary numpy array files information. To convert large number of bigWig files, its use increases the conversion speed significantly because new temporary array files takes time to generate and frequent generation of these files can be avoided. - title (str (optional)) – Title of the data
- resolutions (list of str) –
Additional input resolutions other than these default resolutions: 1kb’, ‘2kb’, ‘4kb’, ‘5kb’, ‘8kb’, ‘10kb’, ‘20kb’, ‘40kb’, ‘80kb’, ‘100kb’, ‘160kb’,‘200kb’, ‘320kb’, ‘500kb’, ‘640kb’, and ‘1mb’.
For Example: use
resolutions=['25kb', '50kb', '75kb']
to add additional 25kb, 50kb and 75kb resolution data. - coarsening_methods (list of str) –
Methods to coarse or downsample the data for converting from 1-base to coarser resolutions. Presently, five methods are implemented.
'min'
-> Minimum value'max'
-> Maximum value'amean'
-> Arithmetic mean or average'hmean'
-> Harmonic mean'gmean'
-> Geometric mean'median'
-> Median
In case of
None
, all five methods will be considered. User may use only subset of these methods. For example:coarse_method=['max', 'amean']
can be used for downsampling by only these two methods. - compression (str) – data compression method in HDF5 file :
lzf
orgzip
method. - keep_original (bool) – Whether original data present in bigwig file should be incorporated in HDF5 file. This will significantly increase size of HDF5 file.
Examples
from gcMapExplorer.lib import genomicsDataHandler as gdh # start BigWigHandler to combine and convert two bigWig files bigwig = gdh.BigWigHandler(['first.bigWig', 'second.bigWig'], './bigWigToWig', './bigWigInfo') # Save hdf5 file with two additional resolutions # and only two downsampling method. bigwig.saveAsH5('converted.h5', resolutions=['25kb', '50kb'], coarsening_methods=['max', 'amean'])
-