bed2h5ΒΆ
Description:
Import a bed file to HDF5 format h5 file
============================================
bed file can be converted into gcMapExplorer compatible HDF5 file using
this tool. This HDF5 file can be loaded into gcMapExplorer browser for
interactive visualization.
This tool does not require any external program.
Resolutions
===========
By default, original data are downsampled to following resolutions: '1kb',
'2kb', '4kb', '5kb', '8kb', '10kb', '20kb', '40kb', '80kb', '100kb', '160kb',
'200kb', '320kb', '500kb', '640kb', and '1mb'.
The data are downsampled at this stage only to speed up the visualization
process as downsampling might slow down the interactive visualization.
Downsampling/Coarsening method
==============================
Presently, six methods are implemented:
1) min -> Minimum value
2) max -> Maximum value
3) amean -> Arithmetic mean or average
4) hmean -> Harmonic mean
5) gmean -> Geometric mean
6) median -> Median
All these methods are used by default.
See below help for "-dm/--downsample-method" option to change the methods.
To keep original 1 base resolution data
=======================================
By default, the output h5 file does not contain original 1-base resolution
data to reduce the file size. To keep the original data in h5 file, used
-ko/--keep-original flag.
Usage:
usage: gcMapExplorer bed2h5 [-h] [-i input.bed] [-t "Genomic Dataset"]
[-dtc 7] [-r "List of Resolutions"]
[-dm "List of downsampling method"]
[-icn CHROMNAME] [-cmeth lzf] [-o out.h5] [-ow]
[-ko] [-idf index.json]
[-wd /home/rajendra/deskForWork/scratch]
Optional arguments:
-h, --help show this help message and exit
-i input.bed, --input input.bed
Input wig file.
-t "Genomic Dataset", --title "Genomic Dataset"
Title of the dataset.
-dtc 7, --data-column 7
The column number, which is considered as data column. Column number
could vary and depends on BED format. For example:
1) ENCODE broadPeak format (BED 6+3): 7th column
2) ENCODE gappedPeak format (BED 12+3): 13th column
3) ENCODE narrowPeak format (BED 6+4): 7th column
4) ENCODE RNA elements format (BED 6+3): 7th column
-r "List of Resolutions", --resolutions "List of Resolutions"
Additional input resolutions other than these resolutions: 1kb', '2kb',
'4kb', '5kb', '8kb', '10kb', '20kb', '40kb', '80kb', '100kb', '160kb','200kb',
'320kb', '500kb', '640kb', and '1mb'.
Resolutions should be provided in comma separated values. For Example:
-r "25kb, 50kb, 75kb"
-dm "List of downsampling method", --downsample-method "List of downsampling method"
Methods to coarse or downsample the data for converting from 1-base
to coarser resolutions. If this option is not provided, all six methods (see
above) will be considered. User may use only subset of these methods.
For example: -dm "max, amean" can be used for downsampling by only these
two methods.
-icn CHROMNAME, --input-chromosome CHROMNAME
Input Chromosome Name.
If this is provided, only this chromosome data is extracted and stored in h5
file.
-cmeth lzf, --compression-method lzf
Data compression method in h5 file.
-o out.h5, --out out.h5
Output h5 file.
If file is already present, it will replace the data. Therefore, be careful
if a file with same name is present.
-ow, --overwrite If a output file is already present, overwrite the datasets in the output
file.
-ko, --keep-original To copy original 1-base resolution data in h5 file. This will increase the
file size significantly.
-idf index.json, --index-file index.json
Index file in json format.
A file in json format containing indices (position in bed file) and sizes of
chromosomes. If this file is not present and given as input, a new file will be
generated. If this file is present, indices and sizes will be taken from this
file. If index and size of input chromosome is not present in json file, these
will be determined from bed file and stored in same json file. This file could
be very helpful in case when same bed file has to be read many times because
step to determine index and size of chromosome is skipped.
-wd /home/rajendra/deskForWork/scratch, --work-dir /home/rajendra/deskForWork/scratch
Directory where temporary files will be stored.