| Literature DB >> 32407471 |
Divon Lan1, Raymond Tobler1, Yassine Souilmi1, Bastien Llamas1.
Abstract
MOTIVATION: genozip is a new lossless compression tool for Variant Call Format (VCF) files. By applying field-specific algorithms and fully utilizing the available computational hardware, genozip achieves the highest compression ratios amongst existing lossless compression tools known to the authors, at speeds comparable with the fastest multi-threaded compressors.Entities:
Year: 2020 PMID: 32407471 PMCID: PMC7332572 DOI: 10.1093/bioinformatics/btaa290
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Benchmarking genozip performance. (a) Compression ratios for genozip using three different options relative to five other commonly used compression tools (see labels) for two VCF files, the FORMAT-subfields-rich data (x-axis) and genotype-rich data dominant (y-axis). (b) Compression (x-axis) and decompression (y-axis) rates for genozip and five other tools on the two VCF files (see inset key), and the rates (c) genozip execution scalability with used CPU cores (see Supplementary Material)