Literature DB >> 28845445

An Evaluation Framework for Lossy Compression of Genome Sequencing Quality Values.

Claudio Alberti1, Noah Daniels2, Mikel Hernaez3, Jan Voges4, Rachel L Goldfeder5, Ana A Hernandez-Lopez1, Marco Mattavelli1, Bonnie Berger2.   

Abstract

This paper provides the specification and an initial validation of an evaluation framework for the comparison of lossy compressors of genome sequencing quality values. The goal is to define reference data, test sets, tools and metrics that shall be used to evaluate the impact of lossy compression of quality values on human genome variant calling. The functionality of the framework is validated referring to two state-of-the-art genomic compressors. This work has been spurred by the current activity within the ISO/IEC SC29/WG11 technical committee (a.k.a. MPEG), which is investigating the possibility of starting a standardization activity for genomic information representation.

Entities:  

Year:  2016        PMID: 28845445      PMCID: PMC5568552          DOI: 10.1109/DCC.2016.39

Source DB:  PubMed          Journal:  Proc Data Compress Conf        ISSN: 2375-0383


  11 in total

1.  Compressing genomic sequence fragments using SlimGene.

Authors:  Christos Kozanitis; Chris Saunders; Semyon Kruglyak; Vineet Bafna; George Varghese
Journal:  J Comput Biol       Date:  2011-03       Impact factor: 1.479

2.  On the future of genomic data.

Authors:  Scott D Kahn
Journal:  Science       Date:  2011-02-11       Impact factor: 47.728

3.  QVZ: lossy compression of quality values.

Authors:  Greg Malysa; Mikel Hernaez; Idoia Ochoa; Milind Rao; Karthik Ganesan; Tsachy Weissman
Journal:  Bioinformatics       Date:  2015-05-28       Impact factor: 6.937

4.  DeeZ: reference-based compression by local assembly.

Authors:  Faraz Hach; Ibrahim Numanagić; S Cenk Sahinalp
Journal:  Nat Methods       Date:  2014-11       Impact factor: 28.547

5.  Quality score compression improves genotyping accuracy.

Authors:  Y William Yu; Deniz Yorukoglu; Jian Peng; Bonnie Berger
Journal:  Nat Biotechnol       Date:  2015-03       Impact factor: 54.908

6.  Lossy compression of quality scores in genomic data.

Authors:  Rodrigo Cánovas; Alistair Moffat; Andrew Turpin
Journal:  Bioinformatics       Date:  2014-04-10       Impact factor: 6.937

7.  Effect of lossy compression of quality scores on variant calling.

Authors:  Idoia Ochoa; Mikel Hernaez; Rachel Goldfeder; Tsachy Weissman; Euan Ashley
Journal:  Brief Bioinform       Date:  2017-03-01       Impact factor: 11.622

8.  Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications.

Authors:  Andy Rimmer; Hang Phan; Iain Mathieson; Zamin Iqbal; Stephen R F Twigg; Andrew O M Wilkie; Gil McVean; Gerton Lunter
Journal:  Nat Genet       Date:  2014-07-13       Impact factor: 38.330

9.  QualComp: a new lossy compressor for quality scores based on rate distortion theory.

Authors:  Idoia Ochoa; Himanshu Asnani; Dinesh Bharadia; Mainak Chowdhury; Tsachy Weissman; Golan Yona
Journal:  BMC Bioinformatics       Date:  2013-06-08       Impact factor: 3.169

10.  Compression of FASTQ and SAM format sequencing data.

Authors:  James K Bonfield; Matthew V Mahoney
Journal:  PLoS One       Date:  2013-03-22       Impact factor: 3.240

View more
  4 in total

1.  CALQ: compression of quality values of aligned sequencing data.

Authors:  Jan Voges; Jörn Ostermann; Mikel Hernaez
Journal:  Bioinformatics       Date:  2018-05-15       Impact factor: 6.937

2.  MZPAQ: a FASTQ data compression tool.

Authors:  Achraf El Allali; Mariam Arshad
Journal:  Source Code Biol Med       Date:  2019-06-03

Review 3.  Systematic benchmarking of omics computational tools.

Authors:  Serghei Mangul; Lana S Martin; Brian L Hill; Angela Ka-Mei Lam; Margaret G Distler; Alex Zelikovsky; Eleazar Eskin; Jonathan Flint
Journal:  Nat Commun       Date:  2019-03-27       Impact factor: 14.919

4.  CMIC: an efficient quality score compressor with random access functionality.

Authors:  Hansen Chen; Jianhua Chen; Zhiwen Lu; Rongshu Wang
Journal:  BMC Bioinformatics       Date:  2022-07-23       Impact factor: 3.307

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.