Literature DB >> 26846812

Advances in high throughput DNA sequence data compression.

Muhammad Sardaraz1, Muhammad Tahir1, Ataul Aziz Ikram2.   

Abstract

Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted.

Keywords:  Genome compression; read compression; reference free compression; referential compression

Mesh:

Year:  2015        PMID: 26846812     DOI: 10.1142/S0219720016300021

Source DB:  PubMed          Journal:  J Bioinform Comput Biol        ISSN: 0219-7200            Impact factor:   1.122


  2 in total

1.  MZPAQ: a FASTQ data compression tool.

Authors:  Achraf El Allali; Mariam Arshad
Journal:  Source Code Biol Med       Date:  2019-06-03

2.  Sequence Compression Benchmark (SCB) database-A comprehensive evaluation of reference-free compressors for FASTA-formatted sequences.

Authors:  Kirill Kryukov; Mahoko Takahashi Ueda; So Nakagawa; Tadashi Imanishi
Journal:  Gigascience       Date:  2020-07-01       Impact factor: 6.524

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.