Literature DB >> 22833526

DELIMINATE--a fast and efficient method for loss-less compression of genomic sequences: sequence analysis.

Monzoorul Haque Mohammed1, Anirban Dutta, Tungadri Bose, Sudha Chadaram, Sharmila S Mande.   

Abstract

SUMMARY: An unprecedented quantity of genome sequence data is currently being generated using next-generation sequencing platforms. This has necessitated the development of novel bioinformatics approaches and algorithms that not only facilitate a meaningful analysis of these data but also aid in efficient compression, storage, retrieval and transmission of huge volumes of the generated data. We present a novel compression algorithm (DELIMINATE) that can rapidly compress genomic sequence data in a loss-less fashion. Validation results indicate relatively higher compression efficiency of DELIMINATE when compared with popular general purpose compression algorithms, namely, gzip, bzip2 and lzma.
AVAILABILITY AND IMPLEMENTATION: Linux, Windows and Mac implementations (both 32 and 64-bit) of DELIMINATE are freely available for download at: http://metagenomics.atc.tcs.com/compression/DELIMINATE. CONTACT: sharmila@atc.tcs.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Mesh:

Year:  2012        PMID: 22833526     DOI: 10.1093/bioinformatics/bts467

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  14 in total

1.  FASTR: A novel data format for concomitant representation of RNA sequence and secondary structure information.

Authors:  Tungadri Bose; Anirban Dutta; Mohammed Mh; Hemang Gandhi; Sharmila S Mande
Journal:  J Biosci       Date:  2015-09       Impact factor: 1.826

Review 2.  Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.

Authors:  Oliver Bonham-Carter; Joe Steele; Dhundy Bastola
Journal:  Brief Bioinform       Date:  2013-07-31       Impact factor: 11.622

3.  MZPAQ: a FASTQ data compression tool.

Authors:  Achraf El Allali; Mariam Arshad
Journal:  Source Code Biol Med       Date:  2019-06-03

4.  Efficient DNA sequence compression with neural networks.

Authors:  Milton Silva; Diogo Pratas; Armando J Pinho
Journal:  Gigascience       Date:  2020-11-11       Impact factor: 6.524

5.  Population-Sequencing as a Biomarker for Sample Characterization.

Authors:  John P Jakupciak
Journal:  J Biomark       Date:  2013-12-08

6.  MFCompress: a compression tool for FASTA and multi-FASTA data.

Authors:  Armando J Pinho; Diogo Pratas
Journal:  Bioinformatics       Date:  2013-10-16       Impact factor: 6.937

7.  Sequence Compression Benchmark (SCB) database-A comprehensive evaluation of reference-free compressors for FASTA-formatted sequences.

Authors:  Kirill Kryukov; Mahoko Takahashi Ueda; So Nakagawa; Tadashi Imanishi
Journal:  Gigascience       Date:  2020-07-01       Impact factor: 6.524

8.  Nucleotide Archival Format (NAF) enables efficient lossless reference-free compression of DNA sequences.

Authors:  Kirill Kryukov; Mahoko Takahashi Ueda; So Nakagawa; Tadashi Imanishi
Journal:  Bioinformatics       Date:  2019-10-01       Impact factor: 6.937

9.  HRCM: An Efficient Hybrid Referential Compression Method for Genomic Big Data.

Authors:  Haichang Yao; Yimu Ji; Kui Li; Shangdong Liu; Jing He; Ruchuan Wang
Journal:  Biomed Res Int       Date:  2019-11-16       Impact factor: 3.411

10.  Comparison of Compression-Based Measures with Application to the Evolution of Primate Genomes.

Authors:  Diogo Pratas; Raquel M Silva; Armando J Pinho
Journal:  Entropy (Basel)       Date:  2018-05-23       Impact factor: 2.524

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.