Literature DB >> 15838146

DNA sequence compression using the burrows-wheeler transform.

Don Adjeroh1, Yong Zhang, Amar Mukherjee, Matt Powell, Tim Bell.   

Abstract

We investigate off-line dictionary oriented approaches to DNA sequence compression, based on the Burrows-Wheeler Transform (BWT). The preponderance of short repeating patterns is an important phenomenon in biological sequences. Here, we propose off-line methods to compress DNA sequences that exploit the different repetition structures inherent in such sequences. Repetition analysis is performed based on the relationship between the BWT and important pattern matching data structures, such as the suffix tree and suffix array. We discuss how the proposed approach can be incorporated in the BWT compression pipeline.

Mesh:

Substances:

Year:  2002        PMID: 15838146

Source DB:  PubMed          Journal:  Proc IEEE Comput Soc Bioinform Conf        ISSN: 1555-3930


  5 in total

1.  SOLiDzipper: A High Speed Encoding Method for the Next-Generation Sequencing Data.

Authors:  Young Jun Jeon; Sang Hyun Park; Sung Min Ahn; Hee Joung Hwang
Journal:  Evol Bioinform Online       Date:  2011-03-10       Impact factor: 1.625

2.  Reference-based compression of short-read sequences using path encoding.

Authors:  Carl Kingsford; Rob Patro
Journal:  Bioinformatics       Date:  2015-02-02       Impact factor: 6.937

3.  MetaCRAM: an integrated pipeline for metagenomic taxonomy identification and compression.

Authors:  Minji Kim; Xiejia Zhang; Jonathan G Ligo; Farzad Farnoud; Venugopal V Veeravalli; Olgica Milenkovic
Journal:  BMC Bioinformatics       Date:  2016-02-19       Impact factor: 3.169

4.  An Optimal Seed Based Compression Algorithm for DNA Sequences.

Authors:  Pamela Vinitha Eric; Gopakumar Gopalakrishnan; Muralikrishnan Karunakaran
Journal:  Adv Bioinformatics       Date:  2016-07-31

5.  Genome-wide search of nucleosome patterns using visual analytics.

Authors:  Rodrigo Santamaría; Roberto Therón; Laura Durán; Alicia García; Sara González; Mar Sánchez; Francisco Antequera
Journal:  Bioinformatics       Date:  2019-07-01       Impact factor: 6.937

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.