Literature DB >> 11072342

A Compression Algorithm for DNA Sequences and Its Applications in Genome Comparison.

.   

Abstract

We present a lossless compression algorithm, GenCompress, for genetic sequences, based on searching for approximate repeats. Our algorithm achieves the best compression ratios for benchmark DNA sequences. Significantly better compression results show that the approximate repeats are one of the main hidden regularities in DNA sequences. We then describe a theory of measuring the relatedness between two DNA sequences. Using our algorithm, we present strong experimental support for this theory, and demonstrate its application in comparing genomes and constructing evolutionary trees.

Year:  1999        PMID: 11072342

Source DB:  PubMed          Journal:  Genome Inform Ser Workshop Genome Inform


  11 in total

1.  Complexity: an internet resource for analysis of DNA sequence complexity.

Authors:  Y L Orlov; V N Potapov
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

2.  Integrating human genome database into electronic health record with sequence alignment and compression mechanism.

Authors:  Wei-Hsin Chen; Yu-Wen Lu; Feipei Lai; Yin-Hsiu Chien; Wuh-Liang Hwu
Journal:  J Med Syst       Date:  2011-05-11       Impact factor: 4.460

3.  LifePrint: a novel k-tuple distance method for construction of phylogenetic trees.

Authors:  Fabián Reyes-Prieto; Adda J García-Chéquer; Hueman Jaimes-Díaz; Janet Casique-Almazán; Juana M Espinosa-Lara; Rosaura Palma-Orozco; Alfonso Méndez-Tenorio; Rogelio Maldonado-Rodríguez; Kenneth L Beattie
Journal:  Adv Appl Bioinform Chem       Date:  2011-01-20

4.  Fast-Find: a novel computational approach to analyzing combinatorial motifs.

Authors:  Micah Hamady; Erin Peden; Rob Knight; Ravinder Singh
Journal:  BMC Bioinformatics       Date:  2006-01-04       Impact factor: 3.169

5.  Integrating alignment-based and alignment-free sequence similarity measures for biological sequence classification.

Authors:  Ivan Borozan; Stuart Watt; Vincent Ferretti
Journal:  Bioinformatics       Date:  2015-01-07       Impact factor: 6.937

6.  LZ complexity distance of DNA sequences and its application in phylogenetic tree reconstruction.

Authors:  Bin Li; Yi Bing Li; Hong Bo He
Journal:  Genomics Proteomics Bioinformatics       Date:  2005-11       Impact factor: 7.691

7.  Human mitochondrial genome compression using machine learning techniques.

Authors:  Rongjie Wang; Tianyi Zang; Yadong Wang
Journal:  Hum Genomics       Date:  2019-10-22       Impact factor: 4.639

8.  Co-phylog: an assembly-free phylogenomic approach for closely related organisms.

Authors:  Huiguang Yi; Li Jin
Journal:  Nucleic Acids Res       Date:  2013-01-18       Impact factor: 16.971

9.  Cross chromosomal similarity for DNA sequence compression.

Authors:  Choi-Ping Paula Wu; Ngai-Fong Law; Wan-Chi Siu
Journal:  Bioinformation       Date:  2008-07-14

10.  Comparison of next-generation sequencing samples using compression-based distances and its application to phylogenetic reconstruction.

Authors:  Ngoc Hieu Tran; Xin Chen
Journal:  BMC Res Notes       Date:  2014-05-29
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.