Literature DB >> 20157640

Data Compression Concepts and Algorithms and their Applications to Bioinformatics.

O U Nalbantog̃lu1, D J Russell, K Sayood.   

Abstract

Data compression at its base is concerned with how information is organized in data. Understanding this organization can lead to efficient ways of representing the information and hence data compression. In this paper we review the ways in which ideas and approaches fundamental to the theory and practice of data compression have been used in the area of bioinformatics. We look at how basic theoretical ideas from data compression, such as the notions of entropy, mutual information, and complexity have been used for analyzing biological sequences in order to discover hidden patterns, infer phylogenetic relationships between organisms and study viral populations. Finally, we look at how inferred grammars for biological sequences have been used to uncover structure in biological sequences.

Entities:  

Year:  2010        PMID: 20157640      PMCID: PMC2821113          DOI: 10.3390/e12010034

Source DB:  PubMed          Journal:  Entropy (Basel)        ISSN: 1099-4300            Impact factor:   2.524


  50 in total

1.  Biological sequence compression algorithms.

Authors:  T Matsumoto; K Sadakane; H Imai
Journal:  Genome Inform Ser Workshop Genome Inform       Date:  2000

Review 2.  The language of genes.

Authors:  David B Searls
Journal:  Nature       Date:  2002-11-14       Impact factor: 49.962

3.  A divide-and-conquer approach to fragment assembly.

Authors:  Hasan H Otu; Khalid Sayood
Journal:  Bioinformatics       Date:  2003-01       Impact factor: 6.937

4.  Utilization of the relative complexity measure to construct a phylogenetic tree for fungi.

Authors:  Dhundy R Bastola; Hasan H Otu; Sarah E Doukas; Khalid Sayood; Steven H Hinrichs; Peter C Iwen
Journal:  Mycol Res       Date:  2004-02

5.  Grammatical representations of macromolecular structure.

Authors:  David Chiang; Aravind K Joshi; David B Searls
Journal:  J Comput Biol       Date:  2006-06       Impact factor: 1.479

6.  Computational analysis of RNAs.

Authors:  S R Eddy
Journal:  Cold Spring Harb Symp Quant Biol       Date:  2006

7.  Syntactic recognition of regulatory regions in Escherichia coli.

Authors:  D A Rosenblueth; D Thieffry; A M Huerta; H Salgado; J Collado-Vides
Journal:  Comput Appl Biosci       Date:  1996-10

8.  A transformational-grammar approach to the study of the regulation of gene expression.

Authors:  J Collado-Vides
Journal:  J Theor Biol       Date:  1989-02-22       Impact factor: 2.691

9.  The information content of DNA.

Authors:  L L Gatlin
Journal:  J Theor Biol       Date:  1966-02       Impact factor: 2.691

10.  Grammar-based distance in progressive multiple sequence alignment.

Authors:  David J Russell; Hasan H Otu; Khalid Sayood
Journal:  BMC Bioinformatics       Date:  2008-07-10       Impact factor: 3.169

View more
  9 in total

1.  Using weighted entropy to rank chemicals in quantitative high-throughput screening experiments.

Authors:  Keith R Shockley
Journal:  J Biomol Screen       Date:  2013-09-20

2.  AC2: An Efficient Protein Sequence Compression Tool Using Artificial Neural Networks and Cache-Hash Models.

Authors:  Milton Silva; Diogo Pratas; Armando J Pinho
Journal:  Entropy (Basel)       Date:  2021-04-26       Impact factor: 2.524

3.  Adaptive efficient compression of genomes.

Authors:  Sebastian Wandelt; Ulf Leser
Journal:  Algorithms Mol Biol       Date:  2012-11-12       Impact factor: 1.405

4.  Conditional entropy in variation-adjusted windows detects selection signatures associated with expression quantitative trait loci (eQTLs).

Authors:  Samuel K Handelman; Michal Seweryn; Ryan M Smith; Katherine Hartmann; Danxin Wang; Maciej Pietrzak; Andrew D Johnson; Andrzej Kloczkowski; Wolfgang Sadee
Journal:  BMC Genomics       Date:  2015-06-18       Impact factor: 3.969

5.  Sequence Factorization with Multiple References.

Authors:  Sebastian Wandelt; Ulf Leser
Journal:  PLoS One       Date:  2015-09-30       Impact factor: 3.240

6.  Vertical lossless genomic data compression tools for assembled genomes: A systematic literature review.

Authors:  Kelvin V Kredens; Juliano V Martins; Osmar B Dordal; Mauri Ferrandin; Roberto H Herai; Edson E Scalabrin; Bráulio C Ávila
Journal:  PLoS One       Date:  2020-05-26       Impact factor: 3.240

7.  Algorithms designed for compressed-gene-data transformation among gene banks with different references.

Authors:  Qiuming Luo; Chao Guo; Yi Jun Zhang; Ye Cai; Gang Liu
Journal:  BMC Bioinformatics       Date:  2018-06-18       Impact factor: 3.169

Review 8.  Information theory applications for biological sequence analysis.

Authors:  Susana Vinga
Journal:  Brief Bioinform       Date:  2013-09-20       Impact factor: 11.622

9.  Storage Space Allocation Strategy for Digital Data with Message Importance.

Authors:  Shanyun Liu; Rui She; Zheqi Zhu; Pingyi Fan
Journal:  Entropy (Basel)       Date:  2020-05-25       Impact factor: 2.524

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.