Literature DB >> 19953199

Fast multiple alignment of ungapped DNA sequences using information theory and a relaxation method.

Thomas D Schneider1, David N Mastronarde.   

Abstract

An information theory based multiple alignment ("Malign") method was used to align the DNA binding sequences of the OxyR and Fis proteins, whose sequence conservation is so spread out that it is difficult to identify the sites. In the algorithm described here, the information content of the sequences is used as a unique global criterion for the quality of the alignment. The algorithm uses look-up tables to avoid recalculating computationally expensive functions such as the logarithm. Because there are no arbitrary constants and because the results are reported in absolute units (bits), the best alignment can be chosen without ambiguity. Starting from randomly selected alignments, a hill-climbing algorithm can track through the immense space of s(n) combinations where s is the number of sequences and n is the number of positions possible for each sequence. Instead of producing a single alignment, the algorithm is fast enough that one can afford to use many start points and to classify the solutions. Good convergence is indicated by the presence of a single well-populated solution class having higher information content than other classes. The existence of several distinct classes for the Fis protein indicates that those binding sites have self-similar features.

Entities:  

Year:  1996        PMID: 19953199      PMCID: PMC2785095          DOI: 10.1016/S0166-218X(96)00068-6

Source DB:  PubMed          Journal:  Discrete Appl Math        ISSN: 0166-218X            Impact factor:   1.139


  17 in total

Review 1.  The Fis protein: it's not just for DNA inversion anymore.

Authors:  S E Finkel; R C Johnson
Journal:  Mol Microbiol       Date:  1992-11       Impact factor: 3.501

2.  A survey of multiple sequence comparison methods.

Authors:  S C Chan; A K Wong; D K Chiu
Journal:  Bull Math Biol       Date:  1992-07       Impact factor: 1.758

3.  Fast multiple alignment of ungapped DNA sequences using information theory and a relaxation method.

Authors:  Thomas D Schneider; David N Mastronarde
Journal:  Discrete Appl Math       Date:  1996-12-01       Impact factor: 1.139

4.  A multiple sequence comparison method.

Authors:  A K Wong; S C Chan; D K Chiu
Journal:  Bull Math Biol       Date:  1993-03       Impact factor: 1.758

5.  Sequence logos: a new way to display consensus sequences.

Authors:  T D Schneider; R M Stephens
Journal:  Nucleic Acids Res       Date:  1990-10-25       Impact factor: 16.971

6.  CAP binding sites reveal pyrimidine-purine pattern characteristic of DNA bending.

Authors:  A M Barber; V B Zhurkin
Journal:  J Biomol Struct Dyn       Date:  1990-10

7.  Information content of binding sites on nucleotide sequences.

Authors:  T D Schneider; G D Stormo; L Gold; A Ehrenfeucht
Journal:  J Mol Biol       Date:  1986-04-05       Impact factor: 5.469

Review 8.  Sequence alignment and penalty choice. Review of concepts, case studies and implications.

Authors:  M Vingron; M S Waterman
Journal:  J Mol Biol       Date:  1994-01-07       Impact factor: 5.469

Review 9.  Compilation and analysis of Escherichia coli promoter DNA sequences.

Authors:  D K Hawley; W R McClure
Journal:  Nucleic Acids Res       Date:  1983-04-25       Impact factor: 16.971

10.  Information analysis of sequences that bind the replication initiator RepA.

Authors:  P P Papp; D K Chattoraj; T D Schneider
Journal:  J Mol Biol       Date:  1993-09-20       Impact factor: 5.469

View more
  16 in total

1.  Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation.

Authors:  T D Schneider
Journal:  Nucleic Acids Res       Date:  2001-12-01       Impact factor: 16.971

2.  Molecular flip-flops formed by overlapping Fis sites.

Authors:  Paul N Hengen; Ilya G Lyakhov; Lisa E Stewart; Thomas D Schneider
Journal:  Nucleic Acids Res       Date:  2003-11-15       Impact factor: 16.971

Review 3.  Consensus sequence Zen.

Authors:  Thomas D Schneider
Journal:  Appl Bioinformatics       Date:  2002

4.  Bipartite pattern discovery by entropy minimization-based multiple local alignment.

Authors:  Chengpeng Bi; Peter K Rogan
Journal:  Nucleic Acids Res       Date:  2004-09-23       Impact factor: 16.971

5.  Twenty Years of Delila and Molecular Information Theory: The Altenberg-Austin Workshop in Theoretical Biology Biological Information, Beyond Metaphor: Causality, Explanation, and Unification Altenberg, Austria, 11-14 July 2002.

Authors:  Thomas D Schneider
Journal:  Biol Theory       Date:  2006

6.  Fast multiple alignment of ungapped DNA sequences using information theory and a relaxation method.

Authors:  Thomas D Schneider; David N Mastronarde
Journal:  Discrete Appl Math       Date:  1996-12-01       Impact factor: 1.139

Review 7.  Trends in information theory-based chemical structure codification.

Authors:  Stephen J Barigye; Yovani Marrero-Ponce; Facundo Pérez-Giménez; Danail Bonchev
Journal:  Mol Divers       Date:  2014-04-05       Impact factor: 2.943

8.  Data Compression Concepts and Algorithms and their Applications to Bioinformatics.

Authors:  O U Nalbantog̃lu; D J Russell; K Sayood
Journal:  Entropy (Basel)       Date:  2010-01-01       Impact factor: 2.524

9.  Anatomy of Escherichia coli sigma70 promoters.

Authors:  Ryan K Shultzaberger; Zehua Chen; Karen A Lewis; Thomas D Schneider
Journal:  Nucleic Acids Res       Date:  2006-12-22       Impact factor: 16.971

10.  Discovery of Fur binding site clusters in Escherichia coli by information theory models.

Authors:  Zehua Chen; Karen A Lewis; Ryan K Shultzaberger; Ilya G Lyakhov; Ming Zheng; Bernard Doan; Gisela Storz; Thomas D Schneider
Journal:  Nucleic Acids Res       Date:  2007-10-05       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.