Literature DB >> 9446751

Information content of individual genetic sequences.

T D Schneider1.   

Abstract

Related genetic sequences having a common function can be described by Shannon's information measure and depicted graphically by a sequence logo. Though useful for many purposes, sequence logos only show the average sequence conservation, and inferring the conservation for individual sequences is difficult. This limitation is overcome by the individual information ( R i) technique described here. The method begins by generating a weight matrix from the frequencies of each nucleotide or amino acid at each position of the aligned sequences. This matrix is then applied to the sequences themselves to determine the sequence conservation of each individual sequence. The matrix is unique because the average of these assignments is the total sequence conservation, ad there is only one way to construct such a matrix. For binding sites on polynucleotides, the weight matrix has a natural cut off that distinguishes functional sequences from other sequences. R i values are on an absolute scale measured in bits of information so the conservation of different biological functions can be compared with one another. The matrix can be used to rank-order the sequences, to search for new sequences, to compare sequences to other quantitative data such as binding energy or distance between binding sites, to distinguish mutations from polymorphisms, to design sequences of a given strength, and to detect errors in databases. The R i method has been used to identify previously undescribed but experimentally verified DNA binding sites. The individual information distribution was determined for E. coli ribosome binding sites, bacterial Fis binding sites, and human donor and acceptor splice junctions, among others. The distributions demonstrate clearly that the consensus sequence is highly unusual, and hence is a poor method to describe naturally occurring binding sites. Copyright 1997 Academic Press Limited.

Entities:  

Mesh:

Substances:

Year:  1997        PMID: 9446751     DOI: 10.1006/jtbi.1997.0540

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  114 in total

1.  In silico detection of control signals: mRNA 3'-end-processing sequences in diverse species.

Authors:  J H Graber; C R Cantor; S C Mohr; T F Smith
Journal:  Proc Natl Acad Sci U S A       Date:  1999-11-23       Impact factor: 11.205

2.  Evolution of biological information.

Authors:  T D Schneider
Journal:  Nucleic Acids Res       Date:  2000-07-15       Impact factor: 16.971

3.  Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation.

Authors:  T D Schneider
Journal:  Nucleic Acids Res       Date:  2001-12-01       Impact factor: 16.971

4.  Molecular identification using flow cytometry histograms and information theory.

Authors:  Q Zeng; A J Young; A Boxwala; J Rawn; W Long; M Wand; M Salganik; E L Milford; S J Mentzer; R A Greenes
Journal:  Proc AMIA Symp       Date:  2001

5.  Contributions of UP elements and the transcription factor FIS to expression from the seven rrn P1 promoters in Escherichia coli.

Authors:  C A Hirvonen; W Ross; C E Wozniak; E Marasco; J R Anthony; S E Aiyar; V H Newburn; R L Gourse
Journal:  J Bacteriol       Date:  2001-11       Impact factor: 3.490

6.  Proteus mirabilis glutathione S-transferase B1-1 is involved in protective mechanisms against oxidative and chemical stresses.

Authors:  Nerino Allocati; Bartolo Favaloro; Michele Masulli; Mikhail F Alexeyev; Carmine Di Ilio
Journal:  Biochem J       Date:  2003-07-01       Impact factor: 3.857

7.  Molecular flip-flops formed by overlapping Fis sites.

Authors:  Paul N Hengen; Ilya G Lyakhov; Lisa E Stewart; Thomas D Schneider
Journal:  Nucleic Acids Res       Date:  2003-11-15       Impact factor: 16.971

Review 8.  Consensus sequence Zen.

Authors:  Thomas D Schneider
Journal:  Appl Bioinformatics       Date:  2002

9.  WebLogo: a sequence logo generator.

Authors:  Gavin E Crooks; Gary Hon; John-Marc Chandonia; Steven E Brenner
Journal:  Genome Res       Date:  2004-06       Impact factor: 9.043

10.  Different spectrum of mutations of isovaleryl-CoA dehydrogenase (IVD) gene in Korean patients with isovaleric acidemia.

Authors:  Yong-Wha Lee; Dong Hwan Lee; Jerry Vockley; Nam-Doo Kim; You Kyoung Lee; Chang-Seok Ki
Journal:  Mol Genet Metab       Date:  2007-06-18       Impact factor: 4.797

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.