Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Linguistic measure of taxonomic and functional relatedness of nucleotide sequences.

Literature DB >> 2363847

Linguistic measure of taxonomic and functional relatedness of nucleotide sequences.

S Pietrokovski¹, J Hirshon, E N Trifonov.

Abstract

The frequencies of "words", oligonucleotides within nucleotide sequences, reflect the genetic information contained in the sequence "texts". Nucleotide sequences are characteristically represented by their contrast word vocabularies. Comparison of the sequences by correlating their contrast vocabularies is shown to reflect well the relatedness (unrelatedness) between the sequences. A single value, the linguistic similarity between the sequences, is suggested as a measure of sequence relatedness. Sequences as short as 1000 bases can be characterized and quantitatively related to other sequences by this technique. The linguistic sequence similarity value is used for analysis of taxonomically and functionally diverse nucleotide sequences. The similarity value is shown to be very sensitive to the relatedness of the source species, thus providing a convenient tool for taxonomic classification of species by their sequence vocabularies. Functionally diverse sequences appear distinct by their linguistic similarity values. This can be a basis for a quick screening technique for functional characterization of the sequences and for mapping functionally distinct regions in long sequences.

Entities: Chemical

Mesh：

Substances：
RNA, Ribosomal

Year: 1990 PMID： 2363847 DOI： 10.1080/07391102.1990.10508563

Source DB: PubMed Journal: J Biomol Struct Dyn ISSN： 0739-1102

Keyword Cloud
Cited

13 in total

1. A simple method for global sequence comparison.

Authors: E Pizzi; M Attimonelli; S Liuni; C Frontali; C Saccone
Journal: Nucleic Acids Res Date: 1992-01-11 Impact factor: 16.971

2. Viroids proper can be distinguished from hammerhead viroids and satellite RNAs through their dinucleotide composition.

Authors: F U Gast; R L Spieker
Journal: Arch Virol Date: 1996 Impact factor: 2.574

3. Protein sequence randomness and sequence/structure correlations.

Authors: R S Rahman; S Rackovsky
Journal: Biophys J Date: 1995-04 Impact factor: 4.033

4. A quality control algorithm for DNA sequencing projects.

Authors: O White; T Dunning; G Sutton; M Adams; J C Venter; C Fields
Journal: Nucleic Acids Res Date: 1993-08-11 Impact factor: 16.971

5. Compositional heterogeneity of the Escherichia coli genome: a role for VSP repair?

Authors: G Gutiérrez; J Casadesús; J L Oliver; A Marín
Journal: J Mol Evol Date: 1994-10 Impact factor: 2.395

6. Different clustering of genomes across life using the A-T-C-G and degenerate R-Y alphabets: early and late signaling on genome evolution?

Authors: V Kirzhner; A Paz; Z Volkovich; E Nevo; A Korol
Journal: J Mol Evol Date: 2007-03-19 Impact factor: 2.395

7. Distinguishing microbial genome fragments based on their composition: evolutionary and comparative genomic perspectives.

Authors: Scott C Perry; Robert G Beiko
Journal: Genome Biol Evol Date: 2010-01-25 Impact factor: 3.416