Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A study of the middle-scale nucleotide clustering in DNA sequences of various origin and functionality, by means of a method based on a modified standard deviation.

Literature DB >> 12234754

A study of the middle-scale nucleotide clustering in DNA sequences of various origin and functionality, by means of a method based on a modified standard deviation.

Christoforos Nikolaou¹, Yannis Almirantis.

Abstract

The deviation from randomness in the distribution of nucleotides in genomic sequences is quantified and studied, using a modified standard deviation (MSD). This method implies a "per block" computation of the standard deviation of the nucleotide frequencies of occurrence, using local means (means taken in a neighborhood of each block). This quantity may serve as a scale-dependent measure of the nucleotide clustering. In the present work, the meso-scale of tenths of nucleotides is principally explored, by means of suitably adjusted filter parameters. This length scale is of an order of magnitude not directly affected by the grammar and syntax rules of the protein-coding procedure, remaining shorter than the scale of appearance of large-scale characteristics of the genome. MSD has been found to distinguish systematically between the sequences of different origin and functionality. The most near-random are found to be coding sequences of prokaryotes, while in intronic and intergenic regions of eukaryotic genomes, extended clustering of similar nucleotides is observed. The distributions of MSD values of large collections of sequences are found to be in most cases characteristic of their biological role and origin. Protein- and non-coding, prokaryotic and eukaryotic DNA as well as promoter, rRNA, viral and organelle sequences have been examined. The presented results corroborate a recently proposed model for genome evolution. The method is also applied for an assessment of the annotation of ORFs taken from the complete genome of Saccharomyces cerevisiae.

Entities: Disease Species

Mesh：

Year: 2002 PMID： 12234754 DOI： 10.1006/jtbi.2002.3045

Source DB: PubMed Journal: J Theor Biol ISSN： 0022-5193 Impact factor: 2.691

Keyword Cloud
Cited

5 in total

1. Measuring the coding potential of genomic sequences through a combination of triplet occurrence patterns and RNY preference.

Authors: Christoforos Nikolaou; Yannis Almirantis
Journal: J Mol Evol Date: 2004-09 Impact factor: 2.395

2. "Word" preference in the genomic text and genome evolution: different modes of n-tuplet usage in coding and noncoding sequences.

Authors: Christoforos Nikolaou; Yannis Almirantis
Journal: J Mol Evol Date: 2005-07-19 Impact factor: 2.395

3. Use of highly variable intergenic spacer sequences for multispacer typing of Rickettsia conorii strains.

Authors: Pierre-Edouard Fournier; Yong Zhu; Hiroyuki Ogata; Didier Raoult
Journal: J Clin Microbiol Date: 2004-12 Impact factor: 5.948

4. Distinguishing microbial genome fragments based on their composition: evolutionary and comparative genomic perspectives.

Authors: Scott C Perry; Robert G Beiko
Journal: Genome Biol Evol Date: 2010-01-25 Impact factor: 3.416

5. Evolution of genomic sequence inhomogeneity at mid-range scales.

Authors: Ashwin Prakash; Samuel S Shepard; Jie He; Benjamin Hart; Miao Chen; Surya P Amarachintha; Olga Mileyeva-Biebesheimer; Jason Bechtel; Alexei Fedorov
Journal: BMC Genomics Date: 2009-11-05 Impact factor: 3.969

5 in total