Literature DB >> 12234754

A study of the middle-scale nucleotide clustering in DNA sequences of various origin and functionality, by means of a method based on a modified standard deviation.

Christoforos Nikolaou1, Yannis Almirantis.   

Abstract

The deviation from randomness in the distribution of nucleotides in genomic sequences is quantified and studied, using a modified standard deviation (MSD). This method implies a "per block" computation of the standard deviation of the nucleotide frequencies of occurrence, using local means (means taken in a neighborhood of each block). This quantity may serve as a scale-dependent measure of the nucleotide clustering. In the present work, the meso-scale of tenths of nucleotides is principally explored, by means of suitably adjusted filter parameters. This length scale is of an order of magnitude not directly affected by the grammar and syntax rules of the protein-coding procedure, remaining shorter than the scale of appearance of large-scale characteristics of the genome. MSD has been found to distinguish systematically between the sequences of different origin and functionality. The most near-random are found to be coding sequences of prokaryotes, while in intronic and intergenic regions of eukaryotic genomes, extended clustering of similar nucleotides is observed. The distributions of MSD values of large collections of sequences are found to be in most cases characteristic of their biological role and origin. Protein- and non-coding, prokaryotic and eukaryotic DNA as well as promoter, rRNA, viral and organelle sequences have been examined. The presented results corroborate a recently proposed model for genome evolution. The method is also applied for an assessment of the annotation of ORFs taken from the complete genome of Saccharomyces cerevisiae.

Entities:  

Mesh:

Year:  2002        PMID: 12234754     DOI: 10.1006/jtbi.2002.3045

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  5 in total

1.  Measuring the coding potential of genomic sequences through a combination of triplet occurrence patterns and RNY preference.

Authors:  Christoforos Nikolaou; Yannis Almirantis
Journal:  J Mol Evol       Date:  2004-09       Impact factor: 2.395

2.  "Word" preference in the genomic text and genome evolution: different modes of n-tuplet usage in coding and noncoding sequences.

Authors:  Christoforos Nikolaou; Yannis Almirantis
Journal:  J Mol Evol       Date:  2005-07-19       Impact factor: 2.395

3.  Use of highly variable intergenic spacer sequences for multispacer typing of Rickettsia conorii strains.

Authors:  Pierre-Edouard Fournier; Yong Zhu; Hiroyuki Ogata; Didier Raoult
Journal:  J Clin Microbiol       Date:  2004-12       Impact factor: 5.948

4.  Distinguishing microbial genome fragments based on their composition: evolutionary and comparative genomic perspectives.

Authors:  Scott C Perry; Robert G Beiko
Journal:  Genome Biol Evol       Date:  2010-01-25       Impact factor: 3.416

5.  Evolution of genomic sequence inhomogeneity at mid-range scales.

Authors:  Ashwin Prakash; Samuel S Shepard; Jie He; Benjamin Hart; Miao Chen; Surya P Amarachintha; Olga Mileyeva-Biebesheimer; Jason Bechtel; Alexei Fedorov
Journal:  BMC Genomics       Date:  2009-11-05       Impact factor: 3.969

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.