Literature DB >> 7620993

Middle-range clustering of nucleotides in genomes.

J Mrázek1, J Kypr.   

Abstract

We propose a novel, transparent and very simple algorithm to analyze middle-range correlations in genomic nucleotide sequences. Analysis by this algorithm of the EMBL Nucleotide Sequence Database demonstrates that all four nucleotides cluster in the genomic nucleotide sequences of eukaryotes on the scale of several hundred base pairs. In prokaryotes, the clustering is weak but still evident. The non-dominant three bases are deficient in the clusters, while A is the most deficient nucleotide in the clusters of C, and vice versa, and G is the most deficient nucleotide in the clusters of T, and vice versa. The algorithm also detects CG islands, extending over 1 kb, in vertebrate sequences. In plants, the CG islands are shown to be much smaller, if they exist at all. A clustering tendency is also exhibited by the TA doublet. Other doublets do not cluster. We observe no strong correlation between nucleotides separated in genomes by > 1 kb.

Mesh:

Substances:

Year:  1995        PMID: 7620993     DOI: 10.1093/bioinformatics/11.2.195

Source DB:  PubMed          Journal:  Comput Appl Biosci        ISSN: 0266-7061


  3 in total

1.  Mosaic structure of the DNA molecules of the human chromosomes 21 and 22.

Authors:  D Häring; J Kypr
Journal:  Mol Biol Rep       Date:  2001-03       Impact factor: 2.316

2.  Conformational properties of DNA strands containing guanine-adenine and thymine-adenine repeats.

Authors:  M Vorlicková; I Kejnovská; J Kovanda; J Kypr
Journal:  Nucleic Acids Res       Date:  1998-03-15       Impact factor: 16.971

3.  Evolution of genomic sequence inhomogeneity at mid-range scales.

Authors:  Ashwin Prakash; Samuel S Shepard; Jie He; Benjamin Hart; Miao Chen; Surya P Amarachintha; Olga Mileyeva-Biebesheimer; Jason Bechtel; Alexei Fedorov
Journal:  BMC Genomics       Date:  2009-11-05       Impact factor: 3.969

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.