Literature DB >> 11120677

Sequence analysis by additive scales: DNA structure for sequences and repeats of all lengths.

P Baldi1, P F Baisnée.   

Abstract

MOTIVATION: DNA structure plays an important role in a variety of biological processes. Different di- and tri-nucleotide scales have been proposed to capture various aspects of DNA structure including base stacking energy, propeller twist angle, protein deformability, bendability, and position preference. Yet, a general framework for the computational analysis and prediction of DNA structure is still lacking. Such a framework should in particular address the following issues: (1) construction of sequences with extremal properties; (2) quantitative evaluation of sequences with respect to a given genomic background; (3) automatic extraction of extremal sequences and profiles from genomic databases; (4) distribution and asymptotic behavior as the length N of the sequences increases; and (5) complete analysis of correlations between scales.
RESULTS: We develop a general framework for sequence analysis based on additive scales, structural or other, that addresses all these issues. We show how to construct extremal sequences and calibrate scores for automatic genomic and database extraction. We show that distributions rapidly converge to normality as Nincreases. Pairwise correlations between scales depend both on background distribution and sequence length and rapidly converge to an analytically predictable asymptotic value. For di- and tri-nucleotide scales, normal behavior and asymptotic correlation values are attained over a characteristic window length of about 10-15 bp. With a uniform background distribution, pairwise correlations between empirically-derived scales remain relatively small and roughly constant at all lengths, except for propeller twist and protein deformability which are positively correlated. There is a positive (resp. negative) correlation between dinucleotide base stacking (resp. propeller twist and protein deformability) and AT-content that increases in magnitude with length. The framework is applied to the analysis of various DNA tandem repeats. We derive exact expressions for counting the number of repeat unit classes at all lengths. Tandem repeats are likely to result from a variety of different mechanisms, a fraction of which is likely to depend on profiles characterized by extreme structural features.

Entities:  

Mesh:

Substances:

Year:  2000        PMID: 11120677     DOI: 10.1093/bioinformatics/16.10.865

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  17 in total

1.  The genome-wide determinants of human and chimpanzee microsatellite evolution.

Authors:  Yogeshwar D Kelkar; Svitlana Tyekucheva; Francesca Chiaromonte; Kateryna D Makova
Journal:  Genome Res       Date:  2007-11-21       Impact factor: 9.043

Review 2.  Mutational dynamics of microsatellites.

Authors:  Atul Bhargava; F F Fuentes
Journal:  Mol Biotechnol       Date:  2010-03       Impact factor: 2.695

3.  Comparative genomics of green sulfur bacteria.

Authors:  Colin Davenport; David W Ussery; Burkhard Tümmler
Journal:  Photosynth Res       Date:  2010-01-23       Impact factor: 3.573

4.  Diversity of the abundant pKLC102/PAGI-2 family of genomic islands in Pseudomonas aeruginosa.

Authors:  Jens Klockgether; Dieco Würdemann; Oleg Reva; Lutz Wiehlmann; Burkhard Tümmler
Journal:  J Bacteriol       Date:  2006-12-28       Impact factor: 3.490

5.  Genome-wide analysis of transcription factor binding sites and their characteristic DNA structures.

Authors:  Zhiming Dai; Dongliang Guo; Xianhua Dai; Yuanyan Xiong
Journal:  BMC Genomics       Date:  2015-01-29       Impact factor: 3.969

6.  Different clustering of genomes across life using the A-T-C-G and degenerate R-Y alphabets: early and late signaling on genome evolution?

Authors:  V Kirzhner; A Paz; Z Volkovich; E Nevo; A Korol
Journal:  J Mol Evol       Date:  2007-03-19       Impact factor: 2.395

7.  Abundant oligonucleotides common to most bacteria.

Authors:  Colin F Davenport; Burkhard Tümmler
Journal:  PLoS One       Date:  2010-03-23       Impact factor: 3.240

8.  Local gene regulation details a recognition code within the LacI transcriptional factor family.

Authors:  Francisco M Camas; Eric J Alm; Juan F Poyatos
Journal:  PLoS Comput Biol       Date:  2010-11-11       Impact factor: 4.475

9.  What is a microsatellite: a computational and experimental definition based upon repeat mutational behavior at A/T and GT/AC repeats.

Authors:  Yogeshwar D Kelkar; Noelle Strubczewski; Suzanne E Hile; Francesca Chiaromonte; Kristin A Eckert; Kateryna D Makova
Journal:  Genome Biol Evol       Date:  2010-07-28       Impact factor: 3.416

10.  Novel microsatellite markers discovery in Patagonian toothfish (Dissostichus eleginoides) using high-throughput sequencing.

Authors:  Killen Ko Garcia; Jorge Touma; Scarleth Bravo; Francisco Leiva; Luis Vargas-Chacoff; Ariel Valenzuela; Patricio Datagnan; Rodolfo Amthauer; Alberto Reyes; Rodrigo Vidal
Journal:  Mol Biol Rep       Date:  2019-06-17       Impact factor: 2.316

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.