Literature DB >> 11591472

Delineating relative homogeneous G+C domains in DNA sequences.

W Li1.   

Abstract

The concept of homogeneity of G+C content is always relative and subjective. This point is emphasized and quantified in this paper using a simple example of one sequence segmented into two subsequences. Whether the sequence is homogeneous or not can be answered by whether the two-subsequence model describes the DNA sequence better than the one-sequence model. There are at least three equivalent ways of looking at the 1-to-2 segmentation: Jensen-Shannon divergence measure, log likelihood ratio test, and model selection using Bayesian information criterion. Once a criterion is chosen, a DNA sequence can be recursively segmented into multiple domains. We use one subjective criterion called segmentation strength based on the Bayesian information criterion. Whether or not a sequence is homogeneous and how many domains it has depend on this criterion. We compare six different genome sequences (yeast S. cerevisiae chromosome III and IV, bacterium M. pneumoniae, human major histocompatibility complex sequence, longest contigs in human chromosome 21 and 22) by recursive segmentations at different strength criteria. Results by recursive segmentation confirm that yeast chromosome IV is more homogeneous than yeast chromosome III, human chromosome 21 is more homogeneous than human chromosome 22, and bacterial genomes may not be homogeneous due to short segments with distinct base compositions. The recursive segmentation also provides a quantitative criterion for identifying isochores in human sequences. Some features of our recursive segmentation, such as the possibility of delineating domain borders accurately, are superior to those of the moving-window approach commonly used in such analyses.

Entities:  

Mesh:

Substances:

Year:  2001        PMID: 11591472     DOI: 10.1016/s0378-1119(01)00672-2

Source DB:  PubMed          Journal:  Gene        ISSN: 0378-1119            Impact factor:   3.688


  14 in total

1.  IsoFinder: computational prediction of isochores in genome sequences.

Authors:  José L Oliver; Pedro Carpena; Michael Hackenberg; Pedro Bernaola-Galván
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

2.  The biased distribution of Alus in human isochores might be driven by recombination.

Authors:  Michael Hackenberg; Pedro Bernaola-Galván; Pedro Carpena; José L Oliver
Journal:  J Mol Evol       Date:  2005-03       Impact factor: 2.395

3.  Isochore structures in the genome of the plant Arabidopsis thaliana.

Authors:  Ren Zhang; Chun-Ting Zhang
Journal:  J Mol Evol       Date:  2004-08       Impact factor: 2.395

4.  Identifying compositionally homogeneous and nonhomogeneous domains within the human genome using a novel segmentation algorithm.

Authors:  Eran Elhaik; Dan Graur; Kresimir Josić; Giddy Landan
Journal:  Nucleic Acids Res       Date:  2010-06-22       Impact factor: 16.971

5.  Fine-structured multi-scaling long-range correlations in completely sequenced genomes--features, origin, and classification.

Authors:  Tobias A Knoch; Markus Göker; Rudolf Lohner; Anis Abuseiris; Frank G Grosveld
Journal:  Eur Biophys J       Date:  2009-06-17       Impact factor: 1.733

6.  Segmentation of time series with long-range fractal correlations.

Authors:  P Bernaola-Galván; J L Oliver; M Hackenberg; A V Coronado; P Ch Ivanov; P Carpena
Journal:  Eur Phys J B       Date:  2012-06-01       Impact factor: 1.500

7.  Organizational heterogeneity of vertebrate genomes.

Authors:  Svetlana Frenkel; Valery Kirzhner; Abraham Korol
Journal:  PLoS One       Date:  2012-02-27       Impact factor: 3.240

8.  CpGcluster: a distance-based algorithm for CpG-island detection.

Authors:  Michael Hackenberg; Christopher Previti; Pedro Luis Luque-Escamilla; Pedro Carpena; José Martínez-Aroza; José L Oliver
Journal:  BMC Bioinformatics       Date:  2006-10-12       Impact factor: 3.169

9.  Copy-number-variation and copy-number-alteration region detection by cumulative plots.

Authors:  Wentian Li; Annette Lee; Peter K Gregersen
Journal:  BMC Bioinformatics       Date:  2009-01-30       Impact factor: 3.169

10.  Partial correlation analysis indicates causal relationships between GC-content, exon density and recombination rate in the human genome.

Authors:  Jan Freudenberg; Mingyi Wang; Yaning Yang; Wentian Li
Journal:  BMC Bioinformatics       Date:  2009-01-30       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.