| Literature DB >> 16845099 |
Philipp W Messer1, Peter F Arndt.
Abstract
CorGen is a web server that measures long-range correlations in the base composition of DNA and generates random sequences with the same correlation parameters. Long-range correlations are characterized by a power-law decay of the auto correlation function of the GC-content. The widespread presence of such correlations in eukaryotic genomes calls for their incorporation into accurate null models of eukaryotic DNA in computational biology. For example, the score statistics of sequence alignment and the performance of motif finding algorithms are significantly affected by the presence of genomic long-range correlations. We use an expansion-randomization dynamics to efficiently generate the correlated random sequences. The server is available at http://corgen.molgen.mpg.de.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16845099 PMCID: PMC1538783 DOI: 10.1093/nar/gkl234
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1CorGen analysis of a 1 Mb region on human chromosome 22. The two plots in the top part show the measured GC-profile (left) and correlation function (right) of the chromosomal region. In the double-logarithmic correlation graph, power-law correlations C(r) ∝ r−α show up as a straight line with slope α. The fitting has been performed in the range 10 < r <10 000, and the obtained parameters are α = 0.359 and C (10) = 0.0234 (green line). A corresponding random sequence of length 1 Mb with the measured long-range correlation parameters and average GC-content of the query sequence has been generated and can be downloaded by the user. Its composition profile and correlation function are shown in the two plots at the bottom.