| Literature DB >> 21208429 |
Abstract
BACKGROUND: The CG dinucleotides are known to be deficient in the human genome, due to a high mutation rate from 5-methylated CG to TG and its complementary pair CA. Meanwhile, many cellular functions rely on these CG dinucleotides, such as gene expression controlled by cytosine methylation status. Thus, CG dinucleotides that provide essential functional substrates should be retained in genomes. How these two conflicting processes regarding the fate of CG dinucleotides - i.e., high mutation rate destroying CG dinucleotides, vs. functional processes that require their preservation remains an unsolved question.Entities:
Mesh:
Year: 2011 PMID: 21208429 PMCID: PMC3025853 DOI: 10.1186/1471-2148-11-3
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Number of SNPs and average derived-allele frequency (DAF) for different mutation types in different annotation categories
| Non-CpG island region | CpG island region | |||||||
|---|---|---|---|---|---|---|---|---|
| Intergenic | Genic | Intergenic | Genic | |||||
| Mutation type* | SNPs | DAF | SNPs | DAF | SNPs | DAF | SNPs | DAF |
| All | 88447 | 0.324 | 79241 | 0.317 | 1053 | 0.34 | 3120 | 0.316 |
| Tsd | 3752 | 0.314 | 4722 | 0.303 | 158 | 0.301 | 461 | 0.3 |
| C-Tsd | 22539 | 0.311 | 19525 | 0.303 | 286 | 0.324 | 861 | 0.309 |
| Tsg | 10235 | 0.353 | 9396 | 0.344 | 100 | 0.406 | 240 | 0.373 |
| C-Tsg | 21567 | 0.33 | 19866 | 0.326 | 138 | 0.401 | 403 | 0.339 |
| Tvd | 419 | 0.333 | 400 | 0.322 | 66 | 0.312 | 196 | 0.325 |
| C-Tvd | 14100 | 0.317 | 11667 | 0.309 | 140 | 0.337 | 461 | 0.3 |
| Tvg | 3415 | 0.306 | 3345 | 0.304 | 50 | 0.385 | 169 | 0.309 |
| C-Tvg | 12481 | 0.33 | 10935 | 0.319 | 108 | 0.346 | 354 | 0.302 |
* The symbols used here are identical to that in the text, with the prefix of C- denoting the corresponding control dataset.
Figure 1Derived-allele frequency for different mutation types in YRI. Left histogram is for non-CpG island regions, right histogram is for CpG island regions. Abbreviations are the same as in the text, with the prefix of C- denoting the corresponding control dataset.
Recent selection on the newly generated CG dinucleotides in different mutation categories
| Transition-generated CG | Control for transition-generated CG | Transversion-generated CG | Control for transversion-generated CG | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Population | Proportion indicating positive selection | DAF of the selected CG | general DAF | Proportion indicating | general DAF | Proportion indicating negative selection | DAF of the selected CG | general DAF | ||||
| YRI | 0.019 | 0.354 | 0.370 | 0.021 | 0.348 | 0.353 | 0.021 | 0.319 | 0.333 | 0.021 | 0.338 | 0.349 |
| CEU | 0.015 | 0.425 | 0.412 | 0.016 | 0.446 | 0.403 | 0.022 | 0.375 | 0.387 | 0.026 | 0.353 | 0.399 |
| ASN | 0.014 | 0.465 | 0.428 | 0.016 | 0.452 | 0.418 | 0.024 | 0.345 | 0.407 | 0.026 | 0.368 | 0.414 |
1. Proportion of the SNPs whose iHs < -2 (positive selection) or iHs > 2 (negative selection) among all SNPs;
2. DAF for SNPs with iHs < -2(positive selection) or iHs > 2(negative selection);
3. DAF for all SNPs.
Recent selection on the backmutated CG dinucleotides
| YRI | CEU | ASN | ||||||
|---|---|---|---|---|---|---|---|---|
| SNP ID | Mutation type | Position | iHs | DAF | iHs | DAF | iHs | DAF |
| rs9409314 | ATG->ACG | intergenic | 2.1 | 0.8 | 2.39 | 0.339 | 2.56 | 0.545 |
| rs7977620 | CAT- > CGT | NAV3 upstream | 1.14 | 0.921 | 2.37 | 0.558 | 2.14 | 0.6 |
| rs4687991 | CAT- > CGT | intergenic | -0.97 | 0.788 | 2.15 | 0.458 | 1.47 | 0.649 |
Figure 2Derived-allele frequency for different mutation types in recombination hotspots and coldspots in YRI. Columns on the left of the solid line represent the pattern without the influence of the CG context (mutations not preceded by C or followed by G); columns on the right of the solid line represent the pattern with CG context.