| Literature DB >> 25591920 |
Carina F Mugal1, Peter F Arndt2, Lena Holm3, Hans Ellegren4.
Abstract
The genomes of many vertebrates show a characteristic variation in GC content. To explain its origin and evolution, mainly three mechanisms have been proposed: selection for GC content, mutation bias, and GC-biased gene conversion. At present, the mechanism of GC-biased gene conversion, i.e., short-scale, unidirectional exchanges between homologous chromosomes in the neighborhood of recombination-initiating double-strand breaks in favor for GC nucleotides, is the most widely accepted hypothesis. We here suggest that DNA methylation also plays an important role in the evolution of GC content in vertebrate genomes. To test this hypothesis, we investigated one mammalian (human) and one avian (chicken) genome. We used bisulfite sequencing to generate a whole-genome methylation map of chicken sperm and made use of a publicly available whole-genome methylation map of human sperm. Inclusion of these methylation maps into a model of GC content evolution provided significant support for the impact of DNA methylation on the local equilibrium GC content. Moreover, two different estimates of equilibrium GC content, one that neglects and one that incorporates the impact of DNA methylation and the concomitant CpG hypermutability, give estimates that differ by approximately 15% in both genomes, arguing for a strong impact of DNA methylation on the evolution of GC content. Thus, our results put forward that previous estimates of equilibrium GC content, which neglect the hypermutability of CpG dinucleotides, need to be reevaluated.Entities:
Keywords: CpG hypermutability; DNA methylation; GC content; GC isochores; GC-biased gene conversion
Mesh:
Year: 2015 PMID: 25591920 PMCID: PMC4349097 DOI: 10.1534/g3.114.015545
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Genome-wide averages (range) of CpG methylation level, GC content, CpG content and CpG[o/e]
| Chicken | Human | |
|---|---|---|
| CpG methylation level | 0.41 (0.18–0.53) | 0.70 (0.22–0.92) |
| GC content | 0.4034 (0.3224–0.5509) | 0.4090 (0.2939–0.6444) |
| CpG content | 0.0092 (0.0032–0.0316) | 0.0093 (0.0029–0.0445) |
| CpG[o/e] | 0.21 (0.12–0.42) | 0.20 (0.10–0.50) |
Genome-wide averages (range) were determined based on nontranscribed and nonrepetitive regions of the genome.
MLR analysis of CpG → CpA/TpG substitution rate in relation to CpG methylation level and sex-averaged recombination rate
| Chicken | Human | |||
|---|---|---|---|---|
| Partial Correlation | Partial Correlation | |||
| CpG methylation level | 5.09·10−16 | < 2·10−16 | ||
| Recombination rate | < 2·10−16 | −0.055 | 7.61·10−2 | |
| R2 = 0.36 | R2 = 0.14 | |||
Partial correlations significant below a P-value threshold of 0.05 are in bold. MLR, multiple linear regression.
Genome-wide averages (range) of current GC content, GC*, and GC*CpG
| Chicken | Human | |
|---|---|---|
| GC content | 0.4034 (0.3224–0.5509) | 0.4090 (0.2939–0.6444) |
| GC* | 0.4752 (0.3277–0.7440) | 0.4027 (0.2056–0.7016) |
| GC*CpG | 0.3988 (0.2895–0.6056) | 0.3464 (0.1866–0.5725) |
Genome-wide averages (range) were determined based on nontranscribed and nonrepetitive regions of the genome.
Figure 1Pair-wise relationships between GC* and current GC content as well as GC*CpG and current GC content (panels A and B for chicken and panels C and D for human). The black solid line represents the leading principal component fitted to the data. The intersection between the black solid and black dashed line indicates the mean values of GC* and GC content, respectively. The red dashed line represents the bisecting line of the first quadrant (x = y).
MLR analysis of GC*CpG and ΔGC in relation to methylation frequency, CpG[o/e], and recombination rate
| GC*CpG | ΔGC | |||||||
|---|---|---|---|---|---|---|---|---|
| Chicken | Human | Chicken | Human | |||||
| Partial Correlation | Partial Correlation | Partial Correlation | Partial Correlation | |||||
| Methylation frequency | 2.99·10−7 | 1.20·10−11 | < 2·10−16 | < 2·10−16 | ||||
| CpG[o/e] | < 2·10−16 | 0.020 | 5.11·10−1 | < 2·10−16 | 0.001 | 9.79·10−1 | ||
| Recombination rate | 4.59·10−14 | 1.20·10−6 | 3.06·10−5 | 8.97·10−16 | ||||
| R2 = 0.84 | R2 = 0.17 | R2 = 0.48 | R2 = 0.29 | |||||
Partial correlations significant below a P-value threshold of 0.05 are in bold. MLR, multiple linear regression.
Pearson correlation coefficients between methylation frequency, CpG[o/e], and recombination rate for chicken (lower left) and human (upper right)
| Methylation Frequency | CpG[o/e] | Recombination Rate | |
|---|---|---|---|
| Methylation frequency | − | 0.79 | 0.30 |
| CpG[o/e] | 0.91 | − | 0.29 |
| Recombination rate | 0.58 | 0.56 | − |
All P-values < 2e-16.
Figure 2Amount of variation in GC*CpG as well as ΔGC explained by the different explanatory variables based on principal component regression analysis (panels A and B for chicken and panels C and D for human, respectively). The height of each bar represents how much of the variance in GC* or ΔGC, respectively, is explained by the corresponding principal component. The size of each colored area is proportional to the relative contribution of the respective genomic feature within each principal component.