| Literature DB >> 25527838 |
Jonas Berglund1, Javier Quilez1, Peter F Arndt2, Matthew T Webster3.
Abstract
The positive-regulatory domain containing nine gene, PRDM9, which strongly associates with the location of recombination events in several vertebrates, is inferred to be inactive in the dog genome. Here, we address several questions regarding the control of recombination and its influence on genome evolution in dogs. First, we address whether the association between CpG islands (CGIs) and recombination hotspots is generated by lack of methylation, GC-biased gene conversion (gBGC), or both. Using a genome-wide dog single nucleotide polymorphism data set and comparisons of the dog genome with related species, we show that recombination-associated CGIs have low CpG mutation rates, and that CpG mutation rate is negatively correlated with recombination rate genome wide, indicating that nonmethylation attracts the recombination machinery. We next use a neighbor-dependent model of nucleotide substitution to disentangle the effects of CpG mutability and gBGC and analyze the effects that loss of PRDM9 has on these rates. We infer that methylation patterns have been stable during canid genome evolution, but that dog CGIs have experienced a drastic increase in substitution rate due to gBGC, consistent with increased levels of recombination in these regions. We also show that gBGC is likely to have generated many new CGIs in the dog genome, but these mostly occur away from genes, whereas the number of CGIs in gene promoter regions has not increased greatly in recent evolutionary history. Recombination has a major impact on the distribution of CGIs that are detected in the dog genome due to the interaction between methylation and gBGC. The results indicate that germline methylation patterns are the main determinant of recombination rates in the absence of PRDM9.Entities:
Keywords: CpG island; biased gene conversion; dog genome; methylation; recombination
Mesh:
Year: 2014 PMID: 25527838 PMCID: PMC4350167 DOI: 10.1093/gbe/evu282
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Overlap in the Dog Genome between GC-Peaks and CGIs, and Two Studies That Inferred the Location of Hotspots (Axelsson et al. 2012; Auton et al. 2013)
| GC-Peaks | CGIs | Axelsson | Auton | |
|---|---|---|---|---|
| Number | 28,947 | 110,104 | 4,019 | 7,104 |
| Mean size/bp | 965 | 481 | 33,461 | 21,896 |
| Overlaps | 22,851 | 1,177 | ||
| Mean size/bp | 588 | 16,041 | ||
Comparison of Number of CGIs in Different Genomic Regions in Dog
| CGIs | Promoter CGIs | Genic CGIs | Intergenic CGIs | |
|---|---|---|---|---|
| Genome | 110,104 | 860 | 2,606 | 107,863 |
| Hotspots | 17,740 (16%) | 286 (33%) | 538 (21%) | 17,321 (16%) |
Average Recombination Rates (cM/Mb) Measured at CGIs in Different Genomic Locations
| Genome Wide | Defined on Aligned Portion between Dog–Panda–Cat | ||||
|---|---|---|---|---|---|
| Rates | CGI Dog | CGI Dog Only | CGI Panda Only | CGI Shared | Total |
| Genome | 3.50 | 3.67 | 1.41 | 3.82 | 3.25 |
| Hotspot | 7.13 | 7.13 | 3.28 | 7.52 | 6.83 |
| Promoter | 4.87 | 4.78 | 2.15 | 4.95 | 4.79 |
| Genic | 3.32 | 3.32 | 1.34 | 3.60 | 2.91 |
| Intergenic | 3.51 | 3.68 | 1.41 | 3.83 | 3.26 |
FPatterns of mutation inferred from SNPs observed in dog resequencing data in different genomic regions and their intersections.
Number of Polymorphic Sites from Resequencing Data (Axelsson et al. 2012) and Their Sequence Context in the Dog Genome and Recombination Hotspots
| Genome Wide | In Hotspots | |||||
|---|---|---|---|---|---|---|
| Type | Mutations | Bases | Frequency | Mutations | Bases | Frequency |
| WS | 1,415,440 | 1,207,172,903 | 0.0012 | 81,652 | 80,377,076 | 0.0010 |
| SW | 1,619,424 | 822,609,669 | 0.0020 | 91,810 | 60,164,049 | 0.0015 |
| SW-non-CpG | 1,197,963 | 784,911,639 | 0.0015 | 65,783 | 56,519,456 | 0.0012 |
| SW-CpG | 421,461 | 37,698,030 | 0.0112 | 26,027 | 3,644,593 | 0.0071 |
Number of CGIs and Their Distribution in the Dog and Panda Genomes
| Dog | Panda | |||||||
|---|---|---|---|---|---|---|---|---|
| Numbers | Total | Shared | Gain | Loss | Total | Shared | Gain | Loss |
| Genome | 73,628 | 24,865 | 35,464 | 8,900 | 44,518 | 25,724 | 9,894 | 13,299 |
| Hotspot | 13,055 (18) | 5,961 (24) | 5,374 (15) | 759 (9) | 7,945 (18) | 6,216 (24) | 970 (10) | 1,720 (13) |
| Promoter | 585 (0.8) | 424 (2) | 107 (0.3) | 10 (0.1) | 439 (1) | 414 (2) | 15 (0.2) | 54 (0.4) |
| Genic | 1,958 (3) | 900 (4) | 736 (2) | 328 (4) | 1,569 (4) | 905 (4) | 336 (3) | 322 (2) |
| Intergenic | 71,789 (98) | 24,134 (97) | 34,683 (98) | 8,590 (97) | 43,065 (97) | 24,915 (97) | 9,560 (97) | 12,972 (98) |
Note.—Values in parentheses are percentages in genomic regions in relation to the genome-wide count for each category of CGI.
FCpG mutability in regions of different recombination rate. Error bars represent 95% confidence intervals from bootstrapping.
FPatterns of nucleotide substitution in dog (blue) and panda (red) lineages inferred from dog–panda–cat alignments. Error bars represent 95% confidence intervals from bootstrapping. (a) Substitution frequencies in (i) dog and (ii) panda. (b) Mean substitution frequencies in dog and panda. (c) Current and stationary GC-content (GC*) in (i) dog an (ii) panda.