| Literature DB >> 31297530 |
Evan Koch1, Rena M Schweizer2, Teia M Schweizer3,4, Daniel R Stahler5, Douglas W Smith5, Robert K Wayne3, John Novembre1,6.
Abstract
Knowledge of mutation rates is crucial for calibrating population genetics models of demographic history in units of years. However, mutation rates remain challenging to estimate because of the need to identify extremely rare events. We estimated the nuclear mutation rate in wolves by identifying de novo mutations in a pedigree of seven wolves. Putative de novo mutations were discovered by whole-genome sequencing and were verified by Sanger sequencing of parents and offspring. Using stringent filters and an estimate of the false negative rate in the remaining observable genome, we obtain an estimate of ∼4.5 x 10-9 per base pair per generation and provide conservative bounds from 2.6 x 10-9 and 7.1 x 10-9. Although our estimate is consistent with recent mutation rate estimates from ancient DNA (4.0 x 10-9 and 3.0-4.5 x 10-9), it implies a wider possible range. We also examined the consequences of our rate and the accompanying interval for dating several critical events in canid demographic history. For example, applying our full range of rates to coalescent models of dog and wolf demographic history implies a wide set of possible divergence times between the ancestral populations of dogs and extant Eurasian wolves (16,000 - 64,000 years ago) although our point estimate indicates a date between 25,000 and 33,000 years ago. Aside from one study in mice, ours provides the only direct mammalian mutation rate outside of primates, and is likely to be vital to future investigations of mutation rate evolution.Entities:
Year: 2019 PMID: 31297530 PMCID: PMC6805234 DOI: 10.1093/molbev/msz159
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
. 1.Whole-genome sequences from seven Yellowstone wolves of known pedigree were analyzed to detect DNMs. For each male (square) or female (circle) wolf, the top number indicates the Yellowstone National Park wolf ID and the bottom number provides the birth year. Below each individual is the average sequencing depth of coverage (see supplementary fig. S1, Supplementary Material online, for more detail on coverage per individual). The number of verified DNMs in each of the four offspring is provided at the top of the box or square.
. 2.Number of sites remaining after the sequential application of filters. Filters were applied independently in each trio to remove regions of the genome producing false positives. The final bars represent the number of sites ultimately examined for candidate DNMs. Filters were applied successively, potentially obscuring the effects that each might have if applied individually to the raw set of sites. A detailed description of all filters is given in the methods section of the main text.
Examination of Potential DNMs.
| YNP 570M | YNP 629M | YNP 645F | YNP 694F | |
|---|---|---|---|---|
| ≥1 alt. read | 2,676 | 3,529 | 3,935 | 3,225 |
| DNP > 0.3 | 112 | 109 | 106 | 108 |
| Sanger sequenced | 32 | 15 | 18 | 19 |
| Failed | 6 | 1 | 3 | 4 |
| Confirmed DNM | 12 | 5 | 7 | 3 |
Note.—The number of sites in each trio after all filtering steps, having a DNp score >0.3, and chosen for Sanger sequencing. The final two rows give the number of confirmed DNMs and the number that failed to sequence out of those for which Sanger sequencing was attempted.
. 3.Sanger sequencing of potential DNMs. (A) Two different quality scores at sites chosen for Sanger sequencing are plotted against the background distribution of scores from sites within the same trio that passed all filters and had at least one alternative read. MQRankSum is a measure of mapping quality, where negative values indicate that reads with alternative alleles mapped less well than reads with reference alleles, potentially reflecting mismapped reads. QD measures base quality score normalized by sequencing depth and is a metric for sequencing quality. (B) GC content in 100 bp surrounding sites chosen for Sanger sequencing is compared with the GC content in the rest of the genome that passed all filters. Sites where Sanger sequencing failed often fell in regions with high GC content, motivating additional filtering based on GC content.
. 4.Trio false negative rates (FNRs) by sequencing depth in the offspring. FNRs were estimated for each possible sequencing depth in each offspring, then were multiplied by the fraction of sites in the offspring with that depth of coverage. This provides the contribution from each sequencing depth to the overall FNR at sites passing all filters. The overall FNRs are the sum of these points.
. 5.Posterior distributions on the mutation rate. (A) Posterior distributions on the per-generation mutation rate without filtering based on GC content. The curve corresponding to the minimum mutation count is the posterior distribution we would calculate if all candidate sites that failed Sanger sequencing are false positives. The curve corresponding to the maximum mutation count is the posterior distribution we would calculate if all failed candidate sites are true DNMs. Dots beneath the curves show the point estimates of the mutation rates that would be obtained from the different trios. (B) Posterior distributions on the per-generation mutation rate after filtering based on GC content.
Recalibration of Estimated Divergence Times in Canid History.
| Divergence Event | Published Dates (ka) | Our Recalibration (ka) |
|---|---|---|
| Western Eurasian dogs | East Asian dogs | LF: 6 (6–11) | 5 (4–17) |
| Mexican wolves | Yellowstone wolves | BvH: 14 (12–18) | 12 (8–28) |
| ZF: 14 (10–17) | 12 (7–26) | |
| Basenji | other dogs | AF: 32 (29–34) | 28 (19–52) |
| ZF: 21 (19–23) | 19 (12–35) | |
| European wolves | East Asian wolves | AF: 33 (29–38) | 29 (19–58) |
| BvH: 27 (24–30) | 24 (16–46) | |
| Dogs | wolves | AF: 37 (35–40) | 33 (23–62) |
| BvH: 28 (24–30) | 25 (16–46) | |
| ZF: 29 (24–30) | 26 (16–46) | |
| LF: 34 (17–48) | 30 (11–74) | |
| North American wolves | Eurasian wolves | BvH: 31 (28–32) | 28 (18–49) |
| ZF: 31 (29–33) | 28 (19–51) | |
| Coyotes | wolves | BvH: 165 (158–171) | 146 (102–264) |
| Golden Jackals | Coyote/Wolf ancestors | AF: 995 (797–1,038) | 884 (514–1,596) |
Note.—Estimated divergence times were taken from four studies that used coalescent models to reconstruct canid population history. Times from different papers have been denoted as follows: AF (Freedman et al. 2014), BvH (vonHoldt et al. 2016), ZF (Fan et al. 2016), and LF (Frantz et al. 2016). The AF, BvH, and ZF studies used the Generalized Phylogenetic Coalescent Sampler method (G-PhoCS) (Gronau et al. 2011). Point estimates represent posterior means and intervals are 95% credible intervals. The LF study used relative cross-coalescent rates calculated using MSMC (Schiffels and Durbin 2014) to estimate divergence times. Point estimates are the times when between population coalescent rates exceeded 50% of the within-population coalescent rates, and intervals give the corresponding times for 25% and 50% of the within-population coalescent rate. Published dates from Freedman et al. (2014) were scaled to a mutation rate of 4.0 × 10−9 to be comparable with other studies. We recalibrated divergence times by rescaling point estimates using our estimated mutation rate of 4.5 × 10−9. Lower bounds on divergence times were obtained by rescaling the lower bound on estimated rates using our upper bound on the mutation rate, 6.2 × 10−9, and upper bounds on divergence times were obtained by rescaling the upper bound on estimated rates using our lower bound on the mutation rate, 2.8 × 10−9.