| Literature DB >> 27846220 |
Søren Besenbacher1, Patrick Sulem2, Agnar Helgason2,3, Hannes Helgason2,4, Helgi Kristjansson2, Aslaug Jonasdottir2, Adalbjorg Jonasdottir2, Olafur Th Magnusson2, Unnur Thorsteinsdottir2,5, Gisli Masson2, Augustine Kong2, Daniel F Gudbjartsson2,4, Kari Stefansson2,5.
Abstract
Mutation of the DNA molecule is one of the most fundamental processes in biology. In this study, we use 283 parent-offspring trios to estimate the rate of mutation for both single nucleotide variants (SNVs) and short length variants (indels) in humans and examine the mutation process. We found 17812 SNVs, corresponding to a mutation rate of 1.29 × 10-8 per position per generation (PPPG) and 1282 indels corresponding to a rate of 9.29 × 10-10 PPPG. We estimate that around 3% of human de novo SNVs are part of a multi-nucleotide mutation (MNM), with 558 (3.1%) of mutations positioned less than 20kb from another mutation in the same individual (median distance of 525bp). The rate of de novo mutations is greater in late replicating regions (p = 8.29 × 10-19) and nearer recombination events (p = 0.0038) than elsewhere in the genome.Entities:
Mesh:
Year: 2016 PMID: 27846220 PMCID: PMC5147774 DOI: 10.1371/journal.pgen.1006315
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Fig 1The correlation between the estimated mutation rate for each child and the age of the parents.
The rate of different types of mutations.
| Mutation Type | Sequence context | Transition vs. Transversion | Number of mutations | Mutation rate PPPY x 1010 (95% c.i.) | |
|---|---|---|---|---|---|
| SNV | CpG | Transition | 2984 | 39.81 | (38.40–41.26) |
| Transversion | 281 | 3.75 | (3.34–4.21) | ||
| All | 3265 | 43.55 | (42.09–45.07) | ||
| nonCpG Strong (C or G) | Transition | 4264 | 2.64 | (2.56–2.72) | |
| Transversion | 3019 | 1.87 | (1.80–1.93) | ||
| All | 7283 | 4.50 | (4.40–4.61) | ||
| Weak (A or T) | Transition | 4758 | 1.90 | (1.85–1.96) | |
| Transversion | 2506 | 1.00 | (0.96–1.04) | ||
| All | 7264 | 2.91 | (2.84–2.97) | ||
| Insertion | All | 353 | 0.08 | (0.08–0.09) | |
| Deletion | All | 929 | 0.22 | (0.21–0.24) | |
The rates per position per year (PPPY) for different types of mutations. G and C base pairs are referred to as strong because they are bound by three hydrogen bonds while weak (A and T) base pairs are bound by two hydrogen bonds.
Fig 2Clustering mutations.
(a) The red line shows a QQ-plot of the observed distances between all pairs of mutations (both within and between individuals) compared to the expected distances assuming independence. The green line shows a QQ plot based only on distances between mutations that occurred in the same individual. The blue line shows a QQ plot based only on distances between mutations that occurred in different individuals. (b) A histogram of the number of mutations per cluster. (c) Histogram showing the distribution of distances to the nearest mutation in the same individual.
Fig 3Distribution of mutation types.
The relative distribution of different types of mutations stratified by the distance to nearest mutation in the same individual. The error bars are 95% confidence intervals.
Fig 4Distribution of mutation types with sequence context.
The relative distribution of different types of mutation when the bases immediately 5’ and 3’ to the mutated base is included. Stratified based on whether the mutation clustered 10bp-20kb from another mutation or was not part of a cluster. The error bars are 95% confidence intervals.
Fig 5The effect of replication timing on the mutation rate.
a) The effect of replication timing At CpG, non-CpG Strong (C or G) and Weak (A or T) sites. The y-values are mutation rates per position per year (PPPY). The x-values are wavelet-smoothed signal of replication timing calculated by the ENCODE project. Early replicating regions have high signal values and late replicating regions have low values. The bands around the points show the 95% confidence interval for each point. b) The effect of replication timing on clustering and non-clustering mutations. The y-axis shows a combined odds ratio for CpG, non-CpG-Strong and non-CpG-Weak sites calculated using the Cochran-Mantel-Haentzel method. An OR over 1 indicates that we observe most mutations in the latest replicating half of the genome. The error bars are 95% confidence intervals.