| Literature DB >> 22207905 |
Sébastien Wielgoss1, Jeffrey E Barrick, Olivier Tenaillon, Stéphane Cruveiller, Béatrice Chane-Woon-Ming, Claudine Médigue, Richard E Lenski, Dominique Schneider.
Abstract
The quantification of spontaneous mutation rates is crucial for a mechanistic understanding of the evolutionary process. In bacteria, traditional estimates using experimental or comparative genetic methods are prone to statistical uncertainty and consequently estimates vary by over one order of magnitude. With the advent of next-generation sequencing, more accurate estimates are now possible. We sequenced 19 Escherichia coli genomes from a 40,000-generation evolution experiment and directly inferred the point-mutation rate based on the accumulation of synonymous substitutions. The resulting estimate was 8.9 × 10(-11) per base-pair per generation, and there was a significant bias toward increased AT-content. We also compared our results with published genome sequence datasets for other bacterial evolution experiments. Given the power of our approach, our estimate represents the most accurate measure of bacterial base-substitution rates available to date.Entities:
Year: 2011 PMID: 22207905 PMCID: PMC3246271 DOI: 10.1534/g3.111.000406
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Description of 35 synonymous mutations observed in 19 genomes sampled from eight evolving populations
| Population | Genome Position | Gene | Base Change | Sequenced Clones |
|---|---|---|---|---|
| Ara–1 | – | – | – | 20K-A, 20K-B, 20K-C |
| Ara–3 | 756,799 | C→T | 30K-B, 40K | |
| 2,613,609 | G→A | 30K-B | ||
| 2,642,843 | G→T | 30K-B | ||
| 2,983,794 | C→T | 40K | ||
| 3,141,566 | C→T | 40K | ||
| 3,407,922 | C→A | 40K | ||
| 4,111,342 | C→T | 30K-A | ||
| 4,177,963 | T→G | 30K-A | ||
| 4,107,018 | T→A | 30K-B, 40K | ||
| 4,313,510 | C→T | 40K | ||
| Ara–5 | 157,626 | A→T | 40K-B | |
| 307,594 | C→T | 40K-A, 40K-B, 40K-C | ||
| 3,107,610 | T→A | 40K-A, 40K-B, 40K-C | ||
| Ara–6 | 857,058 | C→T | 40K-B | |
| 1,352,030 | G→T | 40K-B | ||
| 2,087,738 | C→A | 40K-A, 40K-B | ||
| 2,095,621 | G→A | 40K-A | ||
| 3,482,212 | G→A | 40K-B | ||
| Ara+1 | 132,062 | C→T | 40K-A | |
| 239,002 | A→C | 40K-B | ||
| 3,124,208 | G→A | 40K-A | ||
| 3,308,106 | G→A | 40K-A, 40K-B | ||
| 3,409,316 | T→G | 40K-A, 40K-B | ||
| 3,527,027 | C→A | 40K-B | ||
| 3,910,606 | T→G | 40K-B | ||
| 4,133,104 | G→A | 40K-A, 40K-B | ||
| Ara+2 | 1,083,668 | C→T | 40K-A | |
| – | – | – | 40K-B | |
| Ara+4 | 420,328 | A→C | 40K-A, 40K-B | |
| 2,772,320 | A→C | 40K-A, 40K-B | ||
| 3,061,109 | G→A | 40K-A | ||
| Ara+5 | 122,591 | T→A | 40K-A, 40K-B | |
| 212,865 | T→C | 40K-A, 40K-B | ||
| 1,317,194 | G→A | 40K-A, 40K-B | ||
| 2,009,188 | G→T | 40K-A, 40K-B | ||
| 2,251,393 | G→A | 40K-A, 40K-B |
Genome position in the ancestral reference strain REL606 [GenBank:NC_012967.1].
20K, 30K, and 40K indicate clones sampled after 20,000, 30,000, and 40,000 generations, respectively, and labels A, B and C indicate different clones from the same generation.
Base-substitution rates estimated from evolution experiments with whole-genome data
| Study | Bacterial Strain | Clones | Cumulative Generations | Synonymous Sites (bp) | Synonymous Mutations | |
|---|---|---|---|---|---|---|
| This study | 19 | 300,000 | 941,000 | 25 (52) | 8.9 [5.7–13] | |
| 12 | 10,700 | 930,000 | 5 | 50 [16–120] | ||
| 4 | 13,850 | 945,000 | 2 | 15 [1.9–55] | ||
| 1 | 5000 | 990,000 | 2 | 40 [4.9–150] | ||
| 1 | 1000 | 2,140,000 | 1 | 47 [1.2–260] |
For these calculations, we used only independently evolved end-point clones, and we pooled data from replicate lineages started from the same ancestral strain.
The effective synonymous target size was calculated from the ancestral genome sequences (see Materials and Methods).
The mutation rate µ (per bp per generation) was calculated as the number of observed synonymous mutations divided by the product of the total number of generations and the effective number of synonymous target sites. Brackets indicate 95% confidence limits estimated from a binomial distribution. These estimates do not take into account base composition or changes in genome size.
For comparison with the other datasets, we used only the first clone sequenced at the latest nonmutator time point from each of the eight long-term populations: 20K-A for Ara-1,40K for Ara-3, and 40K-A for the other six populations (Table 1). There were 25 synonymous mutations in these clones and 52 overall in the dataset. A more accurate estimate of µ and its uncertainty for the long-term lines takes into account the multiple clones sequenced from the same population, the pseudo-replication of clones from the same population, the base signatures of the mutations, and changes in genome size. That comprehensive analysis yields 8.9 [4.0–14] × 10−11 per bp per generation (see text).
Figure 1 Expected and observed mutational spectra for synonymous point mutations. White and black bars show the expected and observed base-pair changes, respectively. The expected values reflect the actual base-pair frequencies in the genome and the probability that a particular base-pair mutation (e.g., from C:G to T:A) produces a synonymous change.