| Literature DB >> 35446958 |
Jiao Pan1,2, Weiyi Li3, Jiahao Ni1, Kun Wu1, Iain Konigsberg4, Caitlyn E Rivera3, Clayton Tincher3, Colin Gregory3, Xia Zhou1, Thomas G Doak3,5, Heewook Lee6, Yan Wang1, Xiang Gao7, Michael Lynch8, Hongan Long1,2.
Abstract
Because errors at DNA level power pathogen evolution, a systematic understanding of the rate and molecular spectrum of mutations could guide the avoidance and treatment of infectious diseases. We thus accumulated tens of thousands of spontaneous mutations in 768 repeatedly bottlenecked lineages of 18 strains from various geographical sites, temporal spread, and genetic backgrounds. Entailing over ∼1.36 million generations, the resultant data yield an average mutation rate of ∼0.0005 per genome per generation, with significant within-species variation. This is one of the lowest bacterial mutation rates reported, giving direct support for a high genome stability in this pathogen resulting from high DNA-mismatch-repair efficiency and replication-machinery fidelity. Pathogenicity genes do not exhibit an accelerated mutation rate, and thus elevated mutation rates may not be the major determinant for the diversification of toxin and secretion systems. Intriguingly, a low error rate at the transcript level is not observed, suggesting distinct fidelity of the replication and transcription machineries. This work urges more attention on the most basic evolutionary processes of even the best-known human pathogens, and deepens the understanding of their genome evolution.Entities:
Keywords: zzm321990 Salmonellazzm321990 ; bacteria; genome evolution; spontaneous mutation; transcript error
Year: 2022 PMID: 35446958 PMCID: PMC9040049 DOI: 10.1093/molbev/msac081
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 8.800
The Base-Pair Substitution (BPS) Mutation Rates of Bacteria Estimated with MA-WGS.
General Information of S. enterica in This Study.
| Serovar | Strain | Catalog No. |
| cov | Sites |
| BPS |
| ts/tv | Indels | Errors | Ne |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Agona | BAA-707(T) | BAA-707 | 43 | 113 | 4.38 | 2034 | 54 | 1.66 | 1.57/8.46 | 9(7,2) | 700(3) | 14 |
| Bareilly | 9115(H) | ATCC9115 | 45 | 83 | 4.46 | 2037 | 50 | 2.90 | 0.79/5.74 | 11(8,3) | 1470(3) | 14 |
| Dublin | 2469(Q) | SGSC2469 | 42 | 88 | 4.45 | 2040 | 47 | 4.14 | 1.61/7.09 | 6(4,2) | 977(3) | 14 |
| Enteritidis | LJH608(I) | BAA-1045 | 43 | 76 | 4.40 | 1960 | 47 | 2.56 | 2.36/7.92 | 2(2,0) | 784(3) | 13.5 |
| Newport | C487-69(P) | ATCC27869 | 40 | 96 | 4.49 | 2118 | 50 | 4.20 | 1.17/6.49 | 4(3,1) | 2037(3) | 14.5 |
| Paratyphi A | 9150(J) | ATCC9150 | 43 | 106 | 4.22 | 2031 | 45 | 2.21 | 1.25/5.18 | 3(3,0) | 1402(3) | 13.5 |
| Typhi | ST1(D) | SGSC2728 | 44 | 125 | 4.24 | 1988 | 36 | 1.53 | 1.12/6.30 | 3(1,2) | 1686(3) | 13.5 |
| Typhi | CT18(E) | SGSC4072 | 41 | 98 | 4.24 | 1984 | 49 | 1.57 | 1.58/5.80 | 5(2,3) | 1924(4) | 13.5 |
| Typhi | Ty2(F) | SGSC2666 | 44 | 116 | 4.24 | 1946 | 35 | 3.10 | 1.06/6.39 | 6(2,4) | 1405(3) | 13 |
| Typhi | PNG31(G) | SGSC3194 | 44 | 121 | 3.52 | 1984 | 23 | 1.20 | 0.92/6.11 | 8(5,3) | 1543(3) | 13.5 |
| Typhi | CDC3137-73(L) | SGSC2660 | 44 | 123 | 4.25 | 1970 | 39 | 1.29 | 1.05/6.39 | 3(3,0) | 1802(3) | 13.5 |
| Typhi | CDC1707-81(M) | SGSC2661 | 45 | 116 | 4.25 | 1954 | 42 | 1.74 | 1.33/5.58 | 6(4,2) | 856(3) | 13.5 |
| Typhi | CDC382-82(N) | SGSC2664 | 42 | 131 | 4.20 | 1994 | 28 | 1.63 | 6.00/5.74 | 8(6,2) | 1422(3) | 14 |
| Typhi | CDC9228-77(R) | SGSC2657 | 42 | 127 | 4.13 | 1923 | 20 | 2.70 | 5.67/5.51 | 4(1,3) | 1420(4) | 13 |
| Typhimurium | LT2(K) | ATCC19585 | 47 | 139 | 4.81 | 2000 | 80 | 2.34 | 1.22/5.59 | 6(4,2) | 712(2) | 13.5 |
| Typhimurium | LT2 Δ | SGSC1350 | 12 | 99 | 4.81 | 2079 | 4287 | 0.56 | 70.45/– | 220(54,166) | – | 14 |
| Typhimurium | LT2 Δ | SGSC1348 | 10 | 94 | 4.80 | 2002 | 2817 | 0.81 | 60.24/– | 186(65,121) | – | 13.5 |
| Typhimurium | LT2 Δ | SGSC1349 | 9 | 254 | 4.81 | 2002 | 2349 | 0.67 | 64.25/6.65 | 183(59,124) | 130(2) | 13.5 |
note.—BPSs, total number of base-substitution mutations in all MA lines; Catalog No., catalog numbers from American Type Culture Collection (strain name starting with ATCC or BAA) or Salmonella Genetic Stock Center (starting with SGSC); cov, mean depth of sequencing coverage; G, mean number of generations of MA lines; Errors, total number of transcript errors, number of replicates is in parentheses; Indels, total number of indel mutations in all MA lines; m, mutation bias (μG/C→A/T/μA/T→G/C) in the A/T direction; n, number of MA lines; Sites, mean number of sites with reads covered, in Mbp; Strain, strain names with MA group letter in parentheses; ts/tv, the ratios of transition to transversion mutations/transcript errors; N, the effective population size of the MA lines, estimated by the harmonic mean method.
Fig. 1.Rates and distribution of mutations and transcript errors of the 15 MMR-functional Salmonella enterica strains. (Left) The phylogenetic relationship of the strains based on whole-genome SNPs. (Center) The genome-wide distribution of mutations for each strain (the upper row with sparse colored bars) and transcript errors (the lower row with dense colored bars). (Right) Heat map showing for each strain BPS mutation rates × 10−10 per site per generation; indel mutation rates × 10−10 per site per generation; and transcript-error rates × 10−6 per site per transcription.
Fig. 2.Genome-wide distribution of mutations. Circles from outside to inside (different colors of lines and backgrounds are only for contrast purposes): orange, genomic coordinates in Mbp; blue with red outlines, the number of mutations per 4.86-kbp bin (the whole genome contains 1,000 bins), with mutations pooled from MA lines of all three MMR-deficient S. Typhimurium LT2 strains (ΔmutS, ΔmutL, ΔmutH); green with yellow bars, the number of mutations per 4.86-kbp bin, mutations are pooled from MA lines of all 15 MMR-functional strains (table 2), the gray blocks in the green circle represent regions without reads covered in most non-LT2 wild-type strains; the genomic distribution of mutations in S. Typhimurium LT2 ΔmutL, ΔmutS, ΔmutH, all the MMR-functional strains (the four groups of colored tiles from outside to inside)—red, gray, blue, green, purple, and black tiles mark different types of base-substitution mutations—A:T → C:G, A:T → G:C, A:T → T:A, G:C → A:T, G:C → C:G and G:C → T:A, respectively, note that five single tiles were jittered from the tiles below due to overlapping or being too close with other mutations. ORI, origin of replication; TER, replication terminus.
Fig. 3.Mutation and transcript-error comparison between different genomic regions, strains, or species. (A) The genome size, base-substitution rate (per genome per generation) and effective population size (N) of bacteria studied with MA-WGS (Dettman et al. 2016; Lynch et al. 2016; Long, Sung, et al. 2018; Pan et al. 2021). 1—Agrobacterium tumefaciens C58; 2—Bacillus subtilis NCIB 3610; 3—Burkholderia cenocepacia HI2424; 4—Escherichia coli K-12 MG1655; 5—Photorhabdus luminescens ATCC29999; 6—Pseudomonas aeruginosa PA14; 7—Salmonella enterica (dark symbol uses N from this study; 7′ light symbol uses N from Bobay and Ochman, 2018); 8—Staphylococcus epidermidis ATCC 12228; 9—Vibrio cholerae 2740-80. (B and E) Mutation and transcript-error spectra of the MMR-functional strains; all—refers to all the strains; Typhi—eight strains of the Typhi serovar; others—represents seven non-Typhi strains. (C) The comparison of BPS mutation rates, indel mutation rates, and transcript-error rates of all MMR-functional strains; CDS, coding sequences; ncRNA, noncoding RNA; Intergenic, genomic regions with coding sequences and noncoding RNA excluded. (D) The spectra of mutations and transcript errors in MMR-functional versus ΔmutS Typhimurium LT2. (F) The molecular spectra (per site per transcription) of transcript errors in the coding sequences of E. coli K-12 MG1655 versus S. enterica—based on transcript errors pooled from the 15 MMR-functional strains. All error bars are standard errors of the mean.