| Literature DB >> 28379360 |
Rohan Maddamsetti1,2,3, Philip J Hatcher4, Anna G Green3, Barry L Williams1, Debora S Marks3, Richard E Lenski1,2.
Abstract
Bacteria can evolve rapidly under positive selection owing to their vast numbers, allowing their genes to diversify by adapting to different environments. We asked whether the same genes that evolve rapidly in the long-term evolution experiment (LTEE) with Escherichia coli have also diversified extensively in nature. To make this comparison, we identified ∼2000 core genes shared among 60 E. coli strains. During the LTEE, core genes accumulated significantly more nonsynonymous mutations than flexible (i.e., noncore) genes. Furthermore, core genes under positive selection in the LTEE are more conserved in nature than the average core gene. In some cases, adaptive mutations appear to modify protein functions, rather than merely knocking them out. The LTEE conditions are novel for E. coli, at least in relation to its evolutionary history in nature. The constancy and simplicity of the environment likely favor the complete loss of some unused functions and the fine-tuning of others.Entities:
Keywords: core genome; experimental evolution; fine-tuning mutations; loss-of-function mutations; molecular evolution
Year: 2017 PMID: 28379360 PMCID: PMC5406848 DOI: 10.1093/gbe/evx064
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Nonsynonymous Mutations are Over-represented in the Core Genome of Nonmutator LTEE Populations
| Category and Population | Core | Flexible | Odds Ratio | Significance |
|---|---|---|---|---|
| Nonsynonymous mutations in nonmutator populations | 123 | 51 | 2.41 | |
| Synonymous mutations in nonmutator populations | 10 | 10 | 1.00 | |
| Nonsynonymous mutations in mutator populations | 2265 | 2510 | 0.90 | |
| Synonymous mutations in mutator populations | 838 | 860 | 0.97 |
Note.—The length of the core and flexible (i.e., noncore) portions of the coding sequences in the genome of the LTEE ancestor (E. coli strain REL606) are 1,944,921 and 2,066,263 bp, respectively. Data show the numbers of mutations found in the core and flexible portions in genomes sampled and sequenced at 50,000 generations from six nonmutator populations that retained the ancestral point-mutation rate and six mutator populations that evolved hypermutability. The odds ratio expresses the extent to which the category of mutation is overrepresented (>1) or underrepresented (<1) in the core genome relative to the flexible genome in the indicated populations. The P-value is based on a two-tailed binomial test comparing the observed numbers of mutations to the expectations based on the relative lengths of the core and flexible genomes.
FRelationship between positive selection in the LTEE and nonsynonymous sequence diversity of core genes in the E. coli collection of 60 clinical, environmental, and laboratory strains. The G score provides a measure of positive selection based on the excess of nonsynonymous mutations in the LTEE lineages that retained the ancestral point-mutation rate. The log10 and square-root transformations of the G score and sequence diversity, respectively, improve visual dispersion of the data for individual genes, but they do not affect the nonparametric tests performed, which depend only on rank order. (A) G scores and sequence diversity are very weakly negatively correlated across all 1968 core genes (Spearman-rank correlation r = –0.0701, P = 0.0019). (B) The correlation is not significant using only the 163 genes with positive G scores (Spearman-rank correlation r = –0.0476, P = 0.5463). (C) The 163 core genes with positive G scores in the LTEE have significantly lower nonsynonymous sequence diversity in natural isolates than the 1805 genes with zero G scores (Mann–Whitney U = 125,660, P = 0.0020). Error bars show 95% confidence intervals around the median.
FRelationship between positive selection in the LTEE and nonsynonymous sequence divergence of panorthologs between E. coli (strain REL606) and S. enterica. REL606 is the common ancestor of the LTEE populations. See figure 1 for additional details. (A) G scores and divergence are negatively correlated across all 2853 panorthologs (Spearman-rank correlation r = –0.0911, P < 10−5). (B) The correlation remains significant even when using only the 210 panorthologs with positive G scores (Spearman-rank correlation r = –0.2564, P = 0.0002). (C) The 210 panorthologs with positive G scores in the LTEE are significantly less diverged between E. coli and S. enterica in natural isolates than the 2643 panorthologs with zero G scores (Mann–Whitney U = 223,330, P = < 10−5). Error bars show 95% confidence intervals around the median.
FKEIO essentiality score and G score for the 57 genes with 2 or more nonsynonymous changes in nonmutator LTEE genomes. The transformation of the G score improves visual dispersion of the data for clarity. Triangles are core genes (panorthologs) and circles are noncore flexible genes. Genes affected by at least one potential knockout mutation (small indel, IS-element insertion, or large deletion) are labeled in green, and genes without any of these potential knockout mutations in purple. Also, 10 genes that had parallel mutations at the amino-acid level are additionally indicated in bold.
FParallel amino-acid mutations in the LTEE occur at protein interfaces. For clarity, only relevant protein domains are shown. (A) The I129S mutation is on the dimerization interface of the response regulator AtoC, based on the Aquifex aeolicus structure 1NY5. (B) Q506L occurs on a multimerization interface of the metalloprotease FtsH, encoded by hflB, in the Thermotoga maritima structure 2CE7. (C) R132C in ribosomal initiation factor IF3 interacts with the anticodon of the fMet-tRNA in the Thermus thermophilus structure 5LMQ. (D) Mutations at residue 50 in 30S ribosomal protein S4 lie on the interface with protein S5 in the E. coli ribosome structure 3J9Y. (E) Mutations at residue 294 directly contact the coenzyme NAD in the Haemophilus influenzae NadR protein structure 1LW7, while mutations at residues 290 and 298 are adjacent to 294 on the same face of the alpha helix. (F) A301S occurs at the A/A’ multimerization interface of pyruvate kinase, encoded by pykF, in the E. coli structure 4YNG. (G) T30N occurs at the dimerization interface of the DNA-binding domain of the transcriptional repressor FabR in the Pseudomonas aeruginosa structure 3LSR. (H) N653H occurs at the dimerization interface between ACT4 amino-acid binding domains of the bifunctional (p)ppGpp synthase/hydrolase SpoT in the Chlorobium tepidum structure 3IBW.
Genes and Their Associated Phenotypes that Show Evidence of Positive Selection in the LTEE
| Process | Genes | Phenotype | References |
|---|---|---|---|
| Cell size and shape | Larger size, elongated shape | ||
| Glucose transport | Increased uptake | ||
| Maltose transport | Loss | ||
| Transcription | Unknown | ||
| Translation | Translational speed and accuracy; possible compensation for cost of strepomycin resistance in ancestor | ||
| Acetate metabolism and glyoxylate shunt | Acetate assimilation | ||
| DNA supercoiling | Changes in global transcriptional regulation | ||
| CRP regulon | Regulation of catabolism | ||
| ppGpp regulon | Regulation of ribosome synthesis | ||
| Osmolarity regulation | Unknown |
Note.—See also Tenaillon et al. (2016) for evidence of gene-level parallelism.