Literature DB >> 34132772

Effects of Synonymous Mutations beyond Codon Bias: The Evidence for Adaptive Synonymous Substitutions from Microbial Evolution Experiments.

Susan F Bailey¹, Luz Angela Alonso Morales², Rees Kassen².

Abstract

Synonymous mutations are often assumed to be neutral with respect to fitness because they do not alter the encoded amino acid and so cannot be "seen" by natural selection. Yet a growing body of evidence suggests that synonymous mutations can have fitness effects that drive adaptive evolution through their impacts on gene expression and protein folding. Here, we review what microbial experiments have taught us about the contribution of synonymous mutations to adaptation. A survey of site-directed mutagenesis experiments reveals the distributions of fitness effects for nonsynonymous and synonymous mutations are more similar, especially for beneficial mutations, than expected if all synonymous mutations were neutral, suggesting they should drive adaptive evolution more often than is typically observed. A review of experimental evolution studies where synonymous mutations have contributed to adaptation shows they can impact fitness through a range of mechanisms including the creation of illicit RNA polymerase binding sites impacting transcription and changes to mRNA folding stability that modulate translation. We suggest that clonal interference in evolving microbial populations may be the reason synonymous mutations play a smaller role in adaptive evolution than expected based on their observed fitness effects. We finish by discussing the impacts of falsely assuming synonymous mutations are neutral and discuss directions for future work exploring the role of synonymous mutations in adaptive evolution.

Entities: Chemical

Keywords: distribution of fitness effects; experimental evolution; positive selection; synonymous mutations

Mesh：

Year: 2021 PMID： 34132772 PMCID： PMC8410137 DOI： 10.1093/gbe/evab141

Source DB: PubMed Journal: Genome Biol Evol ISSN： 1759-6653 Impact factor: 3.416

Significance

Synonymous mutations are often assumed to be nearly neutral; however, growing evidence suggests that they can have large beneficial effects and contribute to adaptive evolution. We review the evidence from microbial experimental evolution studies exploring the role of synonymous mutations in adaptive evolution and the mechanisms underlying their fitness effects. Our review suggests that, while synonymous mutations can play a more important role in adaptation than previously thought, nonsynonymous mutations remain the most common route to adaptation in microbial populations, at least. Our results suggest avenues for future experimental evolution research in this field that better account for synonymous mutations in adaptive evolution.

Introduction

Synonymous mutations do not alter the encoded amino acids but can still impact fitness via their effects on gene expression and protein structure. Evidence for these fitness effects now comes from a range of observational, comparative genomics, and experimental studies. In some ways, the idea that synonymous mutations can impact fitness is not at all surprising—observations of widespread variation in codon bias (differences in the frequency of synonymous codons) observed across different species, genes, and even gene regions have long suggested that this might be the case (Grantham et al. 1980; Post and Nomura 1980; Qin et al. 2004; Hershberg and Petrov 2008). Codon usage biases within a given genome often correlate with tRNA copy numbers (Ikemura 1981; Andersson and Kurland 1990), and highly expressed and essential genes tend to have a higher frequency of optimal codons (Gouy and Gautier 1982), supporting the idea that codon usage is indeed under selection rather than simply being the result of mutational biases. However, most comparative genomic observations, theoretical models, and early in vivo experimental tests have suggested that selection for codon usage is quite weak (e.g., s ≤ 10−5; Carlini and Stephan 2003) and so any adaptive evolution at synonymous sites must occur over very long timescales. More recently, and perhaps more surprisingly, there is mounting evidence that some synonymous mutations can have large fitness effects and so have the potential to play a role in adaptive evolution over much shorter timescales. There are now numerous examples of large effect synonymous mutations associated with human disease (Sauna and Kimchi-Sarfaty 2011; Hunt et al. 2014), and drivers of cancer (Supek et al. 2014; Sharma et al. 2019). A number of comparative genomics studies have found evidence of weak purifying selection at synonymous sites (e.g., Zeng and Charlesworth 2009; Keightley and Halligan 2011; Lawrie et al. 2011), and a few have found evidence of synonymous mutations under strong purifying selection (Keightley and Halligan 2011; Lawrie et al. 2013). Although the evidence for strong positive selection in these comparative studies is lacking, the evidence for strong purifying selection suggests that under alternate environmental conditions or in a different genetic background, synonymous mutations could also have strong beneficial fitness effects. A growing number of experimental studies now provide direct evidence that synonymous mutations can have strong beneficial fitness effects and drive adaptive evolution. In this review, we examine the evidence coming from two types of experimental studies. The first are those that directly quantify the fitness of a collection of single-step synonymous and nonsynonymous mutants, usually generated via site-directed mutagenesis. These studies characterize the distribution of fitness effects (DFE) of mutations that are likely to arise and then be subject to selection during adaptive evolution. We refer to these as “DFE studies.” The second types of experimental study are those that track the evolutionary dynamics of replicate populations cultured in controlled environments. These experimental evolution (EE) studies identify synonymous and nonsynonymous (and other) mutations that arise during evolution through whole genome sequencing and quantify the fitness effects of those mutations that drive adaptive evolution. We refer to these as “EE studies.” Both study types discussed here have been performed with populations of either viruses, bacteria, or yeast and so, by necessity, our focus is on microbial evolution. We begin this review with a comparison of the fitness effects of synonymous and nonsynonymous mutations identified in DFE studies. We find that, despite conventional thinking that synonymous mutations are largely neutral, the range of beneficial fitness effects of synonymous and nonsynonymous mutations measured across these studies is often quite similar. We then summarize observations from EE studies where synonymous mutations have been shown to drive adaptive evolution. Although adaptive synonymous mutations have certainly been observed in EE studies, they appear at a lower frequency than we would expect based on the fitness effects reported in the DFE studies reviewed here. We propose several reasons for this mismatch and outline the evidence for specific molecular mechanisms driving the fitness effects of adaptive synonymous mutations. We finish by discussing some of the implications of incorrectly assuming synonymous mutations are neutral and offer suggestions for how experimental evolution studies can continue to contribute to our understanding of the role that synonymous mutations play in adaptive evolution.

Fitness Effects of Synonymous Mutations in DFE Studies

Theory suggests that when evolution is driven by rare beneficial mutations, a scenario termed the strong-selection weak-mutation regime, the identity of the next mutation to fix in an adapting population depends solely on the distribution of fitness effects of the arising mutations (Orr 2002; Joyce et al. 2008). If a population is large enough that mutations are no longer rare, the relative fitness effects of mutations are expected to be even more important in determining which mutation will be next to fix (Bailey et al. 2017). Thus, knowledge of the fitness effects of beneficial mutations available to selection, and how those fitness effects differ between synonymous and nonsynonymous mutations, allows us to estimate the expected contributions of synonymous mutations to adaptive evolution. It is important to note that although DFE studies measure the fitness effects of a collection of possible mutations, biases in mutation rates may significantly shift which of those possible mutations actually arise and are available for selection. Indeed, mutation bias has also been implicated as a potential driver of codon usage bias in a number of organisms (e.g., Eyre-Walker 1991). Thus, although not the focus of this review, we note that there is a potential for mutational biases to impact the contribution of synonymous mutations to adaptive evolution and those impacts are not quantified here. Mutagenesis approaches can be used to generate a collection of mutants and then direct experimental tests of growth rate and/or competitive fitness quantify the fitness effects of single-step mutants relative to a standard ancestor genotype. Many studies of this kind either focus on the effects of amino acid changes (i.e., nonsynonymous mutations; e.g., Bank et al. 2016) or do not confirm the molecular basis of the generated mutations (i.e., cannot distinguish synonymous and nonsynonymous mutations; e.g., Kassen and Bataillon 2006). A few, however, identify and report synonymous and nonsynonymous mutations separately, allowing for a comparison of their fitness effects. Table 1 summarizes those DFE studies that characterize the fitness effects of synonymous mutations and have made that fitness data publicly available for reanalysis.

Table 1

Characteristics of DFEs from Studies with Published Fitness Data for Synonymous Mutations

							No. of Mutations Tested			P Value for K.S. Test of dist(S)≠dist(NS)
	Study		Organism	Fitness Assay Environment	WG/focal Gene		All	ω > 1	Range in Relative Fitness (min–max)	All	ω > 1	GPD Domain for Ben. Mutations^a
Viruses	Carrasco et al. (2007)	A	Tobacco etch potyvirus (+ssRNA)	Nicotiana tabacum host	WG	S	11	3	0–3.133	0.0245	0.0163	Gumbel
	Carrasco et al. (2007)	A	Tobacco etch potyvirus (+ssRNA)	Nicotiana tabacum host	WG	N	55	5	0–2.393	0.0245	0.0163	Gumbel
	Domingo-Calap et al. (2009); Cuevas et al. (2012)	B	Qβ bacteriophage (+ssRNA)	Escherichia coli host	WG	S	36	12	0–1.03	0.0001	0.0001	Weibull
		B	Qβ bacteriophage (+ssRNA)	Escherichia coli host	WG	N	32	3	0–1.035	0.0001	0.0001	Weibull
		C	ΦX174 bacteriophage (ssDNA)	E. coli host	WG	S	38	21	0.809–1.068	0.0001	>0.05	Gumbel
		C	ΦX174 bacteriophage (ssDNA)	E. coli host	WG	N	32	7	0–1.061	0.0001	>0.05	Gumbel
	Peris et al. (2010)	D	f1 bacteriophage(ssDNA)	E. coli host	WG	S	30	15	0.886–1.055	0.0001	>0.05	Weibull
	Peris et al. (2010)	D	f1 bacteriophage(ssDNA)	E. coli host	WG	N	60	7	0–1.074	0.0001	>0.05	Weibull
	Sanjuán et al. (2004)	E	Vesicular stomatitisvirus (−ssRNA)	Baby hamster kidney host cells (BHK21)	WG	S	9	3	0–1.055	0.0064	>0.05	Gumbel
	Sanjuán et al. (2004)	E	Vesicular stomatitisvirus (−ssRNA)	Baby hamster kidney host cells (BHK21)	WG	N	80	13	0–1.16	0.0064	>0.05	Gumbel
	Wu et al. (2014)	F	influenza A H1N1 (−ssRNA)	Human lung carcinoma host cells (A549)	HA	S	532	158	0.011–11.249	0.0001	0.0235	Gumbel
	Wu et al. (2014)	F	influenza A H1N1 (−ssRNA)	Human lung carcinoma host cells (A549)	HA	N	2478	242	0–20.216	0.0001	0.0235	Gumbel
Bacteria	Firnberg et al. (2014)	G	E. coli	LB-agar+ampicillin	TEM-1 β-lactamase	S	611	368	0.003–1.640	0.0001	>0.05	Weibull
	Firnberg et al. (2014)	G	E. coli	LB-agar+ampicillin	TEM-1 β-lactamase	N	1925	434	0.001–1.670	0.0001	>0.05	Weibull
	Lebeuf-Taylor et al. (2019)	H	Pseudomonasfluorescens	M9 + glucose media	gtsB	S	39	27	0.925–1.080	0.0001	>0.05	Weibull
	Lebeuf-Taylor et al. (2019)	H	Pseudomonasfluorescens	M9 + glucose media	gtsB	N	71	29	0.622–1.0514	0.0001	>0.05	Weibull
	Lind et al. (2010)	I	Salmonellatyphimurium	LB	rplA	S	17	2	0.738–1.005	>0.05	—^b	—^b
		I		LB	rplA	N	39	2	0.754–1.004	>0.05	—^b	—^b
		J		LB	rpsT	S	21	0	0.606–0.989	>0.05	—^b	—^b
		J		LB	rpsT	N	49	0	0.72–1	>0.05	—^b	—^b
	Schenk et al. (2012)	K	E. coli	LB agar+cefotaxime	TEM-1 β-lactamase	S	—	10	1.1–2.3^c	—	0.017	S: Weibull
	Schenk et al. (2012)	K	E. coli	LB agar+cefotaxime	TEM-1 β-lactamase	N	—	38	1.1–27^c	—	0.017	N: Fréchet
Fungi	Hietpas et al. (2011)	L	Saccharomycescerevisiae	Dextrose media+G418 + ampicillin	9-AA regionof Hsp90	S	15	14	0.937–1.014	0.0001	0.015	S: Weibull
Fungi	Hietpas et al. (2011)	L	Saccharomycescerevisiae	Dextrose media+G418 + ampicillin	9-AA regionof Hsp90	N	545	41	0–1.042	0.0001	0.015	N: Gumbel

Note.—Studies are grouped by kingdom and then ordered alphabetically by author. WG, whole genome; S, synonymous mutations; NS, nonsynonymous mutations; GPD, Generalized Pareto Distribution; Gumbel domain describes exponential distributions, Weibull domain describes right-truncated distributions, Fréchet domain describes heavy-tailed distributions. Data from Domingo-Calap et al. (2009) and Cuevas et al. (2012) are pooled together (as in Cuevas et al. [2012]), and Schenk et al. (2012) focused on only beneficial mutations. Fitness values reported are relative to an ancestor, thus neutral mutations have a fitness of 1. ω > 1 indicates mutations with a relative fitness greater than one.

Only reported separately for S and N mutation when they fit two different GPD domains.

Too few mutations (df≤2) to be confident in the model fit.

Not strictly a fitness measure, fold increase in minimum inhibitory concentration (MIC) relative to the ancestor.

Characteristics of DFEs from Studies with Published Fitness Data for Synonymous Mutations Note.—Studies are grouped by kingdom and then ordered alphabetically by author. WG, whole genome; S, synonymous mutations; NS, nonsynonymous mutations; GPD, Generalized Pareto Distribution; Gumbel domain describes exponential distributions, Weibull domain describes right-truncated distributions, Fréchet domain describes heavy-tailed distributions. Data from Domingo-Calap et al. (2009) and Cuevas et al. (2012) are pooled together (as in Cuevas et al. [2012]), and Schenk et al. (2012) focused on only beneficial mutations. Fitness values reported are relative to an ancestor, thus neutral mutations have a fitness of 1. ω > 1 indicates mutations with a relative fitness greater than one. Only reported separately for S and N mutation when they fit two different GPD domains. Too few mutations (df≤2) to be confident in the model fit. Not strictly a fitness measure, fold increase in minimum inhibitory concentration (MIC) relative to the ancestor. Some caveats to interpreting this data are warranted. First, a range of approaches are used to quantify fitness so the relevant comparison to make is between synonymous and nonsynonymous mutations within a study. For example, some studies reported here make use of techniques for expressing mutations in a nonnative context, as when SNPs are carried on a plasmid (for example, Firnberg et al. 2014), which raises the possibility that these systems are under stronger selection for expression changes than might otherwise be the case. Caution must therefore be exercised in interpreting the magnitude of fitness estimates across studies. Second, measurement error can vary substantially from study to study, depending on the sensitivity of the particular experimental protocol used to quantify fitness and so is estimated and reported in a range of different ways (and sometimes not at all). Where possible, we show estimates of 95% confidence intervals based on reported experimental error estimates in the summary figures that follow. Third, the assignment of a nonneutral fitness effect to any mutation depends in part on population size, with mutations behaving as neutral when N< 1, where N is the effective population size and s is the selection coefficient or fitness effect of the new mutation. Since the experiments reported here vary in effective population size, typically from about 104–108 individuals, we can be confident that mutations detected here with fitness effects as low as 0.0001 could drive adaptation. Notably, this lower limit on fitness effects is much smaller than the fitness effects reported in the DFE studies summarized here, suggesting that effective population size is not limiting the adaptive impact of the bulk of nonneutral mutations arising in EE studies. Fourth, site-directed mutagenesis studies can sometimes lead to the inadvertent fixation of second-site mutations as part of the construction process. There is no a priori reason to suggest that such second-site mutations should bias our analysis, however, as, even if they do occur, their effects are likely to be random with respect to the comparison of synonymous and nonsynonymous mutations. Last, we have not included those DFE studies that have generated multiple simultaneous synonymous codon changes throughout a focal gene and then characterized the effects of these collective changes (Hense et al. 2010). Although these multisite mutation studies are valuable in confirming that synonymous mutations can collectively have strong positive and negative fitness effects, our focus is on studies examining single-nucleotide mutations as these are likely to be the mutations important for rapid adaptive evolution.

Comparing Synonymous and Nonsynonymous Fitness Effects

The types of organisms have been used to characterize the DFE of synonymous mutations fall into three kingdoms—viruses, bacteria, and fungi (specifically, yeast). DFE studies with viruses tend to quantify the fitness effects of mutations across the whole genome (but see Wu et al. [2014]) for an exception), whereas studies in bacteria and yeast focus on one or two specific genes (Lind et al. 2010) or even a small region within a gene (Hietpas et al. 2011). In half of these DFE studies shown in table 1, we find that the lowest fitness synonymous mutation tested has the same fitness or even lower than lowest-fitness nonsynonymous mutation in the same study. In fact, in five out of 12 of the DFE studies, the most deleterious synonymous mutation tested had a fitness of 0 (Sanjuán et al. 2004; Carrasco et al. 2007; Cuevas et al. 2012) or very close to it (0.003, Firnberg et al. 2014; 0.011, Wu et al. 2014), meaning that synonymous mutations can certainly have lethal or nearly lethal effects. Lethal or nearly lethal fitness effects in synonymous mutations are most common in viruses (two thirds of the viral studies in table 1), however Firnberg et al. (2014) also reported a nearly lethal synonymous mutation in Escherichiacoli. Although the magnitude of deleterious fitness effects can be quite similar between synonymous and nonsynonymous mutations, the overall shapes of the DFEs for synonymous and nonsynonymous mutations varies substantially from one study to the next (see fig. 1). In many studies, the DFE for nonsynonymous mutations looks bimodal with one peak centered around a relative fitness of 1 (nearly neutral fitness effects) and the other peak closer to a relative fitness of 0 (nearly lethal fitness effects), whereas a bimodal pattern is not apparent in the DFEs for synonymous mutations.

. 1.

Distribution of fitness effects of all mutations, with fitness relative to the ancestor (ω) along the x axis and counts of mutations along the y axis. Blue represents nonsynonymous mutations, red represents synonymous mutations. Bars indicate mutation count data. The vertical black solid lines at ω = 1 indicate the fitness of the ancestor and the dashed vertical lines on either side indicate an estimate of 95% CI around that estimate based on mean measurement error reported. Study K did not report measurement error for its fitness estimates and so no dashed line is plotted. Blue and red curves indicate smoothed density fits of the nonsynonymous and synonymous mutations, respectively, using the “density” function in R. Letter labels correspond to study letter labels in table 1. The shapes of the DFEs of synonymous and nonsynonymous mutations are significantly different in all panels except I and J (K–S test, P < 0.05). Note.—The x axis in panel K is not strictly a fitness measure, but instead fold-increase in minimum inhibitory concentration (MIC) relative to the ancestor. When we focus on the DFEs for just those mutations with relative fitness greater than 1 (i.e., potentially beneficial, although not always significantly so), the shape of the DFEs for synonymous and nonsynonymous mutations are more similar, both within and between studies (fig. 2). Kolmogorov–Smirnov tests reveal half of the studies show no significant difference between the DFEs of synonymous and nonsynonymous mutations with putatively beneficial effects. We also use a DFE model fitting approach (Beisel et al. 2007; Rokyta et al. 2008) to fit each DFE of putatively beneficial mutations assuming a Generalized Pareto Distribution (GPD). Extreme value theory suggests that the DFE of beneficial mutations will be shaped like a GPD, regardless of the shape of the entire DFE as long as the populations are well-adapted—that is, close to the fitness optimum (Beisel et al. 2007; Joyce et al. 2008). Depending on the value of the shape parameter, κ, the GPD ranges from a heavy-tailed distribution (κ > 0, Fréchet domain), to exponential (κ ≈ 0, Gumbel domain), to right-truncated (κ < 0, Weibull domain). The DFEs of putatively beneficial synonymous mutations tested here all fall into the Weibull or Gumbel domains (i.e., κ ≤ 0; see table 1); so, too, do the DFEs for nonsynonymous mutations in our analysis, with just one exception (Schenk et al. 2012). We suggest that it is reasonable to assume many microbial evolution experiments populations are indeed close to the fitness optimum—one piece of evidence supporting this is the very small proportion of beneficial mutations (of any type) detected in the populations analyzed. However, we acknowledge that populations in some of these studies may not be close to the fitness optimum, for example, Schenk et al. (2012), where the presence of an antibiotic may knock populations quite far off their fitness optimum and so the GPD may not always be appropriate from an evolutionary genetics perspective. However, since the GPD is quite general in shape, it is able to accommodate many different distribution shapes for comparison.

. 2.

Distribution of fitness effects of mutations with relative fitness greater than 1 (ω > 1). Relative fitness is shown along the x axis and counts of mutations along the y axis. Blue represents nonsynonymous mutations, red represents synonymous mutations. Bars indicate mutation count data. The vertical black solid lines at ω = 1 indicate the fitness of the ancestor and the dashed vertical lines indicate the estimated 95% CI based on mean measurement error reported. Study K did not report measurement error for its fitness estimates and so no dashed line is plotted. Blue and red curves indicate smoothed density fits of the nonsynonymous and synonymous mutations, respectively, using the “density” function in R. Letters correspond to studies summarized in table 1. Panel J is blank because this study did not observe beneficial mutations. The shapes of the DFEs of synonymous and nonsynonymous mutations are significantly different in panels A, B, F, K, and L (K–S test, P < 0.05). Note.—The x axis in panel K is not strictly a fitness measure, but instead fold-increase in minimum inhibitory concentration (MIC) relative to the ancestor. Although the shape of the DFEs among putatively beneficial synonymous and nonsynonymous mutations can be similar, there is certainly variation. Some studies show clear differences between the DFEs of these nominally beneficial synonymous and nonsynonymous mutations (e.g., Carrasco et al. 2007; Cuevas et al. 2012; panels A and B respectively in fig. 2), whereas other studies find these distributions are completely indistinguishable (e.g., Firnberg et al. 2014; Lebeuf-Taylor et al. 2019; panels G and H respectively in fig. 2). Studies that find no significantly beneficial synonymous mutations also find no significantly beneficial nonsynonymous mutations, suggesting again that synonymous and nonsynonymous mutations are not so different in their effects. The one exception is Peris et al. (2010) where two significantly beneficial nonsynonymous mutations are detected, but zero significantly synonymous beneficial mutations.

Drivers of Similarities and Differences in DFEs of Synonymous and Nonsynonymous Mutations

What drives the similarities and differences in the DFEs between synonymous and nonsynonymous mutations from one study to the next? We see that, in general, synonymous mutations have larger deleterious effects on fitness in RNA viruses, compared with DNA viruses (as pointed out by Cuevas et al. 2012), however on the beneficial side there is no clear difference. In fact, the DFE of beneficial mutations differs significantly between synonymous and nonsynonymous mutations in some RNA virus studies (e.g., Carrasco et al. 2007; Domingo-Calap et al. 2009; Cuevas et al. 2012; Wu et al. 2014) but not all (Sanjuán et al. 2004). In DNA virus studies, the synonymous and nonsynonymous DFEs of beneficial mutations did not differ in any of the DFE studies explored here (Domingo-Calap et al. 2009; Peris et al. 2010; Cuevas et al. 2012). In bacteria and yeast, on the other hand, there is a tendency for synonymous mutations to have smaller deleterious effects compared with those of nonsynonymous mutations, with at least one notable exception (see Firnberg et al. [2014]). Perhaps it is differences in strength of selection and/or distance from the fitness optimum that drive variation in the DFEs of beneficial synonymous and nonsynonymous mutations. Quantifying selection and distance to a fitness optimum is difficult in practice, however we can start by making some inferences about the strength of selection based on knowledge of the specifics of the fitness assay environments used in each study. We identify what we suggest are DFE studies where selection is likely to have been particularly strong: 1) Carrasco et al. (2007) where the natural plant host likely imposes strong selection on the Tobacco etch virus, 2) Wu et al. (2014) where the influenza A hemagglutinin gene is under strong selection in human host cells, 3) Schenk et al. (2012) where E. coli is under strong selection for resistance to the antibiotic cefotaxime, and 4) Hietpas et al. (2011) where the gene region being explored is under strong selection, as evidenced by strong sequence conservation in this region across eukaryotes. Interestingly, all four of these studies show significant differences between the DFEs of beneficial synonymous and nonsynonymous mutations. However, of all the other studies in table 1 all but one had DFEs of beneficial synonymous and nonsynonymous mutations that were statistically indistinguishable. This suggests that synonymous mutations may play a greater role in adaptive evolution when selection is weak. Thus, the general characteristics of these DFEs of synonymous and nonsynonymous mutations suggest that we might expect the dynamics of purifying selection to differ significantly between synonymous and nonsynonymous sites, however in terms of positive selection, the potential for differences between synonymous and nonsynonymous substitution dynamics is less clear.

Fitness Effects across Gene Regions

To further explore potential differences in the fitness effects of synonymous and nonsynonymous mutations, we used publicly available fitness data from the same studies reported in table 1 to look for differences in the magnitude of fitness effects of mutations (the magnitude of the fitness data plotted in fig. 1) across different positions within a gene. Thus, collectively for all studies that reported mutation location, we used a general linear model to test for a significant effect of mutation position within the gene on the magnitude of fitness effects (glm; R Core Team 2020). We also included a main effect of mutation type (synonymous vs. nonsynonymous), interaction between mutation type and mutation position, and a main effect of study as a covariate (P < 0.0001; studies varied significantly in the mean fitness effects reported). We restricted this analysis to DFE studies that tested mutations located in the first 500 nucleotide positions (all studies summarized in table 1 except Hietpas et al. 2011) as previous work suggests that it is, at most, the first few hundred nucleotides of the gene where mechanisms specific to the start of the genes are important (Tuller et al. 2011). We find that mutation position has a significant negative effect on the magnitude of fitness effects of synonymous mutations (P = 0.0053; supplementary fig. S1, Supplementary Material online), whereas the fitness effects of nonsynonymous mutations do not vary significantly with mutation location (P = 0.9170; supplementary fig. S2, Supplementary Material online). This linear model also confirms that the mean fitness effect of synonymous mutations is significantly smaller than the mean fitness effect of nonsynonymous mutations in this data set (P < 0.0001). These results suggest that molecular mechanisms driving fitness effects that are specific to the start of the gene, such as transcription or translation initiation, seem to have an impact across organisms and genes. However, it is important to emphasize that while significant, this relationship is weak. Nucleotide position explains only 12.5% of the overall variation in fitness and is not always significant if we focus on a single study, indicating there must be other important mechanisms at play here that are not specific to the start region of a gene.

Synonymous Mutations Driving Adaptive Evolution in EE Studies

Experimental evolution (EE) is an approach used to test factors driving adaptive evolution, including the role of synonymous mutations. By tracking adaptive evolution in replicate populations in controlled environments and sequencing the genomes of the evolved populations, one can identify specific mutations that arise driving adaptation, directly test fitness changes, and even potentially identify the specific molecular mechanisms and phenotypic changes responsible for adaptation. Microbial evolution experiments are often conducted using large populations where evolution is driven by strong selection, and tends to result in a handful of beneficial mutations arising and fixing over a few hundred generations (Bailey et al. 2015). Evolutionary dynamics in these types of experiments have often been assumed to fall into a strong-selection weak-mutation (SSWM) regime where the mutation supply rate is low enough that a single beneficial mutation arises and fixes before the next one comes along. However, more recent deep sequencing and repeated-time sequencing of populations in EE studies suggests that many populations in EE studies are actually in a “clonal interference” or even “multiple mutation” regime. In a clonal interference regime the mutation supply is large enough that independently arising beneficial mutations occur on different clones simultaneously and compete for fixation. With an even larger mutation supply rate, multiple mutations may arise on the same clone before it has swept to fixation. For the most part, EE studies report that the vast majority of mutations that fix are nonsynonymous. In cases where synonymous mutations are observed, they are often assumed to be neutral mutations either drifting in frequency or hitchhiking along with a beneficial nonsynonymous mutation (Lang et al. 2013). However, there are now a number of EE studies reporting synonymous mutations with clear beneficial fitness effects (Bailey et al. 2014; Kristofich et al. 2018).

Experimental Evolution Approaches with Different Starting Points

We group EE studies exploring the role of adaptive synonymous mutations into two general types, which we summarize in table 2. The first, are those studies that simply tracked the evolution of a number of replicate populations, and then happened to observe de novo synonymous mutations rising to appreciable frequency in one or more of those evolved populations. Then, by looking back at the mutation frequency dynamics or through direct tests of reconstructed mutants using growth or competitive fitness assays, the evolved synonymous mutations are identified as having significant beneficial fitness effects. In this type of study, the authors have not initially set out with the intent to explore adaptive synonymous mutations per se but, when adaptive synonymous mutations evolve in their populations, they chose to report on them.

Table 2

Experimental Evolution Studies that Report Clear Evidence for Beneficial Fitness Effects of Synonymous Mutations Arising De Novo

	Study	Organism	Experimental Conditions		No. MutationsObserved	Range of FitnessEffects (min–max)	Evidence for Fitness Effects	Molecular Mechanisms Proposed/Tested
Viruses	Bull et al. (1997)	ΦX174 bacteriophage (ssDNA)	Escherichia coli C and S. typhimurium LT2 hosts	S	11	↑	Parallel evolution	No mechanisms proposed.
	Bull et al. (1997)	ΦX174 bacteriophage (ssDNA)	Escherichia coli C and S. typhimurium LT2 hosts	N	63	↑	Parallel evolution	No mechanisms proposed.
	Bull et al. (1998)	Bacteriophage SP (+ssRNA)	E. coli K-12 host with plasmid expressing directed antisense RNA	S	4	↑	Rapid adaptation; Parallel evolution	Proposed RNA genome secondary structure.
	Bull et al. (1998)	Bacteriophage SP (+ssRNA)		N	2	↑	Rapid adaptation; Parallel evolution	Proposed RNA genome secondary structure.
	Holder and Bull (2001)	Bacteriophage G4 (ssDNA)	E. coli C	S	2	↑	Rapid adaptation	No mechanisms proposed.
	Holder and Bull (2001)	Bacteriophage G4 (ssDNA)	E. coli C	N	14	↑	Rapid adaptation	No mechanisms proposed.
	Novella et al. (2004)	Vesicular stomatitis virus (−ssRNA)	Baby hamster kidney (BHK-21) and sand fly (LL-5) host cells	S	5	↑	Parallel evolution	Unknown. Tested change in codon usage and RNA structure/stability.
	Novella et al. (2004)	Vesicular stomatitis virus (−ssRNA)	Baby hamster kidney (BHK-21) and sand fly (LL-5) host cells	N	16	↑	Parallel evolution
	Bull et al. (2012)	Bacteriophage T7 (dsDNA)	Ancestral genotypes had suboptimal codons in 10A gene; E. coli K-12 host	S	5	↑	Rapid adaptation; Parallel evolution	Not mRNA structure. Possibly codon usage.
	Bull et al. (2012)	Bacteriophage T7 (dsDNA)		N	12	↑	Rapid adaptation; Parallel evolution	Not mRNA structure. Possibly codon usage.
	Foll et al. (2014)	Influenza A H1N1 (−ssRNA)	Madin–Darby canine kidney (MDCK) host cells	S	7	0.05–0.29	Allele frequency dynamics	No mechanisms proposed.
	Foll et al. (2014)	Influenza A H1N1 (−ssRNA)	Madin–Darby canine kidney (MDCK) host cells	N	10	0.06–0.22	Allele frequency dynamics	No mechanisms proposed.
	Kashiwagi et al. (2014)	Bacteriophage Qβ (+ssRNA)	E. coli 43BF′ host	S	14	↑	Rapid adaptation; Amplification rate of reconstructed mutants	Proposed RNA genome secondary structure.
	Kashiwagi et al. (2014)	Bacteriophage Qβ (+ssRNA)	E. coli 43BF′ host	N	17	↑		Proposed RNA genome secondary structure.
Bacteria	Bailey et al. (2014, 2015)	Pseudomonasfluorescens	M9 + either mannose, glucose, or xylose	S	2	0.07–0.09	Competitive fitness of reconstructed mutants	Increased expression. No consistent change in optimal codon usage or mRNA structure/stability.
	Bailey et al. (2014, 2015)	Pseudomonasfluorescens	M9 + either mannose, glucose, or xylose	N	50	↑^b	Competitive fitness of reconstructed mutants
	Agashe et al. (2016)	Methylobacteriumextorquens AM1	Four ancestral genotypes, each with different codon usage in fae gene; methylamine-limited media	S	4	0.11–0.19	Parallel evolution; Growth rate of reconstructed mutants	Increased expression. Not change in codon usage or RNA structure/stability. Not anti-SD affinity. Proposed change in transcription binding site.
	Agashe et al. (2016)	Methylobacteriumextorquens AM1		N	5	0.04–0.17	Parallel evolution; Growth rate of reconstructed mutants
	Kershner et al. (2016)	E. coli	Ancestral genotype has deletion (ΔargC) that severely limits growth and modified proA gene; M9 + glucose media	S	1	5.1 fold	Growth rate of reconstructed mutants	Strengthens an inefficient promotor for downstream gene, proB.
	Kershner et al. (2016)	E. coli		N	2	3.2–4.7 fold	Growth rate of reconstructed mutants
	Knöppel et al. (2016)	Salmonella enterica	Ancestral genotypes each had a single deleterious S mutation	S	18	0.08–0.73	Growth rate of reconstructed mutants	Changes in predicted mRNA structure/stability.
	Knöppel et al. (2016)	Salmonella enterica		N	31	0.05–0.72	Growth rate of reconstructed mutants	Changes in predicted mRNA structure/stability.
	Kristofich et al. (2018)	S. enterica	Ancestral genotype has deletion (ΔargC) that severely limits growth; M9 + glucose media	S	2	0.41–0.67	Growth rate of reconstructed mutants	Changes in predicted mRNA structure/stability and translation efficiency.
	Kristofich et al. (2018)	S. enterica		N	1	0.32	Growth rate of reconstructed mutants
Fungi	McDonald et al. (2016)	Saccharomycescerevisiae	Fitness measures are from one asexsual population, and one sexual population;YPD media	S	2^a	−0.01 to 0.018	Allele frequency dynamics; Competitive fitness of reconstructed mutants	No mechanisms proposed.
Fungi	McDonald et al. (2016)	Saccharomycescerevisiae		N	13^a	−0.01 to 0.08		No mechanisms proposed.

Note.—Studies are grouped by kingdom and then ordered by publication date. S, synonymous mutations; N, nonsynonymous mutations; ↑, indirect evidence of increased fitness driven in part by synonymous mutations.

We only report those mutations with fitness significantly different from the ancestor.

Nonsynonymous mutants were not reconstructed and tested.

Experimental Evolution Studies that Report Clear Evidence for Beneficial Fitness Effects of Synonymous Mutations Arising De Novo Note.—Studies are grouped by kingdom and then ordered by publication date. S, synonymous mutations; N, nonsynonymous mutations; ↑, indirect evidence of increased fitness driven in part by synonymous mutations. We only report those mutations with fitness significantly different from the ancestor. Nonsynonymous mutants were not reconstructed and tested. The second type of EE studies we discuss here are ones designed with the explicit goal of trying to understand the role of synonymous mutations right from the beginning. In these studies, authors set up specific conditions that may be particularly favorable for driving adaptive synonymous substitutions by initiating replicate evolving populations with strains that have been genetically modified such that synonymous mutations are added to a targeted region of the genome, thus changing codon bias. These genetically modified starting strains have lower fitness values compared with the original ancestor and so as evolution proceeds, these fitness losses are compensated for. Sometimes the resulting adaptive evolution is driven by de novo synonymous mutations but nonsynonymous mutations, mutations in noncoding regions, and even gene duplications also frequently play a role. These EE studies are also included in table 2. We now outline key results from both of these types of EE studies exploring adaptive synonymous mutations.

Indirect Evidence of Adaptive Synonymous Mutations in EEs

The earliest evidence for synonymous mutations contributing to rapid adaptive evolution in EE studies came from work with viruses that showed repeated evolution at the same synonymous sites across replicates (Bull et al. 1997, 1998) and fixation of synonymous mutations in the absence of any nonsynonymous mutations that they might have hitchhiked with (Holder and Bull 2001). The observation of repeated, or parallel, evolution is often taken to be evidence of strong selection on these sites because it is unlikely to happen by chance alone. However, no direct tests of fitness were performed and, beyond noting the possibility that these synonymous mutations could be beneficial, not discussed or explored any further. To the best of our knowledge, the first EE study to observe and then explicitly focus analysis and discussion on the contribution of synonymous mutations to rapid adaptive evolution is Novella et al. (2004). Populations of Vesicular Stomatitis Virus (VSV) cultured in three different selection environments for 80 passages increased in fitness between 1.5 to 150 times above that of the ancestral strain and accumulated between 2 and 21 mutations. Repeated evolution occurred frequently, at 21 of the 77 sites with mutations, and five of those repeated mutations were synonymous. To rule out the possibility of mutational hotspots driving the repeated evolution, the authors performed a mutation accumulation experiment, allowing mutations to accumulate under conditions where the strength of selection is greatly reduced. There was no overlap between the mutations arising in the mutation accumulation experiment and those arising in the evolution experiment, suggesting that the observed repeated synonymous evolution was indeed driven by selection. Kashiwagi et al. (2014) also found indirect evidence of synonymous mutations driving adaptive evolution, this time using Qβ bacteriophage selected at increasingly higher temperature in an E. coli host. After 62 days of serial passages, fitness had increased in all replicate evolved populations and mutations were observed at 31 unique sites, with repeated evolution occurring at ten of those sites. A constructed genotype combining four of the evolved synonymous mutations and one evolved intergenic mutation, was shown to have an amplification ratio (a proxy for fitness) an order of magnitude higher than the ancestor. This confirmed that some of the evolved “silent” mutations must have beneficial fitness effects, but the fitness effects of individuals’ synonymous mutations were not directly tested. Foll et al. (2014) found evidence of beneficial synonymous mutations in an EE with influenza A (H1N1) evolved in the presence or absence of increasing concentrations of oseltamivir, an antiviral drug. The authors estimated the fitness of each newly arising mutation by applying an Approximate Bayesian Computation (ABC) approach to temporally repeated allele frequency data. These fitness estimates identified 17 beneficial mutations, seven of which are synonymous. Estimates from this study suggest that fitness effects of the evolved synonymous mutations ranged from 5% to 30%. Thus, although the fitness of individual synonymous mutations was not directly quantified in these EE studies, observations of repeated evolution and allele dynamics suggest that strong positive selection has driven adaptive evolution at synonymous sites. It is important to note that these early studies reporting synonymous mutations driving adaptive evolution were all conducted with viruses and so there was still no clear evidence to suggest adaptive synonymous mutations were anything more than a peculiarity of compact viral genomes, perhaps arising from overlapping reading frames where a mutation could be synonymous in one reading frame and nonsynonymous in another. However, as sequencing prices decreased rapidly, and whole genome sequencing in EE studies with bacteria and yeast became more commonplace, evidence for adaptive synonymous mutations beyond viruses began to emerge.

Direct Evidence of Adaptive Synonymous Mutations in EEs

The first EE study reporting direct evidence of beneficial synonymous mutations driving rapid adaptive evolution is Bailey et al. (2014). The authors followed a Pseudomonas fluorescens population adapting to glucose-limited media and observed two de novo synonymous mutations that arose independently in the same gene, gtsB (a subunit of a putative glucose transporter). Fitness of the evolved synonymous mutations was tested by directly competing constructed mutants against the ancestor and the synonymous mutants showed fitness advantages of 7% and 9%. Beneficial synonymous mutations were also clearly identified in a bacterial EE study using a strain of Salmonella enterica in which the gene for an essential enzyme, argC, was replaced by a gene for the promiscuous enzyme, proA (Kristofich et al. 2018). After 260 generations of evolution, two synonymous mutations arose in the proA gene and direct tests of those mutations showed an increased growth rate, equivalent to a 41% and 67% fitness advantage over the ancestor. Although we are not aware of any eukaryotic EE studies that explicitly discuss the effects of synonymous mutations on adaptive evolution, there certainly are EE studies where synonymous mutations have been observed. McDonald et al. (2016) identified a collection of both synonymous and nonsynonymous mutations evolved in replicate populations of Saccharomyces cerevisiae growth under either clonal or sexual culture regimes. The authors then tested the fitness of constructed genotypes for a subset of these mutations and identified a significant beneficial fitness effect in one synonymous mutation (2%). However, the fitness effects of synonymous mutations were not a focus of this study and so were not discussed further. Thus, EE studies that observed adaptive synonymous mutations and then estimated their fitness effects have found that the adaptive synonymous mutation had fitness effects similar to those of adaptive nonsynonymous mutations observed in the same study. On the other hand, the fraction of observed adaptive mutations that are synonymous in these studies is still often quite low (e.g., 4% in Bailey et al. 2014) but variable—Kristofich et al. (2018) found that ⅓ of evolved adaptive mutations in coding regions were synonymous.

Synonymous Adaptations in Response to Suboptimal Codon Usage

EE studies have also been initiated using modified genotypes where codon usage has been experimentally manipulated within a target gene or genetic region. Here, the aim from the start is to explore the role of synonymous mutations in adaptive evolution. With codon bias shifted away from optimal in the starting genotypes, the replicate populations in these studies start with decreased fitness. As adaptive evolution proceeds, it will be driven by mutations that compensate these codon bias-induced fitness costs. Some of these EE studies are framed within the context of understanding mechanisms driving the evolution of foreign genes that have been recently inserted into a genome. This gene insertion could happen through a natural process like horizontal gene transfer or via artificial insertion for the purpose of heterologous gene expression. Amorós-Moya et al. (2010) first report an EE study using this approach to track adaptive evolution in replicate populations initiated with E. coli variants containing one of three versions of a chloramphenicol resistance gene (CAT) that ranged in their level of codon usage, and so starting fitness. Adaptation compensated for the fitness cost of deviations from the native codon bias through substitutions in the promoter region of CAT; neither synonymous nor nonsynonymous mutations in coding regions were substituted. By contrast, Bull et al. (2012) tracked the adaptive evolution of replicate populations of T7 phage with a deoptimized version of the major capsid gene, 10A (only 10% preferred codons compared with 68% in the wildtype) and found both synonymous and nonsynonymous mutations contributed to fitness recovery. Although the fitness of these synonymous mutations were not directly tested, two pieces of evidence suggest that at least some of them contributed to adaptive evolution: 1) three of the five evolved synonymous mutations were direct reversions back to the wildtype version, and 2) one synonymous mutation in particular evolved independently in all three populations. Agashe et al. (2016) initiated replicate populations with seven distinct constructed variants of Methylobacterium extorquens, each with a different suboptimal synonymous version of the fae gene (encoding the formaldehyde activating enzyme). The starting genotypes varied in number and position of rare codons, resulting in 46–150 synonymous mutations compared with the wildtype. Replicate populations were evolved in growth media containing methylamine, in which the fae enzyme is essential for growth. Initially, the starting variants all had reduced fitness relative to the wild type, however after ∼60 to 250 generations, the populations had all evolved increased fitness. The evolved mutations were synonymous (4), nonsynonymous (5), and upstream of the coding region of the gene (7). The authors directly tested the effects of each evolved mutation on growth rate, gene expression, and enzyme activity, and found the effects of the synonymous and nonsynonymous mutations in this experiment did not differ significantly. Importantly, whereas induced codon bias generates the initial fitness decreases in the replicate populations of these EE studies, the adaptive response is the evolution of synonymous mutations with large fitness effects that are independent of their effect on codon bias, compensating for that initial fitness loss. An EE study by Knöppel et al. (2016) following the evolution of replicate populations of S.enterica fits with this group of studies in that the evolving populations were initiated by one of four different generated genotypes created with an explicit aim to explore the role of synonymous mutations in adaptive evolution. However instead of starting with generated variants containing many synonymous mutations, in this study, each starting variant contained just a single synonymous mutation in the rpsT gene (encoding ribosomal protein S20). The synonymous mutations used were ones previously identified as having deleterious effects (Lind et al. 2010; Lind and Andersson 2013). When allowed to evolve, the replicate populations adapted through upregulation of the rpsT gene, driven by mutations in rpsT: gene duplications (n = 4); synonymous (n = 18), and nonsynonymous (n = 31) mutations, and a nonsynonymous mutation in a different gene (rpoD). Direct tests of the evolved synonymous mutations showed they increased fitness by 8% to 73%, and the evolved nonsynonymous mutations had an almost identical range of fitness effects: 5% to 72%.

Mechanisms Driving the Fitness Effects of Synonymous Mutations in EE Studies

Synonymous mutations do not change the amino acid sequence of a protein, and so their fitness effects must arise through changes to the processes of transcription or translation that govern gene expression. A range of mechanisms can impact either one or both of these processes, from the binding of RNA polymerases that initiate transcription through to changes in the rate and efficiency of translation, so it is perhaps not surprising that no single mechanism has emerged as the explanation for how synonymous mutations impact fitness. Here, we review what experimental evolution and site-directed mutagenesis studies have taught us about the underlying mechanisms governing variation in fitness among synonymous mutations.

Codon Bias

The most commonly cited explanation for fitness effects of synonymous mutations involves codon bias. Many, if not most, organisms preferentially use certain synonymous codons over others. Fitness could increase if synonymous mutations result in new codons that are better aligned with more highly expressed genes elsewhere in the genome, presumably because the pool of available tRNAs is larger and so protein production is less costly. However, there is little evidence to support this claim from experimental studies. In fact, Agashe et al. (2013) showed that introducing multiple codons that are more rare or more common than those used in highly expressed genes both tend to decrease fitness, suggesting that codon usage may be under stabilizing selection (Fuller et al. 2014). Moreover, studies examining changes in codon bias due to single synonymous mutations rarely detect an effect on fitness (Lind et al. 2010; Lebeuf-Taylor et al. 2019), consistent with the idea that selection on codon usage, when it occurs, is weak. The available evidence therefore suggests that codon bias is unlikely to be the driver of adaptive synonymous substitutions, at least in the short term.

Transcription Initiation

Better evidence is available on mechanisms impacting steps of the transcription–translation process. There is direct experimental evidence from three studies in bacteria that synonymous mutations can create or strengthen promoter sites for RNA polymerase that result in increased transcription of downstream genes resulting in fitness gains (Ando et al. 2014; Kershner et al. 2016; Lebeuf-Taylor et al. 2019). This mechanism is the result of the operon structure of bacterial genomes where genes involved in the same cellular process, like small molecule or nutrient transport, are clustered together in the same location. Although it is often assumed that a single promoter controls expression of all the genes in the operon, reality is more complex: sequences that resemble, to varying degrees, the canonical promoter sequence can be found anywhere in the gene. Any mutation that increases the affinity of a sequence for RNA polymerase can serve as a promoter, resulting in the creation of noncanonical or illicit promoter sequences. The result, for a bacterial operon, can be increased transcription of downstream sequence. In other words, the fitness effect of a synonymous mutation is not necessarily via impacts on the gene in which it occurs but, rather, the result of changes to gene expression of colocated genes impacted by transcription. In this sense it is the genetic architecture of gene expression that governs the fitness effects of synonymous mutations.

Translation Mediated by mRNA Structure and Stability

Synonymous mutations could impact the rate and fidelity of translation by changing the secondary structure of mRNA transcripts. Reduced thermodynamic stability of mRNA can make the transcript more accessible to the ribosome during translation, leading to faster translation and high fitness (Kudla et al. 2009). Others have suggested that more tightly wound, and so more stable, mRNA leads to higher fitness because the transcript persists longer due to slower degradation rates (Deutscher 2006). The available evidence for these mechanisms is mixed. Synonymous mutations with fitness effects recovered in positive strand RNA viruses, where translation occurs directly from the genome, are often suggested to result from changes to genomic secondary structure (Carrasco et al. 2007; Domingo-Calap et al. 2009; Cuevas et al. 2012; Kashiwagi et al. 2014), however direct tests are lacking. There is somewhat more compelling support for a connection between reduced stability and increased gene expression when a large number of synonymous mutations are changed simultaneously (Kudla et al. 2009; Goodman et al. 2013), although whether this arises from changes in mRNA stability directly or the epistatic effects of changing multiple sites at once is not clear. In cases examining single synonymous substitutions, where epistasis is absent, predicted mRNA stability (usually estimated using an online tool such as mfold; Zuker 2003) typically does not explain much—and occasionally none—of the variance in fitness (Lind et al. 2010; Firnberg et al. 2014; Agashe et al. 2016; Lebeuf-Taylor et al. 2019). It is not clear whether this result reflects a genuine absence of signal or limitations imposed by the use of computational tools to predict mRNA stability, rather than direct measures. There are cases where certain synonymous mutations can have large effects on mRNA stability and fitness. Lind and Andersson (2013), for example, showed that synonymous mutations that disrupt base pairing and so decrease stability of the mRNA structure of rpsT (encoding ribosomal protein S20 in Salmonella typhimurium) had larger negative impacts on fitness compared with a random set of mutants, consistent with the hypothesis that more stable mRNA results in higher protein expression. The opposite effect, decreased mRNA stability linked to higher fitness, has also been observed for a handful of synonymous mutations. Kristofich et al. (2018) suggest that synonymous mutations in proA of S.enterica, which in their experiment can increase growth rate by synthesizing both proline (its native function) and arginine (a nonnative function necessary for growth on glucose) make the Shine–Dalgarno sequence and start codon of this gene more accessible to ribosomes, thus increasing translation efficiency and perhaps protecting the mRNA from degradation. There is thus strong evidence that the fitness effects of at least some synonymous mutations result from impacts on translation mediated through changes in mRNA stability. Whether this is a general mechanism will require more direct measures connecting synonymous mutations to mRNA stability and translation rates.

Ribosomal Pausing

Ribosomes may pause or stall as they move along the mRNA transcript during translation encountering Shine–Dalgarno-like motifs (Li et al. 2012) and rare tRNA (Mohammad et al. 2019). These pauses can slow translation, decreasing protein expression, and so decreasing fitness. It has also been suggested that such pauses may be beneficial when they help to mitigate crowding of ribosomes, particularly at the beginning of genes. Reducing the rate of translation at the start of a gene, the so-called translational ramp, has been suggested to explain why the first 50 codons in many genes are enriched for rare codons (Tuller et al. 2011). Ribosomal pausing may also have important impacts on protein structure when cotranslational protein folding occurs (Kimchi-Sarfaty et al. 2007). To date, there is little experimental evidence that ribosomal pausing plays an important role in mediating the fitness effects of synonymous mutations involved in adaptive evolution. The one exception is work by Agashe et al. (2013) who found that fitness of strains enriched for rare or common codons was negatively correlated with the number of internal SD-like motifs in the mRNA transcript, however subsequent experimental adaptation of these strains was not mediated by changes to the computationally predicted strength of SD binding for these sites (Agashe et al. 2016).

Alternative Mechanisms

The mechanisms discussed above are those for which some experimental support has been provided from selection experiments or large-scale site-directed mutagenesis studies designed to investigate the DFE among mutations. It is not, however, a comprehensive list of all the ways in which synonymous mutations can impact fitness. Additional mechanisms include various forms of transcriptional or translational control on gene expression mediated through small regulatory RNA molecules (Gu et al. 2012) as well as other better known posttranscriptional regulatory mechanisms like catabolite repression (Görke and Stülke 2008; see Bailey et al. [2014] for a test of this mechanism). No doubt there could be other mechanisms yet to be discovered as well. Uncovering the manifold ways in which synonymous mutations can impact fitness represents an interesting and compelling avenue for future research.

Idiosyncrasies or an Important Part of Adaptive Evolution?

It is clear that synonymous mutations can drive adaptive evolution, but how important are they in practice? The standard view that most synonymous mutations are neutral has led to the suggestion that when strongly beneficial synonymous mutations are observed, they are an anomaly arising from idiosyncratic features of the gene or organism in which they occur. Our survey of DFE studies suggests otherwise: the fitness effects of single-step synonymous are often as variable as those of nonsynonymous mutations and can include those that are strongly deleterious and others that are strongly beneficial. It seems that when you go looking for synonymous mutations with large fitness effects, you find them. Why, then, are they not more often cited as contributors to adaptive evolution? There are, from our perspective, at least three possible reasons. The first is, the evidence we have assembled here notwithstanding, the DFE among synonymous mutations for most genes is effectively neutral. In other words, it is possible that the collection of genes and environments that we have used to assess the DFE among mutations is biased in some way. It is hard to think of an a priori reason why this might be the case, as there is no reason to suspect that, with the one exception of gtsB in P. fluorescens (Lebeuf-Taylor et al. 2019), that these genes were chosen for study because they were known to harbor synonymous mutations with large fitness effects. Nevertheless, until we have more examples of DFEs from other genes at different locations in the same genome and having a wider range of functions, we cannot entirely dismiss this explanation. A second reason is that even if synonymous fitness effects are not beneficial, they may still be important in driving purifying selection. Although difficult to detect, because one is looking for the absence of substitutions, there is some evidence suggesting this may be the case in Lenski’s long-term evolution experiment with E. coli populations. Chursov et al. (2013), for example, showed that nearly twice as many mRNA structure altering mutations occurred in nonessential genes versus essential ones, which is consistent with stronger purifying selection on mRNA structure that could be driven, at least partially, by synonymous mutations. The third explanation rests on the dynamics of genetic variation due to mutation in evolving populations. When a mutation arises de novo in a population, more often than not it is quickly lost due to genetic drift. In a population growing under strong selection weak mutation (SSWM) conditions, the next mutation to fix during adaptive evolution is drawn at random from all possible beneficial mutations, where the probability of fixation for each mutation is weighted by its fitness advantage over the ancestral genotype (Patwa and Wahl 2008). Since nonsynonymous sites outnumber synonymous sites by approximately 2–1, and assuming a similar DFE for synonymous and nonsynonymous mutations, we expect synonymous mutations to make up about ⅓ of the mutations that fix. This ratio is not what is observed across EE studies: outside of the studies collected for this review, synonymous mutations are almost always recovered at a frequency of less than ⅓ in selection experiments (and this is sometimes used as additional evidence that the populations have undergone adaptive evolution). Although it is certainly possible that some of the “missing” adaptive synonymous mutations in EE studies simply have not been reported or discussed because they are assumed to be neutral without any further investigation, it is hard to reconcile this explanation with the fact that highly replicated experiments rarely recover high degrees of parallelism—a signal of strong selection—at synonymous sites. The paucity of synonymous mutations contributing to adaptation seems to be a real effect. We propose that the absence of adaptive synonymous mutations in microbial evolution experiments stems from the fact that most populations being studied do not experience SSWM conditions. Rather, these experiments are done under such large population sizes (typically >106; Cvijović et al. 2018), that the mutation supply rate (the product of the mutation rate and population size) is so high that they evolve in a regime of strong clonal interference where multiple beneficial mutations compete for fixation at the same time (Gerrish and Lenski 1998). Under clonal interference conditions, we expect the fitness advantage of mutations to come into play at two stages. First, we assume that the probability of a mutation arising, escaping drift, and increasing to an appreciable frequency is proportional to its fitness advantage, as under SSWM conditions. Then, a clone’s fitness advantage comes into play a second time as it competes with the other new clones in the population. Since these competing clones have all escaped genetic drift they are at a high enough frequency in the population that we expect their subsequent dynamics to proceed in a deterministic way. Thus, the simple expectation is that the clone with the highest fitness advantage in a group of competing clones will fix and all the rest of the clones will be lost. Under clonal interference conditions, the likelihood that the next mutation to fix is either nonsynonymous or synonymous will depend on the range—and especially the maximum—of fitness values associated with each class. To evaluate this idea, we used a set of very simple simulations meant to represent a single step of adaptive evolution under clonal interference conditions. We first simulate a pool of potential beneficial synonymous and nonsynonymous mutations, assigning fitness effects based on the distributions of real measured fitness effects from the studies summarized in table 1 and assuming nonsynonymous mutations are twice as likely as synonymous mutations (based on the number of nonsynonymous and synonymous sites in a typical genome). We then simulate a single step of adaptive evolution by first randomly drawing a group of competing mutations (the number of mutations in this group is a parameter we vary), with the probability of drawing a particular mutation weighted by its fitness effect. Next, we determine which mutation out of the randomly drawn group has the greatest fitness effect and this highest-fitness mutation is the one that outcompetes others and eventually fixes in the population. Finally, we record whether that fixed mutation is synonymous or nonsynonymous. We repeated this simple simulation 1,000 times, using the observed fitness effects from each DFE study in table 1 and varying the number of competing clones from 1 to 20. Figure 3 shows the results of those simulations. Under SSWM conditions (number of competing clones = 1), the simulations show little variation from one study to the next in the probability that the next fixed mutation is synonymous. However, as the number of clones competing for fixation increases, the probability that the next fixed mutation is synonymous diverges. For about half the DFE studies, increasing the number of competing clones (our proxy for degree of clonal interference) results in a reduced probability that the next fixed mutation is synonymous, and in a few studies this probability drops all the way to 0 quite quickly (see orange and purple). On the other hand, with a few DFEs we do not see much of an impact of increasing the number of competing clones, and with two DFEs we even see an increase in the probability that the next fixed mutation is synonymous. Thus, the shape of a particular organism’s DFE, in particular the fitness effects of the highly beneficial synonymous and nonsynonymous mutations, becomes increasingly important in driving the likelihood of synonymous versus nonsynonymous mutations fixing when a population is in a clonal interference regime.

. 3.

Probability that the next mutation fixed during an adaptive is synonymous over the number of unique clones competing for fixation. Points along each line represent the outcomes of random draws of a range of different number of beneficial clones with fitness drawn from experimentally quantified distributions of fitness effects from the studies summarized in table 1 (legend letter labels correspond to letter labels in table 1). The scarcity of synonymous mutations contributing to adaptation in EE studies—despite their apparent prevalence revealed through DFE studies—is thus likely due to two factors. The first is that the maximum fitness value of synonymous mutation DFEs rarely exceeds the maximum of nonsynonymous DFEs. The second is that the large population sizes of most EE studies means that adaptation occurs in a clonal interference regime. The result is that high fitness nonsynonymous mutations are more likely to contribute to adaptation when mutation supply rates are high.

The Implications of Ignoring Selection at Synonymous Sites

Synonymous mutations appear to play a more significant and pervasive role in rapid adaptive evolution than previously thought. The potential for strong selection at synonymous sites, both positive as well as purifying, means that we need to reconsider the use of comparative genomic approaches to detect selection that use synonymous substitution rates as a proxy for the neutral substitution rate. When there is strong selection at synonymous sites, simply comparing the rate of nonsynonymous mutations per site (dN) to the rate of synonymous mutation per site (dS) (Kimura 1977) can result in extensive false positives and false negatives when identifying selection at both the codon or gene level. We are certainly not the first to acknowledge this problem, in particular with respect to viral genome evolution. For example, Crandall et al. (1999) highlighted the importance of synonymous mutations with fitness effects in a study following the evolution of HIV-1 strains in eight patients, sampled at two different time points, pre- and postdrug therapy. In this study, multiple instances of parallel evolution at both synonymous and nonsynonymous sites across strains from different patients pointed to strong positive selection in the protease gene. However, the calculated dN/dS ratio for these data was less than one, suggesting pervasive purifying selection. The authors suggested that the mismatch occurred because both nonsynonymous and synonymous substitutions were under positive selection at these sites. An EE study discussed earlier in this review, by Novella et al. (2004), reported a similar mismatch, noting that values calculated for dN/dS in their populations are consistent with random drift despite independent evidence—namely, pervasive parallel evolution and a rapid increase in population fitness—suggesting that strong positive selection was, in fact, driving evolutionary dynamics. Missing true instances of selection at nonsynonymous sites is not the only potential problem—false positives are the other possibility. A significant deviation of dN/dS from 1 that is usually taken as evidence of selection at nonsynonymous sites could instead indicate that synonymous sites are under strong selection (either purifying or diversifying), whereas nonsynonymous sites are not. A recent study by Wisotsky et al. (2020) showed that over 50% of sites identified as being under significant positive selection were no longer significant when variation in synonymous substitution rate was explicitly modeled in an analysis of an empirical 13,000 gene alignment data set. Thus, the dN/dS approach to inferring selection must be used with caution.

Accounting for Selection at Synonymous Sites

A few approaches have been proposed to deal with this problem of false positives and negatives driven by synonymous substitution rate variation. For example, some methods explicitly model nucleotide-level selection in combination with codon-level selection (Rubinstein et al. 2011) or categorize synonymous substitutions as conservative (e.g., switching between two different preferred codons) and nonconservative (e.g., switching from a preferred to unpreferred codon), using only the conservative synonymous substitutions as an estimate of neutral evolution (Zhou et al. 2010). These kinds of models can start to distinguish situations where dN/dS is greater than one because nonsynonymous sites are under positive selection versus synonymous sites being under purifying selection. Other models get away from the problem of synonymous site neutrality altogether by simply ignoring synonymous sites and using noncoding regions of the genome to estimate neutral rates of evolution instead. Of course, caution is needed when choosing noncoding regions to use, as they can also have important effects on gene expression and so fitness, as some of the work reviewed here has shown (Kershner et al. 2016). An approach currently under development by the authors of this review aims to detect selection at both synonymous and nonsynonymous sites by using patterns of parallel evolution across closely related strains and species of bacteria. This approach relies on the assumption that parallel evolution is often driven by positive selection and identifies mutations that have arisen repeatedly across independently evolving lineages more often than expected by chance. We have used this approach successfully to look for adaptations in the gtsB gene across Pseudomonas species and strains (see Bailey et al. 2014; Lebeuf-Taylor et al. 2019). Future work will focus on looking for evidence of synonymous mutations with positive fitness effects across different gene regions (e.g., beginning/middle/end, functional domains) and genes (e.g., essential vs. nonessential, location with an operon) to characterize general patterns in the fitness effects of synonymous mutations in populations outside of lab experiments. Results from this broad analysis across strains and species will add to our understanding of the general importance of synonymous mutations in adaptive evolution.

Conclusions

Synonymous mutations can have large effects on fitness and make important contributions to adaptive evolution. Although these fitness effects may vary substantially across different types of genes and organisms, experiments using different species of viruses, bacteria, and yeast reported here confirm that synonymous mutations have widespread importance and cannot be ignored. Experimental studies comparing the distribution of fitness effects of mutations generated by site-directed mutagenesis suggest that in many cases, synonymous mutations should make up about one third of mutations contributing to adaptive evolution. Although a number of experimental studies tracking adaptive evolution of replicate populations do report synonymous mutations contributing to adaptive evolution, it is not usually at this high a frequency. We suggest clonal interference may drive some of this observed mismatch, but further investigation is warranted. Finally, the available experimental evidence points to the creation of noncanonical RNA polymerase binding sites impacting transcription—sometimes of downstream genes in bacteria, at least—and mRNA stability impacting translation as possible mechanisms underlying these fitness changes, although other mechanisms are possible and remain to be explored in more detail. Notably, there is little evidence that large beneficial fitness effects associated with synonymous mutations are attributable to changes to codon bias, arguably the most common explanation for adaptation via synonymous mutations. The reasons for the paucity of synonymous mutations contributing to adaptation in the evolve-and-resequence experiments we discuss here deserves further exploration. We have proposed clonal interference as one possibility but there could be others. The initial stages of adaptation in laboratory evolution experiments are often driven by genetic changes like loss-of-function mutations and gene amplifications that do not require synonymous mutations (Kassen 2014; Murray 2020). It therefore may not be too surprising that synonymous mutations are rarely recovered from short term experiments to novel or stressful environments like those studied in experimental evolution. Closer examination of long-term evolution experiments, such as the LTEE populations (Good et al. 2017) may show evidence of an increasing proportion of synonymous mutations contributing to adaptation over time. Looking ahead, we see great opportunity for using evolution experiments to explore what drives the relative contribution of synonymous mutations to adaptive evolution. We have shown, for example, that the shape of the DFE of synonymous and nonsynonymous mutations can be important for driving which types of mutations are most likely to contribute to adaptation. A direct demonstration of the importance of these distributions for governing the contribution of synonymous mutations to adaptation could be done by using environmental variation to modulate the DFEs. Synonymous mutations would be more likely to contribute to adaptation when the DFE among putatively beneficial mutations is more similar to that of nonsynonymous mutations. The potential impacts of clonal competition on the contribution of synonymous mutations could also be tested in a rather straightforward way by manipulating population size and/or structure, which would in turn impact the degree of clonal competition. The expectation is that larger and/or more structured populations will adapt using fewer adaptive synonymous mutations compared with smaller, less structured populations. We have focused in this review on the fitness effects of single-step synonymous mutations, but as is the case with nonsynonymous mutations, epistasis may be important, and even part of the reason behind difficulties in identifying clear fitness mechanisms. Indeed, some proposed mechanisms for fitness effects of synonymous mutations lend themselves easily to epistatic effects. For example, mRNA folding involves base pairing between nucleotides in different regions of an mRNA transcript. Thus, one could imagine a synonymous mutation in one part of a gene could have important implications for the fitness effects of a subsequent synonymous mutation in another part of the same gene. Finally, our focus in understanding the fitness effects of synonymous mutations up to this point has often been confined to a single gene at a time. However, a few experimental studies discussed here observed mutations in one gene impacted expression of other neighboring genes (Kristofich et al. 2018; Lebeuf-Taylor et al. 2019), suggesting that the single gene may be too narrow of a focus. At least in bacterial genomes, where genes are often grouped together in operons and transcribed together, considering how a mutation impacts the whole transcriptional unit may help move us closer to identifying the mechanisms behind fitness effects of synonymous mutations driving adaptive evolution.

Supplementary Material

Supplementary data are available at Genome Biology and Evolution online. Click here for additional data file.

4 in total

1. Efficiency of the synthetic self-splicing RiboJ ribozyme is robust to cis- and trans-changes in genetic background.

Authors: Markéta Vlková; Bhargava Reddy Morampalli; Olin K Silander
Journal: Microbiologyopen Date: 2021-08 Impact factor: 3.139

2. Genomics of Diversification of Pseudomonas aeruginosa in Cystic Fibrosis Lung-like Conditions.

Authors: Alana Schick; Sonal Shewaramani; Rees Kassen
Journal: Genome Biol Evol Date: 2022-05-31 Impact factor: 4.065

3. Temperature-Specific and Sex-Specific Fitness Effects of Sympatric Mitochondrial and Mito-Nuclear Variation in Drosophila obscura.

Authors: Pavle Erić; Aleksandra Patenković; Katarina Erić; Marija Tanasković; Slobodan Davidović; Mina Rakić; Marija Savić Veselinović; Marina Stamenković-Radak; Mihailo Jelić
Journal: Insects Date: 2022-01-28 Impact factor: 2.769

4. The High Mutational Sensitivity of ccdA Antitoxin Is Linked to Codon Optimality.

Authors: Soumyanetra Chandra; Kritika Gupta; Shruti Khare; Pehu Kohli; Aparna Asok; Sonali Vishwa Mohan; Harsha Gowda; Raghavan Varadarajan
Journal: Mol Biol Evol Date: 2022-10-07 Impact factor: 8.800

4 in total