Literature DB >> 24814466

Fitness is strongly influenced by rare mutations of large effect in a microbial mutation accumulation experiment.

Karl Heilbron¹, Macarena Toll-Riera², Mila Kojadinovic², R Craig MacLean².

Abstract

Our understanding of the evolutionary consequences of mutation relies heavily on estimates of the rate and fitness effect of spontaneous mutations generated by mutation accumulation (MA) experiments. We performed a classic MA experiment in which frequent sampling of MA lines was combined with whole genome resequencing to develop a high-resolution picture of the effect of spontaneous mutations in a hypermutator (ΔmutS) strain of the bacterium Pseudomonas aeruginosa. After ∼644 generations of mutation accumulation, MA lines had accumulated an average of 118 mutations, and we found that average fitness across all lines decayed linearly over time. Detailed analyses of the dynamics of fitness change in individual lines revealed that a large fraction of the total decay in fitness (42.3%) was attributable to the fixation of rare, highly deleterious mutations (comprising only 0.5% of fixed mutations). Furthermore, we found that at least 0.64% of mutations were beneficial and probably fixed due to positive selection. The majority of mutations that fixed (82.4%) were base substitutions and we failed to find any signatures of selection on nonsynonymous or intergenic mutations. Short indels made up a much smaller fraction of the mutations that were fixed (17.4%), but we found evidence of strong selection against indels that caused frameshift mutations in coding regions. These results help to quantify the amount of natural selection present in microbial MA experiments and demonstrate that changes in fitness are strongly influenced by rare mutations of large effect.

Entities: Chemical Disease Gene Species

Keywords: Pseudomonas aeruginosa; experimental evolution; hypermutator; spontaneous mutation; whole genome resequencing

Mesh：

Substances：

Year: 2014 PMID： 24814466 PMCID： PMC4096375 DOI： 10.1534/genetics.114.163147

Source DB: PubMed Journal: Genetics ISSN： 0016-6731 Impact factor: 4.562

MUTATIONS are the ultimate source of genetic variation that natural selection acts upon. Understanding the rate at which mutations arise and the distribution of fitness effects of spontaneous mutations is therefore of central importance to the study of evolutionary biology (Haldane 1937; Kondrashov 1988; Partridge and Barton 1993; Charlesworth and Hughes 1996, 2000; Hughes 2010; Bank ). One of the most widely used methods for determining the rate and fitness effect of spontaneous mutations is the MA experiment. Following the pioneering work Bateman (1959) and Mukai (1964), MA experiments involve propagating many replicate lines at very small effective population sizes so that the effect of natural selection is swamped out by that of genetic drift, allowing weakly selected mutations to accumulate randomly. The decline in mean fitness and increase in among-line variance in fitness are then used to indirectly infer mutation rate and effect estimates (Bateman 1959; Mukai 1964; Keightley 1994; García-Dorado 1997; Shaw ). Recently, whole genome resequencing of MA lines has been used to directly measure the mutation rate in microorganisms (Lynch ; Lee ; Ness ; Sung ,b; Long ). In line with classic mutation rate estimates from reporter gene assays, the emerging consensus is that the genomic mutation rate is remarkably constant across DNA-based microbes, ∼3 × 10−3 mutations/genome/generation (Drake 1991; Lynch 2010). Accurate estimates of the fitness effects of spontaneous mutation, however, have remained elusive (Eyre-Walker and Keightley 2007; Halligan and Keightley 2009). Because MA experiments rely on making comparisons among lines, they have traditionally focused on studying how fitness changes across as many lines as possible. An alternative approach is to combine whole genome resequencing in a smaller number of MA lines of a hypermutator strain to allow a greater number of mutations to accumulate, thus increasing our ability to detect and quantify the amount of natural selection that occurs during microbial mutation accumulation experiments. Furthermore, whole genome resequencing directly determines the average number of mutations that accumulate between fitness measurements, allowing for improved estimates of the distribution of fitness effects of spontaneous mutations. Natural selection must occur to some extent during microbial mutation accumulation experiments because colonies must grow big enough to become visible, resulting in an effective population size (Ne) >1. Beneficial and deleterious mutations should be subject to effective selection when Nes > 1, where s is the absolute value of the fitness effect of the mutation, and the fluctuating population size of microbial MA experiments may further increase the efficacy of selection (Otto and Whitlock 1997). This may explain why many microbial MA experiments have reported results that are consistent with the fixation of some beneficial mutations as a result of positive selection (Shaw ; Joseph and Hall 2004; Perfeito ; Dickinson 2008; Trindade ; Stevens and Sebert 2011). Studies have begun to combine both MA and whole genome resequencing in microorganisms (Lynch ; Lee ; Ness ; Sung ,b; Long ), but none have detected a genomic signature of natural selection. Using detailed fitness measurements and whole genome resequencing, we studied the evolutionary dynamics of eight replicate mutation accumulation lines of a hypermutator strain of the pathogenic bacterium Pseudomonas aeruginosa. MA lines were passaged through 28 single-cell bottlenecks followed by rapid population growth over a period of ∼644 generations. Under this regime, we estimate that the effective population size of MA lines had a lower limit of ∼16, which should be sufficient to prevent natural selection on the vast majority of spontaneous mutations. We determined the evolutionary dynamics of our lines with a high degree of precision by (1) directly measuring competitive fitness instead of a component of fitness such as growth rate, and (2) measuring fitness at every second bottleneck to capture a small number of mutations between each time point. In line with recent work, we used deep whole genome sequencing to determine the genetic consequences of population bottlenecking, infer the molecular basis of altered fitness, and test for genomic signatures of natural selection during the MA procedure. Consistent with previous MA experiments, we found that mean fitness decayed linearly over time. Detailed trajectories of fitness in individual lines coupled to whole genome sequencing revealed that rare, strongly deleterious mutations account for nearly half of the total loss of fitness. Furthermore, we found that positive selection resulted in the fixation of beneficial mutations, and that purifying selection was able to remove the majority of frameshift mutations.

Materials and Methods

Strains

The eight replicate clones used in this study were founded from the P. aeruginosa hypermutator strain PAO1ΔmutS, which was created by replacing mutS—part of the methyl-directed mismatch repair pathway—with the antibiotic resistance marker aac1 using the Cre-lox system for gene deletion and antibiotic resistance marker recycling following the methods of Mandsberg . Deleting mutS increases the mutation rate by ∼70-fold in P. aeruginosa (Torres-Barcelo ), primarily by increasing the rate of transitions (Miller 1996). The reference strain used to assess competitive fitness was PAO1-GFP. This strain was generated by integrating a constitutively expressed GFP marker at the chromosomal tn7 insertion site in P. aeruginosa PAO1 using the methods of Choi and Schweizer (2006).

Mutation accumulation

Eight replicate mutation accumulation lines were generated by streaking randomly selected colonies of PAO1ΔmutS onto individual M9KB agar plates (glycerol, 10 g/liter; peptone, 20 g/liter; M9 salts, 10.5 g/liter; agar, 12 g/liter; and MgSO4, 2 mL/liter). Plates were incubated at 37° for 18 hr before repeating the process of picking a random colony and streaking it on a fresh plate. This process was repeated daily for 30 days. Each day, colonies would form from a single cell, which had doubled ∼23 times, resulting in an Ne of ∼16. Every second day, a portion of the randomly selected colony was suspended in a 50% v/v solution of glycerol and frozen at −80° to be stored for competition assays. To ensure random selection of colonies, the last colony of the streak, which was not touching another colony, was selected. It is unlikely that random colony selection suffered a detection bias due to missing extremely small colonies; we sampled 14 regions between the visible colonies of our streaked plates and restreaked them, but did not detect a single instance of colony growth after 10 days.

Competitive fitness assay

Fitness of each line at each time point was determined relative to the PAO1-GFP strain. Strains were precultured in M9KB medium from frozen 50% glycerol stocks. Overnight cultures of each strain were mixed in M9KB broth at a ratio of ∼80% mutant to 20% PAO1-GFP. The exact initial proportions were confirmed via flow cytometry. Mixtures were competed for 18 hr at 37°, with agitation at 200 rpm, and the final proportion was again measured by flow cytometry. We define the relative fitness of the mutant as the number of doublings that the mutant strain undergoes during the 18-hr competition divided by the number of doublings of the wild-type strain, given by the formulawhere wmutant is the fitness of the mutant relative to the wild-type and N is the number of either the mutant or the wild-type cells at either the beginning or the end of the competition. Each competition assay was performed in two experimental blocks with three replicate competitions per block. In some mutation accumulation lines, fitness became too low to accurately measure (final mutant proportion <10%) and thus these data have been excluded from all analyses except those pertaining to Figure 2 and the decay in average fitness over time. The inclusion of these inaccurate points does not change the statistical significance of any of the results presented.

Figure 2

Flow cytometry

Flow cytometry was used to determine the relative proportions of mutant and wild-type strains at the beginning and end of the competitive fitness assays. Bacterial cultures, diluted 200-fold in sterile filtered M9 salts, were prepared using deionized water to minimize background signal in the flow cytometer. Diluted mixtures were run on an Accuri C6 Flow Cytometer Instrument (BD Accuri, San Jose, CA) until 10,000 cells had been assayed. Events with a forward scatter value <10,000 or a side scatter value <8000 were excluded to prevent the false detection of small particles in the medium and electrical noise. To discriminate between GFP-tagged and untagged cells, cells were excited at a wavelength of 488 nm and fluorescence emissions between 518 and 548 nm were measured. There was a small overlap in the fluorescence profiles of tagged and untagged cells (i.e., the most fluorescent untagged cells were slightly more fluorescent than the least fluorescent GFP-tagged cells), so pure cultures of PAO1 and PAO1-GFP were used as controls to correct for such spillover.

Whole genome sequencing

Illumina whole genome sequencing was performed on the first and last time point of each line, as well as on the five pairs of adjacent time points that showed the largest decrease in fitness. Raw sequencing data were analyzed using an in-house pipeline. Briefly, raw reads were filtered using the NGS QC Toolkit (Patel and Jain 2012) and aligned against the reference genome using BWA (Li and Durbin 2009). Two approaches were used to call variants, GATK’s Unified Genotyper (Depristo ) and SAMtools’s Mpileup (Li ). Identified variants were annotated with SnpEff (Cingolani ). To detect structural variants, we combined two algorithms, Breakdancer (Chen ) and Pindel (Ye ). Finally, copy number variants (CNVs) were detected using Control-FREEC (Boeva ). All differences between the P. aeruginosa PAO1 reference genome and the first time point of each bacterial line were excluded, leaving only mutations that accumulated throughout the experiment. Sequences from intermediate time points were treated as sequences from end points. All mutations found in intermediate time points were found at the end points except for one that fell in a mutation hotspot.

Testing for selection on base substitutions

To test for selection on base substitutions in protein coding genes, we estimated the expected number of protein altering mutations, under the assumption that synonymous mutations are effectively neutral. Specifically, since almost all base substitutions in our experiment were transitions (99.5%), we calculated the neutral mutation rate of each of the four bases to its partner (A→G, G→A, C→T, and T→C) using the observed synonymous mutations in our experiment. Given these mutation rates, we used the nucleotide composition and codon usage of P. aeruginosa proteins to estimate the rates of nonsynonymous and synonymous mutations (dN/dS ratio), as well as the rates of stop-gain, stop-loss, and intergenic mutations. To test for a deviation from the neutral expectation, we tested the null hypothesis that the proportion of mutations in a given class (nonsynonymous, truncation, or intergenic) relative to the number of observed synonymous mutations is equal to the predicted ratio calculated using the synonymous mutation rate. This hypothesis was tested using the normal approximation of the binomial distribution (Zar 2010).

Repetitive regions

The RepeatMasker program (Smit ) was used to screen the PAO1 genome for simple repeats, interspersed repeats, and low-complexity DNA sequences. Homopolymeric tracts of single nucleotide repeats ranging from 4 to 20 bases were identified using the dreg program, implemented in the EMBOSS package (Rice ).

Magnitude of selection against indels in coding regions

The percentage of indels in repetitive coding regions removed by natural selection was calculated under the assumption that indels in the repetitive noncoding genome are neutral. The expected number of indels in repetitive coding regions before natural selection was calculated by dividing the observed number of “neutral” mutations in repetitive noncoding regions by the fraction of repetitive elements that are in noncoding regions (21.8%) and multiplied this value by the fraction of repetitive elements that are in coding regions (78.2%). The percentage of indels removed due to natural selection is then 1 − observed/expected. If mutations in noncoding repetitive regions are not neutral, then this method will generate a lower limit estimate.

Core genes

Precomputed pairwise reciprocal best BLAST hits for 36 Pseudomonas species were downloaded from the Pseudomonas Genome Database (Winsor ). The core genome for P. aeruginosa PAO1 was defined as the set of PAO1 genes that had pairwise reciprocal best BLAST hits in the 35 remaining Pseudomonas species. We found a total of 1435 core genes.

Clusters of Orthologous Groups analysis

A list of P. aeruginosa PAO1 genes with annotated Clusters of Orthologous Groups (COGs) categories (Tatusov ) was downloaded from the National Center for Biotechnology Information. This list was intersected with the list of genes that had experienced at least one mutation during our experiment. Genes with annotated mutations and COG categories were compared to the rest of the genes in the PAO1 genome that were unmutated, but had been assigned a COG category. P-values were computed using Fisher’s exact test and corrected for multiple testing using the false discovery rate method (Benjamini and Hochberg 1995).

Statistical analysis and simulations

All statistical analyses were conducted in R (version 2.15.0) (R Development Core Team 2012). All statistical tests are reported as a P-value and the value for the test statistic with a subscript indicating the degrees of freedom. All tests use α = 0.05 and, where applicable, are two tailed. Simulations were used to generate the expected distribution of the number of mutations per gene, given the substantial variation in gene length in the P. aeruginosa genome (mean: 830 bp, 95% confidence interval: 247–2786 bp). The lengths of all genes in the P. aeruginosa genome were obtained from the Pseudomonas Genome Database (Winsor ). In each simulation, mutations (either synonymous or nonsynonymous) were randomly distributed across a simulated genome, using the same number of mutations as was detected in our experiment. The number of mutations per gene was recorded and results were averaged across 100 simulations.

Results

Here we present the results from a ∼644-generation-long mutation accumulation experiment in eight replicate MA lines. We measured the fitness of each MA line every 2 days, providing a high-resolution picture of the evolutionary dynamics of heavily bottlenecked bacterial populations. We performed whole genome resequencing on multiple time points of each line to determine the molecular nature of mutations fixed under conditions of relaxed natural selection. Whole genome resequencing identified 944 mutations in the eight mutation accumulation lines. Sanger sequencing of a random sample of these mutations confirmed 35/35 mutations (Supporting Information, Table S1), indicating a very low false positive rate. As expected, mutations were Poisson distributed across MA lines (one-sample Kolmogorov–Smirnoff test: P = 0.521, D = 0.270) with an average of 118 mutations fixed per line and an average of 8.4 mutations fixed between each adjacent time point. This equates to a per base pair mutation rate of 2.95 (± 0.21 SE) × 10−8 mutations/site/generation and a genomic mutation rate of 0.18 (± 0.01 SE) mutations/genome/generation. Given that the hypermutator strain used in this study increases the mutation rate by ∼70-fold (Torres-Barcelo ), this estimated genomic mutation rate is in line with the consensus bacterial genomic mutation rate of ∼3 × 10−3 mutations/genome/generation (Drake 1991; Lynch 2010). Of the 944 mutations, 778 (82.4%) were base substitutions, 164 (17.4%) were short indels (<10 bp), and 2 (0.2%) were large structural variations, consisting of a partial gene duplication event (pvdD) and a 1880-bp intergenic deletion. Insertions were ∼2.5-fold more common than deletions (118 insertions vs. 46 deletions) (Figure 1). As is typical for a ΔmutS hypermutator strain, almost all base substitutions were transitions (774/778 = 99.5%), and G:C→A:T transitions (478) were ∼60% more common than A:T→G:C transitions (298).

Figure 1

Types of mutations accumulated. (A) The distribution of accumulated mutations according to type of mutation. Indels <10 base pairs long were considered to be “short.” (B) Further information on the effects of point mutations. As expected, the average fitness of the hypermutator populations decreased significantly over time (Figure 2; ANOVA: P = 1.68 × 10−6, F1,13 = 67.409), indicating that the average effect of spontaneous mutations was deleterious and that recurrent population bottlenecks inhibited the action of natural selection (mean mutational fitness effect = −0.16%). In fact, in some lines, fitness became so low that it was no longer possible to reliably measure (Figure 3). These data are included in Figure 2 to prevent bias, but excluded from subsequent analyses. The average fitness of bottlenecked nonhypermutator control lines did not change significantly over the course of the experiment (ANOVA: P = 0.712, F1,118 = 0.137), indicating that the loss of fitness in hypermutator lines was due to mutation accumulation.

Figure 3

Fitness trajectories for individual mutation accumulation lines. The mean (± SE; n = 6) fitness of individual hypermutator lines through time. Red data points indicate that fitness is too low to measure accurately. The mean fitness (± SE; n = 6) of individual hypermutator lines through time. Red data points indicate that fitness is too low to measure accurately. The y-axis of each plot is scaled differently to maximize the resolution of evolutionary dynamics within a single line.

Average fitness decays in mutation accumulation lines. Plotted points show the mean fitness (± SE) of hypermutator lines (solid symbols, n = 8) and control lines (shaded symbols, n = 4) that were passaged through 28 daily bottlenecks, which correspond to ∼644 generations of mutation accumulation. The fitness of hypermutator lines rapidly declined, but the fitness of control lines did not change over the course of the experiment (ANOVA: F1,3 = 0.436, P = 0.556). Note that in some MA lines, fitness decayed to the point where it was not possible to measure fitness reliably, but these data are included to prevent bias. Fitness trajectories for individual mutation accumulation lines. The mean (± SE; n = 6) fitness of individual hypermutator lines through time. Red data points indicate that fitness is too low to measure accurately. The mean fitness (± SE; n = 6) of individual hypermutator lines through time. Red data points indicate that fitness is too low to measure accurately. The y-axis of each plot is scaled differently to maximize the resolution of evolutionary dynamics within a single line.

Fitness data

Unlike the linear decrease observed for average fitness, the evolutionary trajectory of individual lines was much more complex (Figure 3). The net change in the fitness of MA lines ranged from −1 to −27% (mean: −13% ± 9 SD). A large portion of the net decrease in fitness of each line was due to a single drop between adjacent time points (we hereafter refer to a pair of adjacent time points as a “step”). Specifically, on average, 42.2% (± 12.9% SD) of the total decrease in fitness between the first and last time point in an individual MA line (excluding any beneficial steps) was due to the largest deleterious step in that line. Furthermore, the four most deleterious steps across all lines accounted for 42.3% of the total fitness decrease throughout the entire experiment. To determine whether these large drops in fitness were caused by (1) the accumulation of a greater number of mutations than other steps or (2) the accumulation of mutations of larger effect, we performed whole genome sequencing on the four largest deleterious steps across all MA lines, as well as on an exceptionally large deleterious step, which caused the fitness of its MA line to drop to an undetectable level. These steps did not contain a significantly greater number of mutations than the remaining steps (mean of five largest steps: 9.0 mutations, mean of remainder: 7.9 mutations, paired t-test: P = 0.285, t4 = 1.235). However, these large deleterious steps showed a significantly higher frequency of mutations in highly conserved core genes than other steps (χ2 goodness-of-fit test: P = 0.049, χ21 = 3.882; Table S2). Therefore, large drops in fitness are due to mutations in more important genes rather than due to a greater number of mutations. Although the average fitness effect of a step was deleterious, there were numerous steps in which fitness increased (Figure 4). To confirm the presence of steps containing beneficial mutations, we repeated the competitive fitness assays for the 11 steps with the largest increases in fitness. Even after false discovery rate correction (Benjamini and Hochberg 1995), fitness increased significantly (P < 0.05) in 6/89 (6.7%) of the measurable steps. Because steps where fitness increased were rare, it is likely that each of these steps only contained a single beneficial mutation. This implies that at least six beneficial mutations were fixed during the mutation accumulation experiment, which corresponds to 0.64% of all mutations that were fixed during the experiment.

Figure 4

Changes in fitness for individual “steps.” The distribution of fitness changes for each step in the mutation accumulation experiment across all eight hypermutator lines. Each step represents the difference in fitness between successive assays for an MA line (∼8.4 mutations accumulated/step). The solid line depicts no change in fitness and the area between the dashed shaded lines is the area in which Nes < 1, where Ne is the harmonic mean of population size over time (although this may be an underestimate) (Otto and Whitlock 1997).

Signatures of natural selection

Selection on base substitutions in protein coding genes:

The vast majority of protein-altering base substitutions were nonsynonymous mutations, but the ratio of the rate of nonsynonymous mutations to silent mutations (dN/dS = 1.08) did not differ significantly from the neutral expectation of 1 (Table 1; Z-test: Z = 0.92, P = 0.26). We observed only a single loss-of-stop mutation, but this was similar to our predicted number of 1.4. Truncation mutations that introduce a premature stop codon were much more frequent (n = 14), but this was not significantly different from the neutral expectation of nine truncation mutations (Z-test: Z = 1.63, P = 0.10).

Table 1

Testing for selection on single base pair substitutions

Protein effect	Observed	Expected
Nonsynonymous	480	444.38
Intergenic	80	84.33
Stop-gain	14	8.94
Stop-loss	1	1.41

The number of observed single base pair substitutions relative to the neutral expectation, as determined from the synonymous mutation rate and genome composition of P. aeruginosa. The observed number of mutations does not differ from the neutral expectation for any functional category of mutation.

Selection on coding and noncoding regions:

Protein coding sequence accounts for 89.4% of the P. aeruginosa genome and so we expected that if no natural selection has occurred during the MA experiment then ∼89.4% of mutations will have occurred in protein coding sequences. We found that the percentage of mutations (short indels and base substitutions) that occurred in coding regions was 85.4% (804/942), which was significantly different from the neutral expectation of 89.4% (χ2 goodness-of-fit test: P < 0.001, χ21 = 15.888). Interesting patterns arose when we analyzed base substitutions and short indels separately. We found that the percentage of base substitutions in coding regions (89.6%, 697/778) and intergenic regions (10.41%, 81/778) was not significantly different from the neutral expectation (χ2 goodness-of-fit test: P = 0.833, χ21 = 0.045). This result may be confounded because intergenic regions contain a larger proportion of repetitive DNA than coding regions (intergenic: 7.2%, coding: 3.1%), but when we restricted our analysis to repetitive regions we still observed that the percentage of base substitutions that fell in coding (4.3%) and intergenic (16.5%) repetitive regions did not differ from the neutral expectation (χ2 goodness-of-fit test: P = 0.181, χ21 = 1.791).

Selection on indels:

In contrast to base substitutions, we found significantly fewer indels in coding regions than expected (observed: 107/164 = 65.2%; expected: 89.4%; χ2 goodness-of-fit test: P < 0.0001, χ21 = 100.236). Again, this difference could be confounded because intergenic regions contain a larger proportion of indel-prone repetitive DNA, but we also found significantly fewer indels in repetitive coding regions (observed: 103/160 = 64.4%; expected: 78.2%; χ2 goodness-of-fit test: P < 0.0001, χ21 = 17.920) than expected in the absence of selection. This indicates strong purifying selection against frameshift mutations. In fact, these data suggest that at least 49.6% of frameshift mutations are sufficiently deleterious to be removed by natural selection, even under a regime of intense bottlenecking. Despite selection against frameshift mutations, we still found 106 frameshifts in our experiment. Almost all of them (101/106 = 95.3%) overlapped with homopolymeric tracts of C (ranging from 4C to 8C) or G (ranging from 5G to 8G). There were significantly more frameshifts located near the N terminus of the protein than expected, given the distribution of homopolymeric tracts in the P. aeruginosa genes (Figure 5; one-sided exact binomial test: P = 0.037). We found no significant difference for frameshifts near the middle (one-sided exact binomial test: P = 0.453) or near the C terminus of the protein (one-sided exact binomial test: P = 0.063).

Figure 5

The distribution of indel mutations in proteins. Comparison between the observed and expected position of frameshifts in coding regions. Proteins were divided into three equal pieces and we counted the number of frameshifts (overlapping with homopolymeric tracts) that fell in each section. Expected frequencies were computed by counting the number of homopolymeric tracts in the P. aeruginosa PAO1 proteome that fall in each section. The differences between observed and expected values were statistically significant for the N-terminal third of proteins (one-sided exact binomial test: P = 0.037).

Tests for parallel evolution:

Previous work has shown that exposing replicate microbial populations to a similar selective pressure results in parallel adaptation at a molecular level in both lab experiments (Wichman ; Segrè ; Barrick ) and clinical populations (Huse ; Lieberman ). To test for parallel evolution at the level of individual genes, we compared the distribution of the number of mutations fixed per gene in the eight MA lines, with the distribution expected based on the lengths of the genes in the P. aeruginosa genome (Figure S1; see Materials and Methods for details on calculating the expected distribution). We found no deviation from the expected distribution for synonymous mutations (χ2 goodness-of-fit test: P = 0.643, χ22 = 0.883). On the other hand, we found significantly fewer parallel nonsynonymous mutations than expected (χ2 goodness-of-fit test: P < 0.0001, χ22 = 19.302), which does not support the hypothesis that natural selection was capable of causing parallel evolution on the genomic scale in these MA lines. Rather, longer genes simply had more mutations than smaller genes (Figure S2): genes with one or more mutations were significantly longer than genes without mutations (Kolmogorov–Smirnov test, P < 0.001). It is also possible that parallel evolution could act on levels higher than the gene. We analyzed our mutation data for evidence of over- or underenrichment of mutations in COGs—genes that share a common function. After false discovery rate correction (Benjamini and Hochberg 1995), we found a significant underrepresentation of mutated genes involved in transcription (Table S3; Fisher’s exact test: P = 0.023, Fisher’s odds ratio1 = 0.530), suggesting that mutations in these genes tend to have highly deleterious effects.

Core genes:

We observed that large drops in fitness during the MA experiment were associated with the accumulation of mutations in core genes (Figure 2), and so we sought to determine whether natural selection was effective against mutations in these genes. Surprisingly, there was no significant underrepresentation of mutations in core genes (Fisher’s exact test: P = 0.611, Fisher’s odds ratio1 = 1.051) despite their potentially large deleterious effects on fitness.

Discussion

Mutations are rare events that often lead to small changes in fitness, and these properties of mutations make it intrinsically difficult to directly study the evolutionary consequences of mutation. Our experiment, which combined a classic mutation accumulation experiment with powerful whole genome resequencing technology, found that 42.3% of the decrease in fitness in our lines was driven by 4.5% of the steps with highly deleterious effects on fitness. Given the rarity of large drops in fitness, the most parsimonious explanation is that each one of these drops was driven by a single highly deleterious mutation. Under this assumption, the 42.3% of the decrease in fitness in our experiment was driven by 0.5% of the mutations fixed, which is consistent with previous work in Caenorhabditis elegans (Davies ). The mean mutational effect, s = −1.6 × 10−3, is similar to previous work in Saccharomyces cerevisiae (s = −6 × 10−3), in which whole genome resequencing and MA were combined (Lynch ) and, as expected, is approximately one to two orders of magnitude smaller than previous microbial MA studies that did not use whole genome resequencing and were therefore unable to detect neutral mutations (Halligan and Keightley 2009; Trindade ). We also found evidence of both positive and negative selection in our MA experiment, demonstrating that the results of our experiment cannot be interpreted as a proxy for the effects of spontaneous mutation alone.

Beneficial mutations

Previous studies in Arabidopsis thaliana (Shaw ), Escherichia coli (Perfeito ; Trindade ), Streptococcus pneumoniae (Stevens and Sebert 2011), and S. cerevisiae (Joseph and Hall 2004; Dickinson 2008) have also found evidence that beneficial mutations are fixed during mutation accumulation experiments. Our experimental approach allowed us to experimentally demonstrate that it is highly likely that at least 0.64% of the mutations that fixed during our MA experiment were beneficial. For these mutations to have been fixed by drift, the beneficial mutation rate in a nonhypermutator population with a genomic mutation rate of 3 × 10−3 mutations/genome/generation would have to have been ∼5 × 10−6 mutations/genome/generation, which is two to three orders of magnitude higher than existing estimates (Gerrish and Lenski 1998; Miralles ; Imhof and Schlotterer 2001; Rozen ; Barrett ; but for exceptions, see Perfeito ). Instead, we argue that positive selection was able to drive the fixation of beneficial mutations in our experiment. Consistent with this idea, five of the six significantly beneficial mutations that fixed were sufficiently beneficial that Nes was >1.

Tests for selection at a molecular level

In agreement with recent microbial mutation accumulation experiments that have used whole genome resequencing, we found no evidence of selection on base substitutions, including nonsynonymous mutations (Lynch ; Lee ; Ness ; Sung ,b; Long ). Additionally, we found no evidence of positive selection on the same genes in different MA lines. Surprisingly, we found that nonsynonymous mutations in highly conserved core genes can have strong deleterious effects on fitness (Figure 2), and yet we found no evidence that these mutations were removed by natural selection. The most striking evidence of selection at a genetic level comes from the lack of short indel mutations in coding regions. We estimate that negative selection prevented the fixation of at least 50% of indels in coding regions. In contrast, we did not find any evidence of an underrepresentation of base substitutions that generated a premature stop codon, implying that the absence of indels in coding regions is due to selection against frameshifts, and not selection against gene loss. Despite strong selection, we still found that frameshifts comprise 13.2% of all mutations in coding regions. This high incidence of frameshifting could be because 95.3% of frameshifts overlapped with homopolymeric tracts. Homopolymeric tracts are hypermutable: they are highly prone to gaining or losing repeats through slippage, thereby producing indels. Consistent with recent work (Orsi ; Lin and Kussell 2012), we found a significant overrepresentation of frameshifts at the 5′ end of genes and underrepresentation at the 3′ end (given the distribution of homopolymeric tracts in the PAO1 genome). Although the reasons for the enrichment of 5′ frameshifts is unclear, possible explanations include: (1) 5′ frameshifts tend to create shorter proteins and thus may be less prone to forming toxic aggregations; (2) intergenic regions in P. aeruginosa are very short and 3′ indels may knock out downstream genes; and/or (3) 5′ indels are more likely to destroy gene function, which may be beneficial in some circumstances. For example, Moxon have proposed that simple sequence repeats (such as homopolymeric tracts) are localized hypermutation targets and a mechanism for adaptation. Moreover, standing genetic variation in homopolymeric tracts has been shown to drive the adaptation of Campylobacter jejuni to a novel host (Jerome ).

Implications for mutation accumulation experiments

It is important to emphasize that our experiment differed from most previous MA experiments because we used a hypermutator strain. To what extent is this likely to have biased our results? Hypermutators produce an altered spectrum of spontaneous mutations (e.g., bias toward transitions), which can have important evolutionary implications when strong selection acts on a small number of sites in the genome (Couce ) (e.g., some cases of high-level antibiotic resistance). In our system, frameshifts experienced much stronger selection than any other class of mutation, and it is possible that using a ΔmutS hypermutator altered the rate of appearance of indel mutations relative to base substitutions (Marvig ). However, by using a hypermutator we were able to detect a sufficiently large number of mutations to analyze the effects of relatively rare types of mutation, such as indels, which have traditionally been overlooked in MA studies.

Conclusion

In conclusion, we find that fitness decays in recurrently bottlenecked populations of hypermutator P. aeruginosa because of the fixation of many weakly deleterious mutations and a few highly deleterious mutations. We argue that this pattern of punctuated decay of fitness arises for two reasons. First, most mutations carry little, if any, fitness cost in a laboratory environment, but a substantial fraction of mutations are highly deleterious. Our results suggest that weakly deleterious mutations tend to be intergenic and nonsynonymous mutations, while highly deleterious mutations tend to be indels and mutations in core genes. Second, we find that recurrent bottlenecking does not completely compromise the efficacy of natural selection in microbial mutation accumulation experiments, although large deleterious mutations are unlikely to play a substantial role in the evolution of natural populations. We hope that this study will pave the way for future work aimed at understanding: (1) why frameshift mutations are subject to such strong selection, (2) how bacteria adapt to the deleterious effects of spontaneous mutations, and (3) how the molecular basis of spontaneous mutation is linked to the fitness effects of mutations in natural populations.

56 in total

Review 1. Mutation and the evolution of ageing: from biometrics to system genetics.

Authors: Kimberly A Hughes
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2010-04-27 Impact factor: 6.237

2. Evolution of the mutation rate.

Authors: Michael Lynch
Journal: Trends Genet Date: 2010-06-30 Impact factor: 11.639

3. Rate and effects of spontaneous mutations that affect fitness in mutator Escherichia coli.

Authors: Sandra Trindade; Lilia Perfeito; Isabel Gordo
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2010-04-27 Impact factor: 6.237

4. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads.

Authors: Kai Ye; Marcel H Schulz; Quan Long; Rolf Apweiler; Zemin Ning
Journal: Bioinformatics Date: 2009-06-26 Impact factor: 6.937

5. Homopolymeric tracts represent a general regulatory mechanism in prokaryotes.

Authors: Renato H Orsi; Barbara M Bowen; Martin Wiedmann
Journal: BMC Genomics Date: 2010-02-09 Impact factor: 3.969

6. Parallel evolution in Pseudomonas aeruginosa over 39,000 generations in vivo.

Authors: Holly K Huse; Taejoon Kwon; James E A Zlosnik; David P Speert; Edward M Marcotte; Marvin Whiteley
Journal: MBio Date: 2010-09-21 Impact factor: 7.867

7. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation.

Authors: Ken Chen; John W Wallis; Michael D McLellan; David E Larson; Joelle M Kalicki; Craig S Pohl; Sean D McGrath; Michael C Wendl; Qunyuan Zhang; Devin P Locke; Xiaoqi Shi; Robert S Fulton; Timothy J Ley; Richard K Wilson; Li Ding; Elaine R Mardis
Journal: Nat Methods Date: 2009-08-09 Impact factor: 28.547

8. A framework for variation discovery and genotyping using next-generation DNA sequencing data.

Authors: Mark A DePristo; Eric Banks; Ryan Poplin; Kiran V Garimella; Jared R Maguire; Christopher Hartl; Anthony A Philippakis; Guillermo del Angel; Manuel A Rivas; Matt Hanna; Aaron McKenna; Tim J Fennell; Andrew M Kernytsky; Andrey Y Sivachenko; Kristian Cibulskis; Stacey B Gabriel; David Altshuler; Mark J Daly
Journal: Nat Genet Date: 2011-04-10 Impact factor: 38.330

9. Standing genetic variation in contingency loci drives the rapid adaptation of Campylobacter jejuni to a novel host.

Authors: John P Jerome; Julia A Bell; Anne E Plovanich-Jones; Jeffrey E Barrick; C Titus Brown; Linda S Mansfield
Journal: PLoS One Date: 2011-01-24 Impact factor: 3.240

10. Pseudomonas Genome Database: improved comparative analysis and population genomics capability for Pseudomonas genomes.

Authors: Geoffrey L Winsor; David K W Lam; Leanne Fleming; Raymond Lo; Matthew D Whiteside; Nancy Y Yu; Robert E W Hancock; Fiona S L Brinkman
Journal: Nucleic Acids Res Date: 2010-10-06 Impact factor: 16.971

32 in total

1. The Rate and Molecular Spectrum of Spontaneous Mutations in the GC-Rich Multichromosome Genome of Burkholderia cenocepacia.

Authors: Marcus M Dillon; Way Sung; Michael Lynch; Vaughn S Cooper
Journal: Genetics Date: 2015-05-12 Impact factor: 4.562

2. Mutational Landscape of Spontaneous Base Substitutions and Small Indels in Experimental Caenorhabditis elegans Populations of Differing Size.

Authors: Anke Konrad; Meghan J Brady; Ulfar Bergthorsson; Vaishali Katju
Journal: Genetics Date: 2019-05-20 Impact factor: 4.562

Review 3. Experimental Design, Population Dynamics, and Diversity in Microbial Experimental Evolution.

Authors: Bram Van den Bergh; Toon Swings; Maarten Fauvart; Jan Michiels
Journal: Microbiol Mol Biol Rev Date: 2018-07-25 Impact factor: 11.056

4. The Fitness Effects of Spontaneous Mutations Nearly Unseen by Selection in a Bacterium with Multiple Chromosomes.

Authors: Marcus M Dillon; Vaughn S Cooper
Journal: Genetics Date: 2016-09-26 Impact factor: 4.562

5. Life cycles, fitness decoupling and the evolution of multicellularity.

Authors: Katrin Hammerschmidt; Caroline J Rose; Benjamin Kerr; Paul B Rainey
Journal: Nature Date: 2014-11-06 Impact factor: 49.962

6. Are mutations usually deleterious? A perspective on the fitness effects of mutation accumulation.

Authors: Kevin Bao; Robert H Melde; Nathaniel P Sharp
Journal: Evol Ecol Date: 2022-06-21 Impact factor: 2.074

7. Reconstructed ancestral enzymes reveal that negative selection drove the evolution of substrate specificity in ADP-dependent kinases.

Authors: Víctor Castro-Fernandez; Alejandra Herrera-Morande; Ricardo Zamora; Felipe Merino; Felipe Gonzalez-Ordenes; Felipe Padilla-Salinas; Humberto M Pereira; Jose Brandão-Neto; Richard C Garratt; Victoria Guixe
Journal: J Biol Chem Date: 2017-07-18 Impact factor: 5.157

8. Accelerating Mutational Load Is Not Due to Synergistic Epistasis or Mutator Alleles in Mutation Accumulation Lines of Yeast.

Authors: Jean-Nicolas Jasmin; Thomas Lenormand
Journal: Genetics Date: 2015-11-23 Impact factor: 4.562

9. Evolution of Pseudomonas aeruginosa toward higher fitness under standard laboratory conditions.

Authors: Igor Grekov; Janne Gesine Thöming; Adrian Kordes; Susanne Häussler
Journal: ISME J Date: 2020-12-03 Impact factor: 10.302

10. Self-selection of evolutionary strategies: adaptive versus non-adaptive forces.

Authors: Matthew Putnins; Ioannis P Androulakis
Journal: Heliyon Date: 2021-05-15