Literature DB >> 28379581

Spontaneous Mutation Rate in the Smallest Photosynthetic Eukaryotes.

Marc Krasovec1, Adam Eyre-Walker2, Sophie Sanchez-Ferandin1, Gwenael Piganeau1.   

Abstract

Mutation is the ultimate source of genetic variation, and knowledge of mutation rates is fundamental for our understanding of all evolutionary processes. High throughput sequencing of mutation accumulation lines has provided genome wide spontaneous mutation rates in a dozen model species, but estimates from nonmodel organisms from much of the diversity of life are very limited. Here, we report mutation rates in four haploid marine bacterial-sized photosynthetic eukaryotic algae; Bathycoccus prasinos, Ostreococcus tauri, Ostreococcus mediterraneus, and Micromonas pusilla. The spontaneous mutation rate between species varies from μ = 4.4 × 10-10 to 9.8 × 10-10 mutations per nucleotide per generation. Within genomes, there is a two-fold increase of the mutation rate in intergenic regions, consistent with an optimization of mismatch and transcription-coupled DNA repair in coding sequences. Additionally, we show that deviation from the equilibrium GC content increases the mutation rate by ∼2% to ∼12% because of a GC bias in coding sequences. More generally, the difference between the observed and equilibrium GC content of genomes explains some of the inter-specific variation in mutation rates.
© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  GC bias; GC content; Mamiellophyceae; deletion bias; mutation accumulation; phytoplankton; spontaneous mutation rate; transcription-coupled DNA repair

Mesh:

Substances:

Year:  2017        PMID: 28379581      PMCID: PMC5455958          DOI: 10.1093/molbev/msx119

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


Introduction

Mutations are responsible for genetic variation between organisms, which permits adaptation by natural selection. Thus, estimation of mutation rates (μ) is paramount for a better understanding of all evolutionary change. Because mutations are rare events, estimating their rate was difficult until recently. However, new high throughput sequencing technologies have allowed the investigation of mutations from either offspring-parent trios, in humans (Abecasis et al. 2010; Conrad et al. 2011) and mice (Adewoye et al. 2015; Uchimura et al. 2015), or mutation accumulation (MA) experiments (Halligan and Keightley 2009; Lynch et al. 2016) in organisms such as Drosophila melanogaster (Haag-Liautard et al. 2007; Keightley et al. 2009,2014a), Arabidopsis thaliana (Ossowski et al. 2010), Caenorhabditis elegans (Denver et al. 2004, 2009, 2012), unicellular eukaryotes such as Saccharomyces cerevisiae (Wloch et al. 2001; Lang and Murray 2008; Lynch et al. 2008; Zhu et al. 2014) and bacteria such as Escherichia coli (Lee et al. 2012) and Salmonella typhimurium (Lind and Andersson 2008). These studies have revealed a large variation of spontaneous mutation rates across the tree of life from 7.6 × 10−12 in Tetrahymena thermophila (Long et al. 2016) to 1.1 × 10−8 in humans (Abecasis et al. 2010; Conrad et al. 2011). The variation of mutation rate between species appears to be correlated to two factors—genome size (Drake 1991; Drake et al. 1998), and in particular the size of the protein coding component of the genome (Lynch 2010; Lynch et al. 2016), and effective population size (Lynch 2010; Sung et al. 2012a). Both of these correlations may arise because of the limitations that genetic drift imposes on selection to minimize the mutation rate (Lynch 2010; Sung et al. 2012a). In sexual species, selection always acts to minimize the mutation rate because a modifier of the mutation rate only stays linked with the mutations it causes for a short period of time and deleterious mutations are more prevalent than advantageous mutations, increasing the genetic load (Leigh 1973). However, genetic drift ultimately limits the degree to which the mutation rate can be reduced (Lynch 2010), because the strength of selection acting on a modifier is equal to γ*U*s, where, γ is the proportional decrease in the mutation rate, U is the genomic rate of mutation and s is the average strength of selection against deleterious mutations. If γ*U*s < 1/Ne then selection will be ineffective against the modifier and the mutation rate cannot be reduced further. Hence we expect the per site rate of mutation to depend upon the effective population size (Lynch 2010)—species with larger Ne should have lower mutation rates—and genome size—the more selected sites there are, the lower the mutation rate should be. These predictions appear to be largely upheld Lynch et al. 2016). In asexual species, selection will favor an intermediate mutation rate, which generates sufficient advantageous, while not generating too many deleterious mutations (Paland and Lynch 2006; Henry et al. 2012). However, relatively few species appear to be truly asexual. It has also been observed that there is variation in the mutation rate within a genome at a number of different scales, from differences between chromosomes, to variation between regions on a chromosome and variation between adjacent sites (Hodgkinson and Eyre-Walker 2011; Schrider et al. 2011; Ségurel et al. 2014). As an example, the Y-chromosome in humans and chimps mutates faster than the other chromosomes (Ebersberger et al. 2002). It is also known that mitochondria has a higher mutation rate than the nuclear genome in Caenorhabditis elegans (Denver et al. 2000, 2009), Homo sapiens (Rebolledo-Jaramillo et al. 2014) and Drosophila melanogaster (Haag-Liautard et al. 2008; Keightley et al. 2009). Within chromosomes, it has been shown that nucleotide context affects the mutability of a site in Chlamydomonas reinhardtii (Ness et al. 2015b), Bacillus subtilis (Sung et al. 2015), and humans (Aggarwala and Voight 2016). In mammals, the most conspicuous effect is the high mutability of CpG dinucleotides resulting from cytosine deamination (Coulondre et al. 1978; Fryxell and Zuckerkandl 2000), which leads to an 80% reduction in the frequency of CpG dinucleotides in the human genome (Lander et al. 2001). Gene expression also affects the rate of mutation and its effect is controversial. Martincorena et al.’s (2012) analysis of polymorphisms suggested that the mutation rate in Escherichia coli is lower in highly expressed genes. However, analysis of MA lines in Escherichia coli suggests that the mutation rate actually increases with gene expression (Chen and Zhang 2013). Such a pattern has been observed in two other species, Saccharomyces cerevisiae and humans (Polak and Arndt 2008; Park et al. 2012). This phenomenon is known as transcription-associated mutagenesis (TAM) (Kim and Jinks-Robertson 2012). In this study, we provide the first estimates of the spontaneous mutation rate in 4 species of haploid green algae [Chlorophyta, Mamiellophyceae (Marin and Melkonian 2010)]: Ostreococcus tauri RCC4221 (Blanc-Mathieu et al. 2014), O. mediterraneus RCC2590 (Subirana et al. 2013), Micromonas pusilla RCC299 (Worden et al. 2009), and Bathycoccus prasinos RCC1105 (Moreau et al. 2012), with compact genomes containing 83–84% coding sequences. Green algae constitute one of the most important photosynthetic groups on Earth, with an ubiquitous distribution in world’s oceans (de Vargas et al. 2015; Vannier et al. 2016), and play a fundamental role in foodweb and biogeochimical cycles (Collins et al. 2014). These green algae span a large evolutionary divergence, as revealed by a high proportion of species-specific genes, and high amino-acid divergence between orthologous genes (Šlapeta et al. 2006; Jancek et al. 2008). Their genome size ranges from 13 to 21 Mb and their average GC content ranges from 48% to 63%.

Results

Mutation Rates in Mamiellophyceae

To estimate spontaneous mutation rates, we sequenced 150 mutation accumulation lines in four species of green algae (supplementary table S1, Supplementary Material online, 36–40 lines per species). All together, we found 238 single nucleotide mutations and 48 indels, summarized in table 1. The numbers of synonymous and nonsynonymous mutations are as expected if mutations are randomly distributed across sites for all species (table 2), consistent with an absence of selection against nonsynonymous mutations. We thus assume that selection played a minimal role in the pattern of mutations in these MA experiments.
Table 1

Summary of Spontaneous Mutation Rates in Four Mamiellophyceae Species.

SpeciesTotGenG (Mb)BSInsDelμbs−10μID−10μtot−10
Ostreococcus tauri17,25012.4691584.190.604.79 (3.91–5.80)
Ostreococcus mediterraneus8,38013.3454384.921.005.92 (4.57–7.55)
Bathycoccus prasinos4,14514.9622553.021.374.39 (3.00–6.20)
Micromonas pusilla4,99420.99712128.151.619.76 (7.80–12.07)

Note.—BS is the number of base-substitution mutations, Ins the number of insertions and Del the number of deletions. G is the genome size in Mb and μ the mutation rate per nucleotide per genome per generation. TotGen is the total number of generations accumulated per species. Confidence interval for μ is given under the assumption of a Poisson distribution of the mutations.

Table 2

Mutation Rate Variation between Coding and Intergenic Sequences.

Species% coding genomeμ × 10−10 coding regionsμ × 10−10 intergenic regionsNumbers of mutations syn:nonsyn
O. tauri81.63.98.919:42
O. mediterraneus84.45.011.79:34
B. prasinos83.13.414.75:10
M. pusilla81.98.216.115:41

Note.—The bias of mutation toward intergenic sequences is significant (Chi-squared test, P-value < 0.01). Syn and nonsyn are the synonymous and nonsynonymous point mutations.

Summary of Spontaneous Mutation Rates in Four Mamiellophyceae Species. Note.—BS is the number of base-substitution mutations, Ins the number of insertions and Del the number of deletions. G is the genome size in Mb and μ the mutation rate per nucleotide per genome per generation. TotGen is the total number of generations accumulated per species. Confidence interval for μ is given under the assumption of a Poisson distribution of the mutations. Mutation Rate Variation between Coding and Intergenic Sequences. Note.—The bias of mutation toward intergenic sequences is significant (Chi-squared test, P-value < 0.01). Syn and nonsyn are the synonymous and nonsynonymous point mutations. The base substitution mutation rate (μ) and the insertion-deletion mutation rate (μ) per nucleotide per generation were estimated on callable sites (G*), which represented 97–99% of the complete genome sequence (supplementary table S2, Supplementary Material online). There is no difference in G* between coding and intergenic regions. The total mutation rate, μ, is the sum of μ and μ. Mutation rates varied over 2-fold from 4.4 × 10−10 mutations per site per generation in B. prasinos to 9.8 × 10−10 in M. pusilla. The complete list of mutations for each of the four MA experiments is provided in supplementary tables S3–S8, Supplementary Material online. There is no significant difference in mutation rates between chromosomes (Chi-Squared test, ns). The number of plastic genomes per cell is known to depend on tissue type in plants (Ma and Li 2015) and on growth and redox status in the freshwater green alga C. reinhardtii (Lau et al. 2000). The coverage of chloroplastic (cpDNA) and mitochondrial DNA (mtDNA) relative to the coverage of the nuclear genome can be taken as a proxy for cpDNA, mtDNA, and nuclear genome copy number stoichiometry in Mamiellophyceae. O. tauri and B. prasinos have 2:3:1 mtDNA, cpDNA, nuclear genome coverage on average, whereas these relative ratios are 3:4:1 in O. mediterraneus and 4:7:1 in M. pusilla RCC299. No mutation candidate was found in organellar genomes, even when the mutation detection threshold was lowered to take the genome copy number into account. This implies that the spontaneous mutation rates of organelles are less than 9.6 × 10−10 for the mitochondria and 4.8 × 10−10 for the chloroplast, so that we cannot estimate the nuclear versus organellar mutation rates in Mamiellophyceae. The organelle mutation rates have been estimated to be lower than the nuclear mutation rates in higher plants (Wolfe et al. 1987; Smith 2015), while similar mutation rates in the chloroplast and the nuclear genome have been recently estimated in the green algae Chlamydomonas reinhardtii (Ness et al. 2015a).

Nonrandom Mutation Events in the Genome

The analysis of the distribution of mutations across these four species reveals significant deviations from a uniform distribution of mutations along the genome. First, mutation events tend to cluster within adjacent nucleotides: of our 238 base substitution mutations across all species, 32 occurred within 30 bp of another mutation. These clustered mutations probably represent single mutational events since all adjacent mutations are found within the same strain. These multinucleotide mutations (MNM) were not part of homopolymeric nucleotides. Second, there is an excess of mutations in intergenic regions, with a two-fold increase as compared to coding regions (Chi-Square, P-value < 0.01) (table 2). In addition, transcription levels seems to impact the mutability: in intergenic regions, mutated sites have on average 2.6–5.7 fold less RNAseq read coverage (for B. prasinos and O. tauri, respectively) than nonmutated sites (supplementary table S9, Supplementary Material online, Wilcoxon test, P-value < 0.001). However, there is no significant RNAseq coverage variation between mutated and nonmutated sites in coding sequences in B. prasinos and O. tauri. Third, there are significantly more deletions than insertions over the four experiments (Binomial test, P-value < 0.05). The deletion over insertion ratio is 2.2, and the average length of insertions versus deletions is 1.9 and 9.8 bp, respectively. This corresponds to a net loss of 0.004 bp per genome per generation in O. tauri to 0.03 bp per generation in M. pusilla. This deletion bias has been previously noted in species such as Chlamydomonas reinhardtii (Ness et al. 2015b), with a net loss per genome per generation of 0.022 bp (for 72.5% of the genome). Finally, mutations are over-represented in subtelomeric regions—within 1 kb of the start of the nucleotides AAACCCT telomere repeats (22 mutations identified, considering indels and substitutions). These mutations do not appear in homopolymeric nucleotides, and were identified in the four species. The overrepresentation of mutations in subtelomeric regions is highly significant in O. tauri (10 vs. 94 mutations in subtelomeric versus nonsubtelomeric regions, Fisher exact test, P-value <10−12), B. prasinos (8 vs. 24 mutations, Fisher exact test, P-value <10−12) and M. pusilla (4 vs. 81 mutations, Fisher exact test, P-value <10−4). It is nonsignificant in O. mediterraneus (1 vs. 64 mutations in subtelomeric versus nonsubtelomeric regions).

The Direction of Base-Substitution Mutations

The mutation rate between the four nucleotides is biased from GC to AT mutations (fig. 1). The equilibrium GC content (GC), that is the GC content where the number of mutations from GC to AT is equal to AT to GC, is substantially lower than the observed GC content in all Mamiellophyceae (GC = 36.8% vs. GC = 59.0% for O. tauri, 43.5% vs. 56.0% for O. mediterraneus, 46.2% vs. 63.8 for M. pusilla, and 36.8% vs. 48.0% for B. prasinos). This discrepancy implies either that the mutational spectrum has recently changed in all four species, or that nonmutational processes are acting to maintain the GC content above its mutational equilibrium value. Interestingly, one chromosome has a GC content that is 10% points lower than the others in Mamiellophyceae: 51.3% in M. pusilla, 49.9% in O. mediterraneus, 54.3% in O. tauri, and 41.9% in B. prasinos (Piganeau et al. 2011). Despite this variation, no significant difference in the distribution of GC to AT and AT to GC mutations between high and low GC content chromosomes was observed. Surprisingly, this GC bias seemed to be stronger in coding sequences: the mutation rate from GC to AT was higher than the mutation rate from AT to GC in coding region, whereas it is not significantly biased in intergenic regions (supplementary table S10, Supplementary Material online).
F

Mutation patterns in the four species. GC to AT bias is significant in O. tauri and M. pusilla (Binomial test, P  = 0.0001 and 0.02, respectively).

Mutation patterns in the four species. GC to AT bias is significant in O. tauri and M. pusilla (Binomial test, P  = 0.0001 and 0.02, respectively).

Inter-genomic Variation in the Mutation Rate

Several ecological and biological factors have been proposed to explain the variation in the mutation rate between species, such as genome size and effective population size (Lynch et al. 2016 for a review). We compiled the available estimates of the spontaneous mutation rate from the literature (supplementary table S11, Supplementary Material online), and combined them with the mutation rate estimates from this study. There is a significant decrease of the mutation rate with the effective population size (n = 18, Pearson correlation, ρ = −0.78, P  < 0.0001) (fig. 2, supplementary table S11, Supplementary Material online), and the one Mamiellophyceae for which we have diversity data (O. tauri), and hence an estimate of Ne fits in with this pattern. This negative correlation supports the drift barrier hypothesis (Lynch 2010; Sung et al. 2012; Lynch et al. 2016). There is a negative correlation between genome size (G) and mutation rate (μ) in bacteria (n = 10, Pearson correlation, ρ = −0.95, P  = 0.001) and all micro-organisms (n = 19, Pearson correlation, ρ = −0.57, P  = 0.01). In contrast, the correlation between G and μ in multicellular eukaryotes is positive (supplementary fig. S1 and table S11, Supplementary Material online) (n = 10, Pearson correlation, ρ = 0.67, P  < 0.03).
F

Mutation rates versus effective population size (n = 18, Pearson correlation, ρ  = −0.78, P  < 0.0001, data from supplementary table S11, Supplementary Material online).

Mutation rates versus effective population size (n = 18, Pearson correlation, ρ  = −0.78, P  < 0.0001, data from supplementary table S11, Supplementary Material online). A biased mutation pattern can potentially cause differences between the observed genomic GC content and the equilibrium GC content (supplementary table S12, Supplementary Material online). This in turn can alter the mutation rate. For example, if the mutation pattern is biased towards AT then elevating the GC content above its mutational equilibrium increases the mutation rate above the mutation rate you would get at equilibrium, μeq. In Mamiellophyceae, the deviation from the equilibrium GC content leads to a modest increase in the mutation rate from 2% (O. mediterraneus) to 12% (O. tauri). However, using data from MA experiments in 20 other species (supplementary table S12, Supplementary Material online) the increase of μobs relative to μeq can be much larger; up to 64% in the eukaryote A. thaliana, and 160% in the prokaryote Mesoplasma florum. More generally we find that the mutation rate is positively correlated to the ratio of the observed GC-content to its equilibrium value (fig. 3, Pearson correlation, n = 23, ρ = 0.55, P  = 0.007).
F

Correlation between mutation rate and the ratio of the observed versus the equilibrium GC-content (n = 23, Pearson correlation, ρ = 0.55, P  = 0.007, data from supplementary table S12, Supplementary Material online).

Correlation between mutation rate and the ratio of the observed versus the equilibrium GC-content (n = 23, Pearson correlation, ρ = 0.55, P  = 0.007, data from supplementary table S12, Supplementary Material online).

Discussion

We have performed mutation accumulation experiments in four species of pico-phytoplankton followed by whole genome sequencing of 150 lines. In total, we have observed 238 point mutations and 48 indels, which we have used to analyse various aspects of the mutation rate. The genome coverage of each mutation accumulation lines is higher than 97% and mutation rates vary from μ = 4.4 × 10−10 to 9.8 × 10−10 mutations per nucleotide per generation.

Within Genome Variation of Mutation Rate

Coding Sequences Bias

We observed a 2- to 3-fold difference in the mutation rate of coding and intergenic regions. There are three possible explanations for this observation. First, this could simply reflect selection against spontaneous mutations in coding regions. The MA experiment was designed such that all but the most strongly deleterious mutations would accumulate. However, strongly deleterious mutations will lead to line loss, which is something we observed (Krasovec et al. 2016). In coding regions, approximately one third of nucleotide positions are synonymous and are thus expected to have little consequence on fitness. If selection occurred during the MA experiments, mutations in coding regions should be biased towards synonymous mutations, but there is no excess of synonymous mutations in any of the MA experiments (Chi-squared test, ns) (table 1), consistent with a lack of selection against nonsynonymous mutations during the experiment. Second, the lower mutation rate in coding regions could reflect a difference in the efficiency of mismatch repair (MMR) between coding and intergenic regions (Kunkel and Erie 2015), provided the MMR efficiency is optimized in coding region of the genome (Lee et al. 2012; Foster et al. 2015). Indeed, an experiment comparing wild type and MMR deficient lines in E. coli show that the difference in mutation rates between coding and intergenic sequence vanishes in MMR deficient lines (Foster et al. 2015). Seven genes of the MutS repair machinery have been predicted in the four Mamiellophyceae species of this study [MutS gene family from the picoPLAZA website (Vandepoele et al. 2013)]. Third, the lower mutation rate in coding regions could be due to transcription-coupled DNA repair (TCR) (Hanawalt and Spivak 2008). This system allows the repair of lesions in the DNA that are encountered during transcription and hence is more likely in the regions of the genome that are expressed. Consistent with TCR, mutated intergenic sites have ∼3- to 6-fold lower transcription rates than nonmutated intergenic sites in B. prasinos and O. tauri (supplementary table S9, Supplementary Material online). Genes coding for TCRs have been identified in these four species [rad26 gene family from the picoPLAZA website (Vandepoele et al. 2013)]. However, this bias is not observed in coding sequences, suggesting a threshold effect. When a minimum level of transcription is reached, the probability of repair by TCR becomes too high to detect significant variation between two transcription rates above the threshold. Interestingly, the transcription effect on the mutation rate varies between species, depending of the difference between TCR and TAM (Lynch et al. 2016). Our data suggest that TCR in Mamiellophyceae species compensates and significantly reduces the TAM effect. In conclusion, both MMR and TCR are likely mechanisms involved in lower mutation rates in coding regions in Mamiellophyceae. It is worth noting that the difference in the mutation rate between coding and intergenic regions may affect genome-wide mutation rate estimates in other species, if coding and intergenic sites are unequally represented in re-sequenced MA lines. De novo mutations were estimated from only 78% of the genome in A. thaliana (Ossowski et al. 2010), 75% in C. reinhardtii (Ness et al. 2012, 2015b) and 46% in Heliconius melpomene (Keightley et al. 2014b). These estimates might thus be biased if there is a tendency for de novo mutations to be easier to identify in the coding fraction of the genome.

Multinucleotide Mutations

Multinucleotide mutations (MNM) are known to be responsible of ∼2% to ∼16% of the total number of mutations in humans, S. cerevisiae, D. melanogaster, A. thaliana, and C. elegans (Schrider et al. 2011). Multinucleotide mutations might have implications for the molecular clock, and for the evolution of amino acid sequences by allowing jumps between amino acids that are separated by more than one mutation (Schrider et al. 2011; Besenbacher et al. 2016). In the case of Mamiellophycae, ∼5% of mutation events are MNM. Several hypotheses were proposed to explain the origin of MNM, such as a defect in transcription or translation of a polymerase (Ninio 1991), replication timing (Stamatoyannopoulos et al. 2009) or simply that single mutation may induce other mutations at adjacent sites (Tian et al. 2008). Our data do not permit us to explore these hypotheses, but the MNM rate in Mamiellophyceae has to be taken into account to re-assess estimation of speciation times in this genera (Šlapeta et al. 2006).

Inter-Specific Mutation Rate Variation

The genomic mutation rate varies by about ∼10,000-fold amongst the tree of life. As previously reported, the effective population size appears to be a key correlate of the mutation rate variations, probably because of the drift barrier, which imposes a limit to the lower mutation rate that can be possibly reached by selection (Lynch 2010; Sung et al. 2012a,b; Lynch et al. 2016). The increase of the mutation rate with genome size observed in multicellular eukaryotes might thus simply result from the negative correlation between genome size and the effective population size (Lynch and Conery 2003). Microbial eukaryotes have large effective population sizes (supplementary table S11, Supplementary Material online), and it is thus expected that mutation rates decrease with genome sizes, as reported for bacteria. The available data does not support this, as there is no significant relationship between genome size and mutation rates in microbial eukaryotes (n = 10, supplementary fig. S1, Supplementary Material online, Pearson correlation ρ = −0.47, ns), in sharp contrast to the strong negative relationship in bacteria (n = 10, supplementary fig. S1, Supplementary Material online, Pearson correlation ρ = −0.95, P  = 0.001). Differences in the life cycle, such as the frequency of clonal versus sexual reproduction, or in genome biology, such as the nontranscription of the germline genome in ciliates (as opposed to the somatic genome) (Sung et al. 2012a,b; Long et al. 2016), may blur the expected relationship between μ and genome size in microbial eukaryotes. In addition to the effect of effective population size, the departure from equilibrium GC base composition is responsible for a substantial part of mutation rate variation between species (Smith et al. 2002), as a consequence of the positive relationship between the mutation rate and the departure of the GC content from equilibrium. Indeed, most species have a higher GC content than expected from the GC->AT versus AT->GC mutation rates, and this increases the mutation rate. The mechanisms responsible for increasing the genomic GC content thus contribute to increasing the spontaneous mutation rate in most species. Two mechanisms could move the GC content of a genome away from its equilibrium value; selection or biased gene conversion. Selection can act on protein coding sequences, synonymous codon use (Ikemura 1981) and gene regulatory sequences, in a manner which is expected to lead to less biased base composition than the mutational spectrum would cause. The GC bias is stronger in coding sequences, constituting indirect evidence supporting selection on coding sequence composition. Biased gene conversion, a by-product of recombination, has been identified in many organisms from bacteria (Lassalle et al. 2015) and yeast (Harrison and Charlesworth 2011; Lesecque et al. 2013) to humans (Duret and Arndt 2008; Duret and Galtier 2009). There is indirect evidence of recombination and biased gene conversion in O. tauri (Jancek et al. 2008; Grimsley et al. 2010), so that GC biased gene conversion is likely involved in increasing the GC content of Mamiellophyceae genomes.

Conclusion

Our study provides four novel spontaneous mutation rate estimates in unicellular eukaryotes. Spontaneous mutations rates were assessed over 97–99.5% of the nuclear genomes, they are higher in intergenic than in coding sequences and 5% of mutation events affect multiple nucleotides. Combined with previous spontaneous mutation rate estimates from 20 species, our data also provides evidence that the spontaneous mutation rate increases with the deviation of the genome GC-content from its equilibrium GC-content.

Materials and Methods

Mutation Accumulation Experiments

Mutation accumulation experiments were performed on four haploid marine green algae (Chlorophyta): Ostreococcus tauri RCC4221, O. mediterraneus RCC2590, Micromonas pusilla RCC299, and Bathycoccus prasinos RCC1105. All strains are maintained in the Roscoff Culture Collection (RCC), in France (http://roscoff-culture-collection.org/). Classically, in MA experiments of unicellular organisms, a colony of cells is transferred to a fresh agar plate at each bottleneck to allow the separation of the cells and the random sampling of a new cell. However, this is not possible in these pico-algae as they do not grow on the surface of gelled media, in contrast to Saccharomyces cerevisiae, Dictyostelium discoideum or Chlamydomonas reinhardtii (Wloch et al. 2001; Hall et al. 2013; Morgan et al. 2014). Nevertheless, they are easily cultured in liquid medium in the laboratory. We therefore developed an experimental protocol combining flow cytometry, which has the advantages of counting individual cells while verifying cell size and fluorescence, and transfer of single cells in liquid media (Krasovec et al. 2016). Briefly, MA lines were started from one single cell from a clonal population and maintained in L1 liquid medium in 24 wells of a microtiter plate with a one-cell bottleneck every 14 days. Serial bottlenecks allowed to largely remove the influence of natural selection, the average effective population size, estimated with the harmonic mean of cell number, varied between 6 and 9 across the four species (supplementary table S1, Supplementary Material online). Cell concentrations of MA lines were measured by flow cytometry using a FACSCanto II flow cytometer (Becton Dickinson, Franklin Lakes, NJ), relative to their natural chlorophyll fluorescence (FL3 acquisition at 670 nm) and size scatter (SSC) acquisitions. Depending of cell concentration, the volume corresponding to one cell was inoculated into a new well plate with new media. The number of generations per day, D, was estimated each 14 days at bottleneck time as follows: Nt is the number of cells in the well at bottleneck time, and t = 14 days. MA experiments were performed over a period of 224–378 days depending on the species. MA lines accumulated between 80 and 500 independent generations (supplementary table S1, Supplementary Material online).

Sequencing

DNA of ancestral types and MA lines were extracted as described previously (Winnepenninckx et al. 1993) and sequenced with Illumina technology by GATC biotech® (Konstanz, Germany). Two different sequencing technologies were used: MiSeq for O.tauri and O. mediterraneus, and HiSeq for B. prasinos and M. pusilla. Reads from ancestral types and MA lines were aligned to the reference genomes using BWA (Li and Durbin 2010) (M. pusilla: GCA_000151265.1; O. tauri: GCF_000214015.2); B. prasinos (ORCAE database; Sterck et al. 2012); O. mediterraneus (S Yau et al. in preparation), and SAMtools (Li et al. 2009) were used to obtained bam and mpileup files. The four ancestral types and 150 MA lines were sequenced: 40 for O. tauri, 37 for O. mediterraneus, 37 for M. pusilla, and 36 for B. prasinos.

Mutation Calling

Mutations were called from mpileup files (Li et al. 2009) using GATK (DePristo et al. 2011). The final mutation candidates were filtered to remove low mapping quality regions (MQ < 50), low coverage regions (<5 reads), and shared mutations between all MA lines. The number of callable sites per genome above these thresholds was computed to estimate the per base pair mutation rate [97–99% of the genomes are callable (supplementary table S2)). All alignments around mutations were checked manually. All mutation candidates were compared with the ancestral type to discard spurious candidates that result from substitutions between the reference genome sequence and the genome of the strain at the start of the MA experiment. Additionally, 22 mutation candidates were chosen randomly for Sanger re-sequencing. All re-sequencing confirmed the predicted mutations (true positive rate = 100%, false negative rate < 1 × 10−4). The type of mutation (nonsynonymous, synonymous, intronic, or intergenic) was extracted with snpEff (Cingolani et al. 2012) using the annotation files of each genome [available via ORCAE (Sterck et al. 2012)]. The same method was used for base substitutions and indels mutations.

Mutation Rate at Equilibrium GC Content

Let R be equal to the rate of mutation from GC to AT, R from AT to GC, R the rate of transversions between A and T, and R be the rate of transversions between G and C: NN is the number of GC or AT sites in the genome. Then it is straightforward to show that the GC-content at mutational equilibrium (Sueoka 1962) is: Assuming that R1, R2, R3, and R4 are constant, the expected mutation rate at equilibrium is with it may be written where

Mutation Spectrum Analysis

To investigate the effect of context we extracted the 10 bp at either side of each mutated site and used binomial tests to investigate whether a particular trinucleotide, either NXN or NNX, where X is the mutated site, has a significantly higher or lower mutation rate. To investigate whether gene expression affects the rate of mutation we used STAR (Dobin et al. 2013) to compute the coverage of the genome by RNAseq data, available for B. prasinos (Moreau et al. 2012) and O. tauri (Blanc-Mathieu et al. 2014). Statistical analyses were performed with R (version 3.1.1) (R Development Core Team, 2014).

Supplementary Material

Supplementary data are available at Molecular Biology and Evolution online. Click here for additional data file.
  91 in total

1.  High direct estimate of the mutation rate in the mitochondrial genome of Caenorhabditis elegans.

Authors:  D R Denver; K Morris; M Lynch; L L Vassilieva; W K Thomas
Journal:  Science       Date:  2000-09-29       Impact factor: 47.728

Review 2.  Genome diversity in the smallest marine photosynthetic eukaryotes.

Authors:  Gwenael Piganeau; Nigel Grimsley; Herve Moreau
Journal:  Res Microbiol       Date:  2011-04-21       Impact factor: 3.992

3.  Evolution of the mutation rate.

Authors:  Michael Lynch
Journal:  Trends Genet       Date:  2010-06-30       Impact factor: 11.639

4.  Variation in genome-wide mutation rates within and between human families.

Authors:  Donald F Conrad; Jonathan E M Keebler; Mark A DePristo; Sarah J Lindsay; Yujun Zhang; Ferran Casals; Youssef Idaghdour; Chris L Hartl; Carlos Torroja; Kiran V Garimella; Martine Zilversmit; Reed Cartwright; Guy A Rouleau; Mark Daly; Eric A Stone; Matthew E Hurles; Philip Awadalla
Journal:  Nat Genet       Date:  2011-06-12       Impact factor: 38.330

5.  Human mutation rate associated with DNA replication timing.

Authors:  John A Stamatoyannopoulos; Ivan Adzhubei; Robert E Thurman; Gregory V Kryukov; Sergei M Mirkin; Shamil R Sunyaev
Journal:  Nat Genet       Date:  2009-03-15       Impact factor: 38.330

6.  Morphology, genome plasticity, and phylogeny in the genus ostreococcus reveal a cryptic species, O. mediterraneus sp. nov. (Mamiellales, Mamiellophyceae).

Authors:  Lucie Subirana; Bérangère Péquin; Stéphanie Michely; Marie-Line Escande; Julie Meilland; Evelyne Derelle; Birger Marin; Gwenaël Piganeau; Yves Desdevises; Hervé Moreau; Nigel H Grimsley
Journal:  Protist       Date:  2013-07-24

7.  Extraordinary genome stability in the ciliate Paramecium tetraurelia.

Authors:  Way Sung; Abraham E Tucker; Thomas G Doak; Eunjin Choi; W Kelley Thomas; Michael Lynch
Journal:  Proc Natl Acad Sci U S A       Date:  2012-11-05       Impact factor: 11.205

Review 8.  Transcription as a source of genome instability.

Authors:  Nayun Kim; Sue Jinks-Robertson
Journal:  Nat Rev Genet       Date:  2012-02-14       Impact factor: 53.242

9.  An expanded sequence context model broadly explains variability in polymorphism levels across the human genome.

Authors:  Varun Aggarwala; Benjamin F Voight
Journal:  Nat Genet       Date:  2016-02-15       Impact factor: 38.330

10.  Spontaneous mutation accumulation in multiple strains of the green alga, Chlamydomonas reinhardtii.

Authors:  Andrew D Morgan; Rob W Ness; Peter D Keightley; Nick Colegrave
Journal:  Evolution       Date:  2014-07-09       Impact factor: 3.694

View more
  22 in total

1.  Mutational Landscape of Spontaneous Base Substitutions and Small Indels in Experimental Caenorhabditis elegans Populations of Differing Size.

Authors:  Anke Konrad; Meghan J Brady; Ulfar Bergthorsson; Vaishali Katju
Journal:  Genetics       Date:  2019-05-20       Impact factor: 4.562

2.  Population genomics of picophytoplankton unveils novel chromosome hypervariability.

Authors:  Romain Blanc-Mathieu; Marc Krasovec; Maxime Hebrard; Sheree Yau; Elodie Desgranges; Joel Martin; Wendy Schackwitz; Alan Kuo; Gerald Salin; Cecile Donnadieu; Yves Desdevises; Sophie Sanchez-Ferandin; Hervé Moreau; Eric Rivals; Igor V Grigoriev; Nigel Grimsley; Adam Eyre-Walker; Gwenael Piganeau
Journal:  Sci Adv       Date:  2017-07-05       Impact factor: 14.136

3.  Rapidity of Genomic Adaptations to Prasinovirus Infection in a Marine Microalga.

Authors:  Sheree Yau; Gaëtan Caravello; Nadège Fonvieille; Élodie Desgranges; Hervé Moreau; Nigel Grimsley
Journal:  Viruses       Date:  2018-08-19       Impact factor: 5.048

Review 4.  Old Trade, New Tricks: Insights into the Spontaneous Mutation Process from the Partnering of Classical Mutation Accumulation Experiments with High-Throughput Genomic Approaches.

Authors:  Vaishali Katju; Ulfar Bergthorsson
Journal:  Genome Biol Evol       Date:  2019-01-01       Impact factor: 3.416

Review 5.  Stability across the Whole Nuclear Genome in the Presence and Absence of DNA Mismatch Repair.

Authors:  Scott Alexander Lujan; Thomas A Kunkel
Journal:  Cells       Date:  2021-05-17       Impact factor: 6.600

6.  Muver, a computational framework for accurately calling accumulated mutations.

Authors:  Adam B Burkholder; Scott A Lujan; Christopher A Lavender; Sara A Grimm; Thomas A Kunkel; David C Fargo
Journal:  BMC Genomics       Date:  2018-05-09       Impact factor: 3.969

7.  Single Nucleotide Polymorphism Charting of P. patens Reveals Accumulation of Somatic Mutations During in vitro Culture on the Scale of Natural Variation by Selfing.

Authors:  Fabian B Haas; Noe Fernandez-Pozo; Rabea Meyberg; Pierre-François Perroud; Marco Göttig; Nora Stingl; Denis Saint-Marcoux; Jane A Langdale; Stefan A Rensing
Journal:  Front Plant Sci       Date:  2020-07-07       Impact factor: 5.753

8.  Synergism between the Black Queen effect and the proteomic constraint on genome size reduction in the photosynthetic picoeukaryotes.

Authors:  D Derilus; M Z Rahman; F Pinero; S E Massey
Journal:  Sci Rep       Date:  2020-06-02       Impact factor: 4.379

9.  Low Base-Substitution Mutation Rate but High Rate of Slippage Mutations in the Sequence Repeat-Rich Genome of Dictyostelium discoideum.

Authors:  Sibel Kucukyildirim; Megan Behringer; Way Sung; Debra A Brock; Thomas G Doak; Hatice Mergen; David C Queller; Joan E Strassmann; Michael Lynch
Journal:  G3 (Bethesda)       Date:  2020-09-02       Impact factor: 3.154

10.  Comparative demography elucidates the longevity of parasitic and symbiotic relationships.

Authors:  Luke B B Hecht; Peter C Thompson; Benjamin M Rosenthal
Journal:  Proc Biol Sci       Date:  2018-10-03       Impact factor: 5.349

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.