| Literature DB >> 27635054 |
Hongan Long1,2, David J Winter3, Allan Y-C Chang1, Way Sung4, Steven H Wu3, Mariel Balboa1, Ricardo B R Azevedo1, Reed A Cartwright3,5, Michael Lynch2, Rebecca A Zufall1.
Abstract
Mutation is the ultimate source of all genetic variation and is, therefore, central to evolutionary change. Previous work on Paramecium tetraurelia found an unusually low germline base-substitution mutation rate in this ciliate. Here, we tested the generality of this result among ciliates using Tetrahymena thermophila. We sequenced the genomes of 10 lines of T. thermophila that had each undergone approximately 1,000 generations of mutation accumulation (MA). We applied an existing mutation-calling pipeline and developed a new probabilistic mutation detection approach that directly models the design of an MA experiment and accommodates the noise introduced by mismapped reads. Our probabilistic mutation-calling method provides a straightforward way of estimating the number of sites at which a mutation could have been called if one was present, providing the denominator for our mutation rate calculations. From these methods, we find that T. thermophila has a germline base-substitution mutation rate of 7.61 × 10 - 12 per-site, per cell division, which is consistent with the low base-substitution mutation rate in P. tetraurelia. Over the course of the evolution experiment, genomic exclusion lines derived from the MA lines experienced a fitness decline that cannot be accounted for by germline base-substitution mutations alone, suggesting that other genetic or epigenetic factors must be involved. Because selection can only operate to reduce mutation rates based upon the "visible" mutational load, asexual reproduction with a transcriptionally silent germline may allow ciliates to evolve extremely low germline mutation rates.Entities:
Keywords: Oligohymenophorea; drift-barrier hypothesis; macronucleus; microbial eukaryote; micronucleus; mutation accumulation
Mesh:
Year: 2016 PMID: 27635054 PMCID: PMC5585995 DOI: 10.1093/gbe/evw223
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—Mutation rate estimates for unicellular eukaryotes. Base-substitution mutation rates per nucleotide per generation estimated for different unicellular eukaryotes: T. thermophila (this paper), P. tetraurelia (Sung, Tucker, et al. 2012), C. reinhardtii (Ness et al. 2015), D. discoideum (Saxer et al. 2012), Sa. cerevisiae (Zhu et al. 2014), and Sc. pombe (Farlow et al. 2015). The phylogenetic tree was retrieved from the Open Tree of Life (Hinchliff et al. 2015); branch lengths are arbitrary. Error bars are 95% confidence intervals (ignoring uncertainty in the number of sites at which a mutation could be detected and the number of generations for which each MA experiment ran).
. 2.—Experimental design in relation to parameters of probabilistic mutation-detection model. A complete description of the experiment is presented in Long et al. (2013). Here, we describe how the experiment relates to the parameters used in our probabilistic mutation-calling model. Specifically, the ancestral line with average heterozygosity θ and genome-wide nucleotide frequencies π is used to generate a set of MA lines. Each of these lines accumulates mutations at a rate μ per nucleotide site per generation for 1000 generations. Genomic exclusion, an auto-diploidization process, is used to generate lines with macronuclei representing one haploid-copy of each MA line (and multiple copies of the ancestral line, in order to detect ancestral heterozygosity). The macronuclear genomes of these genomic exclusion lines are then sequenced with a sequencing error rate of ε and overdispersion caused by library preparation and other correlated errors modeled as and for ancestral and descendant lines respectively. A full description of this model and its parameters is given in the subsection of the Materials and Methods labeled "Probabilistic approach using accuMUlate".
. 3.—Illustration of a single history in the accuMUlate method. In our model, a history is a unique combination of states (i.e. the genotypes of ancestral and MA lines, results of genomic exclusions and errors introduced during sequencing) generated during an MA experiment. Here we illustrate one such history by giving values to the different states in a model reflecting the same experimental design as fig. 2 and show how we calculate the probability that this history occurred and generated the observed sequencing data. Because we treat sites in the reference genome independently, we describe the process for a single site. Specifically, we consider a history in which an ancestor that is heterozygous with genotype A/T is used to establish three MA lines. One of those lines experiences a mutation from A/T to A/C, and the C allele of this mutant is passed on to a new macronuclear genome through genomic exclusion. The only data we observe for this locus is the set of bases mapped to this site that pass our filtering steps. We represent this data as vectors containing the number of A, C, G, and T bases mapped to a given site (the base counts). We can use equation (3) to calculate the probability that this sequencing data was generated by the specific history shown here. To do this, we first calculate the probability that the ancestor would have genotype A/T and that the observed sequencing data from the ancestor could be generated from this genotype (using equations (6) and (4), respectively). Next, we consider the MA (descendant) lines, calculating the probability that the three descendant lines would have genotypes A, T and C and that the observed sequencing data could be generated from these genotypes. In this case, we use the Felsenstein (1981) model of nucleotide substitution to calculate the probabilities that genomic exclusions generated from the MA lines would have these genotypes. We use the same genotype likelihood model (equation 4) to calculate the probability that the sequencing data was generated from MA lines with these genotypes. Because each of the descendant lines is independent of each other, the overall probability of the history is simply the product of the probabilities for the ancestral and all descendant lines (equation 3). We calculate the probability of a site containing at least one mutation by repeating this procedure for all possible histories at a given site (i.e. all possible combinations of genotypes) and keeping track of those histories that contain one or more mutations (equation 2).
Summary of sequencing data and detected mutations
| Callable | Mutation | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Line | Coverage | Generations | Fitness | Initial | Final | Scaffold | Substitution | Feature | Locus | Effect |
| M5 | 64.17 | 1000 | 0.56 | 0.92 | 0.88 | scf_8254658 | g.334881C>T | Exon | TTHERM_00675900A | Synonymous (gaC>gaT, D>D) |
| M19 | 53.05 | 1000 | 0.64 | 0.93 | 0.88 | — | — | — | — | — |
| M20 | 34.42 | 1000 | 0.44 | 0.93 | 0.88 | scf_8254594 | g.239327A>T | Intron | TTHERM_00286840A | — |
| M25 | 30.88 | 1000 | 0.57 | 0.93 | 0.88 | scf_8254607 | g.179325C>T | Exon | TTHERM_00439220A | Non Synonymous (Gtt>Att, V>I) |
| M28 | 50.65 | 200 | 0.65 | 0.92 | 0.87 | — | — | — | — | — |
| M29 | 31.36 | 1000 | 0.49 | 0.93 | 0.88 | scf_8254365 | g.304140T>G | Intergenic | — | — |
| M40 | 16.84 | 1000 | 0.57 | 0.83 | 0.63 | scf_8254002 | g.27830G>A | Exon | TTHERM_01128590A | Non Synonymous (tGt>tAt, C>Y) |
| M50 | 106.65 | 1000 | 0.38 | 0.93 | 0.88 | — | — | — | — | — |
Note.—no mutations were detected from lines M50, M28, or M19.
aFitness data from Long et al. (2013), using exponential growth rate as fitness metric and normalized by dividing by the ancestral growth rate.
bThe proportion of all sites in the MAC genome (104 Mb) from which a mutation could have been called if one was present. "Initial" refers to the first analysis (with reads with mapping quality < 13 removed and parameter values ), "final" refers to the subsequent analysis (with reads with mapping quality < 30 removed and putative mutations supported by < 3 read in forward and reverse orientation removed, and with parameter values ).