| Literature DB >> 32371451 |
Tom R Booker1,2.
Abstract
Characterizing the distribution of fitness effects (DFE) for new mutations is central in evolutionary genetics. Analysis of molecular data under the McDonald-Kreitman test has suggested that adaptive substitutions make a substantial contribution to between-species divergence. Methods have been proposed to estimate the parameters of the distribution of fitness effects for positively selected mutations from the unfolded site frequency spectrum (uSFS). Such methods perform well when beneficial mutations are mildly selected and frequent. However, when beneficial mutations are strongly selected and rare, they may make little contribution to standing variation and will thus be difficult to detect from the uSFS. In this study, I analyze uSFS data from simulated populations subject to advantageous mutations with effects on fitness ranging from mildly to strongly beneficial. As expected, frequent, mildly beneficial mutations contribute substantially to standing genetic variation and parameters are accurately recovered from the uSFS. However, when advantageous mutations are strongly selected and rare, there are very few segregating in populations at any one time. Fitting the uSFS in such cases leads to underestimates of the strength of positive selection and may lead researchers to false conclusions regarding the relative contribution adaptive mutations make to molecular evolution. Fortunately, the parameters for the distribution of fitness effects for harmful mutations are estimated with high accuracy and precision. The results from this study suggest that the parameters of positively selected mutations obtained by analysis of the uSFS should be treated with caution and that variability at linked sites should be used in conjunction with standing variability to estimate parameters of the distribution of fitness effects in the future.Entities:
Keywords: adaptation; beneficial mutations; distribution of fitness effects; population genetics
Mesh:
Year: 2020 PMID: 32371451 PMCID: PMC7341129 DOI: 10.1534/g3.120.401052
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Estimates of the parameters of positive selection obtained from the uSFS for nonsynonymous sites
| Common name | Scientific name | γ | |||
|---|---|---|---|---|---|
| House mouse | 14.5 | 0.0030 | |||
| Fruit fly | 23.0 | 0.0045 | |||
| Humans | 0.0064 | 0.000025 |
DFE-alpha implements the analysis methods described by Schneider , polyDFE implements the methods described by Tataru
Castellano estimated the mean fitness effect for an exponential distribution of advantageous mutational effects.
Parameters of positive selection assumed in simulations and the proportion of polyDFE runs for which modeling positive selection gave a significantly better fit to the data
| γ | γ | |||||
|---|---|---|---|---|---|---|
| 10 | 0.0001 | 0.001 | 0.02 | 0.07 | 0.11 | 0.71 |
| 50 | 0.005 | 0.98 | 0.86 | 0.10 | 0.77 | |
| 100 | 0.01 | 0.98 | 0.02 | 0.03 | 0.58 | |
| 500 | 0.05 | 1.00 | 0.39 | 0.00 | 0.99 | |
| 1,000 | 0.10 | 1.00 | 1.00 | 0.00 | 0.71 | |
| 10 | 0.001 | 0.01 | 0.99 | 0.96 | 0.15 | 0.71 |
| 50 | 0.05 | 1.00 | 1.00 | 0.06 | 0.98 | |
| 100 | 0.10 | 1.00 | 1.00 | 0.00 | 0.97 | |
| 500 | 0.50 | 1.00 | 1.00 | 0.00 | 0.94 | |
| 1,000 | 1.00 | 1.00 | 1.00 | 0.00 | 0.71 | |
| 10 | 0.01 | 0.10 | 1.00 | 1.00 | 0.03 | 0.80 |
| 50 | 0.50 | 1.00 | 1.00 | 0.02 | 0.99 | |
| 100 | 1.00 | 1.00 | 1.00 | 0.02 | 0.95 | |
| 500 | 5.00 | 1.00 | 1.00 | 0.00 | 0.72 | |
| 1,000 | 10.0 | 1.00 | 1.00 | 0.00 | 0.41 | |
Figure 1Population genetic summary statistics collated across all simulated genes. α is the observed proportion of substitutions fixed by positive selection. πs/π0 is genetic diversity relative to neutral expectation (π = 0.01). S./S is the proportion of segregating nonsynonymous sites that are advantageous in the simulated datasets.
Figure 2The uSFS for advantageous mutations under different combinations of positive selection parameters. The three bar charts show observed uSFS from simulations that model positive selection parameters that yield similar α. The lines in each panel show the expected frequency spectra for different strengths of beneficial mutations and were obtained using Equation 2 from Tataru .
Figure 3Estimates of the parameters of advantageous mutations and the proportion of adaptive substitutions they imply from simulated datasets. A) γ is the inferred selective effect of a new advantageous mutation; B) p is the proportion of new mutations that are beneficial, the horizontal dashed gray lines indicate the simulated values in each case; C) αDFE is the proportion of adaptive substitutions expected under the inferred DFE, the dashed lines indicate α, the proportion of adaptive substitutions observed in the simulated datasets. Error bars indicate the 95% range of 100 bootstrap replicates.
Figure 4The likelihood surface for the γ and p parameters for three simulated datasets. Hue indicates differences in log likelihood between a particular parameter combination and the best-fitting model. Best fitting models are indicated by red points and the true parameters are given above the plots and indicated by the white plus signs on the likelihood surface. The relation γ = 0.1 is shown as a turquoise line and is constant across the three datasets shown.