| Literature DB >> 30428070 |
Kai Zeng1, Benjamin C Jackson2, Henry J Barton1.
Abstract
It is known that the effective population size (Ne) and the mutation rate (u) vary across the genome. Here, we show that ignoring this heterogeneity may lead to biased estimates of past demography. To solve the problem, we develop new methods for jointly inferring past changes in population size and detecting variation in Ne and u between loci. These methods rely on either polymorphism data alone or both polymorphism and divergence data. In addition to inferring demography, we can use the methods to study a variety of questions: 1) comparing sex chromosomes with autosomes (for finding evidence for male-driven evolution, an unequal sex ratio, or sex-biased demographic changes) and 2) analyzing multilocus data from within autosomes or sex chromosomes (for studying determinants of variability in Ne and u). Simulations suggest that the methods can provide accurate parameter estimates and have substantial statistical power for detecting difference in Ne and u. As an example, we use the methods to analyze a polymorphism data set from Drosophila simulans. We find clear evidence for rapid population expansion. The results also indicate that the autosomes have a higher mutation rate than the X chromosome and that the sex ratio is probably female-biased. The new methods have been implemented in a user-friendly package.Entities:
Mesh:
Year: 2019 PMID: 30428070 PMCID: PMC6409433 DOI: 10.1093/molbev/msy212
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Mean (standard deviation; SD) of the MLEs for the Parameters of Two Different Two-Locus Models.
| Model 1 (true) | 0.65 | 0.75 | 10 | 0.1 | ||
| Mean (SD) | 0.653 (0.05) | 0.752 (0.03) | 10.0 (0.3) | 0.10 (0.003) | ||
| Model 2 (true) | 0.9 | 0.75 | 0.2 | 0.05 | ||
| Mean (SD) | 0.89 (0.08) | 0.742 (0.10) | 0.20 (0.01) | 0.051 (0.005) |
Note.—Definition of the symbols can be found in supplementary table S1, Supplementary Material online. The population size increases in Model 1, but reduces in Model 2. In both models, the X-autosome ratio of Ne are different before and after the population size change (as measured by r1 and r2). The results are based on 100 simulation replicates. The sample size is 100. Both loci contain 5-Mb sites. The mean number of X-linked and autosomal polymorphic sites are 23,187 and 40,734 under Model 1, and 8,040 and 15,296 under Model 2.
Power (%) of the Three Likelihood Ratio Tests.
| Model | Test 1 | Test 2 | Test 3 |
|---|---|---|---|
| Model 1 | 100 | 84 | 83 |
| Model 2 | 67 | 98 | 78 |
Note.—Model 1 and Model 2 are the same as those used in table 1; so are the number of replicates, sample size, and locus length. Each sample was analyzed using the likelihood ratio tests described in the main text. The values above are the frequency at which the null model is rejected at a 5% significance level.
. 1.MLEs obtained by fitting the simplified model to simulated data from 20 loci. Each locus is 5 kb long. The solid blue line in each plot shows the true parameter values across loci. The population size expanded recently with parameters and . The results are based on 100 replicates. The sample size is 100.
Parameter Estimates Obtained by Applying the Simplified Model to Simulated Data Sets Containing Either 20 or 500 Loci.
| Data | Mean (SD) | |||
|---|---|---|---|---|
| True (20 loci) | 10 | 0.5 | 2 | 8 |
| With div | 10.1 (0.26) | 0.51 (0.06) | 2.0 (0.2) | 8.1 (0.9) |
| No div | 10.1 (0.26) | 0.51 (0.08) | — | — |
| True (500 loci) | 10 | 0.5 | 2 | 6 |
| With div | 10.0 (0.04) | 0.50 (0.04) | 2.0 (0.04) | 6.0 (0.5) |
| No div | 10.0 (0.04) | 0.50 (0.05) | — | — |
Note.—The cases with 20 loci are the same as those presented in figure 1. The locus length is 5 kb, and the results are based on 100 replicates and a sample size of 100. For the cases with 500 loci, θ and f were sampled from the gamma distributions described in the main text. The locus length is 10 kb, and the results are based on 50 replicates and a sample size of 50. The demographic model is the same in all cases, and is characterized by g1 and .
Parameter Estimates Obtained by Fitting Two Models to the uSFS from Drosophila simulans.
| Model | MLE and 95% CI of Parameters | |||||||
|---|---|---|---|---|---|---|---|---|
| No pol err | 0.015 | 0.024 | 1.99 | 1.00 | 11.88 | 0.40 | ||
| [0.013, 0.016] | [0.019, 0.028] | [1.26, 2.80] | [0.78, 1.18] | [9.35, 14.93] | [0.33, 0.49] | |||
| With pol err | 0.011 | 0.019 | 1.91 | 1.03 | 12.60 | 0.67 | 0.06 | 0.05 |
| [0.010, 0.013] | [0.014, 0.025] | [1.24, 2.63] | [0.77, 1.33] | [9.99, 15.43] | [0.51, 0.86] | [0.05, 0.07] | [0.05, 0.06] | |
Note.—Both models have H = 2 epochs. The second model contains two extra parameters, X and A, for modeling polarization errors in the X-linked and autosomal data set, respectively. The 95% CIs were obtained by analyzing 100 bootstrap samples. The bootstrap samples were generated by sampling the short introns with replacement, while keeping the numbers of X-linked and autosomal introns the same as in the real data set.