| Literature DB >> 23940765 |
Ran Li1, Minxian Wang, Li Jin, Yungang He.
Abstract
Testing for random mating of a population is important in population genetics, because deviations from randomness of mating may indicate inbreeding, population stratification, natural selection, or sampling bias. However, current methods use only observed numbers of genotypes and alleles, and do not take advantage of the fact that the advent of sequencing technology provides an opportunity to investigate this topic in unprecedented detail. To address this opportunity, a novel statistical test for random mating is required in population genomics studies for which large sequencing datasets are generally available. Here, we propose a Monte-Carlo-based-permutation test (MCP) as an approach to detect random mating. Computer simulations used to evaluate the performance of the permutation test indicate that its type I error is well controlled and that its statistical power is greater than that of the commonly used chi-square test (CHI). Our simulation study shows the power of our test is greater for datasets characterized by lower levels of migration between subpopulations. In addition, test power increases with increasing recombination rate, sample size, and divergence time of subpopulations. For populations exhibiting limited migration and having average levels of population divergence, the statistical power approaches 1 for sequences longer than 1 Mbp and for samples of 400 individuals or more. Taken together, our results suggest that our permutation test is a valuable tool to detect random mating of populations, especially in population genomics studies.Entities:
Mesh:
Year: 2013 PMID: 23940765 PMCID: PMC3734302 DOI: 10.1371/journal.pone.0071496
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1A diagram of the MCP method.
In this case, a sample of n = 4 individuals was drawn from a population of interest. Gamete sequences (2n) of the individuals are denoted by C1-C8 as follows: C1 and C2 are from individual 1, C3 and C4 are from individual 2 and so on. After permuting these sequences N times (where N is any positive integer), N new datasets were obtained by dividing each permuted sequence into n consecutive pairs. For each permutation, the ξ statistic could be calculated. This allowed us to derived the null distribution of the statistic. After locating ξ on the null distribution, a p value of the test could be obtained.
Parameters and terminology.
| Symbols | Explanation | “Steady states” |
|---|---|---|
| N | Effective population size | 5000 |
| r | Recombination rate per site per generation | 10-8 |
|
| Sequence length | 1Mbp |
| μ | Mutation rate (per generation per site) | 10-8 |
|
| Divergence time of two subpopulations (generations). | 400 |
|
| Sample size (number of individuals) | 400 |
| θ | 4Nμ | 200 |
| ρ | 4Nr | 200 |
|
| Migration rate per generation | 0 |
|
| 4N | 0 |
| β | Significance level | 0.05 or 0.01 |
Type 1 error of the MCP method with different parameters.
| Significance levels |
|
|
|
|
|
|---|---|---|---|---|---|
| 0.05 | 0.034 | 0.054 | 0.041 | 0.069 | 0.048 |
| 0.01 | 0.004 | 0.016 | 0.004 | 0.016 | 0.015 |
| Significance levels |
|
|
|
|
|
| 0.05 | 0.027 | 0.048 | 0.048 | 0.048 | 0.054 |
| 0.01 | 0.0047 | 0.011 | 0.012 | 0.011 | 0.017 |
| Significance levels | ρ = 100 | ρ = 200 | ρ = 400 | ρ = 800 | ρ = 1000 |
| 0.05 | 0.048 | 0.051 | 0.039 | 0.043 | 0.038 |
| 0.01 | 0.004 | 0.011 | 0.010 | 0.007 | 0.005 |
| Significance levels | θ = 100 | θ= 200 | θ= 400 | θ= 800 | θ= 1000 |
| 0.05 | 0.054 | 0.057 | 0.043 | 0.042 | 0.061 |
| 0.01 | 0.010 | 0.012 | 0.008 | 0.004 | 0.010 |
We used control variable strategy to detect type Ι error of MCP test in different sequence length l, sample size n, recombination rate ρ = 4Nrl and mutation rate θ = 4Nμl, corresponding to two significance levels 0.05 and 0.01. When detecting the effects of one specific parameter, the values of the other parameters kept in “steady states” were as follows: sequence length l= 1Mbp; effective population size N=5000; recombination rate ρ=4Nrl=4×5000×10-8 l ; sample size n=400 individuals from a random mating population and mutation rate θ=4Nμl=4×5000×10-8 l.
Figure 2Statistical power of MCP and CHI tests under varying: (A) sequence length, (B) recombination rate, (C) sample size, (D) migration rate, (E) population divergence, and (F) mutation rate.