Literature DB >> 25848749

Exploring population size changes using SNP frequency spectra.

Abstract

Inferring demographic history is an important task in population genetics. Many existing inference methods are based on predefined simplified population models, which are more suitable for hypothesis testing than exploratory analysis. We developed a novel model-flexible method called stairway plot, which infers changes in population size over time using SNP frequency spectra. This method is applicable for whole-genome sequences of hundreds of individuals. Using extensive simulation, we demonstrate the usefulness of the method for inferring demographic history, especially recent changes in population size. We apply the method to the whole-genome sequence data of 9 populations from the 1000 Genomes Project and show a pattern of fluctuations in human populations from 10,000 to 200,000 years ago.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2015 PMID： 25848749 PMCID： PMC4414822 DOI： 10.1038/ng.3254

Source DB: PubMed Journal: Nat Genet ISSN： 1061-4036 Impact factor: 38.330

Inferring human demographic history using genetic information can shed light on important prehistoric evolutionary events such as population bottleneck, expansion, migration, and admixture, among others. It is also the foundation of many population genetics analyses, as demographic history is one of the most important forces shaping the polymorphic pattern of our genome[1]. Many of the methods available for inferring demographic history with genome-scale data are model-constrained[2-5], that is, researchers need to pre-define a demographic model (for example, a constant-size phase followed by an exponential growth phase beginning at a certain time point) and the number of the parameters to be estimated before estimating the demographic history. Parameters of the models are then estimated by fitting the expected polymorphic pattern (e.g. a SNP frequency spectrum) given a set of parameters to that of the observed data, either through extensive simulation[2] or diffusion approximation[3]. On the other hand, model-flexible methods (sometime also called “model-free” methods), such as the skyline plot[6] and its derivatives[7-13], are not restricted to a specific demographic model and typically explore larger model space than model-constrained methods. Therefore, model-flexible methods can infer significantly more detailed demographic history and may be more suitable for exploratory or hypothesis generating analysis. However, the skyline plot and its derivatives are based on the full-likelihood of DNA sequences, and at the current stage can only be applied to recombination-free loci such as mitochondrial DNA[14,15]. Recently, Li and Durbin[16] proposed a model-flexible method based on the pairwise sequentially Markovian coalescent (PSMC) framework, which specifically models the recombination between two sequences and therefore can analyze autosomes. However, the PSMC method also has its limitations: i) it still requires the users to have a rough idea of the population history in order to determine the number of parameters to estimate; ii) it requires high-quality sequence data for its application; and iii) it tends to produce biased estimation for recent population histories[17]. We developed a new method called stairway plot. It uses a flexible multi-epoch model () as used in the skyline plot methods[7,8], which has worked well in previous demographic inference applications[8,13]. However, instead of calculating the likelihood of the whole sequence, our method calculates the expected composite likelihood of a given SNP frequency spectrum (SFS)[18-20]. Composite likelihood calculation treats each SNP as an independent locus, which significantly reduces the computational burden. This simplified likelihood is a good approximation when the number of SNPs is large and it has worked well in a population parameter estimation application[18]. Therefore, the stairway plot has both the model flexibility of the skyline plot methods and the computational efficiency making it applicable to hundreds of individuals. The number of parameters to be estimated is systematically determined by the standard likelihood ratio test, and can range from 1 to n-1, where n is the number of sequences in the sample. As the method is based on SFS, it has the potential to be applied to pooled sequence data[22] and even species whose reference genome are not yet available[23]. Details of the stairway plot method can be found in Online Methods. We evaluated the stairway plot using extensive simulation and demonstrated the usage of the method for exploratory demographic inference. Compared to the PSMC method, the stairway plot produced more accurate estimations for recent population size changes. Although it has limited inference accuracy and resolution for more ancient histories, at its applicable range the performances were still comparable to those of the PSMC method. We applied our method to the genomes of nine populations (CEU, GBR, TSI, FIN, CHB, CHS, JPT, YRI, LWK) from the 1000 Genomes Project[24] that are not recently admixed, inferred demographic histories of the populations, and provided interesting hypotheses for future studies, such as ancestors of the FIN population (Finnish in Finland) potentially experienced a recent bottleneck between 10-20 thousand years ago (kya)[25].

RESULTS

Simulation Studies

We validated the stairway plot using extensive coalescent simulations and compared its demographic estimations to those of the PSMC method (see Online Methods). More specifically, for each pre-defined demographic model, we simulated 200 independent samples with ms[26] or MaCS[27] software. For each simulated sample, we used the stairway plot and the PSMC method to infer the demographic history. For the PSMC method, we used the pre-tuned parameters for estimating human population history as suggested by its authors. Along the estimated time span, we calculated the medians and the 2.5 and 97.5 percentiles of the 200 inferred population sizes with the stairway plot and the PSMC method, respectively, and used those percentiles to measure the overall accuracy (by medians) and dispersion (by 2.5 and 97.5 percentiles) of the two methods. compared the performances of the stairway plot and the PSMC methods using six different models inspired by previously estimated human population histories. Without loss of generality, one could use the expected number of mutation(s) per base pair (bp) to measure time, and θ per bp to measure population size, where θ=4N, N is the effective population size and μ is the mutation rate per generation. Dividing by μ and 4μ, one can easily convert the above time measure and population size measure to the number of generations and the number of individuals, respectively. Throughout this paper, we assumed a mutation rate of 1.2 × 10–8 per bp per generation[28-30] and a generation time of 24 years[31]. Model 1 () assumes a constant effective population size of 10,000 individuals. For this model, the medians of inferred histories of both methods fitted well with the true model. Compared to the stairway plot, the PSMC method can infer more ancient history. As to dispersion, that of the stairway plot was smaller (in absolute term) than that of the PSMC method for more recent history, while the opposite was observed for more ancient history. The last two observations were generally true for all models we studied, therefore for the following models we will focus on the accuracy of the two methods for inferring recent histories. Model 2 () assumes a sudden population size increase at one time point and besides that the population size remains constant, which mimics a previously estimated model for an African population[32]. For this model, the median of the stairway plot's inference fitted almost perfectly with the true model, while that of the PSMC method did not fit very well. Model 3 () assumes an exponential growth of population size with a rate of 0.004 per generation[32] (i.e. r=0.004). Model 4 () is another exponential growth model which mimics the estimated recent growth of a population with European ancestry[3]. In both cases, while the stairway plot fits the true model reasonably well, the PSMC is biased upward dramatically. Model 5 () is based on an estimated human population demographic history[4] with a faster exponential growth rate (r=0.01288). Model 6 () is a model tested in the PSMC publication[16]. Again, the stairway plot was a better fit to the recent population history than the PSMC. For inferring more ancient population size changes, we compared the performances of the two methods using four additional models tested in the original PSMC publication plus a population split model (). As we mentioned previously, the stairway plot had a shorter upper limit and a larger dispersion for ancient history inference compared to the PSMC method. The former is a disadvantage for the stairway plot but the latter correctly reflects the uncertainty of our inferences, on the other hand. As to the PSMC method, although it had a smaller dispersion for ancient history inferences, the true histories often fall outside its 95% inference ranges. The stairway plot might produce an artificial bottleneck when the time spans of the last few θ estimations (see Online Methods) overlap with ancient population size fluctuations (see for an example and Discussion for its recognizable pattern). Overall, within the applicable time spans of the stairway plots, roughly up to the last 10 steps of the plot, the performances of the stairway plot for inferring ancient population size were comparable to the PSMC. Many factors can affect the inference of the stairway plot. Using simulation we studied the impact of SNP number (or sequence length), sample size and recombination rate. In short, increasing sample size can significantly improve the inference accuracy (median), especially for inferring recent population growth, while the most obvious effects of larger SNP number and recombination rate are reducing the inference dispersion (). The underlying true demographic history determines the information contained in the sample SFS so that the inference results will also affected. There are known caveats related to that; some bottlenecks of the studied population may be missing from the plot due to limitation of inference power. For example, when two bottlenecks are close to each other or a very deep bottleneck following an ancient bottleneck, the stairway plot may not be able to infer the more ancient one (see and more explanation in Discussion).

Application to the 1000 Genomes Project Data

We applied the stairway plot to the whole genome sequences of nine populations (LWK, YRI, CEU, GBR, TSI, FIN, CHB, CHS, JPT) from the 1000 Genomes Project[24]. We restricted our analysis to the genomic regions that are at least 50 kilobase away from any coding regions based on the RefSeq database[33] to avoid potential impacts from natural selection[34]. We also removed regions that are outside the strict mask of the 1000 Genomes Project[24] to reduce artifacts due to mapping errors. Finally, only sites whose ancestral alleles have been inferred with high confidence (see Online Methods) were included for analysis. Because all the SNPs are from intergenic regions and were called with low-depth sequencing, many of the SNPs on the rare spectrum were not observed. We adjusted the SFSs by using the empirical transition probabilities from the SFSs of the high-depth-sequenced exome regions to the SFSs of low-depth-sequenced exome regions, with the assumption that the SFS bias due to low-depth is systematic and universal across the genome (see Online Methods and for details). For each population, 200 bootstrap SFSs were created from the adjusted SFS, and for each bootstrap SFS the stairway plot was used to infer the demographic history. The median inferred population size in each time interval based on the 200 estimations was used to construct a single inferred history of population size. As there were likely artificial bottlenecks observed for all nine populations (), only more recent histories up to 200-300 kya were taken as results. As a higher mutation rate or a lower generation time will lower our time estimation (and on the opposite a lower mutation rate or a higher generation time will heighten our time estimation), we also provided lower and upper estimations for time ranges assuming a (apes-like) generation time of 20 years[35,36] with a mutation rate of 1.4×10–8 per bp per generation[37] or a generation time of 30 years[38] with a mutation rate of 1.0 × 10–8 per bp per generation[29,30,39], respectively (in brackets in the following paragraph). shows the estimations (see also ) and their 95% bootstrap ranges for the nine populations. There are several patterns that are easily observed: (1) Non-African populations all showed severe bottlenecks between 50-70 kya (36-105 kya), which are most likely due to modern human's OOA migration. (2) All non-African populations except the FIN also showed a shallower and more recent bottleneck between 20-30 kya (14-45 kya), and then was followed by size recoveries. The FIN did not show an obvious bottleneck between 20-30 kya, potentially due to limitation of inference power (see Discussion for details), and its size recovery began at around 15 kya (11-23 kya). (4) Compared to the Non-African populations, the two African populations show wider and shallower bottlenecks between 50-70 kya (36-105 kya) and no bottlenecks between 20-30 kya (14-45 kya). (5) Both African populations also show bottlenecks between 100-200 kya (71-300 kya), probably associated with the origination of the anatomically modern human[40]. This bottleneck is not observed in non-African populations, also likely due to limitation of inference power (see Discussion).

DISCUSSION

Here we reported the development of a novel model-flexible method called stairway plot for inferring population demographic histories, which is designed for exploratory or hypothesis generating analysis. There are several other model-flexible methods including the family of skyline plot methods[6-13] and the PSMC method, whose advantages and limitations were briefly discussed in the Introduction. New developments in this area include the diCal method[17] and multiple sequential Markovian coalescent[41] (MSMC). The diCal method extends the PSMC by modeling the configurations of multiple sequences, and showed improvement over the PSMC on inferring recent population histories. However, diCal requires the users to provide haplotypes (i.e. phased sequence data) and a mutation matrix (i.e. relative mutation rates) for the four nucleic bases, which may introduce biases into the estimation if not properly estimated. Besides, the computational intensity limits diCal's application to ~10 sequences. MSMC is another extension of the PSMC method. Instead of modeling all the coalescent events of multiple sequences, it focuses on the first coalescent event and the external branches of coalescent trees. However, due to the modeling and computational complexity, its application is currently limited to roughly 8 phased sequences. Our stairway plot method is based on the composite likelihood of SFS, and therefore has the advantages of efficient computation and the applicability to a broader range of sequence data, such as low-depth sequence[24], pooled sequence[22] and potentially even reference-free transcriptome data[23]. At the current stage, it can be applied to hundreds of unphased sequences. Compared to the PSMC method, the stairway plot can take the advantages of larger sample sizes and provide more accurate inference for recent population histories. However, the stairway plot still has the limitation for inferring ancient histories, for which the PSMC, diCal or MSMC methods may perform better. Therefore, we recommend the complementary usage of the stairway plot with the PSMC, diCal or MSMC. The application of our stairway plot to nine populations from the 1000 Genomes Project provided some observations worth further and more careful investigation. First, we observed a bottleneck between 10-20 kya in the FIN, which was not observed in other European populations; and vice versa we observed a bottleneck between 20-30 kya in all European populations except the FIN. One explanation of this pattern is that FIN ancestors separated from those of other European populations as earlier as 30 kya. Another possibility is that the FIN may also experience a bottleneck as other European populations, as the shape of its 95% inference ranges suggests a population size decrease around 30 kya. We did some preliminary simulation experiments to investigate the two possibilities (see for details). The results () showed that if a population experienced two continuous bottlenecks, one between 10-20 kya and another between 20-30 kya, our method was not able to infer both bottlenecks. Instead, the plot tended to suggest the more recent bottleneck, which more or less matches the pattern we observed for the FIN. Although we cannot rule out the first explanation, our preliminary analysis suggests the second explanation might be true; that is, the FIN may experience the same bottleneck between 20-30 kya as other European populations, but it may also experience an additional bottleneck between 10-20 kya. Second, we observed that African populations have a bottleneck between 100-200 kya, which is missing in the plots of non-African populations. Again, one possible explanation is that ancestors of all non-African populations separated from those of the African populations as earlier as 200 kya[41], and an alternative explanation is that our method does not have sufficient power to infer that ancient bottleneck with the non-African samples. Because our estimation of population sizes depends on the gene lineages available for coalescence during a period of time, the fewer gene lineages available during the period the less information available for inferring population sizes. As all non-African populations experienced a deep OOA bottleneck between 50-70 kya, many gene lineages of the samples may not survive the bottleneck and be available for inferring more ancient population histories. Although we cannot rule out the first explanation, the simulation experiments we described above supported the alternative explanation, that is, any population having a deep OOA bottleneck did not show an ancient bottleneck between 100-200 kya although the true model has one (). However such an ancient bottleneck can be inferred if the population does not have a deep OOA bottleneck (). Those results also emphasize that interpretations of inferred bottlenecks need to be careful and hypothesis testing is necessary before any conclusions are formulated. There are many ways the stairway plot can be further improved. As our method models the “average” behavior of many independent coalescent trees, the expectations of coalescent times or E(t)s are the “building blocks” for the steps observed in the stairway plot. By nature E(t) is inversely proportional to k(k-1) (see Online Methods). Reflecting on the stairway plot, the step size of the plot, which is proportional to E(t), is typically much larger when k is small (corresponding to ancient histories) than it is when k is large (corresponding to recent histories) . Put another way, we only model ancient demographic histories using a small number of parameters (or steps as to the plot). When the ancient demographic history is complex, the small number of steps overlapping that complex history may ill-fit the data. A typical result is an artificial bottleneck, which occurs only at the last few (< 10) steps of the plot with a distinguishable pattern of a beginning of population decrease at the second step (θ3) and a lowest point typically around the third step (θ4) (see examples in and ). Here we caution users of the stairway plot when such a pattern is observed, the true demographic changes of the population studied may not be correctly reflected. Considering the lower resolution for ancient histories as to the stairway plot, we suggest comparing estimations from various methods (such as the PSMC/MSMC method and diCal) when applicable, and avoiding over-interpretation of the inferred history with the last 10 steps of the stairway plot. One possible improvement for the stairway plot as to the estimation of ancient histories is by integrating the composite likelihood into a Bayesian framework[8,9], which smoothes the θ estimations into continuous probability estimations. A further smoothness can be achieved with a smoothing prior based on a Gaussian Markov random field, in which the smoothness is informed by the data[10]. Another possible improvement for the estimation of the demographic history of a fast growing population, such as for the human population, is by using a different null model. Generally speaking, the underlying null model of the stairway plot is a population of constant size during a certain time period. If an instantaneous size change at a certain point within the period (defined by coalescent times) creates an alternative model with a significantly larger likelihood, the alternative model will replace the null model for further model refinement. This procedure produces a stairway-like inferred population model for a population with a fast size increase or decrease. Assuming an exponential growth model[42] as the null model or a hybrid of the null models of constant size and exponential growth may reduce the number of parameters to be estimated for such populations, and therefore improve the accuracy of estimations. In addition, a more efficient optimization search algorithm for the number and values of θs shall further reduce the computational intensity so that the stairway plot method can be applicable to even larger sample sizes.

ONLINE METHODS

Composite likelihood of a SFS

We assume a random sample of n sequences is taken from a population, whose size may instantaneously change at the time points coinciding with coalescent events of the n sequences of the gene genealogy (). Let t be the k-coalescent time, then the probability where N is the effective size of the population during t. We assume N remains constant during t, and N-1 or N+1 may be equal to or different to N. With a given N, a realization of t from an independent coalescent tree follows the above distribution. If we summarize a large number of independent coalescent trees, the average of observed t will approach its expectation E(t|N) = 4N/(k(k-1)). Let p be the probability (or the expectation from a large number of independent coalescent trees) that a nucleic site is a SNP of size i (n-1 ≥ i ≥ 1), then p can be expressed as a function of θ, where θ = 4N, and μ is the mutation rate per bp per generation[43]. In more detail, where For simplicity, we define SNP size 0 as the size of monomorphic sites, and its probability is Assuming each site is from an independent coalescent tree (i.e. unlinked), the number of SNPs of size i, ξ, can be modeled with a multinomial distribution and the composite likelihood of observing ξ, ξ1, ... , ξ-1 can be written as where Theoretically, it is possible to use a subset of the SNP sizes for the likelihood calculation with a sacrifice of loss of information contained in those SNP size bins (see for details and potential pitfalls). When missing data exist, we can separate the whole SNP spectrum into l sites with n observed alleles, l-1 sites with n-1 observed alleles, and l-2 sites with n-2 observed alleles, ··· . The composite likelihood of the whole data set is

Estimating θs

We used a Java library for numerical optimization called SwarmOps[44] to search for the θs that maximize the composite likelihood of a given SFS. We used a specialized Genetic Algorithm method for real-valued search-spaces called Differential Evolution (DE)[45] if the number of sequences is smaller than 200. Otherwise, we used a Pattern Search (PS) method[44,46]. We used default behavior parameters for DE, and 5000×d and 50×d iterations for DE and PS, respectively, where d is the number of different θs to be estimated. As there are a total of n-1 different θs that can be estimated, we try to minimize the number of different θs to be estimated by using “break points” to group them. That is, in a ordered serial of θ2, θ3, ..., θ, break points are inserted into the serials that separate the θs into continuous groups. Any two consecutive θs that are not separated by a break point belong to the same group. We assume the θs within the same group have the same value, while those belonging to different groups may have different values. We also modeled the autocorrelation between the values of adjacent groups of θs following previous successful practices[8]. The procedure for finding the best grouping of θs fitting the observed SFS is as follows: (1) It begins with a single θ, i.e. θ2 = θ3 = ... = θ. Obtain L1 as the likelihood calculated with this single θ estimation, that is, for a population model of constant size. (2) Increase d by 1; for each point between θ and θ+1, let θ for all l ≤ k and θ+1 for all m > k; use SwarmOps to find the estimations of the two θ values that maximized L; calculate L corresponding to that specific break point and the θ estimations; and find the break point with the largest L and designate it as L2. The procedure stops if –2ln(L1/L2) < 3.84, (i.e. a likelihood ratio test with one degree of freedom and α = 0.05), otherwise, we accept the new split. (3) increasing d by 1 and repeat the practice; based on the best θ break point(s) associated with L-1, find an additional break point associated with the largest L and designate it as L; and stop when –2ln(L-1/L) < 3.84. As this procedure is not an exhaustive search for the global optimum from the whole parameter space. It is not guaranteed to find the global optimum, especially when the underlying true model is complex. Based on our experiments and observations, the estimation results are typically acceptable approximations for the global optimum (see for results from three example experiments).

Determining the population size at a given time point

Without loss of generality, we use θ to measure population size and mutation per bp to measure time (from the time point when the sample was taken). They can be easily converted to the number of individuals and the number of generations if divided by 4μ and μ, respectively. Given θ per bp, the expected length of t is θ/(k(k-1)). Let then the stairway plot infers θ at T < T ≤ T-1 equals θ-1.

PSMC estimation

The PSMC estimations were conducted using the default parameters tuned for human populations. To measure its dispersion, for each simulated sample or bootstrap sample of multiple individuals, we inferred population size changes using PSMC. Then at each time point along the population history, we calculate the 2.5% and 97.5% percentiles of population size estimations from all inferred histories.

The simulation data

Sequence data were simulated using either ms[26] or MaCS[27] software. Detailed simulation commands can be found in the . If not specified, all sequences were simulated assuming a mutation rate (μ) of 1.2 × 10–8 per bp per generation[28-30] and a recombination of ρ = 0.8μ per bp per generation. Please note that we used a smaller estimation of recombination, as a recent study suggested that the average recombination rate for humans is about the same as the mutation rate[47].

The 1000 Genomes Project data

The 1000 Genomes Project phase 1 whole genome SNP calls of the nine populations (LWK, YRI, CEU, GBR, TSI, FIN, CHB, CHS, JPT) were downloaded from the 1000 Genomes Project ftp sites. Regions that are within 50 kb from any known coding genes (based on the RefSeq database[33]) and that are outside the 1000 Genomes Project phase 1 strict mask were removed. Sites whose ancestral alleles were not inferred with a high confidence based on the 1000 Genomes Project phase 1 annotation were also removed. The total number of sites in the human genome that passed our filtering is 650,351,035. For each population we calculated SFS only from the retained sites. Because intergenic regions were sequenced with low depth, many of the alleles with low frequencies were not observed. We adjusted the first 20 minor allele frequency bins of each SFS for each population to obtain the most likely true SFS using the empirical transition probabilities that were based on the SFS of the high-depth sequence data of the exome regions and the SFS of low-depth sequence data of the same regions (see for details).

43 in total

1. Calibrating a coalescent simulation of human genome sequence variation.

Authors: Stephen F Schaffner; Catherine Foo; Stacey Gabriel; David Reich; Mark J Daly; David Altshuler
Journal: Genome Res Date: 2005-11 Impact factor: 9.043

2. Fast and flexible simulation of DNA sequence data.

Authors: Gary K Chen; Paul Marjoram; Jeffrey D Wall
Journal: Genome Res Date: 2008-11-24 Impact factor: 9.043

3. Power of deep, all-exon resequencing for discovery of human trait genes.

Authors: Gregory V Kryukov; Alexander Shpunt; John A Stamatoyannopoulos; Shamil R Sunyaev
Journal: Proc Natl Acad Sci U S A Date: 2009-02-06 Impact factor: 11.205

4. Genetic markers and population history: Finland revisited.

Authors: Jukka U Palo; Ismo Ulmanen; Matti Lukka; Pekka Ellonen; Antti Sajantila
Journal: Eur J Hum Genet Date: 2009-04-15 Impact factor: 4.246

5. Smooth skyride through a rough skyline: Bayesian coalescent-based inference of population dynamics.

Authors: Vladimir N Minin; Erik W Bloomquist; Marc A Suchard
Journal: Mol Biol Evol Date: 2008-04-11 Impact factor: 16.240

6. Bayesian coalescent inference of major human mitochondrial DNA haplogroup expansions in Africa.

Authors: Quentin D Atkinson; Russell D Gray; Alexei J Drummond
Journal: Proc Biol Sci Date: 2009-01-22 Impact factor: 5.349

7. Generation time and effective population size in Polar Eskimos.

Authors: Shuichi Matsumura; Peter Forster
Journal: Proc Biol Sci Date: 2008-07-07 Impact factor: 5.349

Review 8. Reconstructing human origins in the genomic era.

Authors: Daniel Garrigan; Michael F Hammer
Journal: Nat Rev Genet Date: 2006-09 Impact factor: 53.242

9. Bayesian inference of population size history from multiple loci.

Authors: Joseph Heled; Alexei J Drummond
Journal: BMC Evol Biol Date: 2008-10-23 Impact factor: 3.260

10. Assessing the evolutionary impact of amino acid mutations in the human genome.

Authors: Adam R Boyko; Scott H Williamson; Amit R Indap; Jeremiah D Degenhardt; Ryan D Hernandez; Kirk E Lohmueller; Mark D Adams; Steffen Schmidt; John J Sninsky; Shamil R Sunyaev; Thomas J White; Rasmus Nielsen; Andrew G Clark; Carlos D Bustamante
Journal: PLoS Genet Date: 2008-05-30 Impact factor: 5.917

105 in total

1. Corrigendum: Exploring population size changes using SNP frequency spectra.

Authors: Xiaoming Liu; Yun-Xin Fu
Journal: Nat Genet Date: 2015-09 Impact factor: 38.330

2. Inference Under a Wright-Fisher Model Using an Accurate Beta Approximation.

Authors: Paula Tataru; Thomas Bataillon; Asger Hobolth
Journal: Genetics Date: 2015-08-26 Impact factor: 4.562

3. On the importance of being structured: instantaneous coalescence rates and human evolution--lessons for ancestral population size inference?

Authors: O Mazet; W Rodríguez; S Grusea; S Boitard; L Chikhi
Journal: Heredity (Edinb) Date: 2015-12-09 Impact factor: 3.821

4. Accuracy of Demographic Inferences from the Site Frequency Spectrum: The Case of the Yoruba Population.

Authors: Marguerite Lapierre; Amaury Lambert; Guillaume Achaz
Journal: Genetics Date: 2017-03-24 Impact factor: 4.562

5. Inferring Demographic History Using Two-Locus Statistics.

Authors: Aaron P Ragsdale; Ryan N Gutenkunst
Journal: Genetics Date: 2017-04-16 Impact factor: 4.562

6. Large numbers of vertebrates began rapid population decline in the late 19th century.

Authors: Haipeng Li; Jinggong Xiang-Yu; Guangyi Dai; Zhili Gu; Chen Ming; Zongfeng Yang; Oliver A Ryder; Wen-Hsiung Li; Yun-Xin Fu; Ya-Ping Zhang
Journal: Proc Natl Acad Sci U S A Date: 2016-11-21 Impact factor: 11.205

7. The Effect of an Extreme and Prolonged Population Bottleneck on Patterns of Deleterious Variation: Insights from the Greenlandic Inuit.

Authors: Casper-Emil T Pedersen; Kirk E Lohmueller; Niels Grarup; Peter Bjerregaard; Torben Hansen; Hans R Siegismund; Ida Moltke; Anders Albrechtsen
Journal: Genetics Date: 2016-11-30 Impact factor: 4.562

Review 8. Making sense of genomic islands of differentiation in light of speciation.

Authors: Jochen B W Wolf; Hans Ellegren
Journal: Nat Rev Genet Date: 2016-11-14 Impact factor: 53.242

9. Human Prehistoric Demography Revealed by the Polymorphic Pattern of CpG Transitions.

Authors: Xiaoming Liu
Journal: Mol Biol Evol Date: 2020-09-01 Impact factor: 16.240

Review 10. Determinants of genetic diversity.

Authors: Hans Ellegren; Nicolas Galtier
Journal: Nat Rev Genet Date: 2016-06-06 Impact factor: 53.242