Literature DB >> 26056264

Fundamental limits on the accuracy of demographic inference based on the sample frequency spectrum.

Jonathan Terhorst1, Yun S Song2.   

Abstract

The sample frequency spectrum (SFS) of DNA sequences from a collection of individuals is a summary statistic that is commonly used for parametric inference in population genetics. Despite the popularity of SFS-based inference methods, little is currently known about the information theoretic limit on the estimation accuracy as a function of sample size. Here, we show that using the SFS to estimate the size history of a population has a minimax error of at least O(1/log s), where s is the number of independent segregating sites used in the analysis. This rate is exponentially worse than known convergence rates for many classical estimation problems in statistics. Another surprising aspect of our theoretical bound is that it does not depend on the dimension of the SFS, which is related to the number of sampled individuals. This means that, for a fixed number s of segregating sites considered, using more individuals does not help to reduce the minimax error bound. Our result pertains to populations that have experienced a bottleneck, and we argue that it can be expected to apply to many populations in nature.

Keywords:  demographic inference; minimax rate; population genetics

Mesh:

Substances:

Year:  2015        PMID: 26056264      PMCID: PMC4485089          DOI: 10.1073/pnas.1503717112

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  23 in total

1.  Estimation of population parameters and recombination rates from single nucleotide polymorphisms.

Authors:  R Nielsen
Journal:  Genetics       Date:  2000-02       Impact factor: 4.562

2.  An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people.

Authors:  Matthew R Nelson; Daniel Wegmann; Margaret G Ehm; Darren Kessner; Pamela St Jean; Claudio Verzilli; Judong Shen; Zhengzheng Tang; Silviu-Alin Bacanu; Dana Fraser; Liling Warren; Jennifer Aponte; Matthew Zawistowski; Xiao Liu; Hao Zhang; Yong Zhang; Jun Li; Yun Li; Li Li; Peter Woollard; Simon Topp; Matthew D Hall; Keith Nangle; Jun Wang; Gonçalo Abecasis; Lon R Cardon; Sebastian Zöllner; John C Whittaker; Stephanie L Chissoe; John Novembre; Vincent Mooser
Journal:  Science       Date:  2012-05-17       Impact factor: 47.728

3.  Evolution and functional impact of rare coding variation from deep sequencing of human exomes.

Authors:  Jacob A Tennessen; Abigail W Bigham; Timothy D O'Connor; Wenqing Fu; Eimear E Kenny; Simon Gravel; Sean McGee; Ron Do; Xiaoming Liu; Goo Jun; Hyun Min Kang; Daniel Jordan; Suzanne M Leal; Stacey Gabriel; Mark J Rieder; Goncalo Abecasis; David Altshuler; Deborah A Nickerson; Eric Boerwinkle; Shamil Sunyaev; Carlos D Bustamante; Michael J Bamshad; Joshua M Akey
Journal:  Science       Date:  2012-05-17       Impact factor: 47.728

4.  Deep resequencing reveals excess rare recent variants consistent with explosive population growth.

Authors:  Alex Coventry; Lara M Bull-Otterson; Xiaoming Liu; Andrew G Clark; Taylor J Maxwell; Jacy Crosby; James E Hixson; Thomas J Rea; Donna M Muzny; Lora R Lewis; David A Wheeler; Aniko Sabo; Christine Lusk; Kenneth G Weiss; Humeira Akbar; Andrew Cree; Alicia C Hawes; Irene Newsham; Robin T Varghese; Donna Villasana; Shannon Gross; Vandita Joshi; Jireh Santibanez; Margaret Morgan; Kyle Chang; Walker Hale Iv; Alan R Templeton; Eric Boerwinkle; Richard Gibbs; Charles F Sing
Journal:  Nat Commun       Date:  2010-11-30       Impact factor: 14.919

5.  Estimating variable effective population sizes from multiple genomes: a sequentially markov conditional sampling distribution approach.

Authors:  Sara Sheehan; Kelley Harris; Yun S Song
Journal:  Genetics       Date:  2013-04-22       Impact factor: 4.562

6.  Genome-wide inference of ancestral recombination graphs.

Authors:  Matthew D Rasmussen; Melissa J Hubisz; Ilan Gronau; Adam Siepel
Journal:  PLoS Genet       Date:  2014-05-15       Impact factor: 5.917

7.  Inference of human population history from individual whole-genome sequences.

Authors:  Heng Li; Richard Durbin
Journal:  Nature       Date:  2011-07-13       Impact factor: 49.962

8.  Inferring human population size and separation history from multiple genome sequences.

Authors:  Stephan Schiffels; Richard Durbin
Journal:  Nat Genet       Date:  2014-06-22       Impact factor: 38.330

9.  Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data.

Authors:  Ryan N Gutenkunst; Ryan D Hernandez; Scott H Williamson; Carlos D Bustamante
Journal:  PLoS Genet       Date:  2009-10-23       Impact factor: 5.917

10.  Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants.

Authors:  Wenqing Fu; Timothy D O'Connor; Goo Jun; Hyun Min Kang; Goncalo Abecasis; Suzanne M Leal; Stacey Gabriel; Mark J Rieder; David Altshuler; Jay Shendure; Deborah A Nickerson; Michael J Bamshad; Joshua M Akey
Journal:  Nature       Date:  2012-11-28       Impact factor: 49.962

View more
  22 in total

1.  The Site Frequency Spectrum for General Coalescents.

Authors:  Jeffrey P Spence; John A Kamm; Yun S Song
Journal:  Genetics       Date:  2016-02-16       Impact factor: 4.562

2.  Accuracy of Demographic Inferences from the Site Frequency Spectrum: The Case of the Yoruba Population.

Authors:  Marguerite Lapierre; Amaury Lambert; Guillaume Achaz
Journal:  Genetics       Date:  2017-03-24       Impact factor: 4.562

3.  Geometry of the Sample Frequency Spectrum and the Perils of Demographic Inference.

Authors:  Zvi Rosen; Anand Bhaskar; Sebastien Roch; Yun S Song
Journal:  Genetics       Date:  2018-07-31       Impact factor: 4.562

4.  Coalescence times for three genes provide sufficient information to distinguish population structure from population size changes.

Authors:  Simona Grusea; Willy Rodríguez; Didier Pinchon; Lounès Chikhi; Simon Boitard; Olivier Mazet
Journal:  J Math Biol       Date:  2018-07-20       Impact factor: 2.259

5.  Coalescent Processes with Skewed Offspring Distributions and Nonequilibrium Demography.

Authors:  Sebastian Matuszewski; Marcel E Hildebrandt; Guillaume Achaz; Jeffrey D Jensen
Journal:  Genetics       Date:  2017-11-10       Impact factor: 4.562

6.  Inference of complex population histories using whole-genome sequences from multiple populations.

Authors:  Matthias Steinrücken; Jack Kamm; Jeffrey P Spence; Yun S Song
Journal:  Proc Natl Acad Sci U S A       Date:  2019-08-06       Impact factor: 11.205

Review 7.  Inference of population history using coalescent HMMs: review and outlook.

Authors:  Jeffrey P Spence; Matthias Steinrücken; Jonathan Terhorst; Yun S Song
Journal:  Curr Opin Genet Dev       Date:  2018-07-26       Impact factor: 5.578

8.  Robust and scalable inference of population history from hundreds of unphased whole genomes.

Authors:  Jonathan Terhorst; John A Kamm; Yun S Song
Journal:  Nat Genet       Date:  2016-12-26       Impact factor: 38.330

9.  GADMA: Genetic algorithm for inferring demographic history of multiple populations from allele frequency spectrum data.

Authors:  Ekaterina Noskova; Vladimir Ulyantsev; Klaus-Peter Koepfli; Stephen J O'Brien; Pavel Dobrynin
Journal:  Gigascience       Date:  2020-03-01       Impact factor: 6.524

10.  Efficiently inferring the demographic history of many populations with allele count data.

Authors:  Jack Kamm; Jonathan Terhorst; Richard Durbin; Yun S Song
Journal:  J Am Stat Assoc       Date:  2019-07-22       Impact factor: 5.033

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.