Literature DB >> 28018011

DESCARTES' RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA.

Anand Bhaskar1, Yun S Song1.   

Abstract

The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the "folded" SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes' rule of signs for polynomials to the Laplace transform of piecewise continuous functions.

Entities:  

Keywords:  Population genetics; Primary 62B10; coalescent theory; frequency spectrum; identifiability; population size; secondary 92D15

Year:  2014        PMID: 28018011      PMCID: PMC5175586          DOI: 10.1214/14-AOS1264

Source DB:  PubMed          Journal:  Ann Stat        ISSN: 0090-5364            Impact factor:   4.028


  39 in total

1.  The allele frequency spectrum in genome-wide human variation data reveals signals of differential demographic history in three large world populations.

Authors:  Gabor T Marth; Eva Czabarka; Janos Murvai; Stephen T Sherry
Journal:  Genetics       Date:  2004-01       Impact factor: 4.562

2.  FTEC: a coalescent simulator for modeling faster than exponential growth.

Authors:  Mark Reppell; Michael Boehnke; Sebastian Zöllner
Journal:  Bioinformatics       Date:  2012-03-21       Impact factor: 6.937

3.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

4.  Evolution and functional impact of rare coding variation from deep sequencing of human exomes.

Authors:  Jacob A Tennessen; Abigail W Bigham; Timothy D O'Connor; Wenqing Fu; Eimear E Kenny; Simon Gravel; Sean McGee; Ron Do; Xiaoming Liu; Goo Jun; Hyun Min Kang; Daniel Jordan; Suzanne M Leal; Stacey Gabriel; Mark J Rieder; Goncalo Abecasis; David Altshuler; Deborah A Nickerson; Eric Boerwinkle; Shamil Sunyaev; Carlos D Bustamante; Michael J Bamshad; Joshua M Akey
Journal:  Science       Date:  2012-05-17       Impact factor: 47.728

5.  Archaic human ancestry in East Asia.

Authors:  Pontus Skoglund; Mattias Jakobsson
Journal:  Proc Natl Acad Sci U S A       Date:  2011-10-31       Impact factor: 11.205

6.  Deep resequencing reveals excess rare recent variants consistent with explosive population growth.

Authors:  Alex Coventry; Lara M Bull-Otterson; Xiaoming Liu; Andrew G Clark; Taylor J Maxwell; Jacy Crosby; James E Hixson; Thomas J Rea; Donna M Muzny; Lora R Lewis; David A Wheeler; Aniko Sabo; Christine Lusk; Kenneth G Weiss; Humeira Akbar; Andrew Cree; Alicia C Hawes; Irene Newsham; Robin T Varghese; Donna Villasana; Shannon Gross; Vandita Joshi; Jireh Santibanez; Margaret Morgan; Kyle Chang; Walker Hale Iv; Alan R Templeton; Eric Boerwinkle; Richard Gibbs; Charles F Sing
Journal:  Nat Commun       Date:  2010-11-30       Impact factor: 14.919

7.  Estimating variable effective population sizes from multiple genomes: a sequentially markov conditional sampling distribution approach.

Authors:  Sara Sheehan; Kelley Harris; Yun S Song
Journal:  Genetics       Date:  2013-04-22       Impact factor: 4.562

8.  Inference of human population history from individual whole-genome sequences.

Authors:  Heng Li; Richard Durbin
Journal:  Nature       Date:  2011-07-13       Impact factor: 49.962

9.  Demographic inference using spectral methods on SNP data, with an analysis of the human out-of-Africa expansion.

Authors:  Sergio Lukic; Jody Hey
Journal:  Genetics       Date:  2012-08-03       Impact factor: 4.562

10.  Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants.

Authors:  Wenqing Fu; Timothy D O'Connor; Goo Jun; Hyun Min Kang; Goncalo Abecasis; Suzanne M Leal; Stacey Gabriel; Mark J Rieder; David Altshuler; Jay Shendure; Deborah A Nickerson; Michael J Bamshad; Joshua M Akey
Journal:  Nature       Date:  2012-11-28       Impact factor: 49.962

View more
  15 in total

1.  Accuracy of Demographic Inferences from the Site Frequency Spectrum: The Case of the Yoruba Population.

Authors:  Marguerite Lapierre; Amaury Lambert; Guillaume Achaz
Journal:  Genetics       Date:  2017-03-24       Impact factor: 4.562

2.  Inferring Demographic History Using Two-Locus Statistics.

Authors:  Aaron P Ragsdale; Ryan N Gutenkunst
Journal:  Genetics       Date:  2017-04-16       Impact factor: 4.562

3.  Geometry of the Sample Frequency Spectrum and the Perils of Demographic Inference.

Authors:  Zvi Rosen; Anand Bhaskar; Sebastien Roch; Yun S Song
Journal:  Genetics       Date:  2018-07-31       Impact factor: 4.562

4.  Coalescence times for three genes provide sufficient information to distinguish population structure from population size changes.

Authors:  Simona Grusea; Willy Rodríguez; Didier Pinchon; Lounès Chikhi; Simon Boitard; Olivier Mazet
Journal:  J Math Biol       Date:  2018-07-20       Impact factor: 2.259

5.  Inference of complex population histories using whole-genome sequences from multiple populations.

Authors:  Matthias Steinrücken; Jack Kamm; Jeffrey P Spence; Yun S Song
Journal:  Proc Natl Acad Sci U S A       Date:  2019-08-06       Impact factor: 11.205

Review 6.  Inference of population history using coalescent HMMs: review and outlook.

Authors:  Jeffrey P Spence; Matthias Steinrücken; Jonathan Terhorst; Yun S Song
Journal:  Curr Opin Genet Dev       Date:  2018-07-26       Impact factor: 5.578

7.  Efficient computation of the joint sample frequency spectra for multiple populations.

Authors:  John A Kamm; Jonathan Terhorst; Yun S Song
Journal:  J Comput Graph Stat       Date:  2017-02-16       Impact factor: 2.302

8.  Sequence and Structural Diversity of Mouse Y Chromosomes.

Authors:  Andrew P Morgan; Fernando Pardo-Manuel de Villena
Journal:  Mol Biol Evol       Date:  2017-12-01       Impact factor: 16.240

9.  Efficiently inferring the demographic history of many populations with allele count data.

Authors:  Jack Kamm; Jonathan Terhorst; Richard Durbin; Yun S Song
Journal:  J Am Stat Assoc       Date:  2019-07-22       Impact factor: 5.033

10.  Nonparametric coalescent inference of mutation spectrum history and demography.

Authors:  William S DeWitt; Kameron Decker Harris; Aaron P Ragsdale; Kelley Harris
Journal:  Proc Natl Acad Sci U S A       Date:  2021-05-25       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.