Literature DB >> 19318520

Inferring population mutation rate and sequencing error rate using the SNP frequency spectrum in a sample of DNA sequences.

Xiaoming Liu1, Taylor J Maxwell, Eric Boerwinkle, Yun-Xin Fu.   

Abstract

One challenge of analyzing samples of DNA sequences is to account for the nonnegligible polymorphisms produced by error when the sequencing error rate is high or the sample size is large. Specifically, those artificial sequence variations will bias the observed single nucleotide polymorphism (SNP) frequency spectrum, which in turn may further bias the estimators of the population mutation rate theta =4N mu for diploids. In this paper, we propose a new approach based on the generalized least squares (GLS) method to estimate theta, given a SNP frequency spectrum in a random sample of DNA sequences from a population. With this approach, error rate epsilon can be either known or unknown. In the latter case, epsilon can be estimated given an estimation of theta. Using coalescent simulation, we compared our estimators with other estimators of theta. The results showed that the GLS estimators are more efficient than other theta estimators with error, and the estimation of epsilon is usable in practice when the theta per bp is small. We demonstrate the application of the estimators with 10-kb noncoding region sequence sampled from a human population and provide suggestions for choosing theta estimators with error.

Entities:  

Mesh:

Year:  2009        PMID: 19318520      PMCID: PMC2734145          DOI: 10.1093/molbev/msp059

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


  21 in total

1.  A genome sequencing center in every lab.

Authors:  Michael E Zwick
Journal:  Eur J Hum Genet       Date:  2005-11       Impact factor: 4.246

Review 2.  The patterns of natural variation in human genes.

Authors:  Dana C Crawford; Dayna T Akey; Deborah A Nickerson
Journal:  Annu Rev Genomics Hum Genet       Date:  2005       Impact factor: 8.929

3.  Sequencing errors and molecular evolutionary analysis.

Authors:  A G Clark; T S Whittam
Journal:  Mol Biol Evol       Date:  1992-07       Impact factor: 16.240

4.  Inference of population genetic parameters in metagenomics: a clean look at messy data.

Authors:  Philip L F Johnson; Montgomery Slatkin
Journal:  Genome Res       Date:  2006-09-05       Impact factor: 9.043

5.  Accounting for bias from sequencing error in population genetic estimates.

Authors:  Philip L F Johnson; Montgomery Slatkin
Journal:  Mol Biol Evol       Date:  2007-11-02       Impact factor: 16.240

6.  Incorporating experimental design and error into coalescent/mutation models of population history.

Authors:  Bjarne Knudsen; Michael M Miyamoto
Journal:  Genetics       Date:  2007-06-11       Impact factor: 4.562

7.  Testing for neutrality in samples with sequencing errors.

Authors:  Guillaume Achaz
Journal:  Genetics       Date:  2008-06-18       Impact factor: 4.562

8.  Population genetic analysis of shotgun assemblies of genomic sequences from multiple individuals.

Authors:  Ines Hellmann; Yuan Mang; Zhiping Gu; Peter Li; Francisco M de la Vega; Andrew G Clark; Rasmus Nielsen
Journal:  Genome Res       Date:  2008-04-14       Impact factor: 9.043

9.  Statistical properties of segregating sites.

Authors:  Y X Fu
Journal:  Theor Popul Biol       Date:  1995-10       Impact factor: 1.570

10.  A phylogenetic estimator of effective population size or mutation rate.

Authors:  Y X Fu
Journal:  Genetics       Date:  1994-02       Impact factor: 4.562

View more
  7 in total

1.  Characterizing bias in population genetic inferences from low-coverage sequencing data.

Authors:  Eunjung Han; Janet S Sinsheimer; John Novembre
Journal:  Mol Biol Evol       Date:  2013-11-27       Impact factor: 16.240

2.  Measuring the rates of spontaneous mutation from deep and large-scale polymorphism data.

Authors:  Philipp W Messer
Journal:  Genetics       Date:  2009-06-15       Impact factor: 4.562

3.  Estimating population genetic parameters and comparing model goodness-of-fit using DNA sequences with error.

Authors:  Xiaoming Liu; Yun-Xin Fu; Taylor J Maxwell; Eric Boerwinkle
Journal:  Genome Res       Date:  2009-12-01       Impact factor: 9.043

4.  jPopGen Suite: population genetic analysis of DNA polymorphism from nucleotide sequences with errors.

Authors:  Xiaoming Liu
Journal:  Methods Ecol Evol       Date:  2012-03-02       Impact factor: 7.781

5.  Site frequency spectra from genomic SNP surveys.

Authors:  Ganeshkumar Ganapathy; Marcy K Uyenoyama
Journal:  Theor Popul Biol       Date:  2009-04-14       Impact factor: 1.570

6.  Atypical at skew in Firmicute genomes results from selection and not from mutation.

Authors:  Catherine A Charneski; Frank Honti; Josephine M Bryant; Laurence D Hurst; Edward J Feil
Journal:  PLoS Genet       Date:  2011-09-15       Impact factor: 5.917

7.  Estimating DNA polymorphism from next generation sequencing data with high error rate by dual sequencing applications.

Authors:  Ziwen He; Xinnian Li; Shaoping Ling; Yun-Xin Fu; Eric Hungate; Suhua Shi; Chung-I Wu
Journal:  BMC Genomics       Date:  2013-08-07       Impact factor: 3.969

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.