Literature DB >> 17851696

Methods to impute missing genotypes for population data.

Zhaoxia Yu1, Daniel J Schaid.   

Abstract

For large-scale genotyping studies, it is common for most subjects to have some missing genetic markers, even if the missing rate per marker is low. This compromises association analyses, with varying numbers of subjects contributing to analyses when performing single-marker or multi-marker analyses. In this paper, we consider eight methods to infer missing genotypes, including two haplotype reconstruction methods (local expectation maximization-EM, and fastPHASE), two k-nearest neighbor methods (original k-nearest neighbor, KNN, and a weighted k-nearest neighbor, wtKNN), three linear regression methods (backward variable selection, LM.back, least angle regression, LM.lars, and singular value decomposition, LM.svd), and a regression tree, Rtree. We evaluate the accuracy of them using single nucleotide polymorphism (SNP) data from the HapMap project, under a variety of conditions and parameters. We find that fastPHASE has the lowest error rates across different analysis panels and marker densities. LM.lars gives slightly less accurate estimate of missing genotypes than fastPHASE, but has better performance than the other methods.

Mesh:

Substances:

Year:  2007        PMID: 17851696     DOI: 10.1007/s00439-007-0427-y

Source DB:  PubMed          Journal:  Hum Genet        ISSN: 0340-6717            Impact factor:   4.132


  29 in total

1.  Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data.

Authors:  D Fallin; N J Schork
Journal:  Am J Hum Genet       Date:  2000-08-22       Impact factor: 11.025

2.  Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms.

Authors:  Zhaohui S Qin; Tianhua Niu; Jun S Liu
Journal:  Am J Hum Genet       Date:  2002-11       Impact factor: 11.025

3.  A comparison of bayesian methods for haplotype reconstruction from population genotype data.

Authors:  Matthew Stephens; Peter Donnelly
Journal:  Am J Hum Genet       Date:  2003-10-20       Impact factor: 11.025

4.  A haplotype map of the human genome.

Authors: 
Journal:  Nature       Date:  2005-10-27       Impact factor: 49.962

5.  Impact of missing genotype data on Monte-Carlo simulation based haplotype analysis.

Authors:  Tim Becker; Michael Knapp
Journal:  Hum Hered       Date:  2005-07-07       Impact factor: 0.444

6.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

7.  Imputation methods to improve inference in SNP association studies.

Authors:  James Y Dai; Ingo Ruczinski; Michael LeBlanc; Charles Kooperberg
Journal:  Genet Epidemiol       Date:  2006-12       Impact factor: 2.135

8.  A new multipoint method for genome-wide association studies by imputation of genotypes.

Authors:  Jonathan Marchini; Bryan Howie; Simon Myers; Gil McVean; Peter Donnelly
Journal:  Nat Genet       Date:  2007-06-17       Impact factor: 38.330

9.  A comparison of phasing algorithms for trios and unrelated individuals.

Authors:  Jonathan Marchini; David Cutler; Nick Patterson; Matthew Stephens; Eleazar Eskin; Eran Halperin; Shin Lin; Zhaohui S Qin; Heather M Munro; Goncalo R Abecasis; Peter Donnelly
Journal:  Am J Hum Genet       Date:  2006-01-26       Impact factor: 11.025

Review 10.  Meiotic recombination hotspots.

Authors:  M Lichten; A S Goldman
Journal:  Annu Rev Genet       Date:  1995       Impact factor: 16.830

View more
  29 in total

1.  Effects of missing marker and segregation distortion on QTL mapping in F2 populations.

Authors:  Luyan Zhang; Shiquan Wang; Huihui Li; Qiming Deng; Aiping Zheng; Shuangcheng Li; Ping Li; Zhonglai Li; Jiankang Wang
Journal:  Theor Appl Genet       Date:  2010-06-10       Impact factor: 5.699

2.  Structured Matrix Completion with Applications to Genomic Data Integration.

Authors:  Tianxi Cai; T Tony Cai; Anru Zhang
Journal:  J Am Stat Assoc       Date:  2016-08-18       Impact factor: 5.033

3.  Using population mixtures to optimize the utility of genomic databases: linkage disequilibrium and association study design in India.

Authors:  T J Pemberton; M Jakobsson; D F Conrad; G Coop; J D Wall; J K Pritchard; P I Patel; N A Rosenberg
Journal:  Ann Hum Genet       Date:  2007-05-30       Impact factor: 1.670

4.  Detection, imputation, and association analysis of small deletions and null alleles on oligonucleotide arrays.

Authors:  Lude Franke; Carolien G F de Kovel; Yurii S Aulchenko; Gosia Trynka; Alexandra Zhernakova; Karen A Hunt; Hylke M Blauw; Leonard H van den Berg; Roel Ophoff; Panagiotis Deloukas; David A van Heel; Cisca Wijmenga
Journal:  Am J Hum Genet       Date:  2008-06       Impact factor: 11.025

5.  The relationship between imputation error and statistical power in genetic association studies in diverse populations.

Authors:  Lucy Huang; Chaolong Wang; Noah A Rosenberg
Journal:  Am J Hum Genet       Date:  2009-10-22       Impact factor: 11.025

Review 6.  Missing data imputation and haplotype phase inference for genome-wide association studies.

Authors:  Sharon R Browning
Journal:  Hum Genet       Date:  2008-10-11       Impact factor: 4.132

7.  Modeling Informatively Missing Genotypes in Haplotype Analysis.

Authors:  Nianjun Liu; Richard Bucala; Hongyu Zhao
Journal:  Commun Stat Theory Methods       Date:  2009       Impact factor: 0.893

8.  Genotype determination for polymorphisms in linkage disequilibrium.

Authors:  Zhaoxia Yu; Chad Garner; Argyrios Ziogas; Hoda Anton-Culver; Daniel J Schaid
Journal:  BMC Bioinformatics       Date:  2009-02-20       Impact factor: 3.169

9.  Utilizing genotype imputation for the augmentation of sequence data.

Authors:  Brooke L Fridley; Gregory Jenkins; Matthew E Deyo-Svendsen; Scott Hebbring; Robert Freimuth
Journal:  PLoS One       Date:  2010-06-08       Impact factor: 3.240

10.  Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.

Authors:  Ke Hao; Eugene Chudin; Joshua McElwee; Eric E Schadt
Journal:  BMC Genet       Date:  2009-06-16       Impact factor: 2.797

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.