Literature DB >> 20377463

EMINIM: an adaptive and memory-efficient algorithm for genotype imputation.

Hyun Min Kang1, Noah A Zaitlen, Eleazar Eskin.   

Abstract

Genome-wide association studies have proven to be a highly successful method for identification of genetic loci for complex phenotypes in both humans and model organisms. These large scale studies rely on the collection of hundreds of thousands of single nucleotide polymorphisms (SNPs) across the genome. Standard high-throughput genotyping technologies capture only a fraction of the total genetic variation. Recent efforts have shown that it is possible to "impute" with high accuracy the genotypes of SNPs that are not collected in the study provided that they are present in a reference data set which contains both SNPs collected in the study as well as other SNPs. We here introduce a novel HMM based technique to solve the imputation problem that addresses several shortcomings of existing methods. First, our method is adaptive which lets it estimate population genetic parameters from the data and be applied to model organisms that have very different evolutionary histories. Compared to previous methods, our method is up to ten times more accurate on model organisms such as mouse. Second, our algorithm scales in memory usage in the number of collected markers as opposed to the number of known SNPs. This issue is very relevant due to the size of the reference data sets currently being generated. We compare our method over mouse and human data sets to existing methods, and show that each has either comparable or better performance and much lower memory usage. The method is available for download at http://genetics.cs.ucla.edu/eminim.

Entities:  

Mesh:

Year:  2010        PMID: 20377463      PMCID: PMC3198882          DOI: 10.1089/cmb.2009.0199

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  22 in total

1.  Efficiency and power in genetic association studies.

Authors:  Paul I W de Bakker; Roman Yelensky; Itsik Pe'er; Stacey B Gabriel; Mark J Daly; David Altshuler
Journal:  Nat Genet       Date:  2005-10-23       Impact factor: 38.330

2.  A genome-wide scalable SNP genotyping assay using microarray technology.

Authors:  Kevin L Gunderson; Frank J Steemers; Grace Lee; Leo G Mendoza; Mark S Chee
Journal:  Nat Genet       Date:  2005-04-17       Impact factor: 38.330

3.  Leveraging the HapMap correlation structure in association studies.

Authors:  Noah Zaitlen; Hyun Min Kang; Eleazar Eskin; Eran Halperin
Journal:  Am J Hum Genet       Date:  2007-03-02       Impact factor: 11.025

4.  A new multipoint method for genome-wide association studies by imputation of genotypes.

Authors:  Jonathan Marchini; Bryan Howie; Simon Myers; Gil McVean; Peter Donnelly
Journal:  Nat Genet       Date:  2007-06-17       Impact factor: 38.330

5.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase.

Authors:  Paul Scheet; Matthew Stephens
Journal:  Am J Hum Genet       Date:  2006-02-17       Impact factor: 11.025

6.  A DNA polymorphism discovery resource for research on human genetic variation.

Authors:  F S Collins; L D Brooks; A Chakravarti
Journal:  Genome Res       Date:  1998-12       Impact factor: 9.043

7.  The future of genetic studies of complex human diseases.

Authors:  N Risch; K Merikangas
Journal:  Science       Date:  1996-09-13       Impact factor: 47.728

8.  Genome-wide patterns of single-feature polymorphism in Arabidopsis thaliana.

Authors:  Justin O Borevitz; Samuel P Hazen; Todd P Michael; Geoffrey P Morris; Ivan R Baxter; Tina T Hu; Huaming Chen; Jonathan D Werner; Magnus Nordborg; David E Salt; Steve A Kay; Joanne Chory; Detlef Weigel; Jonathan D G Jones; Joseph R Ecker
Journal:  Proc Natl Acad Sci U S A       Date:  2007-07-12       Impact factor: 11.205

9.  A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants.

Authors:  Laura J Scott; Karen L Mohlke; Lori L Bonnycastle; Cristen J Willer; Yun Li; William L Duren; Michael R Erdos; Heather M Stringham; Peter S Chines; Anne U Jackson; Ludmila Prokunina-Olsson; Chia-Jen Ding; Amy J Swift; Narisu Narisu; Tianle Hu; Randall Pruim; Rui Xiao; Xiao-Yi Li; Karen N Conneely; Nancy L Riebow; Andrew G Sprau; Maurine Tong; Peggy P White; Kurt N Hetrick; Michael W Barnhart; Craig W Bark; Janet L Goldstein; Lee Watkins; Fang Xiang; Jouko Saramies; Thomas A Buchanan; Richard M Watanabe; Timo T Valle; Leena Kinnunen; Gonçalo R Abecasis; Elizabeth W Pugh; Kimberly F Doheny; Richard N Bergman; Jaakko Tuomilehto; Francis S Collins; Michael Boehnke
Journal:  Science       Date:  2007-04-26       Impact factor: 47.728

10.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls.

Authors: 
Journal:  Nature       Date:  2007-06-07       Impact factor: 49.962

View more
  8 in total

1.  Imputation aware meta-analysis of genome-wide association studies.

Authors:  Noah Zaitlen; Eleazar Eskin
Journal:  Genet Epidemiol       Date:  2010-09       Impact factor: 2.135

2.  Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data.

Authors:  Wen-Yun Yang; Farhad Hormozdiari; Zhanyong Wang; Dan He; Bogdan Pasaniuc; Eleazar Eskin
Journal:  Bioinformatics       Date:  2013-07-03       Impact factor: 6.937

3.  Hap-seq: an optimal algorithm for haplotype phasing with imputation using sequencing data.

Authors:  Dan He; Buhm Han; Eleazar Eskin
Journal:  J Comput Biol       Date:  2013-02       Impact factor: 1.479

4.  Imputation of single-nucleotide polymorphisms in inbred mice using local phylogeny.

Authors:  Jeremy R Wang; Fernando Pardo-Manuel de Villena; Heather A Lawson; James M Cheverud; Gary A Churchill; Leonard McMillan
Journal:  Genetics       Date:  2012-02       Impact factor: 4.562

5.  Fine-scale maps of recombination rates and hotspots in the mouse genome.

Authors:  Hadassa Brunschwig; Liat Levi; Eyal Ben-David; Robert W Williams; Benjamin Yakir; Sagiv Shifman
Journal:  Genetics       Date:  2012-05-04       Impact factor: 4.562

6.  Quick, "imputation-free" meta-analysis with proxy-SNPs.

Authors:  Christian Meesters; Markus Leber; Christine Herold; Marina Angisch; Manuel Mattheisen; Dmitriy Drichel; André Lacour; Tim Becker
Journal:  BMC Bioinformatics       Date:  2012-09-12       Impact factor: 3.169

7.  Sequence imputation of HPV16 genomes for genetic association studies.

Authors:  Benjamin Smith; Zigui Chen; Laura Reimers; Koenraad van Doorslaer; Mark Schiffman; Rob Desalle; Rolando Herrero; Kai Yu; Sholom Wacholder; Tao Wang; Robert D Burk
Journal:  PLoS One       Date:  2011-06-23       Impact factor: 3.240

8.  Whole-genome haplotyping approaches and genomic medicine.

Authors:  Gustavo Glusman; Hannah C Cox; Jared C Roach
Journal:  Genome Med       Date:  2014-09-25       Impact factor: 11.117

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.