Literature DB >> 17062589

SNiPer-HD: improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays.

Jianping Hua1, David W Craig, Marcel Brun, Jennifer Webster, Victoria Zismann, Waibhav Tembe, Keta Joshipura, Matthew J Huentelman, Edward R Dougherty, Dietrich A Stephan.   

Abstract

MOTIVATION: The technology to genotype single nucleotide polymorphisms (SNPs) at extremely high densities provides for hypothesis-free genome-wide scans for common polymorphisms associated with complex disease. However, we find that some errors introduced by commonly employed genotyping algorithms may lead to inflation of false associations between markers and phenotype.
RESULTS: We have developed a novel SNP genotype calling program, SNiPer-High Density (SNiPer-HD), for highly accurate genotype calling across hundreds of thousands of SNPs. The program employs an expectation-maximization (EM) algorithm with parameters based on a training sample set. The algorithm choice allows for highly accurate genotyping for most SNPs. Also, we introduce a quality control metric for each assayed SNP, such that poor-behaving SNPs can be filtered using a metric correlating to genotype class separation in the calling algorithm. SNiPer-HD is superior to the standard dynamic modeling algorithm and is complementary and non-redundant to other algorithms, such as BRLMM. Implementing multiple algorithms together may provide highly accurate genotyping calls, without inflation of false positives due to systematically miss-called SNPs. A reliable and accurate set of SNP genotypes for increasingly dense panels will eliminate some false association signals and false negative signals, allowing for rapid identification of disease susceptibility loci for complex traits. AVAILABILITY: SNiPer-HD is available at TGen's website: http://www.tgen.org/neurogenomics/data.

Entities:  

Mesh:

Year:  2006        PMID: 17062589     DOI: 10.1093/bioinformatics/btl536

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  25 in total

1.  Germ-line DNA copy number variation frequencies in a large North American population.

Authors:  George Zogopoulos; Kevin C H Ha; Faisal Naqib; Sara Moore; Hyeja Kim; Alexandre Montpetit; Frederick Robidoux; Philippe Laflamme; Michelle Cotterchio; Celia Greenwood; Stephen W Scherer; Brent Zanke; Thomas J Hudson; Gary D Bader; Steven Gallinger
Journal:  Hum Genet       Date:  2007-07-19       Impact factor: 4.132

2.  Multimarker analysis and imputation of multiple platform pooling-based genome-wide association studies.

Authors:  Nils Homer; Waibhav D Tembe; Szabolcs Szelinger; Margot Redman; Dietrich A Stephan; John V Pearson; Stanley F Nelson; David Craig
Journal:  Bioinformatics       Date:  2008-07-10       Impact factor: 6.937

3.  Hidden Markov models for the assessment of chromosomal alterations using high-throughput SNP arrays.

Authors:  Robert B Scharpf; Giovanni Parmigiani; Jonathan Pevsner; Ingo Ruczinski
Journal:  Ann Appl Stat       Date:  2008-06-01       Impact factor: 2.083

4.  Celsius: a community resource for Affymetrix microarray data.

Authors:  Allen Day; Marc R J Carlson; Jun Dong; Brian D O'Connor; Stanley F Nelson
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

5.  R/Bioconductor software for Illumina's Infinium whole-genome genotyping BeadChips.

Authors:  Matthew E Ritchie; Benilton S Carvalho; Kurt N Hetrick; Simon Tavaré; Rafael A Irizarry
Journal:  Bioinformatics       Date:  2009-08-06       Impact factor: 6.937

6.  Clustering algorithms: on learning, validation, performance, and applications to genomics.

Authors:  Lori Dalton; Virginia Ballarin; Marcel Brun
Journal:  Curr Genomics       Date:  2009-09       Impact factor: 2.236

7.  Evidence for an association between KIBRA and late-onset Alzheimer's disease.

Authors:  Jason J Corneveaux; Winnie S Liang; Eric M Reiman; Jennifer A Webster; Amanda J Myers; Victoria L Zismann; Keta D Joshipura; John V Pearson; Diane Hu-Lince; David W Craig; Keith D Coon; Travis Dunckley; Daniel Bandy; Wendy Lee; Kewei Chen; Thomas G Beach; Diego Mastroeni; Andrew Grover; Rivka Ravid; Sigrid B Sando; Jan O Aasly; Reinhard Heun; Frank Jessen; Heike Kölsch; Joseph Rogers; Michael L Hutton; Stacey Melquist; Ron C Petersen; Gene E Alexander; Richard J Caselli; Andreas Papassotiropoulos; Dietrich A Stephan; Matthew J Huentelman
Journal:  Neurobiol Aging       Date:  2008-09-13       Impact factor: 4.673

8.  PanCGH: a genotype-calling algorithm for pangenome CGH data.

Authors:  Jumamurat R Bayjanov; Michiel Wels; Marjo Starrenburg; Johan E T van Hylckama Vlieg; Roland J Siezen; Douwe Molenaar
Journal:  Bioinformatics       Date:  2009-01-07       Impact factor: 6.937

9.  PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data.

Authors:  Chris D Greenman; Graham Bignell; Adam Butler; Sarah Edkins; Jon Hinton; Dave Beare; Sajani Swamy; Thomas Santarius; Lina Chen; Sara Widaa; P Andy Futreal; Michael R Stratton
Journal:  Biostatistics       Date:  2009-10-15       Impact factor: 5.899

10.  Automated SNP genotype clustering algorithm to improve data completeness in high-throughput SNP genotyping datasets from custom arrays.

Authors:  Edward M Smith; Jack Littrell; Michael Olivier
Journal:  Genomics Proteomics Bioinformatics       Date:  2007-12       Impact factor: 7.691

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.