MOTIVATION: Preliminary results on the data produced using the Affymetrix large-scale genotyping platforms show that it is necessary to construct improved genotype calling algorithms. There is evidence that some of the existing algorithms lead to an increased error rate in heterozygous genotypes, and a disproportionately large rate of heterozygotes with missing genotypes. Non-random errors and missing data can lead to an increase in the number of false discoveries in genetic association studies. Therefore, the factors that need to be evaluated in assessing the performance of an algorithm are the missing data (call) and error rates, but also the heterozygous proportions in missing data and errors. RESULTS: We introduce a novel genotype calling algorithm (GEL) for the Affymetrix GeneChip arrays. The algorithm uses likelihood calculations that are based on distributions inferred from the observed data. A key ingredient in accurate genotype calling is weighting the information that comes from each probe quartet according to the quality/reliability of the data in the quartet, and prior information on the performance of the quartet. AVAILABILITY: The GEL software is implemented in R and is available by request from the corresponding author at nicolae@galton.uchicago.edu.
MOTIVATION: Preliminary results on the data produced using the Affymetrix large-scale genotyping platforms show that it is necessary to construct improved genotype calling algorithms. There is evidence that some of the existing algorithms lead to an increased error rate in heterozygous genotypes, and a disproportionately large rate of heterozygotes with missing genotypes. Non-random errors and missing data can lead to an increase in the number of false discoveries in genetic association studies. Therefore, the factors that need to be evaluated in assessing the performance of an algorithm are the missing data (call) and error rates, but also the heterozygous proportions in missing data and errors. RESULTS: We introduce a novel genotype calling algorithm (GEL) for the Affymetrix GeneChip arrays. The algorithm uses likelihood calculations that are based on distributions inferred from the observed data. A key ingredient in accurate genotype calling is weighting the information that comes from each probe quartet according to the quality/reliability of the data in the quartet, and prior information on the performance of the quartet. AVAILABILITY: The GEL software is implemented in R and is available by request from the corresponding author at nicolae@galton.uchicago.edu.
Authors: Yan Lin; George C Tseng; Soo Yeon Cheong; Lora J H Bean; Stephanie L Sherman; Eleanor Feingold Journal: Bioinformatics Date: 2008-09-29 Impact factor: 6.937
Authors: Yi-Ping Fu; D Michael Hallman; Victor H Gonzalez; Barbara E K Klein; Ronald Klein; M Geoffrey Hayes; Nancy J Cox; Graeme I Bell; Craig L Hanis Journal: J Ophthalmol Date: 2010-09-02 Impact factor: 1.909
Authors: Joshua M Korn; Finny G Kuruvilla; Steven A McCarroll; Alec Wysoker; James Nemesh; Simon Cawley; Earl Hubbell; Jim Veitch; Patrick J Collins; Katayoon Darvishi; Charles Lee; Marcia M Nizzari; Stacey B Gabriel; Shaun Purcell; Mark J Daly; David Altshuler Journal: Nat Genet Date: 2008-09-07 Impact factor: 38.330
Authors: Huixiao Hong; Zhenqiang Su; Weigong Ge; Leming Shi; Roger Perkins; Hong Fang; Joshua Xu; James J Chen; Tao Han; Jim Kaput; James C Fuscoe; Weida Tong Journal: BMC Bioinformatics Date: 2008-08-12 Impact factor: 3.169
Authors: Lin Wan; Kelian Sun; Qi Ding; Yuehua Cui; Ming Li; Yalu Wen; Robert C Elston; Minping Qian; Wenjiang J Fu Journal: Nucleic Acids Res Date: 2009-07-07 Impact factor: 16.971