MOTIVATION: Analysis of many thousands of single nucleotide polymorphisms (SNPs) across whole genome is crucial to efficiently map disease genes and understanding susceptibility to diseases, drug efficacy and side effects for different populations and individuals. High density oligonucleotide microarrays provide the possibility for such analysis with reasonable cost. Such analysis requires accurate, reliable methods for feature extraction, classification, statistical modeling and filtering. RESULTS: We propose the modified partitioning around medoids as a classification method for relative allele signals. We use the average silhouette width, separation and other quantities as quality measures for genotyping classification. We form robust statistical models based on the classification results and use these models to make genotype calls and calculate quality measures of calls. We apply our algorithms to several different genotyping microarrays. We use reference types, informative Mendelian relationship in families, and leave-one-out cross validation to verify our results. The concordance rates with the single base extension reference types are 99.36% for the SNPs on autosomes and 99.64% for the SNPs on sex chromosomes. The concordance of the leave-one-out test is over 99.5% and is 99.9% higher for AA, AB and BB cells. We also provide a method to determine the gender of a sample based on the heterozygous call rate of SNPs on the X chromosome. See http://www.affymetrix.com for further information. The microarray data will also be available from the Affymetrix web site. AVAILABILITY: The algorithms will be available commercially in the Affymetrix software package.
MOTIVATION: Analysis of many thousands of single nucleotide polymorphisms (SNPs) across whole genome is crucial to efficiently map disease genes and understanding susceptibility to diseases, drug efficacy and side effects for different populations and individuals. High density oligonucleotide microarrays provide the possibility for such analysis with reasonable cost. Such analysis requires accurate, reliable methods for feature extraction, classification, statistical modeling and filtering. RESULTS: We propose the modified partitioning around medoids as a classification method for relative allele signals. We use the average silhouette width, separation and other quantities as quality measures for genotyping classification. We form robust statistical models based on the classification results and use these models to make genotype calls and calculate quality measures of calls. We apply our algorithms to several different genotyping microarrays. We use reference types, informative Mendelian relationship in families, and leave-one-out cross validation to verify our results. The concordance rates with the single base extension reference types are 99.36% for the SNPs on autosomes and 99.64% for the SNPs on sex chromosomes. The concordance of the leave-one-out test is over 99.5% and is 99.9% higher for AA, AB and BB cells. We also provide a method to determine the gender of a sample based on the heterozygous call rate of SNPs on the X chromosome. See http://www.affymetrix.com for further information. The microarray data will also be available from the Affymetrix web site. AVAILABILITY: The algorithms will be available commercially in the Affymetrix software package.
Authors: Kwong-Kwok Wong; Yvonne T M Tsang; Jianhe Shen; Rita S Cheng; Yi-Mieng Chang; Tsz-Kwong Man; Ching C Lau Journal: Nucleic Acids Res Date: 2004-05-17 Impact factor: 16.971
Authors: Hajime Matsuzaki; Halina Loi; Shoulian Dong; Ya-Yu Tsai; Joy Fang; Jane Law; Xiaojun Di; Wei-Min Liu; Geoffrey Yang; Guoying Liu; Jing Huang; Giulia C Kennedy; Thomas B Ryder; Gregory A Marcus; P Sean Walsh; Mark D Shriver; Jennifer M Puck; Keith W Jones; Rui Mei Journal: Genome Res Date: 2004-03 Impact factor: 9.043
Authors: Michael H Shapero; Jane Zhang; Ann Loraine; Weiwei Liu; Xiaojun Di; Guoying Liu; Keith W Jones Journal: Nucleic Acids Res Date: 2004-12-15 Impact factor: 16.971
Authors: Huferesh K Darbary; Smitha S Dutt; Sheila J Sait; Norma J Nowak; Roy E Heinaman; Daniel L Stoler; Garth R Anderson Journal: Cancer Genet Cytogenet Date: 2009-03
Authors: Yan Lin; George C Tseng; Soo Yeon Cheong; Lora J H Bean; Stephanie L Sherman; Eleanor Feingold Journal: Bioinformatics Date: 2008-09-29 Impact factor: 6.937
Authors: Jeffrey A Knight; Andrew D Skol; Abhijit Shinde; Darcie Hastings; Richard A Walgren; Jin Shao; Thelma R Tennant; Mekhala Banerjee; James M Allan; Michelle M Le Beau; Richard A Larson; Timothy A Graubert; Nancy J Cox; Kenan Onel Journal: Blood Date: 2009-03-18 Impact factor: 22.113