MOTIVATION: In both genome-wide association studies (GWAS) and pathway analysis, the modest sample size relative to the number of genetic markers presents formidable computational, statistical and methodological challenges for accurately identifying markers/interactions and for building phenotype-predictive models. RESULTS: We address these objectives via maximum entropy conditional probability modeling (MECPM), coupled with a novel model structure search. Unlike neural networks and support vector machines (SVMs), MECPM makes explicit and is determined by the interactions that confer phenotype-predictive power. Our method identifies both a marker subset and the multiple k-way interactions between these markers. Additional key aspects are: (i) evaluation of a select subset of up to five-way interactions while retaining relatively low complexity; (ii) flexible single nucleotide polymorphism (SNP) coding (dominant, recessive) within each interaction; (iii) no mathematical interaction form assumed; (iv) model structure and order selection based on the Bayesian Information Criterion, which fairly compares interactions at different orders and automatically sets the experiment-wide significance level; (v) MECPM directly yields a phenotype-predictive model. MECPM was compared with a panel of methods on datasets with up to 1000 SNPs and up to eight embedded penetrance function (i.e. ground-truth) interactions, including a five-way, involving less than 20 SNPs. MECPM achieved improved sensitivity and specificity for detecting both ground-truth markers and interactions, compared with previous methods. AVAILABILITY: http://www.cbil.ece.vt.edu/ResearchOngoingSNP.htm
MOTIVATION: In both genome-wide association studies (GWAS) and pathway analysis, the modest sample size relative to the number of genetic markers presents formidable computational, statistical and methodological challenges for accurately identifying markers/interactions and for building phenotype-predictive models. RESULTS: We address these objectives via maximum entropy conditional probability modeling (MECPM), coupled with a novel model structure search. Unlike neural networks and support vector machines (SVMs), MECPM makes explicit and is determined by the interactions that confer phenotype-predictive power. Our method identifies both a marker subset and the multiple k-way interactions between these markers. Additional key aspects are: (i) evaluation of a select subset of up to five-way interactions while retaining relatively low complexity; (ii) flexible single nucleotide polymorphism (SNP) coding (dominant, recessive) within each interaction; (iii) no mathematical interaction form assumed; (iv) model structure and order selection based on the Bayesian Information Criterion, which fairly compares interactions at different orders and automatically sets the experiment-wide significance level; (v) MECPM directly yields a phenotype-predictive model. MECPM was compared with a panel of methods on datasets with up to 1000 SNPs and up to eight embedded penetrance function (i.e. ground-truth) interactions, including a five-way, involving less than 20 SNPs. MECPM achieved improved sensitivity and specificity for detecting both ground-truth markers and interactions, compared with previous methods. AVAILABILITY: http://www.cbil.ece.vt.edu/ResearchOngoingSNP.htm
Authors: Zaher Dawy; Bernhard Goebel; Joachim Hagenauer; Christophe Andreoli; Thomas Meitinger; Jakob C Mueller Journal: IEEE/ACM Trans Comput Biol Bioinform Date: 2006 Jan-Mar Impact factor: 3.710
Authors: Charles Kooperberg; Joshua C Bis; Kristin D Marciante; Susan R Heckbert; Thomas Lumley; Bruce M Psaty Journal: Am J Epidemiol Date: 2006-11-02 Impact factor: 4.897
Authors: David J Hunter; Peter Kraft; Kevin B Jacobs; David G Cox; Meredith Yeager; Susan E Hankinson; Sholom Wacholder; Zhaoming Wang; Robert Welch; Amy Hutchinson; Junwen Wang; Kai Yu; Nilanjan Chatterjee; Nick Orr; Walter C Willett; Graham A Colditz; Regina G Ziegler; Christine D Berg; Saundra S Buys; Catherine A McCarty; Heather Spencer Feigelson; Eugenia E Calle; Michael J Thun; Richard B Hayes; Margaret Tucker; Daniela S Gerhard; Joseph F Fraumeni; Robert N Hoover; Gilles Thomas; Stephen J Chanock Journal: Nat Genet Date: 2007-05-27 Impact factor: 38.330
Authors: Casimiro Castillejo-López; Angélica M Delgado-Vega; Jerome Wojcik; Sergey V Kozyrev; Elangovan Thavathiru; Ying-Yu Wu; Elena Sánchez; David Pöllmann; Juan R López-Egido; Serena Fineschi; Nicolás Domínguez; Rufei Lu; Judith A James; Joan T Merrill; Jennifer A Kelly; Kenneth M Kaufman; Kathy L Moser; Gary Gilkeson; Johan Frostegård; Bernardo A Pons-Estel; Sandra D'Alfonso; Torsten Witte; José Luis Callejas; John B Harley; Patrick M Gaffney; Javier Martin; Joel M Guthridge; Marta E Alarcón-Riquelme Journal: Ann Rheum Dis Date: 2011-10-06 Impact factor: 19.103