Literature DB >> 19608708

An algorithm for learning maximum entropy probability models of disease risk that efficiently searches and sparingly encodes multilocus genomic interactions.

David J Miller1, Yanxin Zhang, Guoqiang Yu, Yongmei Liu, Li Chen, Carl D Langefeld, David Herrington, Yue Wang.   

Abstract

MOTIVATION: In both genome-wide association studies (GWAS) and pathway analysis, the modest sample size relative to the number of genetic markers presents formidable computational, statistical and methodological challenges for accurately identifying markers/interactions and for building phenotype-predictive models.
RESULTS: We address these objectives via maximum entropy conditional probability modeling (MECPM), coupled with a novel model structure search. Unlike neural networks and support vector machines (SVMs), MECPM makes explicit and is determined by the interactions that confer phenotype-predictive power. Our method identifies both a marker subset and the multiple k-way interactions between these markers. Additional key aspects are: (i) evaluation of a select subset of up to five-way interactions while retaining relatively low complexity; (ii) flexible single nucleotide polymorphism (SNP) coding (dominant, recessive) within each interaction; (iii) no mathematical interaction form assumed; (iv) model structure and order selection based on the Bayesian Information Criterion, which fairly compares interactions at different orders and automatically sets the experiment-wide significance level; (v) MECPM directly yields a phenotype-predictive model. MECPM was compared with a panel of methods on datasets with up to 1000 SNPs and up to eight embedded penetrance function (i.e. ground-truth) interactions, including a five-way, involving less than 20 SNPs. MECPM achieved improved sensitivity and specificity for detecting both ground-truth markers and interactions, compared with previous methods. AVAILABILITY: http://www.cbil.ece.vt.edu/ResearchOngoingSNP.htm

Mesh:

Year:  2009        PMID: 19608708      PMCID: PMC3140808          DOI: 10.1093/bioinformatics/btp435

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  12 in total

1.  Analysis of complex traits using neural networks.

Authors:  A Bhat; P R Lucek; J Ott
Journal:  Genet Epidemiol       Date:  1999       Impact factor: 2.135

2.  Genome-wide strategies for detecting multiple loci that influence complex diseases.

Authors:  Jonathan Marchini; Peter Donnelly; Lon R Cardon
Journal:  Nat Genet       Date:  2005-03-27       Impact factor: 38.330

Review 3.  Microarray data analysis: from disarray to consolidation and consensus.

Authors:  David B Allison; Xiangqin Cui; Grier P Page; Mahyar Sabripour
Journal:  Nat Rev Genet       Date:  2006-01       Impact factor: 53.242

4.  Gene mapping and marker clustering using Shannon's mutual information.

Authors:  Zaher Dawy; Bernhard Goebel; Joachim Hagenauer; Christophe Andreoli; Thomas Meitinger; Jakob C Mueller
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2006 Jan-Mar       Impact factor: 3.710

5.  Logic regression for analysis of the association between genetic variation in the renin-angiotensin system and myocardial infarction or stroke.

Authors:  Charles Kooperberg; Joshua C Bis; Kristin D Marciante; Susan R Heckbert; Thomas Lumley; Bruce M Psaty
Journal:  Am J Epidemiol       Date:  2006-11-02       Impact factor: 4.897

6.  Exploration of gene-gene interaction effects using entropy-based methods.

Authors:  Changzheng Dong; Xun Chu; Ying Wang; Yi Wang; Li Jin; Tieliu Shi; Wei Huang; Yixue Li
Journal:  Eur J Hum Genet       Date:  2007-10-31       Impact factor: 4.246

7.  A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer.

Authors:  David J Hunter; Peter Kraft; Kevin B Jacobs; David G Cox; Meredith Yeager; Susan E Hankinson; Sholom Wacholder; Zhaoming Wang; Robert Welch; Amy Hutchinson; Junwen Wang; Kai Yu; Nilanjan Chatterjee; Nick Orr; Walter C Willett; Graham A Colditz; Regina G Ziegler; Christine D Berg; Saundra S Buys; Catherine A McCarty; Heather Spencer Feigelson; Eugenia E Calle; Michael J Thun; Richard B Hayes; Margaret Tucker; Daniela S Gerhard; Joseph F Fraumeni; Robert N Hoover; Gilles Thomas; Stephen J Chanock
Journal:  Nat Genet       Date:  2007-05-27       Impact factor: 38.330

8.  Bayesian inference of epistatic interactions in case-control studies.

Authors:  Yu Zhang; Jun S Liu
Journal:  Nat Genet       Date:  2007-08-26       Impact factor: 38.330

9.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer.

Authors:  M D Ritchie; L W Hahn; N Roodi; L R Bailey; W D Dupont; F F Parl; J H Moore
Journal:  Am J Hum Genet       Date:  2001-06-11       Impact factor: 11.025

10.  A model for the genetics of handedness.

Authors:  J Levy; T Nagylaki
Journal:  Genetics       Date:  1972-09       Impact factor: 4.562

View more
  20 in total

Review 1.  An overview of population genetic data simulation.

Authors:  Xiguo Yuan; David J Miller; Junying Zhang; David Herrington; Yue Wang
Journal:  J Comput Biol       Date:  2011-12-09       Impact factor: 1.479

2.  The regulation-of-autophagy pathway may influence Chinese stature variation: evidence from elder adults.

Authors:  Feng Pan; Xiao-Gang Liu; Yan-Fang Guo; Yuan Chen; Shan-Shan Dong; Chuan Qiu; Zhi-Xin Zhang; Qi Zhou; Tie-Lin Yang; Yan Guo; Xue-Zhen Zhu; Hong-Wen Deng
Journal:  J Hum Genet       Date:  2010-05-07       Impact factor: 3.172

3.  Evaluation of a two-stage framework for prediction using big genomic data.

Authors:  Xia Jiang; Richard E Neapolitan
Journal:  Brief Bioinform       Date:  2015-03-18       Impact factor: 11.622

4.  A comparative analysis of methods for predicting clinical outcomes using high-dimensional genomic datasets.

Authors:  Xia Jiang; Binghuang Cai; Diyang Xue; Xinghua Lu; Gregory F Cooper; Richard E Neapolitan
Journal:  J Am Med Inform Assoc       Date:  2014-04-15       Impact factor: 4.497

5.  Genetic and physical interaction of the B-cell systemic lupus erythematosus-associated genes BANK1 and BLK.

Authors:  Casimiro Castillejo-López; Angélica M Delgado-Vega; Jerome Wojcik; Sergey V Kozyrev; Elangovan Thavathiru; Ying-Yu Wu; Elena Sánchez; David Pöllmann; Juan R López-Egido; Serena Fineschi; Nicolás Domínguez; Rufei Lu; Judith A James; Joan T Merrill; Jennifer A Kelly; Kenneth M Kaufman; Kathy L Moser; Gary Gilkeson; Johan Frostegård; Bernardo A Pons-Estel; Sandra D'Alfonso; Torsten Witte; José Luis Callejas; John B Harley; Patrick M Gaffney; Javier Martin; Joel M Guthridge; Marta E Alarcón-Riquelme
Journal:  Ann Rheum Dis       Date:  2011-10-06       Impact factor: 19.103

6.  LEAP: biomarker inference through learning and evaluating association patterns.

Authors:  Xia Jiang; Richard E Neapolitan
Journal:  Genet Epidemiol       Date:  2015-02-12       Impact factor: 2.135

7.  Cuckoo search epistasis: a new method for exploring significant genetic interactions.

Authors:  M Aflakparast; H Salimi; A Gerami; M-P Dubé; S Visweswaran; A Masoudi-Nejad
Journal:  Heredity (Edinb)       Date:  2014-02-19       Impact factor: 3.821

8.  Bootstrap aggregating of alternating decision trees to detect sets of SNPs that associate with disease.

Authors:  Richard T Guy; Peter Santago; Carl D Langefeld
Journal:  Genet Epidemiol       Date:  2012-02       Impact factor: 2.135

9.  Recessive/dominant model: Alternative choice in case-control-based genome-wide association studies.

Authors:  Han-Ming Liu; Jin-Ping Zheng; Dan Yang; Zhao-Fa Liu; Zi Li; Zhen-Zhen Hu; Ze-Nan Li
Journal:  PLoS One       Date:  2021-07-21       Impact factor: 3.240

10.  Mining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality.

Authors:  Xia Jiang; Richard E Neapolitan
Journal:  PLoS One       Date:  2012-10-12       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.