Literature DB >> 24737607

A comparative analysis of methods for predicting clinical outcomes using high-dimensional genomic datasets.

Xia Jiang1, Binghuang Cai1, Diyang Xue1, Xinghua Lu1, Gregory F Cooper1, Richard E Neapolitan2.   

Abstract

OBJECTIVE: The objective of this investigation is to evaluate binary prediction methods for predicting disease status using high-dimensional genomic data. The central hypothesis is that the Bayesian network (BN)-based method called efficient Bayesian multivariate classifier (EBMC) will do well at this task because EBMC builds on BN-based methods that have performed well at learning epistatic interactions.
METHOD: We evaluate how well eight methods perform binary prediction using high-dimensional discrete genomic datasets containing epistatic interactions. The methods are as follows: naive Bayes (NB), model averaging NB (MANB), feature selection NB (FSNB), EBMC, logistic regression (LR), support vector machines (SVM), Lasso, and extreme learning machines (ELM). We use a hundred 1000-single nucleotide polymorphism (SNP) simulated datasets, ten 10,000-SNP datasets, six semi-synthetic sets, and two real genome-wide association studies (GWAS) datasets in our evaluation.
RESULTS: In fivefold cross-validation studies, the SVM performed best on the 1000-SNP dataset, while the BN-based methods performed best on the other datasets, with EBMC exhibiting the best overall performance. In-sample testing indicates that LR, SVM, Lasso, ELM, and NB tend to overfit the data. DISCUSSION: EBMC performed better than NB when there are several strong predictors, whereas NB performed better when there are many weak predictors. Furthermore, for all BN-based methods, prediction capability did not degrade as the dimension increased.
CONCLUSIONS: Our results support the hypothesis that EBMC performs well at binary outcome prediction using high-dimensional discrete datasets containing epistatic-like interactions. Future research using more GWAS datasets is needed to further investigate the potential of EBMC. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

Entities:  

Keywords:  Bayesian Classifier; Bayesian Networks; Epistasis; Genomics; High-Dimensional Data; Prediction

Mesh:

Year:  2014        PMID: 24737607      PMCID: PMC4173174          DOI: 10.1136/amiajnl-2013-002358

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  42 in total

Review 1.  New strategies for identifying gene-gene interactions in hypertension.

Authors:  Jason H Moore; Scott M Williams
Journal:  Ann Med       Date:  2002       Impact factor: 4.709

2.  Genome-wide strategies for detecting multiple loci that influence complex diseases.

Authors:  Jonathan Marchini; Peter Donnelly; Lon R Cardon
Journal:  Nat Genet       Date:  2005-03-27       Impact factor: 38.330

3.  Bayesian graphical models for genomewide association studies.

Authors:  Claudio J Verzilli; Nigel Stallard; John C Whittaker
Journal:  Am J Hum Genet       Date:  2006-05-30       Impact factor: 11.025

4.  SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies.

Authors:  Can Yang; Zengyou He; Xiang Wan; Qiang Yang; Hong Xue; Weichuan Yu
Journal:  Bioinformatics       Date:  2008-12-19       Impact factor: 6.937

5.  Genome-wide association analysis by lasso penalized logistic regression.

Authors:  Tong Tong Wu; Yi Fang Chen; Trevor Hastie; Eric Sobel; Kenneth Lange
Journal:  Bioinformatics       Date:  2009-01-28       Impact factor: 6.937

6.  Personal genomes: The case of the missing heritability.

Authors:  Brendan Maher
Journal:  Nature       Date:  2008-11-06       Impact factor: 49.962

7.  An efficient bayesian method for predicting clinical outcomes from genome-wide data.

Authors:  Gregory F Cooper; Pablo Hennings-Yeomans; Shyam Visweswaran; Michael Barmada
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

8.  Comparative analysis of methods for detecting interacting loci.

Authors:  Li Chen; Guoqiang Yu; Carl D Langefeld; David J Miller; Richard T Guy; Jayaram Raghuram; Xiguo Yuan; David M Herrington; Yue Wang
Journal:  BMC Genomics       Date:  2011-07-05       Impact factor: 3.969

9.  Detecting epistatic effects in association studies at a genomic level based on an ensemble approach.

Authors:  Jing Li; Benjamin Horstman; Yixuan Chen
Journal:  Bioinformatics       Date:  2011-07-01       Impact factor: 6.937

10.  A bayesian method for evaluating and discovering disease loci associations.

Authors:  Xia Jiang; M Michael Barmada; Gregory F Cooper; Michael J Becich
Journal:  PLoS One       Date:  2011-08-10       Impact factor: 3.240

View more
  5 in total

1.  Comparison of machine learning classifiers for influenza detection from emergency department free-text reports.

Authors:  Arturo López Pineda; Ye Ye; Shyam Visweswaran; Gregory F Cooper; Michael M Wagner; Fuchiang Rich Tsui
Journal:  J Biomed Inform       Date:  2015-09-16       Impact factor: 6.317

2.  Novel Application of Junction Trees to the Interpretation of Epigenetic Differences among Lung Cancer Subtypes.

Authors:  Arturo Lopez Pineda; Vanathi Gopalakrishnan
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2015-03-23

3.  Computational methods for ubiquitination site prediction using physicochemical properties of protein sequences.

Authors:  Binghuang Cai; Xia Jiang
Journal:  BMC Bioinformatics       Date:  2016-03-03       Impact factor: 3.169

4.  On Predicting lung cancer subtypes using 'omic' data from tumor and tumor-adjacent histologically-normal tissue.

Authors:  Arturo López Pineda; Henry Ato Ogoe; Jeya Balaji Balasubramanian; Claudia Rangel Escareño; Shyam Visweswaran; James Gordon Herman; Vanathi Gopalakrishnan
Journal:  BMC Cancer       Date:  2016-03-04       Impact factor: 4.430

5.  Careful feature selection is key in classification of Alzheimer's disease patients based on whole-genome sequencing data.

Authors:  Marlena Osipowicz; Bartek Wilczynski; Magdalena A Machnicka
Journal:  NAR Genom Bioinform       Date:  2021-07-27
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.