Literature DB >> 27294886

Learning mixed graphical models with separate sparsity parameters and stability-based model selection.

Andrew J Sedgewick1,2, Ivy Shi3, Rory M Donovan4,5, Panayiotis V Benos4,5.   

Abstract

BACKGROUND: Mixed graphical models (MGMs) are graphical models learned over a combination of continuous and discrete variables. Mixed variable types are common in biomedical datasets. MGMs consist of a parameterized joint probability density, which implies a network structure over these heterogeneous variables. The network structure reveals direct associations between the variables and the joint probability density allows one to ask arbitrary probabilistic questions on the data. This information can be used for feature selection, classification and other important tasks.
RESULTS: We studied the properties of MGM learning and applications of MGMs to high-dimensional data (biological and simulated). Our results show that MGMs reliably uncover the underlying graph structure, and when used for classification, their performance is comparable to popular discriminative methods (lasso regression and support vector machines). We also show that imposing separate sparsity penalties for edges connecting different types of variables significantly improves edge recovery performance. To choose these sparsity parameters, we propose a new efficient model selection method, named Stable Edge-specific Penalty Selection (StEPS). StEPS is an expansion of an earlier method, StARS, to mixed variable types. In terms of edge recovery, StEPS selected MGMs outperform those models selected using standard techniques, including AIC, BIC and cross-validation. In addition, we use a heuristic search that is linear in size of the sparsity value search space as opposed to the cubic grid search required by other model selection methods. We applied our method to clinical and mRNA expression data from the Lung Genomics Research Consortium (LGRC) and the learned MGM correctly recovered connections between the diagnosis of obstructive or interstitial lung disease, two diagnostic breathing tests, and cigarette smoking history. Our model also suggested biologically relevant mRNA markers that are linked to these three clinical variables.
CONCLUSIONS: MGMs are able to accurately recover dependencies between sets of continuous and discrete variables in both simulated and biomedical datasets. Separation of sparsity penalties by edge type is essential for accurate network edge recovery. Furthermore, our stability based method for model selection determines sparsity parameters faster and more accurately (in terms of edge recovery) than other model selection methods. With the ongoing availability of comprehensive clinical and biomedical datasets, MGMs are expected to become a valuable tool for investigating disease mechanisms and answering an array of critical healthcare questions.

Entities:  

Mesh:

Year:  2016        PMID: 27294886      PMCID: PMC4905606          DOI: 10.1186/s12859-016-1039-0

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.307


  14 in total

1.  Comparison of the predicted and observed secondary structure of T4 phage lysozyme.

Authors:  B W Matthews
Journal:  Biochim Biophys Acta       Date:  1975-10-20

2.  The huge Package for High-dimensional Undirected Graph Estimation in R.

Authors:  Tuo Zhao; Han Liu; Kathryn Roeder; John Lafferty; Larry Wasserman
Journal:  J Mach Learn Res       Date:  2012-04       Impact factor: 3.654

3.  Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models.

Authors:  Han Liu; Kathryn Roeder; Larry Wasserman
Journal:  Adv Neural Inf Process Syst       Date:  2010-12-31

4.  Sparse inverse covariance estimation with the graphical lasso.

Authors:  Jerome Friedman; Trevor Hastie; Robert Tibshirani
Journal:  Biostatistics       Date:  2007-12-12       Impact factor: 5.899

5.  Robust biomarker identification for cancer diagnosis with ensemble feature selection methods.

Authors:  Thomas Abeel; Thibault Helleputte; Yves Van de Peer; Pierre Dupont; Yvan Saeys
Journal:  Bioinformatics       Date:  2009-11-25       Impact factor: 6.937

6.  Fibrinogen, COPD and mortality in a nationally representative U.S. cohort.

Authors:  David M Mannino; Deepa Valvi; Hana Mullerova; Ruth Tal-Singer
Journal:  COPD       Date:  2012-04-11       Impact factor: 2.409

7.  Genetic interactions between polymorphisms that affect gene expression in yeast.

Authors:  Rachel B Brem; John D Storey; Jacqueline Whittle; Leonid Kruglyak
Journal:  Nature       Date:  2005-08-04       Impact factor: 49.962

8.  MMP1 and MMP7 as potential peripheral blood biomarkers in idiopathic pulmonary fibrosis.

Authors:  Ivan O Rosas; Thomas J Richards; Kazuhisa Konishi; Yingze Zhang; Kevin Gibson; Anna E Lokshin; Kathleen O Lindell; Jose Cisneros; Sandra D Macdonald; Annie Pardo; Frank Sciurba; James Dauber; Moises Selman; Bernadette R Gochuico; Naftali Kaminski
Journal:  PLoS Med       Date:  2008-04-29       Impact factor: 11.069

9.  Learning subgroup-specific regulatory interactions and regulator independence with PARADIGM.

Authors:  Andrew J Sedgewick; Stephen C Benz; Shahrooz Rabizadeh; Patrick Soon-Shiong; Charles J Vaske
Journal:  Bioinformatics       Date:  2013-07-01       Impact factor: 6.937

10.  Learning gene networks under SNP perturbations using eQTL datasets.

Authors:  Lingxue Zhang; Seyoung Kim
Journal:  PLoS Comput Biol       Date:  2014-02-27       Impact factor: 4.475

View more
  24 in total

1.  Effects of emotional maltreatment on semantic network activity during cognitive reappraisal.

Authors:  Sang Won Lee; Seungho Kim; Seung Jae Lee; Hyunsil Cha; Huijin Song; Seunghee Won; Yongmin Chang; Bumseok Jeong
Journal:  Brain Imaging Behav       Date:  2021-06       Impact factor: 3.978

2.  piMGM: incorporating multi-source priors in mixed graphical models for learning disease networks.

Authors:  Dimitris V Manatakis; Vineet K Raghu; Panayiotis V Benos
Journal:  Bioinformatics       Date:  2018-09-01       Impact factor: 6.937

3.  Lipidomic Signatures Align with Inflammatory Patterns and Outcomes in Critical Illness.

Authors:  Junru Wu; Anthony Cyr; Danielle Gruen; Tyler Lovelace; Panayiotis Benos; Tianmeng Chen; Francis Guyette; Mark Yazer; Brian Daley; Richard Miller; Brian Harbrecht; Jeffrey Claridge; Herb Phelan; Brian Zuckerbraun; Matthew Neal; Pär Johansson; Jakob Stensballe; Rami Namas; Yoram Vodovotz; Jason Sperry; Timothy Billiar; PAMPer Study Group
Journal:  Res Sq       Date:  2021-01-08

Review 4.  Biomedical Informatics on the Cloud: A Treasure Hunt for Advancing Cardiovascular Medicine.

Authors:  Peipei Ping; Henning Hermjakob; Jennifer S Polson; Panagiotis V Benos; Wei Wang
Journal:  Circ Res       Date:  2018-04-27       Impact factor: 17.367

Review 5.  Gaussian and Mixed Graphical Models as (multi-)omics data analysis tools.

Authors:  Michael Altenbuchinger; Antoine Weihs; John Quackenbush; Hans Jörgen Grabe; Helena U Zacharias
Journal:  Biochim Biophys Acta Gene Regul Mech       Date:  2019-10-19       Impact factor: 4.490

6.  A million variables and more: the Fast Greedy Equivalence Search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images.

Authors:  Joseph Ramsey; Madelyn Glymour; Ruben Sanchez-Romero; Clark Glymour
Journal:  Int J Data Sci Anal       Date:  2016-12-01

7.  Scoring Bayesian Networks of Mixed Variables.

Authors:  Bryan Andrews; Joseph Ramsey; Gregory F Cooper
Journal:  Int J Data Sci Anal       Date:  2018-01-11

8.  Neurological Complications Acquired During Pediatric Critical Illness: Exploratory "Mixed Graphical Modeling" Analysis Using Serum Biomarker Levels.

Authors:  Vineet K Raghu; Christopher M Horvat; Patrick M Kochanek; Ericka L Fink; Robert S B Clark; Panayiotis V Benos; Alicia K Au
Journal:  Pediatr Crit Care Med       Date:  2021-10-01       Impact factor: 3.971

9.  Integrated Theory- and Data-driven Feature Selection in Gene Expression Data Analysis.

Authors:  Vineet K Raghu; Xiaoyu Ge; Panos K Chrysanthis; Panayiotis V Benos
Journal:  Proc Int Conf Data Eng       Date:  2017-05-18

10.  A Pipeline for Integrated Theory and Data-Driven Modeling of Biomedical Data.

Authors:  Vineet K Raghu; Xiaoyu Ge; Arun Balajiee; Daniel J Shirer; Isha Das; Panayiotis V Benos; Panos K Chrysanthis
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2021-06-03       Impact factor: 3.702

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.