Literature DB >> 27170915

Markov Boundary Discovery with Ridge Regularized Linear Models.

Eric V Strobl1, Shyam Visweswaran2.   

Abstract

Ridge regularized linear models (RRLMs), such as ridge regression and the SVM, are a popular group of methods that are used in conjunction with coefficient hypothesis testing to discover explanatory variables with a significant multivariate association to a response. However, many investigators are reluctant to draw causal interpretations of the selected variables due to the incomplete knowledge of the capabilities of RRLMs in causal inference. Under reasonable assumptions, we show that a modified form of RRLMs can get "very close" to identifying a subset of the Markov boundary by providing a worst-case bound on the space of possible solutions. The results hold for any convex loss, even when the underlying functional relationship is nonlinear, and the solution is not unique. Our approach combines ideas in Markov boundary and sufficient dimension reduction theory. Experimental results show that the modified RRLMs are competitive against state-of-the-art algorithms in discovering part of the Markov boundary from gene expression data.

Entities:  

Keywords:  Markov boundary; linear models; ridge regularization

Year:  2015        PMID: 27170915      PMCID: PMC4861166          DOI: 10.1515/jci-2015-0011

Source DB:  PubMed          Journal:  J Causal Inference        ISSN: 2193-3685


  13 in total

1.  The learning classifier system: an evolutionary computation approach to knowledge discovery in epidemiologic surveillance.

Authors:  J H Holmes; D R Durbin; F K Winston
Journal:  Artif Intell Med       Date:  2000-05       Impact factor: 5.326

2.  Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method.

Authors:  L Li; C R Weinberg; T A Darden; L G Pedersen
Journal:  Bioinformatics       Date:  2001-12       Impact factor: 6.937

3.  The TETRAD Project: Constraint Based Aids to Causal Model Specification.

Authors:  R Scheines; P Spirtes; C Glymour; C Meek; T Richardson
Journal:  Multivariate Behav Res       Date:  1998-01-01       Impact factor: 5.923

4.  Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer.

Authors:  Liat Ein-Dor; Or Zuk; Eytan Domany
Journal:  Proc Natl Acad Sci U S A       Date:  2006-04-03       Impact factor: 11.205

5.  Training a support vector machine in the primal.

Authors:  Olivier Chapelle
Journal:  Neural Comput       Date:  2007-05       Impact factor: 2.026

6.  Analysis and computational dissection of molecular signature multiplicity.

Authors:  Alexander Statnikov; Constantin F Aliferis
Journal:  PLoS Comput Biol       Date:  2010-05-20       Impact factor: 4.475

7.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

8.  Algorithms for Discovery of Multiple Markov Boundaries.

Authors:  Alexander Statnikov; Nikita I Lytkin; Jan Lemeire; Constantin F Aliferis
Journal:  J Mach Learn Res       Date:  2013-02       Impact factor: 3.654

9.  On the number of close-to-optimal feature sets.

Authors:  Edward R Dougherty; Marcel Brun
Journal:  Cancer Inform       Date:  2007-02-16

10.  New methods for separating causes from effects in genomics data.

Authors:  Alexander Statnikov; Mikael Henaff; Nikita I Lytkin; Constantin F Aliferis
Journal:  BMC Genomics       Date:  2012-12-17       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.