Literature DB >> 32546735

Bayesian Hyper-LASSO Classification for Feature Selection with Application to Endometrial Cancer RNA-seq Data.

Lai Jiang1,2, Celia M T Greenwood3,4,5, Weixin Yao6, Longhai Li7.   

Abstract

Feature selection is demanded in many modern scientific research problems that use high-dimensional data. A typical example is to identify gene signatures that are related to a certain disease from high-dimensional gene expression data. The expression of genes may have grouping structures, for example, a group of co-regulated genes that have similar biological functions tend to have similar expressions. Thus it is preferable to take the grouping structure into consideration to select features. In this paper, we propose a Bayesian Robit regression method with Hyper-LASSO priors (shortened by BayesHL) for feature selection in high dimensional genomic data with grouping structure. The main features of BayesHL include that it discards more aggressively unrelated features than LASSO, and it makes feature selection within groups automatically without a pre-specified grouping structure. We apply BayesHL in gene expression analysis to identify subsets of genes that contribute to the 5-year survival outcome of endometrial cancer (EC) patients. Results show that BayesHL outperforms alternative methods (including LASSO, group LASSO, supervised group LASSO, penalized logistic regression, random forest, neural network, XGBoost and knockoff) in terms of predictive power, sparsity and the ability to uncover grouping structure, and provides insight into the mechanisms of multiple genetic pathways leading to differentiated EC survival outcome.

Entities:  

Mesh:

Year:  2020        PMID: 32546735      PMCID: PMC7297975          DOI: 10.1038/s41598-020-66466-z

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


  14 in total

1.  Sparse regression and marginal testing using cluster prototypes.

Authors:  Stephen Reid; Robert Tibshirani
Journal:  Biostatistics       Date:  2015-11-27       Impact factor: 5.899

2.  Averaged gene expressions for regression.

Authors:  Mee Young Park; Trevor Hastie; Robert Tibshirani
Journal:  Biostatistics       Date:  2006-05-11       Impact factor: 5.899

Review 3.  The properties of high-dimensional data spaces: implications for exploring gene and protein expression data.

Authors:  Robert Clarke; Habtom W Ressom; Antai Wang; Jianhua Xuan; Minetta C Liu; Edmund A Gehan; Yue Wang
Journal:  Nat Rev Cancer       Date:  2008-01       Impact factor: 60.716

4.  Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors.

Authors:  Patrick Breheny; Jian Huang
Journal:  Stat Comput       Date:  2015-03       Impact factor: 2.559

5.  Expression pattern of the class I homeobox genes in ovarian carcinoma.

Authors:  Jin Hwa Hong; Jae Kwan Lee; Joong Jean Park; Nak Woo Lee; Kyu Wan Lee; Jung Yeol Na
Journal:  J Gynecol Oncol       Date:  2010-03-31       Impact factor: 4.401

6.  Immunocytochemical detection of the homeobox B3, B4, and C6 gene products in breast carcinomas.

Authors:  B Bodey; B Bodey; S E Siegel; H E Kaiser
Journal:  Anticancer Res       Date:  2000 Sep-Oct       Impact factor: 2.480

7.  Classification of arrayCGH data using fused SVM.

Authors:  Franck Rapaport; Emmanuel Barillot; Jean-Philippe Vert
Journal:  Bioinformatics       Date:  2008-07-01       Impact factor: 6.937

8.  Supervised group Lasso with applications to microarray data analysis.

Authors:  Shuangge Ma; Xiao Song; Jian Huang
Journal:  BMC Bioinformatics       Date:  2007-02-22       Impact factor: 3.169

9.  Causal analysis approaches in Ingenuity Pathway Analysis.

Authors:  Andreas Krämer; Jeff Green; Jack Pollard; Stuart Tugendreich
Journal:  Bioinformatics       Date:  2013-12-13       Impact factor: 6.937

10.  Gene hunting with hidden Markov model knockoffs.

Authors:  M Sesia; C Sabatti; E J Candès
Journal:  Biometrika       Date:  2018-08-04       Impact factor: 2.445

View more
  4 in total

Review 1.  The Application of Bayesian Methods in Cancer Prognosis and Prediction.

Authors:  Jiadong Chu; N A Sun; Wei Hu; Xuanli Chen; Nengjun Yi; Yueping Shen
Journal:  Cancer Genomics Proteomics       Date:  2022 Jan-Feb       Impact factor: 4.069

2.  Cancer Classification with a Cost-Sensitive Naive Bayes Stacking Ensemble.

Authors:  Yueling Xiong; Mingquan Ye; Changrong Wu
Journal:  Comput Math Methods Med       Date:  2021-04-24       Impact factor: 2.238

3.  Using random forest algorithm for glomerular and tubular injury diagnosis.

Authors:  Wenzhu Song; Xiaoshuang Zhou; Qi Duan; Qian Wang; Yaheng Li; Aizhong Li; Wenjing Zhou; Lin Sun; Lixia Qiu; Rongshan Li; Yafeng Li
Journal:  Front Med (Lausanne)       Date:  2022-07-28

4.  Lung adenocarcinoma and lung squamous cell carcinoma cancer classification, biomarker identification, and gene expression analysis using overlapping feature selection methods.

Authors:  Joe W Chen; Joseph Dhahbi
Journal:  Sci Rep       Date:  2021-06-25       Impact factor: 4.379

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.