Literature DB >> 28435175

Hierarchical Feature Selection Incorporating Known and Novel Biological Information: Identifying Genomic Features Related to Prostate Cancer Recurrence.

Yize Zhao1, Matthias Chung2, Brent A Johnson3, Carlos S Moreno4, Qi Long5.   

Abstract

Our work is motivated by a prostate cancer study aimed at identifying mRNA and miRNA biomarkers that are predictive of cancer recurrence after prostatectomy. It has been shown in the literature that incorporating known biological information on pathway memberships and interactions among biomarkers improves feature selection of high-dimensional biomarkers in relation to disease risk. Biological information is often represented by graphs or networks, in which biomarkers are represented by nodes and interactions among them are represented by edges; however, biological information is often not fully known. For example, the role of microRNAs (miRNAs) in regulating gene expression is not fully understood and the miRNA regulatory network is not fully established, in which case new strategies are needed for feature selection. To this end, we treat unknown biological information as missing data (i.e., missing edges in graphs), different from commonly encountered missing data problems where variable values are missing. We propose a new concept of imputing unknown biological information based on observed data and define the imputed information as the novel biological information. In addition, we propose a hierarchical group penalty to encourage sparsity and feature selection at both the pathway level and the within-pathway level, which, combined with the imputation step, allows for incorporation of known and novel biological information. While it is applicable to general regression settings, we develop and investigate the proposed approach in the context of semiparametric accelerated failure time models motivated by our data example. Data application and simulation studies show that incorporation of novel biological information improves performance in risk prediction and feature selection and the proposed penalty outperforms the extensions of several existing penalties.

Entities:  

Keywords:  Hierarchical Feature selection; Imputation; Missing data; Regularization; Semiparametric AFT model

Year:  2017        PMID: 28435175      PMCID: PMC5394568          DOI: 10.1080/01621459.2016.1164051

Source DB:  PubMed          Journal:  J Am Stat Assoc        ISSN: 0162-1459            Impact factor:   5.033


  48 in total

1.  KEGG: kyoto encyclopedia of genes and genomes.

Authors:  M Kanehisa; S Goto
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Network-constrained regularization and variable selection for analysis of genomic data.

Authors:  Caiyan Li; Hongzhe Li
Journal:  Bioinformatics       Date:  2008-03-01       Impact factor: 6.937

3.  MicroRNA in cancer prognosis.

Authors:  Frank J Slack; Joanne B Weidhaas
Journal:  N Engl J Med       Date:  2008-12-18       Impact factor: 91.245

4.  Prognostic value of an RNA expression signature derived from cell cycle proliferation genes in patients with prostate cancer: a retrospective study.

Authors:  Jack Cuzick; Gregory P Swanson; Gabrielle Fisher; Arthur R Brothman; Daniel M Berney; Julia E Reid; David Mesher; V O Speights; Elzbieta Stankiewicz; Christopher S Foster; Henrik Møller; Peter Scardino; Jorja D Warren; Jimmy Park; Adib Younus; Darl D Flake; Susanne Wagner; Alexander Gutin; Jerry S Lanchbury; Steven Stone
Journal:  Lancet Oncol       Date:  2011-03       Impact factor: 41.316

5.  BOOTSTRAP INFERENCE FOR NETWORK CONSTRUCTION WITH AN APPLICATION TO A BREAST CANCER MICROARRAY STUDY.

Authors:  Shuang Li; Li Hsu; Jie Peng; Pei Wang
Journal:  Ann Appl Stat       Date:  2013-03-01       Impact factor: 2.083

6.  Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer.

Authors:  Jie Peng; Ji Zhu; Anna Bergamaschi; Wonshik Han; Dong-Young Noh; Jonathan R Pollack; Pei Wang
Journal:  Ann Appl Stat       Date:  2010-03       Impact factor: 2.083

7.  Assessing the performance of prediction models: a framework for traditional and novel measures.

Authors:  Ewout W Steyerberg; Andrew J Vickers; Nancy R Cook; Thomas Gerds; Mithat Gonen; Nancy Obuchowski; Michael J Pencina; Michael W Kattan
Journal:  Epidemiology       Date:  2010-01       Impact factor: 4.822

8.  MicroRNAs in cancer treatment and prognosis.

Authors:  Cláudia Regina Gasque Schoof; Eder Leite da Silva Botelho; Alberto Izzotti; Luciana Dos Reis Vasques
Journal:  Am J Cancer Res       Date:  2012-06-28       Impact factor: 6.166

9.  The microRNA signature in response to insulin reveals its implication in the transcriptional action of insulin in human skeletal muscle and the role of a sterol regulatory element-binding protein-1c/myocyte enhancer factor 2C pathway.

Authors:  Aurélie Granjon; Marie-Paule Gustin; Jennifer Rieusset; Etienne Lefai; Emmanuelle Meugnier; Isabelle Güller; Catherine Cerutti; Christian Paultre; Emmanuel Disse; Rémi Rabasa-Lhoret; Martine Laville; Hubert Vidal; Sophie Rome
Journal:  Diabetes       Date:  2009-08-31       Impact factor: 9.461

10.  The microRNA.org resource: targets and expression.

Authors:  Doron Betel; Manda Wilson; Aaron Gabow; Debora S Marks; Chris Sander
Journal:  Nucleic Acids Res       Date:  2007-12-23       Impact factor: 16.971

View more
  6 in total

1.  Bayesian generalized biclustering analysis via adaptive structured shrinkage.

Authors:  Ziyi Li; Changgee Chang; Suprateek Kundu; Qi Long
Journal:  Biostatistics       Date:  2020-07-01       Impact factor: 5.899

2.  Bayesian interaction selection model for multimodal neuroimaging data analysis.

Authors:  Yize Zhao; Ben Wu; Jian Kang
Journal:  Biometrics       Date:  2022-02-27       Impact factor: 1.701

3.  Bayesian sparse heritability analysis with high-dimensional neuroimaging phenotypes.

Authors:  Yize Zhao; Tengfei Li; Hongtu Zhu
Journal:  Biostatistics       Date:  2022-04-13       Impact factor: 5.279

4.  Incorporating biological information in sparse principal component analysis with application to genomic data.

Authors:  Ziyi Li; Sandra E Safo; Qi Long
Journal:  BMC Bioinformatics       Date:  2017-07-11       Impact factor: 3.169

5.  Computational Analysis of Pathological Image Enables Interpretable Prediction for Microsatellite Instability.

Authors:  Jin Zhu; Wangwei Wu; Yuting Zhang; Shiyun Lin; Yukang Jiang; Ruixian Liu; Heping Zhang; Xueqin Wang
Journal:  Front Oncol       Date:  2022-07-22       Impact factor: 5.738

6.  Identification of biomarkers for the diagnosis and treatment of primary colorectal cancer based on microarray technology.

Authors:  Zhi Gang Zheng; Bao Qing Ma; Yu Xiao; Tian Xi Wang; Tian Yu; Yu Hu Huo; Qing Qing Wang; Meng Jie Shan; Ling Bing Meng; Jing Han
Journal:  Transl Cancer Res       Date:  2020-05       Impact factor: 1.241

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.