Literature DB >> 27664504

Survival analysis for high-dimensional, heterogeneous medical data: Exploring feature extraction as an alternative to feature selection.

Sebastian Pölsterl1, Sailesh Conjeti2, Nassir Navab3, Amin Katouzian4.   

Abstract

BACKGROUND: In clinical research, the primary interest is often the time until occurrence of an adverse event, i.e., survival analysis. Its application to electronic health records is challenging for two main reasons: (1) patient records are comprised of high-dimensional feature vectors, and (2) feature vectors are a mix of categorical and real-valued features, which implies varying statistical properties among features. To learn from high-dimensional data, researchers can choose from a wide range of methods in the fields of feature selection and feature extraction. Whereas feature selection is well studied, little work focused on utilizing feature extraction techniques for survival analysis.
RESULTS: We investigate how well feature extraction methods can deal with features having varying statistical properties. In particular, we consider multiview spectral embedding algorithms, which specifically have been developed for these situations. We propose to use random survival forests to accurately determine local neighborhood relations from right censored survival data. We evaluated 10 combinations of feature extraction methods and 6 survival models with and without intrinsic feature selection in the context of survival analysis on 3 clinical datasets. Our results demonstrate that for small sample sizes - less than 500 patients - models with built-in feature selection (Cox model with ℓ1 penalty, random survival forest, and gradient boosted models) outperform feature extraction methods by a median margin of 6.3% in concordance index (inter-quartile range: [-1.2%;14.6%]).
CONCLUSIONS: If the number of samples is insufficient, feature extraction methods are unable to reliably identify the underlying manifold, which makes them of limited use in these situations. For large sample sizes - in our experiments, 2500 samples or more - feature extraction methods perform as well as feature selection methods.
Copyright © 2016 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Censoring; Dimensionality reduction; Feature extraction; Feature selection; Spectral embedding; Survival analysis

Mesh:

Year:  2016        PMID: 27664504     DOI: 10.1016/j.artmed.2016.07.004

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  7 in total

1.  Machine learning for optimized individual survival prediction in resectable upper gastrointestinal cancer.

Authors:  Jin-On Jung; Nerma Crnovrsanin; Naita Maren Wirsik; Henrik Nienhüser; Leila Peters; Felix Popp; André Schulze; Martin Wagner; Beat Peter Müller-Stich; Markus Wolfgang Büchler; Thomas Schmidt
Journal:  J Cancer Res Clin Oncol       Date:  2022-05-26       Impact factor: 4.553

2.  A Selective Review on Random Survival Forests for High Dimensional Data.

Authors:  Hong Wang; Gang Li
Journal:  Quant Biosci       Date:  2017

Review 3.  A Complete Process of Text Classification System Using State-of-the-Art NLP Models.

Authors:  Varun Dogra; Sahil Verma; Pushpita Chatterjee; Jana Shafi; Jaeyoung Choi; Muhammad Fazal Ijaz
Journal:  Comput Intell Neurosci       Date:  2022-06-09

4.  A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction.

Authors:  Annette Spooner; Emily Chen; Arcot Sowmya; Perminder Sachdev; Nicole A Kochan; Julian Trollor; Henry Brodaty
Journal:  Sci Rep       Date:  2020-11-23       Impact factor: 4.379

5.  SurvNet: A Novel Deep Neural Network for Lung Cancer Survival Analysis With Missing Values.

Authors:  Jianyong Wang; Nan Chen; Jixiang Guo; Xiuyuan Xu; Lunxu Liu; Zhang Yi
Journal:  Front Oncol       Date:  2021-01-20       Impact factor: 6.244

6.  Body fat prediction through feature extraction based on anthropometric and laboratory measurements.

Authors:  Zongwen Fan; Raymond Chiong; Zhongyi Hu; Farshid Keivanian; Fabian Chiong
Journal:  PLoS One       Date:  2022-02-22       Impact factor: 3.240

Review 7.  Computational Analysis of High-Dimensional DNA Methylation Data for Cancer Prognosis.

Authors:  Ran Hu; Xianghong Jasmine Zhou; Wenyuan Li
Journal:  J Comput Biol       Date:  2022-06-06       Impact factor: 1.549

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.