Literature DB >> 34498681

Benchmark of filter methods for feature selection in high-dimensional gene expression survival data.

Andrea Bommert1, Thomas Welchowski2, Matthias Schmid2, Jörg Rahnenführer1.   

Abstract

Feature selection is crucial for the analysis of high-dimensional data, but benchmark studies for data with a survival outcome are rare. We compare 14 filter methods for feature selection based on 11 high-dimensional gene expression survival data sets. The aim is to provide guidance on the choice of filter methods for other researchers and practitioners. We analyze the accuracy of predictive models that employ the features selected by the filter methods. Also, we consider the run time, the number of selected features for fitting models with high predictive accuracy as well as the feature selection stability. We conclude that the simple variance filter outperforms all other considered filter methods. This filter selects the features with the largest variance and does not take into account the survival outcome. Also, we identify the correlation-adjusted regression scores filter as a more elaborate alternative that allows fitting models with similar predictive accuracy. Additionally, we investigate the filter methods based on feature rankings, finding groups of similar filters.
© The Author(s) 2021. Published by Oxford University Press.

Entities:  

Keywords:  benchmark; feature selection; filter methods; high-dimensional data; survival analysis

Mesh:

Year:  2022        PMID: 34498681      PMCID: PMC8769710          DOI: 10.1093/bib/bbab354

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  18 in total

Review 1.  Filter versus wrapper gene selection approaches in DNA microarray domains.

Authors:  Iñaki Inza; Pedro Larrañaga; Rosa Blanco; Antonio J Cerrolaza
Journal:  Artif Intell Med       Date:  2004-06       Impact factor: 5.326

2.  A comparative study on feature selection methods for drug discovery.

Authors:  Ying Liu
Journal:  J Chem Inf Comput Sci       Date:  2004 Sep-Oct

3.  Consistent estimation of the expected Brier score in general survival models with right-censored event times.

Authors:  Thomas A Gerds; Martin Schumacher
Journal:  Biom J       Date:  2006-12       Impact factor: 2.207

4.  Predicting survival from microarray data--a comparative study.

Authors:  H M Bøvelstad; S Nygård; H L Størvold; M Aldrin; Ø Borgan; A Frigessi; O C Lingjaerde
Journal:  Bioinformatics       Date:  2007-06-06       Impact factor: 6.937

Review 5.  Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.

Authors:  F E Harrell; K L Lee; D B Mark
Journal:  Stat Med       Date:  1996-02-28       Impact factor: 2.373

6.  Large-scale benchmark study of survival prediction methods using multi-omics data.

Authors:  Moritz Herrmann; Philipp Probst; Roman Hornung; Vindi Jurinovic; Anne-Laure Boulesteix
Journal:  Brief Bioinform       Date:  2021-05-20       Impact factor: 11.622

Review 7.  A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data.

Authors:  Zena M Hira; Duncan F Gillies
Journal:  Adv Bioinformatics       Date:  2015-06-11

8.  A Multicriteria Approach to Find Predictive and Sparse Models with Stable Feature Selection for High-Dimensional Data.

Authors:  Andrea Bommert; Jörg Rahnenführer; Michel Lang
Journal:  Comput Math Methods Med       Date:  2017-08-01       Impact factor: 2.238

9.  Correlation-adjusted regression survival scores for high-dimensional variable selection.

Authors:  Thomas Welchowski; Verena Zuber; Matthias Schmid
Journal:  Stat Med       Date:  2019-02-22       Impact factor: 2.373

10.  Filtering for increased power for microarray data analysis.

Authors:  Amber J Hackstadt; Ann M Hess
Journal:  BMC Bioinformatics       Date:  2009-01-08       Impact factor: 3.169

View more
  4 in total

1.  Survival-related genes are diversified across cancers but generally enriched in cancer hallmark pathways.

Authors:  Po-Wen Wang; Yi-Hsun Su; Po-Hao Chou; Ming-Yueh Huang; Ting-Wen Chen
Journal:  BMC Genomics       Date:  2022-05-04       Impact factor: 4.547

2.  A novel EEG-based major depressive disorder detection framework with two-stage feature selection.

Authors:  Yujie Li; Yingshan Shen; Xiaomao Fan; Xingxian Huang; Haibo Yu; Gansen Zhao; Wenjun Ma
Journal:  BMC Med Inform Decis Mak       Date:  2022-08-06       Impact factor: 3.298

3.  MSPJ: Discovering potential biomarkers in small gene expression datasets via ensemble learning.

Authors:  HuaChun Yin; JingXin Tao; Yuyang Peng; Ying Xiong; Bo Li; Song Li; Hui Yang
Journal:  Comput Struct Biotechnol J       Date:  2022-07-14       Impact factor: 6.155

4.  Prognosis of lasso-like penalized Cox models with tumor profiling improves prediction over clinical data alone and benefits from bi-dimensional pre-screening.

Authors:  Florent Chatelain; Laurent Guyon; Rémy Jardillier; Dzenis Koca
Journal:  BMC Cancer       Date:  2022-10-05       Impact factor: 4.638

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.