Literature DB >> 19942583

Robust biomarker identification for cancer diagnosis with ensemble feature selection methods.

Thomas Abeel1, Thibault Helleputte, Yves Van de Peer, Pierre Dupont, Yvan Saeys.   

Abstract

MOTIVATION: Biomarker discovery is an important topic in biomedical applications of computational biology, including applications such as gene and SNP selection from high-dimensional data. Surprisingly, the stability with respect to sampling variation or robustness of such selection processes has received attention only recently. However, robustness of biomarkers is an important issue, as it may greatly influence subsequent biological validations. In addition, a more robust set of markers may strengthen the confidence of an expert in the results of a selection method.
RESULTS: Our first contribution is a general framework for the analysis of the robustness of a biomarker selection algorithm. Secondly, we conducted a large-scale analysis of the recently introduced concept of ensemble feature selection, where multiple feature selections are combined in order to increase the robustness of the final set of selected features. We focus on selection methods that are embedded in the estimation of support vector machines (SVMs). SVMs are powerful classification models that have shown state-of-the-art performance on several diagnosis and prognosis tasks on biological data. Their feature selection extensions also offered good results for gene selection tasks. We show that the robustness of SVMs for biomarker discovery can be substantially increased by using ensemble feature selection techniques, while at the same time improving upon classification performances. The proposed methodology is evaluated on four microarray datasets showing increases of up to almost 30% in robustness of the selected biomarkers, along with an improvement of approximately 15% in classification performance. The stability improvement with ensemble methods is particularly noticeable for small signature sizes (a few tens of genes), which is most relevant for the design of a diagnosis or prognosis model from a gene signature. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19942583     DOI: 10.1093/bioinformatics/btp630

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  101 in total

Review 1.  Standards affecting the consistency of gene expression arrays in clinical applications.

Authors:  Steven A Enkemann
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2010-03-23       Impact factor: 4.254

2.  Improving biomarker list stability by integration of biological knowledge in the learning process.

Authors:  Tiziana Sanavia; Fabio Aiolli; Giovanni Da San Martino; Andrea Bisognin; Barbara Di Camillo
Journal:  BMC Bioinformatics       Date:  2012-03-28       Impact factor: 3.169

3.  Design of a multi-signature ensemble classifier predicting neuroblastoma patients' outcome.

Authors:  Andrea Cornero; Massimo Acquaviva; Paolo Fardin; Rogier Versteeg; Alexander Schramm; Alessandra Eva; Maria Carla Bosco; Fabiola Blengio; Sara Barzaghi; Luigi Varesio
Journal:  BMC Bioinformatics       Date:  2012-03-28       Impact factor: 3.169

4.  A classification framework applied to cancer gene expression profiles.

Authors:  Hussein Hijazi; Christina Chan
Journal:  J Healthc Eng       Date:  2013       Impact factor: 2.682

5.  Radiomics in nuclear medicine: robustness, reproducibility, standardization, and how to avoid data analysis traps and replication crisis.

Authors:  Alex Zwanenburg
Journal:  Eur J Nucl Med Mol Imaging       Date:  2019-06-25       Impact factor: 9.236

6.  Robust clinical marker identification for diabetic kidney disease with ensemble feature selection.

Authors:  Xing Song; Lemuel R Waitman; Yong Hu; Alan S L Yu; David C Robbins; Mei Liu
Journal:  J Am Med Inform Assoc       Date:  2019-03-01       Impact factor: 4.497

7.  Empirical evaluation of consistency and accuracy of methods to detect differentially expressed genes based on microarray data.

Authors:  Dake Yang; Rudolph S Parrish; Guy N Brock
Journal:  Comput Biol Med       Date:  2013-12-13       Impact factor: 4.589

8.  Cross-validation of existing signatures and derivation of a novel 29-gene transcriptomic signature predictive of progression to TB in a Brazilian cohort of household contacts of pulmonary TB.

Authors:  Samantha Leong; Yue Zhao; Rodrigo Ribeiro-Rodrigues; Edward C Jones-López; Carlos Acuña-Villaorduña; Patricia Marques Rodrigues; Moises Palaci; David Alland; Reynaldo Dietze; Jerrold J Ellner; W Evan Johnson; Padmini Salgame
Journal:  Tuberculosis (Edinb)       Date:  2020-01-07       Impact factor: 3.131

9.  Identification and optimization of classifier genes from multi-class earthworm microarray dataset.

Authors:  Ying Li; Nan Wang; Edward J Perkins; Chaoyang Zhang; Ping Gong
Journal:  PLoS One       Date:  2010-10-28       Impact factor: 3.240

10.  Discriminative and informative features for biomolecular text mining with ensemble feature selection.

Authors:  Sofie Van Landeghem; Thomas Abeel; Yvan Saeys; Yves Van de Peer
Journal:  Bioinformatics       Date:  2010-09-15       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.