Literature DB >> 18693924

Are random forests better than support vector machines for microarray-based cancer classification?

Alexander Statnikov1, Constantin F Aliferis.   

Abstract

Cancer diagnosis and clinical outcome prediction are among the most important emerging applications of gene expression microarray technology with several molecular signatures on their way toward clinical deployment. Use of the most accurate decision support algorithms available for microarray gene expression data is a critical ingredient in order to develop the best possible molecular signatures for patient care. As suggested by a large body of literature to-date, support vector machines can be considered "best of class" algorithms for classification of such data. Recent work however found that random forest classifiers outperform support vector machines. In the present paper we point to several biases of this prior work and conduct a new unbiased evaluation of the two algorithms. Our experiments using 18 diagnostic and prognostic datasets show that support vector machines outperform random forests often by a large margin.

Entities:  

Mesh:

Year:  2007        PMID: 18693924      PMCID: PMC2655823     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  10 in total

1.  Support vector machine classification and validation of cancer tissue samples using microarray expression data.

Authors:  T S Furey; N Cristianini; N Duffy; D W Bednarski; M Schummer; D Haussler
Journal:  Bioinformatics       Date:  2000-10       Impact factor: 6.937

2.  HITON: a novel Markov Blanket algorithm for optimal variable selection.

Authors:  C F Aliferis; I Tsamardinos; A Statnikov
Journal:  AMIA Annu Symp Proc       Date:  2003

3.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis.

Authors:  Alexander Statnikov; Constantin F Aliferis; Ioannis Tsamardinos; Douglas Hardin; Shawn Levy
Journal:  Bioinformatics       Date:  2004-09-16       Impact factor: 6.937

4.  Prediction of cancer outcome with microarrays: a multiple random validation strategy.

Authors:  Stefan Michiels; Serge Koscielny; Catherine Hill
Journal:  Lancet       Date:  2005 Feb 5-11       Impact factor: 79.321

5.  GEMS: a system for automated cancer diagnosis and biomarker discovery from microarray gene expression data.

Authors:  Alexander Statnikov; Ioannis Tsamardinos; Yerbolat Dosbayev; Constantin F Aliferis
Journal:  Int J Med Inform       Date:  2005-08       Impact factor: 4.046

Review 6.  Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.

Authors:  F E Harrell; K L Lee; D B Mark
Journal:  Stat Med       Date:  1996-02-28       Impact factor: 2.373

7.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.

Authors:  E R DeLong; D M DeLong; D L Clarke-Pearson
Journal:  Biometrics       Date:  1988-09       Impact factor: 2.571

8.  Converting a breast cancer microarray signature into a high-throughput diagnostic test.

Authors:  Annuska M Glas; Arno Floore; Leonie J M J Delahaye; Anke T Witteveen; Rob C F Pover; Niels Bakx; Jaana S T Lahti-Domenici; Tako J Bruinsma; Marc O Warmoes; René Bernards; Lodewyk F A Wessels; Laura J Van't Veer
Journal:  BMC Genomics       Date:  2006-10-30       Impact factor: 3.969

9.  Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data.

Authors:  Baolin Wu; Tom Abbott; David Fishman; Walter McMurray; Gil Mor; Kathryn Stone; David Ward; Kenneth Williams; Hongyu Zhao
Journal:  Bioinformatics       Date:  2003-09-01       Impact factor: 6.937

10.  Gene selection and classification of microarray data using random forest.

Authors:  Ramón Díaz-Uriarte; Sara Alvarez de Andrés
Journal:  BMC Bioinformatics       Date:  2006-01-06       Impact factor: 3.169

  10 in total
  16 in total

1.  Improving classification performance with discretization on biomedical datasets.

Authors:  Jonathan L Lustgarten; Vanathi Gopalakrishnan; Himanshu Grover; Shyam Visweswaran
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

2.  Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests.

Authors:  João Maroco; Dina Silva; Ana Rodrigues; Manuela Guerreiro; Isabel Santana; Alexandre de Mendonça
Journal:  BMC Res Notes       Date:  2011-08-17

3.  Comparative analyses between retained introns and constitutively spliced introns in Arabidopsis thaliana using random forest and support vector machine.

Authors:  Rui Mao; Praveen Kumar Raj Kumar; Cheng Guo; Yang Zhang; Chun Liang
Journal:  PLoS One       Date:  2014-08-11       Impact factor: 3.240

Review 4.  Pathological bases for a robust application of cancer molecular classification.

Authors:  Salvador J Diaz-Cano
Journal:  Int J Mol Sci       Date:  2015-04-17       Impact factor: 5.923

5.  Classification of Benign and Malignant Thyroid Nodules Using a Combined Clinical Information and Gene Expression Signatures.

Authors:  Bing Zheng; Jun Liu; Jianlei Gu; Jing Du; Lin Wang; Shengli Gu; Juan Cheng; Jun Yang; Hui Lu
Journal:  PLoS One       Date:  2016-10-24       Impact factor: 3.240

6.  Random Forest Segregation of Drug Responses May Define Regions of Biological Significance.

Authors:  Qasim Bukhari; David Borsook; Markus Rudin; Lino Becerra
Journal:  Front Comput Neurosci       Date:  2016-03-09       Impact factor: 2.380

7.  Gene expression profiles for predicting metastasis in breast cancer: a cross-study comparison of classification methods.

Authors:  Mark Burton; Mads Thomassen; Qihua Tan; Torben A Kruse
Journal:  ScientificWorldJournal       Date:  2012-11-28

8.  Filtered selection coupled with support vector machines generate a functionally relevant prediction model for colorectal cancer.

Authors:  Musa Nur Gabere; Mohamed Aly Hussein; Mohammad Azhar Aziz
Journal:  Onco Targets Ther       Date:  2016-06-01       Impact factor: 4.147

9.  Integration of RNA-Seq data with heterogeneous microarray data for breast cancer profiling.

Authors:  Daniel Castillo; Juan Manuel Gálvez; Luis Javier Herrera; Belén San Román; Fernando Rojas; Ignacio Rojas
Journal:  BMC Bioinformatics       Date:  2017-11-21       Impact factor: 3.169

10.  MLACP: machine-learning-based prediction of anticancer peptides.

Authors:  Balachandran Manavalan; Shaherin Basith; Tae Hwan Shin; Sun Choi; Myeong Ok Kim; Gwang Lee
Journal:  Oncotarget       Date:  2017-08-19
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.