Literature DB >> 19208122

Using random forest for reliable classification and cost-sensitive learning for medical diagnosis.

Fan Yang1, Hua-zhen Wang, Hong Mi, Cheng-de Lin, Wei-wen Cai.   

Abstract

BACKGROUND: Most machine-learning classifiers output label predictions for new instances without indicating how reliable the predictions are. The applicability of these classifiers is limited in critical domains where incorrect predictions have serious consequences, like medical diagnosis. Further, the default assumption of equal misclassification costs is most likely violated in medical diagnosis.
RESULTS: In this paper, we present a modified random forest classifier which is incorporated into the conformal predictor scheme. A conformal predictor is a transductive learning scheme, using Kolmogorov complexity to test the randomness of a particular sample with respect to the training sets. Our method show well-calibrated property that the performance can be set prior to classification and the accurate rate is exactly equal to the predefined confidence level. Further, to address the cost sensitive problem, we extend our method to a label-conditional predictor which takes into account different costs for misclassifications in different class and allows different confidence level to be specified for each class. Intensive experiments on benchmark datasets and real world applications show the resultant classifier is well-calibrated and able to control the specific risk of different class.
CONCLUSION: The method of using RF outlier measure to design a nonconformity measure benefits the resultant predictor. Further, a label-conditional classifier is developed and turn to be an alternative approach to the cost sensitive learning problem that relies on label-wise predefined confidence level. The target of minimizing the risk of misclassification is achieved by specifying the different confidence level for different class.

Entities:  

Mesh:

Year:  2009        PMID: 19208122      PMCID: PMC2648734          DOI: 10.1186/1471-2105-10-S1-S22

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  7 in total

1.  Random forest similarity for protein-protein interaction prediction from multiple sources.

Authors:  Yanjun Qi; Judith Klein-Seetharaman; Ziv Bar-Joseph
Journal:  Pac Symp Biocomput       Date:  2005

2.  Qualified predictions for microarray and proteomics pattern diagnostics with confidence machines.

Authors:  Tony Bellotti; Zhiyuan Luo; Alex Gammerman; Frederick W Van Delft; Vaskar Saha
Journal:  Int J Neural Syst       Date:  2005-08       Impact factor: 5.866

3.  Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling.

Authors:  Eng-Juh Yeoh; Mary E Ross; Sheila A Shurtleff; W Kent Williams; Divyen Patel; Rami Mahfouz; Fred G Behm; Susana C Raimondi; Mary V Relling; Anami Patel; Cheng Cheng; Dario Campana; Dawn Wilkins; Xiaodong Zhou; Jinyan Li; Huiqing Liu; Ching-Hon Pui; William E Evans; Clayton Naeve; Limsoon Wong; James R Downing
Journal:  Cancer Cell       Date:  2002-03       Impact factor: 31.743

4.  Evaluating microarray-based classifiers: an overview.

Authors:  A-L Boulesteix; C Strobl; T Augustin; M Daumer
Journal:  Cancer Inform       Date:  2008-02-29

5.  Gene selection and classification of microarray data using random forest.

Authors:  Ramón Díaz-Uriarte; Sara Alvarez de Andrés
Journal:  BMC Bioinformatics       Date:  2006-01-06       Impact factor: 3.169

6.  Conditional variable importance for random forests.

Authors:  Carolin Strobl; Anne-Laure Boulesteix; Thomas Kneib; Thomas Augustin; Achim Zeileis
Journal:  BMC Bioinformatics       Date:  2008-07-11       Impact factor: 3.169

7.  A comparative study of different machine learning methods on microarray gene expression data.

Authors:  Mehdi Pirooznia; Jack Y Yang; Mary Qu Yang; Youping Deng
Journal:  BMC Genomics       Date:  2008       Impact factor: 3.969

  7 in total
  15 in total

1.  Enhanced cancer recognition system based on random forests feature elimination algorithm.

Authors:  Akin Ozcift
Journal:  J Med Syst       Date:  2011-05-13       Impact factor: 4.460

2.  Gene expression profiles identify inflammatory signatures in dendritic cells.

Authors:  Anna Torri; Ottavio Beretta; Anna Ranghetti; Francesca Granucci; Paola Ricciardi-Castagnoli; Maria Foti
Journal:  PLoS One       Date:  2010-02-24       Impact factor: 3.240

3.  An application of machine learning to haematological diagnosis.

Authors:  Gregor Gunčar; Matjaž Kukar; Mateja Notar; Miran Brvar; Peter Černelč; Manca Notar; Marko Notar
Journal:  Sci Rep       Date:  2018-01-11       Impact factor: 4.379

4.  Novel ensemble method for the prediction of response to fluvoxamine treatment of obsessive-compulsive disorder.

Authors:  Hesam Hasanpour; Ramak Ghavamizadeh Meibodi; Keivan Navi; Sareh Asadi
Journal:  Neuropsychiatr Dis Treat       Date:  2018-08-10       Impact factor: 2.570

5.  Multi-objective active machine learning rapidly improves structure-activity models and reveals new protein-protein interaction inhibitors.

Authors:  D Reker; P Schneider; G Schneider
Journal:  Chem Sci       Date:  2016-03-10       Impact factor: 9.825

6.  Development of an Online Health Care Assessment for Preventive Medicine: A Machine Learning Approach.

Authors:  Cheng-Sheng Yu; Yu-Jiun Lin; Chang-Hsien Lin; Shiyng-Yu Lin; Jenny L Wu; Shy-Shin Chang
Journal:  J Med Internet Res       Date:  2020-06-05       Impact factor: 5.428

7.  A particle swarm based hybrid system for imbalanced medical data sampling.

Authors:  Pengyi Yang; Liang Xu; Bing B Zhou; Zili Zhang; Albert Y Zomaya
Journal:  BMC Genomics       Date:  2009-12-03       Impact factor: 3.969

8.  Random generalized linear model: a highly accurate and interpretable ensemble predictor.

Authors:  Lin Song; Peter Langfelder; Steve Horvath
Journal:  BMC Bioinformatics       Date:  2013-01-16       Impact factor: 3.169

9.  Reliable multi-label learning via conformal predictor and random forest for syndrome differentiation of chronic fatigue in traditional Chinese medicine.

Authors:  Huazhen Wang; Xin Liu; Bing Lv; Fan Yang; Yanzhu Hong
Journal:  PLoS One       Date:  2014-06-11       Impact factor: 3.240

10.  Computer Aided Detection System for Prediction of the Malaise during Hemodialysis.

Authors:  Sabina Tangaro; Annarita Fanizzi; Nicola Amoroso; Roberto Corciulo; Elena Garuccio; Loreto Gesualdo; Giuliana Loizzo; Deni Aldo Procaccini; Lucia Vernò; Roberto Bellotti
Journal:  Comput Math Methods Med       Date:  2016-03-06       Impact factor: 2.238

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.