Literature DB >> 21498552

Letter to the editor: on the stability and ranking of predictors from random forest variable importance measures.

Kristin K Nicodemus.   

Abstract

A recent study examined the stability of rankings from random forests using two variable importance measures (mean decrease accuracy (MDA) and mean decrease Gini (MDG)) and concluded that rankings based on the MDG were more robust than MDA. However, studies examining data-specific characteristics on ranking stability have been few. Rankings based on the MDG measure showed sensitivity to within-predictor correlation and differences in category frequencies, even when the number of categories was held constant, and thus may produce spurious results. The MDA measure was robust to these data characteristics. Further, under strong within-predictor correlation, MDG rankings were less stable than those using MDA.

Entities:  

Mesh:

Year:  2011        PMID: 21498552      PMCID: PMC3137934          DOI: 10.1093/bib/bbr016

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  7 in total

1.  Letter to the editor: Stability of Random Forest importance measures.

Authors:  M Luz Calle; Víctor Urrea
Journal:  Brief Bioinform       Date:  2010-03-31       Impact factor: 11.622

2.  Predictor correlation impacts machine learning algorithms: implications for genomic studies.

Authors:  Kristin K Nicodemus; James D Malley
Journal:  Bioinformatics       Date:  2009-05-21       Impact factor: 6.937

Review 3.  Stability and aggregation of ranked gene lists.

Authors:  Anne-Laure Boulesteix; Martin Slawski
Journal:  Brief Bioinform       Date:  2009-09       Impact factor: 11.622

4.  The behaviour of random forest permutation-based variable importance measures under predictor correlation.

Authors:  Kristin K Nicodemus; James D Malley; Carolin Strobl; Andreas Ziegler
Journal:  BMC Bioinformatics       Date:  2010-02-27       Impact factor: 3.169

5.  Performance of random forest when SNPs are in linkage disequilibrium.

Authors:  Yan A Meng; Yi Yu; L Adrienne Cupples; Lindsay A Farrer; Kathryn L Lunetta
Journal:  BMC Bioinformatics       Date:  2009-03-05       Impact factor: 3.169

6.  Bias in random forest variable importance measures: illustrations, sources and a solution.

Authors:  Carolin Strobl; Anne-Laure Boulesteix; Achim Zeileis; Torsten Hothorn
Journal:  BMC Bioinformatics       Date:  2007-01-25       Impact factor: 3.169

7.  Conditional variable importance for random forests.

Authors:  Carolin Strobl; Anne-Laure Boulesteix; Thomas Kneib; Thomas Augustin; Achim Zeileis
Journal:  BMC Bioinformatics       Date:  2008-07-11       Impact factor: 3.169

  7 in total
  23 in total

1.  GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

Authors:  Seyed Amir Naghibi; Hamid Reza Pourghasemi; Barnali Dixon
Journal:  Environ Monit Assess       Date:  2015-12-19       Impact factor: 2.513

2.  Study becomes insight: Ecological learning from machine learning.

Authors:  Qiuyan Yu; Wenjie Ji; Lara Prihodko; C Wade Ross; Julius Y Anchang; Niall P Hanan
Journal:  Methods Ecol Evol       Date:  2021-08-06       Impact factor: 8.335

3.  Automatic health record review to help prioritize gravely ill Social Security disability applicants.

Authors:  Kenneth Abbott; Yen-Yi Ho; Jennifer Erickson
Journal:  J Am Med Inform Assoc       Date:  2017-07-01       Impact factor: 4.497

4.  Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle?

Authors:  Wouter G Touw; Jumamurat R Bayjanov; Lex Overmars; Lennart Backus; Jos Boekhorst; Michiel Wels; Sacha A F T van Hijum
Journal:  Brief Bioinform       Date:  2012-07-10       Impact factor: 11.622

5.  Random KNN feature selection - a fast and stable alternative to Random Forests.

Authors:  Shengqiao Li; E James Harner; Donald A Adjeroh
Journal:  BMC Bioinformatics       Date:  2011-11-18       Impact factor: 3.169

6.  Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

Authors:  Enrico Glaab; Jaume Bacardit; Jonathan M Garibaldi; Natalio Krasnogor
Journal:  PLoS One       Date:  2012-07-11       Impact factor: 3.240

7.  Unravelling the GSK3β-related genotypic interaction network influencing hippocampal volume in recurrent major depressive disorder.

Authors:  Becky Inkster; Andy Simmons; James H Cole; Erwin Schoof; Rune Linding; Tom Nichols; Pierandrea Muglia; Florian Holsboer; Philipp G Sämann; Peter McGuffin; Cynthia H Y Fu; Kamilla Miskowiak; Paul M Matthews; Gwyneth Zai; Kristin Nicodemus
Journal:  Psychiatr Genet       Date:  2018-10       Impact factor: 2.458

8.  A multi-hazard map-based flooding, gully erosion, forest fires, and earthquakes in Iran.

Authors:  Soheila Pouyan; Hamid Reza Pourghasemi; Mojgan Bordbar; Soroor Rahmanian; John J Clague
Journal:  Sci Rep       Date:  2021-07-21       Impact factor: 4.379

9.  An AUC-based permutation variable importance measure for random forests.

Authors:  Silke Janitza; Carolin Strobl; Anne-Laure Boulesteix
Journal:  BMC Bioinformatics       Date:  2013-04-05       Impact factor: 3.169

10.  Combining techniques for screening and evaluating interaction terms on high-dimensional time-to-event data.

Authors:  Murat Sariyar; Isabell Hoffmann; Harald Binder
Journal:  BMC Bioinformatics       Date:  2014-02-26       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.