Literature DB >> 34335110

All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously.

Aaron Fisher1, Cynthia Rudin2, Francesca Dominici3.   

Abstract

Variable importance (VI) tools describe how much covariates contribute to a prediction model's accuracy. However, important variables for one well-performing model (for example, a linear model f (x) = x T β with a fixed coefficient vector β) may be unimportant for another model. In this paper, we propose model class reliance (MCR) as the range of VI values across all well-performing model in a prespecified class. Thus, MCR gives a more comprehensive description of importance by accounting for the fact that many prediction models, possibly of different parametric forms, may fit the data well. In the process of deriving MCR, we show several informative results for permutation-based VI estimates, based on the VI measures used in Random Forests. Specifically, we derive connections between permutation importance estimates for a single prediction model, U-statistics, conditional variable importance, conditional causal effects, and linear model coefficients. We then give probabilistic bounds for MCR, using a novel, generalizable technique. We apply MCR to a public data set of Broward County criminal records to study the reliance of recidivism prediction models on sex and race. In this application, MCR can be used to help inform VI for unknown, proprietary models.

Entities:  

Keywords:  Rashomon; U-statistics; conditional variable importance; interpretable models; permutation importance; transparency

Year:  2019        PMID: 34335110      PMCID: PMC8323609     

Source DB:  PubMed          Journal:  J Mach Learn Res        ISSN: 1532-4435            Impact factor:   5.177


  21 in total

1.  Matching methods for causal inference: A review and a look forward.

Authors:  Elizabeth A Stuart
Journal:  Stat Sci       Date:  2010-02-01       Impact factor: 2.901

Review 2.  Risk Assessment in Criminal Sentencing.

Authors:  John Monahan; Jennifer L Skeem
Journal:  Annu Rev Clin Psychol       Date:  2015-12-11       Impact factor: 18.561

3.  Racial differences in marijuana-users' risk of arrest in the United States.

Authors:  Rajeev Ramchand; Rosalie Liccardo Pacula; Martin Y Iguchi
Journal:  Drug Alcohol Depend       Date:  2006-04-05       Impact factor: 4.492

4.  Reinforcement Learning Trees.

Authors:  Ruoqing Zhu; Donglin Zeng; Michael R Kosorok
Journal:  J Am Stat Assoc       Date:  2015-04-16       Impact factor: 5.033

5.  Classification with correlated features: unreliability of feature ranking and solutions.

Authors:  Laura Tolosi; Thomas Lengauer
Journal:  Bioinformatics       Date:  2011-05-16       Impact factor: 6.937

6.  Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments.

Authors:  Alexandra Chouldechova
Journal:  Big Data       Date:  2017-06       Impact factor: 2.128

7.  Prediction uncertainty and optimal experimental design for learning dynamical systems.

Authors:  Benjamin Letham; Portia A Letham; Cynthia Rudin; Edward P Browne
Journal:  Chaos       Date:  2016-06       Impact factor: 3.642

8.  Misuse of DeLong test to compare AUCs for nested models.

Authors:  Olga V Demler; Michael J Pencina; Ralph B D'Agostino
Journal:  Stat Med       Date:  2012-03-13       Impact factor: 2.373

9.  Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.

Authors:  Cynthia Rudin
Journal:  Nat Mach Intell       Date:  2019-05-13

10.  Bias in random forest variable importance measures: illustrations, sources and a solution.

Authors:  Carolin Strobl; Anne-Laure Boulesteix; Achim Zeileis; Torsten Hothorn
Journal:  BMC Bioinformatics       Date:  2007-01-25       Impact factor: 3.169

View more
  30 in total

Review 1.  A roadmap for multi-omics data integration using deep learning.

Authors:  Mingon Kang; Euiseong Ko; Tesfaye B Mersha
Journal:  Brief Bioinform       Date:  2022-01-17       Impact factor: 11.622

2.  A novel analytical framework for risk stratification of real-world data using machine learning: A small cell lung cancer study.

Authors:  Luca Marzano; Adam S Darwich; Salomon Tendler; Asaf Dan; Rolf Lewensohn; Luigi De Petris; Jayanth Raghothama; Sebastiaan Meijer
Journal:  Clin Transl Sci       Date:  2022-07-29       Impact factor: 4.438

3.  Metabolite, protein, and tissue dysfunction associated with COVID-19 disease severity.

Authors:  Ali Rahnavard; Brendan Mann; Abhigya Giri; Ranojoy Chatterjee; Keith A Crandall
Journal:  Sci Rep       Date:  2022-07-16       Impact factor: 4.996

4.  Scrutinizing XAI using linear ground-truth data with suppressor variables.

Authors:  Rick Wilming; Céline Budding; Klaus-Robert Müller; Stefan Haufe
Journal:  Mach Learn       Date:  2022-04-13       Impact factor: 5.414

5.  Construction and Evaluation of Robust Interpretation Models for Breast Cancer Metastasis Prediction.

Authors:  Nahim Adnan; Maryam Zand; Tim H M Huang; Jianhua Ruan
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2022-06-03       Impact factor: 3.702

6.  Machine learning v. traditional regression models predicting treatment outcomes for binge-eating disorder from a randomized controlled trial.

Authors:  Lauren N Forrest; Valentina Ivezaj; Carlos M Grilo
Journal:  Psychol Med       Date:  2021-11-25       Impact factor: 10.592

7.  Cox-nnet v2.0: improved neural-network based survival prediction extended to large-scale EMR data.

Authors:  Di Wang; Zheng Jing; Kevin He; Lana X Garmire
Journal:  Bioinformatics       Date:  2021-01-30       Impact factor: 6.937

8.  Shapley variable importance cloud for interpretable machine learning.

Authors:  Yilin Ning; Marcus Eng Hock Ong; Bibhas Chakraborty; Benjamin Alan Goldstein; Daniel Shu Wei Ting; Roger Vaughan; Nan Liu
Journal:  Patterns (N Y)       Date:  2022-02-22

9.  Cultural differences in music features across Taiwanese, Japanese and American markets.

Authors:  Kongmeng Liew; Yukiko Uchida; Igor de Almeida
Journal:  PeerJ Comput Sci       Date:  2021-08-03

10.  Mapping environmental suitability of Haemagogus and Sabethes spp. mosquitoes to understand sylvatic transmission risk of yellow fever virus in Brazil.

Authors:  Sabrina L Li; André L Acosta; Sarah C Hill; Oliver J Brady; Marco A B de Almeida; Jader da C Cardoso; Arran Hamlet; Luis F Mucci; Juliana Telles de Deus; Felipe C M Iani; Neil S Alexander; G R William Wint; Oliver G Pybus; Moritz U G Kraemer; Nuno R Faria; Jane P Messina
Journal:  PLoS Negl Trop Dis       Date:  2022-01-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.