Literature DB >> 33884372

Efficient nonparametric statistical inference on population feature importance using Shapley values.

Brian D Williamson1, Jean Feng2.   

Abstract

The true population-level importance of a variable in a prediction task provides useful knowledge about the underlying data-generating mechanism and can help in deciding which measurements to collect in subsequent experiments. Valid statistical inference on this importance is a key component in understanding the population of interest. We present a computationally efficient procedure for estimating and obtaining valid statistical inference on the Shapley Population Variable Importance Measure (SPVIM). Although the computational complexity of the true SPVIM scales exponentially with the number of variables, we propose an estimator based on randomly sampling only Θ(n) feature subsets given n observations. We prove that our estimator converges at an asymptotically optimal rate. Moreover, by deriving the asymptotic distribution of our estimator, we construct valid confidence intervals and hypothesis tests. Our procedure has good finite-sample performance in simulations, and for an in-hospital mortality prediction task produces similar variable importance estimates when different machine learning algorithms are applied.

Entities:  

Year:  2020        PMID: 33884372      PMCID: PMC8057672     

Source DB:  PubMed          Journal:  Proc Mach Learn Res


  5 in total

1.  A model for immunological correlates of protection.

Authors:  Andrew J Dunning
Journal:  Stat Med       Date:  2006-05-15       Impact factor: 2.373

2.  From Local Explanations to Global Understanding with Explainable AI for Trees.

Authors:  Scott M Lundberg; Gabriel Erion; Hugh Chen; Alex DeGrave; Jordan M Prutkin; Bala Nair; Ronit Katz; Jonathan Himmelfarb; Nisha Bansal; Su-In Lee
Journal:  Nat Mach Intell       Date:  2020-01-17

3.  Definitions, methods, and applications in interpretable machine learning.

Authors:  W James Murdoch; Chandan Singh; Karl Kumbier; Reza Abbasi-Asl; Bin Yu
Journal:  Proc Natl Acad Sci U S A       Date:  2019-10-16       Impact factor: 11.205

4.  Predicting In-Hospital Mortality of ICU Patients: The PhysioNet/Computing in Cardiology Challenge 2012.

Authors:  Ikaro Silva; George Moody; Daniel J Scott; Leo A Celi; Roger G Mark
Journal:  Comput Cardiol (2010)       Date:  2012

5.  Nonparametric variable importance assessment using machine learning techniques.

Authors:  Brian D Williamson; Peter B Gilbert; Marco Carone; Noah Simon
Journal:  Biometrics       Date:  2020-12-08       Impact factor: 2.571

  5 in total
  3 in total

1.  Explaining a series of models by propagating Shapley values.

Authors:  Hugh Chen; Scott M Lundberg; Su-In Lee
Journal:  Nat Commun       Date:  2022-08-03       Impact factor: 17.694

2.  Marginal Contribution Feature Importance - an Axiomatic Approach for Explaining Data.

Authors:  Amnon Catav; Boyang Fu; Yazeed Zoabi; Ahuva Weiss-Meilik; Noam Shomron; Jason Ernst; Sriram Sankararaman; Ran Gilad-Bachrach
Journal:  Proc Mach Learn Res       Date:  2021-07

Review 3.  Clinical artificial intelligence quality improvement: towards continual monitoring and updating of AI algorithms in healthcare.

Authors:  Jean Feng; Rachael V Phillips; Ivana Malenica; Andrew Bishara; Alan E Hubbard; Leo A Celi; Romain Pirracchio
Journal:  NPJ Digit Med       Date:  2022-05-31
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.