Literature DB >> 33196064

Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging.

Luke Oakden-Rayner1, Jared Dunnmon2, Gustavo Carneiro1, Christopher Ré2.   

Abstract

Machine learning models for medical image analysis often suffer from poor performance on important subsets of a population that are not identified during training or testing. For example, overall performance of a cancer detection model may be high, but the model may still consistently miss a rare but aggressive cancer subtype. We refer to this problem as hidden stratification, and observe that it results from incompletely describing the meaningful variation in a dataset. While hidden stratification can substantially reduce the clinical efficacy of machine learning models, its effects remain difficult to measure. In this work, we assess the utility of several possible techniques for measuring hidden stratification effects, and characterize these effects both via synthetic experiments on the CIFAR-100 benchmark dataset and on multiple real-world medical imaging datasets. Using these measurement techniques, we find evidence that hidden stratification can occur in unidentified imaging subsets with low prevalence, low label quality, subtle distinguishing features, or spurious correlates, and that it can result in relative performance differences of over 20% on clinically important subsets. Finally, we discuss the clinical implications of our findings, and suggest that evaluation of hidden stratification should be a critical component of any machine learning deployment in medical imaging.

Entities:  

Keywords:  Computing methodologies → Machine learning; convolutional neural networks; hidden stratification; machine learning

Year:  2020        PMID: 33196064      PMCID: PMC7665161          DOI: 10.1145/3368555.3384468

Source DB:  PubMed          Journal:  Proc ACM Conf Health Inference Learn (2020)


  23 in total

Review 1.  Population stratification and spurious allelic association.

Authors:  Lon R Cardon; Lyle J Palmer
Journal:  Lancet       Date:  2003-02-15       Impact factor: 79.321

2.  Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance.

Authors:  Maciej A Mazurowski; Piotr A Habas; Jacek M Zurada; Joseph Y Lo; Jay A Baker; Georgia D Tourassi
Journal:  Neural Netw       Date:  2007-12-27

3.  Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices.

Authors:  Vincent S Chen; Sen Wu; Zhenzhen Weng; Alexander Ratner; Christopher Ré
Journal:  Adv Neural Inf Process Syst       Date:  2019-12

4.  Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study.

Authors:  Sasank Chilamkurthy; Rohit Ghosh; Swetha Tanamala; Mustafa Biviji; Norbert G Campeau; Vasantha Kumar Venugopal; Vidur Mahajan; Pooja Rao; Prashant Warier
Journal:  Lancet       Date:  2018-10-11       Impact factor: 79.321

5.  Assessment of Convolutional Neural Networks for Automated Classification of Chest Radiographs.

Authors:  Jared A Dunnmon; Darvin Yi; Curtis P Langlotz; Christopher Ré; Daniel L Rubin; Matthew P Lungren
Journal:  Radiology       Date:  2018-11-13       Impact factor: 29.146

6.  Deep learning predicts hip fracture using confounding patient and healthcare variables.

Authors:  Marcus A Badgeley; John R Zech; Luke Oakden-Rayner; Benjamin S Glicksberg; Manway Liu; William Gale; Michael V McConnell; Bethany Percha; Thomas M Snyder; Joel T Dudley
Journal:  NPJ Digit Med       Date:  2019-04-30

7.  Weakly supervised classification of aortic valve malformations using unlabeled cardiac MRI sequences.

Authors:  Jason A Fries; Paroma Varma; Vincent S Chen; Ke Xiao; Heliodoro Tejeda; Priyanka Saha; Jared Dunnmon; Henry Chubb; Shiraz Maskatia; Madalina Fiterau; Scott Delp; Euan Ashley; Christopher Ré; James R Priest
Journal:  Nat Commun       Date:  2019-07-15       Impact factor: 14.919

8.  Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study.

Authors:  Pu Wang; Tyler M Berzin; Jeremy Romek Glissen Brown; Shishira Bharadwaj; Aymeric Becq; Xun Xiao; Peixi Liu; Liangping Li; Yan Song; Di Zhang; Yi Li; Guangre Xu; Mengtian Tu; Xiaogang Liu
Journal:  Gut       Date:  2019-02-27       Impact factor: 23.059

9.  Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet.

Authors:  Nicholas Bien; Pranav Rajpurkar; Robyn L Ball; Jeremy Irvin; Allison Park; Erik Jones; Michael Bereket; Bhavik N Patel; Kristen W Yeom; Katie Shpanskaya; Safwan Halabi; Evan Zucker; Gary Fanton; Derek F Amanatullah; Christopher F Beaulieu; Geoffrey M Riley; Russell J Stewart; Francis G Blankenberg; David B Larson; Ricky H Jones; Curtis P Langlotz; Andrew Y Ng; Matthew P Lungren
Journal:  PLoS Med       Date:  2018-11-27       Impact factor: 11.069

10.  Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists.

Authors:  Pranav Rajpurkar; Jeremy Irvin; Robyn L Ball; Kaylie Zhu; Brandon Yang; Hershel Mehta; Tony Duan; Daisy Ding; Aarti Bagul; Curtis P Langlotz; Bhavik N Patel; Kristen W Yeom; Katie Shpanskaya; Francis G Blankenberg; Jayne Seekins; Timothy J Amrhein; David A Mong; Safwan S Halabi; Evan J Zucker; Andrew Y Ng; Matthew P Lungren
Journal:  PLoS Med       Date:  2018-11-20       Impact factor: 11.069

View more
  26 in total

1.  Patient safety and quality improvement: Ethical principles for a regulatory approach to bias in healthcare machine learning.

Authors:  Melissa D McCradden; Shalmali Joshi; James A Anderson; Mjaye Mazwi; Anna Goldenberg; Randi Zlotnik Shaul
Journal:  J Am Med Inform Assoc       Date:  2020-12-09       Impact factor: 4.497

2.  Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging.

Authors:  Nishanth Arun; Nathan Gaw; Praveer Singh; Ken Chang; Mehak Aggarwal; Bryan Chen; Katharina Hoebel; Sharut Gupta; Jay Patel; Mishka Gidwani; Julius Adebayo; Matthew D Li; Jayashree Kalpathy-Cramer
Journal:  Radiol Artif Intell       Date:  2021-10-06

3.  Ethical Machine Learning in Healthcare.

Authors:  Irene Y Chen; Emma Pierson; Sherri Rose; Shalmali Joshi; Kadija Ferryman; Marzyeh Ghassemi
Journal:  Annu Rev Biomed Data Sci       Date:  2021-05-06

4.  The Cases for and against Artificial Intelligence in the Medical School Curriculum.

Authors:  Brandon Ngo; Diep Nguyen; Eric vanSonnenberg
Journal:  Radiol Artif Intell       Date:  2022-08-17

Review 5.  Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension.

Authors:  Xiaoxuan Liu; Samantha Cruz Rivera; David Moher; Melanie J Calvert; Alastair K Denniston
Journal:  Lancet Digit Health       Date:  2020-09-09

Review 6.  Preventing dataset shift from breaking machine-learning biomarkers.

Authors:  Jérôme Dockès; Gaël Varoquaux; Jean-Baptiste Poline
Journal:  Gigascience       Date:  2021-09-28       Impact factor: 6.524

Review 7.  Artificial intelligence for clinical oncology.

Authors:  Benjamin H Kann; Ahmed Hosny; Hugo J W L Aerts
Journal:  Cancer Cell       Date:  2021-04-29       Impact factor: 38.585

8.  Reporting Guidelines for Artificial Intelligence in Medical Research.

Authors:  J Peter Campbell; Aaron Y Lee; Michael Abràmoff; Pearse A Keane; Daniel S W Ting; Flora Lum; Michael F Chiang
Journal:  Ophthalmology       Date:  2020-09-10       Impact factor: 12.079

9.  Synthesis of fracture radiographs with deep neural networks.

Authors:  Nicholas Chedid; Praneeth Sadda; Anish Gonchigar; Jonathan Langdon; Jack Porrino; Andrew Haims; Richard Andrew Taylor
Journal:  Health Inf Sci Syst       Date:  2020-05-30

Review 10.  Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension.

Authors:  Samantha Cruz Rivera; Xiaoxuan Liu; An-Wen Chan; Alastair K Denniston; Melanie J Calvert
Journal:  Lancet Digit Health       Date:  2020-09-09
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.