Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging.

Literature DB >> 33196064

Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging.

Luke Oakden-Rayner¹, Jared Dunnmon², Gustavo Carneiro¹, Christopher Ré².

Abstract

Machine learning models for medical image analysis often suffer from poor performance on important subsets of a population that are not identified during training or testing. For example, overall performance of a cancer detection model may be high, but the model may still consistently miss a rare but aggressive cancer subtype. We refer to this problem as hidden stratification, and observe that it results from incompletely describing the meaningful variation in a dataset. While hidden stratification can substantially reduce the clinical efficacy of machine learning models, its effects remain difficult to measure. In this work, we assess the utility of several possible techniques for measuring hidden stratification effects, and characterize these effects both via synthetic experiments on the CIFAR-100 benchmark dataset and on multiple real-world medical imaging datasets. Using these measurement techniques, we find evidence that hidden stratification can occur in unidentified imaging subsets with low prevalence, low label quality, subtle distinguishing features, or spurious correlates, and that it can result in relative performance differences of over 20% on clinically important subsets. Finally, we discuss the clinical implications of our findings, and suggest that evaluation of hidden stratification should be a critical component of any machine learning deployment in medical imaging.

Entities: Chemical Disease Species

Keywords: Computing methodologies → Machine learning; convolutional neural networks; hidden stratification; machine learning

Year: 2020 PMID： 33196064 PMCID： PMC7665161 DOI： 10.1145/3368555.3384468

Source DB: PubMed Journal: Proc ACM Conf Health Inference Learn (2020)

23 in total

Review 1. Population stratification and spurious allelic association.

Authors: Lon R Cardon; Lyle J Palmer
Journal: Lancet Date: 2003-02-15 Impact factor: 79.321

2. Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance.

Authors: Maciej A Mazurowski; Piotr A Habas; Jacek M Zurada; Joseph Y Lo; Jay A Baker; Georgia D Tourassi
Journal: Neural Netw Date: 2007-12-27

3. Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices.

Authors: Vincent S Chen; Sen Wu; Zhenzhen Weng; Alexander Ratner; Christopher Ré
Journal: Adv Neural Inf Process Syst Date: 2019-12

4. Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study.

Authors: Sasank Chilamkurthy; Rohit Ghosh; Swetha Tanamala; Mustafa Biviji; Norbert G Campeau; Vasantha Kumar Venugopal; Vidur Mahajan; Pooja Rao; Prashant Warier
Journal: Lancet Date: 2018-10-11 Impact factor: 79.321

5. Assessment of Convolutional Neural Networks for Automated Classification of Chest Radiographs.

Authors: Jared A Dunnmon; Darvin Yi; Curtis P Langlotz; Christopher Ré; Daniel L Rubin; Matthew P Lungren
Journal: Radiology Date: 2018-11-13 Impact factor: 29.146

6. Deep learning predicts hip fracture using confounding patient and healthcare variables.

Authors: Marcus A Badgeley; John R Zech; Luke Oakden-Rayner; Benjamin S Glicksberg; Manway Liu; William Gale; Michael V McConnell; Bethany Percha; Thomas M Snyder; Joel T Dudley
Journal: NPJ Digit Med Date: 2019-04-30

7. Weakly supervised classification of aortic valve malformations using unlabeled cardiac MRI sequences.

Authors: Jason A Fries; Paroma Varma; Vincent S Chen; Ke Xiao; Heliodoro Tejeda; Priyanka Saha; Jared Dunnmon; Henry Chubb; Shiraz Maskatia; Madalina Fiterau; Scott Delp; Euan Ashley; Christopher Ré; James R Priest
Journal: Nat Commun Date: 2019-07-15 Impact factor: 14.919

8. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study.

Authors: Pu Wang; Tyler M Berzin; Jeremy Romek Glissen Brown; Shishira Bharadwaj; Aymeric Becq; Xun Xiao; Peixi Liu; Liangping Li; Yan Song; Di Zhang; Yi Li; Guangre Xu; Mengtian Tu; Xiaogang Liu
Journal: Gut Date: 2019-02-27 Impact factor: 23.059

9. Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet.

Authors: Nicholas Bien; Pranav Rajpurkar; Robyn L Ball; Jeremy Irvin; Allison Park; Erik Jones; Michael Bereket; Bhavik N Patel; Kristen W Yeom; Katie Shpanskaya; Safwan Halabi; Evan Zucker; Gary Fanton; Derek F Amanatullah; Christopher F Beaulieu; Geoffrey M Riley; Russell J Stewart; Francis G Blankenberg; David B Larson; Ricky H Jones; Curtis P Langlotz; Andrew Y Ng; Matthew P Lungren
Journal: PLoS Med Date: 2018-11-27 Impact factor: 11.069

10. Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists.

Authors: Pranav Rajpurkar; Jeremy Irvin; Robyn L Ball; Kaylie Zhu; Brandon Yang; Hershel Mehta; Tony Duan; Daisy Ding; Aarti Bagul; Curtis P Langlotz; Bhavik N Patel; Kristen W Yeom; Katie Shpanskaya; Francis G Blankenberg; Jayne Seekins; Timothy J Amrhein; David A Mong; Safwan S Halabi; Evan J Zucker; Andrew Y Ng; Matthew P Lungren
Journal: PLoS Med Date: 2018-11-20 Impact factor: 11.069

26 in total

1. Patient safety and quality improvement: Ethical principles for a regulatory approach to bias in healthcare machine learning.

Authors: Melissa D McCradden; Shalmali Joshi; James A Anderson; Mjaye Mazwi; Anna Goldenberg; Randi Zlotnik Shaul
Journal: J Am Med Inform Assoc Date: 2020-12-09 Impact factor: 4.497

2. Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging.

Authors: Nishanth Arun; Nathan Gaw; Praveer Singh; Ken Chang; Mehak Aggarwal; Bryan Chen; Katharina Hoebel; Sharut Gupta; Jay Patel; Mishka Gidwani; Julius Adebayo; Matthew D Li; Jayashree Kalpathy-Cramer
Journal: Radiol Artif Intell Date: 2021-10-06

3. Ethical Machine Learning in Healthcare.

Authors: Irene Y Chen; Emma Pierson; Sherri Rose; Shalmali Joshi; Kadija Ferryman; Marzyeh Ghassemi
Journal: Annu Rev Biomed Data Sci Date: 2021-05-06

4. The Cases for and against Artificial Intelligence in the Medical School Curriculum.

Authors: Brandon Ngo; Diep Nguyen; Eric vanSonnenberg
Journal: Radiol Artif Intell Date: 2022-08-17

Review 5. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension.

Authors: Xiaoxuan Liu; Samantha Cruz Rivera; David Moher; Melanie J Calvert; Alastair K Denniston
Journal: Lancet Digit Health Date: 2020-09-09

Review 10. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension.

Authors: Samantha Cruz Rivera; Xiaoxuan Liu; An-Wen Chan; Alastair K Denniston; Melanie J Calvert
Journal: Lancet Digit Health Date: 2020-09-09