Literature DB >> 32068005

Inconsistent Performance of Deep Learning Models on Mammogram Classification.

Xiaoqin Wang1, Gongbo Liang2, Yu Zhang2, Hunter Blanton2, Zachary Bessinger2, Nathan Jacobs2.   

Abstract

OBJECTIVES: Performance of recently developed deep learning models for image classification surpasses that of radiologists. However, there are questions about model performance consistency and generalization in unseen external data. The purpose of this study is to determine whether the high performance of deep learning on mammograms can be transferred to external data with a different data distribution.
MATERIALS AND METHODS: Six deep learning models (three published models with high performance and three models designed by us) were evaluated on four different mammogram data sets, including three public (Digital Database for Screening Mammography, INbreast, and Mammographic Image Analysis Society) and one private data set (UKy). The models were trained and validated on either Digital Database for Screening Mammography alone or a combined data set that included Digital Database for Screening Mammography. The models were then tested on the three external data sets. The area under the receiver operating characteristic curve (auROC) was used to evaluate model performance.
RESULTS: The three published models reported validation auROC scores between 0.88 and 0.95 on the validation data set. Our models achieved between 0.71 (95% confidence interval [CI]: 0.70-0.72) and 0.79 (95% CI: 0.78-0.80) auROC on the same validation data set. However, the same evaluation criteria of all six models on the three external test data sets were significantly decreased, only between 0.44 (95% CI: 0.43-0.45) and 0.65 (95% CI: 0.64-0.66).
CONCLUSION: Our results demonstrate performance inconsistency across the data sets and models, indicating that the high performance of deep learning models on one data set cannot be readily transferred to unseen external data sets, and these models need further assessment and validation before being applied in clinical practice.
Copyright © 2020 American College of Radiology. Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Deep learning; mammogram; performance inconsistency

Mesh:

Year:  2020        PMID: 32068005     DOI: 10.1016/j.jacr.2020.01.006

Source DB:  PubMed          Journal:  J Am Coll Radiol        ISSN: 1546-1440            Impact factor:   5.532


  17 in total

Review 1.  Deep learning in breast radiology: current progress and future directions.

Authors:  William C Ou; Dogan Polat; Basak E Dogan
Journal:  Eur Radiol       Date:  2021-01-15       Impact factor: 5.315

2.  Geographic Distribution of US Cohorts Used to Train Deep Learning Algorithms.

Authors:  Amit Kaushal; Russ Altman; Curt Langlotz
Journal:  JAMA       Date:  2020-09-22       Impact factor: 56.272

Review 3.  Radiology artificial intelligence: a systematic review and evaluation of methods (RAISE).

Authors:  Brendan S Kelly; Conor Judge; Stephanie M Bollard; Simon M Clifford; Gerard M Healy; Awsam Aziz; Prateek Mathur; Shah Islam; Kristen W Yeom; Aonghus Lawlor; Ronan P Killeen
Journal:  Eur Radiol       Date:  2022-04-14       Impact factor: 5.315

Review 4.  Shifting machine learning for healthcare from development to deployment and from models to data.

Authors:  Angela Zhang; Lei Xing; James Zou; Joseph C Wu
Journal:  Nat Biomed Eng       Date:  2022-07-04       Impact factor: 25.671

5.  Contrastive Cross-Modal Pre-Training: A General Strategy for Small Sample Medical Imaging.

Authors:  Gongbo Liang; Connor Greenwell; Yu Zhang; Xin Xing; Xiaoqin Wang; Ramakanth Kavuluru; Nathan Jacobs
Journal:  IEEE J Biomed Health Inform       Date:  2022-04-14       Impact factor: 7.021

6.  External Validation of Deep Learning Algorithms for Radiologic Diagnosis: A Systematic Review.

Authors:  Alice C Yu; Bahram Mohajer; John Eng
Journal:  Radiol Artif Intell       Date:  2022-05-04

7.  Diagnostic Accuracy and Failure Mode Analysis of a Deep Learning Algorithm for the Detection of Intracranial Hemorrhage.

Authors:  Andrew F Voter; Ece Meram; John W Garrett; John-Paul J Yu
Journal:  J Am Coll Radiol       Date:  2021-04-03       Impact factor: 6.240

8.  How Clinicians Perceive Artificial Intelligence-Assisted Technologies in Diagnostic Decision Making: Mixed Methods Approach.

Authors:  Deana Shevit Goldin; Hyeyoung Hah
Journal:  J Med Internet Res       Date:  2021-12-16       Impact factor: 5.428

9.  Deep Learning Systems for Pneumothorax Detection on Chest Radiographs: A Multicenter External Validation Study.

Authors:  Yee Liang Thian; Dianwen Ng; James Thomas Patrick Decourcy Hallinan; Pooja Jagmohan; Soon Yiew Sia; Cher Heng Tan; Yong Han Ting; Pin Lin Kei; Geoiphy George Pulickal; Vincent Tze Yang Tiong; Swee Tian Quek; Mengling Feng
Journal:  Radiol Artif Intell       Date:  2021-04-14

10.  Big data and predictive analytics in healthcare in Bangladesh: regulatory challenges.

Authors:  Shafiqul Hassan; Mohsin Dhali; Fazluz Zaman; Muhammad Tanveer
Journal:  Heliyon       Date:  2021-05-29
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.