Clinton J Wang1, Charlie A Hamm1,2, Lynn J Savic1,2, Marc Ferrante1, Isabel Schobert1,2, Todd Schlachter1, MingDe Lin1, Jeffrey C Weinreb1, James S Duncan1,3, Julius Chapiro4, Brian Letzen1. 1. Department of Radiology and Biomedical Imaging, Yale School of Medicine, 333 Cedar Street, New Haven, CT, 06520, USA. 2. Institute of Radiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität, and Berlin Institute of Health, 10117, Berlin, Germany. 3. Department of Biomedical Engineering, Yale School of Engineering and Applied Science, New Haven, CT, 06520, USA. 4. Department of Radiology and Biomedical Imaging, Yale School of Medicine, 333 Cedar Street, New Haven, CT, 06520, USA. j.chapiro@googlemail.com.
Abstract
OBJECTIVES: To develop a proof-of-concept "interpretable" deep learning prototype that justifies aspects of its predictions from a pre-trained hepatic lesion classifier. METHODS: A convolutional neural network (CNN) was engineered and trained to classify six hepatic tumor entities using 494 lesions on multi-phasic MRI, described in Part 1. A subset of each lesion class was labeled with up to four key imaging features per lesion. A post hoc algorithm inferred the presence of these features in a test set of 60 lesions by analyzing activation patterns of the pre-trained CNN model. Feature maps were generated that highlight regions in the original image that correspond to particular features. Additionally, relevance scores were assigned to each identified feature, denoting the relative contribution of a feature to the predicted lesion classification. RESULTS: The interpretable deep learning system achieved 76.5% positive predictive value and 82.9% sensitivity in identifying the correct radiological features present in each test lesion. The model misclassified 12% of lesions. Incorrect features were found more often in misclassified lesions than correctly identified lesions (60.4% vs. 85.6%). Feature maps were consistent with original image voxels contributing to each imaging feature. Feature relevance scores tended to reflect the most prominent imaging criteria for each class. CONCLUSIONS: This interpretable deep learning system demonstrates proof of principle for illuminating portions of a pre-trained deep neural network's decision-making, by analyzing inner layers and automatically describing features contributing to predictions. KEY POINTS: • An interpretable deep learning system prototype can explain aspects of its decision-making by identifying relevant imaging features and showing where these features are found on an image, facilitating clinical translation. • By providing feedback on the importance of various radiological features in performing differential diagnosis, interpretable deep learning systems have the potential to interface with standardized reporting systems such as LI-RADS, validating ancillary features and improving clinical practicality. • An interpretable deep learning system could potentially add quantitative data to radiologic reports and serve radiologists with evidence-based decision support.
OBJECTIVES: To develop a proof-of-concept "interpretable" deep learning prototype that justifies aspects of its predictions from a pre-trained hepatic lesion classifier. METHODS: A convolutional neural network (CNN) was engineered and trained to classify six hepatic tumor entities using 494 lesions on multi-phasic MRI, described in Part 1. A subset of each lesion class was labeled with up to four key imaging features per lesion. A post hoc algorithm inferred the presence of these features in a test set of 60 lesions by analyzing activation patterns of the pre-trained CNN model. Feature maps were generated that highlight regions in the original image that correspond to particular features. Additionally, relevance scores were assigned to each identified feature, denoting the relative contribution of a feature to the predicted lesion classification. RESULTS: The interpretable deep learning system achieved 76.5% positive predictive value and 82.9% sensitivity in identifying the correct radiological features present in each test lesion. The model misclassified 12% of lesions. Incorrect features were found more often in misclassified lesions than correctly identified lesions (60.4% vs. 85.6%). Feature maps were consistent with original image voxels contributing to each imaging feature. Feature relevance scores tended to reflect the most prominent imaging criteria for each class. CONCLUSIONS: This interpretable deep learning system demonstrates proof of principle for illuminating portions of a pre-trained deep neural network's decision-making, by analyzing inner layers and automatically describing features contributing to predictions. KEY POINTS: • An interpretable deep learning system prototype can explain aspects of its decision-making by identifying relevant imaging features and showing where these features are found on an image, facilitating clinical translation. • By providing feedback on the importance of various radiological features in performing differential diagnosis, interpretable deep learning systems have the potential to interface with standardized reporting systems such as LI-RADS, validating ancillary features and improving clinical practicality. • An interpretable deep learning system could potentially add quantitative data to radiologic reports and serve radiologists with evidence-based decision support.
Entities:
Keywords:
Artificial intelligence; Deep learning; Liver cancer
Authors: Borna K Barth; Olivio F Donati; Michael A Fischer; Erika J Ulbrich; Christoph A Karlo; Anton Becker; Burkhard Seifert; Caecilia S Reiner Journal: Acad Radiol Date: 2016-05-09 Impact factor: 3.173
Authors: Gabriel Chartrand; Phillip M Cheng; Eugene Vorontsov; Michal Drozdzal; Simon Turcotte; Christopher J Pal; Samuel Kadoury; An Tang Journal: Radiographics Date: 2017 Nov-Dec Impact factor: 5.333
Authors: Eric C Ehman; Spencer C Behr; Sarah E Umetsu; Nicholas Fidelman; Ben M Yeh; Linda D Ferrell; Thomas A Hope Journal: Abdom Radiol (NY) Date: 2016-05
Authors: An Tang; Mustafa R Bashir; Michael T Corwin; Irene Cruite; Christoph F Dietrich; Richard K G Do; Eric C Ehman; Kathryn J Fowler; Hero K Hussain; Reena C Jha; Adib R Karam; Adrija Mamidipalli; Robert M Marks; Donald G Mitchell; Tara A Morgan; Michael A Ohliger; Amol Shah; Kim-Nhien Vu; Claude B Sirlin Journal: Radiology Date: 2017-11-21 Impact factor: 11.105
Authors: Michael T Corwin; Andrew Y Lee; Ghaneh Fananapazir; Thomas W Loehfelm; Souvik Sarkar; Claude B Sirlin Journal: AJR Am J Roentgenol Date: 2017-10-12 Impact factor: 3.959
Authors: Gary H Chang; David T Felson; Shangran Qiu; Ali Guermazi; Terence D Capellini; Vijaya B Kolachalama Journal: Eur Radiol Date: 2020-02-13 Impact factor: 5.315