Literature DB >> 35896763

Clinically focused multi-cohort benchmarking as a tool for external validation of artificial intelligence algorithm performance in basic chest radiography analysis.

Jan Rudolph1, Balthasar Schachtner2,3, Nicola Fink2,3, Vanessa Koliogiannis2, Vincent Schwarze2, Sophia Goller2, Lena Trappmann2, Boj F Hoppe2, Nabeel Mansour2, Maximilian Fischer4, Najib Ben Khaled5, Maximilian Jörgens6, Julien Dinkel2,3,7, Wolfgang G Kunz2, Jens Ricke2, Michael Ingrisch2, Bastian O Sabel2, Johannes Rueckel2,8.   

Abstract

Artificial intelligence (AI) algorithms evaluating [supine] chest radiographs ([S]CXRs) have remarkably increased in number recently. Since training and validation are often performed on subsets of the same overall dataset, external validation is mandatory to reproduce results and reveal potential training errors. We applied a multicohort benchmarking to the publicly accessible (S)CXR analyzing AI algorithm CheXNet, comprising three clinically relevant study cohorts which differ in patient positioning ([S]CXRs), the applied reference standards (CT-/[S]CXR-based) and the possibility to also compare algorithm classification with different medical experts' reading performance. The study cohorts include [1] a cohort, characterized by 563 CXRs acquired in the emergency unit that were evaluated by 9 readers (radiologists and non-radiologists) in terms of 4 common pathologies, [2] a collection of 6,248 SCXRs annotated by radiologists in terms of pneumothorax presence, its size and presence of inserted thoracic tube material which allowed for subgroup and confounding bias analysis and [3] a cohort consisting of 166 patients with SCXRs that were evaluated by radiologists for underlying causes of basal lung opacities, all of those cases having been correlated to a timely acquired computed tomography scan (SCXR and CT within < 90 min). CheXNet non-significantly exceeded the radiology resident (RR) consensus in the detection of suspicious lung nodules (cohort [1], AUC AI/RR: 0.851/0.839, p = 0.793) and the radiological readers in the detection of basal pneumonia (cohort [3], AUC AI/reader consensus: 0.825/0.782, p = 0.390) and basal pleural effusion (cohort [3], AUC AI/reader consensus: 0.762/0.710, p = 0.336) in SCXR, partly with AUC values higher than originally published ("Nodule": 0.780, "Infiltration": 0.735, "Effusion": 0.864). The classifier "Infiltration" turned out to be very dependent on patient positioning (best in CXR, worst in SCXR). The pneumothorax SCXR cohort [2] revealed poor algorithm performance in CXRs without inserted thoracic material and in the detection of small pneumothoraces, which can be explained by a known systematic confounding error in the algorithm training process. The benefit of clinically relevant external validation is demonstrated by the differences in algorithm performance as compared to the original publication. Our multi-cohort benchmarking finally enables the consideration of confounders, different reference standards and patient positioning as well as the AI performance comparison with differentially qualified medical readers.
© 2022. The Author(s).

Entities:  

Mesh:

Year:  2022        PMID: 35896763      PMCID: PMC9329327          DOI: 10.1038/s41598-022-16514-7

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.996


  38 in total

1.  Fleischner Society glossary of terms: infiltrates.

Authors:  Ferris M Hall
Journal:  Radiology       Date:  2008-09       Impact factor: 11.105

2.  Deep Learning at Chest Radiography: Automated Classification of Pulmonary Tuberculosis by Using Convolutional Neural Networks.

Authors:  Paras Lakhani; Baskaran Sundaram
Journal:  Radiology       Date:  2017-04-24       Impact factor: 11.105

3.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.

Authors:  Varun Gulshan; Lily Peng; Marc Coram; Martin C Stumpe; Derek Wu; Arunachalam Narayanaswamy; Subhashini Venugopalan; Kasumi Widner; Tom Madams; Jorge Cuadros; Ramasamy Kim; Rajiv Raman; Philip C Nelson; Jessica L Mega; Dale R Webster
Journal:  JAMA       Date:  2016-12-13       Impact factor: 56.272

Review 4.  Pathophysiological role of respiratory dysbiosis in hospital-acquired pneumonia.

Authors:  A Roquilly; A Torres; J A Villadangos; M G Netea; R Dickson; B Becher; K Asehnoune
Journal:  Lancet Respir Med       Date:  2019-06-07       Impact factor: 30.700

5.  Accuracy of portable chest radiography in the critical care setting. Diagnosis of pneumonia based on quantitative cultures obtained from protected brush catheter.

Authors:  M S Lefcoe; G A Fox; D J Leasa; R K Sparrow; D G McCormack
Journal:  Chest       Date:  1994-03       Impact factor: 9.410

6.  Impact of Confounding Thoracic Tubes and Pleural Dehiscence Extent on Artificial Intelligence Pneumothorax Detection in Chest Radiographs.

Authors:  Johannes Rueckel; Lena Trappmann; Balthasar Schachtner; Philipp Wesp; Boj Friedrich Hoppe; Nicola Fink; Jens Ricke; Julien Dinkel; Michael Ingrisch; Bastian Oliver Sabel
Journal:  Invest Radiol       Date:  2020-12       Impact factor: 6.016

7.  The radiologic diagnosis of autopsy-proven ventilator-associated pneumonia.

Authors:  R G Wunderink; L S Woldenberg; J Zeiss; C M Day; J Ciemins; D A Lacher
Journal:  Chest       Date:  1992-02       Impact factor: 9.410

8.  Lymphocytopenia as a Predictor of Mortality in Patients with ICU-Acquired Pneumonia.

Authors:  Adrian Ceccato; Meropi Panagiotarakou; Otavio T Ranzani; Marta Martin-Fernandez; Raquel Almansa-Mora; Albert Gabarrus; Leticia Bueno; Catia Cilloniz; Adamantia Liapikou; Miquel Ferrer; Jesus F Bermejo-Martin; Antoni Torres
Journal:  J Clin Med       Date:  2019-06-13       Impact factor: 4.241

9.  Pneumothorax detection in chest radiographs: optimizing artificial intelligence system for accuracy and confounding bias reduction using in-image annotations in algorithm training.

Authors:  Johannes Rueckel; Christian Huemmer; Andreas Fieselmann; Florin-Cristian Ghesu; Awais Mansoor; Balthasar Schachtner; Philipp Wesp; Lena Trappmann; Basel Munawwar; Jens Ricke; Michael Ingrisch; Bastian O Sabel
Journal:  Eur Radiol       Date:  2021-03-27       Impact factor: 5.315

10.  International evaluation of an AI system for breast cancer screening.

Authors:  Scott Mayer McKinney; Marcin Sieniek; Varun Godbole; Jonathan Godwin; Natasha Antropova; Hutan Ashrafian; Trevor Back; Mary Chesus; Greg S Corrado; Ara Darzi; Mozziyar Etemadi; Florencia Garcia-Vicente; Fiona J Gilbert; Mark Halling-Brown; Demis Hassabis; Sunny Jansen; Alan Karthikesalingam; Christopher J Kelly; Dominic King; Joseph R Ledsam; David Melnick; Hormuz Mostofi; Lily Peng; Joshua Jay Reicher; Bernardino Romera-Paredes; Richard Sidebottom; Mustafa Suleyman; Daniel Tse; Kenneth C Young; Jeffrey De Fauw; Shravya Shetty
Journal:  Nature       Date:  2020-01-01       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.