| Literature DB >> 32384019 |
Keelin Murphy1, Henk Smits1, Arnoud J G Knoops1, Michael B J M Korst1, Tijs Samson1, Ernst T Scholten1, Steven Schalekamp1, Cornelia M Schaefer-Prokop1, Rick H H M Philipsen1, Annet Meijers1, Jaime Melendez1, Bram van Ginneken1, Matthieu Rutten1.
Abstract
Background Chest radiography may play an important role in triage for coronavirus disease 2019 (COVID-19), particularly in low-resource settings. Purpose To evaluate the performance of an artificial intelligence (AI) system for detection of COVID-19 pneumonia on chest radiographs. Materials and Methods An AI system (CAD4COVID-XRay) was trained on 24 678 chest radiographs, including 1540 used only for validation while training. The test set consisted of a set of continuously acquired chest radiographs (n = 454) obtained in patients suspected of having COVID-19 pneumonia between March 4 and April 6, 2020, at one center (223 patients with positive reverse transcription polymerase chain reaction [RT-PCR] results, 231 with negative RT-PCR results). Radiographs were independently analyzed by six readers and by the AI system. Diagnostic performance was analyzed with the receiver operating characteristic curve. Results For the test set, the mean age of patients was 67 years ± 14.4 (standard deviation) (56% male). With RT-PCR test results as the reference standard, the AI system correctly classified chest radiographs as COVID-19 pneumonia with an area under the receiver operating characteristic curve of 0.81. The system significantly outperformed each reader (P < .001 using the McNemar test) at their highest possible sensitivities. At their lowest sensitivities, only one reader significantly outperformed the AI system (P = .04). Conclusion The performance of an artificial intelligence system in the detection of coronavirus disease 2019 on chest radiographs was comparable with that of six independent readers. © RSNA, 2020.Entities:
Mesh:
Year: 2020 PMID: 32384019 PMCID: PMC7437494 DOI: 10.1148/radiol.2020201874
Source DB: PubMed Journal: Radiology ISSN: 0033-8419 Impact factor: 11.105
Properties of Training, Validation, and Test Sets
Figure 1a:Top: Images in a 74-year-old man with positive reverse transcription polymerase chain reaction (RT-PCR) test results for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) viral infection. (a) Frontal chest radiograph. (b) Artificial intelligence (AI) system heat map overlaid on a shows pneumonia-related features. The AI system score for this subject is 99.8. Bottom: Images in a 30-year-old man with negative RT-PCR test results for SARS-CoV-2 viral infection. (c) Frontal chest radiograph. (d) AI system heat map overlaid on c. The AI system score for this subject is 0.2.
Figure 2:Receiver operating characteristic (ROC) curve for the artificial intelligence (AI) system and points for each reader (point locations are specified in the figure legend). Reference standard is reverse transcription polymerase chain reaction (RT-PCR) test result. The 95% confidence intervals are shown as a shaded area for the ROC curve, and crosshairs indicate each reader point. The AI system operating points discussed in the text are shown at sensitivities of 60%, 75%, and 85%. The test data set has 454 patients (223 with positive RT-PCR results and 231 with negative RT-PCR results). AUC = area under the ROC curve.
AI System Specificities at Sensitivities Fixed to Match Reader Performance at Various Score Cutoff Values
Figure 3:Receiver operating characteristic (ROC) curves for the artificial intelligence system and each reader individually. Reference standard in each case is the consensus reading of the remaining five readers. The 95% confidence intervals are shown as a shaded area for the ROC curve. AUC = area under ROC curve.
PPVs and NPVs for Readers, AI System, and Consensus Reading