| Literature DB >> 35204543 |
Tom Finck1, Julia Moosbauer2, Monika Probst1, Sarah Schlaeger1, Madeleine Schuberth1, David Schinz1, Mehmet Yiğitsoy2, Sebastian Byas2, Claus Zimmer1, Franz Pfister2, Benedikt Wiestler1.
Abstract
BACKGROUND: Most artificial intelligence (AI) systems are restricted to solving a pre-defined task, thus limiting their generalizability to unselected datasets. Anomaly detection relieves this shortfall by flagging all pathologies as deviations from a learned norm. Here, we investigate whether diagnostic accuracy and reporting times can be improved by an anomaly detection tool for head computed tomography (CT), tailored to provide patient-level triage and voxel-based highlighting of pathologies.Entities:
Keywords: anomaly detection; classification; computed tomography; decision support; machine learning; neuroradiology
Year: 2022 PMID: 35204543 PMCID: PMC8871235 DOI: 10.3390/diagnostics12020452
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Figure 1Reporting interface with and without AI support. Note that AI support provided patient-level predictions “normal” (green), “inconclusive” (white) or “pathological” (red), as highlighted by (1). Furthermore, pixel-wise segmentations of suspected pathology (3), as well as the distribution of anomalous pixels within the stack of CT images (2) were available.
Figure 2Study workflow. The test set consisted of 80 head CTs (40 normal scans and 40 scans showing common intracranial pathologies, as elaborated in the methods section). All readers completed a run with and without AI-support, in a randomized and alternating order. Analyzed endpoints were (i) the diagnostic accuracy to discriminate between normal and pathology-showing CT, (ii) reporting times and (iii) the subjectively assessed diagnostic confidence in patient-level labels.
Figure 3Given is the classification completeness with/without AI-support (upper panel). Patient-level misclassification into “normal” or “pathological” could be significantly reduced from 11/320 cases to 3/320 cases (p < 0.0001) once AI support was available. The lower panel gives the mean (± standard deviation) reporting time per scan with (54.9 ± 7.1 s) and without (65.1 ± 8.9 s) AI support (p < 0.0001).
Classification completeness for experienced and inexperienced readers with/without AI-support.
| No AI-Support | AI-Support | |
|---|---|---|
|
| ||
| All | 11/320 | 3/320 |
| Experienced | 4/160 | 0/160 |
| Inexperienced | 7/160 | 3/160 |
|
| ||
| All | 30/356 | 17/356 |
| Experienced | 10/356 | 6/356 |
| Inexperienced | 20/356 | 11/356 |
Reporting times (mean ± standard deviation) in seconds for experienced/inexperienced readers, as well as subgroups according to experience levels and ground truths. GT: ground truth, RT: reporting time.
| Subgroups | No AI Support (s) | AI Support (s) | D RT |
|
|---|---|---|---|---|
| All | 65.1 ± 8.9 | 54.9 ± 7.1 | 15.7% | 0.0001 |
| All—GT Normal | 59.6 ± 7.8 | 46.3 ± 5.0 | 22.3% | 0.0001 |
| All—GT Pathological | 70.3 ± 10.3 | 63.3 ± 8.6 | 10.0% | 0.016 |
| Experienced | 66.6 ± 11.2 | 55.3 ± 8.7 | 17.0% | 0.0065 |
| Experienced—GT Normal | 56.9 ± 7.2 | 46.2 ± 7.3 | 18.2% | 0.0071 |
| Experienced—GT Pathological | 76.2 ± 13.9 | 64.3 ± 10.2 | 15.6% | 0.021 |
| Inexperienced | 63.4 ± 7.5 | 54.4 ± 5.5 | 13.9% | 0.017 |
| Inexperienced—GT Normal | 62.3 ± 8.4 | 46.3 ± 2.7 | 25.7% | 0.0001 |
| Inexperienced—GT Pathological | 64.4 ± 6.7 | 62.3 ± 7.1 | 3.0% | 0.29 |