| Literature DB >> 34973101 |
Kwang Nam Jin1,2, Eun Young Kim3, Young Jae Kim4, Gi Pyo Lee4, Hyungjin Kim2,5, Sohee Oh6, Yong Suk Kim7, Ju Hyuck Han8, Young Jun Cho9,10.
Abstract
OBJECTIVES: We aim ed to evaluate a commercial artificial intelligence (AI) solution on a multicenter cohort of chest radiographs and to compare physicians' ability to detect and localize referable thoracic abnormalities with and without AI assistance.Entities:
Keywords: Artificial intelligence; Cohort studies; Diagnosis; Radiography; Thorax
Mesh:
Year: 2022 PMID: 34973101 PMCID: PMC9038825 DOI: 10.1007/s00330-021-08397-5
Source DB: PubMed Journal: Eur Radiol ISSN: 0938-7994 Impact factor: 7.034
Fig. 1Flow diagram of the study population and study design for AI augmentation test
Demographic characteristics of all patients who reported in the respiratory outpatient clinics
| Institutions | Total | Dataset for AI augmentation test a | ||||
|---|---|---|---|---|---|---|
| B | G | K | ||||
| No. of patients | 2536 | 1470 | 2000 | 6006 | 230 | |
| Female | 1166 (46) | 643 (44) | 798 (40) | 2607 (43) | 107 (47) | 0·53 |
| Male | 1370 (54) | 827 (56) | 1202 (60) | 3398 (57) | 123 (54) | 0·50 |
| Age (years) | 61 ± 16 | 61 ± 14 | 61 ± 16 | 61 ± 16 | 60 ± 16 | 0·21 |
| Interval between CXR and CT scan (d) | 3 ± 9 | 3 ± 11 | 1 ± 7 | 2 ± 9 | 2 ± 9 | 0·42 |
| No. of PA images | 2536 (99) | 1421 (97) | 1952 (98) | 5908 (98) | 229 (99) | 0·15 |
Note.—Except where indicated, data are mean (± SD) or number (%). AI, artificial intelligence; CXR, chest radiograph; PA, posteroanterior; SD, standard deviation
a The dataset for the AI augmentation test was randomly selected from 6,006 images
b Comparison of proportions or means between the entire population and randomly sampled using the chi-squared test or t-test
Referable thoracic abnormalities on chest radiographs found in the respiratory outpatient clinics
| Entire dataset | Datasets for AI augmentation test ( | |||||
|---|---|---|---|---|---|---|
| Variables | Institutions | |||||
| B ( | G ( | K ( | Total ( | |||
| Intended lesions b | ||||||
| Nodule/mass | 446 (33.9) | 259 (22.1) | 468 (29.7) | 1173 (27.5) | 41 (23.7) | 0·79 |
| Consolidation | 341 (25.9) | 212 (18.1) | 366 (23.2) | 919 (21.6) | 35 (20.2) | 0·99 |
| Pneumothorax | 5 (0.4) | 2 (0.2) | 8 (0.5) | 15 (0.4) | 2 (1.2) | 0·87 |
| Total | 792 (60.1) | 473 (40.4) | 842 (53.4) | 2107 (49.4) | 78 (45.1) | |
| Non-intended lesions | ||||||
| Atelectasis or fibrosis | 93 (7.1) | 62 (5.3) | 185 (11.7) | 340 (8.0) | 15 (8.7) | 0·90 |
| Bronchiectasis | 217 (16.5) | 286 (24.4) | 107 (6.8) | 610 (14.3) | 27 (15.6) | 0·80 |
| Cardiomegaly | 21 (1.6) | 48 (4.1) | 67 (4.3) | 136 (3.2) | 4 (2.3) | 0·94 |
| Diffuse interstitial lung opacities | 115 (8.7) | 73 (6.2) | 65 (4.1) | 253 (5.9) | 10 (5.8) | 0·99 |
| Mediastinal lesion | 11 (0.8) | 27 (2.3) | 36 (2.3) | 74 (1.7) | 4 (2.3) | 0·93 |
| Pleural effusion | 81 (6.2) | 29 (2.5) | 76 (4.8) | 186 (4.4) | 7 (4.0) | 0·99 |
| Other | 188 (14.3) | 172 (14.7) | 198 (12.6) | 558 (13.1) | 28 (16.2) | 0·61 |
| Total | 726 (55.1) | 697 (59.6) | 734 (46.6) | 2157 (50.6) | 95 (54.9% | |
| Total of Inteded or non-intended lesions | 1518 | 1170 | 1576 | 4264 | 173 | N/A |
| No. of patients with any type of lesions | 1317 (52) | 889 (61) | 1131 (57) | 3337 (56) | 137 (60) | 0·36 |
| No. of lesion type per patient c | 1·2 (1–3) | 1·3 (1–4) | 1·4 (1–5) | 1·3 (1–5) | 1·3 (1–4) | 0·83 d |
Note.—Except where indicated, data are numbers of patients, with percentages in parentheses. AI, artificial intelligence; N/A, not applicable
a Except where indicated, comparison of proportions between the total patient population and the randomly sampled dataset for each lesion type using the Chi-squared test
b Intended abnormalities were defined as lesions of the AI solution used in this study
c Number of lesion types per subject was calculated for subjects with intended or non-intended lesions. The numbers in parentheses are ranges
d t-test was performed for comparison of the means between the entire subject dataset and the observer performance test dataset
AUC for each physician and averaged AUCs for chest radiographs (n = 230) from respiratory outpatient clinics unaided and with AI assistance
| Observer group | Physician No | Unaided | AI-assisted | Difference |
|---|---|---|---|---|
| Thoracic radiologists | 1 | 0.903 | 0.909 | 0.006 |
| 2 | 0.900 | 0.912 | 0.012 | |
| 3 | 0.859 | 0.861 | 0.002 | |
| Board-certified radiologists | 4 | 0.892 | 0.923 | 0.031 |
| 5 | 0.854 | 0.863 | 0.009 | |
| 6 | 0.862 | 0.872 | 0.010 | |
| Radiology residents | 7 | 0.871 | 0.888 | 0.017 |
| 8 | 0.820 | 0.872 | 0.052 | |
| 9 | 0.843 | 0.878 | 0.035 | |
| Pulmonologists | 10 | 0.839 | 0.887 | 0.048 |
| 11 | 0.863 | 0.882 | 0.019 | |
| 12 | 0.825 | 0.884 | 0.059 | |
| Average a | 0.861 (0.827, 0.895) | 0.886 (0.854, 0.918) | 0.025 (0.009, 0.041) | |
Note.AUC, area under the receiver operator characteristic curve; AI, artificial intelligence; Numbers in parentheses, 95% CI. CI, confidence interval
a Values in parentheses in the last line of the table are 95% confidence intervals. The p value between the observed average values was .003. The Dorfman-Berbaum-Metz test was used to compare the AUCs between unaided and AI-assisted readings
AUAFROC for each physician and averaged AUAFROCs for chest radiographs (n = 230) from respiratory outpatient clinics unaided and with AI assistance
| Observer group | Physician No | Unaided | AI-assisted | Difference |
|---|---|---|---|---|
| Thoracic radiologists | 1 | 0.845 | 0.857 | 0.012 |
| 2 | 0.839 | 0.863 | 0.024 | |
| 3 | 0.774 | 0.787 | 0.013 | |
| Board-certified radiologists | 4 | 0.821 | 0.856 | 0.035 |
| 5 | 0.782 | 0.796 | 0.014 | |
| 6 | 0.803 | 0.800 | − 0.003 | |
| Radiology residents | 7 | 0.818 | 0.844 | 0.026 |
| 8 | 0.751 | 0.809 | 0.058 | |
| 9 | 0.787 | 0.837 | 0.050 | |
| Pulmonologists | 10 | 0.763 | 0.796 | 0.033 |
| 11 | 0.807 | 0.816 | 0.009 | |
| 12 | 0.768 | 0.803 | 0.035 | |
| Average a | 0.797 (0.758, 0.835) | 0.822 (0.783, 0.861) | 0.025 (0.009, 0.042) | |
Note.AUAFROC area under the alternative free-response receiver operating characteristic curves; AI, artificial intelligence; CI, confidence interval
a Values in parentheses in the last line of the table are 95% confidence intervals. The p value between the observed average values was .003. The Dorfman-Berbaum-Metz test was used to compare the AUAFROCs between unaided and AI-assisted readings
Fig. 2Graphs showing receiver operating characteristic curves (a) and jackknife alternative free-response receiver operating characteristic curves (b) of each physician and AI solution for referable thoracic abnormalities on chest radiographs. TPF, false-positive fraction; FPF, true-positive fraction; LLF,lesion localization fraction; AI,artificial intelligence; GR,general radiologist; P,pulmonologist; RR,radiology resident; TR, thoracic radiologist
Observer group averaged AUC and AUAFROC for chest radiographs (n = 230) from respiratory outpatient clinics
| Observer group | AUC | AUAFROC | ||||||
|---|---|---|---|---|---|---|---|---|
| Unaided | AI-assisted | Unaided | AI-assisted | |||||
| Thoracic radiologists ( | 0.887 (0.841, 0.934) | 0.894 (0.840, 0.947) | 0.207 | 0.581 | 0.820 (0.746, 0.893) | 0.835 (0.757, 0.914) | 0.026 | 0.601 |
| Board-certified radiologists ( | 0.870 (0.829, 0.906) | 0.886 (0.826, 0.946) | 0.141 | 0.123 | 0.801 (0.758, 0.845) | 0.817 (0.755, 0.879) | 0.294 | 0.116 |
| Radiology residents ( | 0.845 (0.796, 0.893) | 0.879 (0.846, 0.912) | 0.070 | 0.033 | 0.785 (0.723, 0.848) | 0.830 (0.788, 0.872) | 0.045 | 0.104 |
| Pulmonologists ( | 0.842 (0.801, 0.848) | 0.884 (0.853, 0.915) | 0.034 | 0.012 | 0.779 (0.731, 0.828) | 0.805 (0.765, 0.845) | 0.071 | 0.037 |
Note. Numbers in parentheses are 95% CIs. AUC, area under the receiver operating characteristic curve; ROC, receiver operator characteristic; AUAFROC, area under the alternative free-response receiver operating characteristic curves; CI, confidence interval
a Comparison of AUCs or AUAFROCs between unaided and AI-assisted readings in each observer group
b Comparison of AUCs or AUAFROCs between unaided and AI standalone performance
The standalone performance of the AI solution was an AUC of 0901 (0860, 0941) and an AUAFROC of 0836 (0789, 0883), respectively. The Dorfman-Berbaum-Metz test was used to compare the AUC and AUAFROC between unaided and AI-assisted readings
Fig. 3A 54-year-old woman with pneumonia in the right lower lung zone. Chest radiography demonstrated ill-defined ground-glass opacity or consolidation in the right para-hilar area, which was marked with a white outline as the reference standard. a The AI solution correctly detected the lesion with a probability value of 69%. b Chest CT without contrast enhancement shows consolidation and tiny ill-defined nodules in the right middle lobe. c Among the 12 observers, seven could detect the lesions without AI assistance. With the use of an AI solution, all observers could detect the lesions. The AI solution led to accurate detection of pneumonia on chest radiographs in the case of five observers (42%), including two pulmonologists, one thoracic radiologist, one general radiologist, and one radiology resident
Fig. 4A 56-year-old man with adenocarcinoma of the right upper lobe. A chest radiograph shows a faint nodular opacity in the right upper lung zone. a The AI solution correctly detected the lesion with a probability value of 63%. b Chest CT with contrast enhancement demonstrated a spiculated nodule in the right upper lobe. c Among the 12 observers, two observers, including one pulmonologist and one radiology resident, could detect the lesion without AI assistance (unaided reading). In addition, two observers, one thoracic radiologist, and one pulmonologist marked a false-positive lesion in unaided reading. With the use of an AI solution, observers could detect the lesions. The false-positive lesion marked on unaided reading was withdrawn by two observers in AI-assisted reading. Regarding visual certainty for the lesion, three observers, including two thoracic radiologists and one pulmonologist, rated a higher score in AI-assisted reading than in unaided reading