| Literature DB >> 34930738 |
Catherine M Jones1,2, Luke Danaher2, Michael R Milne3,2, Cyril Tang1, Jarrel Seah1,4, Luke Oakden-Rayner5, Andrew Johnson1, Quinlan D Buchlak1,6, Nazanin Esmaili6,7.
Abstract
OBJECTIVES: Artificial intelligence (AI) algorithms have been developed to detect imaging features on chest X-ray (CXR) with a comprehensive AI model capable of detecting 124 CXR findings being recently developed. The aim of this study was to evaluate the real-world usefulness of the model as a diagnostic assistance device for radiologists.Entities:
Keywords: deep learning; machine learning chest X-ray
Mesh:
Year: 2021 PMID: 34930738 PMCID: PMC8689166 DOI: 10.1136/bmjopen-2021-052902
Source DB: PubMed Journal: BMJ Open ISSN: 2044-6055 Impact factor: 2.692
Figure 1Flow diagram illustrating the AI-assisted reporting process described in this study. AI, artificial intelligence; CXR, chest X-ray; PACS, picture archiving and communication system; RIS, radiological information system.
Figure 2Example of the modified user interface used by the participating radiologists in this study. The red box highlights the feedback options added to the interface for this study.
List of review options presented to the radiologist with each case
| Review option | Description |
| Rejected clinical finding | A model-detected finding disputed by the radiologist |
| Missed clinical finding | A model-detected finding missed by the radiologist |
| Add additional findings | Finding(s) identified by the radiologist but not identified by the model |
| These findings significantly impacted my report | A yes/no binary question relating to the effect of the model output on the radiologist report |
| These findings may impact patient management | A yes/no binary question relating to the effect of the model output on patient management, as perceived by the reporting radiologist |
| These findings led to additional imaging recommendations | A binary yes/no question related to whether the radiologist recommended further imaging based on the model output |
Demographics and results for the eleven radiologists involved in this study
| Radiologist ID | No of years post-training | Cases reported (% outpatient) | Significant report impact (%) | Patient management changes (%) | Imaging recommendations (%) |
| 1 | 19 | 136 (21.3) | 1 (0.7) | 1 (0.7) | 0 (0.0) |
| 2 | 1 | 325 (46.2) | 4 (1.2) | 0 (0.0 | 1 (0.3) |
| 3 | 4 | 230 (86.1) | 20 (8.6) | 14 (6.1) | 10 (4.3) |
| 4 | 6 | 375 (22.7) | 3 (1.0) | 0 (0.0) | 1 (0.2) |
| 5 | 4 | 186 (45.7) | 22 (11.8) | 9 (4.8) | 8 (4.3) |
| 6 | 20 | 333 (11.1) | 3 (1.0) | 2 (0.6) | 1 (0.3) |
| 7 | 3 | 312 (48.4) | 15 (4.8) | 8 (2.5) | 1 (0.3) |
| 8 | 26 | 408 (39.7) | 10 (2.4) | 5 (1.2) | 4 (1.0) |
| 9 | 9 | 214 (43.0) | 6 (2.8) | 2 (0.9) | 2 (0.9) |
| 10 | 6 | 159 (98.1) | 1 (0.6) | 1 (0.6) | 1 (0.6) |
| 11 | 5 | 294 (40.1) | 7 (2.4) | 1 (0.3) | 0 (0.0) |
| Total | 2972 |
Percentages (%) represent the associated value as a proportion of the total case number for that radiologist.
Figure 3Counts of numbers of critical findings for the cases seen by the radiologist, defined as the number of critical findings agreed + the number of critical findings added. The number of cases which returned zero findings was 1513.
Breakdown of the critical findings detected by the model and the level of radiologist agreement with each, including the number of findings reportedly missed by the model (and added by the radiologist) or missed by the radiologist
| Critical finding | Displayed by model | Radiologist agreed with finding (%) | Radiologist rejected finding (%) | Added in by radiologist | Missed by radiologist |
| Acute aortic syndrome | 2 | 2.0 (100.0) | 0 (0.0) | 0 | 0 |
| Acute humerus fracture | 5 | 5 (100.0) | 0 (0.0) | 0 | 0 |
| Acute rib fracture | 54 | 39 (72.2) | 15 (27.8) | 0 | 5 |
| Cardiomegaly | 1008 | 979 (97.1) | 29 (2.9) | 0 | 0 |
| Cavitating mass | 14 | 13 (92.9) | 1 (7.1) | 0 | 0 |
| Cavitating mass internal content | 6 | 5 (83.3) | 1 (16.7) | 0 | 0 |
| Diffuse airspace opacity | 13 | 13 (100.0) | 0 (0.0) | 0 | 0 |
| Diffuse lower airspace opacity | 153 | 148 (96.7) | 5 (3.3) | 0 | 0 |
| Diffuse perihilar airspace opacity | 45 | 45 (100.0) | 0 (0.0) | 0 | 0 |
| Diffuse upper airspace opacity | 2 | 2 (100.0) | 0 (0.0) | 0 | 0 |
| Focal airspace opacity | 341 | 321 (94.1) | 20 (5.9) | 0 | 2 |
| Hilar lymphadenopathy | 8 | 6 (75.0) | 2 (25.0) | 0 | 0 |
| Inferior mediastinal mass | 8 | 7 (87.5) | 1 (12.5) | 0 | 0 |
| Loculated effusion | 87 | 80 (92.0) | 7 (8.0) | 0 | 1 |
| Lung collapse | 11 | 10 (90.9) | 1 (9.1) | 0 | 0 |
| Malpositioned CVC | 85 | 78 (91.8) | 7 (8.2) | 0 | 1 |
| Malpositioned ETT | 52 | 43 (82.7) | 9 (17.3) | 0 | 0 |
| Malpositioned NGT | 39 | 31 (79.5) | 8 (20.5) | 0 | 0 |
| Malpositioned PAC | 13 | 9 (69.2) | 4 (30.8) | 0 | 0 |
| Multifocal airspace opacity | 125 | 120 (96.0) | 5 (4.0) | 0 | 1 |
| Multiple pulmonary masses | 43 | 38 (88.4) | 5 (11.6) | 0 | 0 |
| Pneumomediastinum | 5 | 5 (100.0) | 0 (0.0) | 1 | 0 |
| Pulmonary congestion | 220 | 215 (97.7) | 5 (2.3) | 1 | 0 |
| Segmental collapse | 292 | 290 (99.3) | 2 (0.7) | 0 | 1 |
| Shoulder dislocation | 1 | 0 (0.0) | 1 (100.0) | 0 | 0 |
| Simple effusion | 687 | 650 (94.6) | 37 (5.4) | 0 | 1 |
| Simple pneumothorax | 90 | 77 (85.6) | 13 (14.4) | 1 | 1 |
| Single pulmonary mass | 41 | 38 (92.7) | 3 (7.3) | 1 | 1 |
| Single pulmonary nodule | 105 | 95 (90.5) | 10 (9.5) | 3 | 5 |
| Subcutaneous emphysema | 53 | 51 (96.2) | 2 (3.8) | 0 | 1 |
| Subdiaphragmatic gas | 7 | 7 (100.0) | 0 (0.0) | 1 | 0 |
| Superior mediastinal mass | 37 | 32 (86.5) | 5 (13.5) | 0 | 0 |
| Tension pneumothorax | 11 | 7 (63.6) | 4 (36.4) | 0 | 0 |
| Tracheal deviation | 133 | 133 (100.0) | 0 (0.0) | 0 | 0 |
| Total | 3796 | 3594 (94.7) | 202 (5.3) | 8 | 20 |
Percentages (%) represent the associated value as a proportion of the total number of findings displayed by the model.
CVC, Central venous catheter; ETT, Endotracheal tube; NGT, Nasogastric tube; PAC, Pulmonary artery catheter.
Factors affecting AI model influence on report, patient management, or imaging recommendation
| Predictor | Change | ORs (adjusted CI) | P value | Benjamini-Adjusted threshold | Significance |
| No of critical findings | Report | 1.306 (1.132 to 1.507) | 0 | 0.0042 | Yes |
| No of critical findings | Patient management | 1.267 (1.056 to 1.521) | 0.001 | 0.0083 | Yes |
| No of critical findings | Imaging recommendation | 1.319 (1.035 to 1.681) | 0.004 | 0.0125 | Yes |
| Lateral CXR | Imaging recommendation | 6.495 (1.297 to 32.530) | 0.005 | 0.0167 | Yes |
| Lateral CXR | Patient management | 2.158 (0.837 to 5.565) | 0.061 | 0.0208 | No |
| Lateral CXR | Report | 1.542 (0.848 to 2.805) | 0.105 | 0.025 | No |
| Radiologist experience | Report | 0–5 years: Baseline | 0.120 | 0.0292 | No |
| Radiologist experience | Patient management | 0–5 years: Baseline | 0.262 | 0.0333 | No |
| Radiologist experience | Imaging recommendation | 0–5 years: Baseline | 0.516 | 0.0458 | No |
| Inpatient/outpatient | Imaging recommendation | 1.550 (0.613 to 3.919) | 0.326 | 0.0375 | No |
| Inpatient/outpatient | Report | 0.794 (0.476 to 1.323) | 0.358 | 0.0417 | No |
| Inpatient/outpatient | Patient management | 0.818 (0.408 to 1.640) | 0.572 | 0.0500 | No |
Significance testing by the Benjamini-Hochberg algorithm to account for multiple hypotheses. ORs derived from stepwise logistic regression coefficients with CIs calculated with Benjamini-adjusted thresholds. Radiologist experience analysed as a categorical variable with derived from stepwise logistic regression coefficients with CIs calculated with Benjamini-adjusted thresholds. Radiologist experience analysed as a categorical variable with ORs representing effect of changing experience levels from the baseline (0–5 years) to a different level.
AI, artificial intelligence; CXR, chest X-ray.