| Literature DB >> 35136123 |
Deniz Alis1, Ceren Alis2, Mert Yergin3, Cagdas Topel4, Ozan Asmakutlu4, Omer Bagcilar5, Yeseren Deniz Senli6, Ahmet Ustundag6, Vefa Salt6, Sebahat Nacar Dogan7, Murat Velioglu8, Hakan Hatem Selcuk9, Batuhan Kara9, Caner Ozer10, Ilkay Oksuz10, Osman Kizilkilic6, Ercan Karaarslan1.
Abstract
To investigate the performance of a joint convolutional neural networks-recurrent neural networks (CNN-RNN) using an attention mechanism in identifying and classifying intracranial hemorrhage (ICH) on a large multi-center dataset; to test its performance in a prospective independent sample consisting of consecutive real-world patients. All consecutive patients who underwent emergency non-contrast-enhanced head CT in five different centers were retrospectively gathered. Five neuroradiologists created the ground-truth labels. The development dataset was divided into the training and validation set. After the development phase, we integrated the deep learning model into an independent center's PACS environment for over six months for assessing the performance in a real clinical setting. Three radiologists created the ground-truth labels of the testing set with a majority voting. A total of 55,179 head CT scans of 48,070 patients, 28,253 men (58.77%), with a mean age of 53.84 ± 17.64 years (range 18-89) were enrolled in the study. The validation sample comprised 5211 head CT scans, with 991 being annotated as ICH-positive. The model's binary accuracy, sensitivity, and specificity on the validation set were 99.41%, 99.70%, and 98.91, respectively. During the prospective implementation, the model yielded an accuracy of 96.02% on 452 head CT scans with an average prediction time of 45 ± 8 s. The joint CNN-RNN model with an attention mechanism yielded excellent diagnostic accuracy in assessing ICH and its subtypes on a large-scale sample. The model was seamlessly integrated into the radiology workflow. Though slightly decreased performance, it provided decisions on the sample of consecutive real-world patients within a minute.Entities:
Mesh:
Year: 2022 PMID: 35136123 PMCID: PMC8826390 DOI: 10.1038/s41598-022-05872-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The flowchart of the study (The image was created by the authors using Microsoft PowerPoint v16). We obtained consecutive non-contrast-enhanced CT scans referred from the emergency service in five different tertiary care centers. Data from four centers were used as the training, and the remaining were used as the validation data. The final model was integrated into the Picture archiving and communication system (PACS) on a dedicated embedded unit. The model's performance was assessed on consecutive emergency non-contrast head CT scans for over six months. The diagnostic and inference performance of the system was documented.
Figure 2A diagram showing the joint convolutional neural network (CNN)-recurrent neural network (RNN) with an attention mechanism (The image was created by the authors using Microsoft PowerPoint v16). We used InceptionResNetV2 as the feature extractor with its top predictions layer removed. The extracted features were stacked per scan and fed into the bi-directional RNN. We placed an attention layer between two layers of the RNN, which facilitates RNN to focus on the most relevant slices to identify ICH and its subtypes.
Characteristics of the study sample.
| Variables | Study sample | Training | Validation | Testing |
|---|---|---|---|---|
| 54 (IQR, 43–65) | 54 (IQR, 43–65) | 53 (IQR, 40–60) | 48 (IQR, 36–57) | |
| 28,253 (58.77%) | 25,079 (57.70%) | 3174 (68.85%) | ||
| 13,224 (27.50%) | 9226 (21.22%) | 612 (13.27%) | 130 (34.21%) | |
| 55,179 | 49,968 | 5211 | 452 | |
| 15,733 (28.51%) | 14,742 (29.50%) | 991 (19.01%) | 167 (36.9%) | |
| 10,080 (18.26%) | 9422 (18.85%) | 658 (12.62%) | 86 (19.02%) | |
| 5963 (10.78%) | 5535 (11.07%) | 418 (8.02%) | 38 (8.4%) | |
| 9555 (17.31%) | 8955 (17.92%) | 600 (11.51%) | 48 (10.61%) | |
| 7473 (13.54%) | 7022 (14.05%) | 451 (8.65%) | 76 (16.81%) | |
| 1237 (2.24%) | 1116 (2.33%) | 71 (1.35%) | 14 (3.1%) | |
| 2,255,271 | 2,255,271 | 212,873 | – | |
| 188,067 (8.33%) | 175,664 (7.78%) | 12,403 (5.82%) | – | |
| 2,067,204 (91.67%) | 2,079,607 (92.22%) | 200.470 (6.18%) | – |
*EDH epidural hemorrhage, ICH intracranial hemorrhage, IPH intra-parenchymal hemorrhage, SAH subarachnoid hemorrhage, SDH subdural hemorrhage.
Diagnostic performance of the unified CNN-RNN model on the training, validation, and testing sets.
| ICH subtype | Diagnostic metrics | Confusion matrix | ||||||
|---|---|---|---|---|---|---|---|---|
| Sensitivity | Specificity | Precision (95% CI) | Accuracy | AUROC | Predictions | Ref. test | ||
| Pos | Neg | |||||||
97.72 (97.47–97.96) | 98.49 (98.36–98.61) | 96.45 (96.15–96.74) | 98.26 (98.15–98.37 | 0.992 (0.991–0.993) | 14,406 | 336 | Pos | |
| ICH-Binary | 531 | 34,695 | Neg | |||||
| IPH | 93.19 (92.67–93.69) | 99.27 (99.18–99.37) | 96.74 (96.37–97.10) | 98.12 (98–98.24) | 0.990 (0.989–0.991) | 8780 | 642 | Pos |
| 296 | 40,250 | Neg | ||||||
| IVH | 91.83 (91.11–92.55) | 99.54 (98.47–99.60) | 96.14 (95.62–96.66) | 98.69 (98.53–98.78) | 0.993 (0.991–0.994) | 5083 | 452 | Pos |
| 204 | 44,229 | Neg | ||||||
| SAH | 81.37 (80.56–82.17) | 98.57 (98.45–98.68) | 92.53 (91.95–93.11) | 95.49 (95.30–95.66) | 0.978 (0.976–0.980 | 7287 | 1668 | Pos |
| 588 | 40,425 | Neg | ||||||
| SDH | 87.35 (86.57–88.13) | 98.94 (98.84–99.04) | 93.09 (92.48–93.71) | 97.31 (97.17–97.48) | 0.956 (0.954–0.958) | 6134 | 888 | Pos |
| 455 | 42,491 | Neg | ||||||
| EDH | 78.47 (76.11–80.822 | 98.94 (98.84–99.01) | 63.28 (60.79–65.76) | 98.43 (98.32–98.53) | 0.988 (0.987–989) | 915 | 251 | Pos |
| 531 | 48,271 | Neg | ||||||
99.70 (99.35–100) | 99.34 (99.09–99.58) | 97.24 (96.24–98.25) | 99.41 (99.19–99.61) | 0.998 (0.996–0.999) | 988 | 3 | Pos | |
| ICH-Binary | 28 | 4192 | Neg | |||||
| IPH | 95.90 (94.38–97.41) | 99.28 (99.02–99.42) | 95.03 (93.38–96.68) | 98.85 (98.55–99.13) | 0.998 (0.997–1) | 631 | 27 | Pos |
| 33 | 4520 | Neg | ||||||
| IVH | 95.22 (93.16–97.26) | 99.56 (99.37–99.74) | 94.99 (92.90–97.08) | 99.21 (98.97–99.45) | 0.998 (0.992–1) | 398 | 20 | Pos |
| 21 | 4772 | Neg | ||||||
| SAH | 84.50 (81.60–87.39) | 99.09 (98.81–99.20) | 92.35 (90.13–94.57) | 97.41 (96.47–97.84) | 0.991 (0.981–0.999) | 507 | 93 | Pos |
| 42 | 4569 | Neg | ||||||
| SDH | 91.13 (88.50–93.75) | 99.33 (99.10–99.55) | 92.78 (90.37–95.19) | 98.62 (98.30–98.90) | 0.974 (0.972–0.976) | 411 | 40 | Pos |
| 32 | 4728 | Neg | ||||||
| EDH | 74.61 (64.48–84.73) | 98.83 (98.49–99.16) | 51.56 (41.80–61.11) | 98.50 (98.16–98.83) | 0.980 (0.970–0.983) | 53 | 18 | Pos |
| 50 | 5090 | Neg | ||||||
96.41 (93.58–99.22) | 95.79 (93.45–98.18) | 93.06 (89.28–96.85) | 96.02 (94.21–97.82 | 0.961 (0.941–0.982) | 161 | 6 | Pos | |
| ICH-Binary | 12 | 273 | Neg | |||||
| IPH | 82.56 (74.53–90.57) | 97.54 (95.95–99.12) | 88.75 (81.83–95.67) | 94.69 (92.62–96.75) | 0.905 (0.888–0.925) | 71 | 15 | Pos |
| 9 | 357 | Neg | ||||||
| IVH | 86.84 (66.94–97.58) | 98.31 (97.06–99.55) | 82.50 (70.72–94.28) | 97.35 (95.86–98.82) | 0.925 (0.900–0.950) | 33 | 5 | Pos |
| 7 | 407 | Neg | ||||||
| SAH | 91.67 (83.84–99.48) | 86.14 (82.76–89.5) | 44 (34.27–53.73) | 86.73 (83.69–89.85) | 0.889 (0.863–0.925) | 44 | 4 | Pos |
| 56 | 348 | Neg | ||||||
| SDH | 88.16 (80.89–95.42) | 90.16 (87.14–93.17) | 64.42 (55.22–73.62) | 89.82 (87.03–92.61) | 0.891 (0.870–0.91) | 67 | 9 | Pos |
| 37 | 339 | Neg | ||||||
| EDH | 71.4 (47.72–95.07) | 99.98 (99.84–1) | 90.91 (73.92–100) | 98.89 (97.15–99.9) | 0.980 (0.96–1) | 10 | 4 | Pos |
| 1 | 437 | Neg | ||||||
*EDH epidural hemorrhage, ICH intracranial hemorrhage, IPH intra-parenchymal hemorrhage, SAH subarachnoid hemorrhage, SDH subdural hemorrhage.
Figure 3A 68-year-old female with known hypertension (The images were created by the authors using open-source software, Matplotlib v3.5, Python v3). A right thalamic hematoma extended into the adjacent ventricular system on a non-contrast head CT scan (right). NormGrad (middle) method generates more delicate saliency maps than Grad-CAM (left), highlighting the thalamic hematoma and its ventricular extension. The average quality scores were 3.6 points and 2 points for the NormGrad and Grad-CAM, respectively. Please note that the observers evaluated saliency maps with the same color spectrum, and the current color maps are adjusted for representative purposes.
Figure 4A 71-year-old man with a recent history of head trauma (The images were created by the authors using open-source software, Matplotlib v3.5, Python v3). Non-contrast head CT scan shows a subdural hematoma along the left tentorium cerebelli (right). NormGrad (middle) method generates finer saliency maps than Grad-CAM (left), highlighting the subdural hematoma. The average quality scores were 3.8 points and 1.8 points for the NormGrad and Grad-CAM, respectively. Please note that the observers evaluated saliency maps with the same color spectrum, and the current color maps are adjusted for representative purposes.
Figure 5The presentative images of different patients in whom the model predictions were wrong (The images were created by the authors using open-source software, Matplotlib v3.5, Python v3). The original (the upper left) and corresponding normgrad images (the upper right) with a false-positive prediction are shown. In addition, the model overlooked the minor subarachnoid hemorrhage in the left frontal lobe (the lower left); the model missed the minor subarachnoid hemorrhage in the frontal lobe and subdural hemorrhage in the frontotemporal area (the lower right).