| Literature DB >> 33067505 |
Gitaek Kwon1, Jongbin Ryu2, Jaehoon Oh3,4, Jongwoo Lim5,6, Bo-Kyeong Kang7,8, Chiwon Ahn9, Junwon Bae10, Dong Keon Lee11.
Abstract
This study aimed to verify a deep convolutional neural network (CNN) algorithm to detect intussusception in children using a human-annotated data set of plain abdominal X-rays from affected children. From January 2005 to August 2019, 1449 images were collected from plain abdominal X-rays of patients ≤ 6 years old who were diagnosed with intussusception while 9935 images were collected from patients without intussusception from three tertiary academic hospitals (A, B, and C data sets). Single Shot MultiBox Detector and ResNet were used for abdominal detection and intussusception classification, respectively. The diagnostic performance of the algorithm was analysed using internal and external validation tests. The internal test values after training with two hospital data sets were 0.946 to 0.971 for the area under the receiver operating characteristic curve (AUC), 0.927 to 0.952 for the highest accuracy, and 0.764 to 0.848 for the highest Youden index. The values from external test using the remaining data set were all lower (P-value < 0.001). The mean values of the internal test with all data sets were 0.935 and 0.743 for the AUC and Youden Index, respectively. Detection of intussusception by deep CNN and plain abdominal X-rays could aid in screening for intussusception in children.Entities:
Mesh:
Year: 2020 PMID: 33067505 PMCID: PMC7567788 DOI: 10.1038/s41598-020-74653-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flow chart of data collection and analysis. ED, emergency department.
Baseline characteristics of participants who provided images for the data sets.
| Positive images | Negative images | P-value | |
|---|---|---|---|
| Images, n | 318 | 3525 | |
| Participants, n | 161 | 1760 | |
| Age, months, mean [s.d.] | 21.8 [12.0] | 24.0 [17.6] | 0.12 |
| Sex, male, n (%) | 101 (62.7) | 1207 (68.6) | 0.13 |
| Images, n | 716 | 5210 | |
| Participants, n | 361 | 2615 | |
| Age, months, mean [s.d.] | 22.2 [17.9] | 32.8 [17.9] | < 0.001* |
| Sex, male | 228 (63.2) | 1461 (55.9) | 0.01* |
| Images, n | 415 | 1200 | |
| Participants, n | 208 | 602 | |
| Age, months, mean [s.d.] | 20.9 [16.8] | 31.1 [23.8] | < 0.001* |
| Sex, male | 136 (65.4) | 305 (50.7) | < 0.001* |
| Images, n | 1449 | 9935 | |
| Participants, n | 730 | 4977 | |
| Age, months, mean [s.d.] | 21.7 [15.8] | 29.5 [19.3] | < 0.001* |
| Sex, male | 465 (63.7) | 2973 (59.7) | 0.04* |
Continuous variables are presented by mean [standard deviation] and categorical variables are presented by N (%), p < 0.05.
The independent t-test or the Kruskal–Wallis test were used to compare positive and negative groups according to normality. Categorical variables were presented as numbers and percentages and analysed using a chi-squared test.
*P-values < 0.05 were considered statistically significant.
Diagnostic performance matrix of the internal and external validation tests with optimal cut-off values (Phase 1).
| (A) | Positive | Negative | (B) | Positive | Negative | (C) | Positive | Negative | |
|---|---|---|---|---|---|---|---|---|---|
| Internal validation | Predicted positive | 188 | 166 | Predicted positive | 214 | 122 | Predicted positive | 136 | 141 |
| Predicted negative | 18 | 1581 | Predicted negative | 13 | 1161 | Predicted negative | 13 | 805 | |
| External validation | Predicted positive | 329 | 446 | Predicted positive | 301 | 1817 | Predicted positive | 466 | 822 |
| Predicted negative | 86 | 754 | Predicted negative | 17 | 1708 | Predicted negative | 250 | 4388 |
The optimal cut-off value was estimated based on the highest Youden index in the internal validation tests.
(A) External validation test with set C set after training and internal validation test with sets A + B, (B) External validation with set A after training and internal validation with sets B + C, (C) External validation with set B after training and internal validation with sets C + A set. Positive; intussusception, Negative; no intussusception. Youden Index is the Sensitivity + Specificity − 1.
Outcomes of the internal validation test after the training with two data sets and of the external validation test using the excluded data set (Phase 1).
| Training and internal validation test | External validation test | P-value of difference between two validation (95% CI) | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Data set | AUC | Highest accuracy | Highest Youden index | Sen | Spe | Optimal cut-off value | Data set | AUC | Youden index | Sen | Spe | ||
| (A) | A + B | 0.966 (0.955, 0.975) | 0.952 | 0.818 | 0.913 | 0.905 | 0.02 | C | 0.811 (0.784, 0.835) | 0.421 | 0.793 | 0.628 | < 0.001* (0.128, 0.183) |
| (B) | B + C | 0.971 (0.959, 0.980) | 0.943 | 0.848 | 0.943 | 0.905 | 0.06 | A | 0.895 (0.874, 0.913) | 0.431 | 0.947 | 0.485 | < 0.001* (0.059, 0.102) |
| (C) | C + A | 0.946 (0.926, 0.961) | 0.927 | 0.764 | 0.913 | 0.851 | 0.01 | B | 0.844 (0.828, 0.858) | 0.493 | 0.651 | 0.842 | < 0.001* (0.080, 0.125) |
(A) External validation with set C after training and internal validation with sets A + B, (B) External validation with set A after training and internal validation with sets B + C, (C) External validation with set B after training and internal validation with sets C + A, (D) Internal validation after training with sets A + B + C. Positive, with intussusception; negative, without intussusception. AUC, area under the receiver operating characteristic curve (ROC). Accuracy, the fraction of the correct predictions over the total number of predictions. The Youden index, sensitivity + specificity – 1—that is, the vertical distance between the 45° line and the point on the ROC curve. In the external validation tests, we selected the optimal cut-off value based on the highest Youden index value in the internal validation tests. CI, confidence interval. Sen, sensitivity. Spe, specificity.
*P-values < 0.05 indicate a statistically significant difference.
Figure 2Receiver operating characteristic (ROC) curves of internal and external validation tests in Phase 1 and 2 experiments. (A) External validation with set C after training and internal validation with sets A + B, (B) External validation with set A after training and internal validation with sets B + C, (C) External validation with set B after training and internal validation with sets C + A, (D) Internal validation after training with sets A + B + C.
Outcomes on the internal validation test after training with all data sets (Phase 2).
| Test | Outcomes | AUC (95% CI) | ||
|---|---|---|---|---|
| Highest Youden index | Sen | Spe | ||
| 1st | 0.737 | 0.816 | 0.921 | 0.936 (0.918–0.950) |
| 2nd | 0.731 | 0.784 | 0.936 | 0.946 (0.931–0.958 |
| 3rd | 0.760 | 0.817 | 0.943 | 0.949 (0.934–0.961) |
| 4th | 0.726 | 0.844 | 0.882 | 0.922 (0.904–0.937) |
| 5th | 0.760 | 0.817 | 0.943 | 0.949 (0.934–0.960) |
| Mean (95% CI) | 0.743 (0.722–0.763) | 0.816 (0.789–0.842) | 0.925 (0.893–0.957) | 0.935 (0.928–0.941) |
AUC, area under the receiver operating characteristic curve (ROC). The Youden index, the sensitivity + specificity – 1—that is, the vertical distance between the 45° line and the point on the ROC curve. In the internal validation tests, after training with all data sets, we selected the outcome values based on the highest Youden index.
CI, confidence interval. Sen, sensitivity. Spe, specificity.
Figure 3Class activation map (CAM) for images which were true positive in the 2nd internal validation test using all data sets. The images in the odd row are the original images while those in the even row are images with CAM applied. Unidentified areas was highlighted by CAM in images from the 6th row, whereas it highlighted the correct areas in the 2nd and 4th rows.
Figure 4Intussusception screening system architecture. The proposed architecture consists of the abdomen detection model (top) and the intussusception classification model (bottom). The abdomen detection model detects the abdominal region from the entire X-ray image. The intussusception classification model detects intussusception on X-ray images that were cropped by the abdomen detection model.