| Literature DB >> 32127625 |
Pranav Rajpurkar1, Allison Park1, Jeremy Irvin1, Andrew Y Ng1, Bhavik N Patel2,3, Chris Chute1, Michael Bereket1, Domenico Mastrodicasa4, Curtis P Langlotz5, Matthew P Lungren5.
Abstract
The development of deep learning algorithms for complex tasks in digital medicine has relied on the availability of large labeled training datasets, usually containing hundreds of thousands of examples. The purpose of this study was to develop a 3D deep learning model, AppendiXNet, to detect appendicitis, one of the most common life-threatening abdominal emergencies, using a small training dataset of less than 500 training CT exams. We explored whether pretraining the model on a large collection of natural videos would improve the performance of the model over training the model from scratch. AppendiXNet was pretrained on a large collection of YouTube videos called Kinetics, consisting of approximately 500,000 video clips and annotated for one of 600 human action classes, and then fine-tuned on a small dataset of 438 CT scans annotated for appendicitis. We found that pretraining the 3D model on natural videos significantly improved the performance of the model from an AUC of 0.724 (95% CI 0.625, 0.823) to 0.810 (95% CI 0.725, 0.895). The application of deep learning to detect abnormalities on CT examinations using video pretraining could generalize effectively to other challenging cross-sectional medical imaging tasks when training data is limited.Entities:
Mesh:
Year: 2020 PMID: 32127625 PMCID: PMC7054445 DOI: 10.1038/s41598-020-61055-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Data Set Selection Flow Diagram.
Demographic information for the training, development, and test datasets.
| Statistic | Training | Development | Test |
|---|---|---|---|
| Studies, N | 438 | 106 | 102 |
| Patients, N | 435 | 105 | 102 |
| Female, N (%) | 253 (58.2) | 64 (61.0) | 64 (62.7) |
| Age, Y mean (SD) | 38.2 (15.6) | 39.2 (17.3) | 38.4 (15.7) |
| Abnormal, N (%) | 255 (58.2) | 53 (50.0) | 51 (50.0) |
Figure 2AppendiXNet Training: AppendiXNet was first pretrained on Kinetics, a large collection of labeled YouTube videos. After pretraining AppendiXNet, the network was fine-tuned on the appendicitis task after replacing the final fully connected layer with one which produces a single output.
Performance measures of AppendiXNet on the independent test set with and without pretraining.
| Model | AUC | Specificity | Sensitivity | Accuracy |
|---|---|---|---|---|
| Pretrained on video images | 0.810 (0.725, 0.895) | 0.667 (0.530, 0.780) | 0.784 (0.654, 0.875) | 0.725 (0.632, 0.803) |
| Not pretrained on video images | 0.724 (0.625, 0.823) | 0.353 (0.236, 0.490) | 0.784 (0.654, 0.875) | 0.569 (0.472, 0.661) |
Figure 3Interpreting AppendiXNet Predictions. CT (grayscale; left) and corresponding gradient-weighted class activation map (Grad-CAM) (colored; right) are provided. (A) Example of true positive. CT image (left) shows acute appendicitis (dashed circle). The deep learning model correctly predicted appendicitis as seen on the corresponding Grad-CAM. (B) Example of false positive. CT image (left) shows a normal appendix (dashed circle). The deep learning model incorrectly predicted appendicitis as seen on the corresponding Grad-CAM where it focused on a loop of distal ileum which was used for a false positive prediction. (C) Example of false negative. CT image (left) shows early acute appendicitis (dashed circle). The deep learning model incorrectly provided a low predicted probability for appendicitis as seen on the corresponding Grad-CAM where it focused on a loop of distal lieum.
Area under the receiver operating characteristic curve (AUC) of different models on the development set with and without pretraining.
| Training Strategy | AUC (95% CI) | |
|---|---|---|
| Not Pretrained | Pretrained | |
| AppendiXNet | 0.743 (0.649, 0.837) | 0.826 (0.742, 0.909) |
| Average of 2D ResNet-18 | 0.704 (0.605, 0.803) | 0.763 (0.672, 0.854) |
| Average of 2D ResNet-34 | 0.740 (0.644, 0.835) | 0.802 (0.715, 0.888) |
| LRCN ResNet-18 | 0.706 (0.605, 0.806) | 0.778 (0.690, 0.867) |
| LRCN ResNet-34 | 0.488 (0.376, 0.600) | 0.787 (0.699, 0.875) |
| SE-ResNeXt-50 | 0.503 (0.391, 0.614) | 0.721 (0.625, 0.817) |