| Literature DB >> 35621904 |
Emily N Boice1, Sofia I Hernandez-Torres1, Eric J Snider1.
Abstract
Ultrasound imaging is essential in emergency medicine and combat casualty care, oftentimes used as a critical triage tool. However, identifying injuries, such as shrapnel embedded in tissue or a pneumothorax, can be challenging without extensive ultrasonography training, which may not be available in prolonged field care or emergency medicine scenarios. Artificial intelligence can simplify this by automating image interpretation but only if it can be deployed for use in real time. We previously developed a deep learning neural network model specifically designed to identify shrapnel in ultrasound images, termed ShrapML. Here, we expand on that work to further optimize the model and compare its performance to that of conventional models trained on the ImageNet database, such as ResNet50. Through Bayesian optimization, the model's parameters were further refined, resulting in an F1 score of 0.98. We compared the proposed model to four conventional models: DarkNet-19, GoogleNet, MobileNetv2, and SqueezeNet which were down-selected based on speed and testing accuracy. Although MobileNetv2 achieved a higher accuracy than ShrapML, there was a tradeoff between accuracy and speed, with ShrapML being 10× faster than MobileNetv2. In conclusion, real-time deployment of algorithms such as ShrapML can reduce the cognitive load for medical providers in high-stress emergency or miliary medicine scenarios.Entities:
Keywords: artificial intelligence; deep learning; emergency medicine; image interpretation; military medicine; shrapnel; ultrasound imaging
Year: 2022 PMID: 35621904 PMCID: PMC9144026 DOI: 10.3390/jimaging8050140
Source DB: PubMed Journal: J Imaging ISSN: 2313-433X
Summary of Bayesian-optimized hyperparameters for ShrapML.
| Hyperparameter | Range of Values | Type |
|---|---|---|
| Number of CNN layers | 1–6 | Integer only |
| CNN filters | 4–32 | Integer only |
| Dropout rate | 25–75% | Real number |
| Fully connected layer filters | 8–256 | Integer only |
| Solver type | RMSprop, ADAM, SGDM | Categorical |
Figure 1Representative ultrasound images for preprocessed baseline (A) and shrapnel of varying sizes—2 mm (B), 4 mm (C), 6 mm (D), 8 mm (E), and 10 mm (F)—acquired in the gelatin phantom.
Summary of image classifier model architecture for shrapnel identification in ultrasound image datasets. The algorithm architecture details identify the layer counts as layers with weights: Conv layers and fully connected (FC) or dense layers. This column also contains information about various modules in use (Fire, MBConv, Bottleneck, etc.).
| Classifier Algorithm | Architecture Details | Source of Training Images | Parameters (in Millions) | Year First Published |
|---|---|---|---|---|
| ShrapML | 8 layers—6 Conv, 2 FC | Ultrasound datasets | 0.43 | 2022 [ |
| AlexNet | 8 layers—5 Conv, 3 FC | ImageNet | 62.3 | 2012 [ |
| DarkNet19 | 19 layers—19 Conv | ImageNet | 20.8 | 2016 [ |
| DarkNet53 | 53 layers—52 Conv, 1 FC | ImageNet | 41.6 | 2018 [ |
| EfficientNetB0 | 82 layers—1 Conv, 16 MBConv modules | ImageNet | 5.3 | 2020 [ |
| GoogleNet | 22 layers—22 Conv | ImageNet | 7 | 2014 [ |
| InceptionNetv3 | 101 layers—99 Conv, 2 FC | ImageNet | 23.9 | 2015 [ |
| MobileNetv2 | 53 layers—3 Conv, 7 Bottleneck modules | ImageNet | 3.5 | 2019 [ |
| ResNet50 | 50 layers—50 Conv | ImageNet | 25.6 | 2015 [ |
| ResNet101 | 101 layers—101 Conv | ImageNet | 44.6 | 2015 [ |
| SqueezeNet | 18 layers—2 Conv, 8 Fire modules | ImageNet | 1.24 | 2016 [ |
| VGG16 | 16 layers—13 Conv, 3 FC | ImageNet | 138 | 2014 [ |
Summary of results of the Bayesian optimization of ShrapML. Iterations 33, 83, and 71 were the three highest-performing iterations, based on validation loss. Iterations 231 and 74 were representative medium- and poor-performing iterations. The last column was results for 10 epochs of training from the original ShrapML model. The single-color heat map indicates best-performing models in green, with the worst being uncolored for each performance metric row.
| Model Feature | Iteration 33 | Iteration 83 | Iteration 71 | Iteration 231 | Iteration 74 | Original ShrapML |
|---|---|---|---|---|---|---|
| FC Nodes | 252 | 214 | 250 | 57 | 8 | 256 |
| CNN Nodes | 32 | 5 | 5 | 23 | 32 | 16 |
| Dropout Rate | 31.2% | 36.4% | 25.6% | 72.8% | 59.8% | 55.0% |
| Solver | ADAM | RMSprop | RMSprop | SGDM | ADAM | RMSprop |
| # Layers | 6 | 6 | 6 | 4 | 3 | 5 |
| Time to 10 Epochs | 40:52 | 09:21 | 09:20 | 28:53 | 38:45 | 24:36 |
| Validation Accuracy | 93.7% | 93.4% | 93.7% | 77.8% | 54.4% | 87.7% |
| Validation Loss | 0.1753 | 0.2056 | 0.2296 | 0.4815 | 0.6898 | 0.3448 |
Figure 2Network architecture for the optimized ShrapML model.
Figure 3Final backend analysis of trained Bayesian-optimized ShrapML model includes (A) confusion matrix, (B) ROC analysis from test image sets.
Summary of performance metrics for ShrapML.
| Accuracy | Area Under ROC | F1 Score | Precision | Recall | Specificity |
|---|---|---|---|---|---|
| 0.9761 | 0.9985 | 0.9765 | 0.9645 | 0.9889 | 0.9631 |
Performance values of accuracy obtained during testing and time needed to train the 12 models from initial experimental training using five epochs. The five selected models are indicated in bold.
| Model | Test Accuracy | Training Time (min) |
|---|---|---|
| AlexNet | 0.50 | 10.6 |
|
|
|
|
| DarkNet53 | 0.68 | 36.4 |
| EfficientNetB0 | 0.81 | 29.5 |
|
|
|
|
| InceptionNetV3 | 0.58 | 22.0 |
|
|
|
|
| ResNet50 | 0.75 | 26.6 |
| ResNet101 | 0.83 | 41.1 |
|
|
|
|
|
|
|
|
| VGG 16 | 0.83 | 67.1 |
Figure 4Confusion matrix analysis after 100 training epochs for (A) DarkNet-19, (B) GoogleNet, (C) MobileNetv2, (D) ShrapML, and (E) SqueezeNet.
Summary of performance metrics for each of five models trained using 100 epochs. The single-color heat map indicates the best-performing models in green, with worst being uncolored for each performance metric row.
| Metric | DarkNet-19 | GoogleNet | Mobile Netv2 | ShrapML | SqueezeNet |
|---|---|---|---|---|---|
| Accuracy | 0.973 | 0.971 | 0.998 | 0.966 | 0.955 |
| AUC | 0.998 | 0.997 | 1.000 | 0.996 | 0.993 |
| F1 | 0.973 | 0.972 | 0.998 | 0.967 | 0.956 |
| Precision | 0.999 | 0.953 | 0.999 | 0.958 | 0.943 |
| Recall | 0.947 | 0.992 | 0.998 | 0.976 | 0.969 |
| Specificity | 0.999 | 0.950 | 0.998 | 0.956 | 0.941 |
| Testing Image Inference Time (ms) | 121.40 | 68.80 | 104.00 | 10.20 | 21.90 |