| Literature DB >> 32368331 |
Gregor Urban1, Saman Porhemmat1, Maya Stark2, Brian Feeley3, Kazunori Okada2, Pierre Baldi1.
Abstract
Total Shoulder Arthroplasty (TSA) is a type of surgery in which the damaged ball of the shoulder is replaced with a prosthesis. Many years later, this prosthesis may be in need of servicing or replacement. In some situations, such as when the patient has changed his country of residence, the model and the manufacturer of the prosthesis may be unknown to the patient and primary doctor. Correct identification of the implant's model prior to surgery is required for selecting the correct equipment and procedure. We present a novel way to automatically classify shoulder implants in X-ray images. We employ deep learning models and compare their performance to alternative classifiers, such as random forests and gradient boosting. We find that deep convolutional neural networks outperform other classifiers significantly if and only if out-of-domain data such as ImageNet is used to pre-train the models. In a data set containing X-ray images of shoulder implants from 4 manufacturers and 16 different models, deep learning is able to identify the correct manufacturer with an accuracy of approximately 80% in 10-fold cross validation, while other classifiers achieve an accuracy of 56% or less. We believe that this approach will be a useful tool in clinical practice, and is likely applicable to other kinds of prostheses.Entities:
Keywords: Computer vision; Deep learning; Orthopedics; Total shoulder arthroplasty; X-ray imaging
Year: 2020 PMID: 32368331 PMCID: PMC7186366 DOI: 10.1016/j.csbj.2020.04.005
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Architecture of the custom CNN model. Convolutional layers are denoted as Conv, max pooling layers as Pool, and fully connected layers as FC.
Fig. 2Examples of the data set: shoulder implants of three different manufacturers. Left to right: Cofield, Depuy, Zimmer.
Performance measures for non-deep learning classifiers. Shown are averages across 10-fold cross-validation, and standard deviation of the mean in parentheses. All methods were trained using data augmentation.
| Classifier | Accuracy [%] | Precision | Recall | F1-Score | AUC |
|---|---|---|---|---|---|
| Random Forest | 56 (1.) | 0.62 (.03) | 0.36 (.02) | 0.51 (.03) | 0.78 (.01) |
| Logistic Regression | 53 (1.) | 0.44 (.05) | 0.31 (.01) | 0.41 (.03) | 0.73 (.01) |
| Gradient Boosting | 55 (1.) | 0.58 (.04) | 0.34 (.01) | 0.48 (.02) | 0.75 (.01) |
| KNN | 52 (1.) | 0.49 (.04) | 0.31 (.01) | 0.43 (.02) | 0.73 (.01) |
Performance measures for convolutional neural networks with pre-training on ImageNet. All models are trained with data augmentation, but we evaluated them both with and without test-time data augmentation. Shown are averages across 10-fold cross-validation and standard deviation of the mean in parentheses.
| Classifier | Accuracy [%] | Precision | Recall | F1-Score | AUC |
|---|---|---|---|---|---|
| VGG-16 | 74.0 (2.3) | 0.72 (.03) | 0.68 (.02) | 0.69 (.03) | 0.93 (.01) |
| VGG-19 | 76.2 (1.6) | 0.75 (.03) | 0.69 (.03) | 0.70 (.03) | 0.93 (.01) |
| ResNet-50 | 75.4 (1.5) | 0.75 (.02) | 0.70 (.02) | 0.71 (.02) | 0.93 (.01) |
| ResNet-152 | 75.6 (2.0) | 0.73 (.03) | 0.69 (.02) | 0.70 (.03) | 0.92 (.01) |
| NASNet | 80.4 (.8) | 0.80 (.01) | 0.75 (.02) | 0.76 (.02) | 0.94 (.00) |
| DenseNet-201 | 79.6 (.9) | 0.79 (.01) | 0.74 (.02) | 0.74 (.01) | 0.94 (.01) |
| VGG-16 | 75.2 (1.7) | 0.74 (.02) | 0.67 (.03) | 0.68 (.03) | 0.93 (.01) |
| VGG-19 | 76.2 (1.9) | 0.75 (.03) | 0.68 (.02) | 0.69 (.03) | 0.93 (.01) |
| ResNet-50 | 75.2 (1.8) | 0.77 (.02) | 0.67 (.03) | 0.70 (.02) | 0.92 (.01) |
| ResNet-152 | 74.5 (1.4) | 0.71 (.03) | 0.69 (.03) | 0.69 (.03) | 0.91 (.00) |
| NASNet | 78.8 (1.8) | 0.78 (.02) | 0.73 (.03) | 0.73 (.03) | 0.93 (.01) |
| DenseNet-201 | 78.9 (2.0) | 0.79 (.03) | 0.74 (.03) | 0.76 (.03) | 0.93 (.01) |
Performance measures for convolutional neural networks without pre-training. Shown are averages across 10-fold cross-validation and standard deviation of the mean in parentheses.
| Classifier | Accuracy [%] | Precision | Recall | F1-Score | AUC |
|---|---|---|---|---|---|
| VGG-16 | 55.6 (1.7) | 0.46 (.02) | 0.42 (.02) | 0.42 (.02) | 0.78 (.01) |
| VGG-19 | 57.0 (1.6) | 0.50 (.03) | 0.43 (.02) | 0.43 (.02) | 0.78 (.01) |
| ResNet-50 | 53.8 (1.7) | 0.39 (.06) | 0.34 (.03) | 0.31 (.04) | 0.74 (.02) |
| ResNet-152 | 53.4 (1.2) | 0.38 (.03) | 0.36 (.02) | 0.34 (.02) | 0.77 (.01) |
| NASNet | 51.8 (1.5) | 0.22 (.04) | 0.29 (.02) | 0.23 (.03) | 0.71 (.02) |
| DenseNet-201 | 54.0 (1.3) | 0.46 (.02) | 0.40 (.02) | 0.39 (.02) | 0.79 (.01) |
| Custom CNN | 56.0 (1.4) | 0.42 (.02) | 0.42 (.02) | 0.41 (.02) | 0.78 (.01) |
Fig. 3Receiver Operating Characteristic (ROC) curve for the Random Forest.
Fig. 4Receiver Operating Characteristic (ROC) curve for NASNet.
Performance of MLP classifiers trained on features extracted from pre-trained ImageNet CNNs. Shown are averages across 10-fold cross-validation, and standard deviation of the mean in parentheses. Trained using data augmentation.
| Features | Accuracy [%] | Precision | Recall | F1-Score | AUC |
|---|---|---|---|---|---|
| VGG-16 | 72.3 (1.) | 0.77 (.01) | 0.61 (.02) | 0.65 (.02) | 0.90 (.01) |
| VGG-19 | 72.2 (2.) | 0.78 (.02) | 0.64 (.03) | 0.67 (.03) | 0.91 (.01) |
Performance measures for convolutional neural networks without using any data augmentation. Shown are averages across 10-fold cross-validation and standard deviation of the mean in parentheses.
| Classifier | Accuracy [%] | Precision | Recall | F1-Score | AUC |
|---|---|---|---|---|---|
| VGG-16 | 58.7 (2.5) | 0.54 (.03) | 0.45 (.03) | 0.45 (.04) | 0.81 (.02) |
| VGG-19 | 63.6 (1.6) | 0.61 (.02) | 0.53 (.03) | 0.54 (.03) | 0.84 (.01) |
| ResNet-50 | 59.6 (2.2) | 0.56 (.02) | 0.49 (.02) | 0.49 (.02) | 0.83 (.01) |
| ResNet-152 | 59.5 (1.2) | 0.54 (.03) | 0.47 (.02) | 0.48 (.02) | 0.83 (.01) |
| NASNet | 64.5 (3.4) | 0.62 (.05) | 0.52 (.04) | 0.54 (.04) | 0.85 (.02) |
| DenseNet-201 | 65.9 (2.4) | 0.65 (.03) | 0.55 (.03) | 0.57 (.03) | 0.86 (.02) |
| Custom CNN | 50.8 (2.4) | 0.39 (.04) | 0.32 (.01) | 0.30 (.02) | 0.73 (.01) |