| Literature DB >> 28681679 |
Jakub Olczak1, Niklas Fahlberg2, Atsuto Maki3, Ali Sharif Razavian1,3, Anthony Jilert2, André Stark1, Olof Sköldenberg1, Max Gordon1.
Abstract
Background and purpose - Recent advances in artificial intelligence (deep learning) have shown remarkable performance in classifying non-medical images, and the technology is believed to be the next technological revolution. So far it has never been applied in an orthopedic setting, and in this study we sought to determine the feasibility of using deep learning for skeletal radiographs. Methods - We extracted 256,000 wrist, hand, and ankle radiographs from Danderyd's Hospital and identified 4 classes: fracture, laterality, body part, and exam view. We then selected 5 openly available deep learning networks that were adapted for these images. The most accurate network was benchmarked against a gold standard for fractures. We furthermore compared the network's performance with 2 senior orthopedic surgeons who reviewed images at the same resolution as the network. Results - All networks exhibited an accuracy of at least 90% when identifying laterality, body part, and exam view. The final accuracy for fractures was estimated at 83% for the best performing network. The network performed similarly to senior orthopedic surgeons when presented with images at the same resolution as the network. The 2 reviewer Cohen's kappa under these conditions was 0.76. Interpretation - This study supports the use for orthopedic radiographs of artificial intelligence, which can perform at a human level. While current implementation lacks important features that surgeons require, e.g. risk of dislocation, classifications, measurements, and combining multiple exam views, these problems have technical solutions that are waiting to be implemented for orthopedics.Entities:
Mesh:
Year: 2017 PMID: 28681679 PMCID: PMC5694800 DOI: 10.1080/17453674.2017.1344459
Source DB: PubMed Journal: Acta Orthop ISSN: 1745-3674 Impact factor: 3.717
Figure 1.2 images from the dataset. The area within the red box is the section presented to the network in order to classify the image. The left image is of a wrist fracture while the right image is without any apparent fracture.
Raw image and label data for a total of 256,458 images. 70% were reserved for training, 20% for validation, and 10% for testing
| Label | n (%) |
|---|---|
| Fracture | |
| No | 111,275 (43) |
| Yes | 143,183 (56) |
| Missing | 2,000 (1) |
| Side | |
| Left | 120,377 (47) |
| Right | 132,511 (52) |
| Missing | 3,570 (1) |
| Exam view | |
| Distal | 7,136 (3) |
| AP | 55,916 (22) |
| Oblique | 44,962 (18) |
| Proximal | 6,776 (3) |
| Radial | 6,946 (3) |
| Lateral | 67,465 (26) |
| Ulnar | 7,014 (3) |
| Missing | 60,243 (24) |
| Exam body part | |
| Finger | 390 (0.2) |
| Thumb | 76 (0) |
| Scaphoid | 27,962 (11) |
| Hand | 5,614 (2) |
| Wrist | 65,264 (25) |
| Ankle | 98,002 (38) |
| Missing | 59,150 (23) |
3 different types
Figure 2.Performance of the 5 networks. An epoch is 1 pass over all images.
Observer fracture outcome compared with gold standard
| Category | Accuracy (%) | 95% CI (%) |
|---|---|---|
| Labels | 83 | 79–87 |
| VGG 16 layers | 83 | 80–87 |
| Reviewer 1 | 82 | 78–86 |
| Reviewer 2 | 82 | 78–85 |
4 labels were missing outcome and were excluded from the analysis for this category.
Outcomes compared between observers. Accuracy is the percentage of outcomes where both observers agree, presented with Cohen’s kappa
| Observer | Accuracy % (Kappa) | ||||
|---|---|---|---|---|---|
| Label | Network | Reviewer 1 | Reviewer 2 | Gold standard | |
| Label | – | 80 (0.6) | 76 (0.5) | 74 (0.5) | 83 (0.7) |
| Network | 80 (0.6) | – | 84 (0.7) | 86 (0.7) | 83 (0.7) |
| Reviewer 1 | 76 (0.5) | 84 (0.7) | – | 90 (0.8) | 82 (0.6) |
| Reviewer 2 | 74 (0.5) | 86 (0.7) | 90 (0.8) | – | 82 (0.6) |
| Gold standard | 83 (0.7) | 83 (0.7) | 82 (0.6) | 82 (0.6) | – |
4 labels were missing outcome and were excluded from the analysis for this category.
Manual review of classifications where the network failed
| Error | n (%) |
|---|---|
| Fracture | |
| Correctly classified | 276 (69) |
| Misclassified | 124 (31) |
| Laterality | |
| Correct laterality | 52 (26) |
| Misclassified | 8 (4) |
| Marker missing | 140 (70) |
| Body part | |
| Correct body part | 17 |
| Related body part | 51 |
| Unrelated body part | 15 |
| Invalid image | 3 |
| Exam view | |
| Correct view | 110 (55) |
| Misclassified | 90 (45) |
| Unrelated view | 12 (6) |
| Closely related view | 78 (39) |
| Ankle: mix-up between AP and mortise | 22 (11) |
| Ankle: mix-up between oblique and lateral | 23 (12) |
| Scaphoid: mix-up between supination and pronation | 14 (7) |
| Scaphoid: mix-up between distal and proximal | 7 (4) |
| Miscellaneous | 12 (6) |