| Literature DB >> 35136091 |
E A Murphy1, B Ehrhardt1, C L Gregson2,3, O A von Arx4, A Hartley2,3, M R Whitehouse2,5, M S Thomas4, G Stenhouse4, T J S Chesser5, C J Budd1,6, H S Gill7,8.
Abstract
Hip fractures are a major cause of morbidity and mortality in the elderly, and incur high health and social care costs. Given projected population ageing, the number of incident hip fractures is predicted to increase globally. As fracture classification strongly determines the chosen surgical treatment, differences in fracture classification influence patient outcomes and treatment costs. We aimed to create a machine learning method for identifying and classifying hip fractures, and to compare its performance to experienced human observers. We used 3659 hip radiographs, classified by at least two expert clinicians. The machine learning method was able to classify hip fractures with 19% greater accuracy than humans, achieving overall accuracy of 92%.Entities:
Mesh:
Year: 2022 PMID: 35136091 PMCID: PMC8825848 DOI: 10.1038/s41598-022-06018-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Hip fracture types.
Figure 2Performance assessment of CNN1 based on the Jaccard index J, which measures the agreement between two images. J = 0 means no agreement and J = 1 means total agreement; J > 0.5 is considered good agreement.
Figure 3Expert fracture classification process and agreement for Dataset 2.
CNN2 performance assessment.
| Actual | Total | |||
|---|---|---|---|---|
| No fracture | Trochanteric | Intracapsular | ||
| No fracture | 304 | 12 | 13 | 329 |
| Trochanteric | 1 | 169 | 6 | 176 |
| Intracapsular | 15 | 14 | 198 | 227 |
| Total | 320 | 195 | 217 | |
| Precision | 0.92 | 0.96 | 0.87 | |
| 95% CI | 0.89 to 0.95 | 0.92 to 0.98 | 0.82 to 0.91 | |
| Recall | 0.95 | 0.87 | 0.91 | |
| 95% CI | 0.92 to 0.97 | 0.81 to 0.91 | 0.87 to 0.95 | |
| F1 | 0.94 | 0.91 | 0.89 | |
Precision = (number correctly predicted as class A)/(number predicted as class A). Recall = (number correctly predicted as class A)/(number actually of class A). F1 varies from 1 = perfect classifier for class A, to 0 = no image was correctly identified as class A.
Figure 4Receiver Operating Characteristic (ROC) curves illustrating trade-offs between true-positive and false-positive rate for the three classes of hip fracture, as predicted by CNN2 using AUC = area under the curve, given with the 95% confidence interval (CI).
Figure 5Activation maps for representative examples for No fracture, Trochanteric and Intracapsular classes. Dark red implies regions of high contribution and dark blue regions of low contribution. A custom python code based on the code provided by Selvaraju et al.[41] downloaded from github (https://github.com/ramprs/grad-cam) was used to generate the activation maps.
Ground-truth classification according to musculoskeletal experts for Dataset 2.
| Class | NHFD subclass | Subtotal | Total |
|---|---|---|---|
| Intracapsular | Displaced | 864 | 1089 |
| Undisplaced | 207 | ||
| Unable to determine subclass | 18 | ||
| Trochanteric | Grade A1/A2 | 818 | 993 |
| Grade A3 | 151 | ||
| Unable to determine subclass | 24 | ||
| Subtrochanteric | 114 | ||
| Unfractured | 1603 | ||
| Not classifiable | 168 |