| Literature DB >> 35610589 |
Nitchanant Kitcharanant1,2, Pojchong Chotiyarnwong3, Thiraphat Tanphiriyakun1,4, Ekasame Vanitcharoenkul2, Chantas Mahaisavariya5, Wichian Boonyaprapa6, Aasis Unnanuntana2.
Abstract
BACKGROUND: Fragility hip fracture increases morbidity and mortality in older adult patients, especially within the first year. Identification of patients at high risk of death facilitates modification of associated perioperative factors that can reduce mortality. Various machine learning algorithms have been developed and are widely used in healthcare research, particularly for mortality prediction. This study aimed to develop and internally validate 7 machine learning models to predict 1-year mortality after fragility hip fracture.Entities:
Keywords: Fragility hip fracture; Machine learning; Mortality prediction
Mesh:
Year: 2022 PMID: 35610589 PMCID: PMC9131628 DOI: 10.1186/s12877-022-03152-x
Source DB: PubMed Journal: BMC Geriatr ISSN: 1471-2318 Impact factor: 4.070
Fig. 1Machine learning development process (a) 3 continuous and 11 categorical predictors of one-year mortality were taken into the computational process. (b) A stratified random sampling technique was applied to split patients in a 70:30 ratio to a training dataset and a testing dataset. (c) Training dataset was used to identify the optimal hyperparameters which provided the highest accuracy in a fivefold internal cross-validation of each model. (d) The performance of all algorithms were evaluated with another, unseen, testing dataset
Comparison of the demographic and clinical characteristics of all patients, and of those in the training and testing groups
| Patient characteristics | Total ( | Testing ( | Training( | |
|---|---|---|---|---|
| Age (years), mean ± SD | 78.4 ± 9.8 | 78.0 ± 10.1 | 78.6 ± 9.7 | 0.511 |
| Female sex, | 355 (72.2%) | 104 (70.3%) | 251 (73.0%) | 0.584 |
| Body mass index (kg/m.2), mean ± SD | 22.3 ± 3.9 | 22.2 ± 4.1 | 22.4 ± 3.9 | 0.634 |
| Charlson comorbidity index (CCI) score, | ||||
| - < 3 | 39 (7.9%) | 13 (8.8%) | 26 (7.6%) | 0.716 |
| - ≥ 3 | 453 (92.1%) | 135 (91.2%) | 318 (92.4%) | |
| Pre-injury ambulatory status, | ||||
| - Bedridden | 15 (3.0%) | 7 (4.7%) | 8 (2.3%) | 0.556 |
| - Indoor dependent | 43 (8.7%) | 10 (6.8%) | 33 (9.6%) | |
| - Outdoor dependent | 16 (3.3%) | 5 (3.4%) | 11 (3.2%) | |
| - Indoor independent | 145 (29.5%) | 45 (30.4%) | 100 (29.1%) | |
| Assistive device, | ||||
| - No ambulation | 15 (3.0%) | 7 (4.7%) | 8 (2.3%) | 0.839 |
| - Without assistive device | 259 (52.6%) | 76 (51.4%) | 183 (53.2%) | |
| - Wheelchair | 10 (2.0%) | 2 (1.4%) | 8 (2.3%) | |
| - Walker | 91 (18.5%) | 28 (18.9%) | 63 (18.3%) | |
| - Quad cane | 3 (0.6%) | 1 (0.7%) | 2 (0.6%) | |
| - Tripod cane | 27 (5.5%) | 9 (6.1%) | 18 (5.2%) | |
| - Single cane | 27 (5.5%) | 25 (16.9%) | 62 (18.0%) | |
| Type of fracture, | ||||
| - Femoral neck fracture | 248 (50.4%) | 77 (52.0%) | 171 (49.7%) | 0.883 |
| - Intertrochanteric fracture | 241 (49.0%) | 70 (47.3%) | 171 (49.7%) | |
| - Subtrochanteric fracture | 3 (0.6%) | 1 (0.7%) | 2 (0.6%) | |
| Treatment, | ||||
| - Conservative treatment | 32 (6.5%) | 11 (7.4%) | 21 (6.1%) | 0.791 |
| - Dynamic hip screw | 36 (7.3%) | 12 (8.1%) | 24 (7.0%) | |
| - Cephalomedullary nailing | 202 (41.1%) | 56 (37.8%) | 146 (42.4%) | |
| - Multiple screw fixation | 19 (3.9%) | 8 (5.4%) | 11 (3.2%) | |
| - Hemiarthroplasty | 195 (39.6%) | 59 (39.9%) | 136 (39.5%) | |
| - Total hip arthroplasty | 8 (1.6%) | 2 (1.4%) | 6 (1.7%) | |
| Comorbidities, | ||||
| - Chronic kidney disease stage 4 or severe | 130 (26.4%) | 37 (25.0%) | 93 (27.0%) | 0.658 |
| - Heart disease | 123 (25.0%) | 44 (29.7%) | 79 (23.0%) | 0.114 |
| - Cerebrovascular accident | 103 (20.9%) | 35 (23.6%) | 68 (19.8%) | 0.336 |
| - Lung disease | 34 (6.9%) | 13 (8.8%) | 21 (6.1%) | 0.332 |
| - Dementia | 81 (16.5%) | 16 (10.8%) | 65 (18.9%) | 0.033 |
| Time to surgery | ||||
| - ≤ 48 h | 233 (50.7%) | 69 (50.4%) | 164 (50.8%) | 1.000 |
| - > 48 h | 227 (49.3%) | 68 (49.6%) | 159 (49.2%) | |
| Death, | 62 (12.6%) | 19 (12.8%) | 43 (12.5%) | 1.000 |
A P value < 0.05 indicates statistical significance
Abbreviation: SD Standard deviation
Comparison of the demographic and clinical characteristics of patients who died and those who survived
| Patient characteristics | Deceased ( | Survived ( | |
|---|---|---|---|
| Age (years), mean ± SD | 81.5 ± 8.5 | 77.9 ± 9.9 | |
| Male sex, n (%) | 25 (40.3%) | 112 (26.0%) | |
| Body mass index (kg/m.2), mean ± SD | 21.8 ± 4.2 | 22.4 ± 3.9 | 0.123 |
| Charlson comorbidity index (CCI), | |||
| - < 3 | 0 (0.0%) | 39 (9.1%) | |
| - ≥ 3 | 62 (100.0%) | 391 (90.9%) | |
| Pre-injury ambulatory status, | |||
| - Bedridden | 1 (1.6%) | 14 (3.3%) | |
| - Indoor dependent | 10 (16.1%) | 33 (7.7%) | |
| - Outdoor dependent | 1 (1.6%) | 15 (3.5%) | |
| - Indoor independent | 24 (38.7%) | 121 (28.1%) | |
| - Outdoor independent | 26 (42.0%) | 247 (57.4%) | |
| Assistive device, | |||
| - No ambulation | 1 (1.6%) | 14 (3.3%) | 0.421 |
| - Without assistive device | 26 (41.9%) | 233 (54.2%) | |
| - Wheelchair | 2 (3.2%) | 8 (1.9%) | |
| - Walker | 17 (27.4%) | 74 (17.2%) | |
| - Quad cane | 0 (0.0%) | 3 (0.7%) | |
| - Tripod cane | 4 (6.5%) | 23 (5.3%) | |
| - Single cane | 12 (19.4%) | 75 (17.4%) | |
| Type of fracture, | |||
| - Femoral neck fracture | 20 (32.3%) | 228 (53.0%) | |
| - Intertrochanteric fracture | 42 (67.7%) | 199 (46.3%) | |
| - Subtrochanteric fracture | 0 (0.0%) | 3 (0.7%) | |
| Treatment, | |||
| - Conservative treatment | 13 (21.0%) | 19 (4.4%) | |
| - Dynamic hip screw | 5 (8.1%) | 31 (7.2%) | |
| - Cephalomedullary nailing | 27 (43.5%) | 175 (40.7%) | |
| - Multiple screw fixation | 2 (3.2%) | 17 (4.0%) | |
| - Hemiarthroplasty | 15 (24.2%) | 180 (41.8%) | |
| - Total hip arthroplasty | 0 (0.0%) | 8 (1.9%) | |
| Time to surgery | |||
| - ≤ 48 h | 19 (38.8%) | 214 (52.1%) | 0.096 |
| - > 48 h | 30 (61.2%) | 197 (47.9%) | |
| Comorbidities, | |||
| - Chronic kidney disease stage 4 or severe | 44 (71.0%) | 86 (20.0%) | |
| - Heart disease | 41 (66.1%) | 82 (19.1%) | |
| - Cerebrovascular accident | 24 (38.7%) | 79 (18.4%) | |
| - Lung disease | 17 (27.4%) | 17 (4.0%) | |
| - Dementia | 27 (43.5%) | 54 (12.6%) | |
A P value < 0.05 indicates statistical significance
Abbreviation: SD Standard deviation
Comparison of the performance of each model, by confusion matrix and evaluation measures
| Models | Testing dataset | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Deceased | 13 | 6 | 0.95 [0.91–0.98] | 0.93 [0.66–1.00] | 0.96 [0.91–0.98] | 0.99 [0.96–1.00] | 0.68 [0.43–0.87] | 0.99 [0.95–1.00] | ||||||
| Survived | 1 | 128 | ||||||||||||
| Deceased | 11 | 8 | 0.93 [0.88–0.97] | 0.11 | 0.85 [0.55–0.98] | 0.49 | 0.94 [0.89–0.97] | 0.31 | 0.98 [0.95–1.00] | 0.56 | 0.58 [0.33–0.80] | 0.32 | 0.98 [0.94–1.00] | 0.47 |
| Survived | 2 | 127 | ||||||||||||
| Deceased | 13 | 6 | 0.94 [0.89–0.97] | 0.5 | 0.81 [0.54–0.96] | 0.32 | 0.95 [0.90–0.98] | 0.95 | 0.98 [0.93–1.00] | 0.32 | 0.68 [0.43–0.87] | 1 | 0.92 [0.83–1.00] | |
| Survived | 3 | 126 | ||||||||||||
| Deceased | 7 | 12 | 0.91 [0.85–0.95] | 0.88 [0.47–1.00] | 0.69 | 0.91 [0.86–0.95] | 0.99 [0.96–1.00] | 1 | 0.37 [0.16–0.62] | 0.95 [0.88–1.00] | ||||
| Survived | 1 | 128 | ||||||||||||
| Deceased | 7 | 12 | 0.89 [0.83–0.94] | 0.64 [0.31–0.89] | 0.91 [0.85–0.95] | 0.97 [0.92–0.99] | 0.18 | 0.37 [0.16–0.62] | 0.91 [0.82–1.00] | |||||
| Survived | 4 | 125 | ||||||||||||
| Deceased | 5 | 14 | 0.90 [0.84–0.94] | 0.83 [0.36–1.00] | 0.58 | 0.90 [0.84–0.95] | 0.99 [0.96–1.00] | 1 | 0.26 [0.09–0.51] | 0.94 [0.86–1.00] | ||||
| Survived | 1 | 128 | ||||||||||||
| Deceased | 5 | 14 | 0.90 [0.84–0.94] | 0.83 [0.36–1.00] | 0.58 | 0.90 [0.84–0.95] | 0.99 [0.96–1.00] | 1 | 0.26 [0.09–0.51] | 0.81 [0.68–0.93] | ||||
| Survived | 1 | 128 | ||||||||||||
A P value < 0.05 indicates statistical significance
Abbreviation: CI Confidence interval
Fig. 2Receiver-operating characteristic curve (ROC) of (a) Random Forests algorithm (RF); (b) Gradient Boosting algorithm (GB); (c) Artificial Neural Network algorithm (ANN); (d) Logistic Regression algorithm (LR); (e) Naive Bayes algorithm (NB); (f) Support Vector Machine algorithm (SVM); (g) K-Nearest Neighbors algorithm (KNN); and (h) all algorithms
Fig. 3Characteristics of the selected model (Random Forests model): SHAP Value summary graph of top-20 variables and their impact on the prediction
The best-tuned hyperparameters for each model
| Classifier models | Hyperparameters |
|---|---|
| Gradient Boosting | max_depth = 10, max_features = 'sqrt', min_samples_split = 50, n_estimators = 800, random_state = 8, learning_rate = 0.5, subsample = 0.5 |
| Random Forests | max_depth = 60, max_features = 'sqrt', min_samples_split = 5, min_samples_leaf = 4, n_estimators = 400, random_state = 8 |
| Artificial Neural Network | activation = 'identity', alpha = 0.0001, batch_size = 'auto', hidden_layer_sizes = 7, learning_rate = 'adaptive', learning_rate_init = 0.001, max_iter = 500, solver = 'lbfgs' |
| Logistic Regression | C = 0.4, multi_class = 'multinomial', random_state = 8, solver = 'saga' |
| Naive Bayes | alpha = 1.0, fit_prior = True, class_prior = None |
| Support Vector Machine | C = 0.1, degree = 4, kernel = 'poly', probability = True, random_state = 8 |
| K-Nearest Neighbors | n_neighbors = 3 |