Literature DB >> 29893799

Estimating an Individual's Probability of Revision Surgery After Knee Replacement: A Comparison of Modeling Approaches Using a National Data Set.

Parham Aram1, Lea Trela-Larsen2, Adrian Sayers2,3, Andrew F Hills1, Ashley W Blom2, Eugene V McCloskey4,5, Visakan Kadirkamanathan1, Jeremy M Wilkinson4,5.   

Abstract

Tools that provide personalized risk prediction of outcomes after surgical procedures help patients make preference-based decisions among the available treatment options. However, it is unclear which modeling approach provides the most accurate risk estimation. We constructed and compared several parametric and nonparametric models for predicting prosthesis survivorship after knee replacement surgery for osteoarthritis. We used 430,455 patient-procedure episodes between April 2003 and September 2015 from the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man. The flexible parametric survival and random survival forest models most accurately captured the observed probability of remaining event-free. The concordance index for the flexible parametric model was the highest (0.705, 95% confidence interval (CI): 0.702, 0.707) for total knee replacement and was 0.639 (95% CI: 0.634, 0.643) for unicondylar knee replacement and 0.589 (95% CI: 0.586, 0.592) for patellofemoral replacement. The observed-to-predicted ratios for both the flexible parametric and the random survival forest approaches indicated that models tended to underestimate the risks for most risk groups. Our results show that the flexible parametric model has a better overall performance compared with other tested parametric methods and has better discrimination compared with the random survival forest approach.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29893799      PMCID: PMC6166214          DOI: 10.1093/aje/kwy121

Source DB:  PubMed          Journal:  Am J Epidemiol        ISSN: 0002-9262            Impact factor:   4.897


Shared decision-making between patient and doctor is fundamental to good clinical practice (1, 2) and improves patient knowledge about medical treatments and their associated benefits and risks (3). Decision aids fill the gap between population-level data and its application to the patient’s individual circumstances to better inform patients making choices about health-care interventions (4–6). The use of decision aids in controlled settings enhances patient participation in the process, improves their knowledge and satisfaction, and reduces decisional conflict (1, 7–9). Patient engagement through shared decision-making reduces inequalities in health between patient groups and benefits health-care economies through improved clinical outcomes and better resource utilization (10). Osteoarthritis is the most prevalent musculoskeletal disease and is a leading cause of chronic pain and disability worldwide (11–13). In the United Kingdom alone, 9 million people currently seek treatment for osteoarthritis with a total indirect cost to the economy of £14.8 billion (approximately $19.6 billion) per annum (14, 15). Each year almost 100,000 individuals undergo knee replacement surgery in England and Wales (16), with a direct cost of £546 million (approximately $722 million) for the inpatient stay alone (14). The National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man (NJR) (http://www-new.njrcentre.org.uk/) was established in 2003 to collect audit data on all total hip and knee replacement surgery in these regions, for which it has a completeness rate of 97% (17). Evidence-based decision-making in the setting of joint replacement surgery, where such decisions are preference-sensitive (6, 18), enable the patient to arrive at an informed choice among several alternative treatments (5). The development of a personalized decision aid in this setting requires the generation of a time-to-event model that incorporates individual characteristics, prosthesis choice, and other fixed and modifiable risk factors. The choice of such models is potentially large, including semiparametric Cox models, parametric survival models, flexible parametric survival models (FPMs), and random survival forests (RSFs). These models can be adapted to provide an estimate of the absolute risk of the outcome of interest for each individual. We used the NJR data set to assess the performance of these methods for individual prediction of the risk of prosthesis revision over an 8-year interval after knee replacement.

METHODS

Study population

Our base data set included 787,106 knee replacements carried out in England and Wales between April 2003 and September 2015. We excluded procedures for which osteoarthritis was not the only indication for surgery (29,918), the patient’s body mass index (BMI, calculated as height (h)/weight (m)2) was below 15 or above 55 (2,485), the patient was younger than 30 years or older than 100 years (262), or the American Society of Anesthesiologists grade was 4 or 5 (2,782), indicating severe comorbidities. We conducted a complete-case analysis and excluded procedures with missing data on any of the study covariates, namely: BMI (316,828 missing), knee replacement procedures (10,648 missing), and chemical and mechanical thromboprophylaxis (1,589 missing). This resulted in 430,455 cases with complete information. Separate models were constructed for each of the procedures being considered—total knee replacement (TKR), unicondylar knee replacement (UKR), or patellofemoral replacement (PFR)—due to differences in survival performance characteristics of the different prosthesis categories (19).

Outcome and covariates

The outcome of interest in our time-to-event models was time to first revision surgery. We linked primary knee replacement procedures to revision procedures recorded in the NJR using a unique patient identifier and side (left or right knee). Patient death with a nonrevised prosthesis was considered to be a censoring event. Analysis covariates included age, BMI, sex, American Society of Anesthesiologists grade, chemical and mechanical thromboprophylaxis, and operation type (unilateral/same-day bilateral), based on their known association with prosthesis revision (19–21). The revision of each side of both simultaneous and sequential bilateral procedures was considered independently, with separate times to event for each side. Sequential bilateral procedures performed on different dates were considered to be independent unilateral operations. Previous research has shown that ignoring the potential dependence between procedures in the same patient does not lead to bias (22).

Modeling approaches

In standard parametric methods a distribution for time-to-event data is assumed where the unknown parameters are inferred using the maximum likelihood estimation. Here, we considered exponential, Weibull, and log-logistic distributions. The exponential distribution is defined by a single scale parameter and assumes a constant hazard over time. The Weibull distribution is a 2-parameter distribution with scale and shape parameters producing increasing (shape parameter >1) and decreasing (shape parameter <1) monotonic hazard functions (23). The Weibull and exponential models are proportional hazards models. The 2-parameter log-logistic model is a proportional odds model that can produce a decreasing monotonic (shape parameter ≤1) or unimodal (shape parameter >1) hazard function, depending on the shape parameter (24). If the estimation of the time-to-event distribution itself is not required, the semiparametric Cox model can be used to estimate the effect of covariates on the baseline hazard function. The Cox model assumes proportional hazards and can be fitted by maximizing a partial likelihood function (25, 26). The standard parametric models explained above place specific constraints on the shape of the hazard function. The FPM offers an alternative approach such that restrictions on the shape of the hazard function are relaxed (27). In this approach the baseline cumulative hazard or odds function is modeled as a flexible function of log time using restricted cubic splines. Restricted cubic splines are piecewise third-order polynomials that are smoothly joined together at break points or knots (28). The complexity of the baseline distribution is determined by the number and position of knots in the spline function. Optimal placement of knots is not essential; thus a simple centile-based approach can be adopted (28). The model is fitted with either a proportional hazards or odds assumption using maximum likelihood estimation. The RSF algorithm (29) is a machine learning tool for modeling time-to-event data and is an extension of random forest classifiers and regressors introduced by Breiman (30). The RSF is a distribution-free method, and its tree-based architecture can take possible interaction effects into account through hierarchical splitting. The RSF approach also accounts for nonlinearity by dichotomizing continuous variables at split points (29). In RSF B bootstraps are drawn from the original data set, and each bootstrap sample is used as a root node to grow a survival tree. A subset of covariates is randomly selected at each node of the tree. The node is then split into 2 left and right daughter nodes using a covariate that gives the maximum survival difference between daughter nodes. This can be done through a measure of separation such as the log-rank test (31–33). For continuous covariates splits over all possible values are considered, and an optimal cutoff is then chosen. The tree is grown until each terminal node contains at least a prespecified number of unique cases. For every tree the cumulative hazard function for each terminal node can be calculated using the Nelson-Aalen estimator (34, 35). This gives a series of estimators that correspond to different terminal nodes that define the cumulative hazard function for the tree. The estimated tree’s hazard function for an individual is the Nelson-Aalen estimator for the individual’s terminal node, and an average cumulative hazard function is calculated across all trees in the random forest. It is recommended that between 64 and 128 trees be used to achieve a balance between model performance, processing time, and memory use (36).

Overall model performance

We used the Akaike information criterion (AIC), a measure that compromises between goodness-of-fit and model complexity (37), to provide an overall measure of the performance of the parametric models. We also compared model predictions by averaging the time-to-event estimates for individuals at each time point and comparing with the population-based estimation (Kaplan-Meier).

Model validation

We applied repeated m-fold cross-validation to measure the performance of candidate models’ overall predictive value, discrimination ability, and calibration (38). In m-fold cross-validation, the data set is randomly assigned into m partitions of approximately equal size. The model is then constructed m times using m − 1 of the partitions and tested on the remaining part of the data. The m test results are then averaged to compute an overall performance measure. This ensures that all available data is used for training and testing the models. In repeated cross-validation the above procedure is performed several times. This reduces the variation of the m-fold cross-validation due to the random partitioning (39) and also allows the computation of confidence intervals for performance measures. Overall validation performance: We evaluated the overall performance of models using the time-dependent Brier score, a commonly used tool in clinical outcomes analysis (40). The Brier score is a proper score function that evaluates the accuracy of probabilistic forecasts, and is calculated as the weighted average of squared distances between the observed outcome and predicted probability of that outcome at fixed time points (41). The weights are introduced to incorporate information from censored data and calculated using a model for either marginal or conditional censoring distribution. Time-dependent Brier scores can be integrated over time to provide a summary measure of overall performance. The nearer the Brier score is to zero for a set of predictions, the better the predictions match the observed outcomes. Discrimination: We evaluated the discrimination capability of our models using an extension of Harrell’s concordance index (C index) (42). The Harrell’s C index is the proportion of pairs of subjects in which the one with the shorter time to event is associated with a higher predicted risk. This ignores pairs where the shorter times to event are censored to produce a result that depends on the censoring distribution. This is addressed by introducing a weighted C index, where the weights are similar to that of the Brier score (43). Calibration: The models were further validated using a calibration process. Calibration is used to test the agreement between the predicted risks and the observed risks for different risk groups. These risk groups can be formed by dividing the predicted risk into quantiles. The observed risk for each group can then be computed using Kaplan-Meier method within that risk group (44).

Statistical analysis

We implemented different time-to-event models for each of the TKR, UKR, and PFR procedures with the same set of covariates. We performed a complete-case analysis assuming that data were missing at random, and used only cases with complete data on the covariates of interest. In parametric models a linear combination of the covariate vector is used to form the risk score. We also investigated nonlinear associations of age and BMI with the outcome using first-degree and second-degree fractional polynomials (45). The number of unknown parameters in the baseline hazard function depends on the chosen model: 1 for the exponential model and 2 for Weibull and log-logistic models. For the FPMs we used AIC values as guidance for selection of the scale, proportional hazards or odds, and the number of knots as proposed by Royston and Parmar (27). In the RSF approach each random forest was computed using 100 bootstraps samples and the log-rank splitting rule. The parametric models, estimated by maximum likelihood, were compared using AIC values. We also compared average (over individuals) prediction of each model with Kaplan-Meier estimates. We then selected the models that could capture the overall survival pattern and further evaluated them using 50 repeats of 5-fold cross-validation by comparing the Brier score, C index, and calibration plot. We also performed our evaluation using 50 repeats of stratified 5-fold cross-validation (46) where each fold contained the same proportion of revised and unrevised cases as in the original data. The statistical analyses were carried out using R (R Foundation for Statistical Computing, Vienna, Austria) (packages: randomForestSRC (47), survival (48, 49), flexsurv (50) and pec (51)).

RESULTS

Baseline characteristics of the complete data set are given in Table 1.
Table 1.

Baseline Characteristics of the Patient-Procedure Episodes in the Complete Data Set From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015

CharacteristicTKRUKRPFR
No.%PTIRNo.%PTIRNo.%PTIR
Outcome
 Unrevised381,32298.436,00995.54,93793.1
 Revised6,1371.61,6844.53666.9
Age, years70.2 (9.1)a0.4564.0 (9.7)a1.2559.6 (11.4)a1.90
BMIb70.2 (9.1)a0.4530.1 (5.0)a1.2529.5 (5.3)a1.90
Sex
 Female221,17857.10.4117,54246.51.304,14878.21.73
 Male166,28142.90.5020,15153.51.211,15521.82.53
ASA physical status
 P1 (healthy patient)39,07510.10.498,17921.71.321,378261.80
 P2 (mild systemic disease)286,69374.00.4426,43270.11.223,50366.11.95
 P3 (severe systemic disease)61,69115.90.493,0828.21.374228.01.82
Chemical prophylaxis
 None23,4186.00.432,8637.61.314077.72.18
 Aspirin only27,9967.20.424,40711.71.1674514.01.63
 LMWH ± aspirin248,12464.00.4521,51857.11.292,94955.62.05
 Other/other combinations87,92122.70.478,90523.61.191,20222.71.52
Mechanical prophylaxisc
 None23,4186.00.471,2733.41.682494.72.75
 Active84,58921.80.468,47622.51.171,23423.31.45
 Passive125,23932.30.4411,82031.41.221,48828.12.31
 Both148,76138.40.4515,77541.91.272,23142.11.76
 Other/other combinations5,4521.40.353490.91.631011.91.14
Operation type
 Unilateral381,65098.50.4535,54294.31.294,79190.32.02
 Simultaneous bilateral5,8091.50.312,1515.70.755129.70.80

Abbreviations: ASA, American Society of Anesthesiologists; BMI, body mass index; LMWH, low molecular-weight heparin; PFR, patellofemoral replacement; PTIR, patient-time incident rate; SD, standard deviation; TKR, total knee replacement; UKR, unicondylar knee replacement.

a Values are expressed as mean (SD).

b Weight (kg)/height (m)2.

c In mechanical prophylaxis, “active” includes foot pump and calf compression whereas “passive” is thromboembolic disease (TED) stockings.

Baseline Characteristics of the Patient-Procedure Episodes in the Complete Data Set From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015 Abbreviations: ASA, American Society of Anesthesiologists; BMI, body mass index; LMWH, low molecular-weight heparin; PFR, patellofemoral replacement; PTIR, patient-time incident rate; SD, standard deviation; TKR, total knee replacement; UKR, unicondylar knee replacement. a Values are expressed as mean (SD). b Weight (kg)/height (m)2. c In mechanical prophylaxis, “active” includes foot pump and calf compression whereas “passive” is thromboembolic disease (TED) stockings. For the FPMs we used proportional hazards scale with 3 interior knots for TKR and UKR models and 1 interior knot for the PFR model. For TKR and UKR models the internal knots were placed at quartiles of the log uncensored survival times, which resulted in 5 parameters in the baseline hazard function. For the PFR model, the internal knot was placed at the median of the log uncensored survival times, giving 3 parameters in the baseline hazard function. Partial dependence analysis based on predictions from RSF (52) suggested nonlinear associations between age and BMI and the outcome. We further analyzed these associations with the FPM using fractional polynomial fitting (45). The results are shown in Web Table 1 (available at https://academic.oup.com/aje), where only powers with the largest deviance differences are reported. The results show that the reduction in deviance is not significant (P ≥ 0.05) compared with the case where untransformed variables were used. In RSF, age and BMI were always selected for splits, but other results for other variables were less stable. Mechanical prophylaxis, chemical prophylaxis, and American Society of Anesthesiologists grade were moderately selected for splitting nodes, while sex and operation type were selected in a small fraction of the resamples. The 3 parametric proportional hazards models, log-logistic model, and the semiparametric Cox model for TKR are presented in Table 2 (UKR and PFR are shown in Web Tables 2 and 3). The hazard ratios from the parametric proportional-hazards models were in close agreement with the Cox semiparametric model. Note, the hazard ratio estimates of the FPM approach are closer to that of the Cox model compared with other proportional hazards models. This is expected given that the Cox model and the FPM should give unbiased hazard ratios whereas the hazard ratios conditional on a specific parametric model could be biased if the distribution is misspecified. The odds ratios of the log-logistic model also showed a consistent behavior with respect to hazard ratios.
Table 2.

Parametric and Semiparametric Cox Models of Prosthesis Survivorship for Total Knee Replacement Using Data From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015

CharacteristicExponential ModelWeibull ModelFPMCox ModelLog-Logistic Model
HR95% CIHR95% CIHR95% CIHR95% CIOR95% CI
Age, years0.9550.953, 0.9580.9530.950, 0.9560.9550.953,0.9580.9550.953, 0.9580.9530.950, 0.956
BMIa1.0091.004, 1.0141.0091.004, 1.0141.0081.003, 1.0131.0081.003, 1.0131.0091.004, 1.014
Sex
 Female1.000Referent1.000Referent1.000Referent1.000Referent1.000Referent
 Male1.2111.151, 1.2741.2221.158, 1.2891.2071.148, 1.2701.2071.148, 1.2701.2241.160, 1.291
ASA physical status
 P2 (mild systemic disease)1.000Referent1.000Referent1.000Referent1.000Referent1.000Referent
 P1 (healthy patient)0.9250.854, 1.0030.9240.849, 1.0050.9320.860, 1.0100.9320.860, 1.0100.9230.848, 1.005
 P3 (severe systemic disease)1.2291.146, 1.3191.2401.152, 1.3351.2251.142, 1.3141.2241.141, 1.3121.2421.154, 1.338
Chemical prophylaxis
 LMWH ± aspirin1.000Referent1.000Referent1.000Referent1.000Referent1.000Referent
 Aspirin only0.9310.851, 1.0180.9380.854, 1.0300.9790.895, 1.0710.9800.896, 1.0720.9390.855, 1.033
 None0.9690.884, 1.0630.9820.891, 1.0811.0280.938, 1.1281.0290.938, 1.1280.9830.891, 1.083
 Other/other combinations1.0340.966, 1.1061.0200.950, 1.0960.9690.905, 1.0370.9630.900, 1.0301.0180.948, 1.094
Mechanical prophylaxisb
 Both1.000Referent1.000Referent1.000Referent1.000Referent1.000Referent
 Active0.9930.927, 1.060.9910.921, 1.0650.9820.917, 1.0530.9820.917, 1.0530.9900.921, 1.065
 Passive0.9730.916, 1.0340.9740.914, 1.0380.9790.921, 1.0410.9810.923, 1.0420.9740.914, 1.039
 None1.0170.924, 1.1201.0300.931, 1.1391.0680.938, 1.1281.0680.969, 1.1761.0310.931, 1.142
 Other/other combinations0.7840.613, 1.0040.7760.600, 1.0060.7970.623, 1.0200.7970.622, 1.0200.7740.598, 1.004
Operation type
 Unilateral1.000Referent1.000Referent1.000Referent1.000Referent1.000Referent
 Simultaneous bilateral0.6020.480, 0.7560.5890.464, 0.7480.6100.486, 0.7650.6090.486, 0.7640.5870.463, 0.746

Abbreviations: ASA, American Society of Anesthesiologists; BMI, body mass index; CI, confidence interval; FPM, flexible parametric model; HR, hazard ratio; LMWH, low molecular-weight heparin; OR, odds ratio.

a Weight (kg)/height (m)2.

b In mechanical prophylaxis, “active” includes foot pump and calf compression whereas “passive” is thromboembolic disease (TED) stockings.

Parametric and Semiparametric Cox Models of Prosthesis Survivorship for Total Knee Replacement Using Data From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015 Abbreviations: ASA, American Society of Anesthesiologists; BMI, body mass index; CI, confidence interval; FPM, flexible parametric model; HR, hazard ratio; LMWH, low molecular-weight heparin; OR, odds ratio. a Weight (kg)/height (m)2. b In mechanical prophylaxis, “active” includes foot pump and calf compression whereas “passive” is thromboembolic disease (TED) stockings.

Overall performance

The AIC values, degrees of freedom, and deviances (twice the negative likelihood) for the parametric models are shown in Table 3, where the FPM is preferred (lowest value) by the AIC. The RSF is not included in Tables 2 and 3 because it is a nonparametric approach and is not fitted via the maximum likelihood algorithm; hence AIC cannot be calculated.
Table 3.

Model Fit Statistics for Different Parametric Models Using Data From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015

ModelTKRUKRPFR
Degrees of FreedomDevianceAICDegrees of FreedomDevianceAICDegrees of FreedomDevianceAIC
Exponential model1477,27677,3041417,92917,957143,5473,575
Weibull model1577,25877,2881517,92617,956153,5353,565
Log-logistic model1577,25177,2811517,92217,952153,5313,561
FPM1876,60676,6421817,82917,865163,5053,537

Abbreviations: AIC, Akaike information criterion; FPM, flexible parametric model; PFR, patellofemoral replacement; TKR, total knee replacement; UKR, unicondylar knee replacement.

Model Fit Statistics for Different Parametric Models Using Data From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015 Abbreviations: AIC, Akaike information criterion; FPM, flexible parametric model; PFR, patellofemoral replacement; TKR, total knee replacement; UKR, unicondylar knee replacement. The averaged predicted survival curves over all individuals along with the observed (Kaplan-Meier) curve over time are plotted in Figure 1. The results show that the FPM and the RSF method captured the observed probabilities of remaining event-free accurately. The averaged hazard curves for the parametric models are also given in Figure 2, showing that the FPM can capture the increase and decrease of the hazard rate in the early and the later stages after primary surgery. This may explain its lower AIC values compared with the other parametric models. Figure 1 also suggests that there is insufficient information after year 8; thus only data up to this time point was used in subsequent analyses.
Figure 1.

Observed and predicted probabilities of remaining event-free, using different models and data from the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015. A) Total knee replacement; B) unicondylar knee replacement; C) patellofemoral replacement. Predicted probabilities of remaining event-free were obtained from different models: exponential model, Weibull model, log-logistic model, flexible parametric model (FPM), and random survival forest (RFS). The observed probability of remaining event-free was obtained from the Kaplan-Meier estimator.

Figure 2.

Hazard estimates for different parametric models, using data from the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015. A) Total knee replacement; B) unicondylar knee replacement; C) patellofemoral replacement. FPM, flexible parametric model.

Observed and predicted probabilities of remaining event-free, using different models and data from the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015. A) Total knee replacement; B) unicondylar knee replacement; C) patellofemoral replacement. Predicted probabilities of remaining event-free were obtained from different models: exponential model, Weibull model, log-logistic model, flexible parametric model (FPM), and random survival forest (RFS). The observed probability of remaining event-free was obtained from the Kaplan-Meier estimator. Hazard estimates for different parametric models, using data from the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015. A) Total knee replacement; B) unicondylar knee replacement; C) patellofemoral replacement. FPM, flexible parametric model.

Repeated m-fold cross-validation

Only the FPM and RSF approaches were considered for further comparison given their performance in the previous analysis. The integrated Brier score of the FPM and the RSF at 5 and 8 years are shown in Table 4. FPM and RSF yielded almost identical integrated Brier scores.
Table 4.

Integrated Brier Score Using Data From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015

Model and ProcedureIntegrated Brier Score
At 5 Years95% CIAt 8 Years95% CI
FPM
 TKR0.0140.014, 0.0140.0200.020, 0.020
 UKR0.0360.036, 0.0360.0520.052, 0.052
 PFR0.0580.058, 0.0590.0740.073, 0.075
RSF
 TKR0.0150.015, 0.0150.0200.020, 0.020
 UKR0.0370.037, 0.0370.0520.052, 0.052
 PFR0.0590.059, 0.0590.0730.072, 0.074

Abbreviations: CI, confidence interval; FPM, flexible parametric model; PFR, patellofemoral replacement; RSF, random survival forest; TKR, total knee replacement; UKR, unicondylar knee replacement.

Integrated Brier Score Using Data From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015 Abbreviations: CI, confidence interval; FPM, flexible parametric model; PFR, patellofemoral replacement; RSF, random survival forest; TKR, total knee replacement; UKR, unicondylar knee replacement. The C indexes of the FPM and the RSF at 8 years are presented in Table 5. The FPM model had a higher C index across all procedures, with the greatest contrast versus the RSF models being for TKR, followed by UKR.
Table 5.

C Index at 8 Years Using Data From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015

ModelTKRUKRPFR
C Index95% CIC Index95% CIC Index95% CI
FPM0.7050.702, 0.7070.6390.634, 0.6430.5890.586, 0.592
RSF0.6600.655, 0.6660.6160.610, 0.6210.5790.575, 0.582

Abbreviations: CI, confidence interval; FPM, flexible parametric model; PFR, patellofemoral replacement; RSF, random survival forest; TKR, total knee replacement; UKR, unicondylar knee replacement.

C Index at 8 Years Using Data From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015 Abbreviations: CI, confidence interval; FPM, flexible parametric model; PFR, patellofemoral replacement; RSF, random survival forest; TKR, total knee replacement; UKR, unicondylar knee replacement. Calibration was assessed by dividing the data into deciles of predicted risk of experiencing prosthesis revision within 8 years. Calibration plots were then constructed (Figure 3) to compare observed and average predicted risks for each decile. The absolute probabilities of prosthesis revision along with observed-to-predicted ratios of each decile for different models are also presented in Table 6.
Figure 3.

Calibration plots of prosthesis revision showing predicted risks (black bars) and observed risks (white bars) for different risk groups, using data from the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015. A) Total knee replacement, results from the flexible parametric model; B) unicondylar knee replacement, results from the flexible parametric model; C) patellofemoral replacement, results from the flexible parametric model; D) total knee replacement, results from the random survival forest; E) unicondylar knee replacement, results from the random survival forest; F) patellofemoral replacement, results from the random survival forest.

Table 6.

Observed Versus Predicted Risks of Prosthesis Revision for Different Risk Groups Using Data From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015

Model and Risk DecileTKRUKRPFR
Predicted Probability, Mean (SD)aRatio of Observed to PredictedPredicted Probability, Mean (SD)aRatio of Observed to PredictedPredicted Probability, Mean (SD)aRatio of Observed to Predicted
FPM
 11.47 (0.0006)1.165.33 (0.0106)1.325.67 (0.0581)1.28
 21.89 (0.0005)1.046.79 (0.0073)1.188.83 (0.0483)1.37
 32.19 (0.0005)1.007.67 (0.0070)0.9610.47 (0.0511)1.05
 42.48 (0.0005)0.848.41 (0.0065)1.0211.77 (0.0453)1.05
 52.79 (0.0005)0.979.11 (0.0055)1.1613.02 (0.0409)1.01
 63.14 (0.0006)1.249.85 (0.0066)1.1414.35 (0.0436)0.98
 73.53 (0.0005)1.1210.70 (0.0077)0.9215.84 (0.0406)1.04
 84.04 (0.0008)1.1611.72 (0.0081)1.3417.67 (0.0562)1.01
 94.77 (0.0008)1.3613.1 (0.0116)1.1320.18 (0.0701)0.92
 106.71 (0.0017)1.4416.41 (0.0245)1.1625.99 (0.1186)0.98
RSF
 10.64 (0.0041)3.134.00 (0.0325)1.846.70 (0.1375)1.25
 21.16 (0.0056)1.975.71 (0.0272)1.389.05 (0.1053)1.22
 31.59 (0.0071)1.566.82 (0.0243)1.1810.59 (0.1141)1.11
 42.02 (0.0087)1.357.84 (0.0265)1.1311.96 (0.1249)1.09
 52.49 (0.0116)1.288.88 (0.0253)1.1313.26 (0.1231)1.02
 63.03 (0.0123)1.1310.01 (0.0287)1.1914.62 (0.1146)1.00
 73.68 (0.0131)1.0411.23 (0.0358)1.2316.09 (0.1194)1.03
 84.56 (0.0168)1.0312.64 (0.0420)1.1317.72 (0.1517)1.00
 95.93 (0.0271)1.0514.52 (0.0470)0.8819.69 (0.1695)1.05
 109.83 (0.0745)0.8719.09 (0.0700)0.9023.20 (0.2855)0.90

Abbreviations: FPM, flexible parametric model; PFR, patellofemoral replacement; RSF, random survival forest; SD, standard deviation; TKR, total knee replacement; UKR, unicondylar knee replacement.

a Predicted probabilities (%) are expressed as mean (SD).

Observed Versus Predicted Risks of Prosthesis Revision for Different Risk Groups Using Data From the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015 Abbreviations: FPM, flexible parametric model; PFR, patellofemoral replacement; RSF, random survival forest; SD, standard deviation; TKR, total knee replacement; UKR, unicondylar knee replacement. a Predicted probabilities (%) are expressed as mean (SD). Calibration plots of prosthesis revision showing predicted risks (black bars) and observed risks (white bars) for different risk groups, using data from the National Joint Registry for England, Wales, Northern Ireland, and the Isle of Man, 2003–2015. A) Total knee replacement, results from the flexible parametric model; B) unicondylar knee replacement, results from the flexible parametric model; C) patellofemoral replacement, results from the flexible parametric model; D) total knee replacement, results from the random survival forest; E) unicondylar knee replacement, results from the random survival forest; F) patellofemoral replacement, results from the random survival forest. The observed-to-predicted ratios indicate that the models tended to underestimate the risk in majority of cases. This underestimation may suggest that additional factors associated with revision are absent from the data set. However, the observation that RSF both underestimates the risks in the low-risk groups and overestimates the risk in the highest-risk decile suggests an overfitting bias despite the ensemble averaging over all trees. We present additional analyses using 50 repeats of stratified 5-fold cross-validation in Web Tables 4 and 5; the results are similar to those in this section.

DISCUSSION

Here we have presented a comparative evaluation of alternative survivorship models for knee replacement using the world’s largest knee-replacement clinical data set. A variety of performance metrics were used to evaluate the generated models. The flexible parametric survival model outperformed other methods although its predictive ability was, at best, modest. The FPM and RSF gave identical integrated Brier scores; however, FPM had a higher C index. The observed-to-predicted ratios indicated that both models tended to underestimate the risks in majority of risk groups. Brier scores close to zero indicate that models are able to calculate underlying risks by usefully extracting information from data. The C index uses individual predicted probabilities to distinguish unrevised from revised cases, and our C index results show that the models are capable of providing meaningful individual predictions, with a range from 0.59 to 0.71 depending on the model chosen. The main disadvantage of parametric methods is that the assumed underlying distribution may be misspecified. The FPM incorporates a parametric distribution with flexible complexity to minimize the problem of model misspecification. However, there is no theoretical basis for the number and locations of the knots for the estimation of the baseline scale (28). Other popular flexible methods include piecewise exponential models (53), Bayesian survival models (54), and alternative spline-based approaches (55). The RSF algorithm does not make any modeling assumptions and can handle nonlinear effects and interactions. However, categorization using data-dependent splits gives a suboptimal representation of a continuous variable (56), and the optimal setting of tuning parameters such as the number of trees, the splitting rule, and the number of randomly selected variables for each node split may also represent challenges with this method. Alternative machine learning techniques in modeling time-to-event data are Survival-SVM (57) and other ensemble schemes such as boosting methods (58). We carried out a complete-case analysis assuming that data were missing at random and thus used only patients with complete data on the covariates of interest. Approximately 41.8% of data were excluded, most due to missing BMI data (40.2%). We consider that this is unlikely to affect the results of our comparative study, but this could be addressed using multiple imputation techniques (59, 60). However, results from previous studies using imputed BMI have produced results almost identical to those of selective complete-case analysis (21). The constructed models do not consider the competing risk of death, thus possibly biasing estimates of the prosthesis revision probability. These models can be further extended to accommodate competing risks in the calculation of the absolute risk for each individual (61, 62). Here we assumed a proportional-hazards spline model where time-dependent effects were not considered. This may also have caused bias in the risk estimates (63) of prosthesis revision. The flexible model can be further extended for possible improvement in fit by adding terms for interactions between covariates and the effect of time (27). Finally, an external validation to assess the generalizability and transportability of the model among different populations is required (64). We created different algorithms to model time to event for the 3 knee replacement procedures because the demographic characteristics of the patient populations undergoing each, while overlapping, are distinct, and our aim was to model individual time-to-event estimates based on real-world data. However, the observed differences in revision events between the procedure types raises the separate question of whether this differential revision rate is a function of the procedure, the prosthesis, the patient, or a combination. One approach to model this would be to select random data sets from the overlapping variable characteristics within the cohorts and to estimate a joint model with indicator variables for the different procedures. An alternative modeling approach, such as propensity score matching, might also be employed. However, with both approaches the residual challenge of unobserved confounding would remain (65). Our findings indicate that predictive algorithms based upon the largest current knee replacement and surgical outcomes data set have a modest ability to predict individual survival performance. Further variables not captured within routinely collected clinical audit data sets, such as time between prosthesis insertion and diagnosis of failure rather than time to revision surgery, and the development of novel algorithm methodologies may enhance predictive ability in the future. However, use of current data-driven point estimates of prosthesis performance despite modest discriminatory ability may still be sufficient to help inform preference-based decision-making, although clinical trials of their implementation will be required to confirm their utility. Click here for additional data file.
  41 in total

1.  Paternalism or partnership? Patients have grown up-and there's no going back.

Authors:  A Coulter
Journal:  BMJ       Date:  1999-09-18

2.  Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects.

Authors:  Patrick Royston; Mahesh K B Parmar
Journal:  Stat Med       Date:  2002-08-15       Impact factor: 2.373

3.  Flexible regression models with cubic splines.

Authors:  S Durrleman; R Simon
Journal:  Stat Med       Date:  1989-05       Impact factor: 2.373

4.  Mid-term equivalent survival of medial and lateral unicondylar knee replacement: an analysis of data from a National Joint Registry.

Authors:  P N Baker; S S Jameson; D J Deehan; P J Gregg; M Porter; K Tucker
Journal:  J Bone Joint Surg Br       Date:  2012-12

5.  The global burden of hip and knee osteoarthritis: estimates from the global burden of disease 2010 study.

Authors:  Marita Cross; Emma Smith; Damian Hoy; Sandra Nolte; Ilana Ackerman; Marlene Fransen; Lisa Bridgett; Sean Williams; Francis Guillemin; Catherine L Hill; Laura L Laslett; Graeme Jones; Flavia Cicuttini; Richard Osborne; Theo Vos; Rachelle Buchbinder; Anthony Woolf; Lyn March
Journal:  Ann Rheum Dis       Date:  2014-02-19       Impact factor: 19.103

6.  Random Survival Forest in practice: a method for modelling complex metabolomics data in time to event analysis.

Authors:  Stefan Dietrich; Anna Floegel; Martina Troll; Tilman Kühn; Wolfgang Rathmann; Anette Peters; Disorn Sookthai; Martin von Bergen; Rudolf Kaaks; Jerzy Adamski; Cornelia Prehn; Heiner Boeing; Matthias B Schulze; Thomas Illig; Tobias Pischon; Sven Knüppel; Rui Wang-Sattler; Dagmar Drogan
Journal:  Int J Epidemiol       Date:  2016-09-01       Impact factor: 7.196

7.  Incorporating patients' preferences into medical decisions.

Authors:  J P Kassirer
Journal:  N Engl J Med       Date:  1994-06-30       Impact factor: 91.245

8.  Evaluating the yield of medical tests.

Authors:  F E Harrell; R M Califf; D B Pryor; K L Lee; R A Rosati
Journal:  JAMA       Date:  1982-05-14       Impact factor: 56.272

9.  Years lived with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010.

Authors:  Theo Vos; Abraham D Flaxman; Mohsen Naghavi; Rafael Lozano; Catherine Michaud; Majid Ezzati; Kenji Shibuya; Joshua A Salomon; Safa Abdalla; Victor Aboyans; Jerry Abraham; Ilana Ackerman; Rakesh Aggarwal; Stephanie Y Ahn; Mohammed K Ali; Miriam Alvarado; H Ross Anderson; Laurie M Anderson; Kathryn G Andrews; Charles Atkinson; Larry M Baddour; Adil N Bahalim; Suzanne Barker-Collo; Lope H Barrero; David H Bartels; Maria-Gloria Basáñez; Amanda Baxter; Michelle L Bell; Emelia J Benjamin; Derrick Bennett; Eduardo Bernabé; Kavi Bhalla; Bishal Bhandari; Boris Bikbov; Aref Bin Abdulhak; Gretchen Birbeck; James A Black; Hannah Blencowe; Jed D Blore; Fiona Blyth; Ian Bolliger; Audrey Bonaventure; Soufiane Boufous; Rupert Bourne; Michel Boussinesq; Tasanee Braithwaite; Carol Brayne; Lisa Bridgett; Simon Brooker; Peter Brooks; Traolach S Brugha; Claire Bryan-Hancock; Chiara Bucello; Rachelle Buchbinder; Geoffrey Buckle; Christine M Budke; Michael Burch; Peter Burney; Roy Burstein; Bianca Calabria; Benjamin Campbell; Charles E Canter; Hélène Carabin; Jonathan Carapetis; Loreto Carmona; Claudia Cella; Fiona Charlson; Honglei Chen; Andrew Tai-Ann Cheng; David Chou; Sumeet S Chugh; Luc E Coffeng; Steven D Colan; Samantha Colquhoun; K Ellicott Colson; John Condon; Myles D Connor; Leslie T Cooper; Matthew Corriere; Monica Cortinovis; Karen Courville de Vaccaro; William Couser; Benjamin C Cowie; Michael H Criqui; Marita Cross; Kaustubh C Dabhadkar; Manu Dahiya; Nabila Dahodwala; James Damsere-Derry; Goodarz Danaei; Adrian Davis; Diego De Leo; Louisa Degenhardt; Robert Dellavalle; Allyne Delossantos; Julie Denenberg; Sarah Derrett; Don C Des Jarlais; Samath D Dharmaratne; Mukesh Dherani; Cesar Diaz-Torne; Helen Dolk; E Ray Dorsey; Tim Driscoll; Herbert Duber; Beth Ebel; Karen Edmond; Alexis Elbaz; Suad Eltahir Ali; Holly Erskine; Patricia J Erwin; Patricia Espindola; Stalin E Ewoigbokhan; Farshad Farzadfar; Valery Feigin; David T Felson; Alize Ferrari; Cleusa P Ferri; Eric M Fèvre; Mariel M Finucane; Seth Flaxman; Louise Flood; Kyle Foreman; Mohammad H Forouzanfar; Francis Gerry R Fowkes; Richard Franklin; Marlene Fransen; Michael K Freeman; Belinda J Gabbe; Sherine E Gabriel; Emmanuela Gakidou; Hammad A Ganatra; Bianca Garcia; Flavio Gaspari; Richard F Gillum; Gerhard Gmel; Richard Gosselin; Rebecca Grainger; Justina Groeger; Francis Guillemin; David Gunnell; Ramyani Gupta; Juanita Haagsma; Holly Hagan; Yara A Halasa; Wayne Hall; Diana Haring; Josep Maria Haro; James E Harrison; Rasmus Havmoeller; Roderick J Hay; Hideki Higashi; Catherine Hill; Bruno Hoen; Howard Hoffman; Peter J Hotez; Damian Hoy; John J Huang; Sydney E Ibeanusi; Kathryn H Jacobsen; Spencer L James; Deborah Jarvis; Rashmi Jasrasaria; Sudha Jayaraman; Nicole Johns; Jost B Jonas; Ganesan Karthikeyan; Nicholas Kassebaum; Norito Kawakami; Andre Keren; Jon-Paul Khoo; Charles H King; Lisa Marie Knowlton; Olive Kobusingye; Adofo Koranteng; Rita Krishnamurthi; Ratilal Lalloo; Laura L Laslett; Tim Lathlean; Janet L Leasher; Yong Yi Lee; James Leigh; Stephen S Lim; Elizabeth Limb; John Kent Lin; Michael Lipnick; Steven E Lipshultz; Wei Liu; Maria Loane; Summer Lockett Ohno; Ronan Lyons; Jixiang Ma; Jacqueline Mabweijano; Michael F MacIntyre; Reza Malekzadeh; Leslie Mallinger; Sivabalan Manivannan; Wagner Marcenes; Lyn March; David J Margolis; Guy B Marks; Robin Marks; Akira Matsumori; Richard Matzopoulos; Bongani M Mayosi; John H McAnulty; Mary M McDermott; Neil McGill; John McGrath; Maria Elena Medina-Mora; Michele Meltzer; George A Mensah; Tony R Merriman; Ana-Claire Meyer; Valeria Miglioli; Matthew Miller; Ted R Miller; Philip B Mitchell; Ana Olga Mocumbi; Terrie E Moffitt; Ali A Mokdad; Lorenzo Monasta; Marcella Montico; Maziar Moradi-Lakeh; Andrew Moran; Lidia Morawska; Rintaro Mori; Michele E Murdoch; Michael K Mwaniki; Kovin Naidoo; M Nathan Nair; Luigi Naldi; K M Venkat Narayan; Paul K Nelson; Robert G Nelson; Michael C Nevitt; Charles R Newton; Sandra Nolte; Paul Norman; Rosana Norman; Martin O'Donnell; Simon O'Hanlon; Casey Olives; Saad B Omer; Katrina Ortblad; Richard Osborne; Doruk Ozgediz; Andrew Page; Bishnu Pahari; Jeyaraj Durai Pandian; Andrea Panozo Rivero; Scott B Patten; Neil Pearce; Rogelio Perez Padilla; Fernando Perez-Ruiz; Norberto Perico; Konrad Pesudovs; David Phillips; Michael R Phillips; Kelsey Pierce; Sébastien Pion; Guilherme V Polanczyk; Suzanne Polinder; C Arden Pope; Svetlana Popova; Esteban Porrini; Farshad Pourmalek; Martin Prince; Rachel L Pullan; Kapa D Ramaiah; Dharani Ranganathan; Homie Razavi; Mathilda Regan; Jürgen T Rehm; David B Rein; Guiseppe Remuzzi; Kathryn Richardson; Frederick P Rivara; Thomas Roberts; Carolyn Robinson; Felipe Rodriguez De Leòn; Luca Ronfani; Robin Room; Lisa C Rosenfeld; Lesley Rushton; Ralph L Sacco; Sukanta Saha; Uchechukwu Sampson; Lidia Sanchez-Riera; Ella Sanman; David C Schwebel; James Graham Scott; Maria Segui-Gomez; Saeid Shahraz; Donald S Shepard; Hwashin Shin; Rupak Shivakoti; David Singh; Gitanjali M Singh; Jasvinder A Singh; Jessica Singleton; David A Sleet; Karen Sliwa; Emma Smith; Jennifer L Smith; Nicolas J C Stapelberg; Andrew Steer; Timothy Steiner; Wilma A Stolk; Lars Jacob Stovner; Christopher Sudfeld; Sana Syed; Giorgio Tamburlini; Mohammad Tavakkoli; Hugh R Taylor; Jennifer A Taylor; William J Taylor; Bernadette Thomas; W Murray Thomson; George D Thurston; Imad M Tleyjeh; Marcello Tonelli; Jeffrey A Towbin; Thomas Truelsen; Miltiadis K Tsilimbaris; Clotilde Ubeda; Eduardo A Undurraga; Marieke J van der Werf; Jim van Os; Monica S Vavilala; N Venketasubramanian; Mengru Wang; Wenzhi Wang; Kerrianne Watt; David J Weatherall; Martin A Weinstock; Robert Weintraub; Marc G Weisskopf; Myrna M Weissman; Richard A White; Harvey Whiteford; Steven T Wiersma; James D Wilkinson; Hywel C Williams; Sean R M Williams; Emma Witt; Frederick Wolfe; Anthony D Woolf; Sarah Wulf; Pon-Hsiu Yeh; Anita K M Zaidi; Zhi-Jie Zheng; David Zonies; Alan D Lopez; Christopher J L Murray; Mohammad A AlMazroa; Ziad A Memish
Journal:  Lancet       Date:  2012-12-15       Impact factor: 79.321

10.  No bias of ignored bilaterality when analysing the revision risk of knee prostheses: analysis of a population based sample of 44,590 patients with 55,298 knee prostheses from the national Swedish Knee Arthroplasty Register.

Authors:  Otto Robertsson; Jonas Ranstam
Journal:  BMC Musculoskelet Disord       Date:  2003-02-05       Impact factor: 2.362

View more
  9 in total

1.  A comparison of survival models for prediction of eight-year revision risk following total knee and hip arthroplasty.

Authors:  Alana R Cuthbert; Lynne C Giles; Gary Glonek; Lisa M Kalisch Ellett; Nicole L Pratt
Journal:  BMC Med Res Methodol       Date:  2022-06-06       Impact factor: 4.612

2.  Hot spots and trends in knee revision research since the 21st century: a bibliometric analysis.

Authors:  Kelei Zhai; Weifeng Ma; Tao Huang
Journal:  Ann Transl Med       Date:  2021-03

3.  Patient relevant outcomes of unicompartmental versus total knee replacement: systematic review and meta-analysis.

Authors:  Hannah A Wilson; Rob Middleton; Simon G F Abram; Stephanie Smith; Abtin Alvand; William F Jackson; Nicholas Bottomley; Sally Hopewell; Andrew J Price
Journal:  BMJ       Date:  2019-02-21

4.  Artificial Learning and Machine Learning Decision Guidance Applications in Total Hip and Knee Arthroplasty: A Systematic Review.

Authors:  Cesar D Lopez; Anastasia Gazgalis; Venkat Boddapati; Roshan P Shah; H John Cooper; Jeffrey A Geller
Journal:  Arthroplast Today       Date:  2021-09-03

Review 5.  Artificial intelligence in diagnosis of knee osteoarthritis and prediction of arthroplasty outcomes: a review.

Authors:  Lok Sze Lee; Ping Keung Chan; Chunyi Wen; Wing Chiu Fung; Amy Cheung; Vincent Wai Kwan Chan; Man Hong Cheung; Henry Fu; Chun Hoi Yan; Kwong Yuen Chiu
Journal:  Arthroplasty       Date:  2022-03-05

6.  Using machine learning for the personalised prediction of revision endoscopic sinus surgery.

Authors:  Mikko Nuutinen; Jari Haukka; Paula Virkkula; Paulus Torkki; Sanna Toppila-Salmi
Journal:  PLoS One       Date:  2022-04-29       Impact factor: 3.752

7.  JointCalc: A web-based personalised patient decision support tool for joint replacement.

Authors:  Evgeny Zotov; Andrew F Hills; Fabio L de Mello; Parham Aram; Adrian Sayers; Ashley W Blom; Eugene V McCloskey; J Mark Wilkinson; Visakan Kadirkamanathan
Journal:  Int J Med Inform       Date:  2020-06-25       Impact factor: 4.046

8.  What Is the Effect of Using a Competing-risks Estimator when Predicting Survivorship After Joint Arthroplasty: A Comparison of Approaches to Survivorship Estimation in a Large Registry.

Authors:  Alana R Cuthbert; Stephen E Graves; Lynne C Giles; Gary Glonek; Nicole Pratt
Journal:  Clin Orthop Relat Res       Date:  2021-02-01       Impact factor: 4.755

9.  Personalized estimation of one-year mortality risk after elective hip or knee arthroplasty for osteoarthritis.

Authors:  Lea Trela-Larsen; Gard Kroken; Christoffer Bartz-Johannessen; Adrian Sayers; Parham Aram; Eugene McCloskey; Visakan Kadirkamanathan; Ashley W Blom; Stein Atle Lie; Ove Nord Furnes; J Mark Wilkinson
Journal:  Bone Joint Res       Date:  2020-11       Impact factor: 5.853

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.