| Literature DB >> 31093534 |
Ben Van Calster1,2, Kirsten Van Hoorde3, Yvonne Vergouwe2, Shabnam Bobdiwala4, George Condous5, Emma Kirk6, Tom Bourne1,4,7, Ewout W Steyerberg2.
Abstract
BACKGROUND: Risk models often perform poorly at external validation in terms of discrimination or calibration. Updating methods are needed to improve performance of multinomial logistic regression models for risk prediction.Entities:
Keywords: Calibration; Discrimination; Model updating; Multicategory outcome; Multinomial logistic regression; Prediction models; Risk models
Year: 2017 PMID: 31093534 PMCID: PMC6457140 DOI: 10.1186/s41512-016-0002-x
Source DB: PubMed Journal: Diagn Progn Res ISSN: 2397-7523
Descriptive statistics for the case study of multicategory outcome prediction: original development data of model M4 (n = 197), the temporal updating data at SGH (n = 1422), and the geographical updating data at QCCH (n = 873)
| Original development data (SGH) | Temporal updating (SGH) | Geographical updating (QCCH) | |
|---|---|---|---|
| Age (years) | 30 (25–33) | 31 (26–35) | 32 (27–32) |
| Initial hCG (IU/L) | 265 (76–618) | 410 (154–941) | 530 (197–1563) |
| hCG ratio | 0.80 (0.33–1.99) | 1.04 (0.39–2.10) | 0.65 (0.34–1.49) |
| Initial progesterone (nmol/L) | 17 (4–66) | 21 (4–61) | 9 (3–34) |
| Outcome, | |||
| Failed | 109 (55%) | 717 (50%) | 502 (58%) |
| IUP | 76 (39%) | 577 (41%) | 245 (28%) |
| Ectopic | 12 (6%) | 128 (9%) | 126 (14%) |
Data are expressed as median (interquartile range) or as N (%). In the temporal updating data, progesterone was missing in 47 patients (3.3%), and in the geographical updating data, progesterone was missing in 109 patients (12.5%)
SGH St George’s Hospital, QCCH Queen Charlotte and Chelsea’s Hospital, IUP intra-uterine pregnancy, hCG human chorionic gonadotropin
Updating methods for multinomial logistic regression models with the numbers of parameters that are estimated for updating in general and in the case study
| Category | Method and description | Number of parameters |
|---|---|---|
| (General = case study) | ||
| Original | 0—no adjustments | 0 = 0 |
| Recalibration | 1—intercept recalibration: adjust intercepts | ( |
| 2—logistic recalibration: adjust intercepts and slopes |
| |
| 3—refitting: re-estimation of individual coefficients | ( | |
| Revision | 4—penalized refitting using recalibrated coefficients from method 2 as offset | ( |
| 5—refitting including functional form: method 3, but hCGr modeled with rcs | ( | |
| Extension | 6—extension: similar to method 3 but log(progesterone) added | ( |
| 7—penalized extension: similar to method 5 but log(progesterone) added | ( |
hCGr human chorionic gonadotropin ratio, rcs restricted cubic spline, k number of outcome categories, q number of variables (including additional nonlinear and interaction terms, but excluding intercepts) in original model, q ′ number of variables when changing functional form of one or more predictors, m number of variables related to added markers
Description of the closed testing procedure for updating of multinomial logistic regression models
| Step | Procedure |
|---|---|
| 1. Original model vs refitting | H0: both models have the same fit, log |
| 2. Intercept recalibration vs refitting | H0: both models have the same fit, log |
| 3. Logistic recalibration vs refitting | H0: both models have the same fit, log |
Each test is performed at the prespecified overall alpha level
H0 null hypothesis, L likelihood, df degrees of freedom
Polytomous discrimination index, pairwise c-statistics, and Brier score on the updating data after correction for optimism using bootstrapping
| Updating method | PDI |
|
|
| Brier |
|---|---|---|---|---|---|
| Temporal updating (SGH) | |||||
| No updating | 0.87 (0.85–0.90) | 0.99 (0.98–0.99) | 0.92 (0.89–0.94) | 0.89 (0.85–0.93) | 0.172 (0.155–0.190) |
| Intercept recalibration | 0.87 (0.84–0.89) | 0.99 (0.98–0.99) | 0.92 (0.89–0.94) | 0.89 (0.85–0.92) | 0.165 (0.148–0.183) |
| Logistic recalibration | 0.88 (0.85–0.90) | 0.99 (0.98–>0.99) | 0.93 (0.91–0.95) | 0.91 (0.88–0.94) | 0.158 (0.143–0.173) |
| Refitting | 0.88 (0.85–0.90) | 0.99 (0.99–>0.99) | 0.93 (0.91–0.95) | 0.91 (0.88–0.94) | 0.157 (0.141–0.172) |
| Penalized refitting | 0.88 (0.85–0.90) | 0.99 (0.98–>0.99) | 0.93 (0.91–0.95) | 0.91 (0.88–0.94) | 0.158 (0.142–0.172) |
| Refitting + rcs | 0.88 (0.86–0.91) | 0.99 (0.98–>0.99) | 0.93 (0.91–0.95) | 0.92 (0.89–0.95) | 0.153 (0.137–0.168) |
| Extension | 0.89 (0.87–0.92) | 0.99 (0.99–>0.99) | 0.93 (0.92–0.95) | 0.93 (0.90–0.95) | 0.150 (0.135–0.165) |
| Penalized extension | 0.89 (0.87–0.92) | 0.99 (0.99–>0.99) | 0.93 (0.92–0.95) | 0.93 (0.90–0.95) | 0.150 (0.135–0.165) |
| Geographical updating (QCCH) | |||||
| No updating | 0.80 (0.77–0.83) | 0.95 (0.93–0.97) | 0.91 (0.88–0.94) | 0.84 (0.79–0.88) | 0.286 (0.258–0.314) |
| Intercept recalibration | 0.80 (0.77–0.83) | 0.95 (0.93–0.97) | 0.91 (0.88–0.94) | 0.84 (0.79–0.88) | 0.278 (0.247–0.310) |
| Logistic recalibration | 0.80 (0.77–0.83) | 0.96 (0.94–0.97) | 0.93 (0.90–0.95) | 0.84 (0.79–0.88) | 0.267 (0.243–0.291) |
| Refitting | 0.80 (0.77–0.83) | 0.96 (0.94–0.97) | 0.94 (0.92–0.96) | 0.84 (0.80–0.88) | 0.266 (0.243–0.291) |
| Penalized refitting | 0.80 (0.77–0.83) | 0.96 (0.94–0.97) | 0.94 (0.91–0.95) | 0.84 (0.79–0.88) | 0.265 (0.242–0.289) |
| Refitting + rcs | 0.82 (0.79–0.85) | 0.96 (0.94–0.97) | 0.94 (0.92–0.96) | 0.85 (0.81–0.89) | 0.261 (0.237–0.284) |
| Extension | 0.81 (0.78–0.84) | 0.96 (0.94–0.97) | 0.94 (0.92–0.96) | 0.84 (0.80–0.88) | 0.262 (0.238–0.287) |
| Penalized extension | 0.81 (0.78–0.84) | 0.96 (0.94–0.97) | 0.94 (0.92–0.96) | 0.84 (0.80–0.88) | 0.263 (0.239–0.287) |
PDI polytomous discrimination index, FPUL failed pregnancy of unknown location, IUP intra-uterine pregnancy, EP ectopic pregnancy, SGH St. George’s Hospital, QCCH Queen Charlotte’s and Chelsea Hospital, rcs restricted cubic splines
Fig. 1Calibration curves for the original M4 model on the temporal (a) and geographical (b) updating data
Fig. 2Calibration intercepts and slopes (with 95% CI) after correction for optimism using bootstrapping. Results for temporal updating are shown in a–d and for geographical updating in e–h
Results of the closed testing procedure
| Step | df | Temporal updating (SGH) | Geographical updating (QCCH) |
|---|---|---|---|
| 1. Original model vs refitting | 8 | Δ | Δ |
| 2. Intercept recalibration vs refitting | 6 | Δ | Δ |
| 3. Logistic recalibration vs refitting | 2 | Δ | Δ |
df degrees of freedom, SGH St. George’s Hospital, QCCH Queen Charlotte’s and Chelsea Hospital, Δℓ difference in −2 log-likelihood
Fig. 3Reclassification plots for logistic recalibration (method 2) vs refitting (method 3) on the temporal (a, c, e) and geographical (b, d, f) updating data: the predicted probability of FPUL when the reference standard is FPUL (a, b), the predicted probability of IUP when the reference standard is IUP (c, d), the predicted probability of EP when the reference standard is EP (e, f)
| Parameter of M4 | |||||
|---|---|---|---|---|---|
| Updating method | Intercept | Log(hCGm) | hCGrc | hCGrc2 | Log(prog) |
| Temporal updating (SGH), FP vs IUP | |||||
| No updating | 5.88 | −1.18 | −5.56 | 2.05 | na |
| Intercept recalibration | 7.07 | −1.18 | −5.56 | 2.05 | na |
| Logistic recalibration | 9.12 | −1.34 | −6.32 | 1.70 | na |
| Refitting | 6.19 | −0.86 | −7.37 | 2.19 | na |
| Penalized refitting | 7.87 | −1.14 | −6.42 | 1.74 | na |
| Refitting + rcs | 24.08 | −0.87 | Replaced by rcs | na | |
| Extension | 10.90 | −0.77 | −5.49 | 1.65 | −1.67 |
| Penalized extension | 9.84 | −0.84 | −5.99 | 1.80 | −1.20 |
| Temporal updating (SGH), EP vs IUP | |||||
| No updating | 0.39 | −0.06 | −0.26 | −3.93 | na |
| Intercept recalibration | 0.99 | −0.06 | −0.26 | −3.93 | na |
| Logistic recalibration | 5.29 | −0.70 | −3.31 | −0.36 | na |
| Refitting | 4.01 | −0.50 | −4.04 | 0.33 | na |
| Penalized refitting | 4.86 | −0.64 | −3.33 | −0.33 | na |
| Refitting + rcs | 14.44 | −0.48 | Replaced by rcs | na | |
| Extension | 7.83 | −0.47 | −2.66 | −0.14 | −1.21 |
| Penalized extension | 6.83 | −0.50 | −3.05 | −0.05 | −0.84 |
| Geographical updating (QCCH), FP vs IUP | |||||
| No updating | 5.88 | −1.18 | −5.56 | 2.05 | na |
| Intercept recalibration | 6.15 | −1.18 | −5.56 | 2.05 | na |
| Logistic recalibration | 5.08 | −0.82 | −3.86 | 0.61 | na |
| Refitting | 4.74 | −0.75 | −4.04 | 0.06 | na |
| Penalized refitting | 4.86 | −0.78 | −3.90 | 0.50 | na |
| Refitting + rcs | 9.01 | −0.73 | Replaced by rcs | na | |
| Extension | 5.67 | −0.66 | −3.38 | −0.06 | −0.52 |
| Penalized extension | 5.45 | −0.73 | −3.65 | 0.38 | −0.32 |
| Geographical updating (QCCH), EP vs IUP | |||||
| No updating | 0.39 | −0.06 | −0.26 | −3.93 | na |
| Intercept recalibration | 1.25 | −0.06 | −0.26 | −3.93 | na |
| Logistic recalibration | 1.70 | −0.20 | −0.91 | −1.33 | na |
| Refitting | 3.81 | −0.51 | −0.42 | −1.59 | na |
| Penalized refitting | 3.22 | −0.43 | −0.67 | −1.39 | na |
| Refitting + rcs | 0.79 | −0.51 | Replaced by rcs | na | |
| Extension | 4.61 | −0.45 | 0.07 | −1.75 | −0.40 |
| Penalized extension | 3.89 | −0.43 | −0.41 | −1.45 | −0.21 |
hCGm the average of hCG at 48 hours and hCG at presentation, hCGrc the centered ratio of hCG at 48 hours and hCG at presentation, prog progesterone, FPUL failed pregnancy of unknown location, IUP intra-uterine pregnancy, EP ectopic pregnancy, SGH St. George’s Hospital, QCCH Queen Charlotte’s and Chelsea Hospital, rcs restricted cubic splines