| Literature DB >> 28709404 |
Romin Pajouheshnia1, Linda M Peelen2, Karel G M Moons2,3, Johannes B Reitsma2,3, Rolf H H Groenwold2.
Abstract
BACKGROUND: Prognostic models often show poor performance when applied to independent validation data sets. We illustrate how treatment use in a validation set can affect measures of model performance and present the uses and limitations of available analytical methods to account for this using simulated data.Entities:
Mesh:
Year: 2017 PMID: 28709404 PMCID: PMC5513339 DOI: 10.1186/s12874-017-0375-8
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Fig. 1a-b: Risk distributions in two simulated validation sets. 50% of individuals received an effective treatment (relative odds reduction on treatment: 0.5), (see Table 2 scenarios 2 and 1, respectively, for details). a the model was validated on the combined treatment and control group of a randomised trial. b the model was validated using data from a non-randomised setting where the probability of receiving treatment depended on an individual’s (untreated) outcome risk. Black lines represent the observed risks in the validation set, after treatment. Grey lines represent the risks of the same individuals had they (hypothetically) remained untreated
A summary of fifteen simulated scenarios
| Scenario | Specification | ||||||
|---|---|---|---|---|---|---|---|
| Data generating models (development and validation sets)† | Sample size of data sets | % outcome (before treatment) | Baseline risk in the absence of treatment (Risk) | Treatment allocation model | % treated in validation set | Treatment effect model | |
| 1 (Default) | logit(Y) = −1.50 + 1*X1 + 1*X2 + 0*U | 1000 | 20 | 1 / (1 + exp.(1.50–1*X1–1*X2–0*U)) | P(Tr) = 1 / (1 + exp. (1.95–10*Risk)) | 50 | ORTr = 0.5 |
| 2 | - | - | - | - | P(Tr) = 0.50 | - | - |
| 3 | - | - | - | - | - | - | ORTr = 1 / (1 + exp.(−1 + 5*Risk)) |
| 4 | - | - | - | - | P(Tr) = 1 / (1 + exp. (18–100*Risk)) | - | - |
| 5 | - | - | - | - | P(Tr) = 1 / (1 + exp. (3.30–10*Risk)) | 25 | ORTr = 0.3 |
| 6 | - | - | - | - | P(Tr) = 1 / (1 + exp. (3.30–10*Risk)) | 25 | - |
| 7 | - | - | - | - | P(Tr) = 1 / (1 + exp. (3.30–10*Risk)) | 25 | ORTr = 0.8 |
| 8 | - | - | - | - | - | - | ORTr = 0.3 |
| 9 | - | - | - | - | - | - | ORTr = 0.8 |
| 10 | - | - | - | - | P(Tr) = 1 / (1 + exp. (0.70–10*Risk)) | 75 | ORTr = 0.3 |
| 11 | - | - | - | - | P(Tr) = 1 / (1 + exp. (0.70–10*Risk)) | 75 | - |
| 12 | - | - | - | - | P(Tr) = 1 / (1 + exp. (0.70–10*Risk)) | 75 | ORTr = 0.8 |
| 13 | logit(Y) = −1.55 + 1*X1 + 1*X2 + 1*U | - | - | 1 / (1 + exp.(1.55–1*X1–1*X2–1*U)) | P(Tr) = 1 / (1 + exp. (1.90–10*Risk)) | - | - |
| 14 | logit(Y) = −1.70 + 1*X1 + 1*X2 + 2*U | - | - | 1 / (1 + exp.(1.70–1*X1–1*X2–2*U)) | P(Tr) = 1 / (1 + exp. (1.80–10*Risk)) | - | - |
| 15 | logit(Y) = −2.15 + 1*X1 + 1*X2 + 4*U | - | - | 1 / (1 + exp.(2.15–1*X1–1*X2–4*U)) | P(Tr) = 1 / (1 + exp. (1.55–10*Risk)) | - | - |
Abbreviations: P(Tr) probability of treatment; Risk: baseline risk of an individual in the validation set, prior to treatment; OR Tr relative effect of treatment on the risk of outcome Y
†Predictors X1, X2, and U were independent random draws from a normal distribution (mean = 0, variance = 0.2); the binary outcome Y was sampled from a binomial distribution with outcome probability derived from the data generating model
Scenario 1 is the default scenario on which all other scenarios are based. Where cells are empty (“-”), the default parameter value from scenario 1 was used
Possible methods to account for the effects of treatment in a validation set
| Approach | Implementation | Key considerations |
|---|---|---|
|
| 1. Exclude any individual who received treatment between the point of prediction and the assessment of the outcome from the analysis. | - Provides correct estimates of performance in the (untreated) target population if treatment use is not associated with other prognostic factors.† |
|
| 1. Fit a propensity score (PS) model for treatment in the validation set using logistic regression: | - Provides correct estimates of performance in (untreated) target population if treatment use is or is not associated with other prognostic factors, provided key assumptions of IPW are met.† |
|
| 1. Calculate the linear predictor of the prognostic model: | - Does not affect discrimination. |
|
| 1. Refit the original prognostic model using the full validation data, including an indicator term for treatment use and treatment interaction terms. | - Can lead to an over-estimation of model discrimination. |
Abbreviations: X design matrix (predictor values) for individual i; Y outcome for individual i; LP linear predictor; PS propensity score; Tr treatment
represent coefficients of the treatment propensity model for individual i
represent coefficients of the original prognostic model for individual i
represent coefficients of the updated prognostic model for individual i
*Interaction terms between treatment use and predictors should be included where necessary
†Estimates will be correct providing all other modelling assumptions are met
Fig. 2a-d: Risk distributions in two simulated validation sets, before and after applying different approaches to correct for treatment use. 50% of individuals received an effective treatment (relative odds reduction on treatment: 0.5) (see Table 2 scenarios 2 and 1, respectively, for details). a the model was validated on the combined treatment and control group of a randomised trial. b-d the model was validated using data from a non-randomised setting where the probability of receiving treatment depended on an individual’s (untreated) outcome risk. Solid black lines represent the observed risks in the validation set after treatment. Dashed black lines represent the risks observed after applying correction methods to the data: a-b the exclusion of treated individuals, c IPW, d IPW followed by the exclusion of treated individuals. Grey lines represent the risks of the same individuals had they remained untreated
Estimated calibration in the validation set (observed:expected (O:E) ratio) across fifteen different simulated scenarios
| Scenario | Method | |||||
|---|---|---|---|---|---|---|
| Reference: untreated | Ignore treatment | Exclude treated | IPW | IPW, exclude | IPWtrunc exclude | |
| 1 | 1.00 (0.09) | 0.76 (0.07) | 1.01 (0.13) | 0.79 (0.09) | 1.00 (0.13) | 1.00 (0.12) |
| 2 | 1.00 (0.09) | 0.79 (0.07) | 1.00 (0.11) | 0.79 (0.07) | 1.00 (0.11) | 1.00 (0.11) |
| 3 | 1.01 (0.09) | 0.69 (0.07) | 1.00 (0.13) | 0.76 (0.09) | 1.00 (0.13) | 1.00 (0.12) |
| 4 | 1.00 (0.09) | 0.72 (0.07) | 1.01 (0.16) | 0.74 (0.30) | 0.98 (0.44) | 1.00 (0.17) |
| 5 | 1.00 (0.09) | 0.80 (0.08) | 1.00 (0.13) | 0.68 (0.07) | 1.00 (0.10) | 1.00 (0.10) |
| 6 | 1.00 (0.09) | 0.87 (0.08) | 1.01 (0.10) | 0.79 (0.08) | 1.00 (0.10) | 1.00 (0.10) |
| 7 | 1.00 (0.09) | 0.96 (0.09) | 1.01 (0.10) | 0.93 (0.10) | 1.00 (0.10) | 1.00 (0.10) |
| 8 | 1.00 (0.09) | 0.63 (0.06) | 1.01 (0.12) | 0.68 (0.08) | 1.00 (0.13) | 1.00 (0.12) |
| 9 | 1.00 (0.09) | 0.91 (0.08) | 1.01 (0.12) | 0.92 (0.09) | 1.00 (0.13) | 1.00 (0.12) |
| 10 | 1.00 (0.09) | 0.49 (0.06) | 1.00 (0.17) | 0.68 (0.11) | 1.00 (0.20) | 1.00 (0.18) |
| 11 | 1.00 (0.09) | 0.66 (0.07) | 1.00 (0.17) | 0.79 (0.11) | 1.00 (0.20) | 1.00 (0.18) |
| 12 | 1.01 (0.09) | 0.88 (0.08) | 1.01 (0.17) | 0.92 (0.12) | 1.00 (0.20) | 1.00 (0.18) |
| 13 | 1.00 (0.09) | 0.75 (0.07) | 0.90 (0.12) | 0.76 (0.08) | 0.87 (0.12) | 0.88 (0.11) |
| 14 | 1.00 (0.09) | 0.74 (0.07) | 0.70 (0.10) | 0.72 (0.07) | 0.67 (0.10) | 0.67 (0.09) |
| 15 | 1.00 (0.09) | 0.76 (0.07) | 0.39 (0.07) | 0.74 (0.07) | 0.38 (0.07) | 0.38 (0.07) |
Abbreviations: Exclude: exclusion of treated individuals from the analysis; IPW inverse (treatment) probability weighting; IPW IPW with weight truncation at 98th percentile
Estimated discrimination in the validation set (c-index) across fifteen different simulated scenarios
| Scenario | Method | |||||
|---|---|---|---|---|---|---|
| Reference: untreated | Ignore treatment | Exclude treated | IPW | IPW, exclude | IPWtrunc exclude | |
| 1 | 0.67 (0.02) | 0.63 (0.02) | 0.65 (0.03) | 0.66 (0.03) | 0.66 (0.05) | 0.65 (0.04) |
| 2 | 0.67 (0.02) | 0.66 (0.02) | 0.67 (0.03) | 0.66 (0.02) | 0.67 (0.03) | 0.67 (0.03) |
| 3 | 0.67 (0.02) | 0.59 (0.03) | 0.65 (0.03) | 0.64 (0.03) | 0.66 (0.05) | 0.65 (0.04) |
| 4 | 0.67 (0.02) | 0.59 (0.03) | 0.60 (0.04) | 0.59 (0.08) | 0.57 (0.15) | 0.60 (0.05) |
| 5 | 0.67 (0.02) | 0.62 (0.02) | 0.65 (0.03) | 0.66 (0.03) | 0.67 (0.03) | 0.66 (0.03) |
| 6 | 0.67 (0.02) | 0.64 (0.02) | 0.65 (0.03) | 0.66 (0.03) | 0.66 (0.03) | 0.66 (0.03) |
| 7 | 0.67 (0.02) | 0.66 (0.02) | 0.65 (0.03) | 0.67 (0.03) | 0.67 (0.03) | 0.66 (0.03) |
| 8 | 0.67 (0.02) | 0.60 (0.03) | 0.65 (0.03) | 0.66 (0.03) | 0.66 (0.05) | 0.65 (0.04) |
| 9 | 0.67 (0.02) | 0.65 (0.02) | 0.65 (0.03) | 0.66 (0.03) | 0.66 (0.05) | 0.65 (0.04) |
| 10 | 0.67 (0.02) | 0.61 (0.03) | 0.65 (0.05) | 0.66 (0.05) | 0.66 (0.08) | 0.65 (0.06) |
| 11 | 0.67 (0.02) | 0.64 (0.03) | 0.65 (0.05) | 0.66 (0.05) | 0.66 (0.08) | 0.65 (0.06) |
| 12 | 0.67 (0.02) | 0.66 (0.02) | 0.65 (0.05) | 0.66 (0.04) | 0.66 (0.08) | 0.65 (0.06) |
| 13 | 0.66 (0.02) | 0.63 (0.02) | 0.63 (0.03) | 0.65 (0.03) | 0.64 (0.05) | 0.63 (0.04) |
| 14 | 0.65 (0.02) | 0.63 (0.02) | 0.60 (0.04) | 0.62 (0.03) | 0.61 (0.04) | 0.60 (0.04) |
| 15 | 0.62 (0.02) | 0.61 (0.03) | 0.57 (0.05) | 0.58 (0.03) | 0.57 (0.05) | 0.57 (0.05) |
Abbreviations: Exclude: exclusion of treated individuals from the analysis; IPW inverse (treatment) probability weighting; IPW IPW with weight truncation at 98th percentile
Fig. 3Calibration curves calculated in a treated validation set, following different approaches to account for the effects of treatment. Scenario 1: P(treatment) increases with risk, fixed treatment effect; scenario 2: randomized treatment, fixed treatment effect; scenario 3: P(treatment) increases with risk, treatment effect increases with risk; scenario 4: 18% baseline risk threshold for treatment, fixed treatment effect. Plots were based on sets of 1 million individuals
Fig. 4:Calibration curves calculated in a treated validation set, following different approaches to account for the effects of treatment, in the presence of an unmeasured predictor (U) associated with both the outcome and the probability of receiving treatment. Scenario 13: U has a weak association with the outcome (log(OR) = 1); scenario 14: U has a moderate association with the outcome (log(OR) = 2); scenario 15: U has a strong association with the outcome (log(OR) = 4). Plots were based on sets of 1 million individuals.