| Literature DB >> 27655140 |
Laura Schummers1, Katherine P Himes2, Lisa M Bodnar3, Jennifer A Hutcheon4.
Abstract
BACKGROUND: Compelled by the intuitive appeal of predicting each individual patient's risk of an outcome, there is a growing interest in risk prediction models. While the statistical methods used to build prediction models are increasingly well understood, the literature offers little insight to researchers seeking to gauge a priori whether a prediction model is likely to perform well for their particular research question. The objective of this study was to inform the development of new risk prediction models by evaluating model performance under a wide range of predictor characteristics.Entities:
Keywords: Area under the receiver operating characteristic curve; Discrimination; Epidemiologic methods; Model performance; Risk classification; Risk prediction model
Year: 2016 PMID: 27655140 PMCID: PMC5031287 DOI: 10.1186/s12874-016-0223-2
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Clinical characteristics and risk factors for preeclampsia included in clinical prediction model in our data set and those from a previously published cohort study of preeclampsia risk
| Predictors in our data set | ||
|---|---|---|
| Prevalence n (%) | Crude odds ratio (95 % CI) | |
| Maternal age a | ||
| <20 | 1,624 (2.2) | 1.0 (0.8, 1.2) |
| 20–29 | 32,140 (42.7) | REF |
| 30–40 | 38,444 (51.1) | 1.0 (0.9, 1.0) |
| ≥40 | 3,017 (4.0) | 1.4 (1.3, 1.6) |
| Prepregnancy body mass index a | ||
| 25–29 | 46,979 (62.5) | REF |
| 30–34 | 17,692 (23.5) | 1.6 (1.5, 1.7) |
| 35–39 | 6,968 (9.3) | 2.1 (1.9, 2.3) |
| ≥40 | 3,586 (4.8) | 1.8 (2.5, 3.1) |
| Maternal height <60 in. | 4,280 (5.7) | 0.9 (0.8, 1.0) |
| Nulliparity | 32,571 (43.3) | 2.5 (2.4, 2.6) |
| Pre-existing diabetes | 769 (1.0) | 2.9 (2.4, 3.5) |
| Smoking | 8,411 (11.2) | 0.9 (0.8, 1.0) |
| History of stillbirth | 713 (0.9) | 0.8 (0.6, 1.1) |
| History neonatal death | 281 (0.4) | 1.0 (0.6, 1.5) |
| History of spontaneous abortion | 18,046 (24.0) | 1.0 (0.9, 1.0) |
aPrepregnancy body mass index and maternal age at birth were modeled using restricted cubic splines
Risk stratification capacity of the original model: observed vs. predicted risk
| Predicted risk (%) | No. of births per stratum (% of sample) | Observed risk (%) | Likelihood ratio (95 % CI) |
|---|---|---|---|
| <3.0 | 5,788 (7.7) | 134 (2.3) | 0.3 (0.2–0.3) |
| 3.0–5.5 | 21,654 (28.8) | 876 (4.0) | 0.4 (0.4–0.4) |
| 5.5–12.0 a | 33,178 (44.11) | 2,846 (8.6) | 1.0 (1.0–1.1) |
| 12.0–15.0 | 5,960 (7.9) | 800 (13.4) | 1.7 (1.6–1.8) |
| >15.0 | 8,645 (11.5) | 1,660 (19.2) | 2.7 (2.6–2.9) |
| Total | 75,225 (100.0 %) | 6,313 (8.4) | - |
aGiven a baseline risk of 8.4 %, this category is clinically equivalent to the population average risk
Fig. 1Discrimination performance (measured by area under Receiver Operator Characteristic curve) of risk prediction models according to simulated predictor characteristics. The original risk prediction model was augmented with simulated predictors with prevalence from 5 to 40 % and odds ratios ranging from 1 to 16: a one added simulated predictor per model; b three added simulated predictors per model; c five added simulated predictors per model
Fig. 2Proportion of population classified into a clinically distinct risk group (predicted risk <3.0 % or >15.0 %) from risk prediction models according to simulated predictor characteristics. The original risk prediction model was augmented with simulated predictors with prevalence from 5 to 40 % and odds ratios ranging from 1 to 16: a one added simulated predictor per model; b three added simulated predictors per model; c five added simulated predictors per model
Fig. 4Overall model performance (measured by the proportion of variability in the outcome explained by the predictors, or Nagelkerge’s r2) of risk prediction models according to simulated predictor characteristics. The original risk prediction model was augmented with simulated predictors with prevalence from 5 to 40 % and odds ratios ranging from 1 to 16: a one added simulated predictor per model; b three added simulated predictors per model; c five added simulated predictors per model
Fig. 3Histogram of predicted risk for each observation based on the original risk prediction model plus a one simulated predictor with an odds ratio of 1.5 and 5 % prevalence, and b five simulated predictors with odds ratios of 6 and 40 % prevalence. Green bars indicate a clinically distinct low risk group (predicted risk <3.0 %); brown bars indicate uninformative predicted risk (3.0–15.0 %); blue bars indicate a clinically distinct high risk group (predicted risk >15.0 %)
Model performance measures according to odds ratio, number, and prevalence of simulated predictors
| Simulated predictor characteristics | Model performance measures | |||||
|---|---|---|---|---|---|---|
| OR of simulated predictors | Number of simulated predictors added to original model | Prevalence of simulated predictors | Proportion of population (%) with informative likelihood ratio a | Proportion of population (%) assigned to clinically distinct risk group b | AUC | Nagelkerke’s r2 (%) |
| 2 | 3 | 10 % | 0.0 | 27.2 | 0.71 | 10.0 |
| 2 | 3 | 20 % | 0.0 | 29.9 | 0.73 | 11.7 |
| 2 | 3 | 40 % | 0.0 | 34.8 | 0.74 | 12.9 |
| 2 | 5 | 10 % | 0.0 | 28.9 | 0.73 | 11.8 |
| 2 | 5 | 20 % | 0.0 | 33.3 | 0.75 | 14.6 |
| 2 | 5 | 40 % | 0.0 | 39.2 | 0.77 | 16.6 |
| 6 | 3 | 10 % | 0.0 | 54.4 | 0.83 | 29.4 |
| 6 | 3 | 20 % | 63.6 | 63.6 | 0.87 | 37.1 |
| 6 | 3 | 40 % | 70.2 | 70.0 | 0.88 | 37.1 |
| 6 | 5 | 10 % | 66.9 | 66.9 | 0.88 | 40.6 |
| 6 | 5 | 20 % | 72.0 | 72.0 | 0.92 | 50.7 |
| 6 | 5 | 40 % | 73.8 | 73.8 | 0.93 | 51.2 |
aDefined as the proportion of the population classified into a stratum with a likelihood ratio <0.10 or >10.0
bDefined as the proportion of the population classified into a stratum with predicted risk meaningfully different than the baseline rate of pre-eclampsia in the population (<0.03 or >0.15)