| Literature DB >> 26514699 |
Menelaos Pavlou1, Gareth Ambler1, Shaun Seaman2, Maria De Iorio1, Rumana Z Omar1.
Abstract
Risk prediction models are used to predict a clinical outcome for patients using a set of predictors. We focus on predicting low-dimensional binary outcomes typically arising in epidemiology, health services and public health research where logistic regression is commonly used. When the number of events is small compared with the number of regression coefficients, model overfitting can be a serious problem. An overfitted model tends to demonstrate poor predictive accuracy when applied to new data. We review frequentist and Bayesian shrinkage methods that may alleviate overfitting by shrinking the regression coefficients towards zero (some methods can also provide more parsimonious models by omitting some predictors). We evaluated their predictive performance in comparison with maximum likelihood estimation using real and simulated data. The simulation study showed that maximum likelihood estimation tends to produce overfitted models with poor predictive performance in scenarios with few events, and penalised methods can offer improvement. Ridge regression performed well, except in scenarios with many noise predictors. Lasso performed better than ridge in scenarios with many noise predictors and worse in the presence of correlated predictors. Elastic net, a hybrid of the two, performed well in all scenarios. Adaptive lasso and smoothly clipped absolute deviation performed best in scenarios with many noise predictors; in other scenarios, their performance was inferior to that of ridge and lasso. Bayesian approaches performed well when the hyperparameters for the priors were chosen carefully. Their use may aid variable selection, and they can be easily extended to clustered-data settings and to incorporate external information.Entities:
Keywords: Bayesian regularisation; overfitting; rare events; shrinkage
Mesh:
Year: 2015 PMID: 26514699 PMCID: PMC4982098 DOI: 10.1002/sim.6782
Source DB: PubMed Journal: Stat Med ISSN: 0277-6715 Impact factor: 2.373
Penile cancer case study: standardised regression coefficient estimates and percentage shrinkage (in parentheses) compared with MLE.
| MLE | RIDGE | LASSO | ENET | ALASSO | SCAD | BLASSO | SSVS ( | SSVS ( | |
|---|---|---|---|---|---|---|---|---|---|
| age | 1.03 | 0.60 (42) | 0.56 (46) | 0.58 (44) | 0.58 (44) | 0.31 (70) | 0.75 (28) | 0.60 (42) | 0.89 (14) |
| depthin | 0.85 | 0.54 (37) | 0.60 (29) | 0.59 (30) | 0.69 (19) | 0.44 (48) | 0.67 (21) | 0.58 (31) | 0.88 (−4) |
| Ki67 | −0.20 | −0.06 (70) | 0.00 (100) | 0.00 (100) | 0.00 (100) | 0.00 (100) | 0.00 (100) | −0.01 (93) | 0.00 (100) |
| Mcm2 | −0.58 | −0.01 (98) | 0.00 (100) | 0.00 (100) | 0.00 (100) | 0.00 (100) | 0.00 (100) | 0.01 (98) | 0.00 (100) |
| Ki67‐g95 | 0.70 | 0.17 (75) | 0.00 (100) | 0.00 (100) | 0.00 (100) | 0.00 (100) | 0.00 (100) | 0.12 (82) | 0.00 (100) |
| lymphnode | 1.34 | 0.91 (32) | 1.01 (25) | 1.00 (25) | 1.12 (16) | 1.27 (5) | 1.17 (13) | 0.95 (29) | 1.32 (2) |
| vascinv | 0.35 | 0.23 (34) | 0.08 (79) | 0.10 (72) | 0.00 (100) | 0.00 (100) | 0.00 (100) | 0.18 (51) | 0.00 (100) |
| extent | 0.36 | 0.24 (33) | 0.13 (65) | 0.15 (59) | 0.00 (100) | 0.01 (97) | 0.00 (100) | 0.19 (49) | 0.00 (100) |
| ploidy | 0.71 | 0.37 (49) | 0.22 (69) | 0.25 (65) | 0.08 (89) | 0.00 (100) | 0.39 (45) | 0.30 (57) | 0.38 (46) |
For SSVS, c = 10 was the value selected using cross‐validation. MLE, maximum likelihood estimation; BE, backwards elimination; LSF, linear shrinkage factor; ENET, elastic net; ALASSO, adaptive lasso; BLASSO, Bayesian lasso; SSVS, stochastic search variable selection; SCAD, smoothly clipped absolute deviation.
Figure 1Performance measures for the penile cancer data: calibration, discrimination and predictive accuracy for E P V = 3 (left) or 5 (right). The number on top of each graph is the median number of predictors selected by each method. The red horizontal line is the median value for MLE. The blue horizontal line is the optimal calibration slope. EPV, events per variable; N, number of observations; RPMSE, root predictive mean squared error; MLE, maximum likelihood estimation; BE, backwards elimination; LSF, linear shrinkage factor; ENET, elastic net; ALASSO, adaptive lasso; BLASSO, Bayesian lasso; SSVS, stochastic search variable selection; MACE, Monte Carlo simulation error (for the median). The number of data sets (for each method) where the calibration slope could not be estimated for E P V = 3 was as follows: LSF: 2; Ridge: 1; Lasso: 1; ENET: 2; BLASSO: 3; Alasso: 1; SCAD: 2; SSVS: 0.
Figure 2(a) Distribution of coverage probabilities of out of sample predictions (MLE, ridge, lasso, Bayesian lasso and SSVS) for 128 patients for the penile cancer data when E P V = 5. (b),(c): Coverage probability versus true risks. Dashed lines (for all plots) show the nominal coverage (90%). Dotted vertical lines for plots (b) and (c) show the 15th/50th/85th percentiles of the true risks. MLE, maximum likelihood estimation; SSVS, stochastic search variable selection.
Figure 3Percentage median bias of the estimates for each of the six coefficients for the penile cancer data with E P V = 3: frequentist (left) and Bayesian (right) shrinkage methods and MLE. The numbers on top of each graphs are the true values of the six regression coefficients. MLE, maximum likelihood estimation; ENET, elastic net; ALASSO, adaptive lasso; BLASSO, Bayesian lasso; SSVS, stochastic search variable selection; SCAD, smoothly clipped absolute deviation.
Modelling the probability of a thromboembolic event.
| MLE | BE | RIDGE | LASSO | ENET | ALASSO | SCAD | BLASSO | SSVS | |
|---|---|---|---|---|---|---|---|---|---|
| Intercept | −3.29 | −3.27 | −3.26 | −3.26 | −3.27 | −3.36 | −3.31 | −3.23 | −3.25 |
| age | 0.26 | 0.27 | 0.15 | 0.13 | 0.17 | 0.05 | 0.10 | 0.15 | 0.15 |
| la diameter | 0.36 | 0.35 | 0.24 | 0.30 | 0.26 | 0.43 | 0.39 | 0.25 | 0.31 |
| mwt | 0.49 | 0.47 | 0.19 | 0.21 | 0.23 | 0.07 | 0.21 | 0.21 | 0.25 |
| mwt2 | −0.34 | −0.32 | −0.15 | −0.16 | −0.18 | −0.07 | −0.15 | −0.18 | −0.21 |
| peak lvot | −0.10 | 0.00 | −0.03 | 0.00 | −0.04 | 0.00 | 0.00 | 0.00 | 0.00 |
| fs | −0.13 | 0.00 | −0.05 | 0.00 | −0.07 | 0.00 | 0.00 | 0.00 | −0.05 |
| lvef | −0.16 | 0.00 | −0.07 | −0.01 | −0.08 | 0.00 | 0.00 | 0.00 | −0.08 |
| af | 0.14 | 0.16 | 0.13 | 0.11 | 0.14 | 0.07 | 0.12 | 0.13 | 0.10 |
| stroke history | 0.22 | 0.22 | 0.17 | 0.17 | 0.18 | 0.19 | 0.20 | 0.17 | 0.16 |
| female | −0.18 | −0.20 | −0.10 | −0.09 | −0.11 | −0.08 | −0.11 | −0.10 | −0.10 |
| NYHA class II | 0.13 | 0.00 | 0.09 | 0.02 | 0.10 | 0.00 | 0.00 | 0.00 | 0.07 |
| NYHA class III/IV | 0.10 | 0.00 | 0.08 | 0.01 | 0.09 | 0.00 | 0.00 | 0.00 | 0.05 |
| vascular disease | −0.07 | 0.00 | −0.03 | 0.00 | −0.03 | 0.00 | 0.00 | 0.00 | 0.00 |
| hypertension | −0.20 | −0.22 | −0.10 | −0.06 | −0.12 | −0.01 | −0.05 | −0.11 | −0.11 |
| diabetes | −0.12 | 0.00 | −0.06 | −0.01 | −0.07 | 0.00 | 0.00 | 0.00 | −0.07 |
| Number of | |||||||||
| predictors retained | 15 | 8 | 15 | 12 | 15 | 8 | 8 | 8 | 13 |
Standardised coefficients of logistic regression model estimated by MLE, BE and the shrinkage methods using a sample of 2082 patients and 75 events (E P V = 5). MLE, maximum likelihood estimation; BE, backwards elimination; LSF, linear shrinkage factor; ENET, elastic net; ALASSO, adaptive lasso; BLASSO, Bayesian lasso; SSVS, stochastic search variable selection; SCAD, smoothly clipped absolute deviation.
External validation of the models for the probability of a thromboembolic event.
| Performance measure | Calibration slope (s.e.) | C‐statistic (s.e.) | Brier score |
|---|---|---|---|
| MLE | 0.78 (0.12) | 0.704 (0.025) | 0.03396 |
| BE | 0.85 (0.13) | 0.703 (0.025) | 0.03381 |
| RIDGE | 1.27 (0.18) | 0.724 (0.024) | 0.03357 |
| LASSO | 1.23 (0.17) | 0.718 (0.024) | 0.03359 |
| ENET | 1.22 (0.17) | 0.723 (0.024) | 0.03358 |
| ALASSO | 1.07 (0.16) | 0.715 (0.026) | 0.03361 |
| SCAD | 1.07 (0.16) | 0.697 (0.026) | 0.03370 |
| BLASSO | 1.26 (0.17) | 0.723 (0.024) | 0.03360 |
| SSVS | 1.19 (0.17) | 0.718 (0.024) | 0.03359 |
Performance measures were evaluated on a validation data set with 2739 patients and 97 events. MLE, maximum likelihood estimation; BE, backwards elimination; LSF, linear shrinkage factor; ENET, elastic net; ALASSO, adaptive lasso; BLASSO, Bayesian lasso; SSVS, stochastic search variable selection; SCAD, smoothly clipped absolute deviation.
Figure 4Calibration plot for the thromboembolism data. Number of patients are 983, 946, 574 and 236 for risk groups 1 to 4, respectively. MLE, maximum likelihood estimation; SSVS, stochastic search variable selection.
Recommended methods according to their predictive performance in data sets with few events.
| Scenario | ||||
|---|---|---|---|---|
| Noise predictors | ||||
| Method | No noise predictors | Non‐sparse | Sparse | Correlated predictors |
| LSF |
|
| × | × |
| BE | × | × | × | × |
| RIDGE |
|
| × |
|
| LASSO | × |
|
|
|
| ENET |
|
|
|
|
| ALASSO | × |
|
| × |
| SCAD | × |
|
| × |
| BLASSOb |
|
|
|
|
| SSVSc |
|
|
|
|
Performed best in simulations.
Performed well but was not the best method in simulations.
Not recommended for the particular scenario.
Tends to overfit the model in the presence of weak effects.
Sensitive to the threshold selection for hard shrinkage.
When the spike and slab variances are chosen appropriately using cross‐validation.
MLE, maximum likelihood estimation; BE, backwards elimination; LSF, linear shrinkage factor; ENET, elastic net; ALASSO, adaptive lasso; BLASSO, Bayesian lasso; SSVS, stochastic search variable selection; SCAD, smoothly clipped absolute deviation.