| Literature DB >> 35699121 |
Oswaldo Gressani1, Christel Faes1, Niel Hens1,2.
Abstract
The mixture cure model for analyzing survival data is characterized by the assumption that the population under study is divided into a group of subjects who will experience the event of interest over some finite time horizon and another group of cured subjects who will never experience the event irrespective of the duration of follow-up. When using the Bayesian paradigm for inference in survival models with a cure fraction, it is common practice to rely on Markov chain Monte Carlo (MCMC) methods to sample from posterior distributions. Although computationally feasible, the iterative nature of MCMC often implies long sampling times to explore the target space with chains that may suffer from slow convergence and poor mixing. Furthermore, extra efforts have to be invested in diagnostic checks to monitor the reliability of the generated posterior samples. A sampling-free strategy for fast and flexible Bayesian inference in the mixture cure model is suggested in this article by combining Laplace approximations and penalized B-splines. A logistic regression model is assumed for the cure proportion and a Cox proportional hazards model with a P-spline approximated baseline hazard is used to specify the conditional survival function of susceptible subjects. Laplace approximations to the posterior conditional latent vector are based on analytical formulas for the gradient and Hessian of the log-likelihood, resulting in a substantial speed-up in approximating posterior distributions. The spline specification yields smooth estimates of survival curves and functions of latent variables together with their associated credible interval are estimated in seconds. A fully stochastic algorithm based on a Metropolis-Langevin-within-Gibbs sampler is also suggested as an alternative to the proposed Laplacian-P-splines mixture cure (LPSMC) methodology. The statistical performance and computational efficiency of LPSMC is assessed in a simulation study. Results show that LPSMC is an appealing alternative to MCMC for approximate Bayesian inference in standard mixture cure models. Finally, the novel LPSMC approach is illustrated on three applications involving real survival data.Entities:
Keywords: Approximate Bayesian inference; Laplace approximation; Metropolis-adjusted Langevin algorithm; P-splines; Survival analysis
Mesh:
Year: 2022 PMID: 35699121 PMCID: PMC9542184 DOI: 10.1002/sim.9373
Source DB: PubMed Journal: Stat Med ISSN: 0277-6715 Impact factor: 2.497
FIGURE 1Approximated (normalized) posterior of the log penalty . The dashed line is the modal value obtained with the bracketing algorithm using a step size
Parameters for the incidence, latency, and censoring rate yielding different cure and censoring levels
| Scenario |
|
|
|
|
| Cure |
| Censoring | Plateau |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.70 |
| 0.95 |
| 0.25 |
| 0.16 |
|
|
| 2 | 1.25 |
| 0.45 |
| 0.20 |
| 0.05 |
|
|
Note: The last column indicates the percentage of observations in the plateau located at the right tail of the Kaplan‐Meier curve.
Numerical results for replications of sample size and under two different cure‐censoring scenarios
| Scenario | Parameters | Mean | Bias | ESE | RMSE |
|
|
|---|---|---|---|---|---|---|---|
| Scenario 1 ( |
| 0.721 (0.743) | 0.021 (0.043) | 0.249 (0.274) | 0.250 (0.277) | 91.0 (88.4) | 96.4 (94.4) |
|
|
|
| 0.240 (0.255) | 0.242 (0.267) | 91.6 (89.8) | 94.8 (94.2) | |
|
| 0.954 (1.044) | 0.004 (0.094) | 0.390 (0.407) | 0.390 (0.417) | 90.6 (90.4) | 94.2 (95.4) | |
|
|
|
| 0.092 (0.094) | 0.092 (0.094) | 89.6 (87.8) | 94.4 (94.8) | |
|
| 0.242 (0.234) |
| 0.184 (0.188) | 0.184 (0.188) | 89.0 (89.2) | 96.4 (94.8) | |
| Scenario 1 ( |
| 0.703 (0.715) | 0.003 (0.015) | 0.184 (0.181) | 0.184 (0.181) | 89.6 (90.0) | 95.6 (95.0) |
|
|
|
| 0.167 (0.172) | 0.167 (0.176) | 90.2 (88.2) | 95.0 (93.8) | |
|
| 0.950 (0.978) | 0.000 (0.028) | 0.269 (0.280) | 0.268 (0.281) | 90.8 (88.4) | 95.0 (93.8) | |
|
|
|
| 0.064 (0.065) | 0.064 (0.065) | 89.6 (89.0) | 94.8 (93.8) | |
|
| 0.254 (0.247) | 0.004 ( | 0.127 (0.123) | 0.127 (0.123) | 90.8 (91.4) | 96.0 (95.2) | |
| Scenario 2 ( |
| 1.281 (1.275) | 0.031 (0.025) | 0.229 (0.239) | 0.230 (0.240) | 91.4 (91.0) | 95.6 (94.2) |
|
|
|
| 0.182 (0.197) | 0.183 (0.200) | 91.0 (88.2) | 95.4 (93.2) | |
|
| 0.430 (0.480) |
| 0.330 (0.333) | 0.330 (0.334) | 89.0 (90.2) | 94.8 (95.0) | |
|
|
|
| 0.074 (0.075) | 0.074 (0.075) | 88.8 (90.2) | 94.4 (95.0) | |
|
| 0.194 (0.196) |
| 0.151 (0.157) | 0.151 (0.157) | 87.2 (87.8) | 94.8 (93.4) | |
| Scenario 2 ( |
| 1.248 (1.270) |
| 0.160 (0.164) | 0.160 (0.165) | 91.4 (90.8) | 95.6 (95.8) |
|
|
| 0.002 ( | 0.129 (0.132) | 0.129 (0.133) | 89.8 (88.6) | 95.2 (94.4) | |
|
| 0.459 (0.467) | 0.009 (0.017) | 0.223 (0.238) | 0.223 (0.239) | 91.0 (89.8) | 95.6 (94.4) | |
|
|
| 0.000 ( | 0.054 (0.052) | 0.054 (0.052) | 88.0 (89.6) | 95.6 (94.0) | |
|
| 0.199 (0.195) |
| 0.105 (0.111) | 0.105 (0.111) | 89.6 (87.8) | 95.0 (92.6) |
Note: Results in parenthesis are for the MLWG sampler with chain length 10 000 and burn‐in 3000.
FIGURE 2Estimated baseline survival curves (gray) for replications under different scenarios with LPSMC. The black curve is the target baseline and the dashed curve is the pointwise median of the 500 gray curves
FIGURE 3Boxplots of the ASE for the incidence in different scenarios
Estimated coverage probability of and credible intervals for and with replications of sample size and B‐splines at selected quantiles
| Nominal | Scenario |
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|---|
| Baseline | 90 (LPSMC) | 1 | 88.5 | 86.5 | 90.0 | 92.0 | 93.0 | 93.5 | 81.0 | 41.5 |
| 90 (MLWG) | 1 | 86.0 | 84.0 | 81.0 | 81.0 | 88.0 | 90.5 | 85.5 | 61.5 | |
| 95 (LPSMC) | 1 | 93.5 | 94.0 | 94.5 | 94.5 | 95.0 | 95.5 | 90.0 | 49.0 | |
| 95 (MLWG) | 1 | 92.0 | 88.5 | 88.5 | 90.5 | 93.5 | 95.0 | 92.5 | 74.5 | |
| 90 (LPSMC) | 2 | 85.0 | 83.5 | 86.0 | 88.0 | 88.5 | 89.0 | 70.5 | 18.5 | |
| 90 (MLWG) | 2 | 84.0 | 80.5 | 82.0 | 86.0 | 87.5 | 90.0 | 79.5 | 40.5 | |
| 95 (LPSMC) | 2 | 93.0 | 92.5 | 91.0 | 94.5 | 94.5 | 92.0 | 79.0 | 30.5 | |
| 95 (MLWG) | 2 | 93.0 | 90.0 | 90.0 | 92.0 | 95.5 | 94.0 | 87.0 | 52.5 | |
| Uncured | 90 (LPSMC) | 1 | 88.5 | 89.5 | 88.5 | 89.5 | 94.0 | 91.5 | 77.5 | 22.5 |
| 90 (MLWG) | 1 | 82.0 | 76.0 | 75.5 | 79.5 | 86.0 | 88.5 | 81.5 | 51.0 | |
| 95 (LPSMC) | 1 | 96.0 | 93.5 | 95.0 | 96.5 | 98.0 | 97.0 | 85.5 | 34.0 | |
| 95 (MLWG) | 1 | 89.0 | 83.5 | 84.5 | 87.0 | 93.0 | 95.5 | 90.0 | 60.5 | |
| 90 (LPSMC) | 2 | 87.0 | 81.5 | 85.5 | 90.0 | 88.5 | 87.0 | 68.0 | 10.0 | |
| 90 (MLWG) | 2 | 74.5 | 76.0 | 79.0 | 88.0 | 90.5 | 89.5 | 75.5 | 27.0 | |
| 95 (LPSMC) | 2 | 91.5 | 90.0 | 91.0 | 93.0 | 95.0 | 93.0 | 78.0 | 17.5 | |
| 95 (MLWG) | 2 | 85.0 | 83.5 | 89.0 | 92.5 | 96.0 | 94.0 | 82.0 | 42.5 |
Note: Results for the MLWG sampler with chain length 10 000 and burn‐in 3000 are also shown.
Empirical computation times for LPSMC and MLWG (with chain length 500) for different combinations of sample size and B‐spline dimension
|
|
|
|
|
| ||
|---|---|---|---|---|---|---|
| LPSMC |
| 0.289 | 0.333 | 0.572 | 0.777 | 0.970 |
|
| 0.466 | 0.549 | 0.889 | 1.238 | 1.517 | |
|
| 1.042 | 1.110 | 1.848 | 3.057 | 4.164 | |
|
| 2.214 | 2.614 | 3.733 | 5.041 | 6.294 | |
| MLWG |
| 1.561 | 1.808 | 2.514 | 3.250 | 3.801 |
|
| 1.943 | 2.157 | 2.978 | 3.835 | 4.486 | |
|
| 2.818 | 3.065 | 4.588 | 5.997 | 7.422 | |
|
| 5.311 | 5.932 | 7.572 | 9.417 | 10.982 |
Note: Results correspond to average wall clock times (elapsed real times) in seconds over replications.
FIGURE 4Kaplan‐Meier estimated survival for the e1684 ECOG dataset. A cross indicates a censored observation
Estimation results for the e1684 dataset with LPSMC, MLWG, and smcure
| Model | Parameter | Estimate | SD | CI |
|---|---|---|---|---|
| LPSMC (incidence) |
| 1.219 | 0.244 | [0.819; 1.620] |
|
|
| 0.284 | [ | |
|
|
| 0.281 | [ | |
|
| 0.016 | 0.011 | [ | |
| MLWG (incidence) |
| 1.355 | 0.375 | [0.884; 1.949] |
|
|
| 0.329 | [ | |
|
|
| 0.325 | [ | |
|
| 0.019 | 0.016 | [ | |
| smcure (incidence) |
| 1.365 | 0.329 | [0.823; 1.906] |
|
|
| 0.333 | [ | |
|
|
| 0.343 | [ | |
|
| 0.020 | 0.016 | [ | |
| LPSMC (latency) |
| 0.092 | 0.170 | [ |
|
|
| 0.169 | [ | |
|
|
| 0.006 | [ | |
| MLWG (latency) |
| 0.058 | 0.183 | [ |
|
|
| 0.188 | [ | |
|
|
| 0.006 | [ | |
| smcure (latency) |
| 0.099 | 0.175 | [ |
|
|
| 0.177 | [ | |
|
|
| 0.007 | [ |
Note: The first column indicates the model part, the second and third columns the parameter and its estimate. The fourth column is the (posterior) SD of the estimate and the last column the confidence/credible interval.
FIGURE 5Kaplan‐Meier estimated survival for the breast cancer data. A cross indicates a censored observation
Estimation results for the breast cancer data with LPSMC
| Model | Parameter | Estimate | SD | CI |
|---|---|---|---|---|
| (Incidence) |
|
| 0.576 | [ |
|
|
| 0.010 | [ | |
|
| 0.181 | 0.281 | [ | |
| (Latency) |
|
| 0.008 | [ |
|
|
| 0.245 | [ |
Note: The first column indicates the model part, the second and third columns the parameter and its posterior (mean) estimate. The fourth column is the posterior SD of the estimate and the last column the credible interval.
FIGURE 6Estimated survival of the susceptibles for two age categories and
FIGURE 7Kaplan‐Meier estimated survival for the ZNA data. A cross indicates a censored observation
Estimation results for the ZNA data with LPSMC and MLWG
| Model | Parameter | Estimate | SD | CI |
|---|---|---|---|---|
| LPSMC (incidence) |
|
| 0.791 | [ |
|
| 0.031 | 0.010 | [0.011; 0.051] | |
| MLWG (incidence) |
|
| 0.780 | [ |
|
| 0.029 | 0.010 | [0.009; 0.049] | |
| LPSMC (latency) |
| 0.011 | 0.007 | [ |
| MLWG (latency) |
| 0.012 | 0.007 | [ |
Note: The first column indicates the model part, the second and third columns the parameter and its posterior (mean) estimate. The fourth column is the posterior SD of the estimate and the last column the credible interval.
Estimation of the cure proportion for different age categories and associated approximate credible interval
|
|
| CI |
|---|---|---|
| 20 | 0.879 | [0.682; 0.958] |
| 30 | 0.843 | [0.658; 0.932] |
| 50 | 0.742 | [0.607; 0.837] |
| 80 | 0.532 | [0.466; 0.592] |
Metropolis‐Langevin‐within‐Gibbs algorithm to explore
| 1: Set |
| 2: |
| 3: ( |
| 4: Compute the Langevin diffusion: |
| 5: Generate proposal: |
| 6: Compute the acceptance probability |
| 7: Draw |
| 8: |
| 9: ( |
| 10: Draw |
| 11: Draw |
| 12: ( |
| 13: Set |
| 14: |