| Literature DB >> 35893205 |
Haro Aida1, Kenichi Hayashi2, Ayano Takeuchi3, Daisuke Sugiyama4, Tomonori Okamura3.
Abstract
Survival analysis is a set of methods for statistical inference concerning the time until the occurrence of an event. One of the main objectives of survival analysis is to evaluate the effects of different covariates on event time. Although the proportional hazards model is widely used in survival analysis, it assumes that the ratio of the hazard functions is constant over time. This assumption is likely to be violated in practice, leading to erroneous inferences and inappropriate conclusions. The accelerated failure time model is an alternative to the proportional hazards model that does not require such a strong assumption. Moreover, it is sometimes plausible to consider the existence of cured patients or long-term survivors. The survival regression models in such contexts are referred to as cure models. In this study, we consider the accelerated failure time cure model with frailty for uncured patients. Frailty is a latent random variable representing patients' characteristics that cannot be described by observed covariates. This enables us to flexibly account for individual heterogeneities. Our proposed model assumes a shifted gamma distribution for frailty to represent uncured patients' heterogeneities. We construct an estimation algorithm for the proposed model, and evaluate its performance via numerical simulations. Furthermore, as an application of the proposed model, we use a real dataset, Specific Health Checkups, concerning the onset of hypertension. Results from a model comparison suggest that the proposed model is superior to existing alternatives.Entities:
Keywords: accelerated failure time model; cure model; epidemiological research; frailty model; survival data analysis
Year: 2022 PMID: 35893205 PMCID: PMC9332026 DOI: 10.3390/healthcare10081383
Source DB: PubMed Journal: Healthcare (Basel) ISSN: 2227-9032
Selected related literature and elements of our proposed method. Elements of our proposed method and relationship among existing studies. The symbol ✓ means “Considered”. PH and AFT means proportional hazard and accelerated failure time models, respectively.
| Literature | Cured Patients | Uncured Model | Frailty |
|---|---|---|---|
| Sy and Taylor [ | ✓ | PH | – |
| Vaupel [ | – | PH | gamma |
| Pan [ | – | AFT | gamma, log-normal |
| Chen et al. [ | – | AFT | generalized gamma |
| Yu [ | ✓ | PH | gamma |
| He [ | ✓ | AFT | generalized gamma |
| Present study | ✓ | AFT | shifted gamma |
Figure 1Graph of probability for .
Means and standard deviations (SD) of estimate values in setting (i).
| Parameter | True Value | Mean (SD) | |||||
|---|---|---|---|---|---|---|---|
|
| −0.5 |
|
|
|
|
|
|
|
| 0.5 |
|
|
|
|
|
|
|
| −0.8 |
|
|
|
|
|
|
|
| 3 |
|
|
|
|
|
|
|
| 1 |
|
|
|
|
|
|
|
| 2 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 0.1 |
|
|
|
|
|
|
|
| 0.5 |
|
|
|
|
|
|
|
| −1 |
|
|
|
|
|
|
|
| 0.6 |
|
|
|
|
|
|
Means and standard deviations (SD) of estimate values in setting (ii).
| Parameter | True Value | Mean (SD) | |||||
|---|---|---|---|---|---|---|---|
|
| −0.5 |
|
|
|
|
|
|
|
| 0.5 |
|
|
|
|
|
|
|
| −0.8 |
|
|
|
|
|
|
|
| 3 |
|
|
|
|
|
|
|
| 1 |
|
|
|
|
|
|
|
| 2 |
|
|
|
|
|
|
|
| 3 |
|
|
|
|
|
|
|
| 0.1 |
|
|
|
|
|
|
|
| 0.5 |
|
|
|
|
|
|
|
| −1 |
|
|
|
|
|
|
|
| 0.6 |
|
|
|
|
|
|
Figure 2Histogram of the estimated values of in setting (ii). The red dotted line represents the true value.
Figure 3Histogram of the estimated values of in setting (ii). The red dotted line represents the true value.
Figure 4Kaplan–Meier estimator of the survival function for the onset of hypertension for each gender. The black and red lines represent the estimators for males and females, respectively.
AIC for the regression model of the onset of hypertension in males. The asterisk (*) represents that variable selection is performed for the uncured probability .
| Model | Distribution | Number of Parameters | AIC |
|---|---|---|---|
| Proportional hazard (PH) | Exponential | 19 | 7120.407 |
| Weibull | 20 | 7030.615 | |
| AFT model | Log-normal | 20 | 7002.352 |
| Generalized gamma | 21 | 6999.232 | |
| Mixture cure + PH | Exponential | 38 | 7103.040 |
| Weibull | 39 | 7047.378 | |
| Mixture cure + AFT | Log-normal | 39 | 7005.633 |
| Generalized gamma* | 40 | 7004.674 | |
| Generalized gamma * | 29 | 6992.934 | |
| Mixtrue cure | Generalized gamma | 41 | 7003.011 |
| Generalized gamma | 30 |
|
AIC for the regression model of the onset of hypertension in females. The asterisk (*) represents that variable selection is performed for the uncured probability .
| Model | Distribution | Number of Parameters | AIC |
|---|---|---|---|
| Proportional hazard (PH) | Exponential | 19 | 11,804.01 |
| Weibull | 20 | 11,644.16 | |
| AFT | Log-normal | 20 | 11,596.45 |
| Generalized gamma | 21 | 11,586.00 | |
| Mixture cure + PH | Exponential | 38 | 11,798.48 |
| Weibull | 39 | 11,669.49 | |
| Mixture cure + AFT | Log-normal | 39 | 11,609.63 |
| Generalized gamma * | 40 | 11,600.23 | |
| Generalized gamma * | 27 | 11,579.07 | |
| Mixture cure + AFT frailty | Generalized gamma | 41 | 11,596.26 |
| Generalized gamma | 26 |
|
Estimation result of regression coefficient when variable selection is performed by applying the proposed model in males. CI is the confidence interval. The asterisk (*) and dagger (†) indicate that p-value is less than and , respectively.
| Covariate | Inference of | ||
|---|---|---|---|
| Estimates | 95% CI | ||
| age |
|
|
|
| waist |
|
|
|
| exe1h_day |
|
|
|
| exe30_2_week |
|
|
|
| sleep_good |
|
|
|
| walk_speed |
|
|
|
| eat_speed_n |
|
|
|
| eat_speed_f |
|
|
|
| eat_b_sleep |
|
|
|
| snacking |
|
| |
| breakfast |
|
|
|
| weight_move |
|
|
|
| plus10kg |
|
|
|
| smoking |
|
|
|
| drink_amount2 |
|
|
|
| drink_amount3 |
|
| 0.054 * |
| drink_amount4 |
|
|
|
| drink_amount5 |
|
|
|
|
|
| ||
|
|
|
| |
| Intercept |
|
| |
| age |
|
| |
| eat_speed_n |
|
|
|
| eat_speed_f |
|
| 0.0664 * |
| kanshyoku |
|
| |
| plus10kg |
|
| |
| drink_amount2 |
|
| 0.0605 * |
| drink_amount4 |
|
| |
Results of parameter estimation when variable selection is performed by applying the proposed model in males.
| Parameter | Distribution | |||
|---|---|---|---|---|
| Generalized Gamma | Shifted Gamma | |||
|
|
|
|
| |
| Estimates |
|
|
|
|
| Standard error |
|
|
|
|
Estimation result of regression coefficient when variable selection is performed by applying the proposed model in females. CI is the confidence interval. The asterisk (*) and dagger (†) indicate that p-value is less than and , respectively.
| Covariate | Inference of | ||
|---|---|---|---|
| Estimates | 95% CI | ||
| age |
|
| |
| waist |
|
|
|
| exe1h_day |
|
|
|
| exe30_2_week |
|
|
|
| sleep_good |
|
|
|
| walk_speed |
|
|
|
| eat_speed_n |
|
|
|
| eat_speed_f |
|
|
|
| eat_b_sleep |
|
|
|
| snacking |
|
|
|
| breakfast |
|
|
|
| weight_move |
|
|
|
| plus10kg |
|
|
|
| smoking |
|
|
|
| drink_amount2 |
|
|
|
| drink_amount3 |
|
|
|
| drink_amount4 |
|
|
|
| drink_amount5 |
|
|
|
|
|
| ||
|
|
|
| |
| Intercept |
|
| <0.001 |
| age |
|
| <0.001 |
| eat_speed_f |
|
|
|
| plus10kg |
|
|
|
Results of parameter estimation when variable selection is performed by applying the proposed model in females.
| Parameter | Distribution | |||
|---|---|---|---|---|
| Generalized Gamma | Shifted Gamma | |||
|
|
|
|
| |
| Estimates |
|
|
|
|
| Standard error |
|
|
|
|
Figure 5Probability density function of the estimated shifted gamma distribution. The black and red lines show the results for males and females, respectively.