Literature DB >> 24963307

Comparison of competing risks models based on cumulative incidence function in analyzing time to cardiovascular diseases.

Minoo Dianatkhah¹, Mehdi Rahgozar², Mohammad Talaei³, Masoud Karimloua², Masoumeh Sadeghi⁴, Shahram Oveisgharan⁵, Nizal Sarrafzadegan⁶.

Abstract

BACKGROUND: Competing risks arise when the subject is exposed to more than one cause of failure. Data consists of the time that the subject failed and an indicator of which risk caused the subject to fail.
METHODS: With three approaches consisting of Fine and Gray, binomial, and pseudo-value, all of which are directly based on cumulative incidence function, cardiovascular disease data of the Isfahan Cohort Study were analyzed. Validity of proportionality assumption for these approaches is the basis for selecting appropriate models. Such as for the Fine and Gray model, establishing proportionality assumption is necessary. In the binomial approach, a parametric, non-parametric, or semi-parametric model was offered according to validity of assumption. However, pseudo-value approaches do not need to establish proportionality.
RESULTS: Following fitting the models to data, slight differences in parameters and variances estimates were seen among models. This showed that semi-parametric multiplicative model and the two models based on pseudo-value approach could be used for fitting this kind of data.
CONCLUSION: We would recommend considering the use of competing risk models instead of normal survival methods when subjects are exposed to more than one cause of failure.

Entities: Chemical Disease Gene Species

Keywords: Binomial Approach; Cardiovascular Diseases; Competing Risks; Cumulative Incidence Function; Fine and Gray Model; Pseudo-value Approach

Year: 2014 PMID： 24963307 PMCID： PMC4063516

Source DB: PubMed Journal: ARYA Atheroscler ISSN： 1735-3955

Introduction

Problems involving competing risks are common in medical researches, where (K > 0) competing causes of failure may occur. Occurrence of any of the risks causes failure or death and precludes the occurrence of other competing risks.1,2 For such data one observes only the failure time and a cause of failure for each subject in the study. Methods for estimating the probability of failure for events that are subject to competing risks are not new. It is still quite common to see inappropriate methods used to estimate such probabilities for endpoints that suffer from competing risks.1 Generally, two types of analysis can be performed when competing risks are present; modeling cause-specific and sub-distribution hazard or cumulative incidence function.3,4 The Cox regression modeling for each event is an example of the first type. In such a model a subject who has failed in other competing risks is treated as a censored subject. This method is valid if the censoring distributions are independent.5 Multi-state models that do not require the existence of potential failure times and Aalen additive hazards model are other examples of the first type of modeling.6,7 Klein modeled covariate effects using this method.7 For the second type, we can find the Fine and Gray8 method, the binomial approach suggested by Scheike and Zhang,9 and the pseudo-value approach suggested by Klein and Andersen.10,11 These approaches are introduced in section 3. We fitted these three methods to cardiovascular diseases (CVD) data of the Isfahan Cohort Study (ICS) introduced in section 2.4.12,13 In section 3 We present the results, and in section 4 findings are discussed in brief.

Materials and Methods

The most common model for competing risks is in terms of potential failure times, where K is competing risks denoted by D1,…,Dk, and for each risk there is a potential failure time of Xi, i=1,…,K. One observes T = min(X1,…,Xk) and a variable ε = j, j = 1,…,K, Where T = Xj defines which of the risks caused the event to occur. Competing risk probabilities can be summarized by cumulative incidence function for the jth competing risk. This function is defined as probability of experiencing risk j prior to time t in the presence of all competing risks. This quantity depends on all the cause-specific hazard rates (hj(t) = 1,…,k), not just the crude hazard rate of cause of interest. (1) When there is a covariate, it is common in medical sciences to study the effect on competing risks quantities.14-18 One solution is a direct regression modeling of cumulative incidence function. Here, we discuss three approaches that focus on this topic. Fine and Gray Model The first approach suggested by Fine and Gray8 is a proportional sub-distribution hazards model with: Where γ and γ0 are hazard and baseline hazard of the sub-distribution, Z and β are vectors of covariates and coefficients, respectively. The partial likelihood is given by: The risk set Ri is formed of those who did not experience an event by time t and those who experienced a competing risk event by time t. Thus, those who experienced other types of events remain in the risk set all the time. The weights are defined as: Where Ĝ is the Kaplan-Meier estimate of survivor function of the censoring distribution.3 This model is valid if the proportionality assumption is established. Binomial Approach The second method is the direct binomial approach suggested by Scheike and Zhang9 which models cumulative incidence function by a general class of models given by: Where h and g are the known link and regression functions, respectively, η(t) is the unknown regression function and β is the vector of regression parameters. We use the semi-parametric multiplicative model: Where X is a (p+1) -dimensional (X = (1,x1,…,xp)), and Z a q-dimensional covariate. These flexible models allow covariate to have time-varying effects and the covariate Z to have constant effects: The model suggests testing the hypothesis that a specific covariate xj has a constant effect over time and define hypothesis H0: ηj(t) . This leads to a very useful goodness-of-fit test for model validation. The test shows exactly where non-proportionality is present. This approach is to start out with a model where all effects initially have parametric or non-parametric effects, and then reduce model complexity by successive testing to find an appropriate semi-parametric model that fits the data. In brief, for this approach, the model is chosen according to proportionality assumption. Pseudo-value Approach The third method of direct modeling of the cumulative incidence function is based on a pseudo-value approach.11 For this model a grid of time points τ1,…,τM is selected. At each grid point, the estimated cumulative incidence function is computed based on the complete data set and the estimated cumulative incidence function based on the sample of size n-1 obtained by deleting the ith observation then the pseudo-value for the ith subject at time τh is defined as: There are the pseudo-values known from jack-knife techniques. is the number of events of type of interest occurring prior to t, When there is no censoring. In this case and are independent. When we have censoring, because pseudo-values are close to the indicators they are approximately independent. This allows us to make use of results from generalized linear models to model the effects of covariates. Where g(0) is a link function. The possible choices could be the logit link g(x) = log(x/(1-x)), or complementary log-log function g(x) = -log(-log(1-x)) on x. Unlike the Fine and Gray model, this approach does not need to establish proportionality assumption. To select the appropriate link function, one crude way, when the factor is categorical, is to look at plots of differences in transformed estimates of the cumulative incidence functions for each category from the baseline category. For two categorical factors, the cumulative incidence functions for two groups (ignoring other covariates), is estimated separately. Then, g(F1h(t))-g(F10(t)) is plotted, here F10(t) and F1h(t) are the estimated cumulative incidence function for baseline and other categories, respectively, and g(0) is either the logit or complementary log-log transforms. If the link chosen for the plot is correct, then the curves should approximately be horizontal. Data To compare these three approaches, we used the data of the Isfahan Cohort Study. The ICS is a community-based, ongoing longitudinal study on 6504 adults aged 35 and older at baseline, aiming at Iranian cardiovascular disease risk chart. Participants lived in both urban and rural areas of three cities and their associated district villages in central Iran (Isfahan, Arak, Najafabad). Several risk factors for cardiovascular disease, like smoking status, lipids, blood pressure, and anthropometric measurements, were measured at baseline. They were followed for 5 years from January 1997 to September 2001. End of study for each subject was confirmed if one of the cardiovascular disease events (CVD) (non-fatal myocardial infarction, fatal myocardial infarction, non-fatal stroke, fatal stroke, sudden cardiac death, and unstable angina) occurred or the subject experienced unrelated CVD death. Finally, data of 5515 participants who had at least one follow-up time after baseline were included in analysis. There is one competing risk of CVD event (event of interest), and it has occurred when the subject experienced unrelated CVD death.12-19

Results

From 5515 (2815 females and 2700 males) cases in ICS data, 5.13% had one of the mentioned CVD and 1.5% experienced unrelated CVD death. The study consisted of patients with non-fatal myocardial infarction (n = 52), fatal myocardial infarction (n = 19), sudden cardiac death (n = 46), non-fatal stroke (n = 40), fatal stroke (n = 14), and unstable angina (n = 112). Moreover, 2133 subjects were 35 to 44 years old, 2449 between 45 to 64, and 933 were 65 and older at baseline. To fit ICS data with R software, the 3 Fine and Gray, binomial, and pseudo-value competing risks approaches, which are directly based on cumulative incidence function were used.3,5,20,21 As is common in medical literature, parametric models have been studied first. Table 1 shows the results. The Fine and Gray model has maximum number of significant covariates (8) and the lowest variances. On the contrary, multiplicative models have minimum number of significant covariates (6) and the most variances, and 7 covariates are significant in logit and complementary log-log models. In the Fine and Gray model, except for abdominal obesity (P = 0.76) and high low-density lipoprotein cholesterol (high LDL-C) (P = 0.20), other covariates are significant (P < 0.05). For the multiplicative model, age, abdominal obesity, hypertension, diabetes mellitus, and current smoking status are significant (P < 0.05). In logit and complementary log-log models, age, hypertension, high LDL-C, low high-density lipoprotein cholesterol (low HDL-C), diabetes mellitus, and current smoking status are significant (P < 0.05). Slight differences among the models are seen for parameter estimates. In addition, for the Fine and Gray logit and multiplicative models, we can interpret as the odds in favor of the categories of a factor relative to the baseline category. Table 2 shows the results of fitting of non-parametric multiplicative model. These models differ from parametric models, because their coefficients have time-varying effects. This table also shows the results of testing goodness-of-fit or constant effect test. Age (65 years and older), abdominal obesity, and diabetes mellitus are significant (P < 0.05). This implies that Fine and Gray, parametric and non-parametric multiplicative models are not appropriate, because the proportionality assumption is violated. Therefore, fitting the semi-parametric model is necessary and allows the covariates with constant and non-constant effects to be presented simultaneously in the model. We use this model later to predict cumulative incidence function for specific subjects. Table 3 shows semi-parametric model results. For this model, age (65 years and older), abdominal obesity, and diabetes mellitus do not have parameter estimates, because of their non-constant effects in time. Figure 1 shows goodness-of-fit plot for hypertension with two logit and complementary log-log transforms. The two plots are approximately horizontal; meaning that both are suitable. Because of differences in variance estimation between these two models, the complementary log-log model is preferred.

Table 1

Results of fitting parametric models on Isfahan Cohort Study (ICS) data

Covariate			Fine and Gary model	Logit model	Complementary log-log model on 1-F1(t)	Multiplicative model
Sex**		B	0.482	0.320	0.265	0.186
		SE (b)	0.146	0.191	0.180	0.221
		P	(0.001)*	(0.094)	(0.142)	(0.400)
Age***	45-64	B	0.828	0.790	0.770	1.190
		SE (b)	0.188	0.252	0.246	0.276
		P	(< 0.001)*	(0.002)*	(0.002)*	(< 0.001)*
	≥ 65	B	1.475	1.438	1.372	1.900
		SE (b)	0.198	0.259	0.251	0.278
		P	(< 0.001)*	(< 0.001)*	(< 0.001)*	(< 0.001)*
Abdominal obesity		B	-0.04	-0.165	-0.168	-0.460
		SE (b)	0.151	0.200	0.188	0.244
		P	(0.760)	(0.409)	(0.372)	(0.050)*
Hypertension		B	0.980	1.154	1.099	1.190
		SE (b)	0.129	0.158	0.150	0.202
		P	(< 0.001)*	(< 0.001)*	(< 0.001)*	(< 0.001)*
High LDL-C		B	0.455	0.412	0.381	0.194
		SE (b)	0.124	0.163	0.154	0.200
		P	(< 0.001)*	(0.012)*	(0.013)*	(0.313)
Low HDL-C		B	0.162	0.376	0.353	0.336
		SE (b)	0.153	0.168	0.157	0.210
		P	(0.200)	(0.025)*	(0.024)*	(0.109)
Diabetes mellitus		B	0.592	0.600	0.513	0.733
		SE (b)	0.153	0.191	0.177	0.225
		P	(< 0.001)*	(0.002)*	(0.004)*	(0.001)*
Hypertriglyceridemia		B	0.340	0.253	0.233	0.119
		SE (b)	0.137	0.177	0.167	0.224
		P	(0.013)*	(0.153)	(0.163)	(0.597)
Smoking		B	0.391	0.585	0.533	0.607
		SE (b)	0.153	0.198	0.184	0.233
		P	(0.010)*	(0.003)*	(0.003)*	(0.009)*

Significant at α = 0.05 level;

Females are reference group;

Age between 35 and 44 are reference group

SE: Standard error; LDL-C: Low-density lipoprotein cholesterol; HDL-C: High-density lipoprotein cholesterol

Table 2

P-values for non-parametric model on Isfahan Cohort Study (ICS) data

Covariate		Multiplicative Model
		H0: η(t)=0	H0: Constant effect
Sex		0.358	0.264
Age	45-64	< 0.001*	0.280
	> = 65	< 0.001*	< 0.001*
Abdominal obesity		0.002*	0.016*
Hypertension		< 0.001*	0.096
High LDL-C		0.170	0.508
Low HDL-C		0.118	0.490
Diabetes mellitus		< 0.001*	0.024*
Hypertriglyceridemia		0.240	0.578
Smoking		0.012*	0.084

Significant at α = 0.05 level

LDL-C: Low-density lipoprotein cholesterol; HDL-C: High-density lipoprotein cholesterol

Table 3

Results of fitting semi-parametric model on Isfahan Cohort Study (ICS) data

Covariate		Multiplicative Model
		b	SE (b)	P
Sex		0.142	0.225	0.527
Age	45-64	1.090	0.225	< 0.001*
	≥ 65	-	-	< 0.001*
Abdominal obesity		-	-	< 0.001*
hypertension		1.190	0.201	< 0.001*
High LDL-C		0.213	0.202	0.292
Low HDL-C		0.375	0.225	0.081
Diabetes mellitus		-	-	< 0.001*
Hypertriglyceridemia		0.110	0.236	0.640
Smoking		0.635	0.234	0.006*

Significant at α = 0.05 level;

SE: Standard error; LDL-C: Low-density lipoprotein cholesterol; HDL-C: High-density lipoprotein cholesterol

Figure 1

Difference in cumulative incidence function for logit and complementary log-log transform in hypertension

Sometimes it is important to get an idea of the cumulative incidence probability for specific patients. Therefore, computing the predicted cumulative incidence function for a given set value of covariates is very popular.22,23 For example, suppose that physicians want to know the value of cumulative incidence function for male patients older than 65 with abdominal obesity, hypertension, high LDL-C, low HDL-C, diabetes mellitus, hypertriglyceridemia, and smoking. Figure 2 shows the predicted cumulative incidence function during 60 months for two appropriate complementary log-log and semi-parametric multiplicative models. The predicted values for the first model are less than the second model for about 35 months (between the 15th-58th months).

Figure 2

Predictions for cardiovascular diseases (CVD) cumulative incidence function for Isfahan Cohort Study (ICS) data using semi-parametric multiplicative and complementary log-log model

Discussion

Data from studies with competing risks outcomes present challenges to the data analyst. Some articles analyze such data with normal survival models. A criticism that can be leveled at these models is the assumption that upon removal of one cause of failure, the risk of failure from remaining causes is unchanged. In human studies this assumption is rarely true.3-5 Here we have used three approaches (Fine and Gray, binomial, and pseudo-value approaches) which are based directly on the cumulative incidence function and their validity depends on proportionality assumption. This collection of models gives a rich variety, from which a user can choose an appropriate model for analyzing the data. We saw that the Fine and Gray, parametric multiplicative model was not able to describe the cumulative incidence function for ICS data. This model’s lacking flexibility was found using the goodness-of-fit approach. This showed that its non-proportionality can primarily be attributed to the effect of covariates. A similar conclusion was reached for the non-parametric multiplicative model. The semi-parametric multiplicative model could be a good choice for this data. With the pseudo-value approaches, two link functions were used in GLM model (logit or complementary log-log function). Unlike the Fine and Gray and multiplicative models, this is more flexible so that we do not need to assume proportionality. Goodness-of-fit plots showed that both link functions are suitable for hypertension groups, but they were different in variance estimation. Moreover, it seems the complementary log-log function is more appropriate. Predictions plot for ICS data using semi-parametric multiplicative and complementary log-log models were quite similar during 5 years, but slight differences in parameters regression were found between the two models.

Conclusion

Inappropriate statistical methods are not rare in binomial literature.5 The competing risk problem is a critical issue in survival analysis. We would recommend considering competing risk models instead of simply using normal survival methods when subjects are exposed to more than one cause of failure. In future studies like ICS, using competing risks models is suggested, because a large number of unrelated CVD deaths will occur during years of follow-up and the use of normal survival functions can lead to incorrect or at least imprecise estimates. As we described, the two appropriate semi-parametric multiplicative and complementary log-log models are proposed for fitting of such data.

14 in total

Review 1. Estimation of failure probabilities in the presence of competing risks: new representations of old estimators.

Authors: T A Gooley; W Leisenring; J Crowley; B E Storer
Journal: Stat Med Date: 1999-03-30 Impact factor: 2.373

2. Competing risks as a multi-state model.

Authors: Per Kragh Andersen; Steen Z Abildstrom; Susanne Rosthøj
Journal: Stat Methods Med Res Date: 2002-04 Impact factor: 3.021

3. Regression modeling of competing risks data based on pseudovalues of the cumulative incidence function.

Authors: John P Klein; Per Kragh Andersen
Journal: Biometrics Date: 2005-03 Impact factor: 2.571

4. Analysing and interpreting competing risk data.

Authors: Melania Pintilie
Journal: Stat Med Date: 2007-03-15 Impact factor: 2.373

5. Modelling competing risks in cancer studies.

Authors: John P Klein
Journal: Stat Med Date: 2006-03-30 Impact factor: 2.373

6. SAS and R functions to compute pseudo-values for censored data regression.

Authors: John P Klein; Mette Gerster; Per Kragh Andersen; Sergey Tarima; Maja Pohar Perme
Journal: Comput Methods Programs Biomed Date: 2008-01-15 Impact factor: 5.428

7. Modeling cumulative incidence function for competing risks data.

Authors: Mei-Jie Zhang; Xu Zhang; Thomas H Scheike
Journal: Expert Rev Clin Pharmacol Date: 2008-05-01 Impact factor: 5.045

8. The analysis of failure times in the presence of competing risks.

Authors: R L Prentice; J D Kalbfleisch; A V Peterson; N Flournoy; V T Farewell; N E Breslow
Journal: Biometrics Date: 1978-12 Impact factor: 2.571

9. Impact of metabolic syndrome on ischemic heart disease - a prospective cohort study in an Iranian adult population: Isfahan Cohort Study.

Authors: M Talaei; M Sadeghi; T Marshall; G N Thomas; P Kabiri; S Hoseini; N Sarrafzadegan
Journal: Nutr Metab Cardiovasc Dis Date: 2010-12-30 Impact factor: 4.222

10. The Isfahan cohort study: rationale, methods and main findings.

Authors: N Sarrafzadegan; M Talaei; M Sadeghi; R Kelishadi; S Oveisgharan; N Mohammadifard; A R Sajjadieh; P Kabiri; T Marshall; G N Thomas; A Tavasoli
Journal: J Hum Hypertens Date: 2010-11-25 Impact factor: 3.012

1 in total

1. Association of lipid markers with coronary heart disease and stroke mortality: A 15-year follow-up study.

Authors: Bagher Pahlavanzade; Farid Zayeri; Taban Baghfalaki; Omid Mozafari; Davood Khalili; Fereidoun Azizi; Alireza Abadi
Journal: Iran J Basic Med Sci Date: 2019-11 Impact factor: 2.699

1 in total