Literature DB >> 29391606

Maximum likelihood estimation for semiparametric regression models with multivariate interval-censored data.

Abstract

Interval-censored multivariate failure time data arise when there are multiple types of failure or there is clustering of study subjects and each failure time is known only to lie in a certain interval. We investigate the effects of possibly time-dependent covariates on multivariate failure times by considering a broad class of semiparametric transformation models with random effects, and we study nonparametric maximum likelihood estimation under general interval-censoring schemes. We show that the proposed estimators for the finite-dimensional parameters are consistent and asymptotically normal, with a limiting covariance matrix that attains the semiparametric efficiency bound and can be consistently estimated through profile likelihood. In addition, we develop an EM algorithm that converges stably for arbitrary datasets. Finally, we assess the performance of the proposed methods in extensive simulation studies and illustrate their application using data derived from the Atherosclerosis Risk in Communities Study.

Entities: Chemical Disease Gene Species

Keywords: Current-status data; EM algorithm; Multivariate failure time data; Nonparametric likelihood; Profile likelihood; Proportional hazards; Proportional odds; Random effects

Year: 2017 PMID： 29391606 PMCID： PMC5787874 DOI： 10.1093/biomet/asx029

Source DB: PubMed Journal: Biometrika ISSN： 0006-3444 Impact factor: 2.445

1. Introduction

Multivariate failure time data arise when each study subject may experience multiple events or when study subjects are sampled in clusters such that the failure times are potentially correlated (Kalbfleisch & Prentice, 2002, Ch. 10). The failure times are interval-censored if the events or failures can only be determined through periodic examination. In the special case of one examination per subject, the observations are called current-status data (Huang, 1996). An example of interval-censored multiple-event data is an HIV/AIDS study where laboratory tests were performed periodically on each patient to detect the presence of cytomegalovirus in the blood and urine (Goggins & Finkelstein, 2000). An example of interval-censored clustered data is a study of pandemic H1N1 influenza where blood samples of family members were collected at different time-points to determine whether there is infection with the influenza virus (Kor et al., 2013). Such data allow characterization of the dependence of related events and evaluation of the effects of covariates on the multivariate outcome. The fact that failure times are never exactly observed, together with their dependence, makes the analysis theoretically and computationally challenging. Several methods for regression analysis of interval-censored multiple-event data have been proposed. Specifically, Goggins & Finkelstein (2000), Kim & Xue (2002), Chen et al. (2007), Tong et al. (2008) and Chen et al. (2013) constructed estimating equations for marginal models by assuming that all subjects are examined at a common set of time-points. Chen et al. (2009) and Chen et al. (2014) considered a frailty proportional hazards model for current-status data and interval-censored data, respectively. The former assumed a piecewise-constant baseline hazard function, while the latter assumed a common set of examination times for all subjects. All the aforementioned work avoids the difficult task of nonparametric estimation by parameterizing the failure time distribution or estimating the survival probabilities at fixed time-points. Wang et al. (2008) studied sieve estimation of a copula proportional hazards model for bivariate current-status data with univariate examination time, which was parameterized by a proportional hazards model. Wen & Chen (2013) established asymptotic theory for the nonparametric maximum likelihood estimation of a gamma-frailty proportional hazards model for bivariate interval-censored data and constructed a self-consistency equation, which involves an artificial tuning constant and may have multiple solutions. Wang et al. (2015) developed an EM algorithm for spline-based sieve estimation of the same model, but for bivariate current-status data. The literature on interval-censored clustered data is relatively limited. Cook & Tolusso (2009) and Kor et al. (2013) constructed estimating functions for a copula proportional hazards model with a piecewise-constant baseline hazard function for current-status and interval-censored data, respectively. Chang et al. (2007) established a profile likelihood theory for a gamma-frailty proportional hazards model with current-status family data, and Wen & Chen (2011) developed a self-consistency algorithm similar to that in Wen & Chen (2013). In this paper, we provide efficient estimation methods for a broad class of semiparametric transformation models with random effects for general interval-censored multivariate failure time data. Our work advances the study of multivariate interval-censored data in several directions. First, we deal with the most general form of interval censoring, allowing each subject to have an arbitrary sequence of examination times, and we do not model the examination times. Second, our models accommodate time-dependent covariates and include both proportional and non-proportional hazards structures. Third, our models allow multiple random effects and treat multiple events and clustered data in a unified framework. Fourth, we estimate the failure time distribution in a completely nonparametric manner and avoid any tuning parameters, which are required by sieve methods. Fifth, we establish a rigorous asymptotic theory for the nonparametric maximum likelihood estimators under mild conditions. Finally, we devise an EM algorithm that involves only low-dimensional parameters in each iteration and performs well in a wide variety of situations. The present paper also substantially extends our recent work on univariate interval-censored data (Zeng et al., 2016). We expand our previous numerical algorithm to handle unobserved random effects and multiple baseline hazard functions. We address new theoretical challenges generated by the presence of random effects, especially in proving the Donsker property of relevant functions in the form of integration over random effects. In addition, the asymptotic theory of Zeng et al. (2016) hinges on the assumption that a subset of study subjects is examined at the study endpoint; here we remove that restrictive assumption and formulate new arguments to prove the consistency of the estimators. Finally, we show that the covariance matrix for the finite-dimensional parameters can be estimated consistently by the inverse empirical covariance matrix of the individual contributions to the gradient of the profile loglikelihood function. This estimator is always positive semidefinite and is numerically more stable than the Hessian matrix used by Zeng et al. (2016) and others.

2. Data, model and likelihood

We consider a general framework for modelling multivariate failure time data that encompasses both multiple events and clustered data. Suppose that there are independent clusters with subjects in the th cluster and that each subject can potentially experience types of events. It is assumed that is small relative to . For , and , let denote the th failure time for the th subject of the th cluster, and let denote the corresponding -vector of possibly time-dependent covariates. We specify that the cumulative hazard function of takes the form where contains 1 and covariates that may be part of , is a -vector of random effects from the multivariate normal distribution with mean zero and covariance matrix indexed by unknown parameters , is a set of unknown regression parameters, is an arbitrary increasing function with , and is a specific transformation function. It is assumed that are independent conditional on . By letting and depend on , model (1) allows the regression parameters and random effects to be different among the types of events; see Lin (1994). In addition, the dependence of on allows for subject-specific random effects. Often does not depend on , and then consists of the upper diagonal elements of the common covariance matrix . An example in which depends on is given in the Supplementary Material. A variety of transformations can be generated through the log-Laplace transform where is a density function with support on . The choice of the gamma density with mean 1 and variance for yields the class of logarithmic transformations with (Chen et al., 2002), which includes the proportional odds model, , and can be extended to include the proportional hazards model by letting . Suppose that is monitored at a sequence of positive time-points . We assume that are independent of and conditional on . Let be the shortest time interval that brackets , i.e., and , where and . Then the likelihood concerning the parameters and is in which if .

3. Nonparametric maximum likelihood estimation

We adopt the nonparametric maximum likelihood estimation approach. For each , let be the ordered sequence of all and with . The estimator for is a step function which jumps only at those time-points with respective jump sizes of . We introduce a latent variable with density as given in (2). Then (3) can be written as where and . To make the maximization of the likelihood more tractable, we introduce independent Poisson random variables with means . Let and . Because the joint probability of and given and is the likelihood arising from the observations is the same as (4). Therefore, we develop an EM algorithm to maximize (4) by treating , and as complete data, where . Conditional on and , the failure time follows a proportional hazards model. Let be a Poisson process with value 0 at and intensity function the same as the hazard function of . Clearly, is the first time jumps from 0 to 1, such that falling in the interval is equivalent to taking no jump before but at least one jump in . Thus, and are indeed the counts of before and between and , respectively. The complete-data loglikelihood is In the M-step, we solve the following equation for using the one-step Newton–Raphson method: where denotes the conditional expectation given the observed data. We then calculate for and , and maximize to estimate . If the are the same and nonparametric, then the latter becomes , where . In the E-step, we evaluate the conditional expectations involved in the M-step. We use the fact that the joint density of and given the observed data is proportional to In addition, the conditional mean of for given , and the observed data is We use Gaussian quadrature to approximate integrals over and . Starting with , and as the identity matrix, we iterate between the E-step and the M-step until convergence to obtain the nonparametric maximum likelihood estimators , and . The high-dimensional parameters are calculated explicitly in the M-step. We show in the Supplementary Material that each iteration of the algorithm guarantees an increase in the likelihood. Due to the presence of random effects, the conditional expectations in this EM algorithm are more tedious to evaluate than those in Zeng et al. (2016).

4. Asymptotic properties

Let and . We establish the asymptotic properties of under the following regularity conditions, wherein we omit the subscript when referring to a random variable for a cluster and use the notation and . The true value of , denoted by , lies in the interior of a known compact set , where is a compact set in and is a compact set in the domain of such that is a positive-definite matrix with eigenvalues bounded away from zero and . The true value of , denoted by , is continuously differentiable with positive derivatives in , which is the union of the supports of . With probability one, has bounded total variation in . If there exists a deterministic function and a constant vector such that with probability 1, then for and . With probability one, has bounded total variation in . The cluster size is bounded by a positive constant and is independent of , and conditional on . For any and , the number of examination times is positive with . The conditional densities of given , denoted by , have continuous second-order partial derivatives with respect to and when for some positive constant , and are continuously differentiable functionals with respect to and . In addition, . The transformation function is twice continuously differentiable on with , and for , where . In addition, is uniformly bounded in and there exists a positive constant such that as . For a pair of parameters and , if with probability 1 for any , and with and , then , and for and . If there exists a vector and functions such that with probability one for any , and with and , where and is the derivative of with respect to , then and for , and . Conditions 1–4 are standard conditions for multivariate failure time regression. Condition 5 requires that two adjacent examination times be separated by at least ; otherwise, the data may contain exact observations, which need a different treatment. This condition also requires smoothness of the joint density of the examination times. Unlike Zeng et al. (2016), we do not require a subset of study subjects to be examined at the end of the study. Condition 6 holds for both the logarithmic family and the Box–Cox family , where if and if . Condition 7 pertains to parameter identifiability, and Condition 8 says that the Fisher information along any submodel at the true parameter values should be nonsingular. If and are time-independent and , then the equations in Conditions 7 and 8 become respectively. It can be shown that the above equations hold if is linearly independent; that is, any symmetric matrix satisfying with probability one must be a zero matrix. We state the strong consistency and weak convergence of the nonparametric maximum likelihood estimators in Theorems 1 and 2, respectively. Under Conditions 1–7, Under Conditions 1–8, The proofs of the theorems are given in the Appendix. In the proof of Theorem 1, a major challenge is to show uniform boundedness of without assuming that there is a positive probability of . To address this challenge, we first obtain a sequence of that converges for any interior compact sets of . We then show that the limit of the sequence is the true parameter value by deriving the covering number for the loglikelihood function. In the proof of Theorem 2, we use the bounded inverse theorem to establish the convergence rates of the in terms of and the Euclidean distance of the other parameter estimators, and we show that the rates obtained are sufficient for the asymptotic normality and efficiency of the estimators. Let , which is obtained by using the above EM algorithm but updating only in the M-step. One may estimate the covariance matrix of by the negative inverse of the Hessian matrix of at , which is determined by the numerical differences of second order and a perturbation constant of the order of (Murphy & van der Vaart, 2000; Zeng et al., 2016). The estimated matrix may be negative definite, especially in small samples. We propose to estimate the covariance matrix of by with where for and is the loglikelihood function for the th cluster. Thus, we estimate the information matrix for by the empirical covariance matrix of the gradient of . We approximate this gradient by a first-order numerical difference, which is quicker to calculate than its second-order counterpart. The resulting covariance matrix estimator is guaranteed to be positive semidefinite and turns out to be more robust with respect to choice of the perturbation constant than the estimator based on the second-order numerical difference. The consistency of this covariance estimator is stated in the following theorem. Under Conditions 1–8,

5. Simulation studies

To evaluate the performance of the proposed methods, we conducted two series of simulation studies. The first series pertained to clustered data, the cluster sizes being 1, 2 and 3 with probabilities 02, 07 and 01, respectively. We considered model (1) with and . We generated two independent cluster-level covariates, the first being Ber(05) and the second Un. We set the corresponding regression parameters and to and , respectively. We adopted the class of logarithmic transformations indexed by parameter and obtained the random effect from where . We generated five potential examination times for each subject, with the first being Un and the gap between any two successive examination times being . We assumed that the study ended at time 5, beyond which no examinations occurred. We simulated 10 000 replicates. Table 1 summarizes the results on the estimation of and for various values of and , and Fig. 1 displays the corresponding results for the estimation of . The biases for all parameter estimators are small and decrease as increases. The variance estimator for is accurate, and the variance of tends to be overestimated. The confidence intervals for both and have proper coverage probabilities. Additional studies revealed that the variance estimator for and the confidence intervals for become more accurate as increases.

Table 1.

Parameter estimation results for simulation studies with clustered data

		n=100				n=200				n=400
r		Bias	SE	SEE	CP	Bias	SE	SEE	CP	Bias	SE	SEE	CP
0	β1=0⋅5	0⋅014	0⋅263	0⋅258	94	0⋅005	0⋅182	0⋅180	95	0⋅002	0⋅127	0⋅126	95
	β2=-0⋅5	-0⋅008	0⋅404	0⋅399	95	-0⋅005	0⋅278	0⋅277	95	-0⋅003	0⋅194	0⋅194	95
	σ2=0⋅5	-0⋅024	0⋅369	0⋅384	96	-0⋅009	0⋅244	0⋅259	97	-0⋅001	0⋅166	0⋅177	97

0⋅5	β1=0⋅5	0⋅014	0⋅302	0⋅299	95	0⋅004	0⋅210	0⋅208	95	0⋅002	0⋅147	0⋅146	95
	β2=-0⋅5	-0⋅010	0⋅483	0⋅479	95	-0⋅007	0⋅333	0⋅331	95	-0⋅004	0⋅233	0⋅232	95
	σ2=0⋅5	-0⋅027	0⋅457	0⋅486	96	-0⋅010	0⋅309	0⋅330	96	0⋅001	0⋅214	0⋅228	96

1	β1=0⋅5	0⋅015	0⋅341	0⋅341	95	0⋅004	0⋅237	0⋅235	95	0⋅002	0⋅166	0⋅165	95
	β2=-0⋅5	-0⋅012	0⋅558	0⋅552	95	-0⋅008	0⋅382	0⋅381	95	-0⋅005	0⋅268	0⋅266	95
	σ2=0⋅5	-0⋅036	0⋅558	0⋅607	95	-0⋅018	0⋅380	0⋅412	95	-0⋅001	0⋅265	0⋅286	95

SE, empirical standard error; SEE, mean standard error estimator; CP, empirical coverage percentage of 95% confidence interval. For , Bias and SEE are based on the median instead of the mean, and the confidence interval is based on the log transformation. Each entry is based on 10 000 replicates.

Fig. 1.

Estimation of for clustered data: the solid and dashed curves show the true values and averaged estimates, respectively, where each estimate is based on 10 000 replicates.

Parameter estimation results for simulation studies with clustered data SE, empirical standard error; SEE, mean standard error estimator; CP, empirical coverage percentage of 95% confidence interval. For , Bias and SEE are based on the median instead of the mean, and the confidence interval is based on the log transformation. Each entry is based on 10 000 replicates. Estimation of for clustered data: the solid and dashed curves show the true values and averaged estimates, respectively, where each estimate is based on 10 000 replicates. The second series of studies was concerned with multiple events. We considered model (1) with , , and . We focused on the logarithmic families indexed by and . For each subject, we generated covariates and random effects from the same distributions as in the first series of studies. We set the regression parameters for the first event, , to and those of the second event, , to . We generated examination times for each subject in the same manner as in the first series of studies. The results for the second series of studies are presented in the Supplementary Material. The basic conclusions are the same as those from the first series. The variance estimation was based on the first-order numerical differentiation with a perturbation constant of . The results are quite stable for perturbation constants between and . We also evaluated variance estimation based on the second-order numerical differentiation and found that the resulting variance estimates may be negative when is small and the perturbation constant is far away from . The two variance estimation methods produced similar estimates in most cases. We recommend using for both the first-order and the second-order numerical differences.

6. An example

The Atherosclerosis Risk in Communities Study recruited a cohort of 14 751 Caucasian and African-American individuals from four U.S. communities: Forsyth County, North Carolina; Jackson, Mississippi; suburbs of Minneapolis, Minnesota; and Washington County, Maryland (The ARIC Investigators, 1989). The participants underwent a baseline examination in 1987–1989, three follow-up examinations at approximately three-year intervals, and a further examination in 2011–2013. One important objective of the study was to investigate risk factors for diabetes and hypertension. The definition of diabetes was a fasting glucose level of 126 mg/dL or above, a nonfasting glucose level of 200 mg/dL or above, self-reported physician diagnosis of diabetes, or use of diabetic medication. The definition of hypertension was systolic blood pressure of 140 mmHg or higher, diastolic blood pressure of 90 mmHg or higher, or use of antihypertensive medication. Both events were determined at the examination times and thus interval-censored. We related the incidence of diabetes and hypertension to race, gender, communities and five baseline risk factors: age, body mass index, glucose level, systolic blood pressure and diastolic blood pressure. We excluded 5890 individuals with prevalent diabetes or hypertension and 124 individuals with unknown status at baseline. After removing another two individuals with missing values of baseline risk factors, we were left with a total of 8735 individuals. We fitted model (1) with , and . The loglikelihood is maximized at and , which is the combination that would be selected by the Akaike information criterion. The loglikelihood values are , and at , and , respectively. Table 2 shows regression analysis results for the aforementioned three combinations of and . The -values are similar. The results indicate that African-Americans are more likely to develop diabetes and hypertension than Caucasians; baseline body mass index is positively associated with the risk of both diabetes and hypertension; and baseline glucose level is positively associated with the risk of diabetes but not hypertension. Not surprisingly, baseline systolic and diastolic blood pressures are positively associated with the risk of hypertension. Because is large, some of the -values are extremely small.

Table 2.

Regression analysis results for the Atherosclerosis Risk in Communities Study

		Diabetes				Hypertension
(r1,r2)	Risk factor	Estimate	Std error	p-value	Estimate	Std error	p-value
(0, 0)	Jackson	-0⋅188	0⋅194	0⋅332	-0⋅251	0⋅139	0⋅070
	Minneapolis suburbs	-0⋅436	0⋅085	<10-4	-0⋅129	0⋅054	0⋅018
	Washington County	0⋅131	0⋅081	0⋅106	0⋅094	0⋅055	0⋅087
	Age	-0⋅015	0⋅006	0⋅011	0⋅016	0⋅004	<10-4
	Male	-0⋅082	0⋅060	0⋅172	-0⋅268	0⋅041	<10-4
	Caucasian	-0⋅563	0⋅192	0⋅003	-0⋅569	0⋅138	<10-4
	Body mass index (kg/m 2)	0⋅088	0⋅006	<10-4	0⋅021	0⋅004	<10-4
	Derived glucose value (mg/dl)	0⋅108	0⋅003	<10-4	0⋅0003	0⋅002	0⋅914
	Systolic blood pressure (mmHg)	0⋅006	0⋅003	0⋅070	0⋅072	0⋅003	<10-4
	Diastolic blood pressure (mmHg)	0⋅005	0⋅005	0⋅271	0⋅014	0⋅003	<10-4
(1, 1)	Jackson	-0⋅189	0⋅240	0⋅432	-0⋅311	0⋅163	0⋅056
	Minneapolis suburbs	-0⋅526	0⋅101	<10-4	-0⋅164	0⋅070	0⋅019
	Washington County	0⋅149	0⋅097	0⋅123	0⋅113	0⋅072	0⋅114
	Age	-0⋅016	0⋅007	0⋅025	0⋅022	0⋅005	<10-4
	Male	-0⋅099	0⋅072	0⋅170	-0⋅303	0⋅053	<10-4
	Caucasian	-0⋅722	0⋅237	0⋅002	-0⋅773	0⋅163	<10-4
	Body mass index (kg/m 2)	0⋅108	0⋅008	<10-4	0⋅030	0⋅006	<10-4
	Derived glucose value (mg/dl)	0⋅130	0⋅004	<10-4	-0⋅0004	0⋅003	0⋅906
	Systolic blood pressure (mmHg)	0⋅008	0⋅004	0⋅053	0⋅093	0⋅003	<10-4
	Diastolic blood pressure (mmHg)	0⋅005	0⋅006	0⋅351	0⋅020	0⋅004	<10-4
(2⋅1, 1⋅3)	Jackson	-0⋅201	0⋅277	0⋅467	-0⋅337	0⋅166	0⋅043
	Minneapolis suburbs	-0⋅607	0⋅116	<10-4	-0⋅174	0⋅075	0⋅021
	Washington County	0⋅161	0⋅112	0⋅150	0⋅119	0⋅077	0⋅126
	Age	-0⋅016	0⋅008	0⋅044	0⋅024	0⋅005	<10-4
	Male	-0⋅114	0⋅084	0⋅178	-0⋅312	0⋅057	<10-4
	Caucasian	-0⋅875	0⋅271	0⋅001	-0⋅844	0⋅168	<10-4
	Body mass index (kg/m 2)	0⋅127	0⋅010	<10-4	0⋅033	0⋅006	<10-4
	Derived glucose value (mg/dl)	0⋅150	0⋅005	<10-4	-0⋅0006	0⋅003	0⋅864
	Systolic blood pressure (mmHg)	0⋅010	0⋅005	0⋅036	0⋅101	0⋅004	<10-4
	Diastolic blood pressure (mmHg)	0⋅004	0⋅007	0⋅496	0⋅022	0⋅004	<10-4

Regression analysis results for the Atherosclerosis Risk in Communities Study The regression parameters have different interpretations under different transformation models. Under the proportional odds model, the regression parameters pertain to the log hazard ratios at baseline, and the hazard ratios decrease over time. Therefore, estimates of the regression parameters tend to have larger magnitudes under the proportional odds model than under the proportional hazards model. The variance component was estimated at 0591, 0646 and 0758 under the proportional hazards, proportional odds and selected models, respectively, and the corresponding standard error estimates were 0057, 0087 and 0111. Thus, there is strong evidence for dependence of diabetes and hypertension. Figure 2 shows the prediction of development of diabetes and hypertension for a Caucasian female and an African-American female with all other risk factors equal. The risk of both diseases is considerably higher for the African-American individual than the Caucasian individual. The three models yield appreciably different estimates of disease-free probabilities.

Fig. 2.

Estimation of disease-free probabilities for an African-American female and a Caucasian female residing in Forsyth County, North Carolina, of age 53 years, with a body mass index of 30 kg/m, glucose level of 97 mg/dl, systolic blood pressure of 125 mmHg and diastolic blood pressure of 70 mmHg: (a) diabetes; (b) hypertension. In each panel the upper solid, dashed and dotted curves represent the Caucasian individual under the proportional hazards, proportional odds and selected models, respectively; the lower solid, dashed and dotted curves pertain to the African-American individual under the proportional hazards, proportional odds and selected models, respectively.

7. Remarks

The proposed EM algorithm, which is used for both parameter estimation and variance estimation, performs remarkably well in practical settings, as demonstrated by the simulation studies and real-data example. We have not encountered nonconvergence with any simulated or empirical datasets. The computing time depends on the number of subjects, the number of distinct interval endpoints and the number of covariates, as well as on the convergence criterion. For the results presented in this paper, the convergence criterion was that the maximal relative change in the parameter estimates at two successive iterations should be less than . With this criterion, it took less than half a second to analyse one simulated dataset with . It took about 10 hours to analyse the Atherosclerosis Risk in Communities Study data, which involves 8765 subjects with 10 covariates and 2240 or 2303 distinct interval endpoints for diabetes or hypertension, respectively; the computing time was shortened to about one hour when the distinct values were reduced to 133 for diabetes and 138 for hypertension by rounding the examination times to the nearest month. The software implementing the proposed methods is available at http://dlin.web.unc.edu/software. We have assumed that the support of the examination times for the th type of event is an interval . We can relax this assumption to let the support consist of intervals or a finite number of discrete time-points. The asymptotic results continue to hold, although the consistency for in Theorem 1 should be stated to hold in the support of the examination times. In the proofs, the integration over should be changed to integration over the support. The framework presented in this paper can be extended to other types of multivariate data. In particular, model (1) can be extended to panel count data (Zhang, 2002) by treating as the intensity function of a counting process rather than the hazard function of a failure time. In addition, model (1) can be combined with a generalized linear mixed model that shares the random effects to jointly model longitudinal and survival data (Henderson et al., 2000; Zeng & Lin, 2007). There are new theoretical and computational challenges in estimating such multivariate models with interval-censored data. Click here for additional data file.

12 in total

1. Joint modelling of longitudinal measurements and event time data.

Authors: R Henderson; P Diggle; A Dobson
Journal: Biostatistics Date: 2000-12 Impact factor: 5.899

2. Efficient estimation for the proportional hazards model with bivariate current status data.

Authors: Lianming Wang; Jianguo Sun; Xingwei Tong
Journal: Lifetime Data Anal Date: 2008-06 Impact factor: 1.588

3. Second-order estimating equations for the analysis of clustered current status data.

Authors: Richard J Cook; David Tolusso
Journal: Biostatistics Date: 2009-07-27 Impact factor: 5.899

4. Regression analysis of multivariate interval-censored failure time data with application to tumorigenicity experiments.

Authors: Xingwei Tong; Man-Hua Chen; Jianguo Sun
Journal: Biom J Date: 2008-06 Impact factor: 2.207

Maximum likelihood estimation for semiparametric regression models with multivariate interval-censored data.

1. Introduction

2. Data, model and likelihood

3. Nonparametric maximum likelihood estimation

4. Asymptotic properties

5. Simulation studies

6. An example

7. Remarks

1. Joint modelling of longitudinal measurements and event time data.

2. Efficient estimation for the proportional hazards model with bivariate current status data.

3. Second-order estimating equations for the analysis of clustered current status data.

4. Regression analysis of multivariate interval-censored failure time data with application to tumorigenicity experiments.

5. A frailty model approach for regression analysis of multivariate current status data.

6. Maximum likelihood estimation for semiparametric transformation models with interval-censored data.

7. A method for analyzing clustered interval-censored data based on Cox's model.

8. Cox regression analysis of multivariate failure time data: the marginal approach.

9. A proportional hazards model for multivariate interval-censored failure time data.

10. The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. The ARIC investigators.

1. A spline-based nonparametric analysis for interval-censored bivariate survival data.

2. Semiparametric Regression Analysis of Multiple Right- and Interval-Censored Events.

3. Semiparametric regression analysis of case-cohort studies with multiple interval-censored disease outcomes.

4. Maximum Likelihood Estimation for Semiparametric Regression Models With Panel Count Data.

5. Instrumental variable estimation of complier causal treatment effect with interval-censored data.