| Literature DB >> 30706536 |
Sara J Baart1,2, Eric Boersma1, Dimitris Rizopoulos2.
Abstract
Studies with longitudinal measurements are common in clinical research. Particular interest lies in studies where the repeated measurements are used to predict a time-to-event outcome, such as mortality, in a dynamic manner. If event rates in a study are low, however, and most information is to be expected from the patients experiencing the study endpoint, it may be more cost efficient to only use a subset of the data. One way of achieving this is by applying a case-cohort design, which selects all cases and only a random samples of the noncases. In the standard way of analyzing data in a case-cohort design, the noncases who were not selected are completely excluded from analysis; however, the overrepresentation of the cases will lead to bias. We propose to include survival information of all patients from the cohort in the analysis. We approach the fact that we do not have longitudinal information for a subset of the patients as a missing data problem and argue that the missingness mechanism is missing at random. Hence, results obtained from an appropriate model, such as a joint model, should remain valid. Simulations indicate that our method performs similar to fitting the model on a full cohort, both in terms of parameters estimates and predictions of survival probabilities. Estimating the model on the classical version of the case-cohort design shows clear bias and worse performance of the predictions. The procedure is further illustrated in data from a biomarker study on acute coronary syndrome patients, BIOMArCS.Entities:
Keywords: case-cohort design; joint models; longitudinal data
Mesh:
Year: 2019 PMID: 30706536 PMCID: PMC6590325 DOI: 10.1002/sim.8113
Source DB: PubMed Journal: Stat Med ISSN: 0277-6715 Impact factor: 2.373
Figure 1A graphical representation of the case‐cohort design
Characteristics of the simulated data sets based on 200 replications of each scenario
| Size Subcohort: | Size Subcohort: | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| % Events | Scenario | Scenario | |||||||
| FC | CCI | CCII | FC | CCI | CCII | ||||
| 20% | patients, | 1 | 2000 | 2000 | 900 | 2 | 2000 | 2000 | 700 |
| events, | 400 | 400 | 400 | 400 | 400 | 400 | |||
| event rate, % | 20% | 20% | 40% | 20% | 20% | 60% | |||
| measurements, | 15 000 | 7000 | 7000 | 19 000 | 6000 | 6000 | |||
| 5% | patients, | 3 | 2000 | 2000 | 700 | 4 | 1900 | 1900 | 400 |
| events, | 100 | 100 | 100 | 100 | 100 | 100 | |||
| event rate, % | 5% | 5% | 15% | 5% | 5% | 25% | |||
| measurements, | 11 000 | 4500 | 4500 | 9000 | 2000 | 2000 | |||
Abbreviations: CCI, case‐cohort design, retain all survival information; CCII, case‐cohort design, classical version; FC, full cohort.
Results from estimating a joint model on simulated data based on 200 replications per scenario
| Size Subcohort: | Size Subcohort: | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| % Events | Scenario | Scenario | |||||||
| FC | CCI | CCII | FC | CCI | CCII | ||||
| 20% |
| 1 | 0.975 | 0.971 | 0.849 | 2 | 0.976 | 0.966 | 0.799 |
| bias | −0.025 | −0.029 | −0.151 | −0.024 | −0.034 | −0.201 | |||
| (2.5%‐97.5%) | (0.89‐1.07) | (0.88‐1.07) | (0.76‐0.94) | (0.89‐1.07) | (0.88‐1.06) | (0.71‐0.89) | |||
| coverage | 92% | 91% | 13% | 92% | 88% | 4% | |||
|
| 1.003 | 0.996 | 1.087 | 1.004 | 0.986 | 1.139 | |||
|
| 0.319 | 0.331 | 0.558 | 0.324 | 0.357 | 0.713 | |||
|
| 0.110 | 0.104 | 0.142 | 0.109 | 0.097 | 0.154 | |||
|
| 0.104 | 0.105 | 0.092 | 0.102 | 0.099 | 0.092 | |||
|
| −1.979 | −1.987 | −1.774 | −1.979 | −1.978 | −1.676 | |||
| 5% |
| 3 | 0.856 | 0.845 | 0.727 | 4 | 0.858 | 0.835 | 0.649 |
| bias | −0.144 | −0.155 | −0.273 | −0.142 | −0.165 | −0.351 | |||
| (2.5%‐97.5%) | (0.74‐0.99) | (0.72‐0.98) | (0.61‐0.86) | (0.74‐0.99) | (0.71‐0.97) | (0.53‐0.78) | |||
| coverage | 38% | 33% | 1% | 39% | 32% | 0% | |||
|
| 1.003 | 0.993 | 1.062 | 1.005 | 0.990 | 1.127 | |||
|
| 0.331 | 0.343 | 0.474 | 0.334 | 0.371 | 0.638 | |||
|
| 0.108 | 0.099 | 0.127 | 0.106 | 0.087 | 0.146 | |||
|
| 0.101 | 0.103 | 0.055 | 0.100 | 0.107 | 0.023 | |||
|
| −2.730 | −2.760 | −2.421 | −2.771 | −2.806 | −2.238 | |||
The bias indicates the difference between the simulated parameter value and the estimated value by each of the models. The coverage is calculated by the percentage of times the true simulated values falls in the credible interval of each simulation.
Simulated values of the parameters: α = 1, β 1 = 1, β 2 = 0.3, β 3 = 0.1, β 4 = 0.1, γ = −2.
Abbreviations: CCI, case‐cohort design, retain all survival information; CCII, case‐cohort design, classical version; FC, full cohort.
Figure 2Predictive accuracy measures from scenario 2 (event rate: 20%; size subcohort: 1/6). AUC, area under the ROC curve; CCI, case‐cohort design, retain all survival information; CCII, case‐cohort design, classical version; PE, prediction error
Results from estimating a joint model for repeated TnI values and the combined study endpoint on two versions of the case‐cohort design in the BIOMArCS data
| CCI | CCII | ||||
|---|---|---|---|---|---|
|
|
|
|
|
| |
|
| Intercept | 8.87 | (7.98, 9.66) | 8.98 | (8.26, 9.78) |
|
| Slope ( | −6.35 | (−7.15, −5.56) | −6.34 | (−7.07, −5.63) |
|
| Δ Slope( | −6.77 | (−7.55, −5.97) | −6.76 | (−7.46, −6.08) |
|
| Sex | 0.54 | (0.15, 0.93) | 0.48 | (0.11, 0.88) |
|
|
|
|
|
| |
|
| Association | 0.30 | (0.10, 0.50) | 0.33 | (0.14, 0.53) |
|
| Sex, survival | −0.43 | (−1.04, 0.21) | −0.44 | (−1.07, 0.15) |
|
|
|
|
|
| |
| AUC |
| 0.551 | (0.420‐0.695) | 0.533 | (0.438‐0.633) |
| PE |
| 0.014 | (0.007‐0.031) | 0.017 | (0.011‐0.032) |
β 3 indicates the difference between the slope estimates before and after 30 days. The coefficient for the slope after 30 days is given by (β 2 + β 3).
The area under the ROC curve (AUC) and prediction error (PE) are calculated using longitudinal measurements up to t = 60 (days) to predict events in (60, 100]. The measures are corrected with Harrell's optimism and shown with the 2.5% and 97.5% confidence limits.
Abbreviations: AUC, area under the ROC curve; CCI, case‐cohort design, retain all survival information; CCII, case‐cohort design, classical version; CI, credible interval; PE, prediction error.