| Literature DB >> 21867515 |
Henrik Støvring1, Ivar S Kristiansen.
Abstract
BACKGROUND: To preserve patient anonymity, health register data may be provided as binned data only. Here we consider as example, how to estimate mean survival time after a diagnosis of metastatic colorectal cancer from Norwegian register data on time to death or censoring binned into 30 day intervals. All events occurring in the first three months (90 days) after diagnosis were removed to achieve comparability with a clinical trial. The aim of the paper is to develop and implement a simple, and yet flexible method for analyzing such interval censored and truncated data.Entities:
Year: 2011 PMID: 21867515 PMCID: PMC3748025 DOI: 10.1186/1756-0500-4-308
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Deaths and censorings among patients with colorectal cancer, Norway, 1991-2005
| 1991-1996 | 1997-2001 | 2002-2005 | ||||
|---|---|---|---|---|---|---|
| FU-Year |
|
|
|
|
|
|
| .25 | 583 | 0 | 471 | 1 | 360 | 1 |
| 1 | 393 | 0 | 355 | 3 | 287 | 132 |
| 2 | 125 | 0 | 143 | 0 | 89 | 97 |
| 3 | 53 | 0 | 76 | 0 | 33 | 67 |
| 4 | 33 | 0 | 31 | 0 | 6 | 29 |
| 5 | 26 | 0 | 19 | 36 | ||
| 6 | 9 | 0 | 6 | 20 | ||
| 7 | 9 | 0 | 4 | 25 | ||
| 8 | 5 | 0 | 2 | 25 | ||
| 9 | 1 | 0 | 0 | 12 | ||
| 10 | 3 | 13 | ||||
| 11 | 1 | 13 | ||||
| 12 | 0 | 7 | ||||
| 13 | 0 | 10 | ||||
| 14 | 0 | 7 | ||||
| 15 | 1 | 7 | ||||
| 16 | ||||||
| Total | 1242 | 57 | 1107 | 122 | 775 | 326 |
Event counts in dataset provided by Cancer Registry of Norway, stratified by period of onset. FU-Year is lower limit of follow-up interval, with the interval extending to the subsequent limit. E is count of events, C count of censorings. Original data were provided as monthly counts, but is here collapsed into annual counts to improve legibility.
Figure 1Censoring distribution functions for sensitivity analysis. Distribution functions used in the sensitivity analysis on the impact of assuming different shapes of the censoring distribution. Gis the conditional distribution of Z for each of the intervals [t; t+1), j = 1, ..., m, given that Z belongs to the j'th interval.
Parameterization of distributions
| Model | Parameter restrictions | Mean | |
|---|---|---|---|
| Weibull: | exp (- | ||
| Gamma: | |||
| Gompertz: | No closed formula | ||
| Log-Logistic: | |||
| Log-Normal: |
† I(a, x) is the incomplete gamma function given by
‡ Φ(x) is the standard normal cdf given by
For each distribution the survivor function S(y) is given accompanied by parameter restrictions, if applicable, and the formula for the mean survival as a function of parameters.
Simulation results comparing analyses with multiple imputation or single midpoint imputation in terms of bias, coverage probability and relative efficiency
| 1 month | 2 months | 6 months | 12 months | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Parameter | Censoring | Estimation |
|
|
|
|
|
|
|
|
|
|
|
|
|
| log(λ) | 3% | MC | 500 | 3.0 | 95.5 | 0.8 | -3.2 | 94.9 | 1.6 | 7.8 | 94.9 | 6.4 | 17.7 | 94.7 | 16.8 |
| 1,000 | 0.0 | 94.7 | 0.8 | 3.5 | 94.3 | 1.7 | 2.6 | 95.6 | 6.4 | 14.5 | 94.9 | 16.7 | |||
| 10,000 | 0.4 | 95.4 | 0.8 | -0.3 | 94.9 | 1.7 | 2.9 | 95.5 | 6.4 | 14.2 | 91.0 | 16.8 | |||
| MI | 500 | 3.0 | 95.5 | 0.8 | -3.4 | 94.9 | 1.6 | 5.7 | 94.9 | 6.4 | 11.1 | 95.0 | 16.8 | ||
| 1,000 | -0.1 | 94.8 | 0.8 | 3.3 | 94.3 | 1.7 | 1.0 | 95.4 | 6.4 | 8.2 | 94.8 | 16.7 | |||
| 10,000 | 0.3 | 95.4 | 0.8 | -0.3 | 94.9 | 1.7 | 1.3 | 95.6 | 6.4 | 8.0 | 94.0 | 16.8 | |||
| 6% | MC | 500 | 2.3 | 95.7 | 0.7 | 0.6 | 94.7 | 1.6 | 9.4 | 95.4 | 6.3 | 29.6 | 94.0 | 16.6 | |
| 1,000 | 0.4 | 94.8 | 0.7 | 1.2 | 95.5 | 1.6 | 9.1 | 95.7 | 6.3 | 29.6 | 94.0 | 16.7 | |||
| 10,000 | 1.1 | 94.7 | 0.8 | 0.7 | 94.6 | 1.7 | 9.0 | 94.0 | 6.3 | 30.7 | 77.6 | 16.8 | |||
| MI | 500 | 2.2 | 95.7 | 0.7 | 0.4 | 94.7 | 1.6 | 6.9 | 95.2 | 6.3 | 19.7 | 94.6 | 16.8 | ||
| 1,000 | 0.3 | 94.7 | 0.7 | 1.1 | 95.5 | 1.6 | 7.2 | 95.7 | 6.3 | 21.3 | 94.6 | 16.9 | |||
| 10,000 | 1.0 | 94.7 | 0.8 | 0.5 | 94.6 | 1.7 | 6.8 | 95.0 | 6.4 | 22.5 | 85.7 | 17.0 | |||
| 9% | MC | 500 | 0.4 | 95.4 | 0.8 | -0.9 | 95.3 | 1.6 | 11.1 | 95.4 | 6.2 | 49.7 | 92.8 | 16.9 | |
| 1,000 | 0.6 | 95.4 | 0.7 | 0.9 | 95.4 | 1.5 | 12.2 | 95.0 | 6.2 | 46.7 | 91.8 | 16.8 | |||
| 10,000 | 0.7 | 95.1 | 0.8 | 1.5 | 94.9 | 1.6 | 12.2 | 91.9 | 6.2 | 45.7 | 57.1 | 16.8 | |||
| MI | 500 | 0.3 | 95.4 | 0.8 | -0.9 | 95.4 | 1.6 | 8.3 | 95.4 | 6.3 | 39.3 | 93.8 | 17.2 | ||
| 1,000 | 0.6 | 95.4 | 0.7 | 0.5 | 95.4 | 1.5 | 9.5 | 95.1 | 6.3 | 35.9 | 93.3 | 17.1 | |||
| 10,000 | 0.5 | 95.1 | 0.8 | 1.2 | 94.8 | 1.6 | 9.4 | 93.4 | 6.3 | 34.9 | 73.7 | 17.1 | |||
| γ | 3% | MC | 500 | 0.6 | 95.0 | 2.0 | 1.0 | 94.8 | 4.1 | 3.9 | 94.5 | 13.5 | 7.6 | 94.8 | 30.2 |
| 1,000 | 0.8 | 95.4 | 2.1 | 2.1 | 95.1 | 4.4 | 1.6 | 95.0 | 13.6 | 6.9 | 95.0 | 30.3 | |||
| 10,000 | -0.2 | 95.3 | 2.0 | 0.5 | 95.1 | 4.2 | 1.6 | 94.9 | 13.6 | 6.0 | 90.7 | 30.3 | |||
| MI | 500 | 0.4 | 94.8 | 2.0 | 0.9 | 94.8 | 4.1 | 2.0 | 94.5 | 13.4 | 2.6 | 94.8 | 29.8 | ||
| 1,000 | 0.6 | 95.4 | 2.1 | 2.0 | 95.0 | 4.4 | -0.0 | 94.8 | 13.5 | 2.1 | 95.0 | 29.9 | |||
| 10,000 | -0.4 | 95.2 | 2.0 | 0.4 | 95.2 | 4.2 | -0.1 | 95.1 | 13.4 | 1.2 | 94.4 | 29.8 | |||
| 6% | MC | 500 | 2.4 | 95.7 | 2.0 | 1.7 | 94.6 | 4.1 | 6.4 | 94.8 | 14.1 | 12.8 | 94.1 | 31.6 | |
| 1,000 | -0.4 | 95.5 | 2.0 | 1.8 | 95.4 | 4.4 | 4.5 | 94.6 | 14.1 | 13.3 | 93.9 | 31.8 | |||
| 10,000 | 0.1 | 94.3 | 2.2 | 0.7 | 94.6 | 4.5 | 3.7 | 92.9 | 14.3 | 12.8 | 79.8 | 31.9 | |||
| MI | 500 | 2.2 | 95.6 | 2.0 | 1.5 | 94.6 | 4.1 | 4.1 | 94.9 | 14.0 | 6.8 | 94.4 | 31.3 | ||
| 1,000 | -0.6 | 95.5 | 1.9 | 1.7 | 95.4 | 4.5 | 2.6 | 94.4 | 14.1 | 7.3 | 94.8 | 31.6 | |||
| 10,000 | -0.0 | 94.2 | 2.2 | 0.6 | 94.6 | 4.5 | 1.8 | 94.2 | 14.2 | 6.8 | 91.3 | 31.7 | |||
| 9% | MC | 500 | 3.9 | 95.6 | 2.6 | 3.0 | 95.7 | 4.7 | 6.1 | 95.4 | 15.2 | 20.3 | 92.4 | 33.8 | |
| 1,000 | 1.9 | 95.5 | 2.2 | 1.8 | 94.4 | 4.7 | 7.2 | 95.4 | 15.0 | 21.0 | 92.3 | 33.6 | |||
| 10,000 | -0.1 | 95.5 | 2.3 | 1.1 | 95.0 | 4.7 | 5.4 | 92.4 | 15.0 | 20.6 | 57.1 | 33.6 | |||
| MI | 500 | 3.7 | 95.6 | 2.6 | 2.8 | 95.7 | 4.7 | 3.5 | 95.6 | 15.2 | 12.9 | 93.5 | 33.7 | ||
| 1,000 | 1.7 | 95.6 | 2.2 | 1.6 | 94.4 | 4.7 | 4.8 | 95.4 | 15.0 | 13.3 | 94.6 | 33.5 | |||
| 10,000 | -0.3 | 95.4 | 2.3 | 0.9 | 95.2 | 4.7 | 3.2 | 94.2 | 15.0 | 12.9 | 80.4 | 33.5 | |||
Simulation results for estimation of log(λ) and γ from datasets with Weibull distributed event times, constant censoring rates, ten years of follow-up, and varying widths of interval censorings. Results for each setting represent analyses of 2,500 generated datasets. Column headers in months indicate width of intervals inducing interval censoring; censoring refers to annual proportion of censoring; Estimation refers to estimation procedure: MC to single imputation of midpoint for censoring events (Equation 5) MI to multiple imputation, (Equation 4). RB is median relative bias in percent, Cov is coverage probability of nominal 95% confidence intervals, while SEIF is percent increase in median standard error relative to analysis with ordinary censoring, but no interval censoring.
Goodness-of-fit tests and estimated mean survival times for five parametric distributions, patients with metastatic colorectal cancer, Norway 1991-2005
| Period | Model |
|
|
| Mean | 90% CI |
|---|---|---|---|---|---|---|
| 1991-1996 | Gamma | 225.1 | 62 | 0.0000 | -† | - |
| Gompertz | 75.9 | 55 | 0.0324 | - | ||
| Log-Logistic | 81.3 | 56 | 0.0151 | 2.29 | (2.06; 2.58) | |
| Log-Normal | 116.9 | 58 | 0.0000 | 1.61 | (1.49; 1.75) | |
| Weibull | 179.7 | 60 | 0.0000 | 1.19 | (1.06; 1.34) | |
| 1997-2001 | Gamma | 154.2 | 60 | 0.0000 | 1.25 | (0.91; 1.50) |
| Gompertz | 79.2 | 53 | 0.0114 | - | ||
| Log-Logistic | 80.8 | 55 | 0.0135 | 3.20 | (2.80; 3.79) | |
| Log-Normal | 98.2 | 56 | 0.0004 | 2.06 | (1.92; 2.22) | |
| Weibull | 137.5 | 60 | 0.0000 | 1.71 | (1.56; 1.87) | |
| 2002-2005 | Gamma | 46.9 | 36 | 0.1053 | 1.89 | (1.77; 2.02) |
| Gompertz | 44.5 | 35 | 0.1312 | - | ||
| Log-Logistic | 43.1 | 34 | 0.1358 | 3.14 | (2.76; 3.65) | |
| Log-Normal | 45.3 | 35 | 0.1142 | 2.39 | (2.24; 2.54) | |
| Weibull | 46.4 | 37 | 0.1392 | 1.92 | (1.81; 2.05) | |
| All | Gamma | 319.7 | 81 | 0.0000 | 1.23 | (1.06; 1.38) |
| Gompertz | 135.8 | 70 | 0.0000 | - | ||
| Log-Logistic | 127.5 | 74 | 0.0001 | 2.87 | (2.67; 3.11) | |
| Log-Normal | 170.0 | 77 | 0.0000 | 2.01 | (1.93; 2.09) | |
| Weibull | 277.1 | 79 | 0.0000 | 1.64 | (1.55; 1.73) | |
† Could not be meaningfully estimated due to unreliable parameter estimates (the estimated standard error of the α parameter approached infinity).
Goodness-of-fit χ2 test statistics, degrees of freedom (d.f.), and p-value is given for each period and all data pooled. The mean survival times are based on 10,000 draws from a bivariate normal distribution given by the estimated parameters and their covariance. For each draw a mean is computed with the appropriate formula given in Table 2, and the median together with 5% and 95% percentiles of this posterior distribution are used as estimate and confidence interval, respectively.
Figure 2Goodness-of-fit plots. Difference between expected and observed counts of events over the time scale. χ2 is the goodness-of-fit test statistic value with d.f. degrees of freedom and associated p-value. Note that the plot for all periods pooled has a differently scaled Y-axis than the period specific plots.
Weibull Parameter estimates and estimated mean survival times, patients with metastatic colorectal cancer, Norway 1991-2005
| Period | Estimation | Mean | Median | 5% | 95% | |||
|---|---|---|---|---|---|---|---|---|
| 1991-1996 | MI | 0.2662 (0.0669) | 0.4945 (0.0287) | 1.196 | 0.086 | 1.194 | 1.057 | 1.340 |
| MC | 0.2662 (0.0669) | 0.4944 (0.0287) | 1.196 | 0.086 | 1.194 | 1.057 | 1.340 | |
| MA | 0.1982 (0.0799) | 0.3741 (0.0256) | 1.203 | 0.086 | 1.201 | 1.064 | 1.346 | |
| NT | -0.6520 (0.0358) | 0.9200 (0.0185) | 2.116 | 0.065 | 2.115 | 2.011 | 2.223 | |
| 1997-2001 | MI | -0.0700 (0.0679) | 0.5962 (0.0334) | 1.709 | 0.095 | 1.706 | 1.556 | 1.867 |
| MC | -0.0700 (0.0679) | 0.5962 (0.0334) | 1.709 | 0.095 | 1.706 | 1.556 | 1.867 | |
| MA | -0.1752 (0.0685) | 0.5801 (0.0319) | 1.713 | 0.094 | 1.710 | 1.561 | 1.871 | |
| NT | -0.8462 (0.0399) | 1.0051 (0.0230) | 2.317 | 0.071 | 2.314 | 2.204 | 2.435 | |
| 2002-2005 | MI | -0.5902 (0.0664) | 0.9425 (0.0516) | 1.926 | 0.074 | 1.925 | 1.806 | 2.051 |
| MC | -0.5904 (0.0664) | 0.9428 (0.0516) | 1.925 | 0.074 | 1.925 | 1.806 | 2.050 | |
| MA | -0.6118 (0.0659) | 0.9506 (0.0509) | 1.926 | 0.074 | 1.925 | 1.807 | 2.050 | |
| NT | -1.0586 (0.0467) | 1.3282 (0.0386) | 2.043 | 0.057 | 2.042 | 1.952 | 2.138 | |
| All | MI | -0.0650 (0.0385) | 0.6099 (0.0197) | 1.641 | 0.052 | 1.641 | 1.555 | 1.728 |
| MC | -0.0650 (0.0385) | 0.6099 (0.0197) | 1.641 | 0.052 | 1.641 | 1.555 | 1.728 | |
| MA | -0.0722 (0.0424) | 0.5048 (0.0178) | 1.645 | 0.052 | 1.645 | 1.559 | 1.731 | |
| NT | -0.8148 (0.0229) | 1.0132 (0.0133) | 2.224 | 0.040 | 2.223 | 2.159 | 2.291 | |
Weibull parameter estimates for the three observation periods for survival after colon-rectal cancer based on data from the Cancer Registry of Norway. Estimated mean survival time for each of the three periods. Mean, s.e., median, 5%, and 95% percentiles all refer to the distribution of computed means obtained when sampling from the distribution of estimates. All subjects with event times smaller than three months are excluded by design. MI is multiple imputation (Equation 4), MC means replacing censored observations with the midpoint of their interval (Equation 5), MA means replacing all events and censored observations with the midpoint of their interval (Equation 6), NT means multiple imputation, but ignoring the truncation (Equation 4 with the term S(M)-1 omitted).
Results of sensitivity analysis with respect to the assumed shape of the interval specific censoring distribution
| 2002-2005 | All | |||
|---|---|---|---|---|
| z | -0.5902 (0.0664) | 0.9425 (0.0516) | -0.0650 (0.0385) | 0.6099 (0.0197) |
| -0.5884 (0.0666) | 0.9383 (0.0516) | -0.0642 (0.0386) | 0.6090 (0.0197) | |
| 1 - (1 - | -0.5924 (0.0663) | 0.9473 (0.0517) | -0.0658 (0.0385) | 0.6109 (0.0197) |
| -0.5926 (0.0663) | 0.9475 (0.0517) | -0.0659 (0.0385) | 0.6109 (0.0197) | |
| -0.5882 (0.0666) | 0.9381 (0.0516) | -0.0642 (0.0386) | 0.6090 (0.0197) | |
Weibull parameter estimates for the period 2002-2005 and all periods joined together, respectively, under four different assumptions regarding the shape of the censoring distribution on each interval, G.