| Literature DB >> 24314264 |
Patrick Royston1, Mahesh K B Parmar.
Abstract
BACKGROUND: Designs and analyses of clinical trials with a time-to-event outcome almost invariably rely on the hazard ratio to estimate the treatment effect and implicitly, therefore, on the proportional hazards assumption. However, the results of some recent trials indicate that there is no guarantee that the assumption will hold. Here, we describe the use of the restricted mean survival time as a possible alternative tool in the design and analysis of these trials.Entities:
Mesh:
Year: 2013 PMID: 24314264 PMCID: PMC3922847 DOI: 10.1186/1471-2288-13-152
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Figure 1Example of sample sizes as a function of the time horizon for PH (solid lines) and non-PH (dashed lines) trial designs. The designs assume recruitment over K1 = 5 yr and follow-up over K2 = 3 yr.
Sample size calculations for hypothetical trials with proportional or non-proportional hazards of the treatment effect
| | | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 7 | 8 | 424 | 368 | 415 | 359 | 4.4 | 324 | 286 | 412 | 364 |
| 2 | 6 | 8 | 426 | 363 | 422 | 359 | 4.5 | 324 | 281 | 406 | 351 |
| 3 | 5 | 8 | 432 | 360 | 431 | 359 | 4.4 | 325 | 275 | 399 | 337 |
| 4 | 4 | 8 | 440 | 356 | 444 | 359 | 4.5 | 325 | 266 | 393 | 322 |
| 5 | 3 | 7.5 | 463 | 360 | 462 | 359 | 4.3 | 328 | 258 | 389 | 305 |
| 6 | 2 | 7.0 | 488 | 359 | 490 | 359 | 4.1 | 332 | 245 | 391 | 288 |
| 7 | 1 | 6.7 | 532 | 359 | 533 | 360 | 3.8 | 351 | 237 | 406 | 273 |
See the text for further details.
Operating characteristics of the test of RMST difference
| PH | 1 | 7 | 8 | 424 | 89.6 | 4.6 | 90.7 | 4.6 |
| | 3 | 5 | 8 | 432 | 90.1 | 4.6 | 90.3 | 4.7 |
| | 5 | 3 | 7.5 | 463 | 90.9 | 5.1 | 90.7 | 4.9 |
| | 7 | 1 | 6.7 | 532 | 90.5 | 5.4 | 90.3 | 5.1 |
| Non-PH | 1 | 7 | 4.4 | 324 | 89.4 | 4.5 | 81.2 | 4.8 |
| | 3 | 5 | 4.4 | 325 | 90.0 | 5.0 | 83.2 | 5.2 |
| | 5 | 3 | 4.3 | 328 | 89.5 | 4.3 | 84.4 | 4.5 |
| 7 | 1 | 3.8 | 351 | 91.6 | 4.8 | 83.5 | 5.5 | |
Significance level and power are presented as percentages. Sample size calculations are for hypothetical trials with proportional hazards (PH) or non-proportional hazards (non-PH) of the treatment effect. Results are given for the RMST test and for the logrank test using the RMST designed sample sizes. Details are as given in Table 1.
Design parameters for the SORCE trial
| 1 | 0.779 | 0.75 | 0.65 |
| 3 | 0.635 | 0.75 | 0.75 |
| 5 | 0.576 | 0.75 | 0.85 |
| 7 | 0.532 | 0.75 | 0.9 |
| 10 | 0.488 | 0.75 | 1.0 |
| 13 | 0.454 | 0.75 | 1.0 |
“DFS prob.” denotes survival probabilities for the disease-free survival outcome. PH and non-PH refer respectively to designs with (as per protocol) and without (hypothetical) proportional hazards of the treatment effect.
Total sample size ( ) and for hypothetical trials based on the design of SORCE
| PH | 5 | 3 | 8 | 1656 | 608 | 1790 | 658 |
| | 5 | 5 | 10 | 1509 | 610 | 1627 | 658 |
| | 5 | 8 | 13 | 1378 | 612 | 1488 | 662 |
| non-PH | 5 | 3 | 5.4 | 1621 | 602 | 1280 | 476 |
| | 5 | 5 | 6.0 | 1803 | 751 | 1266 | 528 |
| 5 | 8 | 8.0 | 2008 | 934 | 1488 | 692 | |
Recruitment (K1) is assumed to be for 5 years in all the designs, whereas follow-up (K2) is varied over 3, 5 and 8 years.
Figure 2Percent maturity () and power curves as a function of for the RE04 trial. Vertical lines show .
Comparison of four measures of the treatment effect in a trial
| 1. Is easily interpreted | | no | yes | yes | yes |
| 2. Does not assume proportional hazards | | no | yes | yes | yes |
| 3. Reflects entire survival history | | yes | no | yes | no |
| 4. Is a measure of survival time | | no | yes | yes | no |
| 5. Can be used with all models | | no | yes | yes | yes |
| 6. Can be calculated in any dataset | | yes | no | yes | yes |
| 7. Does not require a time point to be specified | | yes | yes | no | no |
| 8. Does not change with extended follow-up | | no | yes | yes | yes |
| 9. Is routinely associated with a clinically meaningful time point | no | no | yes | yes | |
aThe measure is the difference in the given statistic between trial arms. ADS = absolute difference in survival.
HR, RMST and derived statistics on survival for four randomized controlled trials in various cancer sites conducted by the Medical Research Council
| 6.8 | 6.7 | 6.4 | 10.8 | |
| 6.1 | 6.7 | 6.3 | 10.8 | |
| 0.72 | 0.39 | 0.082 | 0.11 | |
| 1394 (188) | 962 (483) | 749 (417) | 802 (655) | |
| Design HR | 0.75 | 0.75 | 0.75 | 0.75 |
| Achieved HR | 1.16 | 0.85 | 0.82 | 0.85 |
| (Cox model) | | | | |
| 0.3 | 0.07 | 0.04 | 0.03 | |
| (logrank test) | | | | |
| 0.06 | 0.7 | 0.6 | 0.1 | |
| (G-T test
| | | | |
| RMST in control | 5.39 | 3.69 | 2.46 | 2.68 |
| arm ( | | | | |
| RMST in research | 5.28 | 4.02 | 2.79 | 3.13 |
| arm ( | | | | |
| Diff. | -0.11 (0.10) | 0.33 (0.18) | 0.33 (0.17) | 0.46 (0.26) |
| | | | | |
| 0.3 | 0.07 | 0.05 | 0.08 | |
| | | | | |
| 0.07 | 0.4 | 0.7 | 0.04 | |
| non-PH
|
See text for details.
aEvents occurring in the interval (0,t*).
bGrambsch-Therneau test
cFrom a flexible parametric hazards model with a time-dependent treatment effect.
Figure 3Evolution over time ( ) of -statistics for RMST (truncated, solid lines; non-truncated, short dashed lines) and Cox (truncated, long dashed lines) tests in four randomized controlled trials in cancer.