| Literature DB >> 33532634 |
Wei Xu1,2, Shao Hui Huang3,4, Jie Su1, Shivakumar Gudi3, Brian O'Sullivan3,4.
Abstract
PURPOSE: To facilitate understanding statistical principles and methods for clinicians involved in cancer research.Entities:
Keywords: Cancer; Clinical research; Data analysis; Statistical models; Statistics; Study design
Year: 2021 PMID: 33532634 PMCID: PMC7829109 DOI: 10.1016/j.ctro.2021.01.006
Source DB: PubMed Journal: Clin Transl Radiat Oncol ISSN: 2405-6308
Null Hypothesis and Alternative Hypothesis of Superiority, Equivalent and Non-inferiority Studies.
| Type of study | Null hypothesis | Alternative hypothesis | Type of test |
|---|---|---|---|
| Superiority Study | The experimental arm has the same performance as the control arm | The experimental arm has different performance compared to the control arm | Two-sided* or one-sided** |
| Non-Inferiority Study | The experimental arm is inferior to the control arm | The experimental arm is at least as effective as the control arm | One-sided** |
| Equivalence Study | The experimental arm has different performance compared to the control arm | The experimental arm is equivalent to the control arm | Two-sided* |
* Two-sided test means bi-directional (either better or worse effect) on the performance of the primary endpoint.
** One-sided test means uni-directional (i.e., better effect) on the performance of the primary endpoint.
Variables Required for Sample Size Calculation.
| Key Parameters | Definition | Conventional Value | Relationship to Sample Size |
|---|---|---|---|
| Significance Level (α) | The chance of false positive result | 0.05 or 0.10, one-sided or two-sided; Need to conduct multiplicity adjustment when deal with multiple tests | α ↓ ⇒ samples ↑ |
| Statistical Power (1-β) | The chance of true positive result | 0.80 or 0.90 | power ↑ ⇒ samples ↑ |
| Effect Size (θ) | Minimal Clinical Meaningful Difference | Continuous Outcome: mean difference; Binary Outcome: odd ratio (OR); Time to Event Outcome: hazard ratio (HR) | effect size ↑ ⇒ samples ↓ |
| Variance (standard deviation, STD) | The variability of the continuous outcome measure | Only used for continuous outcomes | STD ↓ ⇒ samples ↓ |
| Example - Changes in Sample Size Due to Change of Assumption (CCTG HN.6 Trial [NCT00820248]) | |||
| Assumptions | Estimated Sample Size | ||
| Assumption 1: Effect size (HR 0.7), 2-year PFS 45% for control arm, alpha 0.05, beta 0.2, recruitment 3.2 years, additional follow up 3 years | 320 (final sample size estimation) | ||
| Assumption 2: Larger effect size (HR 0.65), no change in other assumptions (larger difference in hazard rates between treatment arms, which translated into larger difference in actuarial rate of event manifestation) | 224 (smaller samples) | ||
| Assumption 3: Longer recruitment (5 years), no change in other assumptions (more events manifest within the total length of the trial) | 304 (smaller samples) | ||
| Assumption 4: Longer follow-up (5 years), no change in other assumptions (more events manifest within the total length of the trial) | 282 (smaller samples) | ||
| Assumption 5: Larger statistical power (0.9), no change in other assumptions (less chance of false negative) | 430 (larger samples) | ||
| Assumption 6: Lower PFS for both control arm (2-year PFS 60%) and treatment arm with the same hazard ratio, no change in other assumptions (i.e. lower hazard rates for both treatment and control arms) | 400 (larger samples) | ||
Abbreviation; PFS: progression free survival.
Definition of Commonly Used Oncologic Outcome Endpoint and Analytic Procedure.
| Study endpoint | Endpoint definition | ||||||
|---|---|---|---|---|---|---|---|
| Overall survival (OS) | From date of diagnosis (or date of treatment or date of randomization for RCTs) to date of death from any cause or last follow-up. The event is death due to any cause | ||||||
| Cause specific survival (CSS) | From date of diagnosis (or date of treatment or date of randomization for RCTs) to date of death due to index cancer or last follow-up. The event is death due to index cancer. Death due to other causes can be treated as competing risk events. | ||||||
| Relapse free survival (RFS) | From date of diagnosis (or date of treatment or date of randomization for RCTs) to date of first relapse or date of death or last follow-up. The event is first recurrence. Usually, death without any recurrence can be treated as a competing risk event. | ||||||
| Progression/Disease free survival (PFS/DFS) | From date of treatment to date of first recurrence (relapse) or date of death or last follow-up. The event is first recurrence or death. | ||||||
| Local failure (LF) Regional failure (RF) Distance failure (DF) | From date of treatment to date of local or regional or distant failure or date of death or last follow-up. The event is local or regional or distant failure. Usually, death without failure can be treated as a competing risk event. | ||||||
| Definition of Event, Censor, and Competing Risk | |||||||
| First Event | OS | CSS | RFS | PFS/DFS | LC | RC | DC |
| None (alive, no disease) | Censor | Censor | Censor | Censor | Censor | Censor | Censor |
| Local (primary site) failure | N/A | N/A | Event | Event | Event | N/A | Competing risk |
| Regional (lymph node) failure | N/A | N/A | Event | Event | N/A | Event | Competing risk |
| Distant (remote sites) metastasis | N/A | N/A | Event | Event | N/A | N/A | Event |
| Death due to index cancer | Event | Event | Competing risk | Event | Competing risk | Competing risk | Competing risk |
| Death due to other causes | Event | Competing risk | Competing risk | Event | Competing risk | Competing risk | Competing risk |
Abbreviation: N/A: not applicable; OS: overall survival; CSS: cause-specific survival; RFS: recurrence-free survival; PFS: progression-free survival; DFS: disease-free survival; LC: local control; RC: regional control; DC: distant control.
Fig. 1Actuarial Rate of Locoregional Failure Estimated by Kaplan-Meier Method vs Competing Risk Method in HPV-negative OPC Patients Treated at Princess Margaret Cancer Centre, Toronto, Canada.
Common pitfalls in study design, analysis, and report.
| Stage of the study | Type of pitfall | Consequence | Correction |
|---|---|---|---|
| Study design | Study population with exclusions and exclusions not described, initiation time of intervention not specified or consistent across the trial | Introduce bias into comparison and analysis | Clearly define study cohort and be mindful of potential lead time bias |
| No sample size calculation and power analysis | Too few samples, or too low statistical power, or waste of resource | Conduct sample size calculation and power analysis before data collection | |
| No multiplicity adjustment | Sample size underestimated, or inflation of Type I error | Conduct multiple comparison adjustment using more stringent Type I error control | |
| No control group or inappropriate control group | Introduces bias into comparison and analysis | Identify matched control group | |
| No detailed statistical analysis plan in study design | Introduces bias or incorrect statistical test is used | Develop comprehensive statistical analysis plan | |
| Statistical Modeling and Analysis | Incorrect statistical models and tests on study endpoints | Introduces bias, misleading results and incorrect conclusions | Carefully identify correct statistical models in statistical analysis plan |
| No model assumption checking and model diagnosis | Inappropriate statistical models and tests are conducted | Carefully check model assumption and conduct model diagnosis | |
| Treating observations within the same patient as independent samples | Underestimate or overestimate within- subject variation, provide misleading results | Use appropriate statistical models to incorporate both within subject and between subject variations | |
| Use association tests (e.g., chi square test or linear regression) to evaluate agreement | Provide incorrect conclusion on agreement test | Conduct appropriate test on agreement such as kappa coefficient or correlation coefficient | |
| Use logistic regression on time-to-event outcomes | Ignores follow up time, provides misleading results and conclusions | Conduct survival analysis models on time to event outcomes | |
| Statistical Report and Manuscript | Use categorization on continuous factor without discussion of cut-off selection | Provide incomplete information on study evaluation | Conduct exploratory analysis on different cut- offs, explore both continuous and categorized variable |
| Use standard error to describe variability in a population | Standard error refers to the variability of parameter, but not for population | Provide standard deviation to describe variability in a population | |
| Use approximate p-values such as P < 0.05 or P > 0.05 | Incomplete information | Provide exact p-values in the report | |
| Provide p-values without corresponding confidence interval | Incomplete information | Provide both p-value and corresponding confidence interval | |
| Provide odds ratio or hazard ratio without specifying reference category | Provide incomplete information and potential wrong association direction | Specify the reference group for both the comparison variable and outcome | |
| Indistinction between statistical significance and clinical significance | Draw conclusion only based on statistical significance | Draw conclusion based on both statistical and clinical significance | |
| Failure to report all the analyses that have been conducted and/or undertaking unplanned subset analysis | Potential misleading conclusions due to selection bias or fishing | Provide all the analysis results that have been conducted for the study including subgroup and sensitivity analysis | |
| “No-significance” refers to “no association” or “no effect” | Potential misleading conclusion due to small study or limited sample size | Report both p-values and parameter estimations, provide useful information for future | |
| Inappropriate use of graphs and tables | Provide misleading information and conclusion | Use appropriate graphs and tables to illustrate the analysis results | |
| Claiming superiority based on unplanned subgroup and interaction analysis | Over-interpretation and drawing conclusions based on exploratory analysis results Potential false positive inflation due to multiple comparisons | Restrict unplanned subgroup analysis to hypothesis generating Report interaction analysis results with ratio of HR | |