Literature DB >> 31175197

Power estimations for non-primary outcomes in randomised clinical trials.

Janus Christian Jakobsen^1,2, Christian Ovesen³, Per Winkel¹, Jørgen Hilden⁴, Christian Gluud¹, Jørn Wetterslev¹.

Abstract

OBJECTIVE AND METHODS: It is rare that trialists report power estimations of non-primary outcomes. In the present article, we will describe how to define a valid hierarchy of outcomes in a randomised clinical trial, to limit problems with Type I and Type II errors, using considerations on the clinical relevance of the outcomes and power estimations.
CONCLUSION: Power estimations of non-primary outcomes may guide trialists in classifying non-primary outcomes as secondary or exploratory. The power estimations are simple and if they are used systematically, more appropriate outcome hierarchies can be defined, and trial results will become more interpretable. © Author(s) (or their employer(s)) 2019. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.

Entities: Chemical Disease Gene Species

Keywords: Clinical Trials; Quality In Health Care

Mesh：

Year: 2019 PMID： 31175197 PMCID： PMC6588976 DOI： 10.1136/bmjopen-2018-027092

Source DB: PubMed Journal: BMJ Open ISSN： 2044-6055 Impact factor: 2.692

To avoid problems with Type I errors (false rejection of a true null hypothesis) and Type II errors (false acceptance of a null hypothesis), and rash interpretations of the results of a randomised clinical trial, it is essential to (1) limit the number of outcomes;1 (2) adjust CIs and thresholds for significance according to number of outcome comparisons;1 and (3) define an outcome hierarchy (outcomes classified according to their type and how they ought to become interpreted). Clinical success has many aspects and both beneficial and harmful effects ought to be interpreted, so selecting a single outcome variable is rarely feasible.1 We have previously summarised how to adjust CIs and thresholds for significance if there are multiple outcome comparisons.1 The European Medicines Agency has recently, conservatively and wisely, suggested using Bonferroni corrections.2 The present paper will describe how to define a valid hierarchy of outcomes in a randomised clinical trial, to limit problems with Type I and Type II errors, using power estimations of the non-primary outcomes. Our focus in the present paper is the overall outcome of a trial. Therefore, a Type I error will be defined as the case when the overall conclusion of a trial is that an intervention is effective—when it is not. Type II error will be defined as the case when the overall conclusion of a trial is that an intervention is not effective—when it is. In order to maintain simplicity, we will focus on dichotomous and continuous outcomes, but the described principles may be used for most other types of outcomes as well.3

Summary of fundamental considerations when defining outcome hierarchies in randomised clinical trials

Before considering power estimations of non-primary outcomes, we will briefly summarise what we believe are fundamental and essential considerations when defining outcome hierarchies in randomised clinical trials. It is recommended to prespecify primary and secondary outcomes, including how and when they are assessed (http://www.consort-statement.org/checklists/view/32--consort-2010/80-outcomes).3 To limit problems with multiplicity and difficulties with interpreting the trial results, it is often optimal to use only one primary outcome and the sample size should be based on this outcome.1 The primary outcome in a randomised clinical trial should be the outcome with the highest degree of clinical relevance for the patients, that is, patient centred outcomes. All primary and secondary outcomes in a randomised clinical trial should either be outcomes that are important for the decision to use the intervention or sufficiently validated surrogate outcomes for such important outcomes.2 4 5 History has shown us that we cannot rely on surrogate outcomes, unless they are validated.4 The most-often cited example is the Cardiac Arrhythmia Suppression Trial (known as CAST), in which two drugs that suppressed ventricular arrhythmias (a surrogate outcome correlated with a bad prognosis) were initially approved by the Food and Drug Administration, only to have the CAST demonstrate that, compared with placebo, individuals who had arrhythmias after myocardial infarctions and received antiarrhythmic drugs were 2.5 times as likely to die.6 It is necessary to validate a surrogate outcome before we can be confident that it can be used in clinical trials or practice.4 7 Such validation requires randomised clinical trials that assess both the surrogate and clinical outcome and show that both are changed by the intervention in a comparable manner.4 7 8 Moreover, a validated surrogate for one drug cannot guarantee that the surrogate outcome will not mislead when new drugs are being tested.8 Non-validated surrogate outcomes should always be classified as ‘exploratory outcomes’, until formal validation has been proved and accepted by the scientific community. When planning a randomised clinical trial, it is essential to estimate the required sample size.1 9–11 However, the majority of randomised clinical trials have difficulties in obtaining the stipulated sample size,12 and trials with too small sample sizes often suggest intervention effect sizes far from the ‘true’ effect sizes shown in subsequent larger trials and meta-analyses.1 13 Even most Cochrane systematic reviews with meta-analyses do not have sufficient power.14 15

Power estimations of non-primary outcomes

Consider a single randomised clinical trial. If the estimated sample size has not been reached, the risks of Type I errors (false rejection of the null hypothesis) and Type II errors (false acceptance of the null hypothesis) should be estimated when interpreting the trial results.1 16 The threshold for statistical significance (and consequently the CI) should be adjusted to the fraction of the preplanned number of participants randomised.1 16 Clearly, there is no safeguard against all kinds of bias, but adjustment schemes in common use at the very least protect against the dangers of premature or repeated testing.1 Such adjustments should, ideally, be common practice in all high quality trials.1 16 Analogous problems arise with non-primary outcomes when the information is deemed insufficient; that is, when statistical power is not known, the data cannot unreservedly be analysed as if based on a dataset large enough to draw conclusions about a minimal important difference (MID).17 If MID effect estimates, as well as null effect, are included in the naïve 95% CI, then this indicates that more information may be needed. However, if MID effect estimates are not included in the naïve 95% CI, then it is unclear if more data are needed to uncover a worthwhile effect or if there is in fact no worthwhile difference between the groups.1 When null effect is excluded in the naïve 95% CI and it is unclear whether there is enough information, it will also be difficult to interpret the analysis results. Trial results tend to show spurious results of too beneficial or too harmful effect estimates if there is insufficient information.1 Inspecting unadjusted naïve 95% CI when the sample size has not been reached will not suffice as such CIs would be inappropriately narrow, as stated above.1 16 In order to estimate the statistical power of an analysis, it is necessary to decide on an MID,1 17 an incidence in the control group when assessing a dichotomised outcome or a SD when assessing a continuous outcome, and an acceptable risk of Type I error adjusted according to the number of outcome comparisons.1 2 Alternatively, the sequence in which the secondary outcomes are tested may be prespecified and carried out without adjustment, but stopped when the first null hypothesis is not rejected after which the rest of the assessments will become exploratory.1 Most statistical software can easily estimate both sample sizes and power estimations of non-primary outcomes.18

Power analysis should be part of standard trial methodology

For the reasons stated above, we recommend at the protocol stage to estimate the statistical power of all non-primary outcomes for confirming or rejecting a MID. If the power is less than 80% (or 90%), then this outcome should be classified as an ‘exploratory outcome’ together with the non-validated surrogate outcomes.19 Alternatively, the CI and the thresholds for significance for the outcome in question may be adjusted due to sparse data,1 16 or the sample size could be reconsidered and increased so the power of the non-primary outcome in questions becomes 80% (or 90%).1 16 We searched for all randomised clinical trials published in the British Medical Journal during 2017 and found 10. Only one randomised clinical trial briefly mentioned that ‘A trial of this size will also give more than 80% power to detect important differences in secondary outcomes…’.20 None of the remaining nine trials reported any considerations of power of non-primary outcomes, and it is generally rare that trialists report power estimations of non-primary outcomes. As we have described, trial results always ought to be interpreted in the light of the required sample size and the obtained sample size, and without power estimations it will be difficult to make valid conclusions based on non-primary outcome results. It is simple to estimate the power of outcome tests, so it is striking that this is not done regularly by trialists. Of course, MIDs (together with a measure of variance and an acceptable risk of Type I error) need to be estimated to estimate the power of an outcome comparison, which might seem troublesome. Nevertheless, MIDs need to be defined for all important outcomes regardless of the use of power estimations, otherwise it will be difficult to judge if statistically significant results are also clinically meaningful for patients.1 All the necessary quantities (MIDs, estimations of proportion in the control group, SD) in the power estimations may possibly be estimated on the basis of a systematic review of studies, performed before the trial is conducted. Considerations on the clinical relevance of outcomes and power estimations seem an important tool that may help defining appropriate outcome hierarchies. In addition to estimating a required sample size, we believe that future trialists when planning a randomised clinical trial, ought to estimate power of all non-primary outcomes and consider estimating power of subgroup comparisons. Power estimations of non-primary outcomes may guide trialists in classifying non-primary outcomes as secondary or exploratory. The power estimations are simple and if they are used systematically, more appropriate outcome hierarchies can be defined, and trial results will become more interpretable.21

16 in total

1. CONSORT 2010 statement: updated guidelines for reporting parallel group randomized trials.

Authors: Kenneth F Schulz; Douglas G Altman; David Moher
Journal: Ann Intern Med Date: 2010-03-24 Impact factor: 25.391

2. Although not consistently superior, the absolute approach to framing the minimally important difference has advantages over the relative approach.

Authors: Yuqing Zhang; Shiyuan Zhang; Lehana Thabane; Toshi A Furukawa; Bradley C Johnston; Gordon H Guyatt
Journal: J Clin Epidemiol Date: 2015-03-11 Impact factor: 6.437

Review 3. Evidence-based clinical practice: Overview of threats to the validity of evidence and how to minimise them.

Authors: Silvio Garattini; Janus C Jakobsen; Jørn Wetterslev; Vittorio Bertelé; Rita Banzi; Ana Rath; Edmund A M Neugebauer; Martine Laville; Yvonne Masson; Virginie Hivert; Michaela Eikermann; Burc Aydin; Sandra Ngwabyt; Cecilia Martinho; Chiara Gerardi; Cezary A Szmigielski; Jacques Demotes-Mainard; Christian Gluud
Journal: Eur J Intern Med Date: 2016-05-06 Impact factor: 4.487

4. Mortality and morbidity in patients receiving encainide, flecainide, or placebo. The Cardiac Arrhythmia Suppression Trial.

Authors: D S Echt; P R Liebson; L B Mitchell; R W Peters; D Obias-Manno; A H Barker; D Arensberg; A Baker; L Friedman; H L Greene
Journal: N Engl J Med Date: 1991-03-21 Impact factor: 91.245

5. Sample size calculation for clinical trials: the impact of clinician beliefs.

Authors: P M Fayers; A Cuschieri; J Fielding; J Craven; B Uscinska; L S Freedman
Journal: Br J Cancer Date: 2000-01 Impact factor: 7.640

6. Statistical analysis plan for the EuroHYP-1 trial: European multicentre, randomised, phase III clinical trial of the therapeutic hypothermia plus best medical treatment versus best medical treatment alone for acute ischaemic stroke.

Authors: Per Winkel; Philip M Bath; Christian Gluud; Jane Lindschou; H Bart van der Worp; Malcolm R Macleod; Istvan Szabo; Isabelle Durand-Zaleski; Stefan Schwab
Journal: Trials Date: 2017-11-29 Impact factor: 2.279

7. Upright versus lying down position in second stage of labour in nulliparous women with low dose epidural: BUMPES randomised controlled trial.

Authors:
Journal: BMJ Date: 2017-10-18

Review 8. A reinvestigation of recruitment to randomised, controlled, multicenter trials: a review of trials funded by two UK funding agencies.

Authors: Ben G O Sully; Steven A Julious; Jon Nicholl
Journal: Trials Date: 2013-06-09 Impact factor: 2.279

9. The thresholds for statistical and clinical significance - a five-step procedure for evaluation of intervention effects in randomised clinical trials.

Authors: Janus Christian Jakobsen; Christian Gluud; Per Winkel; Theis Lange; Jørn Wetterslev
Journal: BMC Med Res Methodol Date: 2014-03-04 Impact factor: 4.615

10. Power analysis for random-effects meta-analysis.

Authors: Dan Jackson; Rebecca Turner
Journal: Res Synth Methods Date: 2017-04-04 Impact factor: 5.273

7 in total

1. Emotion regulation training in the treatment of obesity in young adolescents: protocol for a randomized controlled trial.

Authors: Taaike Debeuf; Sandra Verbeken; Elisa Boelens; Brenda Volkaert; Eva Van Malderen; Nathalie Michels; Caroline Braet
Journal: Trials Date: 2020-02-10 Impact factor: 2.279

2. Fatigue self-management led by occupational therapists and/or physiotherapists for chronic conditions: A systematic review and meta-analysis.

Authors: Sungha Kim; Ying Xu; Kelly Dore; Rebecca Gewurtz; Nadine Larivière; Lori Letts
Journal: Chronic Illn Date: 2021-09-13

3. Effectiveness of Specific Techniques in Behavioral Teacher Training for Childhood ADHD Behaviors: Secondary Analyses of a Randomized Controlled Microtrial.

Authors: Anouck I Staff; Saskia van der Oord; Jaap Oosterlaan; Rianne Hornstra; Pieter J Hoekstra; Barbara J van den Hoofdakker; Marjolein Luman
Journal: Res Child Adolesc Psychopathol Date: 2022-01-11

4. Family-based cognitive behavioural therapy versus family-based relaxation therapy for obsessive-compulsive disorder in children and adolescents (the TECTO trial): a statistical analysis plan for the randomised clinical trial.

Authors: Markus Harboe Olsen; Julie Hagstrøm; Nicole Nadine Lønfeldt; Camilla Uhre; Valdemar Uhre; Linea Pretzmann; Sofie Heidenheim Christensen; Christine Thoustrup; Nicoline Løcke Jepsen Korsbjerg; Anna-Rosa Cecilie Mora-Jensen; Melanie Ritter; Janus Engstrøm; Jane Lindschou; Hartwig Roman Siebner; Frank Verhulst; Pia Jeppesen; Jens Richardt Møllegaard Jepsen; Signe Vangkilde; Per Hove Thomsen; Katja Hybel; Line Katrine Harder Clemmesen; Christian Gluud; Kerstin Jessica Plessen; Anne Katrine Pagsberg; Janus Christian Jakobsen
Journal: Trials Date: 2022-10-06 Impact factor: 2.728

5. Targeted hypothermia versus targeted normothermia after out-of-hospital cardiac arrest: a statistical analysis plan.

Authors: Janus Christian Jakobsen; Josef Dankiewicz; Theis Lange; Tobias Cronberg; Gisela Lilja; Helena Levin; Jan Bělohlávek; Clifton Callaway; Alain Cariou; David Erlinge; Jan Hovdenes; Michael Joannidis; Per Nordberg; Mauro Oddo; Paolo Pelosi; Hans Kirkegaard; Glenn Eastwood; Christian Rylander; Manoj Saxena; Christian Storm; Fabio Silvio Taccone; Matthew P Wise; Matt P G Morgan; Paul Young; Alistair Nichol; Hans Friberg; Susann Ullén; Niklas Nielsen
Journal: Trials Date: 2020-10-07 Impact factor: 2.279

6. Detailed statistical analysis plan for the short-term versus long-term mentalisation-based therapy for outpatients with subthreshold or diagnosed borderline personality disorder randomised clinical trial (MBT-RCT).

Authors: Sophie Juul; Sebastian Simonsen; Stig Poulsen; Susanne Lunn; Per Sørensen; Anthony Bateman; Janus Christian Jakobsen
Journal: Trials Date: 2021-07-28 Impact factor: 2.279

Review 7. Critical Appraisal of Large Vitamin D Randomized Controlled Trials.

Authors: Stefan Pilz; Christian Trummer; Verena Theiler-Schwetz; Martin R Grübler; Nicolas D Verheyen; Balazs Odler; Spyridon N Karras; Armin Zittermann; Winfried März
Journal: Nutrients Date: 2022-01-12 Impact factor: 5.717

7 in total