Literature DB >> 27833889

Interim analysis: A rational approach of decision making in clinical trial.

Abstract

Interim analysis of especially sizeable trials keeps the decision process free of conflict of interest while considering cost, resources, and meaningfulness of the project. Whenever necessary, such interim analysis can also call for potential termination or appropriate modification in sample size, study design, and even an early declaration of success. Given the extraordinary size and complexity today, this rational approach helps to analyze and predict the outcomes of a clinical trial that incorporate what is learned during the course of a study or a clinical development program. Such approach can also fill the gap by directing the resources toward relevant and optimized clinical trials between unmet medical needs and interventions being tested currently rather than fulfilling only business and profit goals.

Entities: Chemical Disease Species

Keywords: Clinical trial operation method; decision making; interim analysis; rational approach

Year: 2016 PMID： 27833889 PMCID： PMC5052936 DOI： 10.4103/2231-4040.191414

Source DB: PubMed Journal: J Adv Pharm Technol Res ISSN： 0976-2094

INTRODUCTION

Interim analysis is one of the reliable rational approaches to clinical trials that incorporate what is learned during the course of a clinical study and how it is completed, without compromising the validity or integrity. This method may encompass the potential changes in all program-related resources and activities, including changes in logistical, monitoring, and recruitment procedures. On a realistic level, the study not only requires the ability to measure the outcomes of interest continuously but also to make data and summarized information about those measurements available in a timely manner to different audiences according to the study role. In a clinical context, this means not just continuously tracking trial data collected on case report forms but also generating performance metrics that enable refinements in operations. Interest in this approach has mounted as a result of the soaring cost of clinical research and numerous trial failures, including, particularly, costly and well-publicized failures of major late-stage trials. The simplest result of such an interim analysis is early stopping for futility or continuation of the study. This rational approach also allows clinical researchers to employ the same basic management principles as typical modern businesses, using real-time data and analysis to inform decisions that continually optimize operations.

INTERIM ANALYSIS AND STOPPING RULE

There are a number of practical and theoretical justifications for the implementation of this approach in clinical trials via a variety of group sequential designs that allow a limited number of planned analyses while maintaining a prespecified overall type I error rate and the blind of the study. It is highly desirable that the conduct of the interim analyses be done by a body independent of the one charged with the day-to-day activities of the clinical trial. There are a number of prospective statistical strategies for positive stopping of a clinical trial early.[12] Flexible strategy and other statistical procedures such as stochastic limitation or conditional power approaches consider negative stopping.[34] These include stochastic limitation or conditional power procedures which allow for the early termination of a clinical trial; if given the available trial information so far, the probability of reaching statistical significance in favor of the new treatment is small. There are also Bayesian or semi-Bayesian counterparts for each of these frequentist approaches.[567] Stopping rules for interim analyses based on limited data requires more stringent P values for stopping than later analyses, which can have stopping P values somewhat near to the nominal levels of significance. The use of these practices for which there are no documented statistical strategies creates serious problems during the review process. The Guidance for Industry on Adaptive Design Clinical Trials for Drugs and Biologics was released in 2010 by the Food and Drug Administration (FDA). Where definition included for an adaptive design, which was similar to that of the Adaptive Design Scientific Working Group: A study that includes a prospectively planned opportunity for modification of one or more specified aspects of the study design and hypotheses based on the analysis of data (usually interim data) from participants in the study. Thus, both the Adaptive Design Scientific Working Group and FDA support the notion that changes are based on prespecified decision rules. However, FDA defines this more generally: “The term prospective here means that the adaptation was planned before data were examined in an unblinded manner by any personnel involved in planning the revision. This can include plans that are introduced or made final after the study has started if the blinded state of the personnel involved is unequivocally maintained when the modification plan is proposed.”[8]

PLANNED AND UNPLANNED INTERIM ANALYSES

There is a need to adjust the nominal P values after the conduct of such planned or unplanned interim analyses because it should not be mollified by the fact that such interim analyses were made on the basis of information external to the clinical trial operations. These are perhaps the most difficult to handle and yet, the most common interim analysis issues to which statistical reviewers are faced with during the review process. The known works in this area are those of Geller, Pocock, Hughes, and Emerson. According to those, one approach is to assume that the accumulating data are continuously being looked at, but interim analyses are carried out only when they (the data) look interesting. This is equivalent to a continuous sequential design, and the repeated significance testing sequential designs of Armitage, McPherson, and Rowe may be appropriate. This ad hoc approach has been adopted by a number of statistical reviewers faced with problems of unplanned interim analyses during the review process of clinical trials. Besides this ad hoc approach, the more flexible alpha-spending function approach has also been suggested as a candidate for retrospective adjustment of P values due to unplanned interim analyses when the exact number of unplanned interim analyses actually carried out is known. An example of multiple looks for a comparative trial in which two treatments are being compared for efficacy is as follows. H0:p2 = p1 H1:p2 > p1 A standard design says that for 80% power with alpha of 0.05, we need about 100 patients per arm based on the assumption of p2 = 0.50 and p1 = 0.30 which results in 0.20 for the difference. Hence, what happens if we find P < 0.05 before all patients are enrolled? Why cannot we look at the data a few times in the middle of the trial and conclude that one treatment is better if we see P < 0.05?. When we are looking to find a difference between 0.30 and 0.50, we would not expect to conclude that there is evidence for a difference. However, if we look after every four patients, we get the scenario where we would stop at 96 patients and conclude that there is a significant difference [Figures 1–3].[9]

Figure 1

Plots above show simulated data where p1 = 0.40 and p2 = 0.50

Figure 3

If we look after every forty patients, we get the scenario where we would not stop either. If we wait until the end of the trial (n = 200), we estimate p1 to be 0.45 and p2 to be 0.52. The P value used for testing shows a significant difference of 0.40

Plots above show simulated data where p1 = 0.40 and p2 = 0.50 If we look after every ten patients, we get the scenario where we would not stop until all the 200 patients were observed and would conclude that there no significant difference (P = 0.40) If we look after every forty patients, we get the scenario where we would not stop either. If we wait until the end of the trial (n = 200), we estimate p1 to be 0.45 and p2 to be 0.52. The P value used for testing shows a significant difference of 0.40

Would we have messed up if we looked early on?

Every time we look at the data and consider stopping, we introduce the chance of falsely rejecting the null hypothesis. In other words, every time we look at the data, we have the chance of a type 1 error. If we look at the data multiple times, and we use alpha of 0.05 as our criterion for significance, we have a 5% chance of stopping each time. Under the true null hypothesis and just 2 looks at the data, we “approximate” the error rates as: Probability stop at the first look: 0.05, probability stop at the second look: 0.95 × 0.05 = 0.0475, and total probability of stopping is 0.0975. We can obtain P < 0.05, but not declare statistical significance at the final look. O′Brien-Fleming bounds use more conservative stopping boundaries at early stages. These bounds spend little alpha at the time of the interim looks and lead to boundary values at the final stage that are close to those from the fixed sample design, avoiding the problem with the Pocock bounds. The classical Pocock and O′Brien-Fleming boundaries require a prespecified number of equally spaced looks. However, a Data Safety Monitoring Board (DSMB) may require more flexibility. Alternatively, one could specify an alpha-spending function that determines the rate at which the overall type I error is to be spent during the trial. At each interim look, the type I error is partitioned according to this alpha-spending function to derive the corresponding boundary values. Because the number of looks neither has to be prespecified nor equally spaced, an O-Brien-Fleming type alpha-spending function has become the most common approach to monitoring efficacy in clinical trials. Some investigators have suggested that using “P” to denote “statistical significance” as a way to denote the detection of an “effect” is inappropriate, and offer other solutions such as provision of effect size estimates and their precision from confidence intervals.[1011] Given the lack of standard statistical methods for retrospective adjustment of P values due to unplanned interim analyses, unplanned interim analyses should be avoided as they can flaw the results of a well-planned clinical trial. The performance of a clinical trial is only justified if the clinical investigators in advance consider ethical aspects and if an external Ethical Committee has approved the conduct of the study according to a defined protocol. A great deal of recent discussion in the clinical trials literature has focused on response-adaptive randomization in two-arm trials; however, this represents a fairly specific and relatively infrequently used type of adaptive clinical trial (ACT).[1213141516171819]

OPERATIONAL REQUIREMENT WHILE CONDUCTING THE TRIAL

Trials are need to be carefully monitored so that decisions to stop early, whether based on trial data or external evidence, can be properly made and documented. What in practical terms can be done? First, make a realistic assessment of possible scenarios, using general experience from clinical trials. Rigorous assessment of directly relevant trials should be carried out, using techniques such as meta-analysis. Subjective beliefs about the likely relative efficacy of the treatments and the clinical benefits that would be required before a new treatment would be used routinely can also be documented at this stage, although these can be surprisingly variable, as illustrated by some work on a trial of treatment. During an ongoing trial, different individuals become unblinded to data at different time points, and the regulatory document will be left open with some gray areas that merit further discussion. For instance, investigators typically remain blinded until the end of the study, whereas DSMB members may be partially or fully unblinded at the time of the first interim analysis. Suppose an investigator proposes a design change after the time of the first interim analysis based on external factors, such as the release of results from a similar trial, one could argue that the impetus for the proposed adaptation was not based on the results of unblinded data, which would fit the FDA definition for a valid adaptive design.[8] However, if the proposed adaptation has to be reviewed and approved by the DSMB, the fact they have seen unblinded data would seem to imply that the definition may not be met. The role of a blinded versus unblinded statistician in the process may also be important in determining whether the definition has been met. Further clarification of these types of areas is needed in the future to ensure that researchers and regulatory authorities agree on what constitutes a valid adaptive design. The implementation of these methods required the development of a structure to support DSMBs, which are relatively standard for modern clinical trials. This also required substantial training of clinical trialists to ensure that they understand the intricacies of the methods, as well as the potential pitfalls associated with the use of the methods. During the design of a clinical trial, several important design decisions must be made. Although study success depends on their accuracy, there may be limited information to guide the decisions. This approach addresses the uncertainty by allowing a review of accumulating data during an ongoing trial, and modifying trial characteristics accordingly if the interim information suggests that some of the original decisions may not be valid. However, it is well known that implementing many of the proposed approaches will require the clinical trials community to address several statistical, logistical, and operational hurdles. Mechanisms for stopping the trial must be identified, and criteria for stopping a trial should be explicit. Mortality and excess toxicity are obvious end points to monitor, but more complex features such as quality of life are much more difficult to assess and analyze. A particular dilemma arises when considering which end points to monitor because only short-term results, such as tumour response, acute morbidity, and early deaths, are available quickly, whereas the real value of many trials is their potential to give information on long-term survival and late morbidity. By definition, decisions to stop have to be made primarily on the early information, and it is of importance to assess to what extent this can act as surrogate information for the long-term outcomes. Monitoring for toxicity is always worthwhile, but monitoring for efficacy is likely to be most beneficial when mature data are accruing fast relative to the entry of new patients. If a trial does stop early, what are the priorities? The surviving trial patients should be informed of the position, which will be much easier if they gave genuine informed consent. The next priority should be the release of full results, quickly, via peer-reviewed journals, although this is difficult given the current constraints of most journals. From the statistical viewpoint, monitoring methods can be classified according to whether the method is frequentist or Bayesian,[20] and comprehensive reviews of statistical aspects of monitoring can be found in studies by Whitehead,[21] Jennison and Turnbull,[22] and Piantadosi.[23] However, regardless of the specific method used, a key issue is that statistical rules are only a part of the question, as they tend to oversimplify the information relevant to the decision that must be taken. The decision to stop a trial before the prespecified final analysis should not only be guided by statistical considerations, but also by practical issues (toxicity, ease of administration, costs, etc.), as well as clinical considerations. For this reason, it is preferable to refer to statistical methods as guidelines, rather than rules.[24] The recognized potential ethical benefits of ACTs include a higher probability of receiving an effective intervention for participants, optimizing resource utilization, and accelerating treatment discovery. Ethical challenges voiced include developing procedures, so trial participants can make informed decisions about taking part in ACTs and plausible, though unlikely risks of research personnel are altering the enrollment patterns.[25]

CONCLUSIONS

The decision to conduct an interim analysis should be based on sound scientific reasoning that is guided by clinical and statistical integrity, standard operating practices for interim analyses, and regulatory concerns. Such a decision must not and should not be based on natural tendencies toward operational or academic curiosity. Therefore, unplanned interim analyses should be avoided as they can flaw the results of a well-planned clinical trial. A good performance metrics enable greater understanding of the study progress, far tighter control, more effective allocation of resources such as monitoring time, faster enrollment, and in the larger scheme of things, shorter timelines and lower costs in operations, and decision-making process.

Financial support and sponsorship

Nil.

Conflicts of interest

There are no conflicts of interest.

19 in total

1. Commentary on Hey and Kimmelman.

Authors: Marc Buyse
Journal: Clin Trials Date: 2015-02-03 Impact factor: 2.486

2. Multiplicity in randomised trials II: subgroup and interim analyses.

Authors: Kenneth F Schulz; David A Grimes
Journal: Lancet Date: 2005 May 7-13 Impact factor: 79.321

3. Are outcome-adaptive allocation trials ethical?

Authors: Spencer Phillips Hey; Jonathan Kimmelman
Journal: Clin Trials Date: 2015-02-03 Impact factor: 2.486

Review 4. The place of experimental design and statistics in the 3Rs.

Authors: Richard M A Parker; William J Browne
Journal: ILAR J Date: 2014

5. Commentary on Hey and Kimmelman.

Authors: Edward L Korn; Boris Freidlin
Journal: Clin Trials Date: 2015-02-03 Impact factor: 2.486

6. Commentary on Hey and Kimmelman.

Authors: Scott Brian Saxman
Journal: Clin Trials Date: 2015-02-03 Impact factor: 2.486

7. Commentary on Hey and Kimmelman.

Authors: Donald A Berry
Journal: Clin Trials Date: 2015-02-03 Impact factor: 2.486

8. The fickle P value generates irreproducible results.

Authors: Lewis G Halsey; Douglas Curran-Everett; Sarah L Vowler; Gordon B Drummond
Journal: Nat Methods Date: 2015-03 Impact factor: 28.547

9. The what, why and how of Bayesian clinical trials monitoring.

Authors: L S Freedman; D J Spiegelhalter; M K Parmar
Journal: Stat Med Date: 1994 Jul 15-30 Impact factor: 2.373

10. A unified method for monitoring and analysing controlled trials.

Authors: J Grossman; M K Parmar; D J Spiegelhalter; L S Freedman
Journal: Stat Med Date: 1994-09-30 Impact factor: 2.373

10 in total

1. Reduction of Prolonged Excessive Pressure in Seated Persons With Paraplegia Using Wireless Lingual Tactile Feedback: A Randomized Controlled Trial.

Authors: A Moreau-Gaudry; O Chenu; M V Dang; J-L Bosson; M Hommel; J Demongeot; F Cannard; B Diot; A Prince; C Hughes; N Vuillerme; Y Payan
Journal: IEEE J Transl Eng Health Med Date: 2018-06-07 Impact factor: 3.316

Review 2. Challenges in Interpreting Obstetrics and Gynecology Literature.

Authors: Ann M Bruno; Nathan R Blue
Journal: Clin Obstet Gynecol Date: 2022-03-23 Impact factor: 1.966

3. Long-Term Outcomes in a Multicenter, Prospective Cohort Evaluating the Prognostic 31-Gene Expression Profile for Cutaneous Melanoma.

Authors: Eddy C Hsueh; James R DeBloom; Jonathan H Lee; Jeffrey J Sussman; Kyle R Covington; Hillary G Caruso; Ann P Quick; Robert W Cook; Craig L Slingluff; Kelly M McMasters
Journal: JCO Precis Oncol Date: 2021-04-06

4. Does immediate smart feedback on therapy adherence and inhalation technique improve asthma control in children with uncontrolled asthma? A study protocol of the IMAGINE I study.

Authors: Esther T Sportel; Martijn J Oude Wolcherink; Job van der Palen; Anke Lenferink; Boony J Thio; Kris L L Movig; Marjolein G J Brusse-Keizer
Journal: Trials Date: 2020-09-17 Impact factor: 2.279

5. Multicentre randomised double-blinded placebo-controlled trial of favipiravir in adults with mild COVID-19.

Authors: Mohammad Bosaeed; Ahmad Alharbi; Mohammad Hussein; Mohammed Abalkhail; Khizra Sultana; Abrar Musattat; Hajar Alqahtani; Majid Alshamrani; Ebrahim Mahmoud; Adel Alothman; Abdulrahman Alsaedy; Omar Aldibasi; Khalid Alhagan; Abdullah Mohammed Asiri; Sameera AlJohani; Majed Al-Jeraisy; Ahmed Alaskar
Journal: BMJ Open Date: 2021-04-14 Impact factor: 2.692

6. The SPIRIT Checklist-lessons from the experience of SPIRIT protocol editors.

Authors: Riaz Qureshi; Alexander Gough; Kirsty Loudon
Journal: Trials Date: 2022-04-27 Impact factor: 2.728

Review 7. Device-based therapy for decompensated heart failure: An updated review of devices in development based on the DRI₂P₂S classification.

Authors: Cristiano de Oliveira Cardoso; Abdelmotagaly Elgalad; Ke Li; Emerson C Perin
Journal: Front Cardiovasc Med Date: 2022-09-21

8. Improving Follow-up Attendance for Discharged Emergency Care Patients Using Automated Phone System to Self-schedule: A Randomized Controlled Trial.

Authors: Kyla L Bauer; Omolade O Sogade; Brian F Gage; Brent Ruoff; Lawrence Lewis
Journal: Acad Emerg Med Date: 2020-08-05 Impact factor: 3.451

9. Overrunning in clinical trials: some thoughts from a methodological review.

Authors: Ileana Baldi; Danila Azzolina; Nicola Soriani; Beatrice Barbetta; Paola Vaghi; Giampaolo Giacovelli; Paola Berchialla; Dario Gregori
Journal: Trials Date: 2020-07-21 Impact factor: 2.279

10. A Quality Improvement Emergency Department Surge Management Platform (SurgeCon): Protocol for a Stepped Wedge Cluster Randomized Trial.

Authors: Hensley H Mariathas; Oliver Hurley; Nahid Rahimipour Anaraki; Christina Young; Christopher Patey; Paul Norman; Kris Aubrey-Bassler; Peizhong Peter Wang; Veeresh Gadag; Hai V Nguyen; Holly Etchegary; Farah McCrate; John C Knight; Shabnam Asghari
Journal: JMIR Res Protoc Date: 2022-03-24