Ya-Guang Peng1, Xiao-Lu Nie2, Jing-Jing Feng3, Xiao-Xia Peng1. 1. Center for Clinical Epidemiology and Evidence-based Medicine, Beijing Children's Hospital, Capital Medical University, Beijing 100045; Key Laboratory of Major Diseases in Children, Ministry of Education, Beijing 100045, China. 2. Center for Clinical Epidemiology and Evidence-based Medicine, Beijing Children's Hospital, Capital Medical University, Beijing 100045, China. 3. Center for Medical Safety and Risk Administration, National Institute of Hospital Administration, Beijing 100091, China.
Different from trials for regulatory approvals of new interventions aimed to test the efficacy, comparative effectiveness research (CER) is the direct comparison of existing health-care interventions (compared with active controls) to examine which treatment works best, for whom, and under what settings.[1] Therefore, CER is indispensable to assist consumers, clinicians, purchasers, and policy makers to make evidence-based decisions. Actually, an obvious uprising tendency of CER after 2004 was displayed and a majority of CER was performed by randomized clinical trials (RCTs) with active controlled interventions.It is known to us that major advantages of RCTs to demonstrate causality include low risk of selective bias and minimized the influence of baseline confounding by randomly assigning the intervention, low risk of performance bias, or detection bias due to blinding. Whatever Phase II and III clinical trials or CER, properly designed and carried out RCTs can provide the most definitive causal inference. Compared with RCTs of Phase II and III, which can produce persuasive causal inference, the challenge for CER should be considered cautiously because the cleanest comparison occurs only when standard interventions are performed, i.e., there are no unexpected co-interventions, such as medications, supplementary therapies, and behaviors during trials. RCTs are only expected to be free from baseline confounding but not from postrandomization confounding when we study the long-term effects of sustained clinical interventions in typical patients and care settings.[2]On the other hand, the control group in CER receives active treatments and aims to examine the effectiveness of intervention rather than efficacy so that it is difficult to give sustained standard interventions like that in clinical trials. The baseline equilibrium after randomization could be broken and the disequilibrium of interventions among different trial arms might occur or be magnified due to uncontrollable concomitant therapies. For example, unexpected concomitant therapies can occur when patients admitted in the Intensive Care Unit are enrolled as participants. This may impact the estimation of effect size due to postrandomization confounding.Therefore, we attempt to more clearly define postrandomization confounding so that we are able to give serious consideration to this potential bias resource. The postrandomization confounding could be considered as an error introduced by disequilibrium concomitant therapies after a random assignment in RCTs, especially in trials aiming to compare effectiveness rather than efficacy. The impact of postrandomization confounding on causal inference is shown in Figure 1.
Figure 1
The impact of postrandomization confounding on causal inference in comparative effectiveness research.
The impact of postrandomization confounding on causal inference in comparative effectiveness research.
Current Examples of the Postrandomization Confounding
Some examples of the postrandomization confounding in CER based on RCTs design we discussed here will help us to propose a viewpoint that the postrandomization confounding could actually impact results of CER. We should not place undue emphasis on the known advantages attributing to randomly assigning and blinding, while ignoring other potential bias when we are designing the RCTs for CER. Moreover, we hope to cause concerns on how to design and report cautiously CER.In this view, one RCT on Chinese traditional medicine acupuncture for seasonal allergic rhinitis (SAR) presented potential postrandomization confounding by concomitant medicine therapies. Participants diagnosed with SAR were allocated randomly into real acupuncture (RA) or sham acupuncture (SA) with a good design randomization and blinded procedure to assess the primary outcome, i.e., the severity of SAR symptoms. Some symptom relief medications, short-acting antihistamine loratadine, or a decongestant nasal spray were required for participants when needed. Four weeks of acupuncture treatment was assessed as a safe and effective option for clinical management of SAR due to the reduction scores of SAR symptom severity measurements.[3] However, the efficacy of the acupuncture might also be underestimated due to the potential bias introduced from postrandomization confounding of concomitant medicine. Figure 2 showed the weekly symptom relief medication score, which was calculated by one tablet loratadine equating to one point and two sprays per nostril of oxymetazoline equaled one point also and the decreased spector sneezing score, one of the symptoms severity primary outcomes, from baseline till follow-up. From the third treatment period, the mean difference of weekly symptom relief medication score between groups was enlarged about one point, which indicated the participants in RA group applied less medication to relieve the symptoms compared with those in SA group. Meanwhile, the reduction of symptom severity was significant (P = 0.02 for sneezing score and 0.006 for itchiness of ears and palate score separately) in RA group during the same period. It could be inferred that benefit of decreased symptom score might be estimated conservatively in this trial because the mean difference of symptom score between RA and SA group could be traded off by more weekly symptom relief medication in SA group, which may be also the bias by postrandomization confounding.
Figure 2
Weekly symptom relief medication score (left) and symptom severity over time using the spector sneezing score (right) as an example.
Weekly symptom relief medication score (left) and symptom severity over time using the spector sneezing score (right) as an example.In another large-scale multicenter RCT, the Systolic Blood Pressure Intervention Trial (SPRINT), interventions were aimed at antihypertension and examination of the ratio of benefit to harm by different blood pressure (BP) target values. Logically, the groups were divided according to the BP target values (<140 mmHg in standard treatment group vs. <120 mmHg in the intensive treatment group) rather than different interventions.[4] SPRINT showed targeting a systolic blood pressure of <120 mmHg, compared with <140 mmHg, reduce the rates of major cardiovascular events (MACE) and death from any cause.Based on common sense, it is imperative to standardize the pharmaceuticals algorithms. Therefore, all antihypertensive medications administration in these two studies was evidence-based. For example, angiotensin-converting enzyme-I (ACE-I) and angiotensin receptor blocker (ARB) agents were interchangeably recommended as the first-line treatments for antihypertension by guidelines.[567] Evidence that ACE-I[891011] and ARB[1213] prevented cardiovascular events and death showed these kinds of medication could influence the risk of MACE. Strikingly, the appendix results of SPRINT showed the proportion of more than three antihypertensive agents using were 56.1% in the intensive group and 24.1% in the standard group at most recent visit.[4] Moreover, one additional antihypertensive drug on average was used in the intensive treatment group. At the same time, there were different in ACE-I using (76.7% vs. 55.2%) and ARB using (8.7% vs. 4.0%).[4] The disequilibrium of concomitant drugs after randomization leads us to doubt that the efficacy of reduced MACE in SPRINT could be clearly attributed to the target BP values or to the effects of antihypertensive agents against MACE by itself. The apparent difference of pharmaceuticals algorithms between groups could introduce the potential postrandomization confounding.
What Needs to Happen Next? From Extensive Use to Wise Use
As the upper level of evidence hierarchy, RCTs always have been a competitive and persuasive study design so that more and more stakeholders preferred it. However, it is really not almighty for any setting. In recent decades, the methodology of RCTs is indeed improved with the progress of the clinical study and much more normatively statistical approaches, whatever in design, performance, statistical analysis, reporting, even assessment of the risk of bias. Controls, randomization, and blinding, these three main principles were almost focused and noted in RCTs, but other conditional settings might be omitted, such as clean comparison with standardized interventions and follow-up procedures. Postrandomization confounding was one of the potential biases introduced by ignorance of conditioned on variables.RCTs could be still regarded as the first choice to compare the effectiveness among different interventions because RCTs can provide the most definitive causal inference. What we proposed is that RCTs might be more applicable in strict premarket clinical trials, rather than universal, even abused, especially in some CERs. The following scenarios should be more cautiously considered when RCTs were designed for CER. First, the results of CER might be challengeable when some unavoidable supplementary therapies or behaviors will occur after randomization, such as symptom relief medications, which was not constrained but was designed as a secondary outcome in RCTs aimed to evaluate the effectiveness of acupuncture against SAR. Second, the bias would be more likely to occur in long-term trials conducted in critical care settings[14] or in long-term chronic disease prevention[215] because it is difficult to sustain clean intervention to deal with unexpected disease progression. Third, in some special conditions, such as that of SPRINT study, trial arms are determined by target value of BP rather than medical interventions so that it is predestined that disequilibrium of treatments or behaviors for the targets could occur. This would confuse us that the effect size of RCTs is attributed to different target values or difference of medicine.Therefore, the potential confounding and bias due to postrandomization confounding in CER should be considered more cautiously and reported. Some restrictions of the concomitant therapies or stratification in both assigning and statistical analysis could be considered to deal with this problem. Moreover, meticulous design for the target values settings would be helpful to get sound results. Most important, a standard framework for design, performance, and reporting of CER by RCTs will be required to control this kind of confounding.
Authors: Robert H Eckel; John M Jakicic; Jamy D Ard; Janet M de Jesus; Nancy Houston Miller; Van S Hubbard; I-Min Lee; Alice H Lichtenstein; Catherine M Loria; Barbara E Millen; Cathy A Nonas; Frank M Sacks; Sidney C Smith; Laura P Svetkey; Thomas A Wadden; Susan Z Yanovski; Karima A Kendall; Laura C Morgan; Michael G Trisolini; George Velasco; Janusz Wnek; Jeffrey L Anderson; Jonathan L Halperin; Nancy M Albert; Biykem Bozkurt; Ralph G Brindis; Lesley H Curtis; David DeMets; Judith S Hochman; Richard J Kovacs; E Magnus Ohman; Susan J Pressler; Frank W Sellke; Win-Kuang Shen; Sidney C Smith; Gordon F Tomaselli Journal: Circulation Date: 2013-11-12 Impact factor: 29.690
Authors: Paul A James; Suzanne Oparil; Barry L Carter; William C Cushman; Cheryl Dennison-Himmelfarb; Joel Handler; Daniel T Lackland; Michael L LeFevre; Thomas D MacKenzie; Olugbenga Ogedegbe; Sidney C Smith; Laura P Svetkey; Sandra J Taler; Raymond R Townsend; Jackson T Wright; Andrew S Narva; Eduardo Ortiz Journal: JAMA Date: 2014-02-05 Impact factor: 56.272
Authors: Charlie Changli Xue; Anthony Lin Zhang; Claire Shuiqing Zhang; Cliff DaCosta; David F Story; Frank C Thien Journal: Ann Allergy Asthma Immunol Date: 2015-06-11 Impact factor: 6.347
Authors: Giuseppe Mancia; Robert Fagard; Krzysztof Narkiewicz; Josep Redón; Alberto Zanchetti; Michael Böhm; Thierry Christiaens; Renata Cifkova; Guy De Backer; Anna Dominiczak; Maurizio Galderisi; Diederick E Grobbee; Tiny Jaarsma; Paulus Kirchhof; Sverre E Kjeldsen; Stéphane Laurent; Athanasios J Manolis; Peter M Nilsson; Luis Miguel Ruilope; Roland E Schmieder; Per Anton Sirnes; Peter Sleight; Margus Viigimaa; Bernard Waeber; Faiez Zannad Journal: J Hypertens Date: 2013-07 Impact factor: 4.844