| Literature DB >> 26530985 |
Tessa Kennedy-Martin1, Sarah Curtis2, Douglas Faries3, Susan Robinson4, Joseph Johnston5.
Abstract
Randomized controlled trials (RCTs) are conducted under idealized and rigorously controlled conditions that may compromise their external validity. A literature review was conducted of published English language articles that reported the findings of studies assessing external validity by a comparison of the patient sample included in RCTs reporting on pharmaceutical interventions with patients from everyday clinical practice. The review focused on publications in the fields of cardiology, mental health, and oncology. A range of databases were interrogated (MEDLINE; EMBASE; Science Citation Index; Cochrane Methodology Register). Double-abstract review and data extraction were performed as per protocol specifications. Out of 5,456 de-duplicated abstracts, 52 studies met the inclusion criteria (cardiology, n = 20; mental health, n = 17; oncology, n = 15). Studies either performed an analysis of the baseline characteristics (demographic, socioeconomic, and clinical parameters) of RCT-enrolled patients compared with a real-world population, or assessed the proportion of real-world patients who would have been eligible for RCT inclusion following the application of RCT inclusion/exclusion criteria. Many of the included studies concluded that RCT samples are highly selected and have a lower risk profile than real-world populations, with the frequent exclusion of elderly patients and patients with co-morbidities. Calculation of ineligibility rates in individual studies showed that a high proportion of the general disease population was often excluded from trials. The majority of studies (n = 37 [71.2 %]) explicitly concluded that RCT samples were not broadly representative of real-world patients and that this may limit the external validity of the RCT. Authors made a number of recommendations to improve external validity. Findings from this review indicate that there is a need to improve the external validity of RCTs such that physicians treating patients in real-world settings have the appropriate evidence on which to base their clinical decisions. This goal could be achieved by trial design modification to include a more representative patient sample and by supplementing RCT evidence with data generated from observational studies. In general, a thoughtful approach to clinical evidence generation is required in which the trade-offs between internal and external validity are considered in a holistic and balanced manner.Entities:
Mesh:
Year: 2015 PMID: 26530985 PMCID: PMC4632358 DOI: 10.1186/s13063-015-1023-4
Source DB: PubMed Journal: Trials ISSN: 1745-6215 Impact factor: 2.279
Key results and main author conclusions from Method A studies
| Study | Real-world data source | Key differences (real-world versus RCT patients) | Main author conclusionsb |
|---|---|---|---|
| Cardiology | |||
| Badano et al., 2003 [ | MC-PR | Older, more female, higher rates of concomitant diabetes, greater LVF clinical impairment | Different |
| Björklund et al., 2004 [ | MC-PR | Older, more female and more CV risk factors | Different |
| Costantino et al., 2009a [ | SC-PR | Older, more female, lower NYHA class | Different |
| Dhruva et al., 2008 [ | ID | Older, more female | Different |
| Ezekowitz et al., 2012 [ | MC-PR | Older, more female, more co-morbidities/prior cancer | Different |
| Golomb et al., 2012 [ | MC-PR | Increased self-rated physical activity with increasing age | Different |
| Hutchinson-Jaffe et al., 2010 [ | MC-PR | Older, more female, more co-morbidities, less guideline-recommended treatment/procedures | Different |
| Melloni et al., 2010 [ | MC-PR | More female | Different |
| Steinberg et al., 2007 [ | MC-PR | Older, more co-morbidities/CVD history | NE |
| Uijen et al., 2007a [ | MC-PR | Older, more female, higher CVD risk | Different |
| Wagner et al., 2011 [ | ID | Older, more chronic diseases | NE |
| Mental health | |||
| Kushner et al., 2009 [ | MC-PR | Greater depression severity (some scales), lower preference for novel experiences | NE |
| Rabinowitz et al., 2003a [ | MC-PR | No major differences | Similar |
| Riedel et al., 2005 [ | SC-PR | Older, longer duration of illness, more internistic co-morbidities/hospitalizations | Similar |
| Surman et al., 2010a [ | SC-PR | More co-morbidities, anxiety/depression, alcohol/substance dependence | Different |
| Zarin et al., 2005a [ | MC-PR | Older, more female/Caucasian | Different |
| Oncology | |||
| Baquet et al., 2009 [ | MC-PR | Fewer females (non-sex-specific tumor RCTs), fewer males (sex-specific tumor RCTs) | NE |
| Elting et al., 2006 [ | SC-PR | Older, more females/chronic co-morbidities, worse health/performance status | Different |
| Fraser et al., 2011a [ | MC-PR | Worse disease prognosis, more drug-related toxicity, lower drug dose intensity | Different |
| Jennens et al., 2006 [ | MC-PR | Older | Different |
| Kalata et al., 2009 [ | MC-PR | Older, more females, worse prognosis | Different |
| Mengis et al., 2003a [ | SC-PR | Older, worse performance status, more infections/AML-MDS subtypes | Different |
| van der Linden et al., 2014 [ | MC-PR | Older, more females, poor prognostic factors | Different |
| Yennurajalingam et al., 2013 [ | SC-PR | Older, more males, higher symptom intensity scores | Different |
| Yessaian et al., 2005 [ | MC-PR | No major differences | Similar |
Please see Additional files 2 and 3 for more detailed results
aStudies that employed Methods A and B; in these studies RCT and real-world populations were compared, the authors then used the eligibility criteria from the RCT of interest to determine how many patients would hypothetically have been eligible or ineligible for that trial. Results presented in this table are for Method A only (see Table 3 for Method B results). bDifferent: authors explicitly comment, in their opinion, that there were meaningful differences between populations that suggested they were not representative, that the data could not be extrapolated or were not applicable to real-world settings, and/or that external validity is impacted; NE: authors do not explicitly comment on external validity or do not comment on external validity despite demonstration of differences in baseline characteristics; Similar: authors comment that populations are similar and/or that RCT results are generalizable to the overall disease population
AML acute myeloid leukemia, CV cardiovascular, CVD cardiovascular disease, ID insurance data; LVF left ventricular function, MC-PR patient records - multicenter (including multicenter registries), MDS myelodysplastic syndrome, NYHA New York Heart Association, RCT randomized controlled trial, SC-PR patient records - single center
Key results and main author conclusions from Method B studies
| Study | Real-world data source | % ineligibilitya | Key differences (ineligible versus eligible patients) | Main author conclusionsb |
|---|---|---|---|---|
| Cardiology | ||||
| Bahit et al., 2003 [ | MC-PR | 33.6 | Older, more females/previous MI, lower ASA use, longer LOS | Different |
| Bosch et al., 2008 [ | SC-PR | 41.2 | Older, higher risk profile | Different |
| Collet et al., 2003 [ | SC-PR | 34.0 | Older, more females, higher risk score, fewer in-hospital procedures | NE |
| Costantino et al., 2009c [ | SC-PR | 66.2 | ND | Different |
| Fortin et al., 2006 [ | MC-PR | 1.4–65.5 | ND | NE |
| Koeth et al., 2009 [ | MC-PR | 46.4 | Older, more females, more diabetes/hypertension, less guideline-recommended treatment | Different |
| Krumholz et al., 2003 [ | MC-PR | 84.5 (NRMI) | ND | Similar |
| 90.6 (CCP) | ||||
| Lenzen et al., 2005 [ | MC-PR | 61.6 | Older, more females, more co-morbid hypertension/ACS/renal impairment, less guideline-recommended treatment at baseline | Different |
| Masoudi et al., 2003 [ | ID | 67.0 | ND | Different |
| Steg et al., 2007 [ | MC-PR | 33.6 | Older, history of MI, diabetes, TIA, PAD, and CABG, less guideline-recommended treatment/procedures, high risk score | Different |
| Uijen et al., 2007c [ | MC-PR | 53.0 | ND | Different |
| Mental health | ||||
| Blanco et al., 2008 [ | GP | 75.8 | ND | Different |
| Goedhard et al., 2010 [ | SC-PR | 69.8 | Older, more Axis II personality disorders | Different |
| Hoertel et al., 2013 [ | GP | 58.2 (bipolar) | ND | Different |
| 55.8 (mania) | ||||
| Keitner et al., 2003 [ | SC-PR | 85.5 | ND | Different |
| Khan et al., 2005 [ | GP | 98.2 | ND | Different |
| Rabinowitz et al., 2003c [ | MC-PR | 33.0 | ND | Similar |
| Seemuller et al., 2010 [ | MC-PR | 69.0 | Younger, trend to younger age at disease onset | Similar |
| Storosum et al., 2004 [ | SC-PR | 83.8d | ND | Different |
| Surman et al., 2010c [ | SC-PR | 61.0 | More lifetime co-morbidity, lower overall functioning/SES | Different |
| Talamo et al., 2008 [ | SC-PR | 77.6 | Few differences | Similar |
| van der Lem et al., 2011 [ | SC-PR | 75.5–81.2e | ND | NE |
| Wisniewski et al., 2009 [ | MC-PR | 77.8 | Older, less educated, more black/Hispanic, longer disease duration, history of suicide and substance abuse, more atypical features | Different |
| Zarin et al., 2005c [ | MC-PR | 55.0 (bipolar) 38.0 (schizo-phrenia) | More co-morbidity, lower global functioning, greater use of antipsychotic medication | Different |
| Zetin and Hoepner, 2007 [ | SC-PR | 91.4 | ND | Different |
| Zimmerman et al., 2004 [ | SC-PR | 65.8 | ND | Different |
| Oncology | ||||
| Clarey et al., 2012 [ | SC-PR | 31.0–76.0 | ND | Different |
| Filion et al., 2012 [ | SC-PR | –f | ND | Similar |
| Fraser et al., 2011c [ | MC-PR | 14.9 | ND | Different |
| Mengis et al., 2003c [ | SC-PR | 87.0 | ND | Different |
| Mol et al., 2013 [ | MC-PR | 21.5 | Worse performance status, higher alkaline phosphatase, less primary tumor resection | Similar |
| Somer et al., 2008 [ | SC-PR | 71.0 | ND | Different |
| Terschüren et al., 2010 [ | MC-PR | 35.9 (HL) | ND | Different |
| 70.4 (hgNHL) | ||||
| Vardy et al., 2009 [ | MC-PR | 65.0–72.0 | ND | Different |
Please see Additional files 2 and 4 for more detailed results
aPercentage of patients not eligible for RCT inclusion following the application of eligibility criteria. bDifferent: authors explicitly comment, in their opinion, that there were meaningful differences between populations that suggested they were not representative, that the data could not be extrapolated or were not applicable to real-world settings, and/or that external validity is impacted; NE: authors do not explicitly comment on external validity or do not comment on external validity despite demonstration of differences in baseline characteristics; Similar: authors comment that populations are similar and/or that RCT results are generalizable to the overall disease population. cStudies that employed Methods A and B; in these studies RCT samples and real-world populations were compared, the authors then used the eligibility criteria from the RCT of interest to determine how many patients would hypothetically have been eligible or ineligible for that trial. Results presented in this table are for Method B only (see Table 2 for Method A results). dPercentage of manic episodes not number of ineligible. e75.5 % based on application of stringent criteria using the Mittman regression equation to calculate HAM-D; 81.2 % based on application of stringent criteria using the Hawley or Zimmerman regression equation to calculate HAM-D. fInclusion/exclusion criteria were categorized to identify criteria that might impede RCT recruitment; if any individual category was not met by > 10 % of patients with breast cancer from a retrospective cohort, then the criterion was considered a barrier to recruitment. ACS acute coronary syndrome, ASA aspirin, CABG coronary artery bypass graft, CCP cooperative cardiovascular project, GP general population data, HL Hodgkin’s lymphoma, hgNHL high-grade non-Hodgkin’s lymphoma, ID insurance data, LOS length of stay, MC-PR patient records - multicenter (including multicenter registries and observational studies), MI myocardial infarction, ND not determined, NRMI National Registry of Myocardial Infarction, PAD peripheral arterial disease, SC-PR patient records - single center, SES socioeconomic status, TIA transient ischemic attack
Fig. 1Study selection for a literature review assessing the external validity of randomized controlled trials
Study design overview of included publications
| Number (%) | ||||
|---|---|---|---|---|
| Cardiology | Mental health | Oncology | Total | |
| Total number of studies | 20 (38.5) | 17 (32.7) | 15 (28.8) | 52 (100) |
| Geography | ||||
| USA | 8a (40.0) | 10 (58.8) | 5 (33.3) | 23 (44.2) |
| The Netherlands | 1 (5.0) | 3 (17.6) | 2 (13.3) | 6 (11.5) |
| Germany | 1 (5.0) | 2 (11.8) | 2 (13.3) | 5 (9.6) |
| Canada | 3 (15.0) | 1 (5.9) | 1 (6.7) | 5 (9.6) |
| Other | 7 (35.0) | 1 (5.9) | 5 (33.3) | 13 (25.0) |
| Methodb | ||||
| A only | 9 (45.0) | 2 (11.8) | 7 (46.7) | 18 (34.6) |
| B only | 9 (45.0) | 12 (70.6) | 6 (40.0) | 27 (51.9) |
| A and B | 2 (10.0) | 3 (17.6) | 2 (13.3) | 7 (13.5) |
| Comparisons made, Method Ac,d | ||||
| Demographics | 10 (90.9) | 5 (100) | 8 (88.9) | 23 (92.0) |
| Clinical characteristics | 8 (72.7) | 5 (100) | 7 (77.8) | 20 (80.0) |
| Treatments and procedures | 4 (36.4) | 2 (40.0) | 3 (33.3) | 9 (36.0) |
| Othere | 1 (9.1) | 1 (20.0) | 0 (0.0) | 2 (8.0) |
| Additional analyses undertaken, Method Bd | ||||
| Comparison of baseline characteristics, eligible vs ineligible patients | 6 (54.5) | 6 (40.0) | 1 (12.5) | 13 (38.2) |
| Common reasons for trial ineligibility | 7 (63.6) | 14 (93.3) | 8 (100) | 29 (85.3) |
aIncludes one study conducted in the USA and Canada. bMethod A, formal statistical comparison of baseline characteristics between a real-world patient population and patients enrolled in a randomized controlled trial (RCT) in the same disease area; Method B, determination of the proportion of real-world patients who would have been trial eligible or ineligible by review of individual patient medical records followed by application of RCT eligibility criteria. cEach study made multiple comparisons. dPercentages calculated based on total number of studies employing method (for example, Method A studies plus Method A/B studies). eOther comparisons included physical activity relative to “others the same age” (n = 1 cardiology study) and personality traits (n = 1 mental health study)
Fig. 2Proportion of real-world patients ineligible in randomized controlled trials (RCTs) after application of inclusion/exclusion criteria. Method B studies. Some individual studies reported multiple ineligibility rates derived from the application of selection criteria from a number of different RCTs to a single real-world population. Hence, in the 34 studies that employed Method B, 54 different ineligibility rates were calculated
Recommendations for managing external validity issues made by included studies
| Patient populations |
| Broadening of RCT inclusion and exclusion criteria [ |
| Selection of patients from more appropriate settings/populations to achieve a more representative sample (for example, prospective use of registry data; a priori estimation of patient eligibility by application of trial exclusion criteria to the target population) [ |
| Conduct of RCTs in specific patient subgroups [ |
| Standardization of inclusion/exclusion criteria and diagnostic and screening assessments across RCTs in a given medical condition [ |
| Intervention |
| Broader range of RCT treatments (that is, different and realistic dosing regimens, use of concurrent therapy, and appropriate duration of treatment); comparison of new treatments with treatments as usual rather than to a prescribed dose of a particular medicine [ |
| Reporting |
| Improved reporting of populations and results (that is, greater transparency in the reporting of how exclusion criteria are operationalized and how this influences eligibility, and of the rate and major characteristics of excluded patients) [ |
| Collection, reporting, and comparison of data from patients within and outside of the trial [ |
| Analysis |
| Development of statistical analysis plans and power calculation adjustment to ensure adequate powering for subgroup analyses [ |
| Generation of supportive data |
| Conduct of observational studies after the demonstration of treatment efficacy at the RCT level [ |
| Development of large patient registries in specific disease areas [ |
| Adoption of pragmatic studies [ |
| Clinical practice recommendations |
| Prospective auditing of drug efficacy and safety in everyday practice settings and comparison of these data with RCT results [ |
| Provision of more detailed product information to include the criteria by which patients were selected in pivotal RCTs [ |
RCT randomized controlled trial