| Literature DB >> 29463308 |
Daniel J Niven1, T Jared McCormick2, Sharon E Straus3, Brenda R Hemmelgarn4, Lianne Jeffs5, Tavish R M Barnes6, Henry T Stelfox6.
Abstract
BACKGROUND: The ability to reproduce experiments is a defining principle of science. Reproducibility of clinical research has received relatively little scientific attention. However, it is important as it may inform clinical practice, research agendas, and the design of future studies.Entities:
Keywords: Adoption; Critical care; De-adoption; ICU; Intensive care; Replication research; Reproducibility
Mesh:
Year: 2018 PMID: 29463308 PMCID: PMC5820784 DOI: 10.1186/s12916-018-1018-6
Source DB: PubMed Journal: BMC Med ISSN: 1741-7015 Impact factor: 8.775
Reproducibility framework, terms, and definitions
| Reproducibility component | Definition |
|---|---|
| Unique clinical practice | A specific intervention applied to patients with a specific target condition (e.g., therapeutic hypothermia for patients with traumatic brain injury) |
| Reported effect of clinical practice | |
| Efficacy | For the primary outcome, statistically significant increased risk of a positive outcome, or decreased risk of a negative outcome |
| Harm | For the primary outcome or any pre-specified secondary or safety outcome, statistically significant increased risk of a negative outcome, or decreased risk of a positive outcomea |
| Lack of efficacy | For the primary outcome, a non-statistically significant change |
| Type of results reproducibility [ | |
| Re-test reproduction attempt | For a given clinical practice, a study that re-examined the results of an original study in another group of participants using methodology identical to that of the original studyb |
| Approximate reproduction attempt | For a given clinical practice, a study that re-examined the results of an original study in another group of participants using methodology with minor changes to the population, setting, treatment, outcomes, and/or analyses relative to the original studyb |
| Reproducibility classification | |
| Original study | First randomized controlled trial to examine the effects of a clinical practicec |
| Reproduction attempt | Re-test or approximate reproduction attempt for an original study |
| Consistent effect estimate between original study and reproduction attempt | Clinical practice effect reported in the reproduction attempt was congruent with that in the original study: |
| Inconsistent effect estimate between original study and reproduction attempt | Clinical practice effect reported in the reproduction attempt was different from that in the original study: |
aWhere there was a significant positive effect for the primary outcome, and a significant negative effect for a safety outcome, practice classification was based on the relative importance of each outcome. For example, if survival was improved, but there was an increased incidence of adverse drug reaction, the practice was classified as having efficacy
bSample size of reproduction attempt was required to be at least 90% that of the original study [14]
cEarly phase trials did not count as an original study; these were defined as those for which the main objective was to evaluate the feasibility of processes (recruitment, randomization, blinding, outcome assessment, etc.) required to examine the effect of the clinical practice in a later phase clinical trial [53]
Fig. 1Details of the study selection process. Detailed legend: aStudies included from primary search informed the secondary searches (dashed line). Studies identified in the secondary searches that were published before the corresponding study in the primary search were classified as the original study for that practice, whereas those published after the corresponding study in the primary search were classified as a reproduction attempt. bStudies were excluded if they did not meet eligibility criteria or did not represent an original study or reproduction attempt for any study that was included from the primary search. cClassification as original study or reproduction attempt determined after analyzing final cohort of articles in context of reproducibility framework (Table 1)
Characteristics of included studies classified according to reproduction attempts
| Practices WITHOUT a reproduction attempt ( | Practices with CONSISTENT EFFECT between original study and reproduction attempt ( | Practices with INCONSISTENT EFFECT between original study and reproduction attempt ( | |||
|---|---|---|---|---|---|
| Characteristic, n (%)a | Original study ( | Original study ( | Reproduction attempt ( | Original study ( | Reproduction attempt ( |
| Primary electronic search | 93 (100) | 20 (67) | 25 (64) | 29 (78) | 51 (81) |
| Secondary electronic search | 0 (0) | 10 (33) | 14 (36) | 8 (22) | 12 (19) |
| Continent of origin | |||||
| North America | 44 (47) | 13 (43) | 12 (31) | 18 (49) | 20 (32) |
| Europe | 42 (45) | 15 (50) | 21 (54) | 15 (41) | 37 (59) |
| Australasia | 7 (8) | 1 (3) | 4 (10) | 2 (5) | 4 (6) |
| Other | 0 (0) | 1 (3) | 2 (5) | 2 (5) | 2 (3) |
| Year of publicatione,f | |||||
| Before 1980 | 3 (3) | 0 (0) | 0 (0) | 1 (3) | 0 (0) |
| 1980–1989 | 7 (8) | 2 (7) | 1 (3) | 4 (11) | 3 (5) |
| 1990–1999 | 25 (27) | 14 (47) | 6 (15) | 13 (35) | 12 (19) |
| 2000–2009 | 20 (22) | 11 (37) | 21 (54) | 17 (46) | 21 (33) |
| 2010 or later | 38 (41) | 3 (10) | 11 (28) | 2 (5) | 27 (43) |
| Participating center typef | |||||
| University affiliated | 38 (41) | 17 (57) | 19 (49) | 29 (78) | 23 (37) |
| Mixed university affiliated and non-affiliated | 11 (12) | 5 (17) | 11 (28) | 1 (3) | 13 (21) |
| Unclear | 44 (47) | 8 (27) | 9 (23) | 7 (19) | 27 (43) |
| No. of centres, mean (95% CI)e,f | 7.2 (5.1–9.9) | 4.0 (2.2–7.0) | 9.1 (5.4–15.3) | 2.9 (1.8–4.6) | 16.2 (11.1–23.4) |
| 1e,f | 25 (26) | 11 (37) | 9 (23) | 17 (46) | 7 (11) |
| 2–4e,f | 13 (14) | 8 (27) | 4 (10) | 8 (22) | 5 (8) |
| 5–9e,f | 12 (13) | 3 (10) | 5 (13) | 6 (16) | 5 (8) |
| ≥ 10e,f | 44 (47) | 8 (27) | 21 (54) | 6 (16) | 46 (73) |
| No. of participants, mean (95% CI)e,f | 362.6 (266.3–493.6) | 155.9 (105.5–230.4) | 344.8 (223.9–531.2) | 146.7 (96.5–222.8) | 548.5 (408.8–735.7) |
| < 100d,e,f,g | 17 (18) | 9 (30) | 8 (21) | 16 (43) | 3 (5) |
| 100–499d,e,f,g | 40 (43) | 15 (50) | 12 (31) | 16 (43) | 29 (46) |
| 500–999d,e,f,g | 16 (17) | 6 (20) | 13 (33) | 1 (3) | 13 (21) |
| ≥ 1000d,e,f,g | 20 (22) | 0 (0) | 6 (15) | 4 (11) | 18 (29) |
| Target condition | |||||
| General critical illness | 10 (11) | 2 (7) | 2 (5) | 5 (14) | 12 (19) |
| Respiratory | 24 (26) | 13 (43) | 19 (49) | 13 (35) | 23 (37) |
| ARDS | 4 (4) | 5 (17) | 8 (21) | 5 (14) | 7 (11) |
| Mechanical ventilation (excluding ARDS) | 11 (12) | 3 (10) | 4 (10) | 4 (11) | 8 (13) |
| Respiratory failure (without ventilation) | 9 (10) | 5 (17) | 7 (18) | 4 (11) | 8 (13) |
| Sepsis | 13 (14) | 6 (20) | 7 (18) | 7 (19) | 14 (22) |
| Nosocomial complications | 11 (12) | 5 (17) | 3 (8) | 3 (8) | 3 (5) |
| Neurological | 12 (13) | 2 (7) | 1 (3) | 5 (14) | 8 (13) |
| Acute kidney injury | 6 (6) | 1 (3) | 5 (13) | 3 (8) | 3 (5) |
| General resuscitation | 9 (10) | 0 (0) | 1 (3) | 1 (3) | 0 (0) |
| Trauma | 3 (3) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
| Other | 5 (5) | 1 (3) | 1 (3) | 0 (0) | 0 (0) |
| Type of intervention | |||||
| Drug | 48 (52) | 14 (47) | 16 (41) | 18 (49) | 26 (41) |
| Device/procedure | 20 (22) | 13 (43) | 20 (51) | 14 (38) | 23 (37) |
| Protocol | 11 (12) | 2 (7) | 2 (5) | 4 (11) | 13 (21) |
| Other | 14 (15) | 1 (3) | 1 (3) | 1 (3) | 1 (1) |
| Intervention effect estimatef | |||||
| Lack of efficacy | 51 (55) | 16 (53) | 20 (51) | 10 (27) | 38 (60) |
| Efficacy | 31 (33) | 11 (37) | 15 (38) | 23 (62) | 16 (25) |
| Harm | 11 (12) | 3 (10) | 4 (10) | 4 (11) | 9 (14) |
| Funding | |||||
| Non-commercial | 46 (49) | 12 (40) | 23 (59) | 9 (24) | 29 (46) |
| Commercial | 17 (18) | 5 (17) | 4 (10) | 11 (30) | 15 (24) |
| Both commercial and non-commercial | 14 (15) | 2 (7) | 4 (10) | 7 (19) | 8 (13) |
| Not reported | 15 (16) | 10 (33) | 8 (21) | 10 (27) | 11 (17) |
| None | 1 (1) | 1 (3) | 0 (0) | 0 (0) | 0 (0) |
| Study stopped early | |||||
| Futility | 2 (2) | 0 (0) | 3 (8) | 0 (0) | 5 (8) |
| Benefit | 1 (1) | 0 (0) | 2 (5) | 4 (11) | 2 (3) |
| Harm | 2 (2) | 1 (3) | 1 (3) | 1 (3) | 4 (6) |
| Recruitment/lack of funding | 1 (1) | 1 (3) | 0 (0) | 0 (0) | 3 (5) |
ARDS acute respiratory distress syndrome, ICU intensive care unit, IQR interquartile range, RCT randomized controlled trial
aContinuous data are reported as geometric mean (95% confidence interval) and nominal data as number (%)
bThe 275 included articles described 158 unique practices that were examined in 283 studies. A ‘study’ is a comparison of an intervention with control. The number of studies exceeds the number of included articles because of 8 articles that simultaneously reported 2 separate studies [34–41]; 21 studies were excluded from the data in this table since the reproduction attempt was not yet completed for 13 studies and due to the following 8 practices for which representative studies did not consistently meet our criteria for results reproducibility: chlorhexidine skin antiseptic for central venous catheter insertion, naloxone for patients with sepsis, stress ulcer prophylaxis for prevention of gastrointestinal bleeding, systemic steroids in ARDS, pulmonary surfactant in ARDS, reduction of ventilator-associated pneumonia by various methods, trophic enteral nutrition, and daily interruption of sedatives. Data refer to 262 studies unless otherwise stated
cPrimary electronic search: New England Journal of Medicine, The Lancet, JAMA. Secondary electronic search: Annals of Internal Medicine, BMJ, American Journal of Respiratory and Critical Care Medicine, Chest, Critical Care Medicine, Intensive Care Medicine, Critical Care, clinicaltrials.gov, controlled-trials.com, bibliographies of included studies
dP < 0.05 for comparison of reproduction attempts between practices with consistent and inconsistent effect estimates
eP < 0.05 for comparison between original evaluation and reproduction attempt among practices demonstrating consistent effects
fP < 0.05 for comparison between original evaluation and reproduction attempt among practices demonstrating inconsistent effects
gP < 0.05 for comparison of original evaluations between practices with consistent and inconsistent effect estimates
Fig. 2Classification of included articles and clinical practices according to the assessment of reproducibility. Detailed Legend: aThe sum of clinical practices with consistent (n = 28) and inconsistent (n = 35) effect estimates between original and reproduction attempts does not sum to 66 due to three practices that could not be categorized as their single reproduction attempt was in progress [38, 42, 43]. bPractices wherein all reproduction attempts demonstrated similar effect estimates (e.g., all lack of efficacy). cPractices wherein effect estimates from each reproduction attempt differed from the previous attempt. dEach box represents the way in which the reproduction attempt changed the results of the original study (e.g., efficacy/harm represents practices wherein the original study demonstrated efficacy but reproduction attempt demonstrated harm)
Fig. 3Map of studies with consistent effect estimates between original study and reproduction attempt. Detailed legend: hydroxyethyl starch was examined in both general critically ill and septic patients, thus has duplicate representation within the figure. AKI acute kidney injury, ARDS acute respiratory distress syndrome, COPD chronic obstructive pulmonary disease, CRRT continuous renal replacement therapy, CVC central venous catheter, IRRT intermittent renal replacement therapy, NIV non-invasive ventilation, PEEP positive end-expiratory pressure, RCT randomized clinical trial