Literature DB >> 34190996

Characteristics of Randomized Clinical Trials in Surgery From 2008 to 2020: A Systematic Review.

N Bryce Robinson1, Stephen Fremes2, Irbaz Hameed1,3, Mohamed Rahouma1, Viola Weidenmann1, Michelle Demetres4, Mahmoud Morsi1, Giovanni Soletti1, Antonino Di Franco1, Marco A Zenati5, Shahzad G Raja6, David Moher7, Faisal Bakaeen8,9, Joanna Chikwe10, Deepak L Bhatt11, Paul Kurlansky12, Leonard N Girardi1, Mario Gaudino1.   

Abstract

Importance: Randomized clinical trials (RCTs) provide the highest level of evidence to evaluate 2 or more surgical interventions. Surgical RCTs, however, face unique challenges in design and implementation. Objective: To evaluate the design, conduct, and reporting of contemporary surgical RCTs. Evidence Review: A literature search performed in the 2 journals with the highest impact factor in general medicine as well as 6 key surgical specialties was conducted to identify RCTs published between 2008 and 2020. All RCTs describing a surgical intervention in both experimental and control arms were included. The quality of included data was assessed by establishing an a priori protocol containing all the details to extract. Trial characteristics, fragility index, risk of bias (Cochrane Risk of Bias 2 Tool), pragmatism (Pragmatic Explanatory Continuum Indicator Summary 2 [PRECIS-2]), and reporting bias were assessed. Findings: A total of 388 trials were identified. Of them, 242 (62.4%) were registered; discrepancies with the published protocol were identified in 81 (33.5%). Most trials used superiority design (329 [84.8%]), and intention-to-treat as primary analysis (221 [56.9%]) and were designed to detect a large treatment effect (50.0%; interquartile range [IQR], 24.7%-63.3%). Only 123 trials (31.7%) used major clinical events as the primary outcome. Most trials (303 [78.1%]) did not control for surgeon experience; only 17 trials (4.4%) assessed the quality of the intervention. The median sample size was 122 patients (IQR, 70-245 patients). The median follow-up was 24 months (IQR, 12.0-32.0 months). Most trials (211 [54.4%]) had some concern of bias and 91 (23.5%) had high risk of bias. The mean (SD) PRECIS-2 score was 3.52 (0.65) and increased significantly over the study period. Most trials (212 [54.6%]) reported a neutral result; reporting bias was identified in 109 of 211 (51.7%). The median fragility index was 3.0 (IQR, 1.0-6.0). Multiplicity was detected in 175 trials (45.1%), and only 35 (20.0%) adjusted for multiple comparisons. Conclusions and Relevance: In this systematic review, the size of contemporary surgical trials was small and the focus was on minor clinical events. Trial registration remained suboptimal and discrepancies with the published protocol and reporting bias were frequent. Few trials controlled for surgeon experience or assessed the quality of the intervention.

Entities:  

Mesh:

Year:  2021        PMID: 34190996      PMCID: PMC8246313          DOI: 10.1001/jamanetworkopen.2021.14494

Source DB:  PubMed          Journal:  JAMA Netw Open        ISSN: 2574-3805


Introduction

In surgery, the decision to perform one type of operation instead of another is based on evaluation of the patient by the operating surgeon, and subjectivity inherent to the choice of treatment is unlikely to be neutralized even using complex statistical adjustment.[1] For this reason, treatment allocation bias and unmeasured confounders may be associated with the treatment effect seen in comparative observational surgical studies more than in other medical fields. Only randomized clinical trials (RCTs) can reliably evaluate the true effects of different surgical interventions.[2,3] Previous analyses of surgical RCTs have been limited to 1 or few specialties, few trial characteristics, and short time spans.[4,5] We describe the characteristics of the design, conduct, and reporting of RCTs in 6 key surgical specialties published between 2008 and 2020 to evaluate surgical trials in the current era.

Methods

Search Strategy and Definitions

A literature search was performed by a medical librarian (M.D.) to identify all adult surgical RCTs published between January 1, 2008, and January 1, 2020, in the 2 journals with the highest impact factor in general medicine and in each of the following surgical specialties: cardiothoracic, general, neurosurgery, orthopedic, transplant, and vascular. The specialties were selected from those recognized by the American College of Surgeons[6] with the aim of providing an overview of surgical specialties. The full search strategy is available in eTable 1 in the Supplement. A surgical RCT was defined as a trial that involved a surgical intervention in both the experimental and control arms. A surgical intervention was defined as any procedure performed by a trained surgical specialist with the goal of correcting deformities or defects, repairing injuries, or for the cure of certain diseases, as specified by the National Center for Biotechnology Information.[7] Trials evaluating nonsurgical interventions or medical treatments and trials with at least 1 nonsurgical arm were excluded because they are generally designed by nonsurgical trialists. Endovascular and percutaneous procedures were also excluded. In case of multiple reports from the same trial, the report of the primary analysis was selected. This study was prospectively registered on the international prospective register of systematic reviews (CRD42020162797) and followed Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) reporting guideline.[8] The protocol was previously published.[9]

Extraction of Trial Data

Two reviewers (N.B.R. and I.H.) independently screened the citations retrieved from the literature search and extracted all data following previously described methods and using a predefined data collection form.[10,11,12] A third reviewer (M.G.) resolved any discrepancy. Additional details of trial data extraction are available in the eMethods in the Supplement. Primary trial outcomes were classified as major or minor clinical end points based on a published classification scheme (eTable 2 in the Supplement).[13] Trials were evaluated for pragmatism using the Pragmatic Explanatory Continuum Index Summary (PRECIS-2) tool, which uses a 5-point ordinal scale (ranging from very pragmatic to very explanatory) across 9 domains of trial design, including eligibility, recruitment, setting, organization, intervention delivery, intervention adherence, follow-up, primary outcome, and analysis. Trial primary outcomes were assessed for multiplicity and adjustment. When applicable, sponsor details, classification of the outcome as favorable or unfavorable, and identification of a discrepancy in the registered and reported trial outcome were collected. Superiority-design trials with at least 1 significant dichotomous primary outcome were eligible for fragility index (FI) calculation. In trials showing no significant difference in the primary outcome, reporting bias was appraised. Risk of bias was assessed using The Cochrane Risk of Bias Tool Version 2 (RoB 2) tool. Detailed methods are available in the eMethods in the Supplement.

Statistical Analysis

Categorical variables were reported as counts and percentages. Following assessment of normality by visual inspection and Shapiro-Wilk normality test, continuous variables were reported as mean (SD) when normally distributed or median (interquartile range [IQR]) when not. Based on normality of data, independent samples t test or the Mann-Whitney U test was used to compare continuous variables. Categorical variables were compared using χ2 or Fisher exact tests. The P for trend (linear regression) was used to evaluate variations during the study period. Two-sided significance testing was used and a P value <.05 was considered significant without adjustment for multiple testing. All analyses were performed using R (version 3.6.2 R Project for Statistical Computing) within RStudio.

Results

From the 6699 articles screened, a total of 388 trials were included in the analysis (eFigure 1 in the Supplement). The number of published surgical trials did not significantly change during the study period (Figure 1A). Details of the 388 included clinical trials are available in eTable 3 in the Supplement.
Figure 1.

Randomized Clinical Trials During the Study Period

A-C, By year, outcome, and study design.

Randomized Clinical Trials During the Study Period

A-C, By year, outcome, and study design.

Trial Characteristics

Trial characteristics are summarized in Table 1. One-hundred-twenty-five trials (32.2%) investigated general surgery interventions; 116 (29.9%), orthopedic surgery; 93 (23.9%), cardiothoracic surgery; 21 (5.4%), neurosurgery; 19 (4.9%), vascular surgery; and 1 (0.3%), transplantation. A breakdown of general surgery trials by subspecialty is available in eTable 4 in the Supplement and results by specialty are provided in eTable 5 in the Supplement. Journal characteristics are available in the eResults in the Supplement.
Table 1.

Trial Characteristics

CharacteristicNo. (%) (N = 388)
Journal of publication
American Journal of Transplantation1 (0.3)
Annals of Surgery103 (26.5)
Arthroscopy47 (12.1)
European Journal of Vascular and Endovascular Surgery10 (2.6)
JAMA Surgery9 (2.3)
Journal of Bone and Joint Surgery63 (16.3)
Journal of Neurosurgery10 (2.6)
Journal of Vascular Surgery9 (2.3)
Neurosurgery10 (2.6)
The Annals of Thoracic Surgery32 (8.2)
The Journal of Thoracic and Cardiovascular Surgery44 (11.3)
The Lancet29 (7.5)
The New England Journal of Medicine21 (5.4)
Journal of Heart and Lung Transplantation0
Location
Africa4 (1.0)
Asia86 (22.2)
Australia12 (3.1)
Europe195 (50.3)
North America65 (16.8)
South America6 (1.5)
Multiple continents20 (5.2)
Specialty
General surgery125 (32.2)
Orthopedic surgery116 (29.9)
Cardiothoracic Surgery93 (23.9)
Neurosurgery21 (5.4)
Vascular surgery19 (4.9)
Obstetrics and gynecology10 (2.6)
Urology2 (0.5)
Otolaryngology1 (0.3)
Transplant1 (0.3)
Multicenter trial167 (43.0)
One hundred ninety-five (50.3%) trials originated from Europe; 86 (22.2%), from Asia; 65 (16.8%), from North America; 12 (3.1%), from Australia; 6 (1.5%), from South America; 4 (1.0%), from Africa; and 20 (5.2%), from multiple continents. One hundred sixty-seven trials (43.0%) were multicenter. The median projected sample size was 144 patients (IQR, 85-299 patients) and did not change significantly over the course of the study period (157 patients in 2008 [IQR, 116-361 patients] vs 150 patients in 2019 [IQR, 90-224 patients]; P = .64 for trend). The largest sample size was 15 935 patients, and 17 studies (4.4%) enrolled more than 1000 patients. Of note, the median enrolled sample size was smaller than the projected sample size (122 patients; IQR, 70-245 patients) and 63 trials (20% of trials that reported a sample size calculation) did not reach their projected sample size.

Trial Design

The details of trial design are summarized in Table 2. Two hundred forty-two (62.4%) trials were registered a priori; trial registration significantly increased over the study period (16.2% [6 of 37] registered in 2008 vs 89.7% [35 of 39] registered in 2019; P < .001) (eFigure 2, eTable 6 in the Supplement). In 81 (33.5%) of the registered trials, 1 or more discrepancies between the registered and published primary outcomes were found. Trials with discrepancies were significantly less likely to use intention-to-treat as the main analysis (59.7% vs 76.5%; P = .03) and had significantly lower mean (SD) PRECIS-2 scores (3.38 [0.63] vs 3.63 [0.66]; P = .01) (eTable 7 in the Supplement). There was no significant change in the rate of publication of trials with discrepancies during the study period (50.0% [3 of 6] trials with discrepancy in 2008 vs 20% [7 of 35] trials in 2019; P = .46 for trend) (Figure 1B).
Table 2.

Trial Design of 388 Included Trials

VariableFrequency, No. (%)
Registration in trials registry242 (62.4)
Discrepancy between registered and primary outcome81 (33.5)
Superiority design329 (84.8)
Power, median (IQR), %80.0 (80.0-90.0)
Estimated relative treatment effect, median (IQR), %50.0 (24.7-63.3)
Estimated treatment effect of trials with a major clinical end point as primary outcome50.0 (24.5-67.9)
Estimated treatment effect of trials with a minor clinical end point as primary outcome46.6 (25.0-57.1)
Intention-to-treat as the primary analysis221 (56.9)
Noninferiority design55 (14.2)
Both noninferiority and superiority design3 (0.8)
Use of composite primary outcome82 (21.1)
Major clinical event as primary end point123 (31.7)
No. of patients screened, median (IQR)204 (105-465)
Sample size, median (IQR)
Projected144 (86-299)
Final122 (70-245)
Duration of follow-up, median (IQR), mo24.0 (12.0, 32.0)
Type of primary outcome
Time to event181 (46.7)
Quality of life50 (12.8)
Other scales157 (40.5)
Randomization
Computer generated213 (54.9)
Envelope90 (23.2)
Random number table36 (9.3)
Telephone call to randomization center10 (2.6)
Drawing of lots2 (0.5)
Date of birth2 (0.5)
Flip of a coin1 (0.3)
No details given34 (8.7)
Blinding
None74 (19.1)
Outcome assessor only61 (15.7)
Patient and outcome assessor60 (15.4)
Patient only32 (8.3)
Patient, outcome assessor, data analyst18 (4.6)
Outcome assessor and data analyst8 (2.1)
Data analyst only6 (1.5)
Patient, surgeon, outcome assessor, data analyst1 (0.3)
No details given128 (33.0)
Control for surgeons’ experience
None303 (78.1)
Surgeons’ experience cut-off60 (15.5)
Pretrial training25 (6.4)
Monitoring of the intervention
None371 (95.6)
Photo4 (1.0)
Video9 (2.3)
Site visit3 (0.8)
Data monitoring of outcomes1 (0.3)
Details of the experimental procedure
None41 (10.6)
Limited226 (58.2)
Detailed121 (31.2)
Risk of bias assessment
Low risk86 (22.2)
Some concerns211 (54.4)
High risk91 (23.5)
Funding
External288 (74.2)
Industry96 (33.3)
Industry sponsor involved in the analysis51 (53.1)
Conflicts of interest
First author with study sponsor34 (35.4)
Last author with study sponsor29 (30.2)
PRECIS-2 score, mean (SD)a3.52 (0.65)

Abbreviations: IQR, interquartile range; PRECIS-2, Pragmatic Explanatory Continuum Indicator Summary 2.

PRECIS-2 uses a 5-point ordinal scale (ranging from very pragmatic to very explanatory) across 9 domains of trial design, including eligibility, recruitment, setting, organization, intervention delivery, intervention adherence, follow-up, primary outcome, and analysis.

Abbreviations: IQR, interquartile range; PRECIS-2, Pragmatic Explanatory Continuum Indicator Summary 2. PRECIS-2 uses a 5-point ordinal scale (ranging from very pragmatic to very explanatory) across 9 domains of trial design, including eligibility, recruitment, setting, organization, intervention delivery, intervention adherence, follow-up, primary outcome, and analysis. A total of 329 trials (84.8%) used a superiority design. Trials that used a superiority design were significantly more likely to use intention-to-treat as the main analysis (58.0% [185 of 319] vs 42.3% [22 of 52]; P = .004) and had significantly higher PRECIS-2 score (mean [SD] score 3.55 [0.62] vs 3.24 [0.77]; P = .002). Trials using superiority design were more likely to have a high risk of bias compared with noninferiority trials (26.3% [84 of 319] vs 11.5% [6 of 52]; P = .02) (eTable 8 in the Supplement). There was no significant change in the use of the noninferiority design during the study period (16.2% [6 of 37] noninferiority design in 2008 vs 10.8% [4 of 37] noninferiority design in 2019; P = .98 for trend) (Figure 1C). Trials had 80% power (IQR, 80.0%-90.0% power) and were designed to detect an estimated relative treatment effect of 50% (IQR, 24.7%-63.3%) without significant differences between trials that used major vs minor clinical events in the primary outcome (50.0%; IQR, 24.5%-67.9% vs 46.6%; IQR, 25.0%-57.1%; P = .12) (eFigure 3 in the Supplement). Slightly more than half of the trials (22 [56.9%]) used intention-to-treat as the primary analysis. Most outcomes used (181 [46.7%]) were time to event, 50 (12.8%) were quality-of-life scores, and 157 (40.5%) were based on other ordinal scales. Eighty-two trials (21.1%) used a primary composite outcome. Only 123 trials (31.7%) used major clinical events in the primary outcome. Most trials (303 [78.1%]) did not control for surgeon experience; 60 (15.5%) used an experience cut-off, and 25 (6.4%) used pretrial training. Most trials (n=371 [95.6%]) did not assess the quality of the intervention. Of the 17 trials that assessed the quality of the intervention, 4 trials (23.5%) used intraoperative images, 9 trials (52.9%) used video recording, 3 trials (17.7%) used site visits, and 1 trial (5.9%) used data monitoring of outcomes. Details of the trial intervention were limited in most reported trials (226 [58.2%]), detailed in 121 trials (31.2%), and not specified in 41 (10.6%). Most trials (288 [74.2%]) reported an external funding source. Ninety-six trials (33.3%) were industry funded, and of those 50 (53.1%) reported involvement of industry in the analysis. In industry-funded trials, 34 (35.4%) reported that first authors and 29 (30.2%) reported that last authors disclosed a potentially relevant relationship with industry. Risk of bias was assessed in all trials. Most trials (211 [54.4%]) had some concern for bias, 86 (22.2%) had low risk of bias, and 91 (23.5%) had high risk of bias. Risk of bias by domain is reported in eTable 9 in the Supplement. The mean (SD) PRECIS-2 score was 3.52 (0.65) and increased significantly during the study period (mean [SD] PRECIS-2 was 3.34 [0.60] in 2008 vs 4.01 [0.62] in 2019; P < .001 for trend) (Figure 2A). The recruitment domain had the highest mean (SD) score (4.41 [0.78]), and flexibility in delivery domain had the lowest (2.90 [1.35]) (eTable 10 in the Supplement). The mean (SD) score varied widely among specialties and was highest in general surgery (3.70, [0.53]), and lowest in vascular surgery (3.21 [0.50]) (eTable 11 in the Supplement).
Figure 2.

Evaluation of Randomized Clinical Trials

A, Evaluation using the Pragmatic Explanatory Continuum Index Summary 2 (PRECIS-2) Tool. B, Evaluation using the Fragility Index. C, Evaluation with reporting bias.

Evaluation of Randomized Clinical Trials

A, Evaluation using the Pragmatic Explanatory Continuum Index Summary 2 (PRECIS-2) Tool. B, Evaluation using the Fragility Index. C, Evaluation with reporting bias.

Trial Implementation

Data on trial implementation are summarized in Table 3. Two hundred seventy-three (70.4%) trials reported details of the number of patients screened. The median percentage of screened patients who were enrolled was 76.8% (IQR, 45.1%-95.2%).
Table 3.

Trial Implementation and Reporting

VariableNo. (%)
Screened patients included, median (IQR). %76.8 (45.1-95.2)
Patients lost to follow up, median (IQR)4.0 (0.0-17.0)
Sample size lost to follow up, median (IQR), %3.3 (0.0-10.7)
Fragility index, median (IQR)3.0 (1.0-6.0)
Fragility Index minus patients lost to follow up, median (IQR)0.0 (0.0-3.0)
Crossovers, median (IQR), No.1.0 (0.0-6.5)
Crossover, median (IQR), %0.5 (0.0-3.0)
Trials
With a favorable outcome166 (42.7)
With a neutral outcome212 (54.6)
Multiplicity175 (45.1)
Multiple treatment groups13 (7.4)
Multiple outcomes66 (37.7)
Multiple analyses of the same outcome66 (37.7)
Multiple outcomes + multiple analyses of the same outcome23 (13.2)
Multiple treatment groups + multiple outcomes4 (2.3)
Multiple treatment groups + multiple analyses of the same outcome3 (1.7)
Adjusted for multiple comparisons35 (20.0)
Bonferroni correction25 (71.4)
Tukey test7 (20.0)
Dunn test1 (2.9)
Gatekeeping or hierarchical testing1 (2.9)
Modified α value1 (2.9)
Reporting bias present109/211 (51.7)
Extent of reporting bias
None102 (48.3)
In 1 section other than conclusion11 (5.2)
In conclusion only34 (16.1)
In 2 sections33 (15.6)
In all sections31 (14.7)
Citations, median (IQR), No.36 (15-91)

Abbreviation: IQR, interquartile range.

Abbreviation: IQR, interquartile range. Most trials (213 [54.9%]) used a computer-generated randomization sequence. One hundred eighty-six (47.9%) trials used blinding. When blinding was used, the outcome assessors were most often blinded (61 [32.8%]), followed by patients and outcome assessors (60 [32.3%]). Patients only were blinded in 32 trials (17.2%). In 74 (19.1%) trials blinding was explicitly not used, and in 128 (33.0%) trials no details on blinding were given. Crossover rates were generally low. The median percentage of crossover between treatment arms was 0.5% (IQR, 0.0%-3.0%). The median follow-up time was 24.0 months (IQR, 12.0-32.0 months). The median percentage of patients lost to follow-up was 3.7% (IQR, 0.0%-10.7%). Sixty-two trials (16.0%) were eligible for calculation of the FI. The median FI was 3.0 (IQR 1.0-6.0 FI) (eFigure 4 in the Supplement). The median FI minus loss to follow-up was 0.0 (IQR, 0.0-3.0 loss to follow-up) (eFigure 5 in the Supplement). In 20 trials (32.2%), the number of patients lost to follow-up was higher than the FI. The FI did not significantly change over time (median FI, 1.0 [IQR, 0.5-2.5] in 2008 vs 3.0 [IQR, 2.0-13.0] in 2019; P = .72 for trend) (Figure 2B).

Trial Reporting and Citations

Data on reporting and citations are summarized in Table 3. Ten trials (2.6%) did not report the results for the primary outcome: 7 trials (70.0%) were interim analyses and 3 trials (30.0%) did not explicitly define a primary outcome. One hundred sixty-six trials (42.7%) were reported as favorable and 212 (54.6%) as neutral. No difference in the rate of favorable results was found for trials at higher vs lower risk of bias (27.1% for favorable trials vs 20.8% for neutral trials; P = .15). Multiplicity was detected in 175 trials (45.1%); there was a nonsignificant decrease during the study period (multiplicity identified in 51.8% [19 of 37] of trials in 2008 vs 30.8% [12 of 39] in 2019; P = .06 for trend). Of the trials in which multiplicity was detected, only 20.0% (35) adjusted for multiple comparisons. Details of multiplicity and adjustment are given in Table 3. Two-hundred-eleven trials were eligible for the analysis of reporting bias. Reporting bias was identified in approximately half of the studies (109 of 211 trials [51.7%]). General surgery trials had the highest rate of reporting bias (65.1% [41 of 63]) and cardiothoracic surgery trials had the lowest (24.3% [9 of 37]) (eTable 12 in the Supplement). No differences in trial design or in study sponsor were found between trials with and without reporting bias. Details of reporting bias appraisal are summarized in eTable 13 in the Supplement. The rate of reporting bias did not significantly change over the course of the study period (reporting bias identified in 52.3% [11 of 21] of eligible trials in 2008 vs 42.8% [9 of 21] in 2019; P = .27 for trend) (Figure 2C). The median number of citations in included trials was 36 (IQR, 15-91). Trials published in The New England Journal of Medicine had the highest number of citations (median 268.0; IQR, 168.0-401.0) and trials published in the Journal of Neurosurgery had the lowest number of citations (median 14.0; IQR, 3.2-27.2) (eTable 14 in the Supplement). The median number of citations was highest in general surgery trials (50.0; IQR, 20.0-114.0) and lowest in neurosurgery trials (21.0; IQR, 5.0-36.0) (eTable 15 in the Supplement). The median number of citations was similar for trials with a favorable vs neutral outcome (36.0; IQR, 15.2-89.5 for favorable trials vs 32.5; IQR, 15.0-91.0 for neutral trials; P = .90).

Discussion

In this analysis, we systematically evaluated the surgical trials published in the 2 highest impact factor journals in 6 key surgical specialties between 2008 and 2020. We included 388 trials. The average number of published trials per year did not increase during the study period (average 32 per year). Most trials investigated general surgery interventions and were performed in Europe. Of the trials, only 62.4% were registered a priori, and in 33.5% of the preregistered trials 1 or more discrepancies between the registered and published primary outcome were found. This discrepancy rate is higher than what has been reported in previous analyses limited to medical specialties[14] but consistent with the rate described in an analysis of orthopedic trials.[15] Changes in trial primary outcomes have been shown to be associated with larger treatment effects[16] and, together with the lack of trial registration, may raise concerns of selective reporting. Most trials (84.8%) used a superiority design and this finding remained constant over the 10 years of the study; this finding is in contrast with the increasing use of the noninferiority design that has been described in the other fields.[17,18] While most of the trials (74.2%) reported an external funding source, industry sponsorship was limited to 33.3%, a rate much lower than the 53.2% reported, for example, among cardiovascular trials.[13] The trials were generally small, with a median sample size of 122 patients and the median follow-up was 24 months. Most trials were designed to evaluate a fairly large treatment effect (50%); only one-third of them (31.7%) used major clinical events as the primary outcome. Most trials (54.6%) were neutral. A prior analysis reported that most (57.0%) cardiovascular RCTs reported a positive outcome.[13] Other analyses have similarly found that most published trials reported positive findings.[19,20] Slightly more than one-half of the trials (56.9%) used intention-to-treat as the main analysis. Although intention-to-treat estimates may be biased toward the null in case of high rate of crossover or other protocol deviations, the comparability between groups (the most important strength of RCTs) is assured only when the randomized allocation is preserved. Most trials did not adopt any method to control for surgeons’ experience (78.1%) or to assess the quality of the intervention performed (95.6%) and provided only limited details of the trial intervention (58.2%). There have been multiple examples of important trials in surgery in which failure to assure adequate delivery and quality of the tested interventions has significantly affected the outcomes.[21,22] Blinding (in particular blinding of patients) is challenging in surgical trials. We found that approximately half (47.9%) of trials used some form of blinding, and that blinding of the outcome assessors and of the assessors and the patients was the most frequent strategy used. It is notable that 33.0% of trials did not report any information about blinding, and 19.1% explicitly did not use any form of blinding. Crossover and loss to follow-up rates were low, suggesting excellent trial implementation and high commitment of the investigators to the trial protocol. Trial pragmatism, as measured by the PRECIS-2 score, increased significantly during the study period, similar to what has been reported for cardiovascular trials.[23] Also, trials enrolled a large proportion of the screened patients (76.8%). As in previous analyses in other fields,[24,25] the robustness of the results of surgical trials was relatively low, with the change in condition of only 3 patients needed to switch the statistical significance for most of them. Twenty-three trials (37.1%) had an FI equal to or less than 1 and in 20 trials (32.2%) the number of patients lost to follow-up was higher than the FI, a notable finding because the event rate in patients lost to follow-up has been reported to be higher than in patients who remain in the study.[26] Multiplicity was found in 45.1% of the trials and of them a small number adjusted for multiple testing (20.0%). This low rate is consistent with the results of a recent analysis limited to trials in the cardiovascular field[27] and may raise questions because of the potential inflation of the type I error. In more than half of the 212 trials with neutral results, we found evidence of interpretation bias, a finding consistent with previous analyses.[28,29] While industry sponsorship has been associated with reporting bias in trials of cardiovascular interventions,[13] an association was not confirmed among surgical trials. In addition, the median number of citations of surgical trials during the study period was 36, but with important variations based on the journal of publication and the surgical specialty. Notably, the number of citations was similar for trials with positive or neutral results.

Limitations

This study has limitations. We could not capture unpublished trials and cannot exclude publication bias. We have included only a limited number of surgical specialties and a limited number of journals, and it is possible that trials in specialties or journals not included in our analysis have different characteristics. The FI, the PRECIS-2 score, and the methods used for calculation of reporting bias have been extensively used in trials analyses, but have been analyzed and found to have several important limitations.[30] Also, we have used only univariate analysis and have not adjusted for multiplicity, so we cannot exclude the risk of spurious associations and confounders.

Conclusions

In this systematic review, the number of surgical RCTs published from 2008 to 2020 was relatively small and the average annual number did not increase with the time. Trial sizes were generally small and designed to detect a large effect in outcomes of secondary clinical importance. A substantial proportion of trials was not registered a priori; among those registered, discrepancies between the registered protocol and the published report were found. Blinding and control for surgeons’ experience were adopted in less than half of the trials and the results were relatively fragile. These data suggest that improvements in the design, implementation and reporting of randomized clinical trials in surgery are warranted.
  28 in total

1.  Association of funding and conclusions in randomized drug trials: a reflection of treatment effect or adverse events?

Authors:  Bodil Als-Nielsen; Wendong Chen; Christian Gluud; Lise L Kjaergard
Journal:  JAMA       Date:  2003-08-20       Impact factor: 56.272

2.  Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes.

Authors:  Isabelle Boutron; Susan Dutton; Philippe Ravaud; Douglas G Altman
Journal:  JAMA       Date:  2010-05-26       Impact factor: 56.272

Review 3.  Head-to-head randomized trials are mostly industry sponsored and almost always favor the industry sponsor.

Authors:  Maria Elena Flacco; Lamberto Manzoli; Stefania Boccia; Lorenzo Capasso; Katina Aleksovska; Annalisa Rosso; Giacomo Scaioli; Corrado De Vito; Roberta Siliquini; Paolo Villari; John P A Ioannidis
Journal:  J Clin Epidemiol       Date:  2015-02-07       Impact factor: 6.437

Review 4.  Noninferiority Designed Cardiovascular Trials in Highest-Impact Journals.

Authors:  Behnood Bikdeli; John W Welsh; Yasir Akram; Natdanai Punnanithinont; Ike Lee; Nihar R Desai; Sanjay Kaul; Gregg W Stone; Joseph S Ross; Harlan M Krumholz
Journal:  Circulation       Date:  2019-06-10       Impact factor: 29.690

5.  Design, Conduct, and Analysis of Surgical Randomized Controlled Trials: A Cross-sectional Survey.

Authors:  Jiajie Yu; Wenwen Chen; Shidong Chen; Pengli Jia; Guanyue Su; Youping Li; Xin Sun
Journal:  Ann Surg       Date:  2019-12       Impact factor: 12.969

Review 6.  Randomized Trials in Cardiac Surgery: JACC Review Topic of the Week.

Authors:  Mario Gaudino; A Pieter Kappetein; Antonino Di Franco; Emilia Bagiella; Deepak L Bhatt; Andreas Boening; Mary E Charlson; Marcus Flather; Annetine C Gelijns; Frederick Grover; Stuart J Head; Peter Jüni; Andre Lamy; Marissa Miller; Alan Moskowitz; Wilko Reents; A Laurie Shroyer; David P Taggart; Derrick Y Tam; Marco A Zenati; Stephen E Fremes
Journal:  J Am Coll Cardiol       Date:  2020-04-07       Impact factor: 24.094

Review 7.  Trends in the Explanatory or Pragmatic Nature of Cardiovascular Clinical Trials Over 2 Decades.

Authors:  Nariman Sepehrvand; Wendimagegn Alemayehu; Debraj Das; Arjun K Gupta; Pishoy Gouda; Anukul Ghimire; Amy X Du; Sanaz Hatami; Hazal E Babadagli; Sanam Verma; Zakariya Kashour; Justin A Ezekowitz
Journal:  JAMA Cardiol       Date:  2019-11-01       Impact factor: 14.676

8.  Systematic Evaluation of the Robustness of the Evidence Supporting Current Guidelines on Myocardial Revascularization Using the Fragility Index.

Authors:  Mario Gaudino; Irbaz Hameed; Giuseppe Biondi-Zoccai; Derrick Y Tam; Stephen Gerry; Mohamed Rahouma; Faiza M Khan; Dominick J Angiolillo; Umberto Benedetto; David P Taggart; Leonard N Girardi; Filippo Crea; Marc Ruel; Stephen E Fremes
Journal:  Circ Cardiovasc Qual Outcomes       Date:  2019-12-11

9.  The fragility of trial results involves more than statistical significance alone.

Authors:  Stephen D Walter; Lehana Thabane; Matthias Briel
Journal:  J Clin Epidemiol       Date:  2020-04-13       Impact factor: 7.407

10.  Prevalence of Multiplicity and Appropriate Adjustments Among Cardiovascular Randomized Clinical Trials Published in Major Medical Journals.

Authors:  Muhammad Shahzeb Khan; Maaz Shah Khan; Zunaira Navid Ansari; Tariq Jamal Siddiqi; Safi U Khan; Irbaz Bin Riaz; Zain Ul Abideen Asad; John Mandrola; James Wason; Haider J Warraich; Gregg W Stone; Deepak L Bhatt; Samir R Kapadia; Ankur Kalra
Journal:  JAMA Netw Open       Date:  2020-04-01
View more
  6 in total

1.  Study Types in Orthopaedics Research: Is My Study Design Appropriate for the Research Question?

Authors:  Isabella Zaniletti; Katrina L Devick; Dirk R Larson; David G Lewallen; Daniel J Berry; Hilal Maradit Kremers
Journal:  J Arthroplasty       Date:  2022-09-06       Impact factor: 4.435

2.  Applying the PRECIS-2 tool for self-declared 'pragmatic' acupuncture trials: protocol for a systematic review.

Authors:  Jinwoong Lim; Hyeonhoon Lee; Yong-Suk Kim
Journal:  BMJ Open       Date:  2022-04-12       Impact factor: 2.692

3.  A scattered landscape: assessment of the evidence base for 71 patient decision aids developed in a hospital setting.

Authors:  Marion Danner; Marie Debrouwere; Anne Rummer; Kai Wehkamp; Jens Ulrich Rüffer; Friedemann Geiger; Robert Wolff; Karoline Weik; Fueloep Scheibler
Journal:  BMC Med Inform Decis Mak       Date:  2022-02-17       Impact factor: 2.796

4.  Telehealth follow-up after cholecystectomy is safe in veterans.

Authors:  Danielle Abbitt; Kevin Choy; Rose Castle; Heather Carmichael; Teresa S Jones; Krzystof J Wikiel; Carlton C Barnett; John T Moore; Thomas N Robinson; Edward L Jones
Journal:  Surg Endosc       Date:  2022-08-16       Impact factor: 3.453

5.  A Proposed Personalized Spine Care Protocol (SpineScreen) to Treat Visualized Pain Generators: An Illustrative Study Comparing Clinical Outcomes and Postoperative Reoperations between Targeted Endoscopic Lumbar Decompression Surgery, Minimally Invasive TLIF and Open Laminectomy.

Authors:  Kai-Uwe Lewandrowski; Ivo Abraham; Jorge Felipe Ramírez León; Albert E Telfeian; Morgan P Lorio; Stefan Hellinger; Martin Knight; Paulo Sérgio Teixeira De Carvalho; Max Rogério Freitas Ramos; Álvaro Dowling; Manuel Rodriguez Garcia; Fauziyya Muhammad; Namath Hussain; Vicky Yamamoto; Babak Kateb; Anthony Yeung
Journal:  J Pers Med       Date:  2022-06-29

6.  Peer Reviewed Evaluation of Registered End-Points of Randomised Trials (the PRE-REPORT study): a stepped wedge, cluster-randomised trial.

Authors:  Christopher W Jones; Amanda Adams; Benjamin S Misemer; Mark A Weaver; Sara Schroter; Hayat Khan; Benyamin Margolis; David L Schriger; Timothy F Platts-Mills
Journal:  BMJ Open       Date:  2022-09-28       Impact factor: 3.006

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.