Literature DB >> 26289450

Development of an algorithm to provide awareness in choosing study designs for inclusion in systematic reviews of healthcare interventions: a method study.

Abstract

OBJECTIVES: To develop an algorithm that aims to provide guidance and awareness for choosing multiple study designs in systematic reviews of healthcare interventions.
DESIGN: Method study: (1) To summarise the literature base on the topic. (2) To apply the integration of various study types in systematic reviews. (3) To devise decision points and outline a pragmatic decision tree. (4) To check the plausibility of the algorithm by backtracking its pathways in four systematic reviews.
RESULTS: (1) The results of our systematic review of the published literature have already been published. (2) We recaptured the experience from our four previously conducted systematic reviews that required the integration of various study types. (3) We chose length of follow-up (long, short), frequency of events (rare, frequent) and types of outcome as decision points (death, disease, discomfort, disability, dissatisfaction) and aligned the study design labels according to the Cochrane Handbook. We also considered practical or ethical concerns, and the problem of unavailable high-quality evidence. While applying the algorithm, disease-specific circumstances and aims of interventions should be considered. (4) We confirmed the plausibility of the pathways of the algorithm.
CONCLUSIONS: We propose that the algorithm can assist to bring seminal features of a systematic review with multiple study designs to the attention of anyone who is planning to conduct a systematic review. It aims to increase awareness and we think that it may reduce the time burden on review authors and may contribute to the production of a higher quality review. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

Entities: Chemical Disease Gene Species

Keywords: EPIDEMIOLOGY; MEDICAL EDUCATION & TRAINING; STATISTICS & RESEARCH METHODS

Mesh：

Year: 2015 PMID： 26289450 PMCID： PMC4550722 DOI： 10.1136/bmjopen-2014-007540

Source DB: PubMed Journal: BMJ Open ISSN： 2044-6055 Impact factor: 2.692

We developed an algorithm to provide guidance for allocating various study designs to specific research questions. This can be viewed as a response to the lack of comprehensive guidance in major published methods documents. The terms used for defining the critical decision points of the algorithm, such as length of follow-up, frequency of events and types of outcome need to be interpreted in the context of the disease. Disease-specific circumstances and aims of interventions always have to be taken into account during application of the algorithm. We could follow and confirm the appropriateness of the pathways during the application of the algorithm on four selected systematic reviews. The checking of the plausibility of the algorithm was based on systematic reviews that were already completed. This approach is far from the everyday working condition and the approach may be biased by subjective expectations of the authors. We encourage independent evaluation of the algorithm.

Introduction

When evaluating healthcare interventions, different categories of intervention such as medicinal versus non-medicinal therapy and different categories of outcomes such as intended effects, adverse events or health-related quality of life may sometimes be best answered by multiple study designs. Some designs have features, which preferably match the requirements of specific parts of a research question. Exclusively using data from randomised trials (RCTs) to evaluate whether an intervention might work has a number of limitations.1 For example, RCTs may not be appropriate to estimate the incidence of rare (adverse) events. Other study designs, for example registry analyses, may incorporate the data of much more participants. These analyses may therefore complement the information on rare but important events. We have gathered some examples of research questions that cannot or only with great difficulty be investigated in RCTs (table 1). A practical concern may arise with low numbers of patients with a rare disease. It might be difficult to conduct an RCT to evaluate patients with acquired severe aplastic anaemia. An ethical concern may arise with the treatment of severe diseases or life-threatening treatments. It might be obsolete to conduct an RCT to evaluate a new experimental treatment of patients with pancreatic cancer.

Table 1

Research questions that cannot or that can only with difficulty be investigated in RCTs

Topic	Reason
Research questions that in certain circumstances cannot be investigated in RCTs
Life-threatening intervention, for example, intervention with high early treatment-related mortality	Allocation to intervention group endangers life
Certain second-line interventions reserved for refractory patients that did not respond to first-line standard therapy	Ultimo ratio and therefore no control group by definition. Example: Haematopoietic stem cell transplantation from unrelated versus related donors for patients with acquired severe aplastic anaemia
Pregnant women	Ethical concerns against inclusion in experiments
Infants	Ethical concerns against inclusion in experiments
Interventions that have been shown to produce a dramatic effect	The magnitude of benefit of one particular intervention such as insulin to treat diabetes mellitus would render any intervention a neglect of healthcare if insulin would be omitted unless the new treatment does also have a dramatic effect
Lack of consent to participate	Cheating persons is not legal
Studies that do not comply with the Declaration of Helsinki	The set of ethical principles regarding human experimentation is regarded as the cornerstone document of human research ethics
Research questions that can with difficulty be investigated in RCTs
Rare adverse events and other rare safety outcomes	Number of study participants is too low
Allocation of alternative interventions is dominated by patients’ preferences	Treatment group is chosen by a patient because of specific expectations of effectiveness, adverse events or health-related quality of life

RCTs, randomised controlled trials.

Research questions that cannot or that can only with difficulty be investigated in RCTs RCTs, randomised controlled trials. The methods of conducting systematic reviews of healthcare interventions are major components of ‘evidence-based’ medicine (EBM). In 2000, Sackett et al2 defined EBM as the integration of ‘best’ research evidence with individual clinical expertise, patient values and expectations, and ‘best’ external research evidence. This definition may be visualised by three overlapping circles in a Venn diagram.3 The area of intersection, where all three different resources meet, should represent the EBM. To classify more valid and less valid information, the ‘levels of evidence’ specify a hierarchical order for various research designs based on their internal validity. The highest level of valid data, that is, the ‘best’ evidence, however, is not always available. In table 2, we present the classification of some of the major study designs for intended effects of therapy. There appears to be some variation in the hierarchy among some authors and institutions issuing ‘evidence-based’ guidelines or systematic reviews. All authors agree that RCTs have the highest ‘level of evidence’ with respect to minimising the risk of bias. The prospective non-randomised controlled clinical trial (CCT) has an experimental design and its internal validity should be regarded lower than a randomised trial but higher than an observational study. Prospective cohort studies have a potential for a lower risk of bias than retrospective cohort studies because they have lower risk of recall bias and confounding.11 In cohort studies groups are defined by exposure whereas in case–control studies groups are defined by outcome status. Both points are acknowledged in the ‘hierarchy of evidence’ by some but not all authors. Case series and case reports are descriptions of one or more individual cases. Some authors combine both designs in one category while others place case series a higher level.

Table 2

Definition, classification and hierarchy of study designs for intended effects of therapy

Category	Contr	Prosp	Design	Description	Evidence level by some institutions or authors
Category	Contr	Prosp	Design	Description	I	II	III	IV	V	VI	VII	VIII
Experimental	Yes	Yes	Randomised controlled trial	Random, concealed allocation of participants to an intervention and a control group	1	1	1	1	1	1	1	1
Experimental	Yes	Yes	Prospective non-randomised controlled clinical trial	The method of allocation by the researcher falls short of genuine randomisation and fails to conceal the allocation sequence	2	2	NR	NR	2	1	NR	NR
Observational	Yes	Yes	Prospective cohort study	Comparison of outcome rates between treatment groups (intervention vs comparator)	3	2	2	2	3	2	2	2
Observational	Yes	Yes	Nested case–control study	Case–control study nested in a prospective cohort study, combines advantage of two study designs	NR	NR	NR	NR	NR	NR	NR	NR
Observational	Yes	No	Retrospective cohort study	Comparison of outcome rates between treatment groups (intervention vs comparator)	3	3	3	2	3	2	2	2
Observational	Yes	No	Case–control study	Comparison of treatment rates between outcome groups (cases vs controls)	4	4	3	3	3	2	2	2
Observational	No	No	Registry analysis	Description of the outcome in many patients collected with a wide range of settings and patients’ characteristics	NR	NR	NR	NR	NR	NR	NR	NR
Observational	No	No	Case series	Description of the outcome in a number of 1 or more cases of an intervention	5	5	4	4	3	3	3	3
Observational	No	No	Case report	Description of the outcome in a number of 1 or more cases of an intervention	5	5	5	NR	NR	4	3	3
Observational	No	No	Health outcomes research		NR	NR	NR	2	NR	NR	NR	NR
Observational	No	No	Ecological study		NR	NR	NR	2	NR	NR	NR	NR
Observational	No	No	Cross-sectional study	Intervention and outcome data collected at one particular time.	5	NR	NR	NR	NR	3	NR	NR
Observational	No	No	Within-group comparison	Also known as before-and-after study. Comparison of outcomes before and after an intervention.	5	NR	NR	NR	2	3	NR	NR
Others	NA	No	Expert opinion		5	NR	5	5	NR	4	4	4
Others	NA	No	Consensus recommendation		5	NR	5	NR	NR	NR	4	NR
Others	NA	No	Pathophysiological study		5	NR	5	5	NR	4	NR	NR
Others	NA	No	Animal study		5	NR	NR	NR	NR	NR	NR	NR

Experimental: In an experimental study, the researcher allocates participants to different treatment groups. Observational: In an observational study, the participants are not allocated by the researcher. Control: yes: with control group (comparative study), no: no control group (single-arm study).

Evidence level by some institutions or authors: I: Present review; II: Vandenbroucke 2008;4 III: Gemeinsamer Bundesausschuss (G-BA) Federal Joint Committee 2013;5 Centre for Evidence-based Medicine (CEBM) 2009;6 V: Centre for Reviews and Dissemination (CRD);7 VI: Khan 2011;8 VII: National Institute for Health and Clinical Excellence (NICE): levels of evidence were specified in 20049 but not in the updated versions in 2008 and 2013; VIII: Scottish Intercollegiate Guidelines Network (SIGN) 2011.10

CCS, case–control study; Contr, control group; CR, case report; CS, case series; HOR, Health outcome research; NA, not applicable; NCC, nested case–control study; NRCCT, non-randomised controlled clinical trial; NR, not reported; PCS, prospective cohort study; Prosp, prospective design; RCS, retrospective cohort study; RCT, randomised controlled trial.

Definition, classification and hierarchy of study designs for intended effects of therapy Experimental: In an experimental study, the researcher allocates participants to different treatment groups. Observational: In an observational study, the participants are not allocated by the researcher. Control: yes: with control group (comparative study), no: no control group (single-arm study). Evidence level by some institutions or authors: I: Present review; II: Vandenbroucke 2008;4 III: Gemeinsamer Bundesausschuss (G-BA) Federal Joint Committee 2013;5 Centre for Evidence-based Medicine (CEBM) 2009;6 V: Centre for Reviews and Dissemination (CRD);7 VI: Khan 2011;8 VII: National Institute for Health and Clinical Excellence (NICE): levels of evidence were specified in 20049 but not in the updated versions in 2008 and 2013; VIII: Scottish Intercollegiate Guidelines Network (SIGN) 2011.10 CCS, case–control study; Contr, control group; CR, case report; CS, case series; HOR, Health outcome research; NA, not applicable; NCC, nested case–control study; NRCCT, non-randomised controlled clinical trial; NR, not reported; PCS, prospective cohort study; Prosp, prospective design; RCS, retrospective cohort study; RCT, randomised controlled trial. In figure 1 we show a study design classification tree including the main features of study designs that makes them distinct from others conforming with the reports of the Centre for Evidence-Based Medicine (CEBM), the National Institute for Health and Care Excellence (NICE) and the Centre for Reviews and Dissemination (CRD).7 12 13 Examples of distinguishing study characteristics are the concurrent versus the historical control group or the participants being or not being allocated to the treatment groups by the investigator. Within group comparison has also been referred to as a before and after study in a previous CRD report.

Figure 1

Decision algorithm to help define study designs.

Decision algorithm to help define study designs. We observed a lack of a clear and comprehensive guidance to optimise the choice of study designs in systematic reviews. A simple and clear algorithm could remind authors instantly about important issues that should be considered. Consequently, we aimed to develop such an algorithm. The algorithm could raise awareness of the prospects of multiple study designs, especially for less experienced authors. It could bring attention to issues that could be overlooked but are pertinent to healthcare. We expect that it may reduce the time burden for review authors and may facilitate the production of higher quality reviews.

Methods

The objectives of the study may be subdivided into four items, which are used to structure the text.

Objective 1

First, we systematically reviewed the literature about the advantages and disadvantages of integrating multiple study designs in systematic reviews. Criteria for considering and search methods for identification of publications and data collection and analysis are described in a precursor article that is firmly connected to the present paper.14 These results form the information base for the topics of the current paper.

Objective 2

Second, we have conducted systematic reviews that are associated with the integration of multiple study designs. We reflected what could be learnt from this experience and what could also be a helpful piece of information for upcoming authors. Experience gathered in these papers was weaved into checking the plausibility of the algorithm.

Objective 3

Third, we wanted to know if major not-for-profit publishers of systematic reviews have included guidelines on the integration of multiple study designs in their manuals. We non-systematically searched the internet sites of 12 selected high-profile institutions and we transferred the relevant statements. The access dates are provided with the reference. We conceived an idea, how a decision tree could look like that should contain major characteristics of clinical studies, provide easy to follow pathways and close on recommended study designs. The resulting algorithm should depict the necessary information in a clear and straightforward way and combine all parts on a single page. We observed that the length of follow-up and the frequency of events are essential components of every outcome assessment and we introduced those items as binary decision points. Furthermore, we were convinced on theoretical grounds that both components are critical in the process of choosing the appropriate study design. Fletcher15 classified outcomes into the following five simple categories: death, disease, discomfort, disability and dissatisfaction (table 3) and we judged that this classification fits well into the algorithm. We did not consider economic outcomes. We selected study design labels from the Cochrane Handbook to maintain a common language and a reference for its the descriptions.16 We combined similar design concepts of the Cochrane Handbook that appeared redundant for the purpose of the algorithm. For example, the experimental design comprises mainly two types of design, the randomised controlled trial (RCT) and the non-randomised controlled trial (CCT).17 We used the term ‘CCT’ to combine the quasi-randomised controlled trial, the non-randomised controlled trial and the controlled before-and-after study. We used the term ‘cohort studies’ to combine the prospective and the retrospective cohort study designs. We used the term ‘case series’ to combine the case series study design and the uncontrolled before-and-after comparison design. The term ‘registry analyses’ is not listed in the Cochrane Handbook, though it may be classified as a retrospective subtype of cohort studies. Registries generally collect data that are confined to a specific disease, a specific intervention or a specific outcome. One example are the registry-based studies of the European Society for Blood and Marrow Transplantation, which registers data from transplanted patients after having received bone marrow or haematopoietic stem cells.18 Therefore, we wanted to accentuate this type of data selection and analysis and introduced the term ‘registry analyses’ into the algorithm. We did not consider the historically controlled design because changes over time are expected to because serious systematic differences between treatment groups. We also did not consider the cross-sectional study design due to the lack of observation over time.

Table 3

Outcomes of disease

Outcome	Description	Type of outcome
Death	A bad outcome if untimely	Investigator reported
Disease	A set of symptoms, physical signs, and laboratory abnormalities	Investigator reported
Discomfort	Symptoms such as pain, nausea, dyspnoea, itching, and tinnitus	Participant-reported disease-related symptoms
Disability	Impaired ability to go about usual activities at home, work, or recreation	Participant-reported disease-related impaired function
Dissatisfaction	Emotional reaction to disease and its care, such as sadness or anger	Participant-reported disease-related bother about impaired function and generic health-related quality of life

Outcomes of disease

Objective 4

Fourth, we checked the plausibility of the algorithm by backtracking its pathways with four previously conducted systematic reviews. The first author of the present paper was also the leading author of these systematic reviews and could apply his knowledge about all the details of the history of these systematic reviews. The second author checked whether the results of the plausibility check appeared sensible. The structured research question of each systematic review and the simulated pathways for choosing multiple study designs were described in detail. For this purpose, we structured the information such as the inclusion criteria by using the PICOTS-SD typology: participants (P), interventions (I), comparators (C) and outcomes (O), timing (T), setting (S), and study design (SD).7 19–21

Results

We identified ‘49 studies that compared the effect sizes between randomised and non-randomised controlled trials, which were statistically different in 35%’.14 We concluded: ‘The risk of presenting uncertain results without knowing for sure the direction and magnitude of the effect holds true for both non-randomised and randomised controlled trials. The integration of multiple study designs in systematic reviews is required if patients should be informed on the many facets of patient relevant issues of healthcare interventions’.14 The following four systematic reviews were used. The PICOTS-SD frames of these papers are shown in table 4.

Table 4

PICOTS-SD frames of the included systematic reviews

Examples	References	P	I	C	O	T	S	SD
Example 1: Non-rhabdomyosarcoma soft tissue sarcomas (NRSTS)	22 23	Patients with high-risk NRSTS	High-dose chemotherapy (HDCT) followed by autologous haematopoietic stem cell transplantation (autoHSCT)	Standard-dose chemotherapy (SDCT)	Overall survival (OS), treatment-related mortality (TRM)	5-year follow-up (FU)	Units in university hospitals specialised in transplantation	Randomised controlled trials (RCT)
Example 2: Acquired severe aplastic anaemia (SAA)	24	Patients with acquired SAA	Allogeneic haematopoietic stem cell transplantation (alloHSCT) from HLA-mached related donors	Immunosuppressive therapy (IST) using ciclosporin A (CSA) and antithymocyte globulin (ATG)	OS, treatment-related mortality (TRM)	5-year FU	Units in university hospitals specialised in transplantation	RCT, comparative clinical studies
Example 3: Localised prostate cancer	25 26	Patients with localised prostate cancer	Permanent interstitial low-dose rate brachytherapy (LDR-BT)	Radical prostatectomy (RP), external beam radiotherapy (EBRT), or no primary therapy (NPT)	OS, function and bother as well as health-related quality of life	5-year FU	Surgery and radiotherapy units in general hospitals	RCT
Example 4: Negative pressure wound therapy (NPWT)	27	Patients with chronic wounds	NPWT	conventional gauze dressing	Complete wound closure, severe adverse events such as bleeding	6-month FU	General hospitals	RCT

PICOTS-SD frames of the included systematic reviews Example 1: Non-rhabdomyosarcoma soft tissue sarcomas.22 This systematic review evaluated autologous haematopoietic stem cell transplantation (autoHSCT) following high-dose chemotherapy (HDCT) versus standard-dose chemotherapy (SDCT) in patients with non-rhabdomyosarcoma soft tissue sarcomas (NRSTS). Example 2: Acquired severe aplastic anaemia.24 This systematic review evaluated allogeneic haematopoietic stem cell transplantation (alloHSCT) from matched sibling donors (MSD) versus immunosuppressive therapy (IST) in patients with acquired severe aplastic anaemia (SAA). Example 3: Localised prostate cancer.25 This systematic review evaluated permanent interstitial low-dose-rate brachytherapy (LDR-BT) versus radical prostatectomy (RP) versus external beam radiotherapy (EBRT) and no primary therapy (NPT) in patients with localised prostate cancer. Example 4: Negative pressure wound therapy.27 This systematic review evaluated negative pressure wound therapy (NPWT) versus standard wound dressing in patients with wounds. We looked at a sample of 12 high profile not-for-profit publishers of systematic reviews detailed in table 5. Of these, 10 have published guidance about their methodological procedure for preparing the systematic reviews.7 12 13 16 28–33 A range of other books or guidance documents on systematic reviews exist.8 34–38 We extracted the major statements of their methods guidance with respect to choosing the appropriate research design in online supplementary table S1. We did not identify an algorithm or a comprehensive guidance focusing on finding the appropriate research design in any of these methods guidance documents. We propose an algorithm, which is shown in figure 2. The algorithm has four decision points.

Table 5

Methods guidance by publishers of systematic reviews

ID (Reference)	Country	Name of Institution	Title of handbook
AHRQ 200928	USA	Agency for Healthcare Research and Quality	Methods (section of a completed report)
ASERNIP-S 200929	Australia	Australian Safety and Efficacy Register of New Interventional Procedures—Surgical; Royal Australasian College of Surgeons	General Guidelines for Assessing, Approving and Introducing New Surgical Procedures into a Hospital or Health Service
CADTH 200330	Canada	Canadian Agency for Drugs and Technologies in Health	Guidelines for Authors of CADTH Health Technology Assessment Reports
CEBM 201412	UK	Centre for Evidence-Based Medicine	Study designs
Cochrane 201116	UK, World	The Cochrane Collaboration	Cochrane Handbook for Systematic Reviews of Interventions; V.5.1.0; (updated March 2011)
CRD 20117	UK	Centre for Reviews and Dissemination	Systematic Reviews. CRD's guidance for undertaking reviews in healthcare
HAS 200731	France	French National Authority for Health	General method for assessing health technologies
IQWiG 201332	Germany	Institute for Quality and Efficiency in Health Care	Methoden V.4.0
MRC 200833	UK	Medical Research Council	Developing and evaluating complex interventions: new guidance
MSAC	Australia	Medical Services Advisory Committee	No handbook found
NICE 201313	UK	National Institute for Health and Clinical Excellence	Guide to the Methods of Technology Appraisal
OHTAC	Canada	Ontario Health Technology Advisory Committee	No handbook found

ID, Identifier.

Figure 2

Algorithm Explanation:

RCT: prospective randomised controlled trial, allocating experimental units via random assignment to a treatment or control condition with concealment of the allocation procedure

CCT: prospective non-randomised controlled clinical trial, allocating experimental units via non-random assignment to a treatment or control condition; discomfort etc: discomfort, disability and dissatisfaction

Nested case–control study: case–control study nested within a prospective, observational cohort study

Cohort study: prospective, observational cohort study

Case series: retrospective, observational tabulation of data from participants without consecutive enrolment

Case report: retrospective, observational report of data from one participants up to three patients

Registry analysis: retrospective, observational analysis of participants data from various sources transferred to and collected in a database

CCT, controlled clinical trial; nested case contr: nested case–control study within a prospective cohort study; RCT, prospective randomised controlled trial, allocating experimental units via random assignment to a treatment or control condition with concealment of the allocation procedure

Methods guidance by publishers of systematic reviews ID, Identifier. Algorithm Explanation: RCT: prospective randomised controlled trial, allocating experimental units via random assignment to a treatment or control condition with concealment of the allocation procedure CCT: prospective non-randomised controlled clinical trial, allocating experimental units via non-random assignment to a treatment or control condition; discomfort etc: discomfort, disability and dissatisfaction Nested case–control study: case–control study nested within a prospective, observational cohort study Cohort study: prospective, observational cohort study Case series: retrospective, observational tabulation of data from participants without consecutive enrolment Case report: retrospective, observational report of data from one participants up to three patients Registry analysis: retrospective, observational analysis of participants data from various sources transferred to and collected in a database CCT, controlled clinical trial; nested case contr: nested case–control study within a prospective cohort study; RCT, prospective randomised controlled trial, allocating experimental units via random assignment to a treatment or control condition with concealment of the allocation procedure First, it should be decided whether the outcomes are typically evaluated at an early or late time point after start of treatment. The cut-off between a short and long follow-up depends on the type of disease, intervention and outcome and we list some examples that range from 30 days to 5 years (table 6).

Table 6

Short versus long follow-up depending on the type of disease and intervention/exposure

Diagnosis and intervention	Follow-up		Reference
Diagnosis and intervention	Short (early)	Long (late)	Reference
Shortening the duration and reducing the severity of the common cold treated by vitamin C	<3 days	≥1 week	39
Early vs late radiation morbidity	<30 days	≥30 days	40
Cancer-specific survival after recurrence in patients with recurrent renal cell carcinoma	<5 years	≥5 years	41
Early or late diagnosis on patient survival in gastric cancer	<3 years	≥3 years	42
Early or late mortality after isolated first coronary bypass surgery in multivessel disease in patients with diabetes	<30 days	≥30 days	43
Early or late major adverse cardiac events after percutaneous coronary intervention in cardiac patients	<6 months	≥6 months	44

Short versus long follow-up depending on the type of disease and intervention/exposure Second, it should be decided whether the events of interest are regarded as rare or frequent. The cut-off between a rare and frequent event depends on the type of disease, intervention and outcome and we list some examples in table 7. A rare disease may be defined according to the Office of Rare Diseases Research (ORDR): “In the United States, a rare disease is generally considered to be a disease that affects fewer than 200 000 people”.49

Table 7

Rare versus frequent events depending on the type of disease and intervention/exposure

Diagnosis and intervention	Event		Reference
Diagnosis and intervention	Rare	Frequent	Reference
Dying from lung cancer: lifelong non-smokers vs current smokers (≥25 cigarettes per day)	17/100 000/year	415/100 000/year	45
Prevalence of chronic obstructive pulmonary disease in 2118 lifelong never-smokers without vs with exposure to environmental tobacco smoke (ever at home and at both previous and current work)	4.2%	14.7%	46
Maternal mortality goal of the Healthy People objective in 2000	<3.3/100 000 live births	≥3.3/100 000 live births	47
Relative risk of lung cancer in current smokers vs non-smoker	Not applicable	24.0	45
Relative risk of lung cancer in non-smoker exposed vs not exposed to environmental tobacco smoke	Not applicable	2.4	48

Rare versus frequent events depending on the type of disease and intervention/exposure Third, the type of outcome of interest needs to be considered. We list some examples of outcomes, which may depend on length of follow-up and frequency of events (table 8). Additional examples for outcomes of respiratory tract disease are shown in table 9.

Table 8

Examples for outcomes depending on lengths of follow-up and frequency of events

Outcome	Short follow-up		Long follow-up
Outcome	Rare events	Frequent events	Rare events	Frequent events
Death	Population: Hypertrophic cardiomyopathyIntervention: Physical activityAim: Incidence of sudden death after recreational sportsPractical: Ethical concerns against experimental allocation	Population: MalignanciesIntervention: Unrelated and mismatched haematopoietic stem cell transplantationAim: Incidence of death due to graft rejectionPractical: Ethical concerns may arise to randomise patients	Population: Acquired severe aplastic anaemiaIntervention: Long-term immunosuppressive therapyAim: Incidence of secondary malignanciesPractical: Ethical concerns against randomised allocation	Population: High-risk neuroblastomaIntervention: Retinoic acid as postconsolidation therapy after high-dose chemotherapy followed by autologous haematopoietic stem cell transplantationAim: Event-free survival in comparable groupsPractical: Randomised allocation is required to provide comparable groups
Disease	Population: NeuroblastomaIntervention: Watchful waitingAim: Incidence of spontaneous regressionPractical: Ethical concerns against experimental allocation	Population: MalignanciesIntervention: Unrelated and mismatched haematopoietic stem cell transplantationAim: Incidence of acute graft vs host diseasePractical: Randomised allocation is required to provide comparable groups	Population: MalignanciesIntervention: RadiotherapyAim: Incidence of myelodysplastic syndromePractical: Ethical concerns against randomised allocation	Population: Diabetic foot ulcerIntervention: Negative pressure wound therapyAim: Incidence of wound closurePractical: Randomised allocation is required to provide comparable groups
DiscomfortDisabilityDissatisfaction	Population: Acquired severe aplastic anaemiaIntervention: Matched sibling donor haematopoietic stem cell transplantationAim: Incidence of graft failure as an early and rather unexpected serious complication only observed in the transplant group not in the non-transplant groupPractical: Ethical concerns against experimental allocation	Population: Low-risk localised prostateIntervention: Radical prostatectomyAim: Incidence of erectile dysfunction and urinary incontinencePractical: Randomised allocation provides comparable groups	Population: Advanced prostate cancerIntervention: hormonal androgen deprivation therapyAim: Incidence of emotional distortionPractical: Randomised allocation provides comparable groups	Population: Diabetic foot ulcerIntervention: Negative pressure wound therapyAim: Incidence of amputationPractical: Randomised allocation is required to provide comparable groups

Table 9

Examples for outcomes of respiratory tract diseases depending on lengths of follow-up and frequency of events

Outcome	Short follow-up		Long follow-up
Outcome	Rare events	Frequent events	Rare events	Frequent events
Death	Viral infection may aggravate to acute myocarditis and subsequent heart failure	Lack of nourishment and lack of medicines may cause general susceptibility to life-threatening disease	Infection may affect organs such as the heart. Fibrous replacement of organ tissue may result in late arrhythmia and subsequent cardiac arrest	Lung cancer is the most common cause of cancer-related death in men and women
Disease	Bacterial infection may aggravate to community-acquired pneumonia	Infection may develop to acute sinusitis that may worsen and prolong the condition	Streptococcal pharyngitis may be complicated by chronic rheumatic heart disease	Long-term exposure to tobacco smoke is the most often cause of lung cancer
DiscomfortDisabilityDissatisfaction	Common cold may confine to bed and cause sick leave	Acute sinusitis may cause drowsiness, headache and sleepiness	Streptococcal pharyngitis may be complicated by rheumatic fever, which may have an involuntary movement disorder called Sydenham’s chorea as a main symptom	In non-smokers, secondhand smoke may be the cause of about 20% of cases of chronic obstructive pulmonary disease, which is characterised by shortness of breath and cough

Examples for outcomes depending on lengths of follow-up and frequency of events Examples for outcomes of respiratory tract diseases depending on lengths of follow-up and frequency of events Fourth, the recommended study design for inclusion in a systematic review is assigned. We used the following study design labels: RCT, CCT, nested case–control study, cohort study, case–control study, case series, case report and registry analysis. These and alternative study design labels are described in table 2. Practical or ethical concerns may emerge as reasons to over-ride the earlier decisions or to switch to a more appropriate study design. We remind the reader at the bottom of figure 2 to reconsider the chosen path. This part is introduced to facilitate a flexible handling of the algorithm. Examples are shown in table 8. Ethical concerns are primarily associated with objections against experimental allocation. We conducted a plausibility check of the algorithm’s pathways by backtracking four own previously published systematic reviews. We marked the pathways by boxes that are filled in with a coloured background or that have a coloured frame lines. Example 1: Non-rhabdomyosarcoma soft tissue sarcomas. In the first version of this systematic review, we assumed a long follow-up and frequent events regarding death by disease or complication: NRSTS in figure 3.23 Thus, an RCT would be the best choice for all outcomes but we did not find any RCT and we did not find any comparative study. Instead, we identified only single-arm studies. We estimated overall survival and described adverse events but were unable to draw conclusions on the benefit of the intervention of interest. In a planned update 2 years later, we were able to identify a single RCT.22 Using different study types provided the advantage to report estimates of overall survival in the first version when RCTs were lacking. The advantage affected also the update version because the reporting of adverse events exceeded the scope of a single RCT considerably.

Figure 3

Algorithm with pathways backtracked in four completed systematic reviews.

Algorithm with pathways backtracked in four completed systematic reviews. Example 2: Acquired severe aplastic anaemia. In this systematic review, we assumed a long follow-up and frequent events regarding death by disease or complication: SAA in figure 3.24 Thus, an RCT would be the best choice for all outcomes. As we did not identify any RCT, we included other comparative study designs. We tried to lower the risk of bias imposed by the non-randomised design. Eligible studies needed to be prospective non-randomised controlled trials, to meet the requirements of ‘Mendelian randomisation’, and to be confined to human leucocyte antigen (HLA)-matched sibling donors. Using different study types enabled the evaluation in view of lacking RCTs. It should be noted that these study data were generated more than 10 years ago and may not be applicable to the current medical care status. Example 3: Localised prostate cancer. In this systematic review, we assumed a long follow-up and rare events concerning death: PCa: OS in figure 3.25 Localised prostate cancer as opposed to advanced prostate cancer is believed to be associated with a very good overall survival regardless of the intervention. While invasive interventions may not improve overall survival, they may impair the health-related quality of life considerably. For example, radical prostatectomy may promise to completely remove the malignant tumour but may also disrupt erectile function in a considerable proportion of patients. According to the algorithm, a cohort study would be appropriate to estimate long-term overall survival. Concerning patient-reported outcomes such as discomfort, disability and dissatisfaction, we assume that we have short follow-up and frequent events: PCa: HRQL in figure 3. According to the algorithm, an RCT would be the best choice to evaluate the patient-reported outcomes. As a single RCT was available, data on discomfort, disability and dissatisfaction were sparse and the inclusion of CCTs expanded the results considerably26 Evaluating overall survival needed a different approach than evaluating patient-reported outcomes. The obvious reluctance of patients and physicians alike to participate in RCTs corroborated the consideration of other study designs, though, restrictive inclusion criteria were necessary to enable a minimal level of quality. Example 4: Negative pressure wound therapy. In this systematic review, we assumed a short follow-up and frequent events concerning complete wound closure: NPWT: Closure in figure 3 and we assumed a long follow-up and rare events concerning the outcome of severe adverse events NPWT: AE in figure 3.27 According to the algorithm, an RCT or a CCT would have been appropriate to evaluate the successful treatment of the disease. In 2009, the US Food and Drug Administration issued a report on six deaths and 77 other complications that were reported within a 2-year period in connection with NPWT.50 Many of the deaths occurred in outpatient care or care homes and were caused by bleeding complications. The consideration of registry analyses and case reports were very helpful to draw attention to possible dangerous and life-threatening events.

Discussion

In a separate paper, we concluded that “the integration of multiple study designs in systematic reviews is required and that the risk of presenting uncertain results without knowing for sure the direction and magnitude of the effect holds true for both nonrandomized and randomized controlled trials”.14 Our results appear to be in agreement with other authors. A Cochrane review compared RCTs versus historically or concurrently controlled non-randomised trials in 2007.51 The authors concluded that, on average, the non-randomised controlled trials tend to result in larger estimates of effect than RCTs. The latest update of this Cochrane review in 2011 amended the research question and compared RCTs versus concurrently controlled non-randomised trials and excluded historically controlled ones.52 The authors concluded that “the results of randomized and non-randomized controlled trials sometimes differed”, namely, “in some instances nonrandomized studies yielded larger estimates of effect and in other instances randomized trials yielded larger estimates of effect”. It appears that the early firm statement expressing larger estimates in the non-randomised controlled trials changed to a less decided message. We reported our experience gained during the conduct of four of our systematic reviews. These systematic reviews required the inclusion of multiple study designs to accomplish the planned evaluation of healthcare interventions. They present a few selected topics. Experiences or conclusions derived from these papers are far from being representative and not predestined to be generalised. They were conducted by the person who is also first author of the present work. Subsequently, the inferences based on the four papers and reported in the present paper may be subjective. Thus, further research by other authors and concerning other topics is recommended. We did not identify an existing algorithm or a comprehensive guidance focused on finding the appropriate research design. Therefore, we developed an algorithm, which aims to guide systematic reviewers in the reasonable inclusion of various study designs in their planned systematic reviews of healthcare interventions. The proposed algorithm cannot be applied without considering disease-specific circumstances and aims of interventions. The terms used for defining the critical decision points of the algorithm such as short versus long follow-up need to be interpreted in the context of the disease and may be unclear and not useful if used as general terms. We provided examples to show that short versus long follow-up can vary considerably depending on the disease. Similarly, we provided examples to show that the definition of rare versus frequent events has to be interpreted in the context of the type of intervention or exposure as well as the type of event. The outcomes include hard and soft outcomes, physician-reported and patient-reported outcomes and it is likely that the outcomes can match the purpose of the algorithm. Nevertheless, the types of outcomes have been arbitrarily chosen from a handbook of clinical epidemiology. An alternative selection of other outcomes might also be acceptable for the understanding and usefulness of the algorithm. The labelling of study designs and the descriptions of study design features are not consistently used. Hartling 2010, while testing a tool on study design classification, reported that reviewers disagreed considerably on fundamental design characteristics, such as whether the design was experimental or observational and whether there was a control group involved or not.53 Lopez-Alcade 2011 reported that “Cochrane review groups did not use common study design labels and did not explicitly describe all study design features suggested by the Cochrane Handbook”.54 We are confident that the algorithm is a tool helping to bring seminal features of a systematic review to the attention of anyone who is planning to conduct a systematic review. It has the potential to help to reorientate oneself to major features of the studies eligible for an evaluation of a healthcare intervention. The benefit is the provision of awareness, and it is certainly not a new regulation. The intention is to provide a guide and a decision support tool that might be used fully or partially by persons who are going to prepare a systematic review. While preparing a systematic review, it may be important at an early time point to identify the relevant and the most appropriate study designs necessary to find answers for a variety of prespecified outcomes. It might also be of interest for persons who evaluate the quality of systematic reviews and might want to check whether the all study designs have been considered that should have been considered. Therefore, we think that it may reduce the time burden on review authors and contribute to the production of a higher quality review. The plausibility check is a crude approach to speculate if the theory-based algorithm could be sensibly applied in practice. Thus, further research could facilitate a more objective and statistically measurable testing of the usefulness of the algorithm. It is recommended to let various systematic reviewers backtrack the algorithm independently and to apply the algorithm on more systematic reviews with different topics. We could follow and confirm the appropriateness of the pathways for all described examples of systematic reviews. In one example, the algorithm selected RCTs as the best choice but due to the lack of RCTs it was decided to rely on non-randomised studies. This example showed that it is important to build flexibility into the algorithm, which enables the systematic reviewer to extend or change the inclusion criteria to other study designs in case that certain unexpected conditions may emerge or practical concerns exists. While conducting systematic reviews, we observed the critical importance of case reports and registry analysis for the evaluation of serious adverse events. We mentioned above the Food and Drug Administration (FDA) report on adverse events after NPWT, which covered 2 years from 2007 to 2009. In a recent update, the FDA included two additional years covering a total of 4 years from 2007 to 2011.55 The adverse events increased to 12 deaths and 174 injuries. With respect to the added cases, bleeding was again the major cause of the most serious adverse events and the majority of adverse events occurred at home or in long-term care facilities. All RCTs on NPWT were conducted in hospitals and were unable to provide this information. In France between 1998 and 2004, 21 drugs were reported to be withdrawn from the market for safety reasons. The withdrawal of 19 of 21 drugs was based on case reports and only 1 case was supported by RCT.56 In the European Community between 2002 and 2011, case reports contributed to the withdrawal of 18 of 19 drugs.57

Conclusions

We are confident that the algorithm can assist to bring seminal features of a systematic review to the attention of anyone who is planning to conduct a systematic review. It aims to provide awareness and we think that it may reduce the time burden on review authors and may contribute to the production of a higher quality review.

27 in total

1. GRADE guidelines: 2. Framing the question and deciding on important outcomes.

Authors: Gordon H Guyatt; Andrew D Oxman; Regina Kunz; David Atkins; Jan Brozek; Gunn Vist; Philip Alderson; Paul Glasziou; Yngve Falck-Ytter; Holger J Schünemann
Journal: J Clin Epidemiol Date: 2010-12-30 Impact factor: 6.437

2. Systematic reviews incorporating evidence from nonrandomized study designs: reasons for caution when estimating health effects.

Authors: B C Reeves; J van Binsbergen; C van Weel
Journal: Eur J Clin Nutr Date: 2005-08 Impact factor: 4.016

3. The nature of the scientific evidence leading to drug withdrawals for pharmacovigilance reasons in France.

Authors: Pascale Olivier; Jean-Louis Montastruc
Journal: Pharmacoepidemiol Drug Saf Date: 2006-11 Impact factor: 2.890

4. The accumulated evidence on lung cancer and environmental tobacco smoke.

Authors: A K Hackshaw; M R Law; N J Wald
Journal: BMJ Date: 1997-10-18

5. Comparison of ICD-9-based, retrospective, and prospective assessments of perioperative complications: assessment of accuracy in reporting.

Authors: Peter G Campbell; Jennifer Malone; Sanjay Yadla; Rohan Chitale; Rani Nasser; Mitchell G Maltenfort; Alex Vaccaro; John K Ratliff
Journal: J Neurosurg Spine Date: 2010-12-10

Review 6. Autologous hematopoietic stem cell transplantation following high-dose chemotherapy for non-rhabdomyosarcoma soft tissue sarcomas.

Authors: Frank Peinemann; Lesley A Smith; Mandy Kromp; Carmen Bartel; Nicolaus Kröger; Michael Kulig
Journal: Cochrane Database Syst Rev Date: 2011-02-16

Review 7. Randomisation to protect against selection bias in healthcare trials.

Authors: R Kunz; G Vist; A D Oxman
Journal: Cochrane Database Syst Rev Date: 2007-04-18

8. Poor health-related quality of life is a predictor of early, but not late, cardiac events after percutaneous coronary intervention.

Authors: Susanne S Pedersen; Elisabeth J Martens; Johan Denollet; Ad Appels
Journal: Psychosomatics Date: 2007 Jul-Aug Impact factor: 2.386

9. Mortality from cancer in relation to smoking: 50 years observations on British doctors.

Authors: R Doll; R Peto; J Boreham; I Sutherland
Journal: Br J Cancer Date: 2005-02-14 Impact factor: 7.640

10. Observational research, randomised trials, and two views of medical science.

Authors: Jan P Vandenbroucke
Journal: PLoS Med Date: 2008-03-11 Impact factor: 11.069

1 in total

1. Anti-GD2 antibody-containing immunotherapy postconsolidation therapy for people with high-risk neuroblastoma treated with autologous haematopoietic stem cell transplantation.

Authors: Frank Peinemann; Elvira C van Dalen; Heike Enk; Godelieve Am Tytgat
Journal: Cochrane Database Syst Rev Date: 2019-04-24

1 in total