| Literature DB >> 29783954 |
Monika Mueller1,2, Maddalena D'Addario2, Matthias Egger1, Myriam Cevallos3, Olaf Dekkers4,5, Catrina Mugglin1, Pippa Scott6.
Abstract
BACKGROUND: Systematic reviews and meta-analyses of observational studies are frequently performed, but no widely accepted guidance is available at present. We performed a systematic scoping review of published methodological recommendations on how to systematically review and meta-analyse observational studies.Entities:
Keywords: Meta-analysis; Methods; Observational studies; Recommendation; Systematic review
Mesh:
Year: 2018 PMID: 29783954 PMCID: PMC5963098 DOI: 10.1186/s12874-018-0495-9
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Methodological key items for systematic reviews or meta-analyses of observational studies
| Protocol development | A protocol is written in the preliminary stages of a research synthesis to describe the rational of the review and the methods that will be used to minimise the potential for bias in the review process. |
| Research question | The research question is defined a priori as for any research project. It sets the scope of the review and guides subsequent decisions about the methods to be used to answer the particular research question. |
| Search strategy | The search strategy refers to the methods employed to conduct a methodologically sound search and might include information as the data sources used and the specific terms applied in distinct databases. The search locates articles relevant to answer the a priori defined research question. |
| Study eligibility | Study eligibility is assessed according to pre-defined eligibility criteria related to the study itself such as the study design, the study population, as well as the exposure/s and outcome/s of interest but also to aspects such as the language and year of publication. Usually two reviewers assess each study for eligibility to reduce errors and bias. Specifying which features should be covered by eligibility criteria might be more difficult for observational studies than for RCTs as observational studies cover a broader range of research questions and have more variability in design. |
| Data extraction | Data extraction is performed according to a standardised form that has been finalised during pilot extraction. Usually two reviewers extract data for each study for eligibility to reduce errors and bias. Data extraction for observational studies might be less straight forward than for RCTs because multiple analyses may have been conducted (e.g. unadjusted and adjusted, with analyses adjusting for different sets of potential confounders), and each observational study design will have different data to be extracted. |
| Considering different study designs | Before starting evidence synthesis of observational studies, reviewers must consider which study designs to include as well as how to approach the analysis of data from different study designs. This adds complexity over evidence synthesis that considers RCTs only. |
| Risk of bias assessment | A risk of bias assessment of all primary studies included is important for all systematic reviews and meta-analyses. This assessment allows a better understanding of how bias may have affect results of studies, and subsequently the results of evidence synthesis. Risk of bias assessment of observational studies may be more complex than in RCTs since observational studies are likely to be prone to bias and confounders. |
| Publication bias | Publication bias needs to be considered in any systematic review and meta-analysis as only about half of all completed research projects reach publication in an indexed journal. |
| Heterogeneity | The term heterogeneity refers to differences in results between studies. When heterogeneity exists between studies, it is important to understand why as this will alter the conclusions drawn by the review. An exploration of heterogeneity might be particularly important when reviewing observational studies given the range of study designs and the potential risk of bias in observational studies. |
| Statistical analysis | Statistical analysis in the context of meta-analysis refers to the mathematical analysis and combination of the results of the included primary studies. Important aspects to consider are whether to pool data to provide a single effect in light of observed heterogeneity and how to choose the statistical model to be employed (e.g. fixed or random-effects model). These decisions might need more careful consideration when reviewing observational studies given the range of study designs and the potential risk of bias in observational studies. |
Fig. 1Flow chart of article selection
Study characteristics and recommendations by key item
| Authors, year | Study designs targeteda | Protocol | Research Question | Search | Eligibility | Extraction | Study Designs | Risk of Bias | Publication Bias | Heterogeneity | Statistics |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Abrams, 1995 [ | Cohort and case-control | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ |
| Armstrong, 2007 [ | Observational |
|
|
|
| ✗ |
| ✗ | ✗ |
| ✗ |
| Ashford, 2009 [ | Observational |
|
|
| ✗ | ✗ | ✗ | ✗ |
|
|
|
| Austin, 1997 [ | Cohort and case-control | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ |
| Balshem, 2011 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Blair, 1995 [ | Observational |
| ✗ |
|
|
|
|
|
|
|
|
| Brockwell, 2001 [ | Not specified | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Chaiyakunapruk, 2014 [ | Cohort and case-control | ✗ |
|
|
| ✗ | ✗ |
| ✗ |
|
|
| Chambers, 2009 [ | Case series |
| ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Colditz, 1995 [ | Cohort and case-control | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| ✗ |
|
|
| Davey Smith, 1997 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| ✗ |
| Davey Smith, 1998 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ |
|
| Doria, 2005 [ | Observational and RCT | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
|
| Dwyer, 2001 [ | Cohort and case-control | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
|
| Egger, 1997a [ | Observational |
|
|
|
|
|
|
|
|
|
|
| Egger, 1997b [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ |
| Fraser, 2006 [ | Observational | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Friedenreich, 1994 [ | Case-control |
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Furlan, 2006 [ | Observational | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Golder, 2008 [ | Observational | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Greenland, 1994 [ | Cohort and case-control | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
|
|
|
| Guyatt, 2011a [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ |
| Guyatt, 2011b [ | Observational | ✗ |
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Guyatt, 2011c [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Guyatt, 2011d [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ |
| Hartemink, 2006 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Haynes, 2005 [ | Cohort | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Herbison, 2006 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Hernandez, 2016 [ | Cohort, case-control and cross- sectional | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
|
| Higgins, 2013 [ | Observational | ✗ | ✗ |
|
| ✗ |
|
| ✗ | ✗ | ✗ |
| Horton, 2010 [ | Cross-sectional | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ | ✗ |
| Ioannidis, 2011 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Khoshdel, 2006 [ | Observational and RCT |
|
|
|
| ✗ | ✗ |
| ✗ |
|
|
| Kuper, 2006 [ | Observational | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Lau, 1997 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
|
|
|
| Lemeshow, 2005 [ | Observational | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Loke, 2011 [ | Observational and RCT | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Loke, 2007 [ | Cohort, case-control and cross- sectional |
|
|
| ✗ |
|
|
| ✗ | ✗ |
|
| MacDonald-Jankowski, 2001 [ | Observational and RCT | ✗ |
|
|
| ✗ | ✗ | ✗ |
| ✗ | ✗ |
| Mahid, 2006 [ | Observational and RCT | ✗ | ✗ |
|
| ✗ | ✗ | ✗ |
|
|
|
| Manchikanti, 2009 [ | Observational | ✗ |
|
|
|
| ✗ |
| ✗ |
|
|
| Martin, 2000 [ | Cohort and case-control | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
|
| McCarron, 2010 [ | Observational and RCT | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Moola, 2015 [ | Observational and RCT |
|
| ✗ |
| ✗ | ✗ | ✗ | ✗ |
|
|
| Moreno, 1996 [ | Case-control | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
|
| Munn, 2015 [ | Observational | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ |
|
|
| Naumann, 2007 [ | Not specified | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Normand, 1999 [ | Observational and RCT | ✗ |
|
|
|
| ✗ |
|
| ✗ |
|
| Norris, 2013 [ | Observational and RCT | ✗ | ✗ | ✗ |
| ✗ | ✗ |
| ✗ | ✗ | ✗ |
| O’Connor, 2014 [ | Observational |
|
| ✗ |
| ✗ |
| ✗ | ✗ |
|
|
| Pladevall-Vila, 1996 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
|
| Prevost, 2000 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Price, 2004 [ | Observational | ✗ |
|
| ✗ | ✗ |
|
|
|
| ✗ |
| Raman, 2012 [ | Observational and RCT | ✗ |
|
| ✗ | ✗ | ✗ |
|
|
| ✗ |
| Ravani, 2015 [ | Observational |
|
|
|
|
| ✗ |
| ✗ |
|
|
| Robertson, 2014 [ | Observational and RCT | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Rosenthal, 2001 [ | Observational and RCT | ✗ |
|
| ✗ | ✗ |
| ✗ | ✗ |
| ✗ |
| Sagoo, 2009 [ | Observational |
| ✗ |
|
|
| ✗ |
|
|
|
|
| Salanti, 2005 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ |
|
|
| Salanti, 2009 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Sanderson, 2007 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Schünemann, 2013 [ | Observational |
|
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Shamliyan, 2012 [ | Observational and RCT | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Shuster, 2007 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Simunovic, 2009 [ | Observational and case-control |
|
|
|
|
|
|
| ✗ |
|
|
| Smith, 1995 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Souverein, 2012 [ | Observational and RCT | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
|
| Stansfield, 2016 [ | Not specified | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Sterne, 2016 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| ✗ |
|
|
| Stroup, 2000 [ | Observational | ✗ | ✗ |
|
| ✗ | ✗ |
| ✗ |
|
|
| Sutton, 2002a [ | Observational and RCT | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ |
| ✗ | ✗ |
| Sutton, 2002b [ | Observational and RCT | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ |
| Tak, 2010 [ | Cohort and case-control | ✗ |
| ✗ | ✗ | ✗ | ✗ |
| ✗ |
|
|
| Takkouche, 1999 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
|
| Thomas, 2004 [ | Observational | ✗ |
|
| ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Thompson, 2002 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Thompson, 2011 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Thompson, 2014 [ | Observational | ✗ | ✗ |
| ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Thornton, 2000 [ | Observational | ✗ | ✗ |
|
| ✗ |
| ✗ |
|
|
|
| Tufanaru, 2015 [ | Observational |
| ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ |
|
|
| Tweedie, 1995 [ | Cohort and case-control | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Valentine, 2013 [ | Observational and RCT | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| ✗ | ✗ | ✗ |
| Verde, 2015 [ | Observational and RCT | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
|
| Weeks, 2007 [ | Cohort and case-control | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
| Wells, 2013 [ | Observational and RCT |
| ✗ | ✗ |
| ✗ |
|
| ✗ | ✗ | ✗ |
| West, 2002 [ | Cohort and case-control | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Wille-Jorgensen, 2008 [ | Observational and RCT | ✗ | ✗ |
| ✗ | ✗ |
|
|
|
|
|
| Winegardner, 2007 [ | Observational, Cohort and case-control | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| ✗ |
|
|
| Wong, 2008 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ | ✗ |
| Wong, 1996 [ | Cohort | ✗ | ✗ | ✗ |
| ✗ |
|
|
|
|
|
| Zeegers, 2000 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
|
|
| Zingg, 2016 [ | Observational and cohort | ✗ | ✗ | ✗ |
| ✗ |
|
| ✗ | ✗ |
|
| Zwahlen, 2008 [ | Observational | ✗ | ✗ | ✗ | ✗ | ✗ |
| ✗ | ✗ |
|
|
aDescribes the study designs toward which articles target their recommendations. Articles that target “observational” or “non-randomised” studies are categorised under observational. “Not specified” refers to articles that do not name study designs, but provide recommendations applicable to observational studies
Summary of recommendations from 93 publication by key item
| Key item | No of articles providing recommendation | Topic of recommendation | N articles addressing area (%)a |
|---|---|---|---|
| Protocol development | 16 | Need for protocol to be written in advance | 12 (75%) |
| Items to be included in protocol | 11 (69%) | ||
| Research question | 20 | Scope of research question | 20 (100%) |
| Search strategy | 33 | General methods for conducting searches in context of observational studies | 22 (67%) |
| Specific challenges in searching for observational studies | 12 (36%) | ||
| Study eligibility | 22 | Specifying eligibility criteria | 22 (100%) |
| Assessment of eligibility | 6 (27%) | ||
| Data extraction | 9 | Methods for data extraction | 9 (100%) |
| Dealing with different study designs | 25 | Inclusion of different study designs in a single review | 10 (40%) |
| Combining results from different study designs in a single meta-analysis | 15 (60%) | ||
| Risk of bias assessment | 39 | Methods to assess the risk of bias in individual studies | 39 (100%) |
| Publication bias | 20 | Inclusion of unpublished studies | 5 (25%) |
| Methods to assess publication bias | 7 (35%) | ||
| Heterogeneity | 39 | Measurement of heterogeneity | 39 (100%) |
| Exploring potential causes of heterogeneity | 16 (41%) | ||
| Statistical analysis | 52 | Deciding to combine results in a single effect estimate | 20 (38%) |
| Choosing fixed or random effects meta-analysis | 16 (31%) |
aPercentages do not add up to 100% because articles can contribute recommendations to more than one topic and only the most frequent areas of recommendation for each key item are listed
Key item with conflicting recommendations
| Recommendations in favour | Recommendations against | |
|---|---|---|
| Research question | ||
| Should we formulate the research question as precise as possible? | “A focused research question is essential. The question that is asked needs to be as scientifically precise as possible.” [ | “Thus, questions that the review addresses may be broad or narrow in scope, with each one of them associated with their own advantages and disadvantages. While the questions may be refined based on the data which is available during the review, it is essential to guard against bias and modifying questions, as post-hoc questions are more susceptible to the bias than those asked a priori and data-driven questions can generate false conclusions based on spurious results.” [ |
| Study eligibility | ||
| Should we include studies of all languages? | “Ideally, it would be best to include all studies regardless of language of publication. However, for practical reasons, many meta-analyses limit themselves to English language studies. Although this decreases the number of studies, it does not appear to bias the effect size”. [ | “Including papers in all languages may actually introduce more bias into a meta-analysis”. [ |
| Should we avoid multiple inclusions? | “authors must be careful to avoid the multiple inclusion of studies from which more than one publication has arisen”. [ | “It is important that each entry in a meta-analysis represents an independent sample of data. Thus, for example, multiple reports of the same study need to be merged to obtain a single “best” answer for that study” [ |
| Considering different study designs | ||
| Should we include both RCT and NRS in a single systematic review? | “When both randomized and non-randomized evidence are available, we favor a strategy of including NRS and RCTs in the same systematic review but synthesizing their results separately.” [ | “Ideally, researchers should consider including only controlled trials with proper randomisation of patients that report on all initially included patients according to the intention to treat principle and with an objective, preferably blinded, outcome assessment.” [ |
| Should we pool results of different study designs in a single meta-analysis if results are similar over the different study designs? | “If the meta-analysis includes some randomized experiments and some observational studies, we can meta-analyze them separately and combine their results if they are quite similar, borrowing strength for the randomized experiments from the similar results of the nonrandomized studies.” [ | “Generally, separate meta-analyses should be performed on studies of different designs. It is not usually advisable to combine studies of different designs in a single meta-analysis unless it can be determined that study design has little or no influence on study characteristics such as quality of data, specificity of exposure, and uniformity of diagnoses. In reality, study design is usually one of the most important determinants of data quality, exposure specificity, and diagnostic criteria. Similarly, studies with very different statistical techniques, different comparison populations, or different diagnostic categories should generally not be lumped into a single analysis.” [ |
| Risk of bias assessment | ||
| Should we use scales and summary scores to assess the quality of studies? | “The methodological quality of the recruited studies must be checked before analysis. There are several checklists and score systems to facilitate decision about the quality of a study”. [ | “We do not recommend the use of quality scoring for the simple reason that it would be impossible to treat different study characteristics … that are related to quality as if they are of equal importance or interchangeable and can be measured by a single score”. [ |
| Publication bias | ||
| Should we assess publication bias with a funnel plot? | “Bias can be detected visually by drawing a funnel plot”. [ | “Important, but graphical attempts to detect publication bias can be influenced by the subjective expectations of the analyst”. [ |
| Statistical analysis | ||
| Should we use statistical measures of heterogeneity to decide on statistical model? | “Failing to reject the null-hypothesis assumes that there is homogeneity across the studies and differences between studies are due to random error. In this case a fixed-effect analysis is appropriate” [ | “In taking account of heterogeneity when summarizing effect measures from observational studies many authors recommend formal tests of heterogeneity. However, the available tests often lack statistical power. This means that the possible existence should be considered even where the available tests fail to demonstrate it” [ |