Literature DB >> 28320705

Rating the certainty in evidence in the absence of a single estimate of effect.

M Hassan Murad¹, Reem A Mustafa^2,3, Holger J Schünemann³, Shahnaz Sultan⁴, Nancy Santesso³.

Abstract

When studies measure or report outcomes differently, it may not be feasible to pool data across studies to generate a single effect estimate (ie, perform meta-analysis). Instead, only a narrative summary of the effect across different studies might be available. Regardless of whether a single pooled effect estimate is generated or whether data are summarised narratively, decision makers need to know the certainty in the evidence in order to make informed decisions. In this guide, we illustrate how to apply the constructs of the GRADE (Grading of Recommendation, Assessment, Development and Evaluation) approach to assess the certainty in evidence when a meta-analysis has not been performed and data were summarised narratively. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

Entities: Disease Species

Keywords: EPIDEMIOLOGY; STATISTICS & RESEARCH METHODS

Mesh：

Year: 2017 PMID： 28320705 PMCID： PMC5502230 DOI： 10.1136/ebmed-2017-110668

Source DB: PubMed Journal: Evid Based Med ISSN： 1356-5524

Background

Practitioners of evidence-based medicine need to know the level of certainty in the evidence they are applying to patient care. Whether they are using recommendations from a clinical practice guideline based on a systematic review of the literature or using the results directly from a systematic review, they need to know how trustworthy the evidence is with regard to the benefits and harms of a treatment or a diagnostic test. This construct is called certainty or quality of evidence. The GRADE (Grading of Recommendation, Assessment, Development and Evaluation) approach is a modern framework for rating the certainty in evidence.1 Using GRADE, randomised controlled trials and observational studies are considered to generate high and low certainty evidence, respectively. This initial grade that is based on study design is modified using several key domains such as the methodological limitations of the studies, indirectness of the evidence to the question at hand, imprecision of estimates, inconsistency of the evidence, and the likelihood of publication bias. A body of evidence about a specific outcome is downgraded or upgraded to a final rating of high, moderate, low or very low. High certainty in evidence means that the investigators are very confident that the effect they found across studies is close to the true effect, and very low means that they have very little confidence in the effect.1 Often, a single pooled effect estimate from a meta-analysis is available and is used for assessing the certainty in evidence. However, when studies measure outcomes differently or report outcomes in ways that cannot be standardised and meta-analysed, or in situations of urgency, only a narrative synthesis might be available. Consider a systematic review of self-management programmes in patients with chronic obstructive pulmonary disease.2 There were five randomised trials that informed the effect of the intervention on respiratory symptoms. The individual studies presented their results using different tools and measures which precluded pooling. Two trials3 4 used the Borg scale to assess respiratory symptoms. One of these trials3 also presented results for respiratory symptom severity. The third trial5 presented results as the proportion of days rated in patients' diaries as having mild, moderate or severe respiratory symptoms. The fourth trial6 presented results about mean breathlessness and sputum production scores over 2-week periods and the fifth trial7 presented results as breathlessness, sputum volume and sputum colour during exacerbations. These studies could not be pooled, but the evidence could be summarised narratively. There is some guidance about how to synthesise the effects of interventions narratively.8 Guidance about how to grade certainty in this evidence is needed. A judgement on the certainty in evidence is still required because certainty is a key component of decision-making. Providing decision makers (patients, clinicians and policymakers) with evidence of unknown trustworthiness compromises their ability to transform evidence to action.9 Decision makers would want to know how confident we are in the effect of these programmes to improve respiratory symptoms before offering such programmes to patients with chronic obstructive pulmonary disease.

The approach

We provide suggestions on the use of GRADE to rate the certainty of evidence when a meta-analysis has not been performed, and instead a narrative summary of the effect was provided. The approach leverages the meaning of the constructs that represent GRADE domains to produce judgements on how these constructs affect our certainty. In table 1, we explain how the GRADE domains (methodological limitations of the studies or risk of bias, indirectness, imprecision, inconsistency and the likelihood of publication bias) can be applied without a single pooled estimate. Note that this guidance does not address meta-narrative reviews10–13 (which answer questions about conceptual underpinnings and understanding of a phenomenon) or qualitative systematic reviews14 (which summarise themes from focus groups and interviews); rather, we address evidence synthesis of quantitative estimates of effect not amenable to meta-analysis (and thus summarised narratively).

Table 1

Applying the GRADE approach when evidence for an effect is summarised narratively (a meta-analysis is not available)

GRADE domain	How to apply the GRADE domain to evidence that has been summarised narratively
Methodological limitations of the studies	Make a judgement on the risk of bias across studies for an individual outcome. A sensitivity analysis is not possible to determine if the effect changes when studies at high risk of bias are excluded. It is possible to consider the size of a study, its risk of bias and the impact it would have on the summarised effect.
Indirectness	Make a global judgement on how dissimilar the research evidence is to the clinical question at hand (in terms of population, interventions and outcomes across studies).
Imprecision	Consider the optimal information size (or the total number of events for binary outcomes and the number of participants in continuous outcomes) across all studies. A threshold of 400 or less is concerning for imprecision.15 Results may also be imprecise when the CIs of all the studies or of the largest studies include no effect and clinically meaningful benefits or harms.
Inconsistency	Judge inconsistency by evaluating the consistency of the direction and primarily the difference in the magnitude of effects across studies (since statistical measures of heterogeneity are not available). Widely differing estimates of the effects indicate inconsistency.
Likelihood of publication bias	Publication bias can be suspected when the body of evidence consists of only small positive studies or when studies are reported in trial registries but not published. Statistical evaluation of publication bias is not possible in this case. Publication bias is more likely if the search of the systematic review is not comprehensive.
Factors that can raise certainty in evidence: Large effect Dose–response gradient Plausible confounders or other biases increase the certainty in the effect	If one of the three domains that can increase certainty in a body of evidence (typically from non-randomised studies) is noted, consider rating up the grade of certainty, particularly if it is noted in the majority of studies.

Applying the GRADE approach when evidence for an effect is summarised narratively (a meta-analysis is not available) Large effect Dose–response gradient Plausible confounders or other biases increase the certainty in the effect

Example

In table 2, we again refer to the systematic review of self-management programmes in patients with chronic obstructive pulmonary disease2 and illustrate how we applied the GRADE approach. The outcome of interest in this table is respiratory symptoms which were not pooled in meta-analysis. Evidence derived from five randomised trials showed small to no reductions in respiratory symptoms and was judged to warrant low certainty (rated down for methodological limitations of the included studies and inconsistency). Based on this assessment, decision makers can conclude that self-management programmes may slightly reduce respiratory symptoms. This evidence could also be presented to decision makers in a summary of findings table (typically used in guideline development and generated using GRADEpro which allows narrative summaries of the evidence; https://gradepro.org). Table 3 shows one row of a summary of findings table with explanatory notes. The certainty of evidence in table 3 summarises the GRADE judgements about the different domains (all detailed in table 2) that collectively determined the certainty in evidence for one outcome (respiratory symptoms).

Table 2

Illustrative example of rating the certainty in evidence in the absence of a single estimate of effect

GRADE domain	Judgement	Concerns about certainty domains
Methodological limitations of the studies	One out of five trials7 had low risk of bias in the three items assessed (sequence generation, allocation concealment and blinding) but it was the smallest study (46 participants). Two other trials (56 and 129 participants)3 5 did not report on any of the risk of bias items; making judgements not possible, which was concerning. The remaining two trials4 6 (235 and 157 participants) explicitly reported lack of blinding, unclear sequence generation and allocation concealment. Therefore, we judged the trials to have serious methodological limitations.	Serious
Indirectness	The patients, intervention and comparators in the studies all provide direct evidence to the clinical question at hand. All interventions included an educational component (with some variation in the direct respiratory therapy component). The type and severity of the symptoms (outcome) was assessed using different scales in different trials. We judged the evidence to have no serious indirectness but noted some variability in the intervention and outcome measure.	Not serious
Imprecision	The total number of patients included in all the trials was ∼600. Some trials reported small reductions, and other trials reported ‘non-significant results’ likely because of enrolling a small number of participants which resulted in wide CIs that included meaningful benefits and no effects. We judged the evidence to have borderline imprecision.	Not serious, borderline
Inconsistency	The direction and magnitude of effect varied across the different trials. Overall the results showed either small reduction in symptoms or no change. Two trials,3 4 showed a small effect on dyspnoea at the 5% level using the Borg scale in favour of self-management education programme. In the third trial,5 they found no significant between-group differences in the proportion of days rated as mild, moderate or severe in their respiratory status in symptom diaries. In the fourth trial,6 no significant between-group differences were seen in mean breathlessness and sputum production scores over 2-week periods. However, small statistically significant differences in mean cough and sputum colour scores were seen in favour of the intervention group. In the fifth trial,7 no significant differences were found between the scores of the intervention and control group during exacerbations (breathlessness, sputum volume and sputum colour). We judged the evidence to have serious inconsistency.	Serious
Publication bias	We did not strongly suspect publication bias because both negative and positive trials were published, and the search for studies was comprehensive.	Not suspected

The outcome of interest is respiratory symptoms. Data are derived from a systematic review of self-management programmes in patients with chronic obstructive pulmonary disease.

Table 3

Illustrative example of how the summary of findings can be presented to guideline developers

Outcome	Effect	Number of participants(studies)	Certainty in the evidence*
Respiratory symptomsAssessed using a variety of scales	Most studies showed small reductions in symptoms or no effect.	623(5 randomised trials)	LOW†‡⊕⊕OO(due to serious risk of bias and imprecision)

The outcome of interest is respiratory symptoms (for which a single pooled effect estimate was not available and only a narrative synthesis of the evidence was provided).

*Commonly used symbols to describe certainty in evidence in evidence profiles: high certainty ⊕⊕⊕⊕, moderate certainty ⊕⊕⊕O, low certainty ⊕⊕OO and very low certainty ⊕OOO.

†Serious risk of bias across studies because of unclear or inadequate blinding, sequence generation and allocation concealment.

‡Serious imprecision and inconsistency were considered together as there were small effects, or ‘no effects’ reported in studies (likely due to wide CIs).

Illustrative example of rating the certainty in evidence in the absence of a single estimate of effect The outcome of interest is respiratory symptoms. Data are derived from a systematic review of self-management programmes in patients with chronic obstructive pulmonary disease. Illustrative example of how the summary of findings can be presented to guideline developers The outcome of interest is respiratory symptoms (for which a single pooled effect estimate was not available and only a narrative synthesis of the evidence was provided). *Commonly used symbols to describe certainty in evidence in evidence profiles: high certainty ⊕⊕⊕⊕, moderate certainty ⊕⊕⊕O, low certainty ⊕⊕OO and very low certainty ⊕OOO. †Serious risk of bias across studies because of unclear or inadequate blinding, sequence generation and allocation concealment. ‡Serious imprecision and inconsistency were considered together as there were small effects, or ‘no effects’ reported in studies (likely due to wide CIs).

Discussion

Evidence-based practice is founded on making decisions using the best available evidence, whether it is based on a pooled single effect estimate, or on a narrative review of the individual studies informing each outcome. Stakeholders require that such evidence is appraised and the certainty in the effect is determined in order to inform decision-making. One of the greatest strengths of the GRADE approach is that it provides a systematic method to assess the certainty in evidence and a transparent documentation of the judgements used to assess the body of evidence. While typically it is thought to only apply to results that have been statistically aggregated, evaluating the certainty of evidence can also be performed when results have been narratively summarised.16 In this setting, some certainty domains can be applied directly. For other domains, we have provided additional guidance in which the meaning and connotation of those domains can be used. Taken together, an overall assessment of the evidence can be determined. Stakeholders engaged in shared decision-making in a patient–physician dyad, in guidelines development, or in public health and policy, can then use the summarised effect and the certainty in the evidence to make informed decisions.

15 in total

1. Managing chronic obstructive pulmonary disease in the community. A randomized controlled trial of home-based pulmonary rehabilitation for elderly housebound patients.

Authors: Anne-Marie Boxall; Louise Barclay; Allyn Sayers; Gideon A Caplan
Journal: J Cardiopulm Rehabil Date: 2005 Nov-Dec Impact factor: 2.081

2. GRADE guidelines 6. Rating the quality of evidence--imprecision.

Authors: Gordon H Guyatt; Andrew D Oxman; Regina Kunz; Jan Brozek; Pablo Alonso-Coello; David Rind; P J Devereaux; Victor M Montori; Bo Freyschuss; Gunn Vist; Roman Jaeschke; John W Williams; Mohammad Hassan Murad; David Sinclair; Yngve Falck-Ytter; Joerg Meerpohl; Craig Whittington; Kristian Thorlund; Jeff Andrews; Holger J Schünemann
Journal: J Clin Epidemiol Date: 2011-08-11 Impact factor: 6.437

3. Using GRADE to respond to health questions with different levels of urgency.

Authors: Kristina A Thayer; Holger J Schünemann
Journal: Environ Int Date: 2016-04-26 Impact factor: 9.621

4. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 1: Introduction.

Authors: Pablo Alonso-Coello; Holger J Schünemann; Jenny Moberg; Romina Brignardello-Petersen; Elie A Akl; Marina Davoli; Shaun Treweek; Reem A Mustafa; Gabriel Rada; Sarah Rosenbaum; Angela Morelli; Gordon H Guyatt; Andrew D Oxman
Journal: BMJ Date: 2016-06-28

5. Evaluation of a self-management plan for chronic obstructive pulmonary disease.

Authors: P B Watson; G I Town; N Holbrook; C Dwan; L J Toop; C J Drennan
Journal: Eur Respir J Date: 1997-06 Impact factor: 16.671

6. Economic evaluation of a comprehensive self-management programme in patients with moderate to severe chronic obstructive pulmonary disease.

Authors: E Monninkhof; P van der Valk; T Schermer; J van der Palen; C van Herwaarden; G Zielhuis
Journal: Chron Respir Dis Date: 2004 Impact factor: 2.444

7. Humanistic outcomes in the hypertension and COPD arms of a multicenter outcomes study.

Authors: G A Gourley; T S Portner; D R Gourley; E L Rigolosi; J M Holt; D K Solomon; G E Bass; W R Wicke; R L Braden
Journal: J Am Pharm Assoc (Wash) Date: 1998 Sep-Oct

Review 8. Health assessment of commercial drivers: a meta-narrative systematic review.

Authors: Abd Moain Abu Dabrh; Belal Firwana; Clayton T Cowl; Lawrence W Steinkraus; Larry J Prokop; Mohammad Hassan Murad
Journal: BMJ Open Date: 2014-03-06 Impact factor: 2.692

9. RAMESES publication standards: meta-narrative reviews.

Authors: Geoff Wong; Trish Greenhalgh; Gill Westhorp; Jeanette Buckingham; Ray Pawson
Journal: BMC Med Date: 2013-01-29 Impact factor: 8.775

10. Using qualitative evidence in decision making for health and social interventions: an approach to assess confidence in findings from qualitative evidence syntheses (GRADE-CERQual).

Authors: Simon Lewin; Claire Glenton; Heather Munthe-Kaas; Benedicte Carlsen; Christopher J Colvin; Metin Gülmezoglu; Jane Noyes; Andrew Booth; Ruth Garside; Arash Rashidian
Journal: PLoS Med Date: 2015-10-27 Impact factor: 11.069

81 in total

Review 1. Interferon therapy in patients with SARS, MERS, and COVID-19: A systematic review and meta-analysis of clinical studies.

Authors: Kiarash Saleki; Shakila Yaribash; Mohammad Banazadeh; Ehsan Hajihosseinlou; Mahdi Gouravani; Amene Saghazadeh; Nima Rezaei
Journal: Eur J Pharmacol Date: 2021-06-12 Impact factor: 4.432

2. Conservative management following closed reduction of traumatic anterior dislocation of the shoulder.

Authors: Cordula Braun; Cliona J McRobert
Journal: Cochrane Database Syst Rev Date: 2019-05-10

3. Does the addition of chlorhexidine to glass ionomer cements influence its antimicrobial effect and survival rate? A systematic review.

Authors: V da Mota Martins; L R Paranhos; M N de Oliveira; L C Maia; A C Machado; P C F Santos-Filho
Journal: Eur Arch Paediatr Dent Date: 2022-03-14

4. ESO guideline for the management of extracranial and intracranial artery dissection.

Authors: Stephanie Debette; Mikael Mazighi; Philippe Bijlenga; Alessandro Pezzini; Masatoshi Koga; Anna Bersano; Janika Kõrv; Julien Haemmerli; Isabella Canavero; Piotr Tekiela; Kaori Miwa; David J Seiffge; Sabrina Schilling; Avtar Lal; Marcel Arnold; Hugh S Markus; Stefan T Engelter; Jennifer J Majersik
Journal: Eur Stroke J Date: 2021-10-13

5. Paravertebral anaesthesia with or without sedation versus general anaesthesia for women undergoing breast cancer surgery.

Authors: Anjolie Chhabra; Apala Roy Chowdhury; Hemanshu Prabhakar; Rajeshwari Subramaniam; Mahesh Kumar Arora; Anurag Srivastava; Mani Kalaivani
Journal: Cochrane Database Syst Rev Date: 2021-02-25

6. The Efficacy of Ketogenic Therapies in the Clinical Management of People with Neurodegenerative Disease: A Systematic Review.

Authors: Lauren S Dewsbury; Chai K Lim; Genevieve Z Steiner
Journal: Adv Nutr Date: 2021-07-30 Impact factor: 8.701

7. Paying for performance to improve the delivery of health interventions in low- and middle-income countries.

Authors: Karin Diaconu; Jennifer Falconer; Adrian Verbel; Atle Fretheim; Sophie Witter
Journal: Cochrane Database Syst Rev Date: 2021-05-05

8. Continuous Monitoring of Glucose for Type 1 Diabetes: A Health Technology Assessment.

Authors:
Journal: Ont Health Technol Assess Ser Date: 2018-02-21

Review 9. Does Motivational Interviewing Improve the Weight Management Process in Adolescents? A Systematic Review and Meta-analysis.

Authors: Parisa Amiri; Mohammad Masih Mansouri-Tehrani; Ahmad Khalili-Chelik; Mehrdad Karimi; Sara Jalali-Farahani; Atieh Amouzegar; Elham Kazemian
Journal: Int J Behav Med Date: 2021-07-15

Review 10. Prevalence of Type 2 Diabetes in South Africa: A Systematic Review and Meta-Analysis.

Authors: Carmen Pheiffer; Victoria Pillay-van Wyk; Eunice Turawa; Naomi Levitt; Andre P Kengne; Debbie Bradshaw
Journal: Int J Environ Res Public Health Date: 2021-05-30 Impact factor: 3.390