| Literature DB >> 15788089 |
David Atkins1, Peter A Briss, Martin Eccles, Signe Flottorp, Gordon H Guyatt, Robin T Harbour, Suzanne Hill, Roman Jaeschke, Alessandro Liberati, Nicola Magrini, James Mason, Dianne O'Connell, Andrew D Oxman, Bob Phillips, Holger Schünemann, Tessa Tan-Torres Edejer, Gunn E Vist, John W Williams.
Abstract
BACKGROUND: Systems that are used by different organisations to grade the quality of evidence and the strength of recommendations vary. They have different strengths and weaknesses. The GRADE Working Group has developed an approach that addresses key shortcomings in these systems. The aim of this study was to pilot test and further develop the GRADE approach to grading evidence and recommendations.Entities:
Mesh:
Year: 2005 PMID: 15788089 PMCID: PMC1084246 DOI: 10.1186/1472-6963-5-25
Source DB: PubMed Journal: BMC Health Serv Res ISSN: 1472-6963 Impact factor: 2.655
Example of an evidence profile quality assessment given to the evaluators for them to grade in the pilot study. Example question: Should depressed patients in primary care be treated with SSRIs rather than tricyclics?
| Outcome: | ||||
| Studies | Design | Quality | Consistency | Directness |
| 8 trials Citalopram | RCTs | No serious flaws | No important inconsistency | Some uncertainty about relevance (outcome measure) |
| Outcome: | ||||
| 8 trials Citalopram | RCTs | No serious flaws | No important inconsistency | Some uncertainty about relevance (outcome measure) |
| Outcome: | ||||
| Office for National Statistics (British) | Observational data (national statistics) | Serious flaw, population based | Only one study | Direct |
Example of an evidence profile summary of findings given to the evaluators for them to grade in the pilot study. Example question: Should depressed patients in primary care be treated with SSRIs rather than tricyclics?
| SSRI | tricyclics | Effect | Quality | Relative importance | ||
| Relative (95% CI) | NNT/NNH | |||||
| Depression Severity | 5044 patients | 4510 patients | WMD 0.034 (-0.007 to 0.075) | No difference | ||
| Transient side effects | 1948/7032 (28%) | 2072/6334 (33%) | RRR 13% (5% to 20%) | 20 | ||
| Poisoning fatalities* | 1/100,000 per year of treatment | 58/100,000 per year of treatment | RRR 98% | 1754 | ||
* Uncertainty about baseline risk: Fatality data may be influenced by which pills are given to whom, and it is uncertain if changing antidepressant would deter suicide attempts
Summary of the judgements made by the 17 evaluators for Example 1 of the pilot study. Should depressed patients in primary care be treated with SSRIs rather than tricyclics?
| Rater | Quality of outcome | Relative importance of outcome | Overall quality | Balance benefits vs harm | Recommendation | ||||
| Depression severity | Transient side effects | Poisining fatalities | Depression severity | Transient side effects | Poisining fatalities | ||||
| 1 | H | H | Vl | 7 | 9 | 9 | H or Vl | Uncertain net benefit | Don't do it |
| 2 | M | M | Vl | 9 | 6 | 7 | Vl | Net benefit | Probably do it |
| 3 | H | H | M | 8 | 7 | 9 | H | Uncertain net benefit | Toss up |
| 4 | H | H | L | 6 | 5 | 6 | |||
| 5 | M | M | M | 9 | 6 | 8 | M | Net benefit | Do it |
| 6 | M | M | Vl | 9 | 6 | 9 | Vl | Net benefit | Do it |
| 7 | M | M | L | 8 | 7 | 8 | L | Net benefit | Do it |
| 8 | H | H | Vl | 9 | 5 | 3 | H | Net benefit | Probably do it |
| 9 | M | M | L | 9 | 6 | 8 | L | Net benefit | Probably do it |
| 10 | M | M | L | 9 | 7 | 8 | L | Net benefit | Probably do it |
| 11 | H | H | L | 8 | 5 | 7 | L | Trade offs | Probably do it |
| 12 | H | H | L | 8 | 5 | 7 | L | Trade offs | Probably do it |
| 13 | M | H | M | 9 | 7 | 9 | M | Net benefit | Do it |
| 14 | M | M | L | 9 | 9 | 5 OR 9 | M | Net benefit | Probably do it |
| 15 | M | M | Vl | 9 | 6 | 8 | Vl | Uncertain net benefit | Toss up |
| 16 | M | M | Vl | 9 | 5 | 9 | Vl | Not net benefit | Don't do it |
| 17 | M | M | M | 9 | 9 | 9 | M | Net benefit | Toss up |
Results, summary of the judgements made by the 17 evaluators of the quality for each of the outcomes presented in the 12 examples in the pilot study.
| Outcome | High | Moderate | Low | Very low | Consensus | Comments |
| Depression severity | 6/17 | 10/17 | - | - | Moderate | |
| Transient side effects | 7/17 | 10/17 | - | - | High | Changed to little uncertainty |
| Poisoning fatalities | - | 4/17 | 7/17 | 6/17 | Moderate | Upgraded for very strong association |
| Stroke | 15/17 | 2/17 | - | - | High | |
| Extracranial hemorrage | 16/16 | - | - | - | High | |
| All cause mortality | 12/17 | 5/17 | - | - | - | Agreed to remove this outcome |
| Pain at rest | 16/17 | 1/17 | - | - | High | |
| Pain ay motion | 15/17 | 2/17 | - | - | Moderate | Uncertainty about directness of outcome measure |
| Mobility | 3/17 | 14/17 | - | - | Moderate | |
| Quality of life | 1/17 | 11/17 | 5/17 | - | Moderate | |
| Dropout due to side effects | 14/17 | 3/17 | - | - | High | |
| Serious gi complications | - | 3/17 | 8/17 | 6/17 | - | Need more information before consensus |
| All cause mortality | 2/17 | 15/17 | - | - | Moderate | |
| Non-fatal stroke | 17/17 | - | - | - | High | |
| Non-fatal MI | 17/17 | - | - | - | High | |
| Death | 13/17 | 4/17 | - | - | High | |
| Non-fatal MI | 11/16 | 5/16 | - | - | High | |
| All cause death | 12/17 | 5/17 | - | - | Moderate | If reporting bias, otherwise high |
| Major bleeding | 15/17 | 2/17 | - | - | High | |
| Recurrent thromboembolism | 6/17 | 11/17 | - | - | High | |
| Clinical cure | - | 13/17 | 4/17 | - | Moderate | |
| Dropout due to side effects | - | 10/17 | 7/17 | - | Moderate | |
| Relapse | 2/17 | 11/17 | 4/17 | - | Moderate | |
| Tuberculosis | 2/17 | 10/17 | 5/17 | - | Moderate | |
| TB death | 8/17 | 8/17 | 1/17 | - | High | |
| TB meningitis | 1/17 | 4/17 | 12/17 | - | Moderate | Strong association |
| Serious adverse events | 1/12 | - | - | 11/12 | - | No data, outcome removed |
| Condition unchanged | 5/17 | 19/17 | 2/17 | - | - | No consensus regarding sparse data |
| Poor outcome- surgeon rated | 9/17 | 8/17 | - | - | - | Need bias information before consensus |
| 2nd procedure needed | 9/17 | 8/17 | - | - | Moderate | |
| No success – objective rater | 5/17 | 10/17 | 2/17 | - | - | No consensus regarding sparse data |
| Risks & side effects | 1/15 | 2/15 | 2/15 | 10/15 | - | No data, outcome removed |
| Dental caries – start | - | - | 8/17 | 9/17 | Very low | |
| Dental caries – stop | - | - | 5/17 | 12/17 | Very low | |
| Dental florosis | - | 1/17 | 3/17 | 13/17 | Very low | |
| Bone fracture | - | 1/17 | 3/17 | 13/17 | Very low | |
| Cancer mortality | - | 1/17 | 2/17 | 14/17 | Very low | |
| All injuries | - | 1/17 | 12/17 | 4/17 | Very low | Question changed |
| Correct use early | 8/17 | 5/17 | 4/17 | - | High | Question changed |
| Correct use follow up | 2/17 | 8/17 | 6/17 | 1/17 | High | Question changed |
| Possession of seat | 7/17 | 6/17 | 3/17 | - | High | Question changed |
| CHD | 11/13 | 2/13 | - | - | High | |
| Breast cancer | 11/13 | 2/13 | - | - | High | |
| Stroke | 11/13 | 2/13 | - | - | High | |
| Colorectal cancer | 11/13 | 2/13 | - | - | High | |
| Endometrial cancer | 11/13 | 2/13 | - | - | High | |
| Hip fracture | 11/13 | 2/13 | - | - | High |
Results, summary of the judgements made by the 17 evaluators of the overall quality in the 12 examples in the pilot study
| 1 | 2/15 | 4/15 | 5/15 | 4/15 | Moderate | |
| 2 | 12/17 | 5/17 | - | - | High | |
| 3 | 1/17 | 6/17 | 5/17 | 5/17 | Need more information before consensus | |
| 4 | 4/17 | 13/17 | - | - | High | Based on the new rule |
| 5 | 12/16 | 4/16 | - | - | High | |
| 6 | 7/17 | 10/17 | - | - | High | Based on new rule |
| 7 | - | 11/17 | 6/17 | - | Moderate | |
| 8 | - | 6/17 | 3/17 | 8/17 | High | Based on new rule |
| 9 | 2/16 | 3/16 | 5/16 | 6/16 | High/Moderate depending if there are fatal flaws | |
| 10 | - | 1/17 | 4/17 | 12/17 | Very low | |
| 11 | 1/17 | 3/17 | 8/17 | 5/17 | High | Changed question |
| 12 | 11/13 | 2/13 | - | - | High | |
Results, kappa agreement among the evaluators for each of the 12 examples in the pilot study
| 1 | 3 | 0.436 | 0.149 | 0.031 |
| 2 | 3 | 0.769 | 0.075 | 0.053 |
| 3 | 6 | 0.643 | 0.441 | 0.024 |
| 4 | 3 | 0.926 | 0.823 | 0.050 |
| 5 | 2 | 0.608 | -0.044 | 0.065 |
| 6 | 3 | 0.618 | 0.163 | 0.050 |
| 7 | 3 | 0.520 | -0.028 | 0.044 |
| 8 | 3 | 0.451 | 0.146 | 0.036 |
| 9 | 4 | 0.441 | -0.022 | 0.037 |
| 10 | 5 | 0.579 | 0.005 | 0.034 |
| 11 | 4 | 0.377 | 0.112 | 0.027 |
| 12 | 7 | 0.718 | -0.083 | 0.043 |
Results, summary of the judgements made by the 17 evaluators about the balance between benefits and harms for each of the 12 examples in the pilot study
| Example | Net benefit | Trade offs | Uncertain net benefits | Not net benefits | Consensus |
| 1 | 10/16 | 2/16 | 3/16 | 1/16 | Net benefit |
| 2 | 11/16 | 4/16 | 1/16 | - | Net benefit |
| 3 | 2/17 | 8/17 | 7/17 | - | Need more information |
| 4 | 15/16 | - | 1/16 | - | Net benefit |
| 5 | 13/17 | - | 4/17 | - | Net benefit |
| 6 | 13/17 | 2/17 | 2/17 | - | Net benefit |
| 7 | 4/17 | 3/17 | 9/17 | 1/17 | Uncertain net benefits |
| 8 | 7/16 | - | 9/16 | - | Net benefit |
| 9 | 2/16 | 8/16 | 6/16 | - | Uncertain benefit/trade offs |
| 10 | 2/17 | 4/17 | 10/17 | 1/17 | No consensus |
| 11 | 12/17 | - | 5/17 | - | Net benefit |
| 12 | - | 2/13 | 1/13 | 10/17 | No consensus |
Results, summary of the recommendations made the 17 evaluators for each of the 12 examples in the pilot study
| Example | Do it | Probably do it | Toss up | Probably don't do it | Don't do it | Consensus |
| 1 | 4/16 | 7/16 | 3/16 | - | 2/16 | Probably do it |
| 2 | 6/16 | 8/16 | 2/16 | - | - | Do it |
| 3 | - | 6/15 | 7/15 | 2/15 | - | Need more information |
| 4 | 13/15 | 2/15 | - | - | - | Do it |
| 5 | 11/16 | 5/16 | - | - | - | Do it |
| 6 | 11/17 | 5/17 | 1/17 | - | - | Do it |
| 7 | 1/17 | 7/17 | 2/17 | 6/17 | 1/17 | Probably do it |
| 8 | 2/15 | 7/15 | 4/15 | 2/15 | - | Do it |
| 9 | 1/17 | 4/17 | 8/17 | 4/17 | - | Probably don't do it/Tossup |
| 10 | - | 2/17 | 6/17 | 7/17 | 2/17 | No consensus |
| 11 | 7/17 | 8/17 | 2/17 | - | - | Do it |
| 12 | - | - | - | 4/13 | 9/13 | No consensus |
Example of a modified GRADE evidence profile quality assessment. Table 9 and 10 is what Table 1 and 2 became when including the improvements made based on the pilot study experience.
Question: Should depressed patients be treated with SSRIs rather than tricyclics?
Setting: Primary care
Patients: Moderately depressed adult patients
Reference: North of England Evidence Based Guideline Development Project. Evidence based clinical practice guideline: the choice of antidepressants for depression in primary care. Newcastle upon Tyne: Centre for Health Services Research, 1997.
| Outcome: | |||||||||
| Studies | Design | Quality | Consistency | Directness | SD | SA | RB | DR | PC |
| 8 trials Citalopram | RCTs | No serious limitations | No important inconsistency | Some uncertainty about directness (outcome measure)* | No | No | No | No | No |
| Outcome: | |||||||||
| 8 trials Citalopram | RCTs | No serious limitations | No important inconsistency | Direct | No | No | No | No | No |
| Outcome: | |||||||||
| Office for National Statistics (British) | Observational data | Serious limitation** | Only one study | Direct | No | ++ | No | No | No |
*There was uncertainty about the directness of the outcome measure because of the short duration of the trials.
**It is possible that people at lower risk were more likely to have been given SSRI's and it is uncertain if changing antidepressant would have deterred suicide attempts.
SD = Sparse data (Yes or No)
SA = Strong association (No, + = strong, ++ = very strong)
RB = Reporting bias (Yes or No)
DR = Dose response (Yes or No)
PC = All plausible confounders would have reduced the effect (Yes or No)
CI = confidence interval
WMD = weighted mean difference
RRR = relative risk reduction
Example of a modified GRADE evidence profile summary of findings. Table 9 and 10 is what Table 1 and 2 became when including the improvements made based on the pilot study experience
| Tricyclics | ||||||
| Depression severity | 5044 patients | 4510 patients | WMD 0.034 (-0.007 to 0.075) | No difference | Moderate | Critical |
| Transient side effects | 1948/7032 (28%) | 2072/6334 (33%) | RRR 13% (5% to 20%) | 5 per 100 | High | Critical |
| Poisoning fatalities*** | 1/100,000 per year of treatment | 58/100,000 per year of treatment | RRR 98% (97% to 99%) | 6 per 10,000 | Moderate | Critical |
***There is uncertainty about the baseline risk for poisoning fatalities.
Modified GRADE quality assessment criteria
| High | Randomised trial | ||
| Moderate | Quasi-randomised trial | ||
| Low | Observational study | ||
| Very low | Any other evidence | ||
* 1 = move up or down one grade (for example from high to moderate)
2 = move up or down two grades (for example from high to low)
The highest possible score is High (4) and the lowest possible score is Very low (1). Thus, for example, randomised trials with a strong association would not move up a grade.
** A relative risk of >2 (< 0.5), based on consistent evidence from two or more observational studies, with no plausible confounders
*** A relative risk of > 5 (< 0.2) based on direct evidence with no major threats to validity