| Literature DB >> 28693497 |
Dawid Pieper1, Anja Jacobs2, Beate Weikert3, Alba Fishta3, Uta Wegewitz3.
Abstract
BACKGROUND: Inter-rater reliability (IRR) is mainly assessed based on only two reviewers of unknown expertise. The aim of this paper is to examine differences in the IRR of the Assessment of Multiple Systematic Reviews (AMSTAR) and R(evised)-AMSTAR depending on the pair of reviewers.Entities:
Keywords: AMSTAR; Clinimetrics; Measurement properties; Observer variation; Reliability and validity; Systematic review
Mesh:
Year: 2017 PMID: 28693497 PMCID: PMC5504630 DOI: 10.1186/s12874-017-0380-y
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
AMSTAR checklist
| 1. Was an ‘a priori’ design provided? |
| 2. Was there duplicate study selection and data extraction? |
| 3. Was a comprehensive literature search performed? |
| 4. Was the status of publication (i.e. grey literature) used as an inclusion criterion? |
| 5. Was a list of studies (included and excluded) provided? |
| 6. Were the characteristics of the included studies provided? |
| 7. Was the scientific quality of the included studies assessed and documented? |
| 8. Was the scientific quality of the included studies used appropriately in formulating conclusions? |
| 9. Were the methods used to combine the findings of studies appropriate? |
| 10. Was the likelihood of publication bias assessed? |
| 11. Was the conflict of interest included? |
Experience of reviewers
| Reviewer no. | |||||
|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | |
| Working experience (in years) | 13 | 7 | 5 | 7 | 7 |
| Number of SRs assessed with either AMSTAR, R-AMSTAR or OQAQ | 15 | 1 | 100 | 10 | 10 |
| Number of SRs assessed with any other tool | 35 | 5 | 10 | 5 | 50 |
years of collaboration for each pair
| Pair of reviewers | Collaboration (in years) |
|---|---|
| 1&2 | 0 |
| 1&3 | 0 |
| 1&4 | 5 |
| 1&5 | 3 |
| 2&3 | 0 |
| 2&4 | 3 |
| 2&5 | 0 |
| 3&4 | 0 |
| 3&5 | 0 |
| 4&5 | 3 |
Pairwise inter-rater reliability of AMSTAR and R-AMSTAR
| AMSTAR | R-AMSTAR | |||
|---|---|---|---|---|
| Pair of reviewers | Cohens κ | Holsti’s r | Cohens κ | Holsti’s r |
| 1&2 | 0.55 | 0.87 | 0.37 | 0.81 |
| 1&3 | 0.56 | 0.82 | 0.45 | 0.84 |
| 1&4 | 0.47 | 0.86 | 0.46 | 0.84 |
| 1&5 | 0.41 | 0.82 | 0.32 | 0.77 |
| 2&3 | 0.52 | 0.89 | 0.49 | 0.83 |
| 2&4 | 0.69 | 0.98 | 0.67 | 0.89 |
| 2&5 | 0.50 | 0.88 | 0.39 | 0.82 |
| 3&4 | 0.53 | 0.89 | 0.67 | 0.87 |
| 3&5 | 0.43 | 0.83 | 0.44 | 0.80 |
| 4&5 | 0.52 | 0.88 | 0.45 | 0.81 |
| min | 0.41 | 0.82 | 0.32 | 0.77 |
| max | 0.69 | 0.98 | 0.67 | 0.89 |
| mean | 0.52 | 0.87 | 0.47 | 0.83 |
| median | 0.52 | 0.88 | 0.45 | 0.82 |
Pairwise inter-rater reliability (Cohen’s kappa) for AMSTAR on item-level
| Pair | 1&2 | 1&3 | 1&4 | 1&5 | 2&3 | 2&4 | 2&5 | 3&4 | 3&5 | 4&5 | Mean | Median |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| I1: Was an ‘a priori’ design provided? |
|
|
|
|
|
|
|
|
|
| 1.00 | 1.00 |
| I2: Was there duplicate study selection and data extraction? | 0.33 | 0.71 | 0.50 | 0.33 | 0.61 |
| 0.16 | 0.75 | 0.20 | 0.13 | 0.46 | 0.42 |
| I3: Was a comprehensive literature search performed? | 0.45 | 0.45 | 0.45 | 0.20 | 0.18 |
| 0.29 | 0.18 | 0.29 | 0.29 | 0.38 | 0.29 |
| I4: Was the status of publication (i.e. grey literature) used as an inclusion criterion? | 0.30 | 0.19 | 0.38 | 0.14 | 0.48 |
| 0.33 | 0.63 | 0.54 | 0.50 | 0.44 | 0.43 |
| I5: Was a list of studies (included and excluded) provided? | 0.87 | 0.64 | 0.75 | 0.75 | 0.75 |
|
|
| 0.63 | 0.75 | 0.78 | 0.75 |
| I6: Were the characteristics of the included studies provided? a | ||||||||||||
| I7: Was the scientific quality of the included studies assessed and documented? a | 0.33 | |||||||||||
| I8: Was the scientific quality of the included studies used appropriately in formulating conclusions? |
|
| −0.07 | 0.33 |
| −0.07 | 0.33 | −0.07 | 0.33 | −0.11 | 0.37 | 0.33 |
| I9: Were the methods used to combine the findings of studies appropriate? |
| 0.19 |
| 0.18 | 0.07 | 0.33 |
| 0.33 | 0.19 |
| 0.31 | 0.33 |
| I10: Was the likelihood of publication bias assessed? | 0.29 | 0.71 | 0.53 | 0.53 | 0.33 | 0.73 | 0.73 | 0.33 | 0.33 |
| 0.55 | 0.53 |
| I11: Was the conflict of interest included? | 0.24 | 0.16 | 0.24 | 0.20 | 0.29 | 0.59 | 0.35 |
| 0.48 | 0.67 | 0.40 | 0.32 |
| Mean | 0.55 | 0.56 | 0.47 | 0.41 | 0.52 |
| 0.50 | 0.53 | 0.43 | 0.52 | 0.52 | 0.52 |
| Median | 0.45 | 0.64 | 0.45 | 0.33 | 0.48 |
| 0.35 | 0.63 | 0.33 | 0.50 | 0.50 |
I Item
aFor item 6 and 7 it was not possible to calculate pairwise inter-rater reliability, except for item 7 pair 3&5, because at least one reviewer of each pair scored “yes” for all 16 reviews (resulting in a constant variable)
Highest values per item are marked in bold, lowest values are marked in italics
Number of overruled assessments for each reviewer at item-level for AMSTAR
| Rev1 | Rev2 | Rev3 | Rev4 | Rev5 | |
|---|---|---|---|---|---|
| I1: Was an ‘a priori’ design provided? | 0 | 5 | 0 | 0 | 0 |
| I2: Was there duplicate study selection and data extraction? | 5 | 1 | 4 | 0 | 7 |
| I3: Was a comprehensive literature search performed? | 2 | 1 | 3 | 1 | 2 |
| I4: Was the status of publication (i.e. grey literature) used as an inclusion criterion? | 7 | 3 | 6 | 1 | 5 |
| I5: Was a list of studies (included and excluded) provided? | 2 | 0 | 2 | 1 | 3 |
| I6: Were the characteristics of the included studies provided? | 0 | 0 | 0 | 0 | 3 |
| I7: Was the scientific quality of the included studies assessed and documented? | 1 | 1 | 3 | 1 | 0 |
| I8: Was the scientific quality of the included studies used appropriately in formulating conclusions? | 1 | 1 | 1 | 1 | 4 |
| I9: Were the methods used to combine the findings of studies appropriate? | 4 | 4 | 7 | 4 | 3 |
| I10: Was the likelihood of publication bias assessed? | 5 | 3 | 11 | 1 | 2 |
| I11: Was the conflict of interest included? | 7 | 2 | 1 | 0 | 2 |
| Total | 34 | 21 | 38 | 10 | 29 |
I Item, Rev Reviewer