Robert C Lorenz1, Katja Matthias2, Dawid Pieper3, Uta Wegewitz4, Johannes Morche2, Marc Nocon2, Olesja Rissling2, Jacqueline Schirm2, Anja Jacobs2. 1. Federal Joint Committee (Healthcare), Medical Consultancy Department, Gutenbergstr. 13, 10587 Berlin, Germany; Division of Social and Preventive Medicine, University of Potsdam, Research Focus Cognitive Sciences, Am Neuen Palais 10, Potsdam 14469, Germany. Electronic address: robert.lorenz@g-ba.de. 2. Federal Joint Committee (Healthcare), Medical Consultancy Department, Gutenbergstr. 13, 10587 Berlin, Germany. 3. Witten/Herdecke University, School of Medicine, Faculty of Health, Evidence-based Health Services Research, IFOM - Institute for Research in Operative Medicine, Ostmerheimer Str. 200, 51109 Cologne, Germany. 4. Federal Institute for Occupational Safety and Health, Nöldnerstr.40-42, 10317 Berlin, Germany.
Abstract
OBJECTIVES: The objectives of this study were to determine the interrater reliability (IRR) of assessment of multiple systematic reviews (AMSTAR) 2 for reviews of pharmacological or psychological interventions for the treatment of major depression, to compare it to that of AMSTAR and risk of bias in systematic reviews (ROBIS), and to assess the convergent validity between the appraisal tools. STUDY DESIGN AND SETTING: Two groups of four raters were each assigned one of two samples of 30 systematic reviews. All eight raters applied AMSTAR 2 to their sample. Each group also applied either AMSTAR or ROBIS. Fleiss' kappa and Gwet's AC1 were calculated, and agreement between the tools was assessed. RESULTS: The median kappa values as a measure of IRR indicated a moderate agreement for AMSTAR 2 (median = 0.51), a substantial agreement for AMSTAR (median = 0.62), and a fair agreement for ROBIS (median = 0.27). Validity results showed a positive association for AMSTAR and AMSTAR 2 (r = 0.91) as well as ROBIS and AMSTAR 2 (r = 0.84). For the overall rating, AMSTAR 2 showed a high concordance with ROBIS and a lower concordance with AMSTAR. CONCLUSION: The IRR of AMSTAR 2 was found to be slightly lower than the IRR of AMSTAR and higher than the IRR of ROBIS. Validity measurements indicate that AMSTAR 2 is closely related to both ROBIS and AMSTAR.
OBJECTIVES: The objectives of this study were to determine the interrater reliability (IRR) of assessment of multiple systematic reviews (AMSTAR) 2 for reviews of pharmacological or psychological interventions for the treatment of major depression, to compare it to that of AMSTAR and risk of bias in systematic reviews (ROBIS), and to assess the convergent validity between the appraisal tools. STUDY DESIGN AND SETTING: Two groups of four raters were each assigned one of two samples of 30 systematic reviews. All eight raters applied AMSTAR 2 to their sample. Each group also applied either AMSTAR or ROBIS. Fleiss' kappa and Gwet's AC1 were calculated, and agreement between the tools was assessed. RESULTS: The median kappa values as a measure of IRR indicated a moderate agreement for AMSTAR 2 (median = 0.51), a substantial agreement for AMSTAR (median = 0.62), and a fair agreement for ROBIS (median = 0.27). Validity results showed a positive association for AMSTAR and AMSTAR 2 (r = 0.91) as well as ROBIS and AMSTAR 2 (r = 0.84). For the overall rating, AMSTAR 2 showed a high concordance with ROBIS and a lower concordance with AMSTAR. CONCLUSION: The IRR of AMSTAR 2 was found to be slightly lower than the IRR of AMSTAR and higher than the IRR of ROBIS. Validity measurements indicate that AMSTAR 2 is closely related to both ROBIS and AMSTAR.