Frank Soboczenski1, Thomas A Trikalinos2, Joël Kuiper3, Randolph G Bias4, Byron C Wallace5, Iain J Marshall6. 1. School of Population Health & Environmental Sciences, Faculty of Life Sciences and Medicine, King's College London, 3rd Floor, Addison House, Guy's Campus, London, SE1 1UL, UK. frank.soboczenski@kcl.ac.uk. 2. Center for Evidence Synthesis in Health, Brown University, Providence, USA. 3. Vortext Systems, Groningen, Netherlands. 4. School of Information, University of Texas at Austin, Austin, USA. 5. Khoury College of Computer Sciences, Northeastern University, Boston, USA. 6. School of Population Health & Environmental Sciences, Faculty of Life Sciences and Medicine, King's College London, 3rd Floor, Addison House, Guy's Campus, London, SE1 1UL, UK.
Abstract
OBJECTIVE: Assessing risks of bias in randomized controlled trials (RCTs) is an important but laborious task when conducting systematic reviews. RobotReviewer (RR), an open-source machine learning (ML) system, semi-automates bias assessments. We conducted a user study of RobotReviewer, evaluating time saved and usability of the tool. MATERIALS AND METHODS: Systematic reviewers applied the Cochrane Risk of Bias tool to four randomly selected RCT articles. Reviewers judged: whether an RCT was at low, or high/unclear risk of bias for each bias domain in the Cochrane tool (Version 1); and highlighted article text justifying their decision. For a random two of the four articles, the process was semi-automated: users were provided with ML-suggested bias judgments and text highlights. Participants could amend the suggestions if necessary. We measured time taken for the task, ML suggestions, usability via the System Usability Scale (SUS) and collected qualitative feedback. RESULTS: For 41 volunteers, semi-automation was quicker than manual assessment (mean 755 vs. 824 s; relative time 0.75, 95% CI 0.62-0.92). Reviewers accepted 301/328 (91%) of the ML Risk of Bias (RoB) judgments, and 202/328 (62%) of text highlights without change. Overall, ML suggested text highlights had a recall of 0.90 (SD 0.14) and precision of 0.87 (SD 0.21) with respect to the users' final versions. Reviewers assigned the system a mean 77.7 SUS score, corresponding to a rating between "good" and "excellent". CONCLUSIONS: Semi-automation (where humans validate machine learning suggestions) can improve the efficiency of evidence synthesis. Our system was rated highly usable, and expedited bias assessment of RCTs.
OBJECTIVE: Assessing risks of bias in randomized controlled trials (RCTs) is an important but laborious task when conducting systematic reviews. RobotReviewer (RR), an open-source machine learning (ML) system, semi-automates bias assessments. We conducted a user study of RobotReviewer, evaluating time saved and usability of the tool. MATERIALS AND METHODS: Systematic reviewers applied the Cochrane Risk of Bias tool to four randomly selected RCT articles. Reviewers judged: whether an RCT was at low, or high/unclear risk of bias for each bias domain in the Cochrane tool (Version 1); and highlighted article text justifying their decision. For a random two of the four articles, the process was semi-automated: users were provided with ML-suggested bias judgments and text highlights. Participants could amend the suggestions if necessary. We measured time taken for the task, ML suggestions, usability via the System Usability Scale (SUS) and collected qualitative feedback. RESULTS: For 41 volunteers, semi-automation was quicker than manual assessment (mean 755 vs. 824 s; relative time 0.75, 95% CI 0.62-0.92). Reviewers accepted 301/328 (91%) of the ML Risk of Bias (RoB) judgments, and 202/328 (62%) of text highlights without change. Overall, ML suggested text highlights had a recall of 0.90 (SD 0.14) and precision of 0.87 (SD 0.21) with respect to the users' final versions. Reviewers assigned the system a mean 77.7 SUS score, corresponding to a rating between "good" and "excellent". CONCLUSIONS: Semi-automation (where humans validate machine learning suggestions) can improve the efficiency of evidence synthesis. Our system was rated highly usable, and expedited bias assessment of RCTs.
Authors: Julian P T Higgins; Douglas G Altman; Peter C Gøtzsche; Peter Jüni; David Moher; Andrew D Oxman; Jelena Savovic; Kenneth F Schulz; Laura Weeks; Jonathan A C Sterne Journal: BMJ Date: 2011-10-18
Authors: James M Gwinnutt; Maud Wieczorek; Javier Rodríguez-Carrio; Andra Balanescu; Heike A Bischoff-Ferrari; Annelies Boonen; Giulio Cavalli; Savia de Souza; Annette de Thurah; Thomas E Dorner; Rikke Helene Moe; Polina Putrik; Lucía Silva-Fernández; Tanja Stamm; Karen Walker-Bone; Joep Welling; Mirjana Zlatković-Švenda; Francis Guillemin; Suzanne M M Verstappen Journal: RMD Open Date: 2022-06
Authors: Patricia Sofia Jacobsen Jardim; Christopher James Rose; Heather Melanie Ames; Jose Francisco Meneses Echavez; Stijn Van de Velde; Ashley Elizabeth Muller Journal: BMC Med Res Methodol Date: 2022-06-08 Impact factor: 4.612
Authors: James M Gwinnutt; Maud Wieczorek; Giulio Cavalli; Andra Balanescu; Heike A Bischoff-Ferrari; Annelies Boonen; Savia de Souza; Annette de Thurah; Thomas E Dorner; Rikke Helene Moe; Polina Putrik; Javier Rodríguez-Carrio; Lucía Silva-Fernández; Tanja Stamm; Karen Walker-Bone; Joep Welling; Mirjana I Zlatković-Švenda; Francis Guillemin; Suzanne M M Verstappen Journal: RMD Open Date: 2022-03