Anna Noel-Storr1, Gordon Dooley2, Lisa Affengruber3, Gerald Gartlehner4. 1. Radcliffe Department of Medicine, University of Oxford, Oxford, UK. Electronic address: anna.noel-storr@rdm.ox.ac.uk. 2. Metaxis Ltd, Oxford, UK. 3. Department for Evidence-based Medicine and Evaluation, Danube University Krems, Krems an der Donau, Austria; Department of Family Medicine, Care and Public Health Research Institute, Maastricht University, Maastrict, The Netherlands. 4. Department for Evidence-based Medicine and Evaluation, Danube University Krems, Krems an der Donau, Austria; RTI International, Research Triangle Park, NC, USA.
Abstract
OBJECTIVES: To assess the feasibility of a modified workflow that uses machine learning and crowdsourcing to identify studies for potential inclusion in a systematic review. STUDY DESIGN AND SETTING: This was a substudy to a larger randomized study; the main study sought to assess the performance of single screening search results versus dual screening. This substudy assessed the performance in identifying relevant randomized controlled trials (RCTs) for a published Cochrane review of a modified version of Cochrane's Screen4Me workflow which uses crowdsourcing and machine learning. We included participants who had signed up for the main study but who were not eligible to be randomized to the two main arms of that study. The records were put through the modified workflow where a machine learning classifier divided the data set into "Not RCTs" and "Possible RCTs." The records deemed "Possible RCTs" were then loaded into a task created on the Cochrane Crowd platform, and participants classified those records as either "Potentially relevant" or "Not relevant" to the review. Using a prespecified agreement algorithm, we calculated the performance of the crowd in correctly identifying the studies that were included in the review (sensitivity) and correctly rejecting those that were not included (specificity). RESULTS: The RCT machine learning classifier did not reject any of the included studies. In terms of the crowd, 112 participants were included in this substudy. Of these, 81 completed the training module and went on to screen records in the live task. Applying the Cochrane Crowd agreement algorithm, the crowd achieved 100% sensitivity and 80.71% specificity. CONCLUSIONS: Using a crowd to screen search results for systematic reviews can be an accurate method as long as the agreement algorithm in place is robust. TRIAL REGISTRATION: Open Science Framework: https://osf.io/3jyqt.
OBJECTIVES: To assess the feasibility of a modified workflow that uses machine learning and crowdsourcing to identify studies for potential inclusion in a systematic review. STUDY DESIGN AND SETTING: This was a substudy to a larger randomized study; the main study sought to assess the performance of single screening search results versus dual screening. This substudy assessed the performance in identifying relevant randomized controlled trials (RCTs) for a published Cochrane review of a modified version of Cochrane's Screen4Me workflow which uses crowdsourcing and machine learning. We included participants who had signed up for the main study but who were not eligible to be randomized to the two main arms of that study. The records were put through the modified workflow where a machine learning classifier divided the data set into "Not RCTs" and "Possible RCTs." The records deemed "Possible RCTs" were then loaded into a task created on the Cochrane Crowd platform, and participants classified those records as either "Potentially relevant" or "Not relevant" to the review. Using a prespecified agreement algorithm, we calculated the performance of the crowd in correctly identifying the studies that were included in the review (sensitivity) and correctly rejecting those that were not included (specificity). RESULTS: The RCT machine learning classifier did not reject any of the included studies. In terms of the crowd, 112 participants were included in this substudy. Of these, 81 completed the training module and went on to screen records in the live task. Applying the Cochrane Crowd agreement algorithm, the crowd achieved 100% sensitivity and 80.71% specificity. CONCLUSIONS: Using a crowd to screen search results for systematic reviews can be an accurate method as long as the agreement algorithm in place is robust. TRIAL REGISTRATION: Open Science Framework: https://osf.io/3jyqt.
Authors: Shahnaz Sultan; Madelin R Siedler; Rebecca L Morgan; Toju Ogunremi; Philipp Dahm; Lisa A Fatheree; Thomas S D Getchius; Pamela K Ginex; Priya Jakhmola; Emma McFarlane; M Hassan Murad; Robyn L Temple Smolkin; Yasser S Amer; Murad Alam; Bianca Y Kang; Yngve Falck-Ytter; Reem A Mustafa Journal: J Gen Intern Med Date: 2021-09-20 Impact factor: 6.473
Authors: Elizabeth Allen; Alice R Rumbold; Amy Keir; Carmel T Collins; Jennifer Gillis; Hiroki Suganuma Journal: Cochrane Database Syst Rev Date: 2021-10-21
Authors: Anna H Noel-Storr; Patrick Redmond; Guillaume Lamé; Elisa Liberati; Sarah Kelly; Lucy Miller; Gordon Dooley; Andy Paterson; Jenni Burt Journal: BMC Med Res Methodol Date: 2021-04-26 Impact factor: 4.615
Authors: Anna Noel-Storr; Gerald Gartlehner; Gordon Dooley; Emma Persad; Barbara Nussbaumer-Streit Journal: Res Synth Methods Date: 2022-04-25 Impact factor: 9.308
Authors: Katie O'Hearn; Cameron MacDonald; Anne Tsampalieros; Leo Kadota; Ryan Sandarage; Supun Kotteduwa Jayawarden; Michele Datko; John M Reynolds; Thanh Bui; Shagufta Sultan; Margaret Sampson; Misty Pratt; Nick Barrowman; Nassr Nama; Matthew Page; James Dayre McNally Journal: BMC Med Res Methodol Date: 2021-07-08 Impact factor: 4.615
Authors: Emma J Dennett; Sadia Janjua; Elizabeth Stovold; Samantha L Harrison; Melissa J McDonnell; Anne E Holland Journal: Cochrane Database Syst Rev Date: 2021-07-26
Authors: Nadia Soliman; Simon Haroutounian; Andrea G Hohmann; Elliot Krane; Jing Liao; Malcolm Macleod; Daniel Segelcke; Christopher Sena; James Thomas; Jan Vollert; Kimberley Wever; Harutyun Alaverdyan; Ahmed Barakat; Tyler Barthlow; Amber L Harris Bozer; Alexander Davidson; Marta Diaz-delCastillo; Antonina Dolgorukova; Mehnaz I Ferdousi; Catherine Healy; Simon Hong; Mary Hopkins; Arul James; Hayley B Leake; Nathalie M Malewicz; Michael Mansfield; Amelia K Mardon; Darragh Mattimoe; Daniel P McLoone; Gith Noes-Holt; Esther M Pogatzki-Zahn; Emer Power; Bruno Pradier; Eleny Romanos-Sirakis; Astra Segelcke; Rafael Vinagre; Julio A Yanes; Jingwen Zhang; Xue Ying Zhang; David P Finn; Andrew S C Rice Journal: Pain Date: 2021-07-01 Impact factor: 6.961