| Literature DB >> 33243276 |
Allison Gates1, Michelle Gates2, Daniel DaRosa2, Sarah A Elliott2, Jennifer Pillay2, Sholeh Rahman2, Ben Vandermeer2, Lisa Hartling2.
Abstract
BACKGROUND: We evaluated the benefits and risks of using the Abstrackr machine learning (ML) tool to semi-automate title-abstract screening and explored whether Abstrackr's predictions varied by review or study-level characteristics.Entities:
Keywords: Artificial intelligence; Efficiency; Machine learning; Methods; Systematic reviews; Text mining
Mesh:
Year: 2020 PMID: 33243276 PMCID: PMC7694314 DOI: 10.1186/s13643-020-01528-x
Source DB: PubMed Journal: Syst Rev ISSN: 2046-4053
Characteristics of the included reviews
| Review name | Review type | Research question type | Intervention/exposure type | Included study designs |
|---|---|---|---|---|
| Systematic | ||||
| Systematic | ||||
| Systematic | ||||
| Rapid | ||||
| Systematic | ||||
| Rapid | ||||
| Rapid | ||||
| Systematic | ||||
| Systematic | ||||
| Systematic | ||||
| Systematic | ||||
| Rapid | ||||
| Systematic | ||||
| Systematic | ||||
| Systematic | ||||
| Rapid |
nRCT non-randomized controlled trial, RCT randomized controlled trial, UTI urinary tract infection, VBAC vaginal birth after cesarean section
Screening characteristics of the included reviews
| Review | Screened by human reviewers, | Screened in Abstrackr, | |||
|---|---|---|---|---|---|
| Screening workload | Included, title and abstract | Included, final report | Training set ( | Predicted relevant | |
| Activity and pregnancy | 2928 | 236 (8) | 98 (3) | 10/190 (5) | 319 (12) |
| Antipsychotics | 12156 | 1177 (10) | 127 (1) | 15/185 (8) | 2117 (18) |
| Brain injury | 6262 | 518 (8) | 40 (1) | 11/189 (6) | 2126 (35) |
| Concussion | 1439 | 46 (3) | 5 (< 1) | 3/197 (2) | 638 (51) |
| Diabetes | 47141 | 698 (1) | 205 (< 1) | 104/196 (52) | 5187 (11) |
| Digital technologies for pain | 2662 | 207 (8) | 64 (2) | 15/185 (8) | 321 (13) |
| Experiences of bronchiolitis | 651 | 88 (14) | 28 (4) | 13/187 (7) | 111 (25) |
| Experiences of UTIs | 1493 | 25 (2) | 4 (< 1) | 3/197 (2) | 864 (67) |
| Treatments for bronchiolitis | 5861 | 518 (9) | 137 (2) | 12/188 (6) | 656 (12) |
| VBAC | 5092 | 807 (16) | 21 (< 1) | 25/175 (13) | 1490 (30) |
| Visual acuity | 11229 | 224 (2) | 1 (< 1) | 4/296 (1) | 3639 (33) |
| Community gardening | 1536 | 153 (10) | 32 (2) | 55/145 (28) | 139 (10) |
| Depression safety | 964 | 44 (5) | 8 (1) | 7/193 (4) | 449 (59) |
| Depression treatments | 1583 | 418 (26) | 179 (11) | 43/157 (22) | 904 (65) |
| Preterm delivery | 451 | 96 (21) | 34 (8) | 47/153 (24) | 95 (38) |
| Workplace stress | 767 | 141 (18) | 59 (8) | 36/164 (18) | 210 (37) |
UTI urinary tract infection, VBAC vaginal birth after cesarean
aRetrospective screening data
bThe training set was 200 records for all reviews except diabetes and visual acuity, for which it was 300
Proportion missed, workload savings, and estimated time savings for each systematic review
| Systematic review | Records missed, single reviewer, | Records missed, simulation, | Workload savings, | Estimated time savings, h (d) |
|---|---|---|---|---|
| 11 (11) | 1 (1) | 2536 (43) | 21 h (3 d) | |
| 4 (3) | 3 (2) | 10508 (43) | 88 h (11 d) | |
| 2 (5) | 0 (0) | 4193 (33) | 35 h (4 d) | |
| 0 (0) | 0 (0) | 635 (22) | 5 h (< 1 d) | |
| 0 (0) | 0 (0) | 2271 (43) | 19 h (2 d) | |
| 12 (43) | 3 (11) | 389 (30) | 3 h (< 1 d) | |
| 0 (0) | 0 (0) | 448 (15) | 4 h (< 1 d) | |
| 10 (7) | 1 (1) | 5300 (45) | 44 h (6 d) | |
| 5 (24) | 3 (14) | 3750 (37) | 31 h (4 d) | |
| 0 (0) | 0 (0) | 7418 (33) | 62 h (8 d) |
d days, h hours, UTI urinary tract infection, VBAC vaginal birth after cesarean
The diabetes review was excluded because the screening data were not in a format amenable to analysis
Select review characteristics, stratified by Abstrackr’s relevance predictions
| Review characteristic | Correctly predicted as relevant, | Incorrectly predicted as irrelevant, | ||
|---|---|---|---|---|
| Systematic | 689 | 601 (87) | 88 (13) | 0.37 |
| Rapid | 113 | 95 (84) | 18 (16) | |
| Single | 297 | 246 (83) | 51 (17) | 0.01 |
| Multiple | 505 | 450 (89) | 55 (11) | |
| Simple | 403 | 346 (86) | 57 (14) | 0.47 |
| Complex | 399 | 350 (88) | 49 (12) | |
| Single—only randomized trials | 129 | 122 (95) | 7 (5) | 0.003 |
| Single—only systematic reviews | 72 | 57 (79) | 15 (21) | |
| Multiple | 601 | 517 (86) | 84 (14) | |
aFisher’s exact test
Study design and study-level risk of bias, stratified by Abstrackr’s relevance predictions
| Study characteristic | Correctly predicted as relevant, | Incorrectly predicted as irrelevant, | ||
|---|---|---|---|---|
| Trial | 483 | 438 (91) | 45 (9) | 0.0006 |
| Observational | 214 | 169 (79) | 45 (21) | |
| Mixed methods | 2 | 2 (100) | 0 (0) | |
| Qualitative | 15 | 14 (93) | 1 (7) | |
| Review | 88 | 73 (83) | 15 (17) | |
| Low | 120 | 96 (80) | 24 (20) | 0.039 |
| High or unclear | 500 | 438 (88) | 62 (12) | |
aFisher’s exact test
Publication year and journal impact factor, stratified by Abstrackr’s relevance predictions
| Study characteristic | All studies | Correctly predicted as relevant | Incorrectly predicted as irrelevant | Mean difference (95% CI)a | |
|---|---|---|---|---|---|
| 2008 (7) | 2008 (7) | 2006 (10) | 1.77 (0.27, 3.26) | 0.02 | |
| 4.87 (8.49) | 4.91 (8.39) | 4.61 (9.14) | 0.30 (− 1.44, 2.03) | 0.74 |
aMean difference between correctly identified studies and those erroneously excluded
bUnpaired t test