| Literature DB >> 29530097 |
Allison Gates1, Cydney Johnson1, Lisa Hartling2.
Abstract
BACKGROUND: Machine learning tools can expedite systematic review (SR) processes by semi-automating citation screening. Abstrackr semi-automates citation screening by predicting relevant records. We evaluated its performance for four screening projects.Entities:
Keywords: Automation; Machine learning; Methodology; Systematic review
Mesh:
Year: 2018 PMID: 29530097 PMCID: PMC5848519 DOI: 10.1186/s13643-018-0707-8
Source DB: PubMed Journal: Syst Rev ISSN: 2046-4053
PICOS (participants, interventions, comparators, outcomes, study design) characteristics of the screening projects
| Characteristic | Screening project | |||
|---|---|---|---|---|
| Bronchiolitis | Antipsychotics | Diabetes | Child Health SRs | |
| Participants | Infants ≤ 24 months | Children and young adults ≤24 years | Any age | Children ≤ 18 years |
| Intervention | Pharmacologic | Pharmacologic | Multicomponent behavioral program | Any |
| Comparator | Placebo; active pharmacologic comparator | Placebo; no treatment; active pharmacologic comparator | Usual or standard care; active comparator | Any (including non-comparative SRs) |
| Outcomes | Rate of admission or length of stay; change in clinical severity score; oxygen saturation; respiratory rate; heart rate; symptoms; QoL; pulmonary function | Intermediate effectiveness outcomes; adverse effects | Behavioral; clinical; health (e.g., quality of life); diabetes-related health care utilization; program acceptability; harms | Health outcomes relevant to children, including the accuracy of diagnostic tests and outcomes measured in adults related to exposures during childhood |
| Study design | RCTs | RCTs; NRCTs; controlled cohort studies; controlled before-after studies | RCTs; NRCTs; prospective comparative studies; prospective cohort studies; controlled before-after studies | Non-Cochrane: SRs; meta-analyses; network meta-analyses; individual patient data meta-analyses |
NRCT non-randomized controlled trial, QoL quality of life, RCT randomized controlled trial, SR systematic review
Screening workload and proportion of records included by screening project, as performed by the human reviewers
| Screening characteristics | Screening project ( | |||
|---|---|---|---|---|
| Antipsychotics | Bronchiolitis | Child Health SRs | Diabetes | |
| Records retrieved by the searches | 12,763 | 5893 | 5243 | 47,141 |
| Accepted after title and abstract screeninga | 808 (6.3) | 520 (8.8) | 3143 (59.9) | 698 (1.5) |
| Accepted after full-text screeningb | 135 (1.1) | 155 (2.6) | 1598 (30.5) | 205 (0.4) |
SR systematic review
aBased on dual independent screening by two human reviewers
bRecords included in the final report
Descriptive characteristics of the title and abstract screening processes in Abstrackr, across three trials
| Characteristic | Topic | |||
|---|---|---|---|---|
| Antipsychotics | Bronchiolitis | Child Health SRs | Diabetes | |
| Screened by humanb | ||||
| N records | 277 (32) | 607 (340) | 210 (10) | 323 (206) |
| % records | 2.2 (0.3) | 10.3 (5.8) | 4.0 (0.2) | 0.7 (0.4) |
| Accepted by humanc | ||||
| N records | 19 (3) | 56 (35) | 118 (20) | 111 (74) |
| % records | 6.9 (1.1) | 9.0 (0.9) | 56.1 (6.9) | 34.1 (1.6) |
| Predicted as relevant by Abstrackrd | ||||
| N records | 4259 (1281) | 1163 (123) | 4535 (173) | 5187 (1430) |
| % records | 34.1 (10.2) | 22.0 (0.9) | 90.1 (3.6) | 11.0 (3.0) |
All values are mean (SD) across three trials. Standard deviations for proportions (% records) relate to the range of values observed across trials, and not the mean variance across trials
SR systematic review
aIncluded some duplicates as three EndNote libraries were combined to create the dataset
bBefore Abstrackr produced predictions
cBased on the decisions of two independent human reviewers for each screening project
dRecords that Abstrackr predicted as relevant for further inspection following title and abstract screening (equivalent to “accepted as relevant”)
Fig. 1Abstrackr’s mean sensitivity and specificity across three trials for each project
Abstrackr’s mean performance across three trials for each of the screening projects
| Performance metric | Topic | |||
|---|---|---|---|---|
| Antipsychotics | Bronchiolitis | Child Health SRs | Diabetes | |
| Precision, % (SD) | 15.1 (2.6) | 38.1 (2.6) | 64.7 (2.0) | 14.8 (2.6) |
| False negative rate, % (SD) | 21.2 (8.3) | 7.3 (2.2) | 3.5 (1.4) | 17.9 (2.3) |
| Proportion missed, % (SD) | 0.1 (0.1) | 0.1 (0.1) | 6.4 (1.7) | 0.1 (0.01) |
| Workload savings, % (SD) | 64.5 (9.8) | 70.0 (3.7) | 9.5 (3.5) | 88.4 (2.7) |
Standard deviations for proportions relate to the range of values observed across trials and not the mean variance across trials
SD standard deviation
aIncluded some duplicates, as three EndNote libraries were combined to create the dataset