| Literature DB >> 31727150 |
Allison Gates1, Samantha Guitard1, Jennifer Pillay1, Sarah A Elliott1, Michele P Dyson1, Amanda S Newton2, Lisa Hartling3.
Abstract
BACKGROUND: We explored the performance of three machine learning tools designed to facilitate title and abstract screening in systematic reviews (SRs) when used to (a) eliminate irrelevant records (automated simulation) and (b) complement the work of a single reviewer (semi-automated simulation). We evaluated user experiences for each tool.Entities:
Keywords: Automation; Machine learning; Systematic reviews; Usability; User experience
Year: 2019 PMID: 31727150 PMCID: PMC6857345 DOI: 10.1186/s13643-019-1222-2
Source DB: PubMed Journal: Syst Rev ISSN: 2046-4053
Population, intervention, comparator, outcome, and study design (PICOS) criteria for the systematic reviews
| Criteria | Antipsychotics [ | Bronchiolitis | Visual Acuity [ |
|---|---|---|---|
| Population | Children and young adults aged ≤ 24 years experiencing a psychiatric disorder or behavioral issues outside the context of a disorder | Infants and young children aged < 24 months experiencing their first episode of wheeze or diagnosed with bronchiolitis or RSV | Community-dwelling adults aged ≥ 65 years with unrecognized impaired visual acuity or vision-related functional limitations |
| Intervention | Any Food and Drug Administration-approved first- or second-generation antipsychotic | Any bronchodilator, any corticosteroid, hypertonic saline, oxygen therapy, antibiotics, heliox | Vision screening tests (alone or within multicomponent screening/assessment) performed by primary healthcare professionals |
| Comparators | Placebo, no treatment, any other antipsychotic, the same antipsychotic in a different dose | Placebo, usual care, no treatment, normal saline, or another intervention of interest | No screening, delayed screening, attention control, screening involving all components of intervention except vision component, usual care |
| Outcomes | Intermediate and effectiveness outcomes, adverse effects and major adverse effects, adverse effects limiting treatment, specific adverse events, persistence and reversibility of adverse effects | Outpatient admissions, inpatient length of stay, change in clinical score, oxygen saturation, respiratory rate, heart rate, pulmonary function, adverse events, escalation of care, length of illness, duration of oxygen therapy | Benefits (e.g., mortality, adverse consequences of poor vision), harms (e.g., serious adverse events), implementation factors (e.g., uptake of referrals) |
| Study designs | RCTs and nRCTs, controlled cohort studies, controlled before-after studies | RCTs | RCTs, controlled experimental and observational studies |
nRCT non-randomized controlled trial, RCT randomized controlled trial, RSV respiratory syncytial virus
Characteristics of the reviews and screening predictions for each tool
| Characteristic | Antipsychotics, | Bronchiolitis, | Visual Acuity, |
|---|---|---|---|
| Screening workloada | 12,156 | 5861 | 11,229 |
| Included by title/abstractb | 1178 (10) | 518 (9) | 224 (2) |
| Included in the reviewb | 127 (1) | 137 (2) | 1 (< 1) |
| Includes/excludes in training set | Abstrackr, 15/185 | Abstrackr, 12/188 | Abstrackrc, 4/296 |
| DistillerSR, 14/186 | DistillerSR, 14/186 | DistillerSR, 2/198 | |
| RobotAnalyst, 20/180 | RobotAnalyst, 15/185 | RobotAnalyst, 3/197 | |
| Screened by toold | 11,956 (98) | 5661 (97) | 11,029 (98) |
| Predicted relevant by Abstrackr | 2117 (18) | 656 (12) | 3639 (33) |
| Predicted relevant by DistillerSR | 7 (< 1) | 83 (1) | 0 (0) |
| Predicted relevant by RobotAnalyst | 3488 (29) | 1082 (19) | 3221 (29) |
aTotal number of records retrieved via the electronic searches. Each record was screened by two reviewers
bIncluded following the initial screening by two independent reviewers (retrospective)
cAll training sets were 200 records, with the exception of the Visual Acuity review which required a 300-record training set in Abstrackr before predictions were produced
dAfter a 200-record training set
Fig. 1Proportion missed (percent) by tool and systematic review, automated simulation
Fig. 2Workload savings (percent) by tool and systematic review, automated simulation
Fig. 3Estimated time savings (days) by tool and systematic review, automated simulation
Fig. 4Proportion missed (percent) by tool and systematic review, semi-automated simulation
Fig. 5Workload savings (percent) by tool and systematic review, semi-automated simulation
Fig. 6Estimated time savings (days) by tool and systematic review, semi-automated simulation
System Usability Scale responses for each item, per toola
| Item | Abstrackr | DistillerSR | RobotAnalyst |
|---|---|---|---|
| I think that I would like to use the tool frequently | 3.5 (1) | 4 (0.5) | 1 (1) |
| I found the tool to be unnecessarily complex | 2 (1) | 3.5 (1.25) | 3 (0.5) |
| I thought the tool was easy to use | 4 (1.25) | 2.5 (2) | 2 (1.5) |
| I think that I would need the support of a technical person to be able to use the tool | 1 (1) | 2.5 (1.25) | 4 (1.25) |
| I found the various function in the tool were well integrated | 4 (1.25) | 3.5 (2.25) | 3 (1.25) |
| I thought there was too much inconsistency in the tool | 2 (0.25) | 1 (1.25) | 4 (1.25) |
| I would imagine that most people would learn to use the tool very quickly | 4.5 (1) | 3 (1.25) | 3 (0.25) |
| I found the tool very cumbersome to use | 2 (0.5) | 3 (1.25) | 5 (0) |
| I felt very confident using the tool | 4 (1) | 3.5 (1.25) | 2 (2.25) |
| I needed to learn a lot of things before I could get going with the tool | 2 (0.25) | 3 (0.5) | 2.5 (1) |
| Overall score (/100) | 79 (23) | 64 (31) | 31 (8) |
Likert-like scale: 1 = strongly disagree, 3 = neutral, and 5 = strongly agree. Values represent the median (interquartile range) of responses