| Literature DB >> 31296265 |
Iain J Marshall1, Byron C Wallace2.
Abstract
Technologies and methods to speed up the production of systematic reviews by reducing the manual labour involved have recently emerged. Automation has been proposed or used to expedite most steps of the systematic review process, including search, screening, and data extraction. However, how these technologies work in practice and when (and when not) to use them is often not clear to practitioners. In this practical guide, we provide an overview of current machine learning methods that have been proposed to expedite evidence synthesis. We also offer guidance on which of these are ready for use, their strengths and weaknesses, and how a systematic review team might go about using them in practice.Entities:
Keywords: Evidence synthesis; Machine learning; Natural language processing
Year: 2019 PMID: 31296265 PMCID: PMC6621996 DOI: 10.1186/s13643-019-1074-9
Source DB: PubMed Journal: Syst Rev ISSN: 2046-4053
Fig. 1Classifying text using machine learning, in this example logistic regression with a ‘bag of words’ representation of the texts. The system is ‘trained’, learning a coefficient (or weight) for each unique word in a manually labelled set of documents (typically in the 1000s). In use, the learned coefficients are used to predict a probability for an unknown document
Fig. 2Bag of words modelling for classifying RCTs. Top left: Example of bag of words for three articles. Each column represents a unique word in the corpus (a real example would likely contain columns for 10,000s of words). Top right: Document labels, where 1 = relevant and 0 = irrelevant. Bottom: Coefficients (or weights) are estimated for each word (in this example using logistic regression). In this example, high +ve weights will increase the predicted probability that an unseen article is an RCT where it contains the words ‘random’ or ‘randomized’. The presence of the word ‘systematic’ (with a large negative weight) would reduce the predicted probability that an unseen document is an RCT
Fig. 3Schematic of a typical data extraction process. The above illustration concerns the example task of extracting the study sample size. In general, these tasks involve labelling individual words. The word (or ‘token’) at position t is represented by a vector. This representation may encode which word is at this position and likely also communicates additional features, e.g. whether the word is capitalized or if the word is (inferred to be) a noun. Models for these kinds of tasks attempt to assign labels all T words in a document and for some tasks will attempt to maximize the joint likelihood of these labels to capitalize on correlations between adjacent labels
Fig. 4Typical workflow for semi-automated abstract screening. The asterisk indicates that with uncertainty sampling, the articles which are predicted with least certainty are presented first. This aims to improve the model accuracy more efficiently
Examples of machine learning systems available for use in systematic reviews
| Example tools | Comments | |
|---|---|---|
| Search—finding RCTs | RobotSearch ( Cochrane Register of Studies ( RCT tagger ( | • Validated machine learning filters available for identifying RCTs and suitable for fully automatic use • Conventional topic-specific keyword search strategy still needed • No widely available tools for non-RCT design currently |
| Search—literature exploration | Thalia ( | Allows search of PubMed for concepts (i.e. chemicals, diseases, drugs, genes, metabolites, proteins, species and anatomical entities) |
| Screening | Abstrackr ( EPPI reviewer ( RobotAnalyst ( SWIFT-Review ( Colandr ( Rayyan ( | • Screening systems automatically sort a search retrieval by relevance • RobotAnalyst and SWIFT-Review also allow |
| Data extraction | ExaCT ( RobotReviewer ( NaCTeM text mining tools for automatically extracting concepts relating to genes and proteins (NEMine), yeast metabolites (Yeast MetaboliNER), and anatomical entities (AnatomyTagger) ( | • These prototype systems automatically extract data elements (e.g. sample sizes, descriptions of PICO elements) from free-texts. |
| Bias assessment | RobotReviewer ( | • Automatic assessment of biases in reports of RCTs • System recommended for |
Machine learning: computer algorithms which ‘learn’ to perform a specific task through statistical modelling of (typically large amounts of) data Natural language processing: computational methods for automatically processing and analysing ‘natural’ (i.e. human) language texts Text classification: automated categorization of documents into groups of interest Data extraction: the task of identifying key bits of structured information from texts Crowd-sourcing: decomposing work into Micro-tasks: discrete units of work that together complete a larger undertaking Semi-automation: using machine learning to Human-in-the-loop: workflows in which humans remain involved, rather than being replaced Supervised learning: estimating model parameters using manually labelled data Distantly supervised: learning from pseudo, noisy ‘labels’ derived automatically by applying rules to existing databases or other structured data Unsupervised: learning without any labels (e.g. clustering data) |