| Literature DB >> 29695296 |
Guy Tsafnat1, Paul Glasziou2, George Karystianis3, Enrico Coiera4.
Abstract
BACKGROUND: Screening candidate studies for inclusion in a systematic review is time-consuming when conducted manually. Automation tools could reduce the human effort devoted to screening. Existing methods use supervised machine learning which train classifiers to identify relevant words in the abstracts of candidate articles that have previously been labelled by a human reviewer for inclusion or exclusion. Such classifiers typically reduce the number of abstracts requiring manual screening by about 50%.Entities:
Keywords: Automation of systematic reviews; Evidence screening; Study characterisation; Study selection
Mesh:
Year: 2018 PMID: 29695296 PMCID: PMC5918752 DOI: 10.1186/s13643-018-0724-7
Source DB: PubMed Journal: Syst Rev ISSN: 2046-4053
High-level steps of a systematic review
| Step 1: Conduct a broad search of the literature | |
| Step 2: Screen the search results for relevant articles and exclude all others | |
| Step 3: Extract study characteristics from included studies | |
| Step 4: Synthesise the studies based on extracted characteristics and report on findings |
The three systematic reviews used in this study
| Name |
|
|
|
| Topic |
|---|---|---|---|---|---|
| Hamra 2014 [ | 604 | 615 | 615 | 17 (2.7%) | Outdoor particulate matter exposure and lung cancer |
| Johnson 2014 [ | 3023 | 3023 | 2470 | 17 (0.7%) | PFOA effects on fetal growth |
| Thayer 2013* [ | 2054 | 1880 | 11 (0.6%) | Bisphenol A (BPA) exposure and obesity |
The list of relevant studies was provided by the authors
O original number of articles reported in the original review, N number of articles in our search results, n number of articles with abstracts, I number of included articles in the systematic review
*Thayer 2013 provides a search protocol and no search results
Screening rule tests in this study and the rationale behind each
| No. | Screening rule | Rationale |
|---|---|---|
| 1 | All 4 PECO Terms | A well-written abstract of an observational study should mention all PECO elements. PECO elements are chosen (and not country and study type) because these are regularly used to retrieve studies relevant to a systematic review question. |
| 2 | Any 3 PECO Terms | We expect a higher recall and lower precision than the rule 1 as this allows one PECO element to be missed. |
| 3 | Any 2 PECO Terms | We expect a higher recall and lower precision than the rule 2 as this allows an additional PECO element to be missed. |
| 4 | PEO | We expect the same or lower recall and higher precision than rule 2 because confounders are often omitted from observational study abstracts. |
| 5 | PE | We expect the same or lower recall and higher precision than rule 3 because of an assumption that abstracts of observational studies should mention the population the study was conducted on and the exposure that was measured. |
| 6 | EO | We expect the same or lower recall and higher precision than rule 3 because of a belief that abstracts of observational studies should mention the exposure that was studied and the outcomes that were observed. |
Summary of screening workload savings for three systematic reviews
| Screening rule | TP | FP | FN | TN | Pr | Re | Work saved |
|---|---|---|---|---|---|---|---|
| Hamra 2014 ( | Max = 97.2% | ||||||
| All 4 PECO Terms | 5 | 5 | 12 | 593 | 50% | 29% | 98.4% |
| Any 3 PECO Terms | 12 | 24 | 5 | 574 | 33% | 71% | 94.1% |
| Any 2 PECO Terms | 17 | 89 | 0 | 509 | 16% | 100% | 82.8% |
| PEO | 11 | 17 | 6 | 581 | 39% | 60% | 95.4% |
| PE | 11 | 13 | 6 | 585 | 46% | 65% | 96.1% |
| EO | 17 | 65 | 0 | 533 | 21% | 100% | 86.7% |
| Johnson 2014 ( | Max = 99.3% | ||||||
| All 4 PECO Terms | 3 | 1 | 14 | 2455 | 75% | 18% | 99.8% |
| Any 3 PECO Terms | 14 | 12 | 3 | 2441 | 54% | 82% | 98.9% |
| Any 2 PECO Terms | 16 | 60 | 1 | 2393 | 21% | 94% | 96.9% |
| PEO | 13 | 49 | 4 | 2413 | 25% | 76% | 97.5% |
| PE | 13 | 5 | 4 | 1551 | 72% | 76% | 99.3% |
| EO | 16 | 11 | 1 | 2442 | 59% | 94% | 98.9% |
| Thayer 2013 ( | Max = 99.4% | ||||||
| All 4 PECO Terms | 7 | 20 | 13 | 1840 | 26% | 35% | 98.6% |
| Any 3 PECO Terms | 9 | 83 | 2 | 1786 | 10% | 82% | 95.1% |
| Any 2 PECO Terms | 11 | 304 | 0 | 1565 | 3% | 100% | 83.2% |
| PEO | 7 | 116 | 4 | 1753 | 6% | 64% | 93.5% |
| PE | 14 | 45 | 6 | 1815 | 24% | 70% | 96.9% |
| EO | 11 | 195 | 0 | 1674 | 5% | 100% | 89.0% |
| Average ( | Max = 99.1% | ||||||
| All 4 PECO Terms | 15 | 26 | 39 | 4888 | 37% | 28% | 99.2% |
| Any 3 PECO Terms | 35 | 119 | 10 | 4801 | 23% | 78% | 96.9% |
| Any 2 PECO Terms | 44 | 453 | 1 | 4467 | 9% | 98% | 90.0% |
| PEO | 31 | 182 | 14 | 4747 | 15% | 69% | 95.7% |
| PE | 38 | 63 | 16 | 3951 | 38% | 70% | 98.0% |
| EO | 44 | 271 | 1 | 4649 | 14% | 98% | 93.7% |
Work saved is the proportion of all positives in the entire set of n references (i.e., 1 − (TP + FP)/n)
TP true positive, FP false positives, FN false negative, TN true negative, P precision, Re recall
Summary of dictionaries used in the extraction algorithm
| Dictionary name | Size | Description | Example |
|---|---|---|---|
| Adjectives | 32 | Descriptive adjectives for the participant population | Non-smoking, postmenopausal |
| Controls | 70 | Nouns that refer to the participant population | Participants, students |
| Countries | 464 | Names of countries worldwide along with their respective nationalities | Malaysian, USA, American |
| Effect | 41 | Utilised epidemiological study metrics | Hazard ratio, adjusted odds ratio |
| Numbers | 178 | Numbers described by words | One, thousand |
| Related | 24 | Verbs indicating an association between exposure and outcome | Related, linked |
| Relations | 26 | Nouns indicating an association between exposure and outcome | Correlation, association |
| States | 56 | Names of the states and territories in the USA | Missouri, New York |
| Study types | 21 | Various epidemiological study designs (both observational and experimental) | Placebo-controlled, case control |
| PFOA | 51 | Various mentions of perfluorooctanoate (PFOA) mentions | PFOA, perfluorooctanoic acid |
| BPA | 56 | Variations of bisphenol A (BPA) mentions | BPA, urinary BPA concentrations |
| Folic acid | 57 | Variations of folic acid mentions | Folic acid, folic acid supplement |
| Air pollutants | 73 | Variations of outdoor particulate matter exposure mentions | Ambient no2, particulate air pollution |
| Fetal growth | 87 | Variations of fetal growth mentions | Birth weight, ponderal index |
| Twinning | 28 | Variations of twinning mentions | Twinning, twin pregnancies |
| Lung cancer | 229 | Variations of lung cancer and related comorbidities mentions | lc, lung cancer |
| Confounders | 74 | Various concept mentions as confounders in | Mode of delivery, fertility treatment use |
Selection of semantic rules used for extraction in GATE format, each with example phrases they match
| Characteristic | |||||||
|---|---|---|---|---|---|---|---|
| Study design | Example | We | conducted | a | hospital-based | prospective cohort | study |
| Rule | (verb) | ({Token.string==~“(?i)a”}|{Token.string==~“(?i)an”}) | ({Token})[0,1] | (study) | (types)? | ||
| Population | Example | Cohort | of | 665 Danish pregnant women | |||
| Rule | ({Token.string==~“(?i)cohort”}|{Token.string==~“(?i)sample”}|{Token.string==~“(?i)total”}|{Token.string==~“(?i)samples”}|{Token.string==~“(?i)cross-sectional”}|{Token.string==~“(?i)subsample”}) | {Token.string==~“(?i)of”} | (population) | ||||
| Exposure | Example | Perfluorinated compounds | in relation to | birth weight | |||
| Rule | (folic_variations) | {Lookup.majorType==“relations”} | (birthoutcomes) | ||||
| Outcome | Example | association | with | miscarriage | |||
| Rule | {Lookup.majorType==“relations”} | ({Token.string==~“(?i)between”}|(with)) | ({Token.string==~“(?i)the”})? | (birthoutcomes) | |||
| Confounding factor | Example | Adjusting | for | covariates, including maternal pre-pregnancy BMI, smoking, education, and birth weight | |||
| Rule | (adjustment) | ({Token.string = ~“(?i)for”}|{Token.string = ~“(?i)by”}) | ({Token})[0,3] | (cnf) | |||
| Country | Example | In | Alberta | , | Canada | ||
| Rule | {Token.string==~“(?i)in”} | ({Token}) [ | {Token.string==“,”} | ({Lookup.majorType==“countries”}|{Lookup.majorType==“states”}) | |||