| Literature DB >> 32940710 |
Iain J Marshall1, Benjamin Nye2, Joël Kuiper3, Anna Noel-Storr4, Rachel Marshall5, Rory Maclean1, Frank Soboczenski1, Ani Nenkova6, James Thomas7, Byron C Wallace2.
Abstract
OBJECTIVE: Randomized controlled trials (RCTs) are the gold standard method for evaluating whether a treatment works in health care but can be difficult to find and make use of. We describe the development and evaluation of a system to automatically find and categorize all new RCT reports.Entities:
Keywords: automatic database curation; evidence based medicine; randomized controlled trials; research synthesis
Mesh:
Year: 2020 PMID: 32940710 PMCID: PMC7727361 DOI: 10.1093/jamia/ocaa163
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1.How articles are retrieved, annotated, and stored. ICTRP: International Clinical Trials Registry Platform; PICO: populations, interventions/comparators, and outcomes; RCT: randomized controlled trial.
Summary of Trialstreamer components
| Component | Model architecture | How used in Trialstreamer | Data |
|---|---|---|---|
| RCT vs non-RCT study classifier | Ensemble of support vector machine and convolutional neural network models | Previously validated model | Training and parameter tuning on 280 000 abstracts with crowdsourced labels from Cochrane Crowd |
| Human vs non-human study classifier | Support vector machine model, based on Cohen et al | New model trained and validated, based on prior method | 467 153 abstracts of RCTs from PubMed, using MeSH term “Humans” |
| Sample size extraction | Multilayer perceptron model for classifying integers in abstracts | New model trained and validated | Trained on 8935 abstracts with sample sizes (6315 taken from structured results data in ClinicalTrials.gov; 2620 manually labelled by our team) |
| PICO text spans | LSTM-CRF model | Previously validated model used unchanged | 5000 abstracts with from |
| PICO concepts | Rule-based concept extraction, following the method of | Reimplementation of previously described method | 972 371 selected concepts (CUIs) and their associated text from the UMLS Metathesaurus 2019 edition. Concepts included were from the source vocabularies: SNOMED CT, RxNorm, MeSH, MedDRA, and the World Health Organization ATC classification system |
| Risk of bias | Logistic regression model with L2 regularization, using bag-of-words representation (unigrams, bigrams and trigrams) of the title and abstract text to generate an overall score | New model trained and validated | 13 463 abstracts of RCTs with Cochrane Risk of Bias tool assessments in the Cochrane Library: 60% used for training; 40% withheld for evaluation |
ATC: Anatomic Therapeutic Chemical; CUI: concept unique identifier; LSTM-CRF: long short-term memory conditional random fields; MedDRA: Medical Dictionary for Regulatory Activities; MeSH: Medical Subject Headings; RCT: randomized controlled trial; SNOMED CT: Systematized Nomenclature of Medicine Clinical Terms; UMLS: Unified Medical Language System.
Figure 2.Screenshot of the Trialstreamer web interface home page.
Figure 3.Trialstreamer search results.
Summary of evaluation performance of models used in the Trialstreamer system
| Component | Recall (95% CI) | Precision (95% CI) | C-statistic (95% CI) | Brier score (95% CI) |
|---|---|---|---|---|
| RCT classifier (balanced threshold) | ||||
| MeSH indexed articles | 0.97 (0.96-0.98) | 0.52 (0.48-0.56) | 0.99 (0.98-0.99) | 0.01 (0.01-0.01) |
| Not MeSH indexed | 0.94 (0.92-0.96) | 0.50 (0.44-0.57) | 0.98 (0.98-0.98) | 0.01 (0.01-0.01) |
| Human vs nonhuman classifier | 1.00 (1.00-1.00) | 1.00 (1.00-1.00) | 0.95 (0.94-0.96) | 0.003 (0.003-0.004) |
| Sample size extraction | 0.79 (0.77-0.82) | 0.88 (0.85-0.90) | n/a | n/a |
| PICO text spans | ||||
| Population (n = 200) | 0.66 | 0.78 | n/a | n/a |
| Interventions (n = 200) | 0.65 | 0.61 | ||
| Outcomes (n = 200) | 0.63 | 0.69 | ||
| PICO concepts | ||||
| | n/a | n/a | ||
| Population (n = 1107) | 0.73 (0.70-0.75) | 0.28 (0.26-0.29) | ||
| Interventions (n = 954) | 0.78 (0.75-0.80) | 0.52 (0.49-0.55) | ||
| Outcomes (n = 503) | 0.64 (0.61-0.68) | 0.25 (0.23-0.27) | ||
| | ||||
| Population (n = 1107) | 0.78 (0.76-0.81) | 0.30 (0.29-0.32) | ||
| Interventions (n = 954) | 0.85 (0.82, 0.87) | 0.57 (0.54-0.60) | ||
| Outcomes (n = 503) | 0.65 (0.62-0.69) | 0.30 (0.27-0.33) | ||
| Risk of bias (probability of being at low risk of bias) | 0.46 (0.40-0.52) | 0.44 (0.41-0.48) | 0.80 (0.79-0.81) | 0.10 (0.10-0.11) |
MeSH: Medical Subject Headings; n/a: Not applicable; PICO: populations, interventions/comparators, and outcomes; RCT: randomized controlled trial.
The PICO text span accuracy results are taken from Nye et al, and we present them here for convenience.
Figure 4.Calibration plot of the risk of bias model.
Figure 5.Counts of all randomized controlled trials (RCTs) in PubMed, estimated by manual indexing (yellow) vs automation (blue). CI: confidence interval.
Figure 6.Histogram of the number of trial participants in all randomized controlled trials (RCTs) in PubMed, as extracted by our sample size extraction model.