| Literature DB >> 30184195 |
Aaron M Cohen1, Zackary O Dunivin1, Neil R Smalheiser2.
Abstract
The Medical Subject Heading 'Humans' is manually curated and indicates human-related studies within MEDLINE. However, newly published MEDLINE articles may take months to be indexed and non-MEDLINE articles lack consistent, transparent indexing of this feature. Therefore, for up to date and broad literature searches, there is a need for an independent automated system to identify whether a given publication is human-related, particularly when they lack Medical Subject Headings. One million MEDLINE records published in 1987-2014 were randomly selected. Text-based features from the title, abstract, author name and journal fields were extracted. A linear support vector machine was trained to estimate the probability that a given article should be indexed as Humans and was evaluated on records from 2015 to 2016. Overall accuracy was high: area under the receiver operating curve = 0.976, F1 = 95% relative to MeSH indexing. Manual review of cases of extreme disagreement with MEDLINE showed 73.5% agreement with the automated prediction. We have tagged all articles indexed in PubMed with predictive scores and have made the information publicly available at http://arrowsmith.psych.uic.edu/evidence_based_medicine/index.html. We have also made available a web-based interface to allow users to obtain predictive scores for non-MEDLINE articles. This will assist in the triage of clinical evidence for writing systematic reviews.Entities:
Mesh:
Year: 2018 PMID: 30184195 PMCID: PMC6146117 DOI: 10.1093/database/bay079
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Forward selection process results, showing best performing feature included at each stage using 5 × 2 cross-validation on the training data set
| Stage | Feature | AUC | MCC |
|---|---|---|---|
| 1 | Abstract Bigrams | 0.955 | 0.771 |
| 2 | Abstract Unigrams | 0.967 | 0.813 |
| 3 | Journal Name | 0.969 | 0.823 |
| 4 | Title Unigrams | 0.972 | 0.831 |
| 5 | Title Bigrams | 0.973 | 0.834 |
| 6 | Abstract Trigrams | 0.974 | 0.837 |
Comparison of performance results predicted by cross-validation and actual results predicted on the test data set
| Dataset | AUC | MCC | F1 | Recall | Precision | Brier Score | Error Rate |
|---|---|---|---|---|---|---|---|
| Training | 0.975 | 0.841 | 0.944 | 0.940 | 0.949 | 0.059 | 0.073 |
| Test | 0.976 | 0.833 | 0.950 | 0.946 | 0.955 | 0.056 | 0.070 |
Figure 1Probabilistic tagger confidence score calibration plot. The x-axis represents the predicted probability score, and the y-axis shows the proportion of articles within a similar probability score range that were assigned the Humans MeSH term. Numbers next to the dots show the number of samples included in the probability score range used to calculate the MeSH Humans proportion. The dotted line x = y shows perfect calibration for comparison.
Figure 2Probabilistic tagger predicted probability score distribution over articles in the test set, consisting of articles published in 2015–2016 and assigned the Humans MeSH term. Shows the distribution of the probability estimates of these articles as predicted by our model versus the percentage of articles in the test set assigned the MeSH Humans term.
Figure 3Probabilistic tagger predicted probability score distribution over articles in the test set, consisting of articles published in 2015–2016 and NOT assigned the Humans MeSH term. Shows the distribution of the probability estimates of these articles as predicted by our model versus the percentage of articles in the test set NOT assigned the MeSH Humans term.
Comparison of manual review for cases of extreme disagreement between the MEDLINE assigned Humans MeSH term and the model’s predictive probability scores. One hundred cases of extreme prediction disagreement were selected randomly from articles with the MEDLINE Humans assignment but predictive tagger probabilities <0.01, and another 100 cases lacking the MEDLINE Humans term but having predictive tagger probabilities >0.99
| Manual review | ||||
|---|---|---|---|---|
| Disagreement type | Humans | Not Humans | Uncertain | Totals |
| Humans MeSH term assigned, tagger probability score < 0.01 | 2 | 97 | 1 | 100 |
| Humans MeSH term not assigned, tagger probability score > 0.99 | 50 | 41 | 9 | 100 |
| Totals | 52 | 138 | 10 | 200 |