| Literature DB >> 32810207 |
Benjamin Elsworth1, Tom R Gaunt1.
Abstract
SUMMARY: The field of literature-based discovery is growing in step with the volume of literature being produced. From modern natural language processing algorithms to high quality entity tagging, the methods and their impact are developing rapidly. One annotation object that arises from these approaches, the subject-predicate-object triple, is proving to be very useful in representing knowledge. We have implemented efficient search methods and an application programming interface, to create fast and convenient functions to utilize triples extracted from the biomedical literature by SemMedDB. By refining these data, we have identified a set of triples that focus on the mechanistic aspects of the literature, and provide simple methods to explore both enriched triples from single queries, and overlapping triples across two query lists.Entities:
Year: 2021 PMID: 32810207 PMCID: PMC8088324 DOI: 10.1093/bioinformatics/btaa726
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Data flow of the MELODI Presto overlap function. Two lists of queries (Q1 and Q2) are first checked for previous enrichment analysis. For queries, which have not been previously analysed the text of each missing term is run as a PubMed query and returned IDs are matched to the MELODI Presto database for enrichment. The results of previous analysis are loaded from a local store. Overlapping elements between each pair of enriched triple sets (one from each query list, Q1 and Q2) are then identified and returned