| Literature DB >> 25592675 |
Dimitar Hristovski1, Dejan Dinevski2, Andrej Kastrin3, Thomas C Rindflesch4.
Abstract
BACKGROUND: The proliferation of the scientific literature in the field of biomedicine makes it difficult to keep abreast of current knowledge, even for domain experts. While general Web search engines and specialized information retrieval (IR) systems have made important strides in recent decades, the problem of accurate knowledge extraction from the biomedical literature is far from solved. Classical IR systems usually return a list of documents that have to be read by the user to extract relevant information. This tedious and time-consuming work can be lessened with automatic Question Answering (QA) systems, which aim to provide users with direct and precise answers to their questions. In this work we propose a novel methodology for QA based on semantic relations extracted from the biomedical literature.Entities:
Mesh:
Year: 2015 PMID: 25592675 PMCID: PMC4307891 DOI: 10.1186/s12859-014-0365-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The user interface of the question answering tool SemBT.
Top 15 semantic relations extracted with SemRep
|
|
|
|
|---|---|---|
| PROCESS_OF | 739217 | 12908669 |
| LOCATION_OF | 2033678 | 9560753 |
| PART_OF | 1132906 | 8736983 |
| TREATS | 993417 | 5435929 |
| ISA | 306793 | 3752806 |
| COEXISTS_WITH | 1149505 | 2496628 |
| AFFECTS | 1008068 | 2124063 |
| INTERACTS_WITH | 956926 | 1824826 |
| USES | 225686 | 1391375 |
| ASSOCIATED_WITH | 544318 | 1316494 |
| CAUSES | 525082 | 1164132 |
| ADMINISTERED_TO | 140881 | 1079967 |
| STIMULATES | 442904 | 845725 |
| INHIBITS | 424125 | 749490 |
| AUGMENTS | 384689 | 742944 |
Only the top 15 relations with highest instances count are shown [for full table see Additional file 1]. For each semantic relation its name, the number of unique relations and the number of instances are shown.
UMLS semantic types and their corresponding abbreviations that can be used for posing questions
|
|
|
|
|
|---|---|---|---|
| dsyn | Disease or Syndrome | 2603234 | 12591865 |
| aapp | Amino Acid, Peptide, or Protein | 4345793 | 11503829 |
| podg | Patient or Disabled Group | 199404 | 9258258 |
| bpoc | Body Part, Organ, or Organ Component | 1392967 | 8711584 |
| gngm | Gene or Genome | 3422406 | 6946503 |
| topp | Therapeutic or Preventive Procedure | 991912 | 5108457 |
| neop | Neoplastic Process | 802419 | 4650747 |
| cell | Cell | 807092 | 4346530 |
| mamm | Mammal | 456140 | 4230378 |
| phsu | Pharmacologic Substance | 1100842 | 4226225 |
| orch | Organic Chemical | 1550322 | 3485563 |
| bacs | Biologically Active Substance | 951043 | 2922549 |
| fndg | Finding | 592051 | 2906281 |
| patf | Pathologic Function | 667378 | 2828014 |
| popg | Population Group | 339502 | 2419899 |
| aggp | Age Group | 179259 | 2191630 |
| tisu | Tissue | 291668 | 1634166 |
| celc | Cell Component | 349980 | 1606480 |
| humn | Human | 94088 | 1537291 |
| sosy | Sign or Symptom | 275794 | 1400138 |
| diap | Diagnostic Procedure | 271705 | 1385473 |
| orgf | Organism Function | 292346 | 1307166 |
| inpo | Injury or Poisoning | 200300 | 1076107 |
| celf | Cell Function | 370243 | 1004738 |
| mobd | Mental or Behavioral Dysfunction | 186453 | 997544 |
Also shown is how many times a semantic type appears as an argument in semantic relations and semantic relation instances. Only the most frequent 25 semantic types are shown out of 133.
Figure 2Faceting, filtering and argument expansion used together to get the factors that predispose various neoplasms.
Figure 3The first few instances of the semantic relation “Donepezil-TREATS-Alzheimer’s disease” shown as highlighted sentences.