| Literature DB >> 28365720 |
K E Ravikumar1, Majid Rastegar-Mojarad1,2, Hongfang Liu1.
Abstract
Extracting meaningful relationships with semantic significance from biomedical literature is often a challenging task. BioCreative V track4 challenge for the first time has organized a comprehensive shared task to test the robustness of the text-mining algorithms in extracting semantically meaningful assertions from the evidence statement in biomedical text. In this work, we tested the ability of a rule-based semantic parser to extract Biological Expression Language (BEL) statements from evidence sentences culled out of biomedical literature as part of BioCreative V Track4 challenge. The system achieved an overall best F-measure of 21.29% in extracting the complete BEL statement. For relation extraction, the system achieved an F-measure of 65.13% on test data set. Our system achieved the best performance in five of the six criteria that was adopted for evaluation by the task organizers. Lack of ability to derive semantic inferences, limitation in the rule sets to map the textual extractions to BEL function were some of the reasons for low performance in extracting the complete BEL statement. Post shared task we also evaluated the impact of differential NER components on the ability to extract BEL statements on the test data sets besides making a single change in the rule sets that translate relation extractions into a BEL statement. There is a marked improvement by over 20% in the overall performance of the BELMiner's capability to extract BEL statement on the test set. The system is available as a REST-API at http://54.146.11.205:8484/BELXtractor/finder/. Database URL: http://54.146.11.205:8484/BELXtractor/finder/.Entities:
Mesh:
Year: 2017 PMID: 28365720 PMCID: PMC5467463 DOI: 10.1093/database/baw156
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1System architecture.
Figure 2Illustration of individual steps of rule based semantic parser.
Mapping NLP system output to BEL functions
| Event/Entities | BEL function |
|---|---|
| phosphorylation of | p(HGNC:PDE3B,pmod(P,S,273)) |
| translocation of | tloc(p(HGNC:HSF1)) |
| expression of | act(p(HGNC:ICAM1)) |
| truncal | path(MESHD:Obesity) |
| interaction of | complex(p(HGNC:CCNA1),p(HGNC:E2F1)) |
| complex(p(HGNC:CCNA1),p(HGNC:CDK2)) | |
| act(p(HGNC:GK)) | |
| activates STAT3 | increases act(p(HGNC:STAT3)) |
Verbs for causal relations
| S. No | Verb categories | Verbs |
|---|---|---|
| 1 | decreases | reduce, decrease, suppress, block, down-regulate, decrease, down-regulation, inhibit |
| 2 | increases | increase, induce, activate, enhance, up-regulate, up-regulation |
| 3 | directlyIncreases | increase verbs preceded by an adjective “directly” |
| 4 | directlyDecreases | decrease verbs preceded by an adjective “directly” |
Performance of BELMiner on BioCreative BEL task (with and without gold standard entities)
| Class | Entities from gold standard | Entities from NER | |||||
|---|---|---|---|---|---|---|---|
| Pre (%) | Rec (%) | F-mes (%) | Pre (%) | Rec (%) | F-mes (%) | ||
| Term (T) | Run1 | 91.8 | 74.67 | 82.35 | 82.03 | 59.33 | 68.86 |
| Run2 | 92.51 | 70.00 | 79.70 | 83.33 | 50.00 | 62.5 | |
| FS | Run1 | 51.47 | 62.50 | 56.45 | 50.77 | 58.93 | 54.55 |
| Run2 | 51.61 | 57.14 | 54.24 | 54.72 | 51.79 | 53.21 | |
| Function | Run1 | 25.53 | 36.36 | 30.00 | 27.78 | 37.88 | 32.05 |
| Run2 | 27.06 | 34.85 | 30.46 | 30.67 | 34.85 | 32.62 | |
| Relation-Secondary (RS) | Run1 | ||||||
| Run2 | |||||||
| Relation | Run1 | ||||||
| Run2 | |||||||
| Statement | Run1 | 32.09 | 21.29 | 25.60 | 26.42 | 13.86 | 18.18 |
| Run2 | 32.09 | 21.29 | 25.60 | 26.42 | 13.86 | 18.18 | |
Pre, precision; Rec, recall; F-mes, F-measure.
Performance of BELMiner on BioCreative BEL task (Post Shared Task Improvements)
| Class/Run2 | Entities from NER | |||
|---|---|---|---|---|
| Pre (%) | Rec (%) | F-mes (%) | ||
| Term (T) | Run2 | 83.89 | 50.33 | 62.92 |
| Function secondary (FS) | Run2 | 85.19 | 41.07 | 55.42 |
| Function | Run2 | 71.43 | 30.3 | 42.55 |
| Relation-Secondary (RS) | Run2 | 93.13 | 60.4 | 73.27 |
| Relation | Run2 | 69.37 | 38.12 | 49.20 |
| Statement | Run2 | 59.6 | 29.21 | 39.2 |
Pre, precision; Rec, recall; F-mes, F-measure.
Figure 3Impact of different NER components of BELMiner on BioCreative BEL task.