| Literature DB >> 32448122 |
Zhijing Li1,2, Yuchen Lian1,2, Xiaoyong Ma1,2, Xiangrong Zhang3, Chen Li4,5.
Abstract
BACKGROUND: Semantic resources such as knowledge bases contains high-quality-structured knowledge and therefore require significant effort from domain experts. Using the resources to reinforce the information retrieval from the unstructured text may further exploit the potentials of such unstructured text resources and their curated knowledge.Entities:
Keywords: Attention mechanism; Bio-text-mining; Biological semantic relation; Knowledge base
Year: 2020 PMID: 32448122 PMCID: PMC7245897 DOI: 10.1186/s12859-020-3540-8
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
The details of the data
| Entity type | Gene |
|---|---|
| Protein | |
| Is_Functionally_Equivalent_To | |
| Interacts_With | |
| Has_Sequence_Identical_To | |
| Transcribes_Or_Translates_To | |
| Regulates_Expression | |
| Is_Linked_To | |
| Binds_To | |
| Regulates_Molecule_Activity |
Results of entity extraction without KB information
| Model | P | R | F1 |
|---|---|---|---|
| SVM | 0.5914 | 0.5754 | 0.5833 |
| CRF | 0.5911 | 0.5915 | 0.5913 |
| BiLSTM | 0.5896 | 0.5760 | 0.5827 |
| BiLSTM-CRF | 0.6231 | 0.5825 | 0.6021 |
Results of entity extraction based on different model with KB information
| Model | P | R | F1 |
|---|---|---|---|
| BiLSTM-Uni-Bio | 0.6314 | 0.6154 | 0.6233 |
| BiLSTM-CRF-Uni-Bio | 0.6354 | 0.6282 | 0.6318 |
Results of relation extraction without KB information
| Model | P | R | F1 |
|---|---|---|---|
| SVM (Litway) | 0.4564 | 0.4343 | 0.4451 |
| BiLSTM-attention | 0.5024 | 0.4495 | 0.4512 |
| RNN-CNN | 0.5133 | 0.4199 | 0.4619 |
| BiLSTM (the whole sentences) | 0.4335 | 0.3654 | 0.3966 |
| BiLSTM | 0.4828 | 0.3930 | 0.4333 |
Results of relation extraction based on different model with KB information
| Model | P | R | F1 |
|---|---|---|---|
| BiLSTM (the whole sentences)-Uni-Bio | 0.4533 | 0.4006 | 0.4254 |
| BiLSTM-Uni-Bio | 0.5172 | 0.4575 | 0.4681 |
Results of relation extraction (BiLSTM-Uni-Bio)
| Relations | P | R | F1 |
|---|---|---|---|
| ThemeOf | 0.49 | 0.58 | 0.53 |
| CauseOf | 0.51 | 0.32 | 0.35 |
Results of relation extraction (the best system)
| Relations | P | R | F1 |
|---|---|---|---|
| ThemeOf | 0.50 | 0.51 | 0.51 |
| CauseOf | 0.55 | 0.22 | 0.32 |
Results of relation extraction (BioCreative data)
| Model | P | R | F1 |
|---|---|---|---|
| Rule-based | 0.3890 | 0.3010 | 0.3394 |
| CNN | 0.3653 | 0.2561 | 0.3011 |
| RNN-CNN | 0.3711 | 0.3288 | 0.3486 |
| CHEMPROT | 0.3732 | 0.3280 | 0.3491 |
| CNN-KB | 0.3602 | 0.3337 | 0.3464 |
| BiLSTM | 0.3215 | 0.3381 | 0.3296 |
| BiLSTM-KB | 0.3875 | 0.3157 | 0.3479 |
| KBLSTM | 0.3716 | 0.3276 | 0.3482 |
| BiLSTM-Uni-Bio | 0.3671 | 0.3331 | 0.3493 |
Fig. 1Relation extraction example in BioNLP-2016 competition. In this figure, we give an example of the relation extraction task in BioNLP-2016 competition
Fig. 2Flow chart of the proposed system. The processes of our system include preprocessing, word embedding, prior knowledge from UniProt KB, entity representation, BiLSTM, Bio-information retrieval (BioModels), and entity and relation extraction. For the prior knowledge from UniProt KB, we use Bioservices, urllib, BeautifulSoup tool does finish a series of processes. For the Bio-information retrieval (BioModels) part, we apply the attention mechanism to import the prior knowledge into the system. We use the method to do entity extraction and relation extraction. It is mainly about the relation extraction
Fig. 3Entity representation. The entity representation includes information from both KB and scientific literature
An example of obtaining the information of the entity “Contactin-2” from UniProt
| Entity | Contactin-2 |
|---|---|
| Function | In conjunction with another transmembrane protein, CNTNAP2, contributes to the organization of axonal domains at nodes of Ranvier by maintaining voltage-gated potassium channels at the juxtaparanodal region. May be involved in cell adhesion. |
| Recommended name | Contactin-2 |
| Alternative name | Axonal glycoprotein TAG-1 |
| Axonin-1 | |
| Transient axonal glycoprotein 1 | |
| TAX-1 | |
| Gene names | CNTN2 |
Fig. 4An example of the process of extracting the related entities of the entity “ATERF1” from BioModels. We give an example of a specific process of searching the related entities of the given entity
Fig. 5The architecture of the bio-information retrieval from BioModels. The attention mechanism of how to introduce the information from BioModels into the BiLSTM architecture
Fig. 6BiLSTM model for the entity and relation extraction. The flow chart of the BiLSTM network to predict the type of the entity and relation