| Literature DB >> 29077847 |
Yijia Zhang1,2, Wei Zheng1,3, Hongfei Lin1, Jian Wang1, Zhihao Yang1, Michel Dumontier4.
Abstract
Motivation: Adverse events resulting from drug-drug interactions (DDI) pose a serious health issue. The ability to automatically extract DDIs described in the biomedical literature could further efforts for ongoing pharmacovigilance. Most of neural networks-based methods typically focus on sentence sequence to identify these DDIs, however the shortest dependency path (SDP) between the two entities contains valuable syntactic and semantic information. Effectively exploiting such information may improve DDI extraction.Entities:
Mesh:
Year: 2018 PMID: 29077847 PMCID: PMC6030919 DOI: 10.1093/bioinformatics/btx659
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.An illustration of SDP. The sentence example is from DDI extraction 2013 corpus. ‘Drug0’ and ‘Drug1’ denote two targeted drug entities, respectively. The Stanford parser is used to syntactic parse the sentence and generate the dependency syntactic graph. The nodes and edges on the shortest path between ‘Drug0’ and ‘Drug1’ are shown in bold. SDP between ‘Drug0’ and ‘Drug1’ can be extracted from the dependency syntactic graph. The nodes and edges on the SDP denote the tokens and dependency relations on the SDP between ‘Drug0’ and ‘Drug1’, respectively
Fig. 2.The overview of our hierarchical RNNs model on sequence and SDP
The statistics of the DDI 2013 extraction corpus
| Corpus | Advice | Effect | Mechanism | Int | Negative |
|---|---|---|---|---|---|
| Training set | 826 | 1687 | 1319 | 188 | 23772 |
| Test set | 221 | 360 | 302 | 96 | 4737 |
| Total | 1047 | 2047 | 1621 | 284 | 28554 |
The evaluation of different RNNs model on performance
| RNNs model | Precision | Recall | ||
|---|---|---|---|---|
| Simple RNNs | 0.657 | 0.576 | 0.614 | |
| GRUs | 0.733 | 0.715 | 0.724 | +0.11 |
| LSTMs | 0.741 | 0.718 | 0.729 | +0.05 |
Note: ‘’ denotes the corresponding improvement of F-score.
The effect of the embedding feature on performance
| Embedding feature | Precision | Recall | ||
|---|---|---|---|---|
| Word | 0.688 | 0.717 | 0.703 | |
| Word+POS | 0.717 | 0.713 | 0.715 | +0.12 |
| Word+POS+Position | 0.741 | 0.718 | 0.729 | +0.14 |
Note: ‘Word’, ‘POS’ and ‘Position’ denote word embedding, POS embedding and position embedding, respectively. ‘’ denotes the corresponding improvement of F-score.
The effect of strategy on performance
| Model | Precision | Recall | |
|---|---|---|---|
| B-LSTMs ( | 0.76 | 0.656 | 0.704 |
| Joint-LSTMs ( | 0.734 | 0.697 | 0.715 |
| SDP Bi-LSTMs | 0.592 | 0.474 | 0.526 |
| Sequence Bi-LSTMs | 0.702 | 0.691 | 0.696 |
| Hierarchy Bi-LSTMs | 0.725 | 0.689 | 0.707 |
| Hierarchy Bi-LSTMs +Att. | 0.73 | 0.703 | 0.717 |
| Hierarchy Bi-LSTMs +Att.+SDP | 0.741 | 0.718 | 0.729 |
Note: ‘Att.’ denotes using embedding attention mechanism.
Performance comparison with other state-of-the-art methods on DDI extraction 2013 corpus
| Methods | Overall performance | |||||||
|---|---|---|---|---|---|---|---|---|
| Advice | Effect | Mechanism | Int | Precision | Recall | |||
| Feature-based methods | UTurku ( | 0.63 | 0.6 | 0.582 | 0.507 | 0.732 | 0.499 | 0.594 |
| ( | 0.725 | 0.662 | 0.693 | 0.483 | — | — | 0.67 | |
| ( | 0.774 | 0.696 | 0.736 | 0.524 | 0.737 | 0.687 | 0.711 | |
| Kernel-based methods | FBK-irst ( | 0.692 | 0.628 | 0.679 | 0.646 | 0.656 | 0.651 | |
| WBI ( | 0.632 | 0.61 | 0.618 | 0.51 | 0.642 | 0.579 | 0.609 | |
| 0.714 | 0.713 | 0.669 | 0.516 | — | — | 0.684 | ||
| Neural networks-based methods | SCNN (2016) | — | — | — | — | 0.725 | 0.651 | 0.686 |
| 0.782 | 0.682 | 0.722 | 0.51 | 0.653 | 0.702 | |||
| 0.777 | 0.693 | 0.702 | 0.464 | 0.757 | 0.647 | 0.698 | ||
| Joint-LSTMs ( | 0.794 | 0.676 | 0.431 | 0.734 | 0.697 | 0.715 | ||
| — | — | — | — | 0.737 | 0.708 | 0.722 | ||
| Our method | 0.74 | 0.543 | 0.741 | |||||
Note: The highest value is shown in bold. The ‘—’ denotes the value is not provided in the paper.