| Literature DB >> 33098410 |
Masaki Asada1, Makoto Miwa1, Yutaka Sasaki1.
Abstract
MOTIVATION: Neural methods to extract drug-drug interactions (DDIs) from literature require a large number of annotations. In this study, we propose a novel method to effectively utilize external drug database information as well as information from large-scale plain text for DDI extraction. Specifically, we focus on drug description and molecular structure information as the drug database information.Entities:
Year: 2021 PMID: 33098410 PMCID: PMC8289381 DOI: 10.1093/bioinformatics/btaa907
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Overview of our method. (A) Illustrates how to encode input sentences, drug descriptions and drug molecular structures. (B) and (C) show the prediction layer when the drug description representation and the drug molecular structure representation are used
An example of preprocessing
| Mention1 | Mention2 | Preprocessed input sentence |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Note: The input sentence contains three target drug pairs.
Fig. 2.Illustration of molecular fingerprints. This figure shows the extraction of several fingerprint subgraphs from a molecular structure when radius is 2
Statistics of SemEval-2013 dataset
| Train | Test | |||
|---|---|---|---|---|
| DrugBank | MEDLINE | DrugBank | MEDLINE | |
| # documents | 572 | 142 | 158 | 33 |
| # sentences | 5675 | 1301 | 973 | 326 |
| # drug pairs | 26 005 | 1787 | 5265 | 451 |
| # positive pairs | 3789 | 232 | 884 | 95 |
| # negative pairs | 22 216 | 1555 | 4381 | 356 |
| Mechanism | 1257 | 62 | 278 | 24 |
| Effect | 1535 | 152 | 298 | 62 |
| Advice | 818 | 8 | 214 | 7 |
| Int. | 179 | 10 | 94 | 2 |
Fig. 3.Linking between mentions and DrugBank entry
Hyper-parameters for CNNs
| Parameter | Value |
|---|---|
| Word embedding size | 768 |
| Initial learning rate | 5e-5 |
| Number of fine-tuning epochs | 3 |
| L2 weight decay | 0.01 |
| Dropout rate | 0.1 |
| Mini-batch size | 32 |
| Word position embedding size | 10 |
| Convolution window size | 5 |
| Convolution filter size | 768 |
| Convolution window size for description | 3 |
| Convolution filter size for description | 20 |
Hyper-parameters for GNNs
| Parameter | Value |
|---|---|
| Molecular embedding size | 50 |
| Number of hidden layer | 5 |
| Radius | 1 |
Evaluation on DDI extraction from texts on the test set
| Method |
|
|
|
|---|---|---|---|
|
| 75.29 | 60.37 | 67.01 |
| BioBERT ( | — | — | 78.8 |
| Text-only (word2vec CNN) | |||
| ( | 71.97 | 68.44 | 70.16 |
| Text-only (SciBERT linear) | 80.28 | 81.92 | 81.09 |
| Text-only (SciBERT CNN) | 83.10 | 80.38 | 81.72 |
| + Desc | 84.05 | 81.81 | 82.91 |
| + Mol (radius = 0) | 83.29 | 82.02 | 82.65 |
| + Mol (radius = 1) | 83.57 | 82.12 | 82.84 |
| + Mol (radius = 2) | 83.66 | 81.10 | 82.36 |
| + Desc + Mol (radius = 1) |
|
|
|
| + Desc + Mol (radius = 0,1,2) | 84.51 | 82.53 | 83.51 |
| + Mol (radius = 0,1,2) | 84.69 | 82.53 | 83.60 |
Note: We defined Text-only (SciBERT CNN) model as our baseline model. The best score is shown in bold.
Evaluation on DDI extraction from texts on the development set
| Method |
|
|
|
|---|---|---|---|
| Text-only (SciBERT CNN) | 83.55 | 80.19 | 81.84 |
| + Desc | 83.19 | 82.31 | 82.75 |
| + Mol (radius = 0) | 83.73 | 81.25 | 82.47 |
| + Mol (radius = 1) | 82.85 | 83.90 | 83.37 |
| + Mol (radius = 2) | 82.88 | 83.58 | 83.23 |
| + Desc + Mol (radius = 1) |
|
|
|
Performance on individual DDI types in F-scores
| DDI type | ||||
|---|---|---|---|---|
| Method | Mech. | Effect | Adv. | Int. (%) |
| Text-only | 86.18 | 79.12 | 88.34 | 55.94 |
| + Desc |
| 81.08 |
|
|
| + Mol (radius = 0) |
| 81.20 | 90.67 |
|
| + Mol (radius = 1) | 86.33 | 80.48 |
|
|
| + Mol (radius = 2) |
|
| 88.58 | 57.34 |
| + Desc + Mol (radius = 1) | 87.61 | 82.05 | 90.79 | 58.74 |
Note: The best score for each type is shown in bold and the scores lower than the baseline model are shown with underlines.
Individual F-scores on 5-fold cross-validated training dataset
| DDI type | |||||
|---|---|---|---|---|---|
| Method | Mech. | Effect | Adv. | Int. (%) | |
| Fold 1 | Text-only | 84.60 |
| 85.80 | 68.29 |
| + Desc |
|
|
|
| |
| + Mol (radius = 1) |
|
|
|
| |
| + Desc + Mol (radius = 1) |
|
|
|
| |
| Fold 2 | Text-only | 83.46 | 83.26 | 78.80 |
|
| + Desc | 84.15 |
| 81.99 |
| |
| + Mol (radius = 1) |
|
| 81.64 |
| |
| + Desc + Mol (radius = 1) |
| 83.38 |
|
| |
| Fold 3 | Text-only | 84.91 | 59.21 |
| 91.43 |
| + Desc |
|
|
| 91.67 | |
| + Mol (radius = 1) |
| 86.24 |
|
| |
| + Desc + Mol (radius = 1) |
| 87.25 |
| 92.96 | |
| Fold 4 | Text-only | 76.81 | 81.56 | 78.01 | 79.45 |
| + Desc | 77.54 | 82.47 |
| 81.16 | |
| + Mol (radius = 1) |
| 84.03 |
|
| |
| + Desc + Mol (radius = 1) | 77.35 |
| 79.40 |
| |
| Fold 5 | Text-only | 81.97 | 81.76 |
| 76.54 |
| + Desc | 84.95 | 83.02 |
|
| |
| + Mol (radius = 1) | 86.09 | 83.74 |
|
| |
| + Desc + Mol (radius = 1) |
|
|
|
| |
| Average | Text-only | 82.34 | 76.99 | 81.67 |
|
| + Desc | 83.09 | 84.39 |
|
| |
| + Mol (radius = 1) | 82.47 | 83.57 |
|
| |
| + Desc + Mol (radius = 1) |
|
|
|
| |
Note: We used the micro-averaged F-score to calculate the average of the folds. The best score for each type is shown in bold and the scores lower than the baseline model are shown with underlines.
Comparisons of F-scores on different parts of the test set
| Method | MEDLINE | DrugBank | Overall (%) |
|---|---|---|---|
| Text-only (SciBERT CNN) | 74.57 | 82.44 | 81.72 |
| + Desc | 74.41 | 83.75 | 82.91 |
| + Mol (radius = 0) | 75.00 | 83.41 | 82.65 |
| + Mol (radius = 1) | 73.98 | 83.71 | 82.84 |
| + Mol (radius = 2) | 74.57 | 83.15 | 82.36 |
| + Desc + Mol (radius = 1) |
|
|
|
Accuracy of binary classification on the DrugBank pairs
| Accuracy (%) | ||
|---|---|---|
| Description | SciBERT | 91.05 |
| Molecular structure | GNN (radius = 0) | 67.58 |
| GNN (radius = 1) | 82.21 | |
| GNN (radius = 2) | 89.36 |
Evaluation on DDI extraction from texts with or without pre-training of GNNs for the molecular structure and CNNs for the description
| Methods |
|
|
| |
|---|---|---|---|---|
| SciBERT | 83.10 | 80.38 | 81.72 | |
| w/ pre-training | + Desc |
| 79.26 | 81.85 |
| + Mol (radius = 0) | 82.69 | 81.00 | 81.83 | |
| + Mol (radius = 1) | 84.51 | 80.28 | 82.34 | |
| + Mol (radius = 2) | 82.36 | 80.28 | 81.74 | |
| w/o pre-training | + Desc | 84.05 | 81.81 |
|
| + Mol (radius = 0) | 83.29 | 82.02 | 82.65 | |
| + Mol (radius = 1) | 83.57 |
| 82.84 | |
| + Mol (radius = 2) | 83.66 | 81.10 | 82.36 |
Fig. 4.F-scores for different sentence lengths on the 5-fold cross-validated training dataset. We used the micro-averaged F-score to calculate the average of the folds