| Literature DB >> 31113357 |
Huiwei Zhou1, Chengkun Lang2, Zhuang Liu2, Shixian Ning2, Yingyu Lin3, Lei Du4.
Abstract
BACKGROUND: Automatic extraction of chemical-disease relations (CDR) from unstructured text is of essential importance for disease treatment and drug development. Meanwhile, biomedical experts have built many highly-structured knowledge bases (KBs), which contain prior knowledge about chemicals and diseases. Prior knowledge provides strong support for CDR extraction. How to make full use of it is worth studying.Entities:
Keywords: Attention mechanism; CDR extraction; Context features; Gating units; Knowledge representations
Mesh:
Year: 2019 PMID: 31113357 PMCID: PMC6528333 DOI: 10.1186/s12859-019-2873-7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The dependency tree of sentence 1 with chemical “pilocarpine” and disease “seizures”
Fig. 2The framework of the knowledge-guided convolutional networks
Statistics of the CDR dataset
| Dataset | Articles | Chemical | Disease | CID | ||
|---|---|---|---|---|---|---|
| Men | ID | Men | ID | |||
| Training | 500 | 5203 | 1467 | 4182 | 1965 | 1038 |
| Development | 500 | 5347 | 1507 | 4244 | 1865 | 1012 |
| Test | 500 | 5385 | 1435 | 4424 | 1988 | 1066 |
Men, ID and CID denotes the number of Mentions, MeSH IDs and CID relations, respectively
Settings of hyper-parameters
| Parameter | Description | Value |
|---|---|---|
|
| TransE epochs | 500 |
|
| Word embedding dimension | 100 |
|
| Entity/relation embedding dimension | 100 |
|
| Filter number | 100 |
|
| Minimal batch size | 20 |
|
| Learning rate of intra-sentence instances | 0.0001 |
|
| Learning rate of inter-sentence instances | 0.0002 |
Effects of different prior knowledge on performance on the CDR dataset
| Method | Intra-sentence level | Inter-sentence level | Document level | ||||||
|---|---|---|---|---|---|---|---|---|---|
| P (%) | R (%) | F (%) | P (%) | R (%) | F (%) | P (%) | R (%) | F (%) | |
|
| 70.61 |
|
|
|
|
|
|
|
|
|
|
| 57.04 | 63.43 | 57.71 | 10.88 | 18.31†† | 68.82 | 67.92 | 68.37† |
|
| 60.99 | 53.38 | 56.93†† | 40.57 | 8.07 | 13.46†† | 57.21 | 61.44 | 59.25†† |
|
| 58.03 | 53.56 | 55.71†† | 47.69 | 5.82 | 10.37†† | 56.82 | 59.38 | 58.07†† |
The descriptions and analysis for Table 3 could be found in subsection “Effects of prior knowledge”. The marker † and †† represent P-value < 0.05 and P-value < 0.01, respectively, using pairwise t-test against KCN. The highest scores are highlighted in bold
Influences of curated CDR articles on the relation extraction results
| Method | Intra-sentence level | Inter-sentence level | Document level | ||||||
|---|---|---|---|---|---|---|---|---|---|
| P (%) | R (%) | F (%) | P (%) | R (%) | F (%) | P (%) | R (%) | F (%) | |
|
|
| 60.41 |
|
| 12.57 | 21.09 |
| 72.98 |
|
| - | 64.44 | 53.38 | 58.39 | 45.41 | 8.35 | 14.10 | 60.98 | 61.73 | 61.35 |
| - | 64.07 | 60.23 | 62.09 | 54.63 | 11.07 | 18.41 | 62.40 | 71.29 | 66.55 |
| - | 65.53 | 48.87 | 55.99 | 49.71 | 8.16 | 14.02 | 61.76 | 57.41 | 59.50 |
| Only KB | 59.44 |
| 62.48 | 31.34 |
|
| 50.41 |
| 63.90 |
The descriptions and analysis for Table 4 could be found in subsection “Influences of the curated articles in the CDR dataset”. The highest scores are highlighted in bold
Effects of each component of architecture on performance on the CDR dataset
| Method | Intra-sentence level | Inter-sentence level | Document level | ||||||
|---|---|---|---|---|---|---|---|---|---|
| P (%) | R (%) | F (%) | P (%) | R (%) | F (%) | P (%) | R (%) | F (%) | |
|
|
| 60.41 |
|
|
|
|
|
|
|
|
| 67.71 |
| 63.96† | 60.95 | 9.66 | 16.68†† | 66.70 | 70.26 | 68.43†† |
|
| 63.37 | 52.25 | 57.28†† | 42.55 | 9.38 | 15.37†† | 58.98 | 61.63 | 60.28†† |
The descriptions and analysis for Table 5 could be found in subsection “Effects of architecture”. The marker † and †† represent P-value < 0.05 and P-value < 0.01, respectively, using pairwise t-test against KCN. The highest scores are highlighted in bold
Effects of different parameter sharing strategies on performance on the CDR dataset
| Method | Intra-sentence level | Inter-sentence level | Document level | ||||||
|---|---|---|---|---|---|---|---|---|---|
| P (%) | R (%) | F (%) | P (%) | R (%) | F (%) | P (%) | R (%) | F (%) | |
|
|
| 60.41 |
| 65.37 |
|
| 69.65 |
|
|
|
| 70.09 | 60.23 | 64.78 | 62.96 | 9.57 | 16.61†† | 69.02 | 69.79 | 69.40† |
|
| 70.39 | 60.23 | 64.91† |
| 10.23 | 17.78†† |
| 70.45 | 70.25†† |
|
| 69.15 |
| 64.81† | 63.19 | 10.79 | 18.43†† | 68.18 | 71.76 | 69.93†† |
The descriptions and analysis for Table 6 could be found in subsection “Effects of sharing parameters”. The marker † and †† represent P-value < 0.05 and P-value < 0.01, respectively, using pairwise t-test against KCN. The highest scores are highlighted in bold
Effects of different gating mechanisms in the gated convolutional layer on performance on the CDR dataset
| Method | Intra-sentence level | Inter-sentence level | Document level | ||||||
|---|---|---|---|---|---|---|---|---|---|
| P (%) | R (%) | F (%) | P (%) | R (%) | F (%) | P (%) | R (%) | F (%) | |
|
| 70.61 |
|
|
|
|
| 69.65 |
|
|
|
|
| 59.29 | 64.92†† | 62.05 | 11.35 | 19.19†† |
| 70.64 | 70.31†† |
|
| 71.38 | 59.66 | 65.00†† | 60.11 | 10.32 | 17.61†† | 69.46 | 69.98 | 69.72†† |
The descriptions and analysis for Table 7 could be found in subsection “Effects of gating mechanisms”. The marker † and †† represent P-value < 0.05 and P-value < 0.01, respectively, using pairwise t-test against KCN. The highest scores are highlighted in bold
Fig. 3The attention visualization of a negative instance
Fig. 4The gating visualization of a positive instance
Comparison with previous systems of CDR extraction
| Method | System | P (%) | R (%) | F (%) | |
|---|---|---|---|---|---|
| without KBs | Feature-based | Gu et al. [ | 62.00 | 55.10 | 58.30 |
| Neural network-based | Nguyen et al. [ | 57.00 | 68.60 | 62.30 | |
| Le et al. [ | 58.02 | 76.20 |
| ||
| Verga et al. [ | 55.60 | 70.80 | 62.10 | ||
| with KBs | Feature-based | Pons et al. [ | 73.10 | 67.60 | 70.20 |
| Peng et al. [ | 68.15 | 66.04 | 67.08 | ||
| ♠Peng et al. [ | 71.07 | 72.61 |
| ||
| Neural network-based | Li et al. [ | 59.97 | 81.49 | 69.09 | |
| Zhou et al. [ | 60.51 | 80.48 | 69.08 | ||
|
| 69.65 | 72.98 |
| ||
|
♠
| 72.12 | 68.67 | 70.35 | ||
The descriptions and analysis for Table 8 could be found in subsection “Comparison with previous works”. The marker ♠ indicates that the system uses additional weakly labeled data for training. The highest F1-score of each subgroup is highlighted in bold
Comparison with previous systems under the three conditions
| Condition | System | P (%) | R (%) | F (%) |
|---|---|---|---|---|
| (I) |
| 62.15 | 46.28 | 53.70 |
|
| 48.24 | 66.89 | 56.05 | |
|
| 56.82 | 59.38 | 58.07 | |
| (II) |
| 68.55 | 59.10 | 63.48 |
|
| 60.51 | 80.48 | 69.08 | |
|
| 69.65 | 72.98 |
| |
| (III) |
| 59.95 | 45.78 | 51.91 |
|
| 60.30 | 55.72 | 57.92 | |
|
| 62.68 | 57.04 | 59.73 |
The descriptions and analysis for Table 9 could be found in subsection “Influences of CDR triples on previous works”. The three different conditions are (I) without KBs, marked as ; (II) with KBs, marked as ; (III) with KBs but removing CDR triples in the CDR test set, marked as . The highest F1-score is highlighted in bold
Fig. 5The error distribution of origins of FPs and FNs. FNs-IC denotes incorrectly classified FNs and FNs-MC denotes missing classified FNs