| Literature DB >> 33796016 |
Guiduo Duan1,2, Jiayu Miao1,2, Tianxi Huang3, Wenlong Luo1,2, Dekun Hu4.
Abstract
Relation extraction is a popular subtask in natural language processing (NLP). In the task of entity relation joint extraction, overlapping entities and multi-type relation extraction in overlapping triplets remain a challenging problem. The classification of relations by sharing the same probability space will ignore the correlation information among multiple relations. A relational-adaptive entity relation joint extraction model based on multi-head self-attention and densely connected graph convolution network (which is called MA-DCGCN) is proposed in the paper. In the model, the multi-head attention mechanism is specifically used to assign weights to multiple relation types among entities so as to ensure that the probability space of multiple relation is not mutually exclusive. This mechanism also predicts the strength of the relationship between various relationship types and entity pairs flexibly. The structure information of deeper level in the text graph is extracted by the densely connected graph convolution network, and the interaction information of entity relation is captured. To demonstrate the superior performance of our model, we conducted a variety of experiments on two widely used public datasets, NYT and WebNLG. Extensive results show that our model achieves state-of-the-art performance. Especially, the detection effect of overlapping triplets is significantly improved compared with the several existing mainstream methods.Entities:
Keywords: DCGCN; entity relation joint extraction; graph convolutional networks; overlapping triplets detection; relational-adaptive mechanism
Year: 2021 PMID: 33796016 PMCID: PMC8008121 DOI: 10.3389/fnbot.2021.635492
Source DB: PubMed Journal: Front Neurorobot ISSN: 1662-5218 Impact factor: 2.650
An example of an overlapping triplet.
| SingleEntityOverlap | EntityPairOverlap |
| ( | ( |
| ( | ( |
Figure 1The MA-DCGCN model for joint entity and relation extraction.
Figure 2An illustration of a three-layer densely connected graph convolutional network.
Statistics about the datasets.
| Normal | 37,013 | 3,266 | 1,596 | 246 |
| EPO | 9,782 | 978 | 227 | 26 |
| SEO | 1,4735 | 1,297 | 3,406 | 457 |
| All | 5,6195 | 5,000 | 5,019 | 703 |
| Relation | 24 | 246 | ||
Results of comparison with mainstream methods on NYT and WebNLG datasets.
| NovelTagging | 62.4 | 31.7 | 42.0 | 52.5 | 19.3 | 28.3 |
| CopyRe | 61.0 | 56.6 | 58.7 | 37.7 | 36.4 | 37.1 |
| GraphRel | 63.9 | 60.0 | 61.9 | 44.7 | 41.1 | 42.9 |
| CopyMTL | 75.7 | 68.7 | 72.0 | 58.0 | 54.9 | 56.4 |
| OrderRL | 77.9 | 67.2 | 72.1 | 66.3 | 59.9 | 61.6 |
| HRL | 78.1 | 77.1 | 77.6 | - | - | 28.6 |
| AntNRE | 80.2 | 53.5 | 64.2 | 80.4 | 45.4 | 58.0 |
| ImprovingGCN | 83.2 | 64.7 | 72.8 | 66.4 | 62.7 | 64.5 |
| Ours | 81.3 | 76.7 | 67.4 | |||
Bold marks highest number among all models.
Ablation tests on the NYT dataset.
| ALL | 79.4 |
| - char embedding | 78.1 |
| - context embedding | 77.8 |
| - BiGCN | 77.6 |
| - Multi-head Attention | 76.9 |
| - DCGCN | 78.2 |
F1 score for different numbers of GCN layer.
| 1st-GCN | 1 | 76.8 |
| 2 | ||
| 3 | 77.3 | |
| 2nd-DCGCN | 2 | 79.1 |
| 3 | ||
| 4 | 79.3 | |
| 3rd-DCGCN | 3 | 79.3 |
Bold marks the optimal setting.
Figure 3F1 score for different class of overlapping triples. (A) F1 of normal triple (B) F1 of EntityPairOverlap (C) F1 of SingleEntityOverlap.
Figure 4F1 score for different number of triples on two datasets. The X-axis represents the number of triples in a sentence. (A) F1 of different triples on the NYT (B) F1 of different triples on the WebNLG.