| Literature DB >> 33286937 |
Min Zhang1,2, Guohua Geng1, Sheng Zeng1, Huaping Jia2.
Abstract
Knowledge graph completion can make knowledge graphs more complete, which is a meaningful research topic. However, the existing methods do not make full use of entity semantic information. Another challenge is that a deep model requires large-scale manually labelled data, which greatly increases manual labour. In order to alleviate the scarcity of labelled data in the field of cultural relics and capture the rich semantic information of entities, this paper proposes a model based on the Bidirectional Encoder Representations from Transformers (BERT) with entity-type information for the knowledge graph completion of the Chinese texts of cultural relics. In this work, the knowledge graph completion task is treated as a classification task, while the entities, relations and entity-type information are integrated as a textual sequence, and the Chinese characters are used as a token unit in which input representation is constructed by summing token, segment and position embeddings. A small number of labelled data are used to pre-train the model, and then, a large number of unlabelled data are used to fine-tune the pre-training model. The experiment results show that the BERT-KGC model with entity-type information can enrich the semantics information of the entities to reduce the degree of ambiguity of the entities and relations to some degree and achieve more effective performance than the baselines in triple classification, link prediction and relation prediction tasks using 35% of the labelled data of cultural relics.Entities:
Keywords: Bidirectional Encoder Representations from Transformers (BERT); cultural relics; entity type; knowledge graph completion; link prediction
Year: 2020 PMID: 33286937 PMCID: PMC7597339 DOI: 10.3390/e22101168
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Overall fine-tuning procedures for BERT-KGC.
Figure 2The input representation of BERT-KGC.
Summary statistics of CCR20.
| Dataset | # Rel | # Ent | # Train | # Dev | # Test |
|---|---|---|---|---|---|
| CCR20 | 16 | 34,877 | 69,642 | 2908 | 3069 |
Performance comparison of BERT-KGC and the baselines for triple classification.
| Method | P | R | F1 |
|---|---|---|---|
| TransE (Bordes et al. 2013) [ | 70.3 | 72.2 | 71.2 |
| TransH (Wang et al. 2014) [ | 72.3 | 70.4 | 71.3 |
| TransR (Lin et al. 2015b) [ | 75.5 | 77.6 | 76.5 |
| TEKE (Wang and Li 2016) [ | 75.8 | 78.3 | 77.1 |
| NTN (Socher et al. 2013) [ | 76.0 | 79.4 | 77.7 |
| TransD (Ji et al. 2015) [ | 76.7 | 80.9 | 78.7 |
| ConvE (Dettmers et al. 2018) [ | 77.3 | 81.6 | 79.4 |
| ConvKB (Nguyen et al. 2017) [ | 77.9 | 84.3 | 81.1 |
| CapsE (Nguyen et al. 2019) [ | 80.4 | 84.4 | 82.3 |
| TKRL (Xie et al. 2016) [ | 81.1 | 85.0 | 83.3 |
| KG-BERT(a) (Yao et al. 2019) [ |
|
|
|
| BERT-KGC |
|
|
|
Note: The experimental results are for BERT-KGC and baseline methods in triple classification. The results are in percentages. The baseline comparison results were obtained from the original papers. Bold denotes the best result, while the second-best score is underlined.
Performance comparison of BERT-KGC and baselines for link prediction.
| Method | MR | Hits@10 |
|---|---|---|
| TransE (Bordes et al. 2013) [ | 2594 | 47.3 |
| TransH (Wang et al. 2014) [ | 2965 | 46.2 |
| TransR (Lin et al. 2015b) [ | 3272 | 48.4 |
| NTN (Socher et al. 2013) [ | 3315 | 47.2 |
| TransD (Ji et al. 2015) [ | 3303 | 47.3 |
| ConvE (Dettmers et al. 2018) [ | 3298 | 49.3 |
| ConvKB (Nguyen et al. 2017) [ | 2592 | 50.3 |
| CapsE (Nguyen et al. 2019) [ | 2945 | 51.3 |
| TKRL (Xie et al. 2016) [ | 2108 | 51.9 |
| KG-BERT(a) (Yao et al. 2019) [ |
|
|
| BERT-KGC |
|
|
Note: MR denotes the mean rank of the correct entities. Hits@10 is the proportion of correct entities in the top 10 reported in %. Bold denotes the best result, while the second-best score is underlined.
Performance comparison of BERT-KGC and the baselines for relation prediction.
| Method | MR | Hits@10 |
|---|---|---|
| TransE (Bordes et al. 2013) [ | 379 | 78.2 |
| TransH (Wang et al. 2014) [ | 370 | 79.4 |
| TransR (Lin et al. 2015b) [ | 369 | 78.9 |
| NTN (Socher et al. 2013) [ | 387 | 77.6 |
| TransD (Ji et al. 2015) [ | 332 | 79.1 |
| ConvE (Dettmers et al. 2018) [ | 259 | 79.4 |
| ConvKB (Nguyen et al. 2017) [ | 264 | 80.6 |
| CapsE (Nguyen et al. 2019) [ | 260 | 81.8 |
| TKRL (Xie et al. 2016) [ | 246 | 80.2 |
| KG-BERT(b) (Yao et al. 2019) [ |
|
|
| BERT-KGC |
|
|
Note. Bold denotes the best result, while the second-best score is underlined.
Figure 3The influence of the training data proportions on the triple classification (%). Figure (a–c) is the precision, Recall and F1-score of the triple classification (%) respectively.