| Literature DB >> 34065740 |
Chonghao Chen1, Jianming Zheng1, Honghui Chen1.
Abstract
Fact verification aims to verify the authenticity of a given claim based on the retrieved evidence from Wikipedia articles. Existing works mainly focus on enhancing the semantic representation of evidence, e.g., introducing the graph structure to model the evidence relation. However, previous methods can't well distinguish semantic-similar claims and evidences with distinct authenticity labels. In addition, the performances of graph-based models are limited by the over-smoothing problem of graph neural networks. To this end, we propose a graph-based contrastive learning method for fact verification abbreviated as CosG, which introduces a contrastive label-supervised task to help the encoder learn the discriminative representations for different-label claim-evidence pairs, as well as an unsupervised graph-contrast task, to alleviate the unique node features loss in the graph propagation. We conduct experiments on FEVER, a large benchmark dataset for fact verification. Experimental results show the superiority of our proposal against comparable baselines, especially for the claims that need multiple-evidences to verify. In addition, CosG presents better model robustness on the low-resource scenario.Entities:
Keywords: contrastive learning; entity graph; fact verification; graph neural network
Year: 2021 PMID: 34065740 PMCID: PMC8156189 DOI: 10.3390/s21103471
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Examples of fact verification. The key evidences to verify the claims are highlighted, and the bold tokens denote the key entity in the cases. [Docname, linenum] indicates the evidence is extracted from line “linenum” in article “Docname”.
Figure 2Overview of the CosG model.
Figure 3Detailed structure of the CosG model.
Comparison of graph-based semantic representation learning methods. In particular, our model mainly adopts the co-occurrence method to build the entity graph. Differently, we introduce two types of contrastive tasks to further learn the discriminative representations of samples.
| Method | Advantage | Disadvantage |
|---|---|---|
| Full-connect-based methods [ | This method can fully explore the relations of non-consecutive semantic units in the text. | This method easily brings a lot of noise in the graph feature aggregation. |
| Dependency-structure-based methods [ | This method can exactly capture long-range relations between semantic units by their dependencies. | This method is computationally inefficient for the complex depencey tree structure and different types of edges. |
| Co-occurrence-based methods [ | This method can well extract the relations for related semantic units and reduce the noise brought by irrelevant nodes, as well as the computing overhead brought by different kinds of edges. | This method may lead to the loss of some potential semantic relations. |
Statistics of FEVER.
| Split | SUPPORTED | REFUTED | NEI |
|---|---|---|---|
| Training | 80,035 | 29,775 | 35,659 |
| Dev | 6666 | 6666 | 6666 |
| Test | 6666 | 6666 | 6666 |
Overall performance (%) on the development (dev) set and the bind test set. The results produced by the best baseline and the best performer in each column are underlined and boldfaced, respectively.
| Model | Dev | Test | ||
|---|---|---|---|---|
| LA | FEVER Score | LA | FEVER Score | |
| ColumbiaNLP | 58.77 | 50.83 | 57.45 | 49.06 |
| QED | 44.70 | 43.90 | 50.12 | 43.42 |
| Athene | 68.49 | 64.74 | 65.46 | 61.58 |
| UCL MRG | 69.66 | 65.41 | 67.62 | 62.52 |
| UNC NLP | 69.72 | 66.49 | 68.21 | 64.21 |
| BERT Concat | 73.67 | 68.89 | 71.01 | 65.64 |
| BERT Pair | 73.30 | 68.90 | 69.75 | 65.18 |
| SR-MRS | 75.12 | 70.18 |
| 67.26 |
| GEAR | 74.84 | 70.69 | 71.60 | 67.10 |
| RoEG |
|
| 71.47 |
|
|
|
|
|
|
|
Computational complexity and efficiency. We set the training and dev time of SR-MRS to 1 unit, respectively. Then, we can find the relative time cost of each corresponding model against SR-MRS.
| Method | Complexity | Time Consumption | |
|---|---|---|---|
| Training | Dev | ||
| UNC NLP |
| 1.15 | 1.37 |
| SR-MRS |
| 1.00 | 1.00 |
| GEAR |
| 1.89 | 1.32 |
| CosG |
| 1.12 | 1.31 |
Results of an ablation study on development set (%).
| Model | LA | FEVER Score |
|---|---|---|
| CosG (2-layer) | 76.95 | 74.12 |
|
| 76.24 | 72.94 |
|
| 75.72 | 72.76 |
|
| 76.37 | 73.19 |
|
| 75.82 | 72.90 |
|
| 75.01 | 72.14 |
Figure 4Performance on easy and difficult development sets (%).
Figure 5Performance on the development set where models are trained by different number of samples (%).