| Literature DB >> 34497874 |
Heng-Yang Lu1,2, Chenyou Fan3, Xiaoning Song1, Wei Fang1.
Abstract
BACKGROUND: Rumor detection is a popular research topic in natural language processing and data mining. Since the outbreak of COVID-19, related rumors have been widely posted and spread on online social media, which have seriously affected people's daily lives, national economy, social stability, etc. It is both theoretically and practically essential to detect and refute COVID-19 rumors fast and effectively. As COVID-19 was an emergent event that was outbreaking drastically, the related rumor instances were very scarce and distinct at its early stage. This makes the detection task a typical few-shot learning problem. However, traditional rumor detection techniques focused on detecting existed events with enough training instances, so that they fail to detect emergent events such as COVID-19. Therefore, developing a new few-shot rumor detection framework has become critical and emergent to prevent outbreaking rumors at early stages.Entities:
Keywords: COVID-19; Few-shot learning; Multi-modality; Rumor detection; Social media
Year: 2021 PMID: 34497874 PMCID: PMC8384041 DOI: 10.7717/peerj-cs.688
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Figure 1Example of the Sina Weibo page, which contains a rumor microblog.
Figure 2Workflow of the rumor judgement by the official Sina Weibo community management center.
Statistics of events for the COVID-19 rumor dataset after removing duplicates.
| Event | Rumor | Non-rumor | |||
|---|---|---|---|---|---|
| Modality 1 | Modality 2 | Modality 1 | Modality 2 | ||
| COVID-19 irrelevant | MH370 | 134 | 239 | 263 | 156 |
| College entrance exams | 591 | 945 | 148 | 153 | |
| Olympics | 82 | 199 | 174 | 149 | |
| Urban managers | 150 | 305 | 95 | 71 | |
| Cola | 420 | 407 | 216 | 285 | |
| Child trafficking | 173 | 258 | 95 | 54 | |
| Waste oil | 58 | 91 | 134 | 122 | |
| Accident | 83 | 168 | 101 | 77 | |
| Earthquake | 59 | 133 | 118 | 85 | |
| Typhoon | 65 | 163 | 108 | 90 | |
| Rabies | 43 | 77 | 102 | 70 | |
| COVID-19 relevant | Lockdown the city | 25 | 59 | 87 | 89 |
| Zhong Nanshan | 22 | 48 | 56 | 44 | |
| Wuhan | 70 | 161 | 168 | 154 | |
| Total | 1,975 | 3,253 | 1,865 | 1,599 | |
Figure 3Workflow of COMFUSE.
Figure 4Illustrations of word embeddings with BERT.
Figure 5Structure of Bi-GRUs.
Figure 6Workflow of one meta-learning iteration.
Statistics of instances in both datasets for experiments.
| Weibo Dataset | PHEME Dataset | |||||
|---|---|---|---|---|---|---|
| Training set | Validation set | Test set | Training set | Validation set | Test set | |
| split 0 | 1,676 | 1,736 | 428 | 776 | 889 | 877 |
| split 1 | 2,429 | 983 | 889 | 776 | ||
| split 2 | 2,719 | 693 | 995 | 889 | ||
| total number of instances | 3,840 | 2,207 | ||||
Figure 7(A-G) Statistics of length per text of the Weibo dataset.
Figure 8(A-G) Statistics of length per text of the PHEME dataset.
Figure 9Experimental results of different pad sizes of source posts with a fixed pad size of comments as 32 on the Weibo dataset.
Figure 10Experimental results of different pad sizes of comments with a fixed pad size of source posts as 100 on the Weibo dataset.
The classification accuracy of Weibo dataset in COVID-19 rumor detection.
| split 0 | split1 | split2 | average | |
|---|---|---|---|---|
| DT-EMB | 57.51% | 56.82% | 56.47% | 56.93% |
| SEQ-CNNs | 66.76% | 66.89% | 68.48% | 67.38% |
| SEQ-Bi-GRUs | 65.89% | 71.09% | 69.81% | 68.93% |
| GAN-GRU-early | 60.74% | 47.09% | 51.45% | 53.00% |
| BiGCN-early | 71.88% | 63.46% | 63.50 | 66.28% |
| COMFUSE-post-only | 73.48% | 75.22% | 71.17% | 73.29% |
| COMFUSE-com-only | 70.44% | 74.59% | 74.11% | 73.17% |
| COMFUSE |
|
|
|
|
The classification accuracy of PHEME dataset in latest events rumor detection.
| split 0 | split1 | split2 | average | |
|---|---|---|---|---|
| DT-EMB | 56.92% | 56.67% | 56.85% | 56.81% |
| SEQ-CNNs | 63.06% | 61.42% | 63.67% | 62.72% |
| SEQ-Bi-GRUs | 63.25% | 60.83% | 62.97% | 62.35% |
| GAN-GRU-early | 53.47% | 50.43% | 56.03% | 53.31% |
| BiGCN-early | 67.94% | 57.94% | 59.08% | 61.65% |
| COMFUSE-post-only | 63.56% | 65.58% | 64.17% | 64.44% |
| COMFUSE-com-only | 58.36% | 60.47% | 57.39% | 58.74% |
| COMFUSE |
|
|
|
|