| Literature DB >> 35308294 |
Wang Gao1, Lin Li2, Xiaohui Tao3, Jing Zhou1, Jun Tao1.
Abstract
Every epidemic affects the real lives of many people around the world and leads to terrible consequences. Recently, many tweets about the COVID-19 pandemic have been shared publicly on social media platforms. The analysis of these tweets is helpful for emergency response organizations to prioritize their tasks and make better decisions. However, most of these tweets are non-informative, which is a challenge for establishing an automated system to detect useful information in social media. Furthermore, existing methods ignore unlabeled data and topic background knowledge, which can provide additional semantic information. In this paper, we propose a novel Topic-Aware BERT (TABERT) model to solve the above challenges. TABERT first leverages a topic model to extract the latent topics of tweets. Secondly, a flexible framework is used to combine topic information with the output of BERT. Finally, we adopt adversarial training to achieve semi-supervised learning, and a large amount of unlabeled data can be used to improve inner representations of the model. Experimental results on the dataset of COVID-19 English tweets show that our model outperforms classic and state-of-the-art baselines.Entities:
Keywords: Adversarial training; Informative tweet identification; Social media; Topic model
Year: 2022 PMID: 35308294 PMCID: PMC8924578 DOI: 10.1007/s11280-022-01034-1
Source DB: PubMed Journal: World Wide Web ISSN: 1386-145X Impact factor: 2.716
Fig. 1Architecture of TABERT combining topic information and BERT
Fig. 2Architecture of TABERT with adversarial training
Fig. 3Experimental results of seven methods
Fig. 4Experimental results of different topic models
Fig. 5F1 scores for different ratios of annotated data
Fig. 6Number of tweets for five actionable information types
Performance of TABERT and BERT in actionable information mining
| Model | Precision | Recall | F1 |
|---|---|---|---|
| BERT | 0.6114 | 0.6037 | 0.5976 |
| TABERT | 0.6530 | 0.6306 | 0.6373 |