| Literature DB >> 34802278 |
Jyun-Yu Jiang1, Yichao Zhou1, Xiusi Chen1, Yan-Ru Jhou1, Liqi Zhao1, Sabrina Liu1, Po-Chun Yang1, Jule Ahmar1, Wei Wang1.
Abstract
The outbreak of the novel coronavirus, COVID-19, has become one of the most severe pandemics in human history. In this paper, we propose to leverage social media users as social sensors to simultaneously predict the pandemic trends and suggest potential risk factors for public health experts to understand spread situations and recommend proper interventions. More precisely, we develop novel deep learning models to recognize important entities and their relations over time, thereby establishing dynamic heterogeneous graphs to describe the observations of social media users. A dynamic graph neural network model can then forecast the trends (e.g. newly diagnosed cases and death rates) and identify high-risk events from social media. Based on the proposed computational method, we also develop a web-based system for domain experts without any computer science background to easily interact with. We conduct extensive experiments on large-scale datasets of COVID-19 related tweets provided by Twitter, which show that our method can precisely predict the new cases and death rates. We also demonstrate the robustness of our web-based pandemic surveillance system and its ability to retrieve essential knowledge and derive accurate predictions across a variety of circumstances. Our system is also available at http://scaiweb.cs.ucla.edu/covidsurveiller/. This article is part of the theme issue 'Data science approachs to infectious disease surveillance'.Entities:
Keywords: knowledge graph; natural language processing; pandemic surveillance; social media mining
Mesh:
Year: 2021 PMID: 34802278 PMCID: PMC8607148 DOI: 10.1098/rsta.2021.0125
Source DB: PubMed Journal: Philos Trans A Math Phys Eng Sci ISSN: 1364-503X Impact factor: 4.226
Figure 1Illustration of our proposed framework for pandemic surveillance, COVID-19 Surveiller.
Figure 2The pipeline of constructing activity nodes from free-text data. (Online version in colour.)
Figure 3Time series prediction model. (Online version in colour.)
Performance of the short-term (1 day and 7 days ahead) and long-term (14 days and 28 days ahead) new confirmed case number and fatality forecasts. All the improvements of our method over the baseline methods are statistically significant at a 99% confidence level in paired -tests. Our method achieves 5.6%, 9.5%, 9.4% and 5.6% lower MAE than the best baseline MPNN + LSTM when forecasting the new confirmed case numbers for 1, 7, 14, 28 days ahead.
| confirmed case | fatality | |||||||
|---|---|---|---|---|---|---|---|---|
| no. days ahead ( | 1 | 7 | 14 | 28 | 1 | 7 | 14 | 28 |
| — | 768.43 | 978.53 | 2472.09 | — | 15.49 | 18.59 | 26.18 | |
| — | 755.36 | 1099.76 | 1591.01 | — | 14.24 | 15.60 | 19.06 | |
| — | 1123.72 | 1253.14 | 1534.64 | — | 18.91 | 19.85 | 24.36 | |
| 604.18 | 802.98 | 961.30 | 1300.49 | 19.32 | 21.91 | 24.47 | 29.20 | |
| 791.07 | 991.05 | 1341.80 | 2019.24 | 16.59 | 18.65 | 22.22 | 31.77 | |
| 1262.33 | 1248.08 | 1235.20 | 1204.19 | 18.04 | 17.94 | 17.77 | 17.74 | |
| 485.52 | 567.74 | 825.41 | 1304.11 | 12.13 | 12.90 | 14.87 | 19.73 | |
| 455.68 | 523.77 | 672.05 | 967.12 | 12.17 | 12.79 | 14.57 | 20.01 | |
| ours | 430.01 | 474.16 | 608.98 | 913.20 | 11.78 | 11.85 | 13.24 | 18.26 |