Literature DB >> 36148026

Mining Social Media Data to Predict COVID-19 Case Counts.

Maksims Kazijevs1, Furkan A Akyelken2, Manar D Samad2.   

Abstract

The unpredictability and unknowns surrounding the ongoing coronavirus disease (COVID-19) pandemic have led to an unprecedented consequence taking a heavy toll on the lives and economies of all countries. There have been efforts to predict COVID-19 case counts (CCC) using epidemiological data and numerical tokens online, which may allow early preventive measures to slow the spread of the disease. In this paper, we use state-of-the-art natural language processing (NLP) algorithms to numerically encode COVID-19 related tweets originated from eight cities in the United States and predict city-specific CCC up to eight days in the future. A city-embedding is proposed to obtain a time series representation of daily tweets posted from a city, which is then used to predict case counts using a custom long-short term memory (LSTM) model. The universal sentence encoder yields the best normalized root mean squared error (NRMSE) 0.090 (0.039), averaged across all cities in predicting CCC six days in the future. The R 2 scores in predicting CCC are more than 0.70 and often over 0.8, which suggests a strong correlation between the actual and our model predicted CCC values. Our analyses show that the NRMSE and R 2 scores are consistently robust across different cities and different numbers of time steps in time series data. Results show that the LSTM model can learn the mapping between the NLP-encoded tweet semantics and the case counts, which infers that social media text can be directly mined to identify the future course of the pandemic.

Entities:  

Keywords:  LSTM; Twitter; natural language processing; pandemic prediction; social media

Year:  2022        PMID: 36148026      PMCID: PMC9490453          DOI: 10.1109/ichi54592.2022.00027

Source DB:  PubMed          Journal:  IEEE Int Conf Healthc Inform        ISSN: 2575-2626


  5 in total

Review 1.  Social media in Ebola outbreak.

Authors:  L Hossain; D Kam; F Kong; R T Wigand; T Bossomaier
Journal:  Epidemiol Infect       Date:  2016-03-04       Impact factor: 4.434

2.  A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis.

Authors:  Furqan Rustam; Madiha Khalid; Waqar Aslam; Vaibhav Rupapara; Arif Mehmood; Gyu Sang Choi
Journal:  PLoS One       Date:  2021-02-25       Impact factor: 3.240

3.  Data Mining and Content Analysis of the Chinese Social Media Platform Weibo During the Early COVID-19 Outbreak: Retrospective Observational Infoveillance Study.

Authors:  Jiawei Li; Qing Xu; Raphael Cuomo; Vidya Purushothaman; Tim Mackey
Journal:  JMIR Public Health Surveill       Date:  2020-04-21

4.  An interactive web-based dashboard to track COVID-19 in real time.

Authors:  Ensheng Dong; Hongru Du; Lauren Gardner
Journal:  Lancet Infect Dis       Date:  2020-02-19       Impact factor: 25.071

5.  Public discourse and sentiment during the COVID 19 pandemic: Using Latent Dirichlet Allocation for topic modeling on Twitter.

Authors:  Jia Xue; Junxiang Chen; Chen Chen; Chengda Zheng; Sijia Li; Tingshao Zhu
Journal:  PLoS One       Date:  2020-09-25       Impact factor: 3.240

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.