| Literature DB >> 33996386 |
Abir El Azzaoui1, Sushil Kumar Singh1, Jong Hyuk Park1.
Abstract
Nowadays, the world is experiencing a pandemic crisis due to the spread of COVID-19, a novel coronavirus disease. The contamination rate and death cases are expeditiously increasing. Simultaneously, people are no longer relying on traditional news channels to enlighten themselves about the epidemic situation. Alternately, smart cities citizens are relying more on Social Network Service (SNS) to follow the latest news and information regarding the outbreak, share their opinions, and express their feelings and symptoms. In this paper, we propose an SNS Big Data Analysis Framework for COVID-19 Outbreak Prediction in Smart Sustainable Healthy City, where Twitter platform is adopted. Over 10000 Tweets were collected during two months, 38% of users aged between 18 and 29, while 26% are between 30 and 49 years old. 56% of them are males and 44% are females. The geospatial location is USA, and the used language is English. Natural Language Processing (NLP) is deployed to filter the tweets. Results demonstrated an outbreak cluster predicted seven days earlier than the confirmed cases with an indicator of 0.989. Analyzing data from SNS platforms enabled predicting future outbreaks several days earlier, and scientifically reduce the infection rate in a smart sustainable healthy city environment.Entities:
Keywords: Big Data Analysis; COVID-19; NLP; SNS; Smart Healthy City
Year: 2021 PMID: 33996386 PMCID: PMC8103782 DOI: 10.1016/j.scs.2021.102993
Source DB: PubMed Journal: Sustain Cities Soc ISSN: 2210-6707 Impact factor: 7.587
Fig. 1Coronavirus hashtag’s report on Twitter.
Related work comparison
| Research work | Year | Other Platform | Method/ | SNS Monitoring | Sentiment Detection | Fake News Detection | Limitations | |
|---|---|---|---|---|---|---|---|---|
| Yoo et al. [ | 2016 | Yes | No | Structural equation modeling. | Yes | No | No | Cannot make causal inferences regarding the relationship among the key variable. |
| Shahi et al. [ | 2020 | Yes | No | NLP, | No | No | Yes | Human annotated categories only for tree languages. |
| Shahi et al. [ | 2020 | Yes | No | Python | No | No | Yes | The dataset excludes less viral misinformation. |
| Massaad et al. [ | 2020 | Yes | No | Google Colab. | No | Yes | No | The data collected only in one country. |
| Pui et al. [ | 2020 | Yes | No | Google Colab | No | Yes | No | The results were not developed |
| Htet et al. [ | 2018 | Yes | No | HBase. | No | Yes | No | Hadoop MR used has some drawbacks in the case of the transaction between input and output. |
| Barbosa et | 2010 | Yes | No | Support vector machines | No | Yes | No | Sentences with antagonistic sentiments cannot be analyzed |
| 2020 | Yes | No | Google Colab | Yes | Yes | Yes | Tested only one SNS platform |
Fig. 2SNS Big Data Analysis Framework Overview.
Fig. 3Process Flow of the Proposed Framework.
Subjectivity and Polarity Metric
| Subjectivity | Polarity | |||
|---|---|---|---|---|
| Scale | 0 | 1 | −1 | 1 |
| Explanation | Objective | Subjective | Negative | Positive |
Fig. 4Example of Subjectivity and Polarity results.
Fig. 5Detailed Methodology Flowchart.
Fig. 6Categories of USA population in COVID-19 case.
Fig. 7Prediction results compared with confirmed results.
Fig. 8Median Prediction Factor.
Fig. 9Predicted and Confirmed COVID-19 Cases Based on Days.
Comparative result analysis based on different models.
| Model Name | Model Description | Prediction Results |
|---|---|---|
| Logistic | 0.999 | |
| Linear | 0.557 | |
| Logarithmic | 0.289 | |
| Quadratic | 0.88 | |
| Cubic | 0.982 | |
| Compound | 0.977 | |
| Power | 0.702 | |
| Exponential | 0.977 | |
| SNS Analysis | 0.989 |
Fig. 10Comparative result analysis.
Comparison based on key considerations.
| Research work | Year | Method/Software/Hardware | Sustainability | Security | Availability | Integrity | |
|---|---|---|---|---|---|---|---|
| [ | 2016 | Yes | Structural equation modeling. | No | No | Yes | Yes |
| [ | 2020 | Yes | NLP, | Yes | No | Yes | Yes |
| [ | 2020 | Yes | Python | No | Yes | Yes | No |
| [ | 2020 | Yes | Google Colab. | No | No | Yes | Yes |
| [ | 2020 | Yes | Google Colab | Yes | Yes | No | No |
| [ | 2018 | Yes | HBase. | Yes | No | Yes | Yes |
| [ | 2010 | Yes | Support vector machines | Yes | Yes | No | Yes |
| 2020 | Yes | Google Colab | Yes | Yes | Yes | Yes |