| Literature DB >> 33110297 |
Sikakollu Prasanth1, Uttam Singh1, Arun Kumar1, Vinay Anand Tikkiwal2, Peter H J Chong3.
Abstract
The recent outbreak of COVID-19 has brought the entire world to a standstill. The rapid pace at which the virus has spread across the world is unprecedented. The sheer number of infected cases and fatalities in such a short period of time has overwhelmed medical facilities across the globe. The rapid pace of the spread of the novel coronavirus makes it imperative that its' spread be forecasted well in advance in order to plan for eventualities. An accurate early forecasting of the number of cases would certainly assist governments and various other organizations to strategize and prepare for the newly infected cases, well in advance. In this work, a novel method of forecasting the future cases of infection, based on the study of data mined from the internet search terms of people in the affected region, is proposed. The study utilizes relevant Google Trends of specific search terms related to COVID-19 pandemic along with European Centre for Disease prevention and Control (ECDC) data on COVID-19 spread, to forecast the future trends of daily new cases, cumulative cases and deaths for India, USA and UK. For this purpose, a hybrid GWO-LSTM model is developed, where the network parameters of Long Short Term Memory (LSTM) network are optimized using Grey Wolf Optimizer (GWO). The results of the proposed model are compared with the baseline models including Auto Regressive Integrated Moving Average (ARIMA), and it is observed that the proposed model achieves much better results in forecasting the future trends of the spread of infection. Using the proposed hybrid GWO-LSTM model incorporating online big data from Google Trends, a reduction in Mean Absolute Percentage Error (MAPE) values for forecasting results to the extent of about 98% have been observed. Further, reduction in MAPE by 74% for models incorporating Google Trends was observed, thus, confirming the efficacy of utilizing public sentiments in terms of search frequencies of relevant terms online, in forecasting pandemic numbers.Entities:
Keywords: Auto regressive integrated moving average (ARIMA); COVID-19; Deep learning; Forecasting; Google trends; Grey wolf optimization (GWO); Long short term memory (LSTM); Optimization; Pandemic
Year: 2020 PMID: 33110297 PMCID: PMC7580652 DOI: 10.1016/j.chaos.2020.110336
Source DB: PubMed Journal: Chaos Solitons Fractals ISSN: 0960-0779 Impact factor: 5.944
Google trends search terms.
| S.No. | Search terms |
|---|---|
| 1. | Coronavirus symptoms |
| 2. | Coronavirus |
| 3. | Covid |
| 4. | Handwash |
| 5. | Healthcenter |
| 6. | Mask |
| 7. | Positive cases |
| 8. | Sanitizer |
| 9. | Coronavirus Vaccine |
Fig. 1Proposed workflow for forecasting future COVID-19 trends.
Fig. 2Plots depicting the relationship between the highly correlated GT terms and NC data for India, USA and UK.
Fig. 3Architecture of an LSTM cell.
Correlation values between google trends data and ECDC data for India.
| TCC | NC | TCD | |||||
|---|---|---|---|---|---|---|---|
| S.No. | Search term | Pearson | Spearman | Pearson | Spearman | Pearson | Spearman |
| 1. | Sanitizer | ||||||
| 2. | Coronavirus | 0.174 | 0.166 | 0.127 | |||
| 3. | Covid | 0.374 | 0.539 | ||||
| 4. | Health center | 0.037 | 0.071 | 0.026 | 0.054 | 0.058 | 0.071 |
| 5. | Mask | 0.020 | 0.018 | ||||
| 6. | Handwash | 0.026 | 0.017 | ||||
| 7. | Coronavirus Symptoms | 0.292 | 0.294 | 0.266 | |||
| 8. | Positive Cases | ||||||
| 9. | Coronavirus Vaccine | 0.362 | 0.370 | 0.370 | |||
Correlation values between google trends data and ECDC data for USA.
| TCC | NC | TCD | |||||
|---|---|---|---|---|---|---|---|
| S.No. | Search term | Pearson | Spearman | Pearson | Spearman | Pearson | Spearman |
| 1. | Sanitizer | ||||||
| 2. | Coronavirus | 0.049 | |||||
| 3. | Covid | ||||||
| 4. | Health center | ||||||
| 5. | Mask | 0.541 | 0.851 | ||||
| 6. | Handwash | ||||||
| 7. | Coronavirus Symptoms | 0.318 | |||||
| 8. | Positive Cases | ||||||
| 9. | Coronavirus Vaccine | ||||||
Correlation values between google trends data and ECDC data for UK.
| TCC | NC | TCD | |||||
|---|---|---|---|---|---|---|---|
| S.No. | Search term | Pearson | Spearman | Pearson | Spearman | Pearson | Spearman |
| 1. | Sanitizer | ||||||
| 2. | Coronavirus | ||||||
| 3. | Covid | ||||||
| 4. | Health center | ||||||
| 5. | Mask | ||||||
| 6. | Handwash | ||||||
| 7. | Coronavirus Symptoms | ||||||
| 8. | Positive Cases | ||||||
| 9. | Coronavirus Vaccine | ||||||
Fig. 4Comparison of performance of different models for forecasting different parameters in India (a) Daily New cases (b) Total Cumulative cases (c) Total Deaths.
Fig. 5Comparison of performance of different models for forecasting different parameters in USA (a) Daily New cases (b) Total Cumulative cases (c) Total Deaths.
Fig. 6Comparison of performance of different models for forecasting different parameters in United Kingdom (a) Daily New cases (b) Total Cumulative cases (c) Total Deaths.
Comparison of RMSE and MAPE values using different models and features for India.
| S.No. | Trend | Model | Inputs used | nRMSE | MAPE |
|---|---|---|---|---|---|
| 1. | Total cumulative cases (TCC) | ECDC-A (2, 2, 0) | TCC | 0.485 | 41.698 |
| ECDC - L | TCC | 0.035 | 14.161 | ||
| ECDC - GT - L | TCC, GT - Covid, Vaccine | 0.016 | 4.154 | ||
| ECDC - GT - GWO - L | TCC, GT - Covid, Vaccine | 0.013 | 3.452 | ||
| 2. | Daily new cases (NC) | ECDC-A (2, 1, 0) | NC | 0.274 | 22.591 |
| ECDC - L | NC | 0.140 | 20.923 | ||
| ECDC - GT - L | NC, GT - Covid, Vaccine | 0.037 | 9.259 | ||
| ECDC - GT- GWO - L | NC, GT - Covid, Vaccine | 0.032 | 7.140 | ||
| 3. | Total cumulative deaths (TCD) | ECDC - A (0, 2, 1) | TCD | 0.309 | 27.535 |
| ECDC - L | TCD | 0.074 | 13.752 | ||
| ECDC - GT - L | TCD, GT - Covid, Vaccine | 0.027 | 6.885 | ||
| ECDC - GT - GWO - L | TCD, GT - Covid, Vaccine | 0.001 | 0.304 |
Comparison of RMSE and MAPE values using different models and features for USA.
| S.No. | Trend | Model | Inputs used | nRMSE | MAPE |
|---|---|---|---|---|---|
| 1. | Total cumulative cases (TCC) | ECDC - A (0, 2, 1) | TCC | 0.109 | 10.46 |
| ECDC - L | TCC | 0.135 | 12.914 | ||
| ECDC - GT - L | TCC, GT - Covid, Mask | 0.011 | 3.831 | ||
| ECDC - GT - GWO - L | TCC, GT - Covid, Mask | 0.012 | 3.132 | ||
| 2. | Daily new cases (NC) | ECDC - A (1, 1, 0) | NC | 0.169 | 15.571 |
| ECDC - L | NC | 0.157 | 13.262 | ||
| ECDC- GT- L | NC, GT - Covid, Mask | 0.138 | 12.637 | ||
| ECDC - GT - GWO - L | NC, GT - Covid, Mask | 0.132 | 11.78 | ||
| 3. | Total cumulative deaths (TCD) | ECDC - A (2, 2, 3) | TCD | 0.099 | 9.517 |
| ECDC - L | TCD | 0.250 | 14.55 | ||
| ECDC - GT - L | TCC, GT - Covid, Mask | 0.014 | 3.746 | ||
| ECDC - GT - GWO - L | TCD, GT - Covid, Mask | 0.009 | 2.565 |
Comparison of RMSE and MAPE values using different models and features for UK.
| S.No. | Trend | Model | Inputs used | nRMSE | MAPE |
|---|---|---|---|---|---|
| 1. | Total cumulative cases (TCC) | ECDC - A (2, 2, 0) | TCC | 0.088 | 8.808 |
| ECDC - L | TCC | 0.089 | 8.993 | ||
| ECDC - GT -L | TCC, GT - Covid, Vaccine | 0.027 | 7.136 | ||
| ECDC - GT - GWO - L | TCC, GT - Covid, Vaccine | 0.006 | 1.695 | ||
| 2. | Daily new cases (NC) | ECDC - A (2, 1, 0) | NC | 0.168 | 16.931 |
| ECDC - L | NC | 0.195 | 19.632 | ||
| ECDC - GT - L | NC, GT - Covid, Vaccine | 0.0363 | 9.236 | ||
| ECDC - GT - GWO - L | NC, GT - Covid, Vaccine | 0.027 | 6.945 | ||
| 3. | Total cumulative deaths (TCD) | ECDC - A (0, 2, 1) | TCD | 0.077 | 7.749 |
| ECDC - L | TCD | 0.058 | 3.12 | ||
| ECDC - GT - L | TCD, GT - Covid, Vaccine | 0.025 | 2.484 | ||
| ECDC - GT - GWO - L | TCD, GT - Covid, Vaccine | 0.009 | 1.442 |
Percentage improvement in MAPE values for proposed model (ECDC-GT-GWO-L) when compared to other three models for India.
| 1. | ECDC-A | 91.71 | 68.39 | 98.89 |
| 2. | ECDC - L | 75.61 | 65.87 | 97.78 |
| 3. | ECDC - GT - L | 16.8 | 22.89 | 95.57 |
Percentage improvement in MAPE values for proposed model (ECDC-GT-GWO-L) when compared to other three models for USA.
| 1. | ECDC-A | 70.05 | 24.34 | 73.04 |
| 2. | ECDC - L | 75.74 | 11.17 | 82.37 |
| 3. | ECDC - GT - L | 18.24 | 6.78 | 31.53 |
Percentage improvement in MAPE values for proposed model (ECDC-GT-GWO-L) when compared to other three models for UK.
| 1. | ECDC-A | 80.07 | 58.97 | 81.38 |
| 2. | ECDC - L | 81.14 | 64.62 | 53.76 |
| 3. | ECDC - GT - L | 76.23 | 24.80 | 41.93 |
Percentage improvement in MAPE values for ECDC-GT-L when compared to ECDC-L for India, USA, UK.
| 1. | India | 70.66 | 55.74 | 49.93 |
| 2. | USA | 70.33 | 4.71 | 74.25 |
| 3. | UK | 20.64 | 52.95 | 53.78 |