| Literature DB >> 33281305 |
Hossein Abbasimehr1, Reza Paki1.
Abstract
COVID-19 virus has encountered people in the world with numerous problems. Given the negative impacts of COVID-19 on all aspects of people's lives, especially health and economy, accurately forecasting the number of cases infected with this virus can help governments to make accurate decisions on the interventions that must be taken. In this study, we propose three hybrid approaches for forecasting COVID-19 time series methods based on combining three deep learning models such as multi-head attention, long short-term memory (LSTM), and convolutional neural network (CNN) with the Bayesian optimization algorithm. All models are designed based on the multiple-output forecasting strategy, which allows the forecasting of the multiple time points. The Bayesian optimization method automatically selects the best hyperparameters for each model and enhances forecasting performance. Using the publicly available epidemical data acquired from Johns Hopkins University's Coronavirus Resource Center, we conducted our experiments and evaluated the proposed models against the benchmark model. The results of experiments exhibit the superiority of the deep learning models over the benchmark model both for short-term forecasting and long-horizon forecasting. In particular, the mean SMAPE of the best deep learning model is 0.25 for the short-term forecasting (10 days ahead). Also, for long-horizon forecasting, the best deep learning model obtains the mean SMAPE of 2.59.Entities:
Keywords: Bayesian optimization; CNN; COVID-19; Deep learning; LSTM; Multi-head attention
Year: 2020 PMID: 33281305 PMCID: PMC7699029 DOI: 10.1016/j.chaos.2020.110511
Source DB: PubMed Journal: Chaos Solitons Fractals ISSN: 0960-0779 Impact factor: 5.944
Summary of studies on COVID-19 infection forecasting.
| Reference | Modeling techniques | Country | Date |
|---|---|---|---|
| ANFIS | China | 21 January, 2020 to 18 February, 2020 | |
| Logistic model, Bertalanffy model and Gompertz model | China | 15 January, 2020 to 4 April 2020 | |
| Gompertz and Logistic | China, South Korea, Italy, and Singapore | Until 27 March, 2020 | |
| Gompertz, Logistic Artificial Neural Networks | Mexico | February 27, 2020 to May 8, 2020 | |
| ANN, ARIMA | Iran | Trainset:19 February, 2020 to 24 | |
| March, 2020 | |||
| Test set: 25 March, 2020 to 31 March, 2020 | |||
| Fuzzy Fractal | Ten countries: US, United Kingdom, Turkey, Spain, Mexico, Italy, Iran, Germany, France, and Belgium | July 22, 2020 to 7 August, 2020 | |
| An ensemble of neural network models with fuzzy aggregation | Mexico and 12 states in Mexico | Not available | |
| ARIMA, nonlinear autoregression neural network (NARNN), and LSTM | Denmark, Belgium, Germany, France, United Kingdom, Finland, Switzerland and Turkey | Until 3 May, 2020 | |
| Bi-directional LSTM, | India (32 Indian states) | March 14, 2020- May 14, 2020 | |
| Stacked LSTM, and | |||
| Convolutional LSTM | |||
| ARIMA, support vector regression (SVR), LSTM, GRU, and Bi-LSTM | Ten countries: Brazil, China, Germany, India, Israel, Italy, Russia, Spain, UK, USA | Until June 27, 2020 | |
| LSTM | Russia, Peru and Iran | Until July 7, 2020 |
Fig. 1The general procedure of the proposed models.
Fig. 2The proposed attention-based model (ATT_BO).
Fig. 3The structure of the LSTM [27].
Fig. 4The Proposed LSTM-based model.
Fig. 5The proposed CNN-based model.
The description of data.
| Dataset | Countries | Time period |
|---|---|---|
| Dataset 1 | US, United Kingdom, Turkey, Spain, Mexico, Italy, Iran, Germany, France, Belgium | January 20, 2020–August 1, 2020 |
| Dataset 2 | US, Brazil, India, Russia, South Africa, Mexico, Peru, Chile, Colombia, Iran | January 20, 2020- August 3, 2020 |
Fig. 6The Process of instance generation.
The range of hyperparameters used in the Bayesian optimization process.
| Model | Hyperparameter range |
|---|---|
| ATT_BO | Activation function: (ReLU, Linear) |
| LSTM_BO | Activation function: (ReLU, Linear, Tanh) |
| Dropout rate: (0.0,0.1,0.2,0.3,0.4,0.50) | |
| Number of neurons: (32,64,128,256) | |
| CNN_BO | Size of kernel: (2,3,4,5,6) |
| Stride: (1,2) | |
| Number of neurons: (32,64,128,256) |
The performance of the proposed methods in terms of SMAPE on Dataset 1.
| Country | ATT_BO | LSTM_BO | CNN_BO | Fuzzy fractal |
|---|---|---|---|---|
| US | 0.4082 | 0.5325 | 0.2776 | 1.0755 |
| UK | 0.0464 | 0.056 | 0.0504 | 1.0147 |
| Turkey | 0.0412 | 0.0475 | 0.0984 | 0.0085 |
| Spain | 0.6536 | 0.62 | 0.6119 | 0.3572 |
| Mexico | 0.5171 | 0.5668 | 0.5684 | 0.693 |
| Italy | 0.0438 | 0.1117 | 0.0626 | 1.5343 |
| Iran | 0.0685 | 0.1313 | 0.0577 | 1.5343 |
| Germany | 0.1562 | 0.2321 | 0.1823 | 0.1174 |
| France | 0.3956 | 0.3169 | 0.313 | 0.2894 |
| Belgium | 0.2754 | 0.4366 | 0.2519 | 0.4281 |
The performance of the proposed methods in terms of MAPE on Dataset 1.
| Country | ATT_BO | LSTM_BO | CNN_BO | Fuzzy fractal |
|---|---|---|---|---|
| US | 0.317 | 0.5314 | 0.276 | 1.0691 |
| UK | 0.0402 | 0.0542 | 0.0456 | 1.0214 |
| Turkey | 0.0412 | 0.0182 | 0.0984 | 0.0085 |
| Spain | 0.4977 | 0.5947 | 0.6025 | 0.3581 |
| Mexico | 0.4389 | 0.5355 | 0.5187 | 0.6901 |
| Italy | 0.0409 | 0.1114 | 0.0624 | 0.0551 |
| Iran | 0.0538 | 0.1269 | 0.0428 | 1.5196 |
| Germany | 0.1461 | 0.2128 | 0.1804 | 0.1173 |
| France | 0.3208 | 0.3033 | 0.3088 | 0.2893 |
| Belgium | 0.2609 | 0.39432 | 0.2491 | 0.4287 |
The performance of the proposed methods in terms of RMSE on Dataset 1.
| Country | ATT_BO | LSTM_BO | CNN_BO | Fuzzy fractal |
|---|---|---|---|---|
| US | 20023.27 | 26415.24 | 15181.05 | 27609.68 |
| UK | 164.8403 | 193.7 | 180.567 | 3494.91 |
| Turkey | 99.43 | 115.42 | 240.66 | 27.303 |
| Spain | 2320.7 | 2273.54 | 2269.22 | 1398.52 |
| Mexico | 2511.54 | 2825.99 | 2781.7 | 3069.18 |
| Italy | 126.71 | 298.89 | 173.7 | 168.08 |
| Iran | 243.015 | 417.944 | 198.11 | 5135.7 |
| Germany | 395.75 | 537.7 | 436.9 | 333.42 |
| France | 1035.42 | 910.24 | 894.56 | 782.001 |
| Belgium | 230.52 | 369.39 | 208.2 | 312.61 |
The performance of all methods in terms of Mean SMAPE, Mean MAPE, Mean RMSE, Rank SMAPE, Rank MAPE, rank RMSE (the best results are marked bold) on Dataset 1.
| Method | ATT_BO | LSTM_BO | CNN_BO | Fuzzy fractal |
|---|---|---|---|---|
| Mean SMAPE | 0.2606 | 0.3051 | 0.2474 | 0.7052 |
| Mean MAPE | 0.2157 | 0.2883 | 0.2385 | 0.5557 |
| Mean RMSE | 2715.12 | 3435.80 | 2256.47 | 4233.14 |
| Rank SMAPE | 2.1 | 3 | 2.2 | 2.7 |
| Rank MAPE | 2 | 3 | 2.4 | 2.6 |
| Rank RMSE | 2 | 3.3 | 2.1 | 2.6 |
Fig. 7The actual and predicted number of cases for 10 days (22 Jul to 1 August) for US.
Fig. 16The actual and predicted number of cases for 10 days (22 Jul to 1 August) for Belguim.
Fig. 8The actual and predicted number of cases for 10 days (22 Jul to 1 August) for UK.
Fig. 9The actual and predicted number of cases for 10 days (22 Jul to 1 August) for Turkey.
Fig. 10The actual and predicted number of cases for 10 days (22 Jul to 1 August) for Spain.
Fig. 11The actual and predicted number of cases for 10 days (22 Jul to 1 August) for Mexico.
Fig. 13The actual and predicted number of cases for 10 days (22 Jul to 1 August) for Iran.
Fig. 14The actual and predicted number of cases for 10 days (22 Jul to 1 August) for Germany.
Fig. 15The actual and predicted number of cases for 10 days (22 Jul to 1 August) for France.
The performance of the proposed methods in terms of SMAPE on Dataset 2 (The best results are marked bold).
| Country | ATT_BO | LSTM_BO | CNN_BO |
|---|---|---|---|
| US | 0.8117 | 0.9946 | |
| Brazil | 4.1811 | 3.4828 | |
| India | 0.8735 | 1.1117 | |
| Russia | 0.7723 | 1.7461 | |
| South Africa | 8.1889 | 9.4018 | |
| Mexico | 1.1996 | 1.5866 | |
| Peru | 4.3358 | 3.5637 | |
| Chile | 2.4096 | 3.2176 | |
| Colombia | 4.8034 | 4.2347 | |
| Iran | 1.0407 | 1.4831 |
The performance of the proposed methods in terms of MAPE on Dataset 2 (the best results are marked bold).
| Country | ATT_BO | LSTM_BO | CNN_BO |
|---|---|---|---|
| US | 0.8105 | 0.9883 | |
| Brazil | 4.2974 | 3.5924 | |
| India | 0.878 | 1.08 | |
| Russia | 0.7681 | 1.7688 | |
| South Africa | 8.8127 | 10.2508 | |
| Mexico | 1.1919 | 1.5692 | |
| Peru | 4.21 | 3.4734 | |
| Chile | 2.3693 | 3.147 | |
| Colombia | 4.9383 | 4.3363 | |
| Iran | 1.0485 | 1.4977 |
The performance of the proposed methods in terms of RMSE on Dataset 2 (the best results are marked bold).
| Country | ATT_BO | LSTM_BO | CNN_BO |
|---|---|---|---|
| US | 36223.39 | 43230.38 | |
| Brazil | 110229.9 | 98053.65 | |
| India | 13834.01 | 16290.89 | |
| Russia | 7054.7 | 15662.22 | |
| South Africa | 55611.48 | 65740.78 | |
| Mexico | 5360.93 | 7080.47 | |
| Peru | 20003.09 | 17376.54 | |
| Chile | 9388.29 | 13168.58 | |
| Colombia | 12356.65 | 10936.55 | |
| Iran | 3279.91 | 4802.84 |
The performance of all methods in terms of Mean SMAPE, Mean MAPE, Mean RMSE, Rank SMAPE, Rank MAPE, rank RMSE on Dataset 2 (the best results are marked bold).
| Method | ATT_BO | LSTM_BO | CNN_BO |
|---|---|---|---|
| Mean SMAPE | 2.7801 | 3.0348 | |
| Mean MAPE | 2.8512 | 3.1181 | |
| Mean RMSE | 26648.92 | 26868.01 | |
| Rank SMAPE | 2 | 2.6 | |
| Rank MAPE | 2 | 2.6 | |
| Rank RMSE | 2 | 2.5 |
Fig. 17The actual and predicted number of cases for test set-US.
Fig. 26The actual and predicted number of cases for test set-Iran.
Fig. 19The actual and predicted number of cases for test set-India.
Fig. 20The actual and predicted number of cases for test set-Russia.
Fig. 22The actual and predicted number of cases for test set-Mexico.
Fig. 18The actual and predicted number of cases for test set-Brazil.
Fig. 21The actual and predicted number of cases for test set- South Africa.
Fig. 23The actual and predicted number of cases for test set-Peru.
Fig. 24The actual and predicted number of cases for test set-Chile.
Fig. 25The actual and predicted number of cases for test set-Colombia.