| Literature DB >> 34658536 |
Hossein Abbasimehr1, Reza Paki1,2, Aram Bahrini3.
Abstract
The COVID-19 pandemic has disrupted the economy and businesses and impacted all facets of people's lives. It is critical to forecast the number of infected cases to make accurate decisions on the necessary measures to control the outbreak. While deep learning models have proved to be effective in this context, time series augmentation can improve their performance. In this paper, we use time series augmentation techniques to create new time series that take into account the characteristics of the original series, which we then use to generate enough samples to fit deep learning models properly. The proposed method is applied in the context of COVID-19 time series forecasting using three deep learning techniques, (1) the long short-term memory, (2) gated recurrent units, and (3) convolutional neural network. In terms of symmetric mean absolute percentage error and root mean square error measures, the proposed method significantly improves the performance of long short-term memory and convolutional neural networks. Also, the improvement is average for the gated recurrent units. Finally, we present a summary of the top augmentation model as well as a visual representation of the actual and forecasted data for each country.Entities:
Keywords: Augmentation methods; COVID-19 pandemic; Deep learning; Time series forecasting
Year: 2021 PMID: 34658536 PMCID: PMC8502508 DOI: 10.1007/s00521-021-06548-9
Source DB: PubMed Journal: Neural Comput Appl ISSN: 0941-0643 Impact factor: 5.102
Fig. 1Structure of an LSTM unit
Fig. 2Structure of a GRU unit [39]
Fig. 3Structure of CNN for time series
Fig. 4An example of time series
Fig. 5Proposed schema
Architecture of three deep learning models used to evaluate the proposal
| Benchmarking models | Architecture |
|---|---|
| LSTM | LSTM layer |
| Dense | |
| Output | |
| GRU | GRU layer |
| Dense | |
| Output | |
| CNN | 1D convolution layer |
| Dense | |
| Output |
Fig. 6Architecture of the utilized models
Statistical properties of the daily data of the COVID-19 cases for the USA, Brazil, India, France, and the UK
| Country | USA | Brazil | India | France | UK |
|---|---|---|---|---|---|
| Sample size | 432 | 397 | 424 | 430 | 423 |
| Mean | 70,052 | 31,574 | 28,395 | 11,088 | 10,277 |
| Median | 47,043 | 28,629 | 18,537 | 4321.5 | 4,329 |
| Mode | 0 | 0 | 0 | 0 | 0 |
| Standard deviation | 68,074.5 | 23,178.8 | 27,378.6 | 14,657.6 | 13,657.4 |
| Skewness | 1.31 | 0.45 | 0.83 | 2.24 | 1.88 |
| Standard error of skewness | 0.12 | 0.12 | 0.12 | 0.12 | 0.12 |
| 11.15 | 3.68 | 6.98 | 19.01 | 15.82 | |
| Kurtosis | 0.76 | 7.43 | 3.37 | ||
| Standard error of kurtosis | 0.23 | 0.24 | 0.24 | 0.24 | 0.24 |
| 3.26 | 31.60 | 14.21 | |||
| Min | 0 | 0 | 0 | 0 | 0 |
| Max | 300,416 | 100,158 | 97,894 | 106,091 | 68,192 |
| Range | 300,416 | 100,158 | 97,894 | 106,091 | 68,192 |
Statistical properties of the daily data of the COVID-19 cases for Russia, Italy, Spain, Turkey, and Germany
| Country | Russia | Italy | Spain | Turkey | Germany |
|---|---|---|---|---|---|
| Sample size | 423 | 423 | 422 | 383 | 427 |
| Mean | 10,566 | 8,351 | 7,965 | 8,376 | 6,521 |
| Median | 8,764 | 2,843 | 1,931 | 2,026 | 1,898 |
| Mode | 0 | 0 | 0 | 987 | 0 |
| Standard deviation | 8,348.8 | 9,959.9 | 12,779.9 | 42,585.6 | 8,722.4 |
| Skewness | 0.68 | 1.16 | 2.85 | 18.45 | 1.76 |
| Standard error of skewness | 0.12 | 0.12 | 0.12 | 0.13 | 0.12 |
| 5.75 | 9.77 | 23.97 | 147.56 | 14.88 | |
| Kurtosis | 0.41 | 11.14 | 353.45 | 2.90 | |
| Standard error of kurtosis | 0.24 | 0.24 | 0.24 | 0.25 | 0.24 |
| 1.74 | 47.00 | 1,419.46 | 12.29 | ||
| Min | 0 | 0 | 0 | 0 | 0 |
| Max | 29,499 | 40,902 | 93,822 | 823,225 | 49,044 |
| Range | 29,499 | 40,902 | 93,822 | 823,225 | 49,044 |
Range of the hyperparameters
| Hyperparameter | Range |
|---|---|
| Common hyperparameters | Lag: [10, 11, 12, 13, 14, 15] |
| Learning rate: [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05] | |
| Dense activation function: [ReLU, Linear] | |
| Output activation function: [ReLU, Linear] | |
| LSTM | Activation function: [ReLU, Linear] |
| Dropout rate: [0.0, 0.1, 0.2, 0.3, 0.4, 0.5] | |
| Number of units: [4, 8, 16, 32, 64, 128] | |
| CNN_FE | Kernel size: [2, 3, 4] |
| Number of filters: [32, 64, 128, 256] | |
| GRU | Activation function: [ReLU, Linear] |
| Number of units: [4, 8, 16, 32, 64, 128] | |
| Dropout rate: [0.0, 0.1, 0.2, 0.3, 0.4, 0.5] |
An example of the sample generation function
| Input | Output |
|---|---|
| . | . |
LSTM results for ten countries in terms of SMAPE and RMSE for regular and augmentation approaches—LSTM_Aug is obtained following the proposed approach
| Country | SMAPE | RMSE | ||
|---|---|---|---|---|
| LSTM_Aug | LSTM | LSTM_Aug | LSTM | |
| USA | 1.73 | 570575.55 | ||
| Brazil | 0.76 | 101239.26 | ||
| India | 0.77 | 112064.7 | ||
| France | 0.66 | 45009.65 | ||
| Russia | 1.23 | 59394.68 | ||
| UK | 2.32 | 106723.4 | ||
| Italy | 0.62 | 22976.17 | ||
| Spain | 2.37 | 86450.37 | ||
| Turkey | 2.13 | 78740.49 | ||
| Germany | 0.73 | 24476.58 | ||
| Mean | 1.31 | 120057.52 | ||
Bold values indicate the best results
Results of CNN for ten countries in terms of SMAPE and RMSE for regular and augmentation approaches. CNN_Aug is obtained following the proposed approach
| Country | SMAPE | RMSE | ||
|---|---|---|---|---|
| CNN_Aug | CNN | CNN_Aug | CNN | |
| USA | 0.32 | 104899.61 | ||
| Brazil | 0.67 | 88552.5 | ||
| India | 0.66 | 102096.63 | ||
| France | 0.51 | 33311.58 | ||
| Russia | 0.48 | 23084.47 | ||
| UK | 0.93 | 41051.61 | ||
| Italy | 0.50 | 19769.05 | ||
| Spain | 1.41 | 54012.22 | ||
| Turkey | 1.22 | 45970.07 | ||
| Germany | 0.73 | 26058.33 | ||
| Mean | 0.73 | 51860.22 | ||
Bold values indicate the best results
GRU results for ten countries in terms of SMAPE and RMSE for regular and augmentation approaches—GRU_Aug is obtained following the proposed approach
| Country | SMAPE | RMSE | ||
|---|---|---|---|---|
| GRU_Aug | GRU | GRU_Aug | GRU | |
| USA | 0.44 | 152119.5 | ||
| Brazil | 0.77 | 101927.57 | ||
| India | 0.72 | 109283.11 | ||
| France | 0.60 | 40609.86 | ||
| Russia | 0.93 | 47059.80 | ||
| UK | 0.52 | 24989.78 | ||
| Italy | 0.43 | 18585.69 | ||
| Spain | 4.1 | 145384.97 | ||
| Turkey | 1.64 | 64087.62 | ||
| Germany | 0.76 | 25800.14 | ||
| Mean | 1.01 | 62664.06 | ||
Bold values indicate the best results
Top model for each country
| Country | Top model |
|---|---|
| USA | CNN_Aug |
| Brazil | CNN_Aug |
| India | LSTM_Aug |
| France | CNN_Aug |
| Russia | CNN_Aug |
| UK | LSTM_Aug |
| Italy | CNN_Aug |
| Spain | CNN_Aug |
| Turkey | CNN_Aug |
| Germany | CNN_Aug |
Fig. 7Actual and forecasted number of cases for test set—USA
Fig. 8Actual and forecasted number of cases for test set—Brazil
Fig. 9Actual and forecasted number of cases for test set—India
Fig. 10Actual and forecasted number of cases for test set—France
Fig. 11Actual and forecasted number of cases for test set—Russia
Fig. 12Actual and forecasted number of cases for test set—UK
Fig. 13Actual and forecasted number of cases for test set—Italy
Fig. 14Actual and forecasted number of cases for test set—Spain
Fig. 15Actual and forecasted number of cases for test set—Turkey
Fig. 16Actual and forecasted number of cases for test set—Germany