Literature DB >> 34658536

A novel approach based on combining deep learning models with statistical methods for COVID-19 time series forecasting.

Hossein Abbasimehr¹, Reza Paki^1,2, Aram Bahrini³.

Abstract

The COVID-19 pandemic has disrupted the economy and businesses and impacted all facets of people's lives. It is critical to forecast the number of infected cases to make accurate decisions on the necessary measures to control the outbreak. While deep learning models have proved to be effective in this context, time series augmentation can improve their performance. In this paper, we use time series augmentation techniques to create new time series that take into account the characteristics of the original series, which we then use to generate enough samples to fit deep learning models properly. The proposed method is applied in the context of COVID-19 time series forecasting using three deep learning techniques, (1) the long short-term memory, (2) gated recurrent units, and (3) convolutional neural network. In terms of symmetric mean absolute percentage error and root mean square error measures, the proposed method significantly improves the performance of long short-term memory and convolutional neural networks. Also, the improvement is average for the gated recurrent units. Finally, we present a summary of the top augmentation model as well as a visual representation of the actual and forecasted data for each country.

Entities: Chemical

Keywords: Augmentation methods; COVID-19 pandemic; Deep learning; Time series forecasting

Year: 2021 PMID： 34658536 PMCID： PMC8502508 DOI： 10.1007/s00521-021-06548-9

Source DB: PubMed Journal: Neural Comput Appl ISSN： 0941-0643 Impact factor: 5.102

Introduction

Temporary interventions such as social distancing, self-isolating, quarantining, and shutting down nonessential activities have been strategies for the governments to prevent the virus from spreading. It is essential to forecast the number of infected cases using different data types to notify public health decision-makers by estimating the likely impact of the COVID-19 pandemic and plan accordingly [1-4]. Deep learning models have demonstrated successful performance in language and image processing tasks [6-8]. Also, they exhibited state-of-the-art performance in forecasting complex time series data [5, 9–12]. The main advantage of deep learning models is their ability to learn representations from raw input data. Among the most popular deep learning algorithms, long short-term memory (LSTM) and bidirectional LSTM (Bi-LSTM) [13] have been used in [14-17], with significant results in COVID-19 forecasting. LSTM is a special type of recurrent neural networks (RNNs), which is developed to learn temporal information from sequential data [18]. Despite the fact that deep learning algorithms can reach acceptable performance in time series forecasting, particularly in COVID-19 forecasting applications, their forecasting capability is primarily dependent on the amount of data available to fit their parameters appropriately [12, 19]. Another challenge with deep learning for time series forecasting is that, even though adequate data samples are available, data from the distant past are typically less useful for forecasting [12]. In other words, in predicting, recent observations of an individual series are more valuable. This may be due to shifts in patterns that formerly occurred in a series. To overcome the aforementioned issue and increase the performance of deep learning models in time series forecasting, we propose exploiting time series augmentation techniques [19-21] to generate new series with similar temporal dependencies as the original series. We then extract new samples from the augmented time series to enhance model training. Three deep learning models based on the LSTM, gated recurrent units (GRU) [22], and convolutional neural network (CNN) [23] are used to see whether the proposed approach is useful. A multi-step-ahead forecasting strategy [24] is used to develop the models, allowing them to predict the number of cases for the next few days. It is a preferable alternative to single-step-ahead forecasting for long-horizon forecasting [25]. The proposed models are applied to COVID-19 data from the top 10 countries with the most reported confirmed cases from January 20, 2020, until March 28, 2021. We show that the proposed method significantly improves the performance of the LSTM-based and CNN-based models but has an average improvement on the GRU performance. To evaluate the effectiveness of the proposed model, we visualize the forecasting results and provide statistical characteristics of the data to enable governments to make long-term decisions on how to deal with the pandemic. The remainder of this paper is organized as follows. Section 2 provides a brief review on COVID-19 time series forecasting and the description of the employed deep learning methods. In Sect. 3, we present the proposed approach and the architectures of the designed models. Section 4 assesses the usefulness of the proposed method via the experimental study. Discussions are provided in Sect. 5, and finally, the paper concludes in Sect. 6 with some suggestions for future work in this area.

Related work

This section first presents a review of the COVID-19 time series forecasting methods and then describes the utilized models throughout the study.

COVID-19 Forecasting

Various approaches, mostly mathematical, statistical, machine learning, and deep learning models have been utilized in previous studies [3, 4, 14, 16, 17, 26]. Rahimi et al. [27] provided a review of widely used forecasting models on COVID-19 data. Here, we concentrate mainly on COVID-19 time series forecasting studies and present a brief review in this context. Al-Qaness et al. [28] presented an improved adaptive neuro-fuzzy inference method (ANFIS) that uses an enhanced flower pollination algorithm (FPA) by the salp swarm algorithm (SSA) to forecast the COVID-19 cases in China. Their model is more potent in terms of mean absolute percentage error (MAPE), root mean squared relative error (RMSRE), coefficient of determination, and computing time. Torrealba-Rodriguez et al. [3] used Gompertz, logistic, and artificial neural network (ANN) models. Their results from the infected cases in Mexico showed a high coefficient of determination between the studied data and those obtained by the proposed models. Similar studies which considered Gompertz and logistic models can be found in [3, 29–32]. Castillo and Melin [26] studied an approach based on fuzzy fractal for data from 10 countries by combining (1) fractal dimension to evaluate the complexity of the dynamics in the time series and (2) fuzzy logic to reflect the uncertainty forecasting. Melin et al. [33] introduced a multiple ensemble neural network model with fuzzy logic response aggregation. Their experiments on the data of Mexico infected cases show the superiority of their proposed model over the single ANN. Kırbaş et al. [15] used autoregressive integrated moving average (ARIMA), nonlinear autoregression neural network (NARNN), and LSTM approaches to study the data of 8 European countries. Shahid et al. [16] proposed forecast models with ARIMA, support vector regression (SVR), LSTM, Bi-LSTM in 10 significantly affected countries. Leila et al. [34] applied ANN and ARIMA models, and Petropoulos and Makridakis [35] implemented exponential smoothing forecasting to predict the infected cases. Arura et al. [14] utilized recurrent neural network (RNN)-based variants such as deep LSTM, convolutional LSTM, and Bi-LSTM for the cases in India. For Russia, Peru, and Iran, Wang et al. [16] used LSTM networks and rolling updating mechanisms to feed new forecasting outcomes into model training for the next iteration. The study of Hasan [36] suggested a hybrid model consisting of ensemble empirical mode decomposition (EEMD) and artificial neural network (ANN), which outperformed conventional statistical analysis. Machine learning algorithms were used by Li et al. [37] in predicting mortality in confirmed cases of COVID-19. Their results indicated that the gradient boosting decision tree (GBDT) outperforms logistic regression (LR) models, the performance comparison appeared to be independent of disease severity, and the 5-index LR or LR-5 model is powerful in death prediction with a high area under the curve (AUC). Reviewing the previous studies indicate that computational intelligence methods and especially deep learning methods have attracted growing attention in COVID-19 time series forecasting. Even though deep neural networks have performed reasonably well when applied on COVID-19 time series data, in this study, we aim to enhance their predictive power by feeding them with more data. In general, the performance of the generated model in a deep learning task is largely determined by the amount of samples used in the model training phase. The inherent problem in time series forecasting is that time series are often short, and accordingly, the number of extracted samples becomes small. To address this problem, we propose to generate a new time series with similar characteristics to the original time series using statistical data augmentation methods. The obtained series via the augmentation approach is used to create new samples. In this way, a sufficient number of instances are provided for model learning.

Description of the employed models

We use RNN for the sequence processing task, which can catch the temporal dependencies in a time sequence, unlike ANN. However, the key issue with RNN is the gradient vanishing/exploding problem, which makes them difficult to train. Two new architectures with gating mechanisms, the LSTM [38] and GRU [39], have been proposed to solve this problem. In addition, we will use CNN, which is briefly discussed here, as another deep learning unit in our experiments.

LSTM

In this section, we explain the structure and mechanism of the LSTM unit. As illustrated in Fig. 1, each LSTM unit is comprised of a memory cell C, an input gate i, an output gate o and a forget gate f.

Fig. 1

Structure of an LSTM unit

Structure of an LSTM unit Considering the following parameters, the learning procedure of LSTM is described below:The output of the LSTM unit is computed as follows:where is the output gate that regulates the outgoing information of the LSTM unit and is the memory. is computed bywhere is the logistic sigmoid and is the output vector (hidden state) of the time . The memory cell is updated as follows:where , the newly computed memory is obtained as follows:In fact, the memory cell is a combination of the previous memory multiplied by the forget gate, and the new memory regulated by the input gate, . and are computed as follows: the input vector at time step t are bias vectors of input, output, forget, and memory cell. are weight matrix of input, output, forget, and memory cell. are the recurrent weights of input, output, forget, and memory cell.

GRU

GRU is another variant of RNN that uses gating mechanism to regulate the flow of information inside the unit. Unlike LSTM, GRU does not contain a memory cell. As Figure 2 portrays, the GRU has two gates, a reset gate and an update gate . The rest gate decides how to combine the new input with the previous hidden state, . Also, the update gate determines how much unit updates its hidden state.

Fig. 2

Structure of a GRU unit [39]

CNN

CNN has shown promise in a variety of fields, including machine vision [23]. CNN’s convolutional layers take input data and extract new features by performing convolution operations on it with convolution kernels. Each CNN contains a convolution kernel (i.e., a small window) that slides over the input data and performs convolutional operations to generate new features, as shown in Fig. 3 [40]. The generated features obtained by the convolution technique are typically more discriminative than the raw input data, resulting in better forecasting.

Fig. 3

Structure of CNN for time series

Proposed method

Deep learning methods such as LSTM, CNN, and GRU have been applied successfully in the time series forecasting context. These techniques’ performance mainly depends on having enough data to fit their parameters suitably [12]. The number of samples extracted from a short time series may be insufficient to achieve an optimal model [19]. These methods should be appropriately regularized to prevent them from overfitting. Another difficulty with time series forecasting is that, even if the series is long and adequate data are available, the observations from the far past usually provide fewer determinants for predicting. In other words, recent observations of an individual series are more useful in forecasting. This may be because of the changes that happen in patterns that existed in a series. The commonly used procedure of data preparation for a time series forecasting task is illustrated in Fig. 4. As shown, a given time series is divided into in-samples and out-samples considering a certain ratio, for example, 80/20. The out-sample part (test data) is used to evaluate the obtained model. Also, the in-sample part is divided into the train data and the validation data . The validation data are utilized to tune the model’s hyperparameters and to evaluate a model fit on the train data. Selecting separate validation data leads to excluding the recent observations from the train data, so the recent patterns that exist in the data will not be captured. One simple solution to tackle this problem is to include the validation data in model training. However, in this way, overfitting may occur, which usually leads to loss of accuracy on test data. In this study, we propose to use time series augmentation methods to avoid model overfitting and improve the accuracy.

Fig. 4

An example of time series

An example of time series Specifically, we utilize a time series augmentation technique to create new series with the same temporal dependencies that exist in the original series. The augmented time series is used to create a new validation set. The overall procedure of the proposed idea is illustrated in Fig. 5. The proposed model contains preprocessing and model training phases. Firstly, in the preprocessing phase, a time series augmentation technique is applied, and then, the sample generation procedure is accomplished. In the modeling phase, the deep learning models are employed on the generated samples, and the best model is achieved. In the model training process, we adopt the Bayesian optimization algorithm to fine-tune the hyperparameters of each model.

Fig. 5

Proposed schema

Proposed schema To explain our proposal, we describe its procedure using Algorithm 1. To augment a time series, we apply the method proposed in [20]. This algorithm firstly applies the Box–Cox transformation to the series and then decomposes the series into trend, seasonal, and reminder adopting STL or Loess [41]. Then it bootstraps the reminder using the moving block bootstrap (MBB) [42], and the trend and seasonal components are added together, and finally, inverse Box–Cox transformation is applied. As illustrated in Algorithm 1, lines 1-7 show the procedure of computing bootstrapped series. In lines 8-10, the bootstrapped series are aggregated, and then, for the original series and the augmented series, the instances with input–output format are created considering a Lag, and an output window (Output_Window). Line 11 concatenates the two validation sets. In lines 12-18, the benchmarking models are trained and evaluated, and the best model in terms of RMSE is returned.

Architecture of the utilized deep learning models

Three state-of-the-art deep learning models are employed to explore whether the forecasting performance of the proposed scheme is better than the performance of the regular approach. The list of benchmarking models along with their architectures is provided in Table 1.

Table 1

Architecture of three deep learning models used to evaluate the proposal

Benchmarking models	Architecture
LSTM	LSTM layer
	Dense
	Output
GRU	GRU layer
	Dense
	Output
CNN	1D convolution layer
	Dense
	Output

Architecture of three deep learning models used to evaluate the proposal Also, Fig. 6 illustrates the full architectures of the proposed methods. As can be observed from the figure, the dense and output layers are the same for LSTM, CNN, and GRU. Every model learns a representation (a feature vector) of the input data and feeds it into the fully connected (dense) layer; afterward, the predictions are computed using the output layer.

Fig. 6

Architecture of the utilized models

Hyperparameter selection procedure using bayesian optimization

The choice of optimal hyperparameters is essential in obtaining a forecasting model with high accuracy [43]. Deep learning-based models usually contain several hyperparameters. Although grid search is a popular strategy for finding the optimal hyperparameters, it requires more computational time and resources to fine-tune deep learning methods. This is due to the fact that the grid search method exhaustively considers all parameter combinations, so it needs more computational resources, especially in the case of deep learning. The main reason behind using the Bayesian hyperparameter optimization is that it does not consider all hyperparameter combinations, and so less training time and resources are needed. The Bayesian hyperparameter optimization uses Bayesian models based on Gaussian processes to predict good tuning parameters [43]. The study of Wu et al. [43] indicated that the Bayesian optimization-based method could find the optimal hyperparameters for the popular machine learning algorithms. In line with [11, 12, 43], the Bayesian optimization technique [44, 45] is used to tune the hyperparameters in all of the experiments in this study. The Bayesian optimization algorithm uses the error on the validation data to determine the appropriateness of each model.

Experimental study

In this study, we use R forecast package1 version 8.14 to generate the augmentation of each time series. Also, the deep learning models are implemented with Keras [46], the Python deep learning library.

Dataset

The Humanitarian Data Exchange (HDX) [47] is the source of the data utilized in this study. The Corona Virus Resource Center at Johns Hopkins University has compiled and released a credible source of COVID-19 reported cases on HDX so that scientists can model the disease’s spread and conduct data analysis [48]. The dataset contains the daily record of confirmed cases in the time series format, including temporal patterns. In this study, the experiments were conducted using the time series data for ten countries with the highest number of confirmed cases from January 20, 2020, to March 28, 2021. These countries are the USA, Brazil, India, France, Russia, the UK, Italy, Spain, Turkey, and Germany. The last 28 days of each series are used as the test set, and the remaining days are used as the training data. We also made the validation set the same size as the test set (28 days).

Statistical properties of the data

In Tables 2 and 3, we apply statistical properties to the aforementioned ten countries with the highest COVID-19 cases to better interpret the dataset.

Table 2

Statistical properties of the daily data of the COVID-19 cases for the USA, Brazil, India, France, and the UK

Country	USA	Brazil	India	France	UK
Sample size	432	397	424	430	423
Mean	70,052	31,574	28,395	11,088	10,277
Median	47,043	28,629	18,537	4321.5	4,329
Mode	0	0	0	0	0
Standard deviation	68,074.5	23,178.8	27,378.6	14,657.6	13,657.4
Skewness	1.31	0.45	0.83	2.24	1.88
Standard error of skewness	0.12	0.12	0.12	0.12	0.12
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Z_{Skewness}}$$\end{document}ZSkewness	11.15	3.68	6.98	19.01	15.82
Kurtosis	0.76	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.53	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.42	7.43	3.37
Standard error of kurtosis	0.23	0.24	0.24	0.24	0.24
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Z_{Kurtosis}}$$\end{document}ZKurtosis	3.26	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-2.16	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-1.78	31.60	14.21
Min	0	0	0	0	0
Max	300,416	100,158	97,894	106,091	68,192
Range	300,416	100,158	97,894	106,091	68,192

Table 3

Statistical properties of the daily data of the COVID-19 cases for Russia, Italy, Spain, Turkey, and Germany

Country	Russia	Italy	Spain	Turkey	Germany
Sample size	423	423	422	383	427
Mean	10,566	8,351	7,965	8,376	6,521
Median	8,764	2,843	1,931	2,026	1,898
Mode	0	0	0	987	0
Standard deviation	8,348.8	9,959.9	12,779.9	42,585.6	8,722.4
Skewness	0.68	1.16	2.85	18.45	1.76
Standard error of skewness	0.12	0.12	0.12	0.13	0.12
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Z_{Skewness}}$$\end{document}ZSkewness	5.75	9.77	23.97	147.56	14.88
Kurtosis	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.57	0.41	11.14	353.45	2.90
Standard error of kurtosis	0.24	0.24	0.24	0.25	0.24
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Z_{Kurtosis}}$$\end{document}ZKurtosis	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-2.41	1.74	47.00	1,419.46	12.29
Min	0	0	0	0	0
Max	29,499	40,902	93,822	823,225	49,044
Range	29,499	40,902	93,822	823,225	49,044

The sample size refers to the number of observations included in the experiment for each country which is not necessarily equal because it is counted from the day the first COVID-19 cases were reported. The mean or average of the data is the most popular and well-known measure of central tendency and is equal to the sum of all the values in the dataset divided by the number of observations. It is worth noting that the total cases can be obtained by multiplying the sample size by the sample mean during the study period. Other than mean, two other measures of central tendency are median and mode. Median is the middle value for the dataset that has been arranged in order of magnitude. An essential property about the median is that it is less affected or “Robust” by outliers and skewed data. Mode, on the other hand, is the most frequent number of daily cases in our dataset. It does not give a fair measure of central tendency when compared to median and mean [50]. The obtained mode for most of the countries is zero. The reason for that could be having days without any new cases or failing to report instances due to holidays or weekends. The sample’s square root of variance often known as standard deviation is a measure of the amount of variation or dispersion of the dataset, using the same unit as the mean. A small standard deviation implies that the values tend to be close to the mean of the dataset, whereas a high standard deviation suggests that the values are spread out over a wider range [51]. Besides standard deviation, two other dispersion measures are (1) skewness, where it measures the amount of asymmetricity, and (2) kurtosis, where it determines the heaviness of the distribution tails, also known as is the “tailedness” or the “peakedness.” For a country dataset with one mode (uni-modal), a positive skewness shows that the data are asymmetric and skewed to the right, a negative skewness explains that the data are asymmetric and skewed to the left, and finally, a symmetric dataset always has a zero skewness. To provide a comparison to the standard normal distribution, it is common to use an adjusted version known as the excess kurtosis, which is the kurtosis minus 3. A dataset with zero excess kurtosis is called “Mesokurtic,” with a positive excess kurtosis is named “Leptokurtic” indicating heavy tails with large outliers and less variable, and with a negative excess kurtosis is known as “Platykurtic” which have the flattest peak and highly dispersed [52]. A Z-score for skewness and kurtosis can be obtained by dividing the skew values or excess kurtosis by their standard errors, respectively, which are shown as and in Tables 2 and 3. As the studied sample size of the countries is large, either an absolute skew value larger than 2 or an absolute kurtosis larger than 7 can be used as reference values for determining the significance of non-normality [53]. It is worth mentioning that the utilized models in this study are based on neural networks and deep learning. These methods are nonparametric that model the data without prior assumptions of their distribution [54]. Finally, the range for each country is the difference between the dataset’s largest and smallest observations, which expresses a country dataset’s dispersion. Statistical properties of the daily data of the COVID-19 cases for the USA, Brazil, India, France, and the UK Statistical properties of the daily data of the COVID-19 cases for Russia, Italy, Spain, Turkey, and Germany

Measures of evaluation

The two forecasting performance measures used in the comparison are (1) symmetric mean absolute percentage error (SMAPE) which is defined asand (2) the root mean square error (RMSE) which is obtained bywhere and are the predicted and observed values at time point t, respectively.

Hyperparameter selection

Table 4 illustrates the domain of all hyperparameters utilized in the implemented models. The lag hyperparameter exploited in transforming input time series into samples suitable for deep learning techniques has a significant impact on obtaining models that can forecast future values with minimum error [49]. Another important hyperparameter is the learning rate that regulates how the weights are adjusted during the model training. Additionally, the utilized models throughout this study have different key hyperparameters that influence the forecasting accuracy of obtained models. The hyperparameters specific to each model are provided in Table 4. Also, as outlined in the previous section (see Table 1), each utilized deep learning techniques contain a dense layer that follows the sequence capturing layer (e.g., LSTM, CNN, or GRU) and an output layer, which produces the outputs. These layers are common in all the utilized models, and their ranges are also given in Table 4.

Table 4

Range of the hyperparameters

Hyperparameter	Range
Common hyperparameters	Lag: [10, 11, 12, 13, 14, 15]
	Learning rate: [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05]
	Dense activation function: [ReLU, Linear]
	Output activation function: [ReLU, Linear]
LSTM	Activation function: [ReLU, Linear]
	Dropout rate: [0.0, 0.1, 0.2, 0.3, 0.4, 0.5]
	Number of units: [4, 8, 16, 32, 64, 128]
CNN_FE	Kernel size: [2, 3, 4]
CNN_FE	Number of filters: [32, 64, 128, 256]
GRU	Activation function: [ReLU, Linear]
	Number of units: [4, 8, 16, 32, 64, 128]
	Dropout rate: [0.0, 0.1, 0.2, 0.3, 0.4, 0.5]

Range of the hyperparameters

Data preprocessing

According to the methodology shown in Fig. 5 and the procedure described in Algorithm 1, firstly, a new time series is generated via augmentation. Next, the original and the augmented series are transformed into samples with the input–output format. Then, the resulted samples are split into train set, validation set, and test set following the holdout procedure. Finally, a new validation set is created by concatenating the validation samples corresponding to the original series and the augmented one. It should be noted that this study adopts multi-output forecasting, and the sample generation process is performed using the lag (size of the input window) and the output window. In all experiments, the output window is set to 7 days. Following the multi-output forecasting strategy, for time series and Lag , Output_window , the created instances are shown in Table 5.

Table 5

An example of the sample generation function

Input	Output
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_1, t_2, t_3, t_4, t_5$$\end{document}t1,t2,t3,t4,t5	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_6, t_7$$\end{document}t6,t7
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_2, t_3, t_4, t_5, t_6$$\end{document}t2,t3,t4,t5,t6	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_7, t_8$$\end{document}t7,t8
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_3, t_4, t_5, t_6, t_7$$\end{document}t3,t4,t5,t6,t7	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_8, t_9$$\end{document}t8,t9
.	.
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_{n-(L+O)+1}, t_{n-(L+O)+2}, t_{n-(L+O)+3}, t_{n-(L+O)+4}, t_{n-(L+O)+5}$$\end{document}tn-(L+O)+1,tn-(L+O)+2,tn-(L+O)+3,tn-(L+O)+4,tn-(L+O)+5	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_{n-1}, t_{n}$$\end{document}tn-1,tn

An example of the sample generation function

Evaluation of the effectiveness of the proposed approach

In this section, we investigate whether our proposed approach is able to enhance the forecasting accuracy of the deep learning models based on LSTM, CNN, and GRU. We run our experiments on the data of the before-mentioned ten countries. All experiments are repeated ten times, and the average performance measures are reported. Table 6 shows the results obtained using the deep learning model based on LSTM. As it can be seen, the model obtained using the proposed approach (LSTM_Aug) leads to a lower error in terms of SMAPE and RMSE for eight countries. LSTM_Aug achieves superior results for the USA, Brazil, India, France, Russia, the UK, Spain, and Turkey. Also, the mean SMAPE for LSTM_Aug is 0.82, which is lower than the one for LSTM, which has a mean of 1.30. Also, regarding RMSE, the mean RMSE measure for LSTM_Aug is significantly lower than the mean RMSE of LSTM. The experiments indicate that the results of LSTM_Aug are excellent, and the proposed approach significantly impacts the performance of LSTM.

Table 6

LSTM results for ten countries in terms of SMAPE and RMSE for regular and augmentation approaches—LSTM_Aug is obtained following the proposed approach

Country	SMAPE		RMSE
Country	LSTM_Aug	LSTM	LSTM_Aug	LSTM
USA	0.87	1.73	291935.32	570575.55
Brazil	0.62	0.76	83105.03	101239.26
India	0.50	0.77	81741.89	112064.7
France	0.60	0.66	40719.57	45009.65
Russia	0.86	1.23	41813.95	59394.68
UK	0.46	2.32	22279.42	106723.4
Italy	0.62	0.44	22976.17	18834.81
Spain	1.29	2.37	51592.85	86450.37
Turkey	1.71	2.13	66494.82	78740.49
Germany	0.73	0.65	24476.58	21542.29
Mean	0.82	1.31	72713.56	120057.52

Bold values indicate the best results

LSTM results for ten countries in terms of SMAPE and RMSE for regular and augmentation approaches—LSTM_Aug is obtained following the proposed approach Bold values indicate the best results

Convolution model

The results of experiments using deep learning model based on CNN are given in Table 7. The best values are shown in boldface. In terms of SMAPE, the CNN_Aug model achieves better performance in 9 countries out of 10. Also, in terms of RMSE, we see a similar performance where CNN_Aug beats CNN in 9 cases. To give a comprehensive report on the performance of models, the mean SMAPE and mean RMSE measures also are computed. The mean SMAPE for CNN_Aug is 0.63, which is lower than that for CNN (0.73). Also, this is true for mean RMSE in which the CNN_Aug achieves lower error than CNN. The results indicate that CNN_Aug outperforms CNN, and that using the proposed data preparation strategy considerably enhances the accuracy of CNN-based deep learning models.

Table 7

Results of CNN for ten countries in terms of SMAPE and RMSE for regular and augmentation approaches. CNN_Aug is obtained following the proposed approach

Country	SMAPE		RMSE
Country	CNN_Aug	CNN	CNN_Aug	CNN
USA	0.24	0.32	81157.55	104899.61
Brazil	0.54	0.67	73194.5	88552.5
India	0.66	0.52	102096.63	81892.72
France	0.49	0.51	30442.08	33311.58
Russia	0.32	0.48	15545.49	23084.47
UK	0.89	0.93	39566.78	41051.61
Italy	0.35	0.50	14129.16	19769.05
Spain	1.16	1.41	47599.51	54012.22
Turkey	1.04	1.22	39365.34	45970.07
Germany	0.62	0.73	22090.1	26058.33
Mean	0.63	0.73	46518.71	51860.22

Bold values indicate the best results

Results of CNN for ten countries in terms of SMAPE and RMSE for regular and augmentation approaches. CNN_Aug is obtained following the proposed approach Bold values indicate the best results

GRU model

Table 8 provides the results of experiments using the deep learning method based on GRU. Similar to the previously mentioned models, we compare the model obtained using the regular experimental setting (GRU model) with the model obtained using the proposed augmentation approach (GRU_aug). GRU_Aug and GRU perform similarly as each of them achieves minimum error in terms of SMAPE and RMSE in 5 cases out of 10 countries. This can be attributed to the fact that the GRU model uses different gating units and use less training parameters and this prevents it from overfitting.

Table 8

GRU results for ten countries in terms of SMAPE and RMSE for regular and augmentation approaches—GRU_Aug is obtained following the proposed approach

Country	SMAPE		RMSE
Country	GRU_Aug	GRU	GRU_Aug	GRU
USA	0.28	0.44	97535.03	152119.5
Brazil	0.75	0.77	99050.16	101927.57
India	0.53	0.72	83931.57	109283.11
France	0.60	0.55	40609.86	36529.07
Russia	0.54	0.93	26955.22	47059.80
UK	0.51	0.52	24700.29	24989.78
Italy	0.43	0.36	18585.69	15907.25
Spain	4.1	1.25	145384.97	50215.92
Turkey	1.64	1.48	64087.62	57313.48
Germany	0.76	0.66	25800.14	21647.34
Mean	1.01	0.77	62664.06	61699.28

Bold values indicate the best results

GRU results for ten countries in terms of SMAPE and RMSE for regular and augmentation approaches—GRU_Aug is obtained following the proposed approach Bold values indicate the best results

Overall comparison of the proposed method with regular approach

To provide an overall description of the results, we summarize the results of experiments and show the top augmentation model for each country in Table 9. As can be seen from the table, for all ten countries, the models based on the proposed augmentation approach show the top accuracy in terms of both SMAPE and RMSE. This demonstrates the effectiveness of the proposed approach in increasing the forecasting accuracy of the deep learning methods. Also, CNN Aug performs excellently and reaches the best model for eight countries including, the USA, Brazil, France, Russia, Italy, Spain, Turkey, and Germany. Besides, LSTM_Aug achieves the best accuracy for two countries. Furthermore, as illustrated in Table 9, in no country is GRU_Aug superior.

Table 9

Top model for each country

Country	Top model
USA	CNN_Aug
Brazil	CNN_Aug
India	LSTM_Aug
France	CNN_Aug
Russia	CNN_Aug
UK	LSTM_Aug
Italy	CNN_Aug
Spain	CNN_Aug
Turkey	CNN_Aug
Germany	CNN_Aug

Top model for each country

Visualizing the results

To further demonstrate the forecasting ability of the obtained models, in this section, the actual and forecasts for each country are visualized in Figs. 7, 8, 9, 10, 11, 12, 13, 14, 15, and 16. The actual values are shown in red in all figures, while the forecasts are shown in black using the best deep learning model (Figs. 7, 8, 9, 10, 11, 12, 13, 14, 15 and 16). As Figs. 7, 8, 9, 10, 11, 12, 13, 14, 15, and 16 indicate, the predicted cases for the USA, Brazil, France, Russia, UK, Italy, Turkey, and Germany are very close to the actual values; and there is a minimum error. Also, there are overlaps at some of the time points that demonstrate the power of the proposed approach. The plot for India (Figure 9) indicates that from time point 1 to time point 15, the forecasted values are very close to the real values but after time point 15, the error increases. Furthermore, as shown in Fig. 14, the inaccuracy is rather substantial in various time points for Spain. This is primarily due to the noise in the country’s input data.

Fig. 7

Actual and forecasted number of cases for test set—USA

Fig. 8

Actual and forecasted number of cases for test set—Brazil

Fig. 9

Actual and forecasted number of cases for test set—India

Fig. 10

Actual and forecasted number of cases for test set—France

Fig. 11

Actual and forecasted number of cases for test set—Russia

Fig. 12

Actual and forecasted number of cases for test set—UK

Fig. 13

Actual and forecasted number of cases for test set—Italy

Fig. 14

Actual and forecasted number of cases for test set—Spain

Fig. 15

Actual and forecasted number of cases for test set—Turkey

Fig. 16

Actual and forecasted number of cases for test set—Germany

Actual and forecasted number of cases for test set—USA Actual and forecasted number of cases for test set—Brazil Actual and forecasted number of cases for test set—India Actual and forecasted number of cases for test set—France Actual and forecasted number of cases for test set—Russia Actual and forecasted number of cases for test set—UK Actual and forecasted number of cases for test set—Italy Actual and forecasted number of cases for test set—Spain Actual and forecasted number of cases for test set—Turkey Actual and forecasted number of cases for test set—Germany

Discussion

In this study, we proposed a method that uses augmentation techniques to enhance time series forecasting. To conduct experimental study and to test the effectiveness of the proposed idea, we selected three deep learning methods, LSTM, GRU, and CNN. Furthermore, due to the importance of accurate forecasting of COVID-19 infections, data of ten countries with highest cases of infections have been chosen. The results of experiments demonstrated that the models obtained employing LSTM and the proposed idea greatly outperforms the regular LSTM model. Similarly, the proposed method significantly improves the performance of the CNN models. Besides, for GRU, the proposed method achieves an average performance.

Assumptions

Similar to any time series forecasting task, in this study, we utilize the series past values to train the models. Also, we assume that an optimal hyperparameters for the utilized models have been chosen.

Implications of the results

Unlike the one-step-ahead forecasting, where a forecasting model uses the previous observations to predict a single time step, the multi-step-ahead forecasting strategy [24], which was used in this study, allows forecasting two or more steps. In the COVID-19 forecasting, the multi-step-ahead forecasting is attractive to policymakers. In fact, a longer window forecasting uncovers the trend of pandemic effectively and thus appeal more significant for governments. Also, in terms of SMAPE, the models generated following the proposed idea demonstrate excellent performance. Besides, the Mean SMAPE values for LSTM_Aug, CNN_Aug, and GRU_Aug are 0.82, 0.63, and 1.01, respectively indicating the forecasting power of the proposed method.

Practical implications

As we mentioned previously, in this study, we formulate forecasting the number of infected cases as a time series forecasting problem in which the data of past observations of a series is used for predicting the future time points. The proposed models forecast the number of infected cases for a longer horizon with minimum error in comparison to their regular counterparts. The forecasts can be utilized by governments to take appropriate decisions in controlling the pandemic.

Limitations

In this study, we did not access to the other sources of information such as the interventions implemented by each country or vaccination of COVID-19. The models only were learned using the time series of the infections. Another limitation of this study is related to the hyperparameter selection for deep learning methods. As these methods contains a complex architecture, they require more computation. Therefore, investigating every hyperparameter configuration, similar to way performed in the grid search method, may not practicable. Therefore, in this study we used the Bayesian optimization algorithm to search the optimal hyperparameters.

Conclusion and future work

A new schema based on time series augmentation was suggested in this study to improve the performance of deep learning techniques in time series forecasting. The proposed method’s main idea is to use a time series augmentation technique to create a new time series with the same properties in the original series. Then, we use the generated series to obtain enough samples to train the deep learning methods optimally. The proposed method is implemented in the context of COVID-19 time series forecasting data of the 10 most affected countries using the LSTM, GRU, and CNN models. According to the findings of the experiments, in the majority of countries, the LSTM_Aug model outperformed the standard LSTM model and the CNN_Aug model achieved significant performance than the regular CNN. In addition, GRU_Aug obtained an average performance when compared to the regular GRU. Overall, the models’ performance following the proposed idea is excellent and significantly improves the regular models. As future work, we intend to evaluate the proposed method using other time series augmentation approaches such as dynamic time warping barycentric averaging.

26 in total

1. Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review.

Authors: Waseem Rawat; Zenghui Wang
Journal: Neural Comput Date: 2017-06-09 Impact factor: 2.026

2. Modeling and forecasting trend of COVID-19 epidemic in Iran until May 13, 2020.

Authors: Ali Ahmadi; Yasin Fadaei; Majid Shirani; Fereydoon Rahmani
Journal: Med J Islam Repub Iran Date: 2020-03-31

3. The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2.

Authors:
Journal: Nat Microbiol Date: 2020-03-02 Impact factor: 17.745

4. Development and external evaluation of predictions models for mortality of COVID-19 patients using machine learning method.

Authors: Simin Li; Yulan Lin; Tong Zhu; Mengjie Fan; Shicheng Xu; Weihao Qiu; Can Chen; Linfeng Li; Yao Wang; Jun Yan; Justin Wong; Lin Naing; Shabei Xu
Journal: Neural Comput Appl Date: 2021-01-05 Impact factor: 5.606

Review 5. A review on COVID-19 forecasting models.

Authors: Iman Rahimi; Fang Chen; Amir H Gandomi
Journal: Neural Comput Appl Date: 2021-02-04 Impact factor: 5.102

6. Triage of potential COVID-19 patients from chest X-ray images using hierarchical convolutional networks.

Authors: Kapal Dev; Sunder Ali Khowaja; Ankur Singh Bist; Vaibhav Saini; Surbhi Bhatia
Journal: Neural Comput Appl Date: 2021-02-25 Impact factor: 5.102

7. Statistical notes for clinical researchers: assessing normal distribution (2) using skewness and kurtosis.

Authors: Hae-Young Kim
Journal: Restor Dent Endod Date: 2013-02-26

8. Modeling and prediction of COVID-19 in Mexico applying mathematical and computational models.

Authors: O Torrealba-Rodriguez; R A Conde-Gutiérrez; A L Hernández-Javier
Journal: Chaos Solitons Fractals Date: 2020-05-29 Impact factor: 5.944

9. Prediction and analysis of COVID-19 positive cases using deep learning models: A descriptive case study of India.

Authors: Parul Arora; Himanshu Kumar; Bijaya Ketan Panigrahi
Journal: Chaos Solitons Fractals Date: 2020-06-17 Impact factor: 9.922

10. Exponentially Increasing Trend of Infected Patients with COVID-19 in Iran: A Comparison of Neural Network and ARIMA Forecasting Models.

Authors: Leila Moftakhar; Mozhgan Seif; Marziyeh Sadat Safe
Journal: Iran J Public Health Date: 2020-10 Impact factor: 1.429

3 in total

1. COVID-19 Spatio-Temporal Evolution Using Deep Learning at a European Level.

Authors: Ioannis Kavouras; Maria Kaselimi; Eftychios Protopapadakis; Nikolaos Bakalos; Nikolaos Doulamis; Anastasios Doulamis
Journal: Sensors (Basel) Date: 2022-05-11 Impact factor: 3.847

2. A novel mathematical model for prioritization of individuals to receive vaccine considering governmental health protocols.

Authors: N Shamsi Gamchi; M Esmaeili
Journal: Eur J Health Econ Date: 2022-07-28

Review 3. Tracking machine learning models for pandemic scenarios: a systematic review of machine learning models that predict local and global evolution of pandemics.

Authors: Marcelo Benedeti Palermo; Lucas Micol Policarpo; Cristiano André da Costa; Rodrigo da Rosa Righi
Journal: Netw Model Anal Health Inform Bioinform Date: 2022-10-11

3 in total