Literature DB >> 34121965

Short-term prediction of COVID-19 spread using grey rolling model optimized by particle swarm optimization.

Zeynep Ceylan1.   

Abstract

The prediction of the spread of coronavirus disease 2019 (COVID-19) is vital in taking preventive and control measures to reduce human health damage. The Grey Modeling (1,1) is a popular approach used to construct a predictive model with a small-sized data set. In this study, a hybrid model based on grey prediction and rolling mechanism optimized by particle swarm optimization algorithm (PSO) was applied to create short-term estimates of the total number of confirmed COVID-19 cases for three countries, Germany, Turkey, and the USA. A rolling mechanism that updates data in equal dimensions was applied to improve the forecasting accuracy of the models. The PSO algorithm was used to optimize the Grey Modeling parameters (1,1) to provide more robust and efficient solutions with minimum errors. To compare the accuracy of the predictive models, a nonlinear autoregressive neural network (NARNN) was also developed. According to the analysis results, Grey Rolling Modeling (1,1) optimized by PSO algorithm performs better than the classical Grey Modeling (1,1), Grey Rolling Modelling (1,1), and NARNN models for predicting the total number of confirmed COVID-19 cases. The present study can provide an important basis for countries to allocate health resources and formulate epidemic prevention policies effectively.
© 2021 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  COVID-19; Grey modeling (1,1); NARNN; Particle swarm optimization; Prediction; Rolling mechanism

Year:  2021        PMID: 34121965      PMCID: PMC8186943          DOI: 10.1016/j.asoc.2021.107592

Source DB:  PubMed          Journal:  Appl Soft Comput        ISSN: 1568-4946            Impact factor:   6.725


Introduction

Millions of people were infected and died due to the new coronavirus disease 2019 (COVID-19), which started at the end of 2019 and affected the whole world in a short time. As a first significant warning against the severe effects of the epidemic, a global pandemic was declared by the World Health Organization (WHO) on March 11, and the world was alarmed [1]. The spread rate, high aggressiveness, and severe damage ability of COVID-19 pose a significant threat to the health and safety of all humanity. While the epidemic has exceeded its peak in many countries, it continues to spread rapidly worldwide. Although the outbreak was brought under control in China, which was the starting point, Europe and America became the new epidemic centres [2]. COVID-19 has become a top priority for researchers worldwide [3], [4]. In general, epidemics have a cycle that resembles a bell curve. At first, the epidemic spreads quickly. After a while, it makes a plateau, and then the spread rate (number of infected people) begins to decrease. The time to reach the turning point is related to the dynamics of the countries, and it may differ. These phases of the outbreak are similar to the shape of a bell curve. The form of the bell-curve is related to the policies that countries apply against the epidemic. In countries that strictly follow the rules and have effective interventions, there will be a broader peak with a lower height. However, in countries that are late to react or implement wrong strategies, a sharp curve will emerge. Of course, this means more infected people, loss of life, and income. Considering the size and impact of the outbreak, effective healthcare system management is essential to provide fast and accurate treatment for people with COVID-19 symptoms. Thus, it is very important to predict the intensity to be experienced and know the number of cases in advance to provide adequate and timely health care. For this purpose, various studies have been conducted in the literature using different mathematical models to predict the spread of the COVID-19 epidemic [5], [6], [7]. Grey prediction models have arisen as a rapid and straightforward modelling and forecasting tool in grey system theory. Grey Modelling (1,1), also abbreviated as GM (1,1), is the simplest primary time-series prediction model in grey system theory. It is the most commonly used grey prediction model in the literature because of its computational efficiency and ability to characterize an unknown system using a limited amount of data [8]. In recent years, the GM (1,1) model has shown satisfactory results in various fields such as environment [9], energy [10], manufacturing [11], transportation [12], and medicine and health [13]. As shown in Table 1, the classical GM (1,1) and its derived models have been used successfully in medicine to predict the spread of various infectious diseases. For example, Gao et al. (2007) used the GM (1,1) model for forecasting of malaria epidemic situation in Shenzhen Longgang areas [13]. Ding et al. (2011) compared the results of the GM (1,1) model and the D-R algorithm to estimate the trend of H1N1 cases in Mainland China [14]. Ren et al. (2012) analysed tuberculosis incidence rates and deaths in HIV-negative patients in the USA and Germany using the GM (1,1) model [15]. Shen et al. (2013) utilized the grey model to examine typhoid and paratyphoid fever epidemic peaks in China [16]. Guo et al. (2014) used the linear model, the traditional GM (1,1) model, and the GM (1,1) model with the self-memory principle (SMGM(1,1)) to forecast the incidence rates of two notifiable diseases in China [17]. Zhang et al. (2014) used various grey models to forecast the incidence of Hepatitis B in Xinjiang, China [18]. Zhang et al. (2017) applied the different grey models to predict human Echinococcosis spread in Xinjiang, China. They reported that the dynamic epidemic prediction model could identify the future tendency of Echinococcosis outbreak [19].
Table 1

Summary of studies on the prediction of epidemic diseases using grey prediction models.

ReferenceDiseaseMethod(s)Country
Gao et al. [13]MalariaGM (1,1)China
Ding et al. [14]H1N1GM (1,1), D-R algorithmChina
Ren et al. [15]TuberculosisGM (1,1), D-R algorithmUS, Germany
Shen et al. [16]TPFGM (1,1), DGMChina
Guo et al. [17]Dysentery and GonorrheaGM (1,1), SMGM (1,1), and Linear ModelChina
Zhang et al. [18]HBVGM (1,1), GVM, NGBM (1,1), PSO-NNGBM (1,1), and HWESChina
Zhang et al. [19]EchinococcosisGM (1,1), PECGM (1,1), FGM (1,1), and SARIMAChina
Yang et al. [20]TPFGM (1,1)China
Wang et al. [21]HBVGM (1,1), ARIMAChina
Gao et al. [22]TPFGM (1,1), SARIMAChina
Şahin and Şahin [23]COVID-19GM (1,1), NGBM (1,1), and FANGBM (1,1)Italy, UK, and the USA
Luo et al. [24]COVID-19GM (1,1), GVM, ARGM (1,1), ONGM (1,1), ENGM (1,1), ARIMA, NGBM (1,1), GRM (1,1), and GERM (1,1, eat)China, Italy, Britain, and Russia
Zhao et al. [25]COVID-19Rolling-GVMChina
This studyCOVID-19GM (1,1), Rolling-GM (1,1), Rolling-PSO-GM (1,1), and NARNNGermany, Turkey, and the USA

H1N1: Influenza A Virus Subtype; HBV: Hepatitis B Virus; TPF: Typhoid and Paratyphoid Fevers; DGM: Discrete Grey Model; SMGM(1,1): GM(1,1) model with self-memory principle; GVM: Grey Verhulst Model; NGBM(1,1): Nonlinear Grey Bernoulli Model; PSO-NNGBM(1,1): Nash Nonlinear Grey Bernoulli Model Optimized by Particle Swarm Optimization; HWES: Holt–Winters Exponential Smoothing; PECGM(1,1): Grey-Periodic Extensional Combinatorial Model; FGM(1,1): Modified Grey Model using Fourier Series; ARIMA: Autoregressive Integrated Moving Average; SARIMA: Seasonal Autoregressive Integrated Moving Average; FANGBM(1,1): Fractional Nonlinear Grey Bernoulli Model; ARGM(1,1): Autoregressive Grey Model; ONGM(1,1): Optimized NGM(1,1,k,c) Model; ENGM(1,1): Exact Nonhomogeneous Grey Model; GRM(1,1): Grey Richards Model; GERM(1,1,): Grey Extend Richards Model; Rolling-GVM: Grey Verhulst Models with a Rolling Mechanism; NARNN: Nonlinear Autoregressive Neural Network; Rolling-GM(1,1): GM(1,1) Model with a Rolling Mechanism; Rolling-PSO-GM(1,1): Grey Modelling (1,1) Optimized by Particle Swarm Optimization with a Rolling Mechanism, COVID-19: Coronavirus Disease 2019.

Summary of studies on the prediction of epidemic diseases using grey prediction models. H1N1: Influenza A Virus Subtype; HBV: Hepatitis B Virus; TPF: Typhoid and Paratyphoid Fevers; DGM: Discrete Grey Model; SMGM(1,1): GM(1,1) model with self-memory principle; GVM: Grey Verhulst Model; NGBM(1,1): Nonlinear Grey Bernoulli Model; PSO-NNGBM(1,1): Nash Nonlinear Grey Bernoulli Model Optimized by Particle Swarm Optimization; HWES: Holt–Winters Exponential Smoothing; PECGM(1,1): Grey-Periodic Extensional Combinatorial Model; FGM(1,1): Modified Grey Model using Fourier Series; ARIMA: Autoregressive Integrated Moving Average; SARIMA: Seasonal Autoregressive Integrated Moving Average; FANGBM(1,1): Fractional Nonlinear Grey Bernoulli Model; ARGM(1,1): Autoregressive Grey Model; ONGM(1,1): Optimized NGM(1,1,k,c) Model; ENGM(1,1): Exact Nonhomogeneous Grey Model; GRM(1,1): Grey Richards Model; GERM(1,1,): Grey Extend Richards Model; Rolling-GVM: Grey Verhulst Models with a Rolling Mechanism; NARNN: Nonlinear Autoregressive Neural Network; Rolling-GM(1,1): GM(1,1) Model with a Rolling Mechanism; Rolling-PSO-GM(1,1): Grey Modelling (1,1) Optimized by Particle Swarm Optimization with a Rolling Mechanism, COVID-19: Coronavirus Disease 2019. Yang et al. (2018) applied the GM (1,1) model on the prediction of the incidence trend of typhoid and paratyphoid fevers in Wuhan, China, to help decision-makers to take measures on prevention and control [20]. Wang et al. (2018) compared the success of the GM (1,1) model with the Autoregressive Integrated Moving Average (ARIMA) model for the prediction of hepatitis B in China [21]. Gao et al. (2020) analysed the long-term cumulative incidence of both typhoid and paratyphoid fevers in China using both GM (1,1) and Seasonal Autoregressive Integrated Moving Average (SARIMA) models [22]. Şahin and Şahin (2020) [23], Luo et al. (2020) [24], and Zhao et al. (2020) [25] studied the prediction of cumulative COVID-19 cases using grey prediction models. As shown in Table 1, the grey prediction model is used as a common tool to predict the spread of different epidemics. This can be attributed to the structure of the GM (1,1) model, which can work efficiently with less data. Despite its wide use, the prediction efficiency of the traditional GM (1,1) model is still being improved [26]. The rolling mechanism developed by Akay and Atak (2007) is one of the most effective methods used to increase the predictive ability of the GM (1,1) model [27]. It includes the use of recent data to handle noisy sequences. Thus, the rolling mechanism was used in this study, as recent data represents the latest epidemiological trend and characteristic of the epidemic. Besides, the classical GM (1,1) parameters were optimized using Particle Swarm Optimization (PSO), a meta-heuristic algorithm inspired by nature. The main reason for using the PSO algorithm is to provide an advantage for the Rolling-GM (1,1) model due to its quick convergence nature and finding optimal solutions in a reasonable time [28]. The prediction capability of the grey rolling model optimized by the PSO algorithm, abbreviated as Rolling-PSO-GM (1, 1), was also compared with common prediction models such as Nonlinear Autoregressive Neural Network (NARNN), GM (1, 1), and Rolling-GM (1, 1). The COVID-19 epidemic has reached a peak in many countries. What we know about the virus is much more than the beginning of the outbreak. For this reason, the forecasting studies based on the data at the onset of the epidemic need to be updated and compared. This study aims to accurately estimate the total number of confirmed COVID-19 cases while still experiencing this virus. To the best of the author’s knowledge, the Rolling-PSO-GM (1,1) model is used for the first time in estimating the number of COVID-19 cases. Thus, this study is expected to contribute to the literature in this respect. Presenting a case study on Turkey, Germany, and the USA, it is believed that this study can be one of can be the promising alternative and guiding studies for estimating the number of COVID-19 cases for other countries. Furthermore, the results of this study are expected to provide effective guidance in the decision-making process for the prevention and control of epidemics for the governments.

Material and methods

Data collection

In this study, the total confirmed cases of COVID-19 in three countries, Germany, Turkey, the USA, were discussed. The COVID-19 data was taken from the Johns Hopkins University website (https://github.com/CSSEGISandData/COVID-19). The 40-day data was divided into two parts: the in-sample dataset and the out-of-sample dataset. Data from April 26, 2020, to May 30, 2020, were used as an in-sample dataset for model construction. On the other hand, data from April 31, 2020, to June 04, 2020, were used as an out-of-sample dataset to determine how well the developed models perform on the new dataset. All analyses were performed using MATLAB version 2019b software. The structure of the NARNN model.

Nonlinear autoregressive neural network

The use of linear models makes the problem more burdensome because most of the time-series data include high variability and transient nature [29]. To overcome the limitation of the linear models, nonlinear approaches that recognize time series patterns and nonlinear characteristics of the data are required. Artificial Intelligence (AI) models are frequently used in the literature because they adapt quickly to changes in the system and can successfully predict nonlinear problems [30]. Nonlinear Autoregressive Neural Network (NARNN) is an AI prediction method used to predict next values using historical values of one-dimensional series [31]. It is commonly used in different fields such as wind forecasting [32], global solar radiation forecasting [33], disease prevalence prediction [29], and power prediction [34]. In this study, the NARNN model was built and used for short-term prediction of COVID-19 spread in Germany, Turkey, and the USA. The mathematical representation of the NARNN model can be represented as follows [35]: where is the current response, is the nonlinear function, and it is approximated during the training stage of the network by calculating the optimal weights of the network and the corresponding bias, are the historical responses, and is the time delay parameter. In the NARNN model, a closed-loop network is used to perform a multi-step prediction. The output of the closed-loop NARNN model is expressed as follows [35]: where is the forecast steps in the future. The basic NARNN framework is shown in Fig. 1
Fig. 1

The structure of the NARNN model.

The NARNN model has a structure consisting of an input layer, an output layer, and several hidden layers. There are two parameters of the model that need to be specified, i.e., delay order and the number of hidden layer nodes. Forecasting procedure of the Rolling-GM (1,1) model for this study.

Grey prediction model with rolling mechanism

The grey system theory (GST), dealing with systems having uncertain and incomplete information, was proposed by Julong Deng in the 1980s [36]. The grey prediction in the GST is used to investigate a large amount of unknown information using a small amount of information in a system containing incomplete data [37], [38]. In the grey prediction model GM (n,m), n indicates the order of differential equations, and m indicates the number of variables. In terms of easy calculation, the most commonly used model is the GM (1,1) model. The GM (1,1) model represents the first-order model with a single variable. The GM (1,1) model has many advantages compared to traditional prediction methods because it does not require to know if the prediction variables fit the normal distribution, and also not much statistical samples are necessary [20], [39]. Therefore, GM (1,1) model has been used successfully in predicting problems in many disciplines and has achieved very successful results [40]. The grey prediction has three basic operations: the accumulated generating operator (AGO), the inverse accumulating operator (IAGO), and the grey model (GM). The steps of the GM (1,1) model is as follows [41]. Step 1. The non-negative row sequence with n samples is presented in Eq. (3). Monotonically increasing series is generated by using a one-time accumulating generation operation (1-AGO): where, Step 2. A first-order grey differential equation is formed to obtain GM (1,1) model: where, where  and b are called the developing and the driving coefficient, respectively. is a dynamic parameter in practical applications and is taken as 0.5 in the original GM (1,1) model. These are two parameters of the GM (1,1) model and can be estimated using the least square method : where is the constant vector, and B is the accumulated matrix. Step 3. After calculating the a and b coefficients, the GM (1,1) model can be established by solving the differential equation in Eq. (10) where the initial condition is taken as . Typically, in the original GM (1,1) model, all data is used for prediction. However, in the case of chaotic data, it is recommended to use the latest data to improve the prediction accuracy of the GM (1,1) model. To achieve this, the grey prediction with a rolling mechanism (Rolling-GM (1,1)) model based on the rolling steps using the last data, removing the old data for each loop, is used [27]. In Rolling-GM (1,1) model, is forecasted by employing the original GM (1,1) model to  where . After the result is found, the forecasting procedure is repeated, but the newly estimated entry is added to at the end of the sequence, and the oldest data is removed from the data. Next, is used to predict . Fig. 2 shows the flow chart of the Rolling-GM (1,1) model for this study [42].
Fig. 2

Forecasting procedure of the Rolling-GM (1,1) model for this study.

Parameter optimization

The prediction performance of the GM (1,1) model depends on two parameters calculated in Eq. (8), namely the developing parameter and the driving coefficient . In the classical GM (1,1) model, these parameters are calculated by the least-squares estimation method. However, both of these parameters can also be obtained by intelligent optimization algorithms, which can improve the predictive performance of the classical GM (1,1) model. In this study, the optimal values of the and coefficients in each rolling period were calculated using the PSO algorithm to increase the prediction accuracy of the COVID-19 spread. The optimization process of the and parameters of the GM (1,1) model by PSO algorithm is shown in Fig. 3.
Fig. 3

Flowchart of GM (1,1) model optimized by PSO algorithm.

Flowchart of GM (1,1) model optimized by PSO algorithm.

Particle swarm optimization

The Particle Swarm Optimization (PSO), initially proposed by Kennedy and Eberhart [43], is a population-based optimization algorithm used to achieve the optimal solution, inspired by the social behaviour of birds and fish flocking. PSO algorithm has been successfully applied to a variety of real-world applications because of the computational efficiency in solving complex optimization problems and the rapid convergence to a reasonably good solution [44], [45]. In the PSO algorithm, the system starts with a population of random potential solutions, and m particles are generated in the D-dimension solution space randomly. and where , represent the position (direction) and velocity of particles, respectively. Each X position in the flock is scored based on the solution approach to the problem. Personal best (pbest) is the local best of the current generation for each particle; on the other hand, global best (gbest) represents the global best among the local best examples in the current generation. Each particle moves in the direction of its previous best () and the global best () position in the swarm with a certain velocity to find the gbest position. The following Eqs. (11), (12) show how a particle velocity and position are updated [46]. where w represents the inertia weight used to balance the local and global search capabilities of the algorithm, and are the acceleration coefficients representing learning behaviour, enables the particle to benefit from its own experience, provides the experience of partners’ particles in the flock. and are uniform random numbers ranging from 0 to 1, refers to the number of iterations, is the personal best position of the particle i in the dth dimension, position and is the gbest position achieved so far in the flock. In the rest of this study, the developed model is abbreviated as Rolling-PSO-GM (1,1). The details of the PSO algorithm are given as follows. The parameter values calculated by GM (1,1), Rolling-GM (1,1), and Rolling-PSO-GM (1,1) models.

Performance evaluation metrics

Prediction accuracy is an important criterion for evaluating the performance of a forecasting model. In this study, three metrics were calculated to compare the accuracy and reliability of the models: mean absolute deviation (MAD), root mean square error (RMSE), mean absolute percent error (MAPE%). MAD and RMSE are two metrics of the average magnitude of the forecast errors. MAPE is a general metric considered as a percentage of prediction accuracy. The calculation of MAD, RMSE, and MAPE are given by the following Eqs. (13), (14), (15). where n is the number of observations, and are actual and predicted data at time k, respectively. It is known that the model with the lowest MAD, MAPE, and RMSE values means better performance.

Results

The classic GM (1,1), Rolling-GM (1,1), NARNN models were employed to verify the effectiveness of the Rolling-PSO-GM (1,1) model in estimating the number of confirmed COVID-19 cases in Germany, Turkey, and the USA. In models with a rolling mechanism, a small value of is preferred for the rolling if a prediction series includes a significant versatility [27]. In this case, data from the last four periods (, , , and is selected to forecast point. Therefore, the rolling mechanism in this study was built by selecting the last four days of cumulative COVID-19 cases data. That is, the number of the total confirmed COVID-19 cases for each fifth day is calculated based on the total confirmed data of the previous four days. For example, the total number of COVID-19 cases on April 30 is estimated using data from April 26, 2020, to April 29, 2020. Thus, and parameter values in the Rolling-GM (1, 1) model change at each rolling stage. On the other hand, the NARNN model was implemented using the Neural Network Toolbox in MATLAB version 2019b software. The 35-day dataset was divided into three main subsets: training (70%25 data), validation (15%5 data), and test set (15%5 data). The delay parameter was set to four to compare the NARNN model results with the grey rolling models. Also, the NARNN model was run five times with a single hidden layer with ten neurons. The traditional Levenberg–Marquardt backpropagation algorithm was used to train the model. This algorithm was chosen because it is known as the fastest backpropagation training algorithm in the literature [29]. For the classical GM (1, 1) model, all COVID-19 data from April 26 to May 30, 2020, was used for the inputs of the grey model, and then the fixed values of parameters and are calculated. For Rolling-PSO-GM (1,1) model, the optimum values of the and parameters were obtained by the PSO algorithm at each rolling stage. Table 2 shows the parameters calculated by the three grey prediction models.
Table 2

The parameter values calculated by GM (1,1), Rolling-GM (1,1), and Rolling-PSO-GM (1,1) models.

CountryDateGM (1,1)
Rolling-GM (1,1)
Rolling-PSO-GM (1,1)
ababab
Germany31-May−0.0041160,780.5359−0.0027181,532.5496−0.0027181,455.2106
1-June−0.0023182,242.0345−0.0019182,380.4666
2-June−0.0026182,516.5239−0.0019182,747.0767
3-June−0.0024183,083.9341−0.0014183,382.3125
4-June−0.0025183,457.0084−0.0016183,535.6261

Turkey31-May−0.01036117,195.8141−0.0066159,432.2875−0.0064159,452.5876
1-June−0.0064160,557.0853−0.0062160,592.7040
2-June−0.0065161,533.6708−0.0057161,825.6768
3-June−0.0064162,620.9347−0.0059162,680.5650
4-June−0.0065163,645.0932−0.0059163,679.1199

USA31-May−0.01641,033,121.5836−0.01371,692,244.8785−0.01351,693,002.5918
1-June−0.01361,716,018.1400−0.01331,717,202.3190
2-June−0.01371,739,384.9457−0.01311,740,763.5549
3-June−0.01361,763,534.7679−0.01271,765,423.0989
4-June−0.01371,787,680.8312−0.01231,789,983.3062
According to the PSO operations steps, the parameters of the PSO algorithm are set as follows: In Eq. (11), and , cognitive and social acceleration coefficients, respectively, are adjusted small () to avoid missing the optimal solution. Besides, the maximum inertia weight was determined as 0.8 based on the study by Shi and Eberhart [32]. The maximum iteration number is set to 2000. All initial parameter values of PSO are given in Table 3.
Table 3

The initial parameters of the PSO algorithm.

ParameterValue
Maximum number of iterations (epochs) to train2000
Maximum inertia weight, wmax0.8
Minimum inertia weight, wmin0.1
Acceleration coefficients, c1,c21
Number of particles70
The maximum velocity2
Minimum global error gradient1025
Table 4 shows the forecasting values of NARNN, GM (1,1), Rolling GM (1,1), and Rolling-PSO-GM (1,1) models used to predict the spread of COVID-19 in countries. To directly compare the performance of classical GM (1,1) with rolling-based forecasting models, the values from April 26 to April 29, 2020, were not taken into account.
Table 4

Comparison of reported and predicted COVID-19 cases for the countries.

DateGermany
Turkey
USA
ActualNARNNGM (1,1)Rolling- GM (1,1)Rolling-PSO- GM (1,1)ActualNARNNGM (1,1)Rolling- GM (1,1)Rolling-PSO- GM (1,1)ActualNARNNGM (1,1)Rolling- GM (1,1)Rolling-PSO- GM (1,1)
In-sample

26-Apr157,770110,130971,078
27-Apr158,758112,261994,265
28-Apr159,912114,6531,018,926
29-Apr161,539117,5891,046,737
30-Apr163,009162,819163,748162,871162,948120,204120,534122,709120,265120,3471,076,2241,099,2971,111,0921,073,5481,074,002
01-May164,077163,988164,418164,607164,559122,392122,753123,987123,135123,0791,110,4641,121,5501,129,4771,105,8541,106,259
02-May164,967164,743165,091165,428165,361124,375124,762125,279124,938124,8671,138,2281,148,5641,148,1661,143,0811,143,768
03-May165,664165,485165,767165,985165,951126,045126,496126,584126,550126,5201,162,6851,171,2831,167,1641,171,6251,170,557
04-May166,152166,169166,445166,495166,525127,659128,151127,902127,965127,9121,186,0671,193,2611,186,4771,190,2731,189,709
05-May167,007166,914167,126166,782166,857129,491130,035129,235129,344129,2791,210,5771,216,5691,206,1091,210,9391,210,653
06-May168,162168,150167,810167,622167,683131,744132,085130,581131,216131,2391,235,6661,240,1331,226,0661,235,1101,235,451
07-May169,430169,512168,496169,127169,135133,721134,285131,941133,770133,8351,263,4021,263,7411,246,3531,261,1791,261,608
08-May170,588170,643169,186170,637170,622135,569136,112133,316135,934135,8881,290,1511,290,3871,266,9761,290,2881,290,851
09-May171,324171,341169,878171,833171,792137,115137,729134,704137,546137,5031,315,0991,315,5251,287,9411,318,4841,318,084
10-May171,879170,715170,573172,349172,279138,657139,104136,108138,895138,8441,333,9701,338,7171,309,2521,342,0581,341,652
11-May172,576170,637171,271172,558172,564139,771140,183137,525140,229140,1011,353,3971,355,1041,330,9151,357,4381,356,434
12-May173,171173,191171,971173,182173,210141,475141,887138,958141,189141,1181,376,1221,375,0931,352,9381,372,8921,373,339
13-May174,098174,085172,675173,838173,870143,114142,757140,406142,810142,9051,397,0851,400,3351,375,3241,397,1861,397,695
14-May174,478174,480173,382174,809174,850144,749144,552141,868144,827144,7541,424,2431,419,6631,398,0811,419,7641,420,215
15-May175,233175,264174,091175,226175,135146,457146,889143,346146,416146,4331,449,4981,448,9771,421,2151,447,9641,448,931
16-May175,752175,776174,803175,741175,803148,067148,427144,839148,147148,0011,473,5141,472,6561,444,7311,476,7751,476,611
17-May176,369175,943175,519176,432176,406149,435149,685146,348149,772149,6981,491,8291,493,2011,468,6371,499,0131,498,270
18-May176,551176,552176,237176,924176,732150,593150,741147,873150,987150,9271,513,8161,508,0731,492,9381,514,4001,513,456
19-May177,778175,809176,958177,024176,952151,615151,572149,413151,907151,8601,534,8711,532,3511,517,6411,533,8031,534,378
20-May178,473176,656177,682178,314178,487152,587152,462150,969152,740152,6641,557,9331,554,7571,542,7531,557,0281,557,203
21-May179,021179,019178,409179,530179,442153,548153,435152,542153,603153,5651,583,7981,578,4841,568,2801,580,1691,581,012
22-May179,710180,489179,139179,670179,701154,500154,444154,131154,526154,4811,607,1091,606,7561,594,2301,608,4181,608,141
23-May179,986179,994179,872180,309180,256155,686155,437155,737155,467155,5501,628,2121,628,0821,620,6091,632,7151,632,168
24-May180,328180,348180,608180,539180,470156,827156,407157,359156,728156,7961,648,1581,646,7101,647,4251,651,2641,650,785
25-May180,600180,597181,347180,627180,594157,814157,503158,998158,012157,8741,666,5051,666,7661,674,6841,669,2811,668,453
26-May181,200180,716182,089180,919181,039158,762158,331160,655158,915158,8761,685,9561,685,5031,702,3941,686,2651,686,034
27-May181,524181,558182,834181,583181,601159,797159,416162,328159,745159,7771,704,4891,706,4081,730,5631,705,0151,704,499
28-May182,196182,195183,582182,034182,052160,979160,550164,019160,784160,8971,727,3571,726,1761,759,1981,723,9701,725,182
29-May182,922182,823184,333182,638182,700162,120161,517165,728162,076162,1171,751,6121,750,2801,788,3071,747,7511,748,437
30-May183,189183,264185,087183,616183,367163,103162,359167,454163,302163,1721,775,4281,774,4231,817,8981,775,4581,775,591

Out-of-sample

31-May183,410183,538185,844183,764183,688163,942163,093169,199164,202164,1391,794,4651,793,8491,847,9781,800,0581,799,322
01-Jun183,594183,421186,605184,135183,959164,769164,006170,961165,234165,0961,811,3931,806,8551,878,5561,824,6731,823,222
02-Jun183,879183,223187,368184,644184,289165,555164,579172,742166,322166,0231,832,7821,814,9341,909,6391,849,8321,846,742
03-Jun184,121183,057188,135185,063184,500166,422167,017174,542167,384167,0621,852,7881,820,1981,941,2371,875,1691,869,405
04-Jun184,472182,724188,904185,544184,824167,410165,874176,360168,474168,0421,874,1561,823,5061,973,3581,900,9351,891,680
The initial parameters of the PSO algorithm. As seen in Table 4, the Rolling-GM (1,1) and Rolling-PSO-GM (1,1) models outperform the original GM (1, 1) model for all countries. On the other hand, in terms of prediction errors, the NARNN achieved better prediction results than the classical GM (1, 1) model. However, it performed worse than the Rolling-GM (1, 1) and Rolling-PSO-GM (1, 1) models. The main reason for this may be that the rolling-based prediction models have a mechanism that updates the data at each rolling stage. It is seen that the PSO algorithm has significantly improved the prediction performance of the Rolling-GM (1,1) model in both model building and model testing stages. This shows that optimizing the and parameters can improve the prediction accuracy of the grey model. Fig. 4 presents the forecasting results of the cumulative cases of COVID-19 for Germany, Turkey, and the USA using Rolling-PSO-GM (1,1) model, respectively. There appears to be good agreement between the reported confirmed cases and the predicted cases.
Fig. 4

Actual and predicted values of COVID-19 data in (a) Germany, (b) Turkey, and (c) the USA.

Comparison of reported and predicted COVID-19 cases for the countries. From Table 5, it is seen that the lowest MAD, RMSE, and MAPE (%) values of the Rolling-PSO-GM (1,1) model for Germany are 222.935, 292.748, and 0.129% in the sample dataset and 356.800, 359.487, and 0.194% in the out-of-sample dataset, respectively. It also performs best with the values of MAD 201.161, RMSE 266.677, and MAPE 0.148% in the sample dataset and MAD 452.800, RMSE 484.517, and MAPE 0.273% in the out-of-sample dataset of Turkey. Similarly, it has the best performance for the USA with the values MAD 2484.806, RMSE 3292.779, and MAPE 0.184% in the sample dataset and MAD 12 957.400, RMSE 13 723.065, and MAPE 0.703% in the out-of-sample dataset. Clearly, the analysis results confirm that the Rolling-PSO-GM (1,1) is more accurate and provides a significant improvement over the NARNN, traditional GM (1,1), and Rolling-GM (1,1) models.
Table 5

The performance evaluation metrics to compare NARNN, GM (1,1), Rolling-GM (1,1), and Rolling-PSO-GM (1,1) models.

CountryPerformance criteriaNARNNGM (1,1)Rolling- GM (1,1)Rolling-PSO- GM (1,1)
In-sample (26 April–30 May)
GermanyMAD318.355847.323264.484222.935
RMSE659.427977.334328.544292.748
MAPE (%)0.1820.4820.1530.129
TurkeyMAD372.4841970.355245.129201.161
RMSE407.3282253.889304.895266.677
MAPE (%)0.2621.3530.1770.148
USAMAD3698.25819 369.0652898.6772484.806
RMSE5937.71921 968.5323702.0103292.779
MAPE (%)0.2961.3470.2120.184
Out-of-sample (31 May–4 June)
GermanyMAD753.8003476.000734.800356.800
RMSE965.8413547.349779.714359.487
MAPE (%)0.4091.8890.3990.194
TurkeyMAD943.8007141.200703.600452.800
RMSE996.8827261.774765.461484.517
MAPE (%)0.5694.3060.4240.273
USAMAD21 248.40077 036.80017 016.60012 957.400
RMSE28 167.55178 671.17718 527.63613 723.065
MAPE (%)1.1444.1900.9220.703
Actual and predicted values of COVID-19 data in (a) Germany, (b) Turkey, and (c) the USA. The performance evaluation metrics to compare NARNN, GM (1,1), Rolling-GM (1,1), and Rolling-PSO-GM (1,1) models.

Discussion

The grey system theory requires only a limited number of data to understand the behaviour of unknown systems that differ from other time-series methods. The grey prediction models are widely used for short-term prediction due to the simple calculation ability and higher predictive accuracy. In recent years, grey modelling (1,1) has become increasingly popular in both medicine and public health research. Therefore, in this study, a hybrid model based on grey modelling (1,1) and rolling mechanism optimized by the PSO was used for short-term prediction of COVID-19 spread. The study results can give information about the estimated number of COVID-19 cases that may occur in the next few days. Since the outbreak started on different dates in different countries, viewing the curve of each country allows us to compare countries more easily. Comparisons between countries can provide preliminary information about where and how the pandemic grew at any given time. According to results obtained by the Rolling-PSO-GM (1,1) model, the number of confirmed new cases is expected to be an average of 303, 988, and 23 250 cases in the next five days (31 May-4 June) for Germany, Turkey, and the USA, respectively. Germany is one of the first countries to develop a reflex against the epidemic, and it has seen a consistent drop in new cases. On the other hand, the total number of COVID-19 continues to increase in the USA, where COVID-19 causes approximately 112,000 deaths, respectively. The country’s health system is struggling to cope with the epidemic. The estimates indicate that the current trend is far behind Germany and Turkey. In all three countries, comprehensive prevention and control are required to be maintained and strengthened to reduce the spread of COVID-19. The present study should be interpreted in light of some limitations. The used data in this study was taken from the official website of Johns Hopkins University. It includes positive cases are that have been confirmed by the state, national or local labs. However, it is difficult to estimate the actual spread of COVID-19 with this information. Because it is more accurate and reliable to say that there are more cases in these countries that have not yet been detected. Thus, it was assumed that the report data in this study might be slightly less than the actual number of cases of COVID-19.

Conclusion

There is limited data on the growth trajectory of the COVID-19 outbreak. Besides, the epidemiological features of the new coronavirus are not fully disclosed. Therefore, the prediction of the COVID-19 spread is essential in that it gives an idea of what might happen next days. Public health professionals and policymakers need estimates when making important decisions. These decisions include how to best utilize resources within a health system, which social distancing measures to implement, and what other measures can be taken to reduce the impact of COVID-19. The forecasts provide preliminary information that shows what might happen in the coming days and whether preventive measures are working. However, traditional methods show limitations such as large amounts of statistical data and a structural system requirement. These pose a significant challenge for researchers to predict the spread of COVID-19 accurately. To overcome the problem, the grey system theory was adopted in predicting the spread of COVID-19 in this study. The GM (1,1), Rolling-GM (1,1), and Rolling-PSO-GM (1,1) models were used to predict the cumulative case number of COVID-19 in Germany, Turkey, and the USA. Besides, the NARNN model, a neural-based forecasting technique, was also implemented to compare the performance of the developed grey models. Analysis results showed that the use of the rolling mechanism and PSO algorithm with the classical GM (1,1) model has significantly improved the prediction accuracy of the COVID-19 spread. Therefore, the Rolling-PSO-GM (1,1) is selected as the best prediction model with having the lowest prediction errors for all countries. However, the Rolling-PSO-GM (1,1) model cannot take into account the economic and social factors affecting the spread of COVID-19. Also, the effect of climate change was ignored for short-term predictions. These disadvantages can be prevented by using the GM (1, n) model, which can be an important research topic in future studies. Besides, in future studies, the parameters of the classical grey prediction model can be optimized using other techniques such as genetic algorithm or the grey wolf optimizer. Thus, the performance of optimization techniques on the short-term prediction of COVID-19 spread can be compared, and different results can be obtained.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  5 in total

1.  Dual attention-based sequential auto-encoder for Covid-19 outbreak forecasting: A case study in Vietnam.

Authors:  Phu Pham; Witold Pedrycz; Bay Vo
Journal:  Expert Syst Appl       Date:  2022-05-13       Impact factor: 8.665

2.  Optimization in the Context of COVID-19 Prediction and Control: A Literature Review.

Authors:  Elizabeth Jordan; Delia E Shin; Surbhi Leekha; Shapour Azarm
Journal:  IEEE Access       Date:  2021-09-17       Impact factor: 3.476

3.  Forecasting CO2 Emissions Using A Novel Grey Bernoulli Model: A Case of Shaanxi Province in China.

Authors:  Huiping Wang; Zhun Zhang
Journal:  Int J Environ Res Public Health       Date:  2022-04-19       Impact factor: 4.614

4.  A novel grey model based on Susceptible Infected Recovered Model: A case study of COVD-19.

Authors:  Huiming Duan; Weige Nie
Journal:  Physica A       Date:  2022-05-30       Impact factor: 3.778

5.  Impact of COVID-19 pandemic on the epidemiology of STDs in China: based on the GM (1,1) model.

Authors:  Jingmin Yan; Yanbo Li; Pingyu Zhou
Journal:  BMC Infect Dis       Date:  2022-06-04       Impact factor: 3.667

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.