Mohammed N Alenezi1, Fawaz S Al-Anzi2, Haneen Alabdulrazzaq1, Ammar Alhusaini2, Abdullah F Al-Anzi3. 1. Computer Science & Information Systems Department, Public Authority for Applied Education & Training, Kuwait. 2. Computer Engineering Department, Kuwait University, Kuwait. 3. Electrical & Computer Engineering Department, American University, Kuwait.
Abstract
Today, the world is fighting against a dangerous epidemic caused by the novel coronavirus, also known as COVID-19. All have been impacted and countries are trying to recover from the social, economic, and health devastations of COVID-19. Recent epidemiology research has concentrated on using different prediction models to estimate the numbers of infected, recovered, and deceased cases around the world. This study is primarily focused on evaluating two common prediction models: Susceptible - Infected - Recovered (SIR) and Susceptible - Exposed - Infected - Recovered (SEIR). The SIR and SEIR models were compared in estimating the outbreak and identifying the better fitting model for forecasting future spread in Kuwait. Based on the results of the comparison, the SEIR model was selected for predicting COVID-19 infected, recovered, and cumulative cases. The data needed for estimation was collected from official sites of the Kuwait Government between 24 February and 1 December 2020. This study presents estimated values for peak dates and expected eradication of COVID-19 in Kuwait. The proposed estimation model is simulated using the Python Programming language on the collected data. The simulation was performed with various basic reproduction numbers (between 5.2 and 3), the initial exposed population, and the incubation rate. The results show that the SEIR model was better suited than the SIR model for predicting both infection and recovery cases with R 0 values ranging from 3 to 4, E 0 = 80 and α = 0.2 .
Today, the world is fighting against a dangerous epidemic caused by the novel coronavirus, also known as COVID-19. All have been impacted and countries are trying to recover from the social, economic, and health devastations of COVID-19. Recent epidemiology research has concentrated on using different prediction models to estimate the numbers of infected, recovered, and deceased cases around the world. This study is primarily focused on evaluating two common prediction models: Susceptible - Infected - Recovered (SIR) and Susceptible - Exposed - Infected - Recovered (SEIR). The SIR and SEIR models were compared in estimating the outbreak and identifying the better fitting model for forecasting future spread in Kuwait. Based on the results of the comparison, the SEIR model was selected for predicting COVID-19infected, recovered, and cumulative cases. The data needed for estimation was collected from official sites of the Kuwait Government between 24 February and 1 December 2020. This study presents estimated values for peak dates and expected eradication of COVID-19 in Kuwait. The proposed estimation model is simulated using the Python Programming language on the collected data. The simulation was performed with various basic reproduction numbers (between 5.2 and 3), the initial exposed population, and the incubation rate. The results show that the SEIR model was better suited than the SIR model for predicting both infection and recovery cases with R 0 values ranging from 3 to 4, E 0 = 80 and α = 0.2 .
The world is now undergoing a challenging period due to an unparalleled spread of an infectious virus, named coronavirus or COVID-19. The effects from the family of coronaviruses can range from a simple cold to more dangerous forms similar to Middle East Respiratory Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS). COVID-19, also in this family, was reported on 31 December 2019 in Wuhan, China. Due to the expeditious transmission of the virus, the Chinese Government implemented several containment measures to control this outbreak from spreading, including a complete lockdown and suspension of all forms of transportation to Wuhan during late January 2020 [1]. Thailand was the second country to report cases on 13 January 2020 [2]. COVID-19 has now spread to 218 countries and territories. As of 6 December 2020, nearly 66.6 million infected cases, 42.8 million recovered cases, and 1.53 million deceased cases were reported due to COVID-19 [3].On 30 January 2020, the total world count of confirmed cases reached 8,096. COVID-19 was declared an epidemic and a global public health emergency by the World Health Organization (WHO) [4]. With its rapid spreading becoming a serious challenge across the world, the WHO announced COVID-19 as a global pandemic on 11 March 2020 [5]. Most impacted countries imposed a full or partial lockdown for controlling the spread. One of the best measures to be implemented to reduce the possibility of spreading is social distancing. This involves people maintaining a distance of 1 meter, wearing masks covering the mouth and nose, and gloves to limit transmission. The pandemic quickly impacted various areas such as economics, education, politics, etc., along with public health all over the world. Furthermore, we saw an increase in poverty and unemployment rates.When considering how to best control the spread of COVID-19, the virus’s vigorous infectious behavior, ambiguity in transmission methods, expanded incubation period, complications in detecting the virus, etc., must be taken into account. Most countries began joint efforts to prevent the transmission and eradication of the virus [6]. Pharmaceutical companies, research departments, and health departments are developing multiple vaccines and treatments to prevent and bolster COVID-19 recoveries. There are now vaccines in the later stages of development and testing.Kuwait faces a serious threat due to the novel coronavirus, and its impact is seen everywhere. COVID-19 was first discovered in Kuwait on 24 February 2020; the first five cases were related to citizens who returned from abroad. Trying to limit localized infection rates, the Kuwait Government imposed precautionary measures such as quarantines, banning flights to and from selected countries, closing down retail shops, announcing a public holiday from 12 March 2020 onward. A partial curfew was implemented on 22 March 2020 between 5:00 PM and 4:00 AM daily, and the curfew timings were changed twice (from 5:00 PM to 6:00 AM and again from 4:00 PM to 8:00 AM). Finally, the government issued a complete lockdown from 10 May 2020 to 30 May 2020 [7].In Kuwait, 142,993 individuals had been infected with COVID-19 as of 1 December 2020, out of which 138,507 had recovered, 881 deceased, and 3,605 currently receiving treatment. As of 1 June 2020, the population of Kuwait was approximately 4,776,407, according to data collected from the Public Authority for Civil Information (PACI) [8]. The Ministry of Health (MOH) and a small number of private hospitals and clinics are Kuwait’s primary healthcare providers. The entire bed capacity is around 8,200, almost 7,118 from MOH and 1,082 from private sector hospitals [9].It is essential to generate an accurate estimation model for forecasting the impacts of COVID-19 in all fields like health, social, and economics for supporting the decision-makers to make appropriate mandates to handle these uncertainties. The estimation model helps to forecast what should happen in the near future. These predictions will support authorities to take precautionary and preventive measures like arranging sufficient care and treatment plans. Due to the nature of pandemics, it is not possible to make precise estimates. However, researchers and scientists tried to model this pandemic based on some scientifically proven estimation methods. According to the estimation results; authorities will decide how to best deal with the situation at hand.The SIR and SEIR estimation models are two simple and effective compartmental models used for modeling a pandemic. The Susceptible-Infected-Recovered model (SIR) is a popular model used for forecasting a pandemic [10]. The SIR model considers the total population as three compartments: Susceptible population, Infected population, and Recovered population. The infected population is referred to as the total number of individuals infected with COVID-19 and are capable of spreading the virus to others. The people who are all susceptible to infection but not yet infected are in the Susceptible compartment. Those who are deceased or recovered are in the Recovered compartment. These phases are considered as the three progressive phases of an epidemic. The SIR model is effective for estimating the percentage of the population who might need medical support. Based on the SIR estimation model, an individual who has recovered from the virus has acquired immunity and will not be reinfected.The SEIR model is an extension of the SIR model which introduced the compartment labeled Exposed to the already existing SIR. Those exposed to infected individuals and who did not become infected are categorized as an Exposed population [11]. SEIR considers the entire population under the susceptible compartment. It also considers the people who recovered have acquired a lifelong immunity against the virus.Here, we are going to model the COVID-19 outbreak using the SIR and SEIR models, analyze and compare both in terms of accuracy to find the best model for predicting the outbreak using Kuwait as a case study. After selecting the most relevant model, it will be used to predict the COVID-19 spread in Kuwait with different basic reproduction numbers (). For estimating the epidemic, we assumed that the country has a constant population during the estimation period (ignoring deaths, births, and migration during the estimation period). This research is mainly done to identify the best estimation method for modeling the COVID-19 pandemic considering the infected, recovered, and cumulative cases based on various values, which will be explained in the upcoming sections. All required information needed for this study is collected from government authorized sites during the period of 24 February to 1 December 2020 [7]. Our study is also trying to identify the best values for peak dates and the expected decline of this pandemic in Kuwait.The rest of the paper is organized as follows. The “Background” section discusses various estimation models available for modeling the pandemics. In section “SEIR-based prediction”, we describe the SEIR model, a deterministic compartmental model, which we used to model the COVID-19 pandemic in this research. A comparison of the two popularly used deterministic compartmental models, SIR and SEIR, is performed and illustrated in section “Comparison of SIR and SEIR”. Finally, the analysis and conclusions of the results are provided in sections “Results and discussion”, and “Conclusion”, respectively.
Background
Regression models
Antonio Guterres, the UN Secretary-General, pronounced COVID-19 as the most hazardous calamity since the Second World War. The coronavirus outbreak created a frightening global predicament. It severely impacted the daily lives of people around the world. It has also affected the economic, health, social, and political aspects in every impacted country. Most countries recommended imposing travel bans and other restrictions; large-scale quarantines were set up all over the world as an attempt to impede the COVID-19 spread.Many researchers modeled and forecasted the COVID-19 pandemic. Most of the studies were focused on tracing the spread of COVID-19 to analyze and predict its infection rate, recovery rate, and expected eradication. For evaluating and forecasting the COVID-19 pandemic, researchers used various models like deterministic compartmental models (DCM), agent-based models (ABM), and logistic growth models [12]. Although there are a variety of models available for forecasting various pandemics, statistical models produce better results. Regression-based models are of a different type of model, and all of them are used by various researchers in their studies. Linear regression of various orders (2nd, 3rd, and so on), Locally Weighted Linear Regression (LOESS), Generalized Linear Model (GLM), Poisson, and logistic regression, are all examples of various regression models. Regression-based methods are mainly founded on the number and type of independent and dependent parameters and the shape of the regression curve.Regression analysis is the most common method used to predict and analyze the relationship between two or more dependent and independent variables. Independent variables can be used for estimating the target or dependent variable using previous values. Regression analysis is used to analyze the relationship between those variables and predict future values based on this analysis. The regression model can be linear or nonlinear. If the model contains only linear parameters, then it is known as linear regression. We can also represent a polynomial regression as linear regression. The relationship between the dependent and independent variables in a polynomial regression of second-order (also known as a second-order model or quadratic model) with one explanatory variable is given by Eq. (1) and with two variables is explained in Eq. (2), respectively.
Where , and are explanatory variables or features, and y is the dependent variable. The value of the dependent variable, y may be the number of death cases, recovered cases, confirmed cases, etc. x is used to represent features like gender, region, age, number of tests conducted, etc. , and are constants which represent bias or intercept (), slopes or weights (, and ). represents the possible error obtained in this model since in any real-world situations, the regression model may not be able to estimate the correct target value, and it signifies the noise in these relationships. In the quadratic model, the linear effect parameter and quadratic effect parameter are and respectively. When x = 0, the value of y gives the intercept value().Eq. (3) represents the relationship between independent and dependent variables using a third-order polynomial regression model. The relationship between the target and independent variables would be non-linear or curvilinear in the polynomial regression model.Where , …, are coefficients of variables. The polynomial regression can be represented as a linear regression with multiple explanatory variables when is represented by for all i(1, 2, 3, …). Consider the third-order polynomial regression equation of one independent variable explained in 3 can be viewed as a linear regression model with three independent variables as explained in 4. , and are rewritten as , and respectively.A regression model that used to analyze and predict the influence of numerous continuous independent parameters on various target parameters is referred to as the Generalized Linear Model (GLM) [13]. It can be represented as a suitable generalization of already available regression models such as linear or polynomial. GLM is not able to predict the exact value of the target value using explanatory variables. The difference is calculated as a possible error in this case also. The main components of GLM are a random component, a link function g(), and a linear predictor. The conditional distribution of target parameters s is referred to as a random component. Eq. (5) represents a linear relationship between independent and dependent parameters, which is viewed as a linear predictor. The value of i will be 1, 2, …, n, and a link function which represents how the mean relates to the linear predictor.A non-parametric regression model used for smoothening the regression curve or line in volatile time-series is known as Locally Weighted Linear Regression (LOESS) [14]. The scatter plot is used for getting the best fitting data. The regression curve is smoothened with the help of local subsets. The first step in the LOESS method is to identify a smoothing parameter. After identifying a parameter, the model selects k nearest neighbors of an independent variable to be smoothened (). LOESS algorithm is applied to each point of , which reassigns the weights to its nearest neighbors.A regression model used for predicting the discrete dependent or response parameter is known as Poisson regression [15]. This model’s main assumption is that the response variable is considered positive counts, and it follows the Poisson distribution. This model is mainly applicable for analyzing the rates having positive counts as values. It is similar to Logistic regression, which is mainly used for calculating ratios having values between 0 and 1.The logistic regression model is also a regression model used to predict or analyze the target or response parameter using explanatory or independent parameters under consideration [16]. The logistic regression model is best suited for analyzing and estimating the growth of pandemic or epidemic diseases. The model assumes that the epidemics are growing exponentially in the initial stage, then it reaches a steady increase phase and diminishes its rate of growth. The logistic regression-based epidemic model calculates the count of infected cases using Eq. (6) having as the initial condition [16].where , and K defines the count of infectedpeople, infection rate, and final epidemic size respectively. The rate of change of infection at time t is calculated using Eq. (7)
[16].The estimation of time at which the epidemic reaches its maximum growth rate of is explained in (8).Eqs. (9), (10) estimate the values of peak count of infected cases and the maximum rate of growth at peak period.
Eq. (11) is used for fitting the regression model with the actual confirmed cases of infection. The value represents the actual estimate of infection where .
Deterministic compartmental models
There exist numerous models for analyzing and estimating the epidemic spread. Deterministic compartmental models (DCM) are non-linear models used for modeling the epidemic spread. They mainly use differential equations for modeling the outbreak. The most commonly used DCM models are the Susceptible Infected Recovered (SIR) model, the Susceptible Exposed Infected Recovered (SEIR) model, and the Autoregressive Integrated Moving Average (ARIMA) model.
Susceptible-infected-recovered (SIR) model
The SIR model assumes total population is a sum of three different parameters namely, Susceptible (S), Infected (I), and Recovered (R) as explained in Eq. (12)
[17]. Many researchers used the SIR model to analyze and estimate various diseases such as HIV and Ebola [18], [19]. In the SIR model, the total population is represented by N
[10]. A susceptible population is a subset of the total population who all are healthy, but they are at risk of becoming infected. Personsinfected by the disease are known as the Infected population. Those who are recovered from the pandemic have acquired immunity and are referred to as the recovered population. The total deceased population is also counted as recovered in the SIR model [20].The SIR model works based on the assumption that the total population is constant during the period of epidemic analysis and prediction, which means no deaths and births are considered in that duration. The model estimates the changes in Susceptible (S), Infected (I), and Recovered (R) populations as differential equations explained in Eqs. (13), (14), (15) respectively [21], [22].The SIR model is used to model the pandemic considering some initial conditions. The values of the initial susceptible population, initial infected population, and the initial recovered population are denoted by , and , respectively.Infection rate represents the rate at which the susceptible population is becoming infected per day and is indicated by . In contrast, the recovery rate, indicates the recovery rate from the infection with acquired immunity [23]. The fraction of infection rate to the recovery rate as in Eq. (16) is referred to as a basic reproduction number and is denoted by .
Susceptible-exposed-infected-recovered (SEIR) model
An advancement of the SIR estimation model is known as the SEIR model which introduces a new compartment as Exposed (E) to the already established compartments of SIR. The SEIR model assumes the total population as a sum of these four compartments, as shown in Eq. (17). It considers the entire population to be in the susceptible compartment [11]. Those exposed to infectedpersons, but do not become infectious, are labeled as an Exposed population. The SEIR estimation model also assumes the total population is constant in the entire duration of estimation.The Eqs. (18), (19), (20), (21) explained the rate of change of , and R with time using the SEIR model [24].The rate at which exposed persons become infectious is referred to as incubation rate,
[2]. and are referred to as serial or infectious period and incubation or latent period, respectively. The value of is calculated as a fraction of the reproduction number and period, and as reciprocals of incubation period and serial period respectively, as shown in Eqs. (22), (23), (24)
[25], [24]. The reciprocal infection rate is known as the contact period ().There are several measures available for evaluating the generated estimation model. Some of them are Residual Sum of Squares (RSS), Coefficient of Determination (), and Root Mean Squared Error (RMSE). RSS estimates the error between actual and estimated values. It is a statistical method used for identifying the variance in the actual data values, which was not determined by the generated estimation model. RSS measures help to identify the optimal values of infection and recovery rates( and ), which estimates the possible error rate with the selected and . Eq. (25) explains how the RSS measures are calculated.Another statistical measure used for model evaluation is the coefficient of determination (), which is used as a goodness-of-fit measure. It is measured as a percentage of variance in the target variable estimated using the independent parameters. It calculates the strength of the relationship of the generated prediction model with the actual target variable. The value of is between 0 and 1(ranging from 0 to 100%). The method for calculating the value is given in Eq. (26).Where TSS refers to the total sum of squares which is calculated as a sum of squared variation of the predicted parameter, from its total mean, and is explained in Eq. (27).RMSE represents the standard deviation of the actual values from estimated data points or the regression line. RMSE is calculated using Eq. (28).Where refers to the estimated value at point i and is the corresponding actual value.
Autoregressive integrated moving average (ARIMA) model
Another statistical model for analyzing and forecasting the future growth of time-dependent information is known as ARIMA. It is the most commonly used statistical-based method for estimating and analyzing changes in the time-dependent data [26]. The AutoRegression (refers to AR in ARIMA) model is used to identify the relationship between the observed data and other lagged observations. Differentiation is used to make the time series stationary ; it is considered a pre-processing step (Integrated). By considering the residual error dependency and observed data, Moving Average (MA) is performed for lagged observations. Lag polynomials are used in the ARIMA model as shown in Eqs. (29), (30)
[27].Where , and q in these equations should be greater than or equal to 0. The ARIMA model with d = 0 (ARIMA()) is ARMA() model, when d and q are equal to 0 (ARIMA()) is AR(p) model, and finally, when p and d are equal to 0 (ARIMA()) is MA(q) model. In almost all cases, the value of d is 1 (difference in time-series data is 1). ARIMA model with p = 0, q = 0, and d = 1 is a special case and is known as the Random Walk model and corresponding is estimated using Eq. (31)
[27].Many researchers performed analysis and estimation of COVID-19 epidemic outbreak all over the world using different estimation models. They all tried to forecast the peak values and expected ending time of this epidemic. COVID-19 spread in Italian regions are analyzed and estimated by Distante et al. [25]. The peak values infection and period are estimated using the SEIR estimation model . They studied the epidemic spread and concluded that the outbreak reaches its maximum value in Italy’s northern regions by March-end and Southern regions of Italy by the first week of April 2020. They calculated the basic reproduction number using two different methods based on daily cases and studied duration. Their estimation was almost correct, and the outbreak started diminishing at the end of March.Peirlinck et al. [24] performed an analysis on the COVID-19 outbreak, especially in China and the United States, to demonstrate the effectiveness of mathematical models for estimating the outbreak growth and other parameters. They also provide some guidelines for controlling the outbreak successfully. They evaluate the relaxation effects of preventive measures such as total lockdown, travel restrictions, in-place shelter for an entire or specific population, and vaccination potential. For their studies, they integrate the data from the initial stages of outbreak in the United States and China for estimating the various periods of the epidemic such as infectious, latent, and contact periods and the value of basic reproduction number. For estimating the parameters of COVID-19 outbreaks in these two countries, they combined the global network model and the SEIR-based local epidemic estimation model.Alenezi et al. [28] used the SIR model, with various values of , to predict peak dates for Kuwait. According to their obtained results, Kuwait reaches its peak between July 23rd and August 22nd of 2020. They also found that the lockdown as well as other preventive measures taken by Kuwait’s government have proven to be effective in reducing the number of cases.Syed and Sibgatullah [29] analyzed and estimated the COVID-19 outbreak in Pakistan using the SIR estimation model. They did the analysis based on the data collected from the National Database of their country. They forecasted the peak value and the time at which COVID-19 reached its peak in Pakistani areas, estimating the peak on 26 May 2020. Their conclusion was that unless the authorities imposed strict policies to control the epidemic growth, 90% of the total population would be affected by the epidemic before the last week of July.An SIR-based estimation model is generated for modeling the growth of the COVID-19 epidemic in Bangladesh by Rahman et al. [20]. They analyzed and forecasted the spread of coronavirus. They studied and analyzed the impact of various preventive measures imposed by their government for controlling the outbreak, like social distancing. They forecasted the final size of infection in their country at 3,782,558, and the epidemic would reach its peak value on the 92nd day. Their study concluded that social distancing has an effective impact on controlling the epidemic’s spread, and strict social distancing is one of the best measures to control the epidemic’s growth.Batista [30] analyzed and estimated the COVID-19 epidemic spread in China, South Korea, and the rest of the world using SIR based estimation model. He did this study to estimate the final size of this outbreak in these regions. He forecasted these estimates using both the SIR and logistic model and evaluated his model using score.He et al. [31] proposed an SEIR-based model for analyzing and forecasting COVID-19 based on some control measures, including quarantine, hospital, etc. They modeled the epidemic considering collected information from Hubei Province. They used a particle swarm optimization algorithm for identifying the various parameters of the proposed model. Their study identified that the parameters may be changed based on the scenarios. They suggested quarantine and treatment are the best methods for controlling the epidemic. Lounis and Azevedo [32] modeled COVID-19 in Algeria using the classical and generalized SEIR model. They tried to forecast the future 100 days out based on the official confirmed cases in Algeria between April 2020 and early August 2020. They forecasted the counts of cumulative infection and deaths up to November 2020.A model’s suitability for prediction depends on the problem at hand. Compartmental models like SIR, and SEIR are widely used to predict a pandemic’s spread because they are deterministic models and can work easily with a large population size. They can also be used to analyze the effect of various control strategies imposed by the authorities. ARIMA model, on the other hand, efficiently manages the outliers and can be used for both seasonal and non-seasonal data. The accuracy of ARIMA depends on how the observed data (training set), and/or parameters, are being modeled. For the purpose of this study, the SIR and SEIR models were chosen.
SEIR-based prediction
The SEIR model is an extension of the SIR model used to analyze and forecast the epidemic outbreak. The main parameters of the SEIR-based estimation model are incubation rate (), infection rate (), and recovery rate (). Globally, almost 218 countries have been affected by COVID-19. This research aims to compare SIR and SEIR estimation models using the number of cases, both infection, and recovery between 24 February 2020 and 28 May 2020, and to forecast and model the COVID-19 epidemic using the best model.Python programming language is used for simulating the SEIR-based estimation model. Python provides a vast number of predefined modules or tools for a wide variety of applications. The Python tools or modules mainly used for modeling the COVID-19 outbreak are Matplotlib, math, xlsxwriter, xlrd, and sklearn [33]. Matplotlib is a Python plotting library mainly used to visualize static, animated, or interactive figures. All the graphs used here are plotted using Matplotlib. The generated estimation model is evaluated using different measures such as RSS, RMSE, and . RMSE and measures are estimated using sklearn module, which provides an effective platform for machine learning. The sklearn module is constructed on SciPy, Matplotlib, and NumPy. The development environment used for developing the python-based SEIR model is Python’s Integrated Development and Learning Environment (IDLE). The data required for modeling the COVID-19 outbreak in Kuwait is collected mainly from authorized sources such as Kuwait Government’s official websites [7] related to COVID-19 and the WHO [3]. Fig. 1
depicts the collected information of confirmed daily infection and recovery in Kuwait.
Fig. 1
Kuwait daily cases of infection and recovery from 24th February 2020 to 28th May 2020 [7].
Kuwait daily cases of infection and recovery from 24th February 2020 to 28th May 2020 [7].The actual values of infection rates () and recovery rates () are calculated using the collected data per day (time) and is demonstrated in Fig. 2, Fig. 3
. The recovery and infection rates, a fraction of the population already infected, represent the percentage of newly or daily recovered and infected populations. Consider an example: a recovery rate of 0.15 points out that 15% of the currently infected population at time t is recovered at time t. Eqs. (32), (33) estimate and values for any time t.
Fig. 2
The infection rates from 24th February 2020 to 28th May 2020.
Fig. 3
The recovery rates from 24th February 2020 to 28th May 2020.
The infection rates from 24th February 2020 to 28th May 2020.The recovery rates from 24th February 2020 to 28th May 2020.The latent or incubation period is between 1 and 14 days. It is difficult to find the exact value for the incubation period. So, the study is conducted with various values of the latent period and hence the incubation rate. The estimated values of are 1/4, 1/5, 1/6, 1/12, and 1/13, with various exposed population values (47, 80, 94, and 477).Eqs. (34), (35), (36), (37) estimates the value of the main four compartments , of the SEIR estimation model at any time t as a sum of these values at time () and rate of change of these values calculated using Eqs. (18), (19), (20), (21) respectively [34].The cumulative infection and recovery counts are calculated using the collected information about actual infection and recovery cases in Kuwait. These values are studied and compared with the forecasted values of both infection and recovery using score, RSS, and RMSE measures. The initial values of all compartments are set based on the confirmed cases from the first day of the outbreak reported in Kuwait. The total population is calculated here as 4,776,000. The entire population is considered to be susceptible. The initial values of the exposure cannot be predicted precisely. This study is conducted based on some assumed values for the exposed population; 47, 80, 94, and 477. Based on these assumed values, the susceptible population also varies accordingly, 4,775,948, 4,775,915, 4,775,901, and 4,775,518. The initial values of the infected population and the recovered population are 5 and 0, respectively.A fraction between the rates of infection and recovery is referred to as basic reproduction number . The transmission rate of an outbreak is determined based on the value of the . An epidemic’s growth is determined by ; either it may or may not form an outbreak in that country or the global population. The value of is less than 1, then it will not become an outbreak and diminish suddenly. Otherwise, it will be emerging exponentially and severely affected a significant percentage of the total population [22], [35]. The value of is used to identify the number of infections to be expected during the initial stages of the epidemic [35]. The infected person makes contacts on average and he or she is recovered within days based on
[36]. This research is performed based on different values of and .
Comparison of SIR and SEIR
A comparison between the SIR and SEIR models was conducted for infection and recovery cases. Fig. 4, Fig. 5, Fig. 6, Fig. 7
clearly explain how the estimated models fit with the actual values of infection and recovery. For the estimation of both infection and recovery, the SEIR model outperforms the SIR model.
Fig. 4
Comparison of actual and estimated cumulative infection between 24th February 2020 and 28th May 2020 using SIR model.
Fig. 5
Comparison of actual and estimated cumulative infection between 24th February 2020 and 28th May 2020 using SEIR model.
Fig. 6
Comparison of actual and estimated cumulative recovery between 24th February 2020 and 28th May 2020 using SIR model.
Fig. 7
Comparison of actual and estimated cumulative recovery between 24th February 2020 and 28th May 2020 using SEIR model.
Comparison of actual and estimated cumulative infection between 24th February 2020 and 28th May 2020 using SIR model.Comparison of actual and estimated cumulative infection between 24th February 2020 and 28th May 2020 using SEIR model.Comparison of actual and estimated cumulative recovery between 24th February 2020 and 28th May 2020 using SIR model.Comparison of actual and estimated cumulative recovery between 24th February 2020 and 28th May 2020 using SEIR model.The emerging COVID-19 outbreak is modeled using SIR and SEIR models for various values. The estimated values that were obtained using the SEIR model are close to the actual reported numbers, especially for the recovery cases.In examining the evaluation measures for both models on infection and recovery as shown in Fig. 8, Fig. 9, Fig. 10
; the SEIR model has resulted in a more accurate estimate of the values for both infection and recovery. The SEIR model with parameters and has surpassed the SIR model used for modeling the outbreak.
Fig. 8
Evaluation results of SEIR and SIR based on values.
Fig. 9
Evaluation results of SEIR and SIR based on RMSE values.
Fig. 10
Evaluation results of SEIR and SIR based on RSS values.
Evaluation results of SEIR and SIR based on values.Evaluation results of SEIR and SIR based on RMSE values.Evaluation results of SEIR and SIR based on RSS values.Based on the comparison performed in section “Comparison of SIR and SEIR”, the SEIR estimation model outperforms the SIR model for predicting both infection and recovery. Hence, the SEIR model is selected for forecasting the future values of infection, recovery, peak dates, and peak values. The actual values of both infection and recovery cases are collected from recognized sources like official websites of the Kuwait Government [7].Fig. 11, Fig. 12
illustrate the actual values of daily infection and recovery as collected from the government-authorized sources from 29 May 2020 to 1 December 2020. The daily infection count is gradually decreasing due to the government’s successful preventive measures and social distancing guidelines that were followed by the residents.
Fig. 11
The daily cases of infection from 29th May 2020 to 1st December 2020.
Fig. 12
The daily cases of recovery from 29th May 2020 to 1st December 2020.
The daily cases of infection from 29th May 2020 to 1st December 2020.The daily cases of recovery from 29th May 2020 to 1st December 2020.Fig. 13, Fig. 14
depict the daily infection and recovery rates from 29 May 2020 to 1 December 2020. The infection rate, and hence the daily count, has decreased. Moreover, due to the successful control of COVID-19, the recovery rate has increased.
Fig. 13
The infection rates from 29th May 2020 to 1st December 2020.
Fig. 14
The recovery rates from 29th May 2020 to 1st December 2020.
The infection rates from 29th May 2020 to 1st December 2020.The recovery rates from 29th May 2020 to 1st December 2020.The SEIR model has shown better results overall for predicting both infection and recovery cases. The basic reproduction number is decreased from the initial values, which implies that the government’s preventive measures are successful.
Results and discussion
The SEIR-based estimation is performed based on the confirmed cases between 24 February 2020 and 1 December 2020. The first five COVID-19 cases in Kuwait were reported on 24 February 2020. The values of infection and recovery rates are changed over time. In the early stages, the infection rates and the number of infections increased slowly and then decreased. The recovery rate and the number of recovered cases increased slowly. The main problem that was raised when using the SEIR model for modeling COVID-19 in kuwait, is the lack of methods for measuring the initial exposed population’s exact value. So, this research assumes the initial exposed population and incubation rate. This research is performed with various values for , and . The various values for are 47, 80, 94, and 477 and for 1/4, 1/5, 1/6, 1/12, and 1/13. From these values, and gave better performance than other values. The forecasted cumulative infection and recovery using the SEIR model with and based on different values is illustrated in Fig. 15, Fig. 16
.
Fig. 15
The forecasted cumulative infection highlighting the rise, peak, and predicted decline of COVID-19 using the SEIR model with various values having and .
Fig. 16
The forecasted cumulative recovery highlighting the rise, peak, and predicted decline of COVID-19 using the SEIR model with various values having and .
The forecasted cumulative infection highlighting the rise, peak, and predicted decline of COVID-19 using the SEIR model with various values having and .The forecasted cumulative recovery highlighting the rise, peak, and predicted decline of COVID-19 using the SEIR model with various values having and .The preventive measures such as quarantine, curfew, travel, and entry restrictions issued by the Kuwait Government to control the growth of COVID-19 almost came into fruition. The number of confirmed cases decreased gradually, and recovered cases increased. Social distancing and other precautionary measures reduce the spread of the virus.The containment measures issued by the Kuwait Government had a positive impact on daily infection count. The precautionary measures, such as a full or partial curfew, travel ban, social distancing, wearing mask and gloves, using hand sanitizer, etc., had positive feedback in decreasing infection count. The daily infection count is gradually decreasing. The preventive measures controlled the growth of COVID-19 in Kuwait to some extent.The study is performed on various values and using different values. Based on the evaluation performed on various values of initial exposed population and incubation rate, and gives better results. Fig. 17, Fig. 18, Fig. 19, Fig. 20
illustrate the forecasted growth of infection and recovery for various values of and using various values.
Fig. 17
Forecasted infection and recovery for various values with and .
Fig. 18
Forecasted infection and recovery for various values with and .
Fig. 19
Forecasted infection and recovery for various values with and .
Fig. 20
Forecasted infection and recovery for various values with and .
Forecasted infection and recovery for various values with and .Forecasted infection and recovery for various values with and .Forecasted infection and recovery for various values with and .Forecasted infection and recovery for various values with and .
Conclusion
Analyzing and forecasting the spread of an outbreak, while it is happening, is essential to help authorities determine necessary precautions and containment measures for controlling the transmission of the disease. Several mathematical and compartmental models have been commonly used in epidemiology research to model an epidemic’s spread including HIV, Ebola, and COVID-19. Factors such as population size and purpose of prediction can affect a model’s suitability. The main focus of this study was to analyze and compare both SIR and SEIR models and find the more suitable one among them for forecasting future values, while taking into consideration the impact of preventive measures implemented by the Kuwait Government. In this research, we evaluated and compared both models’ performance and selected the SEIR model as the more suitable model based on the data collected for the period of 24 February 2020 to 1 December 2020 from Kuwait government-authorized sources. The Python programming language was used for simulating the SEIR model on various values of basic reproduction number , initial exposed population , and incubation rate . The results showed that the SEIR model is fitted with the infection and recovery cases for values of ranging from 3 to 4, an incubation rate of 0.2, and an initial exposed population of 80. In our evaluation of the estimation models, we used accuracy measures like , RMSE, and RSS. It should be noted that the data collected for analyzing and estimating the spread does not consider any external factors that might influence the number of infection and recovery cases. The results of our study have shown that containment measures like travel restrictions and lockdowns were proven to control the spread of COVID-19. Other precautionary measures such as social distancing and wearing masks have helped in curbing the spread of COVID-19.
Declaration of Competing Interest
We, the authors, declare that we have no conflict of interest in this research study.
Table 1
Evaluation measures of the SEIR-based model using various values with and .
β
γ
R0
R2
RMSE
RSS
(a) Cumulative Infection
0.117
0.0225
5.2
0.978062532
676.9646103
43536702.95
0.128
0.0333
3.84
0.957640202
940.6967818
84066491.35
0.129
0.0337
3.83
0.96320421
876.7419419
73024261.11
0.13
0.0355
3.66
0.942013235
1100.619384
115079487.8
0.132
0.035
3.77
0.974455512
730.5018494
50695130.44
0.119
0.024
4.96
0.978480988
670.4769885
42706242.25
0.135
0.04
3.38
0.919980388
1292.917868
158805478.3
0.138
0.04
3.45
0.966003817
842.7287255
67468211.95
(b) Cumulative Recovery
0.117
0.0225
5.2
0.923005422
587.2567376
32762695.21
0.128
0.0333
3.84
0.991850371
191.0587178
3467826.196
0.129
0.0337
3.83
0.991278533
197.6481329
3711154.523
0.13
0.0355
3.66
0.99062722
204.895374
3988300.856
0.132
0.035
3.77
0.976923149
321.5038754
9819650.479
0.119
0.024
4.96
0.950042743
473.0393082
21257787.77
0.135
0.04
3.38
0.975310644
332.5468342
10505802.71
0.138
0.04
3.45
0.923937882
583.6898527
32365915.2
Table 2
Evaluation measures of the SEIR-based model using various values with and .
β
γ
R0
R2
RMSE
RSS
(a) Cumulative Infection
0.117
0.0225
5.2
0.968207131
814.9624151
63095555.11
0.128
0.0333
3.84
0.913933067
1340.882983
170806881.5
0.129
0.0337
3.83
0.923848169
1261.284018
151129550.6
0.13
0.0355
3.66
0.893024409
1494.910059
212301828.1
0.132
0.035
3.77
0.947109522
1051.142181
104965489.1
0.119
0.024
4.96
0.968217636
814.827754
63074705.52
0.135
0.04
3.38
0.869391213
1651.805527
259203842.3
0.138
0.04
3.45
0.936184401
1154.612194
126647285.3
(b) Cumulative Recovery
0.117
0.0225
5.2
0.824998007
885.3590573
74466762.73
0.128
0.0333
3.84
0.961397342
415.8211155
16426184.01
0.129
0.0337
3.83
0.97125032
358.8512339
12233549.77
0.13
0.0355
3.66
0.971088583
359.8592116
12302371.96
0.132
0.035
3.77
0.990274007
208.7204041
4138599.672
0.119
0.024
4.96
0.864380085
779.3986876
57708919.85
0.135
0.04
3.38
0.989660219
215.2056173
4399778.484
0.138
0.04
3.45
0.988680649
225.1690567
4816604.891
Table 3
Evaluation measures of the SEIR-based model using various values with and .
β
γ
R0
R2
RMSE
RSS
(a) Cumulative Infection
0.117
0.0225
5.2
0.95090185
1012.757063
97439302.54
0.128
0.0333
3.84
0.908568457
1382.040406
181453389.9
0.129
0.0337
3.83
0.911870041
1356.858316
174901126.6
0.13
0.0355
3.66
0.885134429
1549.057912
227960139.3
0.132
0.035
3.77
0.91933363
1298.132355
160089023.1
0.119
0.024
4.96
0.952725219
993.7736599
93820678.28
0.135
0.04
3.38
0.842937537
1811.377287
311703329.4
0.138
0.04
3.45
0.8858056
1544.525633
226628145.9
(b) Cumulative Recovery
0.117
0.0225
5.2
0.960294113
421.7211554
16895629.63
0.128
0.0333
3.84
0.844871707
833.5726237
66010115.31
0.129
0.0337
3.83
0.828196351
877.2313208
73105805.06
0.13
0.0355
3.66
0.816860847
905.708697
77929283.16
0.132
0.035
3.77
0.77015303
1014.652496
97804370.27
0.119
0.024
4.96
0.951107165
467.9727301
20804855.24
0.135
0.04
3.38
0.739539011
1080.113214
110831232.8
0.138
0.04
3.45
0.65147389
1249.44055
148304660.3
Table 4
Evaluation measures of the SEIR-based model using various values with and .
β
γ
R0
R2
RMSE
RSS
(a) Cumulative Infection
0.117
0.0225
5.2
0.892130755
1501.141173
214075358
0.128
0.0333
3.84
0.949823908
1023.814165
99578567.23
0.129
0.0337
3.83
0.950873963
1013.044636
97494646.29
0.13
0.0355
3.66
0.940970548
1110.470673
117148786
0.132
0.035
3.77
0.952724097
993.7854435
93822903.22
0.119
0.024
4.96
0.90485911
1409.796153
188814893.3
0.135
0.04
3.38
0.917363518
1313.888851
163998871.7
0.138
0.04
3.45
0.943160794
1089.674324
112802062.6
(b) Cumulative Recovery
0.117
0.0225
5.2
0.927619393
569.388968
30799360.71
0.128
0.0333
3.84
0.683428718
1190.78593
134707257.4
0.129
0.0337
3.83
0.6517672
1248.914691
148179850.9
0.13
0.0355
3.66
0.635589188
1277.59606
155063910.8
0.132
0.035
3.77
0.545645138
1426.578015
193336859
0.119
0.024
4.96
0.895267311
684.9187641
44565802.78
0.135
0.04
3.38
0.504216174
1490.198474
210965691.7
0.138
0.04
3.45
0.347030615
1710.189937
277851214
Table 5
Evaluation measures of the SEIR-based model using various values with and .
Authors: Neal R Smith; James M Trauer; Manoj Gambhir; Jack S Richards; Richard J Maude; Jonathan M Keith; Jennifer A Flegg Journal: Malar J Date: 2018-08-17 Impact factor: 2.979
Authors: Haneen Alabdulrazzaq; Mohammed N Alenezi; Yasmeen Rawajfih; Bareeq A Alghannam; Abeer A Al-Hassan; Fawaz S Al-Anzi Journal: Results Phys Date: 2021-07-15 Impact factor: 4.476