Literature DB >> 34131557

A study on the efficiency of the estimation models of COVID-19.

Mohammed N Alenezi¹, Fawaz S Al-Anzi², Haneen Alabdulrazzaq¹, Ammar Alhusaini², Abdullah F Al-Anzi³.

Abstract

Today, the world is fighting against a dangerous epidemic caused by the novel coronavirus, also known as COVID-19. All have been impacted and countries are trying to recover from the social, economic, and health devastations of COVID-19. Recent epidemiology research has concentrated on using different prediction models to estimate the numbers of infected, recovered, and deceased cases around the world. This study is primarily focused on evaluating two common prediction models: Susceptible - Infected - Recovered (SIR) and Susceptible - Exposed - Infected - Recovered (SEIR). The SIR and SEIR models were compared in estimating the outbreak and identifying the better fitting model for forecasting future spread in Kuwait. Based on the results of the comparison, the SEIR model was selected for predicting COVID-19 infected, recovered, and cumulative cases. The data needed for estimation was collected from official sites of the Kuwait Government between 24 February and 1 December 2020. This study presents estimated values for peak dates and expected eradication of COVID-19 in Kuwait. The proposed estimation model is simulated using the Python Programming language on the collected data. The simulation was performed with various basic reproduction numbers (between 5.2 and 3), the initial exposed population, and the incubation rate. The results show that the SEIR model was better suited than the SIR model for predicting both infection and recovery cases with R 0 values ranging from 3 to 4, E 0 = 80 and α = 0.2 .

Entities: Chemical Disease Gene Species

Keywords: Epidemiology; Mathematical Modeling

Year: 2021 PMID： 34131557 PMCID： PMC8192278 DOI： 10.1016/j.rinp.2021.104370

Source DB: PubMed Journal: Results Phys ISSN： 2211-3797 Impact factor: 4.476

Introduction

The world is now undergoing a challenging period due to an unparalleled spread of an infectious virus, named coronavirus or COVID-19. The effects from the family of coronaviruses can range from a simple cold to more dangerous forms similar to Middle East Respiratory Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS). COVID-19, also in this family, was reported on 31 December 2019 in Wuhan, China. Due to the expeditious transmission of the virus, the Chinese Government implemented several containment measures to control this outbreak from spreading, including a complete lockdown and suspension of all forms of transportation to Wuhan during late January 2020 [1]. Thailand was the second country to report cases on 13 January 2020 [2]. COVID-19 has now spread to 218 countries and territories. As of 6 December 2020, nearly 66.6 million infected cases, 42.8 million recovered cases, and 1.53 million deceased cases were reported due to COVID-19 [3]. On 30 January 2020, the total world count of confirmed cases reached 8,096. COVID-19 was declared an epidemic and a global public health emergency by the World Health Organization (WHO) [4]. With its rapid spreading becoming a serious challenge across the world, the WHO announced COVID-19 as a global pandemic on 11 March 2020 [5]. Most impacted countries imposed a full or partial lockdown for controlling the spread. One of the best measures to be implemented to reduce the possibility of spreading is social distancing. This involves people maintaining a distance of 1 meter, wearing masks covering the mouth and nose, and gloves to limit transmission. The pandemic quickly impacted various areas such as economics, education, politics, etc., along with public health all over the world. Furthermore, we saw an increase in poverty and unemployment rates. When considering how to best control the spread of COVID-19, the virus’s vigorous infectious behavior, ambiguity in transmission methods, expanded incubation period, complications in detecting the virus, etc., must be taken into account. Most countries began joint efforts to prevent the transmission and eradication of the virus [6]. Pharmaceutical companies, research departments, and health departments are developing multiple vaccines and treatments to prevent and bolster COVID-19 recoveries. There are now vaccines in the later stages of development and testing. Kuwait faces a serious threat due to the novel coronavirus, and its impact is seen everywhere. COVID-19 was first discovered in Kuwait on 24 February 2020; the first five cases were related to citizens who returned from abroad. Trying to limit localized infection rates, the Kuwait Government imposed precautionary measures such as quarantines, banning flights to and from selected countries, closing down retail shops, announcing a public holiday from 12 March 2020 onward. A partial curfew was implemented on 22 March 2020 between 5:00 PM and 4:00 AM daily, and the curfew timings were changed twice (from 5:00 PM to 6:00 AM and again from 4:00 PM to 8:00 AM). Finally, the government issued a complete lockdown from 10 May 2020 to 30 May 2020 [7]. In Kuwait, 142,993 individuals had been infected with COVID-19 as of 1 December 2020, out of which 138,507 had recovered, 881 deceased, and 3,605 currently receiving treatment. As of 1 June 2020, the population of Kuwait was approximately 4,776,407, according to data collected from the Public Authority for Civil Information (PACI) [8]. The Ministry of Health (MOH) and a small number of private hospitals and clinics are Kuwait’s primary healthcare providers. The entire bed capacity is around 8,200, almost 7,118 from MOH and 1,082 from private sector hospitals [9]. It is essential to generate an accurate estimation model for forecasting the impacts of COVID-19 in all fields like health, social, and economics for supporting the decision-makers to make appropriate mandates to handle these uncertainties. The estimation model helps to forecast what should happen in the near future. These predictions will support authorities to take precautionary and preventive measures like arranging sufficient care and treatment plans. Due to the nature of pandemics, it is not possible to make precise estimates. However, researchers and scientists tried to model this pandemic based on some scientifically proven estimation methods. According to the estimation results; authorities will decide how to best deal with the situation at hand. The SIR and SEIR estimation models are two simple and effective compartmental models used for modeling a pandemic. The Susceptible-Infected-Recovered model (SIR) is a popular model used for forecasting a pandemic [10]. The SIR model considers the total population as three compartments: Susceptible population, Infected population, and Recovered population. The infected population is referred to as the total number of individuals infected with COVID-19 and are capable of spreading the virus to others. The people who are all susceptible to infection but not yet infected are in the Susceptible compartment. Those who are deceased or recovered are in the Recovered compartment. These phases are considered as the three progressive phases of an epidemic. The SIR model is effective for estimating the percentage of the population who might need medical support. Based on the SIR estimation model, an individual who has recovered from the virus has acquired immunity and will not be reinfected. The SEIR model is an extension of the SIR model which introduced the compartment labeled Exposed to the already existing SIR. Those exposed to infected individuals and who did not become infected are categorized as an Exposed population [11]. SEIR considers the entire population under the susceptible compartment. It also considers the people who recovered have acquired a lifelong immunity against the virus. Here, we are going to model the COVID-19 outbreak using the SIR and SEIR models, analyze and compare both in terms of accuracy to find the best model for predicting the outbreak using Kuwait as a case study. After selecting the most relevant model, it will be used to predict the COVID-19 spread in Kuwait with different basic reproduction numbers (). For estimating the epidemic, we assumed that the country has a constant population during the estimation period (ignoring deaths, births, and migration during the estimation period). This research is mainly done to identify the best estimation method for modeling the COVID-19 pandemic considering the infected, recovered, and cumulative cases based on various values, which will be explained in the upcoming sections. All required information needed for this study is collected from government authorized sites during the period of 24 February to 1 December 2020 [7]. Our study is also trying to identify the best values for peak dates and the expected decline of this pandemic in Kuwait. The rest of the paper is organized as follows. The “Background” section discusses various estimation models available for modeling the pandemics. In section “SEIR-based prediction”, we describe the SEIR model, a deterministic compartmental model, which we used to model the COVID-19 pandemic in this research. A comparison of the two popularly used deterministic compartmental models, SIR and SEIR, is performed and illustrated in section “Comparison of SIR and SEIR”. Finally, the analysis and conclusions of the results are provided in sections “Results and discussion”, and “Conclusion”, respectively.

Background

Regression models

Antonio Guterres, the UN Secretary-General, pronounced COVID-19 as the most hazardous calamity since the Second World War. The coronavirus outbreak created a frightening global predicament. It severely impacted the daily lives of people around the world. It has also affected the economic, health, social, and political aspects in every impacted country. Most countries recommended imposing travel bans and other restrictions; large-scale quarantines were set up all over the world as an attempt to impede the COVID-19 spread. Many researchers modeled and forecasted the COVID-19 pandemic. Most of the studies were focused on tracing the spread of COVID-19 to analyze and predict its infection rate, recovery rate, and expected eradication. For evaluating and forecasting the COVID-19 pandemic, researchers used various models like deterministic compartmental models (DCM), agent-based models (ABM), and logistic growth models [12]. Although there are a variety of models available for forecasting various pandemics, statistical models produce better results. Regression-based models are of a different type of model, and all of them are used by various researchers in their studies. Linear regression of various orders (2nd, 3rd, and so on), Locally Weighted Linear Regression (LOESS), Generalized Linear Model (GLM), Poisson, and logistic regression, are all examples of various regression models. Regression-based methods are mainly founded on the number and type of independent and dependent parameters and the shape of the regression curve. Regression analysis is the most common method used to predict and analyze the relationship between two or more dependent and independent variables. Independent variables can be used for estimating the target or dependent variable using previous values. Regression analysis is used to analyze the relationship between those variables and predict future values based on this analysis. The regression model can be linear or nonlinear. If the model contains only linear parameters, then it is known as linear regression. We can also represent a polynomial regression as linear regression. The relationship between the dependent and independent variables in a polynomial regression of second-order (also known as a second-order model or quadratic model) with one explanatory variable is given by Eq. (1) and with two variables is explained in Eq. (2), respectively. Where , and are explanatory variables or features, and y is the dependent variable. The value of the dependent variable, y may be the number of death cases, recovered cases, confirmed cases, etc. x is used to represent features like gender, region, age, number of tests conducted, etc. , and are constants which represent bias or intercept (), slopes or weights (, and ). represents the possible error obtained in this model since in any real-world situations, the regression model may not be able to estimate the correct target value, and it signifies the noise in these relationships. In the quadratic model, the linear effect parameter and quadratic effect parameter are and respectively. When x = 0, the value of y gives the intercept value(). Eq. (3) represents the relationship between independent and dependent variables using a third-order polynomial regression model. The relationship between the target and independent variables would be non-linear or curvilinear in the polynomial regression model.Where , …, are coefficients of variables. The polynomial regression can be represented as a linear regression with multiple explanatory variables when is represented by for all i(1, 2, 3, …). Consider the third-order polynomial regression equation of one independent variable explained in 3 can be viewed as a linear regression model with three independent variables as explained in 4. , and are rewritten as , and respectively. A regression model that used to analyze and predict the influence of numerous continuous independent parameters on various target parameters is referred to as the Generalized Linear Model (GLM) [13]. It can be represented as a suitable generalization of already available regression models such as linear or polynomial. GLM is not able to predict the exact value of the target value using explanatory variables. The difference is calculated as a possible error in this case also. The main components of GLM are a random component, a link function g(), and a linear predictor. The conditional distribution of target parameters s is referred to as a random component. Eq. (5) represents a linear relationship between independent and dependent parameters, which is viewed as a linear predictor. The value of i will be 1, 2, …, n, and a link function which represents how the mean relates to the linear predictor. A non-parametric regression model used for smoothening the regression curve or line in volatile time-series is known as Locally Weighted Linear Regression (LOESS) [14]. The scatter plot is used for getting the best fitting data. The regression curve is smoothened with the help of local subsets. The first step in the LOESS method is to identify a smoothing parameter. After identifying a parameter, the model selects k nearest neighbors of an independent variable to be smoothened (). LOESS algorithm is applied to each point of , which reassigns the weights to its nearest neighbors. A regression model used for predicting the discrete dependent or response parameter is known as Poisson regression [15]. This model’s main assumption is that the response variable is considered positive counts, and it follows the Poisson distribution. This model is mainly applicable for analyzing the rates having positive counts as values. It is similar to Logistic regression, which is mainly used for calculating ratios having values between 0 and 1. The logistic regression model is also a regression model used to predict or analyze the target or response parameter using explanatory or independent parameters under consideration [16]. The logistic regression model is best suited for analyzing and estimating the growth of pandemic or epidemic diseases. The model assumes that the epidemics are growing exponentially in the initial stage, then it reaches a steady increase phase and diminishes its rate of growth. The logistic regression-based epidemic model calculates the count of infected cases using Eq. (6) having as the initial condition [16].where , and K defines the count of infected people, infection rate, and final epidemic size respectively. The rate of change of infection at time t is calculated using Eq. (7) [16]. The estimation of time at which the epidemic reaches its maximum growth rate of is explained in (8). Eqs. (9), (10) estimate the values of peak count of infected cases and the maximum rate of growth at peak period. Eq. (11) is used for fitting the regression model with the actual confirmed cases of infection. The value represents the actual estimate of infection where .

Deterministic compartmental models

There exist numerous models for analyzing and estimating the epidemic spread. Deterministic compartmental models (DCM) are non-linear models used for modeling the epidemic spread. They mainly use differential equations for modeling the outbreak. The most commonly used DCM models are the Susceptible Infected Recovered (SIR) model, the Susceptible Exposed Infected Recovered (SEIR) model, and the Autoregressive Integrated Moving Average (ARIMA) model.

Susceptible-infected-recovered (SIR) model

The SIR model assumes total population is a sum of three different parameters namely, Susceptible (S), Infected (I), and Recovered (R) as explained in Eq. (12) [17]. Many researchers used the SIR model to analyze and estimate various diseases such as HIV and Ebola [18], [19]. In the SIR model, the total population is represented by N [10]. A susceptible population is a subset of the total population who all are healthy, but they are at risk of becoming infected. Persons infected by the disease are known as the Infected population. Those who are recovered from the pandemic have acquired immunity and are referred to as the recovered population. The total deceased population is also counted as recovered in the SIR model [20]. The SIR model works based on the assumption that the total population is constant during the period of epidemic analysis and prediction, which means no deaths and births are considered in that duration. The model estimates the changes in Susceptible (S), Infected (I), and Recovered (R) populations as differential equations explained in Eqs. (13), (14), (15) respectively [21], [22]. The SIR model is used to model the pandemic considering some initial conditions. The values of the initial susceptible population, initial infected population, and the initial recovered population are denoted by , and , respectively. Infection rate represents the rate at which the susceptible population is becoming infected per day and is indicated by . In contrast, the recovery rate, indicates the recovery rate from the infection with acquired immunity [23]. The fraction of infection rate to the recovery rate as in Eq. (16) is referred to as a basic reproduction number and is denoted by .

Susceptible-exposed-infected-recovered (SEIR) model

An advancement of the SIR estimation model is known as the SEIR model which introduces a new compartment as Exposed (E) to the already established compartments of SIR. The SEIR model assumes the total population as a sum of these four compartments, as shown in Eq. (17). It considers the entire population to be in the susceptible compartment [11]. Those exposed to infected persons, but do not become infectious, are labeled as an Exposed population. The SEIR estimation model also assumes the total population is constant in the entire duration of estimation. The Eqs. (18), (19), (20), (21) explained the rate of change of , and R with time using the SEIR model [24]. The rate at which exposed persons become infectious is referred to as incubation rate, [2]. and are referred to as serial or infectious period and incubation or latent period, respectively. The value of is calculated as a fraction of the reproduction number and period, and as reciprocals of incubation period and serial period respectively, as shown in Eqs. (22), (23), (24) [25], [24]. The reciprocal infection rate is known as the contact period (). There are several measures available for evaluating the generated estimation model. Some of them are Residual Sum of Squares (RSS), Coefficient of Determination (), and Root Mean Squared Error (RMSE). RSS estimates the error between actual and estimated values. It is a statistical method used for identifying the variance in the actual data values, which was not determined by the generated estimation model. RSS measures help to identify the optimal values of infection and recovery rates( and ), which estimates the possible error rate with the selected and . Eq. (25) explains how the RSS measures are calculated. Another statistical measure used for model evaluation is the coefficient of determination (), which is used as a goodness-of-fit measure. It is measured as a percentage of variance in the target variable estimated using the independent parameters. It calculates the strength of the relationship of the generated prediction model with the actual target variable. The value of is between 0 and 1(ranging from 0 to 100%). The method for calculating the value is given in Eq. (26). Where TSS refers to the total sum of squares which is calculated as a sum of squared variation of the predicted parameter, from its total mean, and is explained in Eq. (27). RMSE represents the standard deviation of the actual values from estimated data points or the regression line. RMSE is calculated using Eq. (28). Where refers to the estimated value at point i and is the corresponding actual value.

Autoregressive integrated moving average (ARIMA) model

Another statistical model for analyzing and forecasting the future growth of time-dependent information is known as ARIMA. It is the most commonly used statistical-based method for estimating and analyzing changes in the time-dependent data [26]. The AutoRegression (refers to AR in ARIMA) model is used to identify the relationship between the observed data and other lagged observations. Differentiation is used to make the time series stationary ; it is considered a pre-processing step (Integrated). By considering the residual error dependency and observed data, Moving Average (MA) is performed for lagged observations. Lag polynomials are used in the ARIMA model as shown in Eqs. (29), (30) [27]. Where , and q in these equations should be greater than or equal to 0. The ARIMA model with d = 0 (ARIMA()) is ARMA() model, when d and q are equal to 0 (ARIMA()) is AR(p) model, and finally, when p and d are equal to 0 (ARIMA()) is MA(q) model. In almost all cases, the value of d is 1 (difference in time-series data is 1). ARIMA model with p = 0, q = 0, and d = 1 is a special case and is known as the Random Walk model and corresponding is estimated using Eq. (31) [27]. Many researchers performed analysis and estimation of COVID-19 epidemic outbreak all over the world using different estimation models. They all tried to forecast the peak values and expected ending time of this epidemic. COVID-19 spread in Italian regions are analyzed and estimated by Distante et al. [25]. The peak values infection and period are estimated using the SEIR estimation model . They studied the epidemic spread and concluded that the outbreak reaches its maximum value in Italy’s northern regions by March-end and Southern regions of Italy by the first week of April 2020. They calculated the basic reproduction number using two different methods based on daily cases and studied duration. Their estimation was almost correct, and the outbreak started diminishing at the end of March. Peirlinck et al. [24] performed an analysis on the COVID-19 outbreak, especially in China and the United States, to demonstrate the effectiveness of mathematical models for estimating the outbreak growth and other parameters. They also provide some guidelines for controlling the outbreak successfully. They evaluate the relaxation effects of preventive measures such as total lockdown, travel restrictions, in-place shelter for an entire or specific population, and vaccination potential. For their studies, they integrate the data from the initial stages of outbreak in the United States and China for estimating the various periods of the epidemic such as infectious, latent, and contact periods and the value of basic reproduction number. For estimating the parameters of COVID-19 outbreaks in these two countries, they combined the global network model and the SEIR-based local epidemic estimation model. Alenezi et al. [28] used the SIR model, with various values of , to predict peak dates for Kuwait. According to their obtained results, Kuwait reaches its peak between July 23rd and August 22nd of 2020. They also found that the lockdown as well as other preventive measures taken by Kuwait’s government have proven to be effective in reducing the number of cases. Syed and Sibgatullah [29] analyzed and estimated the COVID-19 outbreak in Pakistan using the SIR estimation model. They did the analysis based on the data collected from the National Database of their country. They forecasted the peak value and the time at which COVID-19 reached its peak in Pakistani areas, estimating the peak on 26 May 2020. Their conclusion was that unless the authorities imposed strict policies to control the epidemic growth, 90% of the total population would be affected by the epidemic before the last week of July. An SIR-based estimation model is generated for modeling the growth of the COVID-19 epidemic in Bangladesh by Rahman et al. [20]. They analyzed and forecasted the spread of coronavirus. They studied and analyzed the impact of various preventive measures imposed by their government for controlling the outbreak, like social distancing. They forecasted the final size of infection in their country at 3,782,558, and the epidemic would reach its peak value on the 92nd day. Their study concluded that social distancing has an effective impact on controlling the epidemic’s spread, and strict social distancing is one of the best measures to control the epidemic’s growth. Batista [30] analyzed and estimated the COVID-19 epidemic spread in China, South Korea, and the rest of the world using SIR based estimation model. He did this study to estimate the final size of this outbreak in these regions. He forecasted these estimates using both the SIR and logistic model and evaluated his model using score. He et al. [31] proposed an SEIR-based model for analyzing and forecasting COVID-19 based on some control measures, including quarantine, hospital, etc. They modeled the epidemic considering collected information from Hubei Province. They used a particle swarm optimization algorithm for identifying the various parameters of the proposed model. Their study identified that the parameters may be changed based on the scenarios. They suggested quarantine and treatment are the best methods for controlling the epidemic. Lounis and Azevedo [32] modeled COVID-19 in Algeria using the classical and generalized SEIR model. They tried to forecast the future 100 days out based on the official confirmed cases in Algeria between April 2020 and early August 2020. They forecasted the counts of cumulative infection and deaths up to November 2020. A model’s suitability for prediction depends on the problem at hand. Compartmental models like SIR, and SEIR are widely used to predict a pandemic’s spread because they are deterministic models and can work easily with a large population size. They can also be used to analyze the effect of various control strategies imposed by the authorities. ARIMA model, on the other hand, efficiently manages the outliers and can be used for both seasonal and non-seasonal data. The accuracy of ARIMA depends on how the observed data (training set), and/or parameters, are being modeled. For the purpose of this study, the SIR and SEIR models were chosen.

SEIR-based prediction

The SEIR model is an extension of the SIR model used to analyze and forecast the epidemic outbreak. The main parameters of the SEIR-based estimation model are incubation rate (), infection rate (), and recovery rate (). Globally, almost 218 countries have been affected by COVID-19. This research aims to compare SIR and SEIR estimation models using the number of cases, both infection, and recovery between 24 February 2020 and 28 May 2020, and to forecast and model the COVID-19 epidemic using the best model. Python programming language is used for simulating the SEIR-based estimation model. Python provides a vast number of predefined modules or tools for a wide variety of applications. The Python tools or modules mainly used for modeling the COVID-19 outbreak are Matplotlib, math, xlsxwriter, xlrd, and sklearn [33]. Matplotlib is a Python plotting library mainly used to visualize static, animated, or interactive figures. All the graphs used here are plotted using Matplotlib. The generated estimation model is evaluated using different measures such as RSS, RMSE, and . RMSE and measures are estimated using sklearn module, which provides an effective platform for machine learning. The sklearn module is constructed on SciPy, Matplotlib, and NumPy. The development environment used for developing the python-based SEIR model is Python’s Integrated Development and Learning Environment (IDLE). The data required for modeling the COVID-19 outbreak in Kuwait is collected mainly from authorized sources such as Kuwait Government’s official websites [7] related to COVID-19 and the WHO [3]. Fig. 1 depicts the collected information of confirmed daily infection and recovery in Kuwait.

Fig. 1

Kuwait daily cases of infection and recovery from 24th February 2020 to 28th May 2020 [7].

Kuwait daily cases of infection and recovery from 24th February 2020 to 28th May 2020 [7]. The actual values of infection rates () and recovery rates () are calculated using the collected data per day (time) and is demonstrated in Fig. 2, Fig. 3 . The recovery and infection rates, a fraction of the population already infected, represent the percentage of newly or daily recovered and infected populations. Consider an example: a recovery rate of 0.15 points out that 15% of the currently infected population at time t is recovered at time t. Eqs. (32), (33) estimate and values for any time t.

Fig. 2

The infection rates from 24th February 2020 to 28th May 2020.

Fig. 3

The recovery rates from 24th February 2020 to 28th May 2020.

The infection rates from 24th February 2020 to 28th May 2020. The recovery rates from 24th February 2020 to 28th May 2020. The latent or incubation period is between 1 and 14 days. It is difficult to find the exact value for the incubation period. So, the study is conducted with various values of the latent period and hence the incubation rate. The estimated values of are 1/4, 1/5, 1/6, 1/12, and 1/13, with various exposed population values (47, 80, 94, and 477). Eqs. (34), (35), (36), (37) estimates the value of the main four compartments , of the SEIR estimation model at any time t as a sum of these values at time () and rate of change of these values calculated using Eqs. (18), (19), (20), (21) respectively [34]. The cumulative infection and recovery counts are calculated using the collected information about actual infection and recovery cases in Kuwait. These values are studied and compared with the forecasted values of both infection and recovery using score, RSS, and RMSE measures. The initial values of all compartments are set based on the confirmed cases from the first day of the outbreak reported in Kuwait. The total population is calculated here as 4,776,000. The entire population is considered to be susceptible. The initial values of the exposure cannot be predicted precisely. This study is conducted based on some assumed values for the exposed population; 47, 80, 94, and 477. Based on these assumed values, the susceptible population also varies accordingly, 4,775,948, 4,775,915, 4,775,901, and 4,775,518. The initial values of the infected population and the recovered population are 5 and 0, respectively. A fraction between the rates of infection and recovery is referred to as basic reproduction number . The transmission rate of an outbreak is determined based on the value of the . An epidemic’s growth is determined by ; either it may or may not form an outbreak in that country or the global population. The value of is less than 1, then it will not become an outbreak and diminish suddenly. Otherwise, it will be emerging exponentially and severely affected a significant percentage of the total population [22], [35]. The value of is used to identify the number of infections to be expected during the initial stages of the epidemic [35]. The infected person makes contacts on average and he or she is recovered within days based on [36]. This research is performed based on different values of and .

Comparison of SIR and SEIR

A comparison between the SIR and SEIR models was conducted for infection and recovery cases. Fig. 4, Fig. 5, Fig. 6, Fig. 7 clearly explain how the estimated models fit with the actual values of infection and recovery. For the estimation of both infection and recovery, the SEIR model outperforms the SIR model.

Fig. 4

Comparison of actual and estimated cumulative infection between 24th February 2020 and 28th May 2020 using SIR model.

Fig. 5

Comparison of actual and estimated cumulative infection between 24th February 2020 and 28th May 2020 using SEIR model.

Fig. 6

Comparison of actual and estimated cumulative recovery between 24th February 2020 and 28th May 2020 using SIR model.

Fig. 7

Comparison of actual and estimated cumulative recovery between 24th February 2020 and 28th May 2020 using SEIR model.

Comparison of actual and estimated cumulative infection between 24th February 2020 and 28th May 2020 using SIR model. Comparison of actual and estimated cumulative infection between 24th February 2020 and 28th May 2020 using SEIR model. Comparison of actual and estimated cumulative recovery between 24th February 2020 and 28th May 2020 using SIR model. Comparison of actual and estimated cumulative recovery between 24th February 2020 and 28th May 2020 using SEIR model. The emerging COVID-19 outbreak is modeled using SIR and SEIR models for various values. The estimated values that were obtained using the SEIR model are close to the actual reported numbers, especially for the recovery cases. In examining the evaluation measures for both models on infection and recovery as shown in Fig. 8, Fig. 9, Fig. 10 ; the SEIR model has resulted in a more accurate estimate of the values for both infection and recovery. The SEIR model with parameters and has surpassed the SIR model used for modeling the outbreak.

Fig. 8

Evaluation results of SEIR and SIR based on values.

Fig. 9

Evaluation results of SEIR and SIR based on RMSE values.

Fig. 10

Evaluation results of SEIR and SIR based on RSS values.

Evaluation results of SEIR and SIR based on values. Evaluation results of SEIR and SIR based on RMSE values. Evaluation results of SEIR and SIR based on RSS values. Based on the comparison performed in section “Comparison of SIR and SEIR”, the SEIR estimation model outperforms the SIR model for predicting both infection and recovery. Hence, the SEIR model is selected for forecasting the future values of infection, recovery, peak dates, and peak values. The actual values of both infection and recovery cases are collected from recognized sources like official websites of the Kuwait Government [7]. Fig. 11, Fig. 12 illustrate the actual values of daily infection and recovery as collected from the government-authorized sources from 29 May 2020 to 1 December 2020. The daily infection count is gradually decreasing due to the government’s successful preventive measures and social distancing guidelines that were followed by the residents.

Fig. 11

The daily cases of infection from 29th May 2020 to 1st December 2020.

Fig. 12

The daily cases of recovery from 29th May 2020 to 1st December 2020.

The daily cases of infection from 29th May 2020 to 1st December 2020. The daily cases of recovery from 29th May 2020 to 1st December 2020. Fig. 13, Fig. 14 depict the daily infection and recovery rates from 29 May 2020 to 1 December 2020. The infection rate, and hence the daily count, has decreased. Moreover, due to the successful control of COVID-19, the recovery rate has increased.

Fig. 13

The infection rates from 29th May 2020 to 1st December 2020.

Fig. 14

The recovery rates from 29th May 2020 to 1st December 2020.

The infection rates from 29th May 2020 to 1st December 2020. The recovery rates from 29th May 2020 to 1st December 2020. The SEIR model has shown better results overall for predicting both infection and recovery cases. The basic reproduction number is decreased from the initial values, which implies that the government’s preventive measures are successful.

Results and discussion

The SEIR-based estimation is performed based on the confirmed cases between 24 February 2020 and 1 December 2020. The first five COVID-19 cases in Kuwait were reported on 24 February 2020. The values of infection and recovery rates are changed over time. In the early stages, the infection rates and the number of infections increased slowly and then decreased. The recovery rate and the number of recovered cases increased slowly. The main problem that was raised when using the SEIR model for modeling COVID-19 in kuwait, is the lack of methods for measuring the initial exposed population’s exact value. So, this research assumes the initial exposed population and incubation rate. This research is performed with various values for , and . The various values for are 47, 80, 94, and 477 and for 1/4, 1/5, 1/6, 1/12, and 1/13. From these values, and gave better performance than other values. The forecasted cumulative infection and recovery using the SEIR model with and based on different values is illustrated in Fig. 15, Fig. 16 .

Fig. 15

The forecasted cumulative infection highlighting the rise, peak, and predicted decline of COVID-19 using the SEIR model with various values having and .

Fig. 16

The forecasted cumulative recovery highlighting the rise, peak, and predicted decline of COVID-19 using the SEIR model with various values having and .

The forecasted cumulative infection highlighting the rise, peak, and predicted decline of COVID-19 using the SEIR model with various values having and . The forecasted cumulative recovery highlighting the rise, peak, and predicted decline of COVID-19 using the SEIR model with various values having and . The preventive measures such as quarantine, curfew, travel, and entry restrictions issued by the Kuwait Government to control the growth of COVID-19 almost came into fruition. The number of confirmed cases decreased gradually, and recovered cases increased. Social distancing and other precautionary measures reduce the spread of the virus. The containment measures issued by the Kuwait Government had a positive impact on daily infection count. The precautionary measures, such as a full or partial curfew, travel ban, social distancing, wearing mask and gloves, using hand sanitizer, etc., had positive feedback in decreasing infection count. The daily infection count is gradually decreasing. The preventive measures controlled the growth of COVID-19 in Kuwait to some extent. The study is performed on various values and using different values. Based on the evaluation performed on various values of initial exposed population and incubation rate, and gives better results. Fig. 17, Fig. 18, Fig. 19, Fig. 20 illustrate the forecasted growth of infection and recovery for various values of and using various values.

Fig. 17

Forecasted infection and recovery for various values with and .

Fig. 18

Forecasted infection and recovery for various values with and .

Fig. 19

Forecasted infection and recovery for various values with and .

Fig. 20

Forecasted infection and recovery for various values with and .

Forecasted infection and recovery for various values with and . Forecasted infection and recovery for various values with and . Forecasted infection and recovery for various values with and . Forecasted infection and recovery for various values with and .

Conclusion

Analyzing and forecasting the spread of an outbreak, while it is happening, is essential to help authorities determine necessary precautions and containment measures for controlling the transmission of the disease. Several mathematical and compartmental models have been commonly used in epidemiology research to model an epidemic’s spread including HIV, Ebola, and COVID-19. Factors such as population size and purpose of prediction can affect a model’s suitability. The main focus of this study was to analyze and compare both SIR and SEIR models and find the more suitable one among them for forecasting future values, while taking into consideration the impact of preventive measures implemented by the Kuwait Government. In this research, we evaluated and compared both models’ performance and selected the SEIR model as the more suitable model based on the data collected for the period of 24 February 2020 to 1 December 2020 from Kuwait government-authorized sources. The Python programming language was used for simulating the SEIR model on various values of basic reproduction number , initial exposed population , and incubation rate . The results showed that the SEIR model is fitted with the infection and recovery cases for values of ranging from 3 to 4, an incubation rate of 0.2, and an initial exposed population of 80. In our evaluation of the estimation models, we used accuracy measures like , RMSE, and RSS. It should be noted that the data collected for analyzing and estimating the spread does not consider any external factors that might influence the number of infection and recovery cases. The results of our study have shown that containment measures like travel restrictions and lockdowns were proven to control the spread of COVID-19. Other precautionary measures such as social distancing and wearing masks have helped in curbing the spread of COVID-19.

Declaration of Competing Interest

We, the authors, declare that we have no conflict of interest in this research study.

Table 1