Olusegun Michael Otunuga1. 1. Department of Mathematics, Marshall University, One John Marshall Drive, Huntington, WV, USA.
Abstract
We derive the time-dependent probability distribution for the number of infected individuals at a given time in a stochastic Susceptible-Infected-Susceptible (SIS) epidemic model. The mean, variance, skewness, and kurtosis of the distribution are obtained as a function of time. We study the effect of noise intensity on the distribution and later derive and analyze the effect of changes in the transmission and recovery rates of the disease. Our analysis reveals that the time-dependent probability density function exists if the basic reproduction number is greater than one. It converges to the Dirac delta function on the long run (entirely concentrated on zero) as the basic reproduction number tends to one from above. The result is applied using published COVID-19 parameters and also applied to analyze the probability distribution of the aggregate number of COVID-19 cases in the United States for the period: January 22, 2020-March 23, 2021. Findings show that the distribution shifts concentration to the right until it concentrates entirely on the carrying infection capacity as the infection growth rate increases or the recovery rate reduces. The disease eradication and disease persistence thresholds are calculated.
We derive the time-dependent probability distribution for the number of infected individuals at a given time in a stochastic Susceptible-Infected-Susceptible (SIS) epidemic model. The mean, variance, skewness, and kurtosis of the distribution are obtained as a function of time. We study the effect of noise intensity on the distribution and later derive and analyze the effect of changes in the transmission and recovery rates of the disease. Our analysis reveals that the time-dependent probability density function exists if the basic reproduction number is greater than one. It converges to the Dirac delta function on the long run (entirely concentrated on zero) as the basic reproduction number tends to one from above. The result is applied using published COVID-19 parameters and also applied to analyze the probability distribution of the aggregate number of COVID-19 cases in the United States for the period: January 22, 2020-March 23, 2021. Findings show that the distribution shifts concentration to the right until it concentrates entirely on the carrying infection capacity as the infection growth rate increases or the recovery rate reduces. The disease eradication and disease persistence thresholds are calculated.
Entities:
Keywords:
COVID-19; Hypergeometric; Infection; Kummer; Laguerre function; Probability density function; Stochastic differential equation; Whittaker function
Several advances have been made in the field of mathematical modeling in studying the behavior of certain dynamical systems whose state evolves with time [1], [10], [23], [38], [43], [44], [49], [51], [54], [56], [57]. Most of these dynamical processes in nature are far from equilibrium and so using stationary distribution to describe and analyze the evolution of such processes has a great limitation [8], [9], [30], [55]. Although studying the evolution of such random dynamical systems using time-dependent probability density function is challenging, significant analysis and estimates about the process can be deduced from such study. The exact solutions in terms of probability density function of dynamical systems can only be obtained for a restricted class of dynamical systems [12], [13], [21], [31], [41].Apart from Fokker-Planck equation (FPE), a partial differential equation that describes the time evolution of probability density function, there are other methods such as the moment or cumulant equations approach [17] that can be used to analyze the distribution of nonlinear systems under random perturbations. Because of the complexity of solving partial differential equations, many approximate methods have been developed numerically to solve the FPE equation for situations far from equilibrium. A simple thermodynamically consistent matrix numerical method (MNM) is developed by Holubec et al. [26] for solving over-damped FPEs with time-dependent coefficients. Paola at al. [17] proposed an approximate method called the Taylor moments for the probabilistic description of response of nonlinear system driven by a stochastic process. Other methods such as the stochastic linearization (SL) method [53], the equivalent non-linear equations method (ENLE) [14], the moment differential equation method (MDE) Monte Carlo simulation using a non-Gaussian closure approximation [40], the path integral approach, perturbation technique and Galerkin method [32], [35], [60] have been developed for approximating the time-dependent probability density of some general first-order and second-order non-linear stochastic systems.In this work, we develop the time-dependent probability density function for the number of infections of certain diseases satisfying a particular stochastic SIS epidemic model. The state transition probability density function for the dynamic process is governed by the FPE equation [19], [58]. The result obtained is used to analyze the distribution of the aggregate daily infection count of COVID-19 cases for the United States collected from the Center of Disease Control and Prevention (CDC) for the period: 01/22/2020-03/23/2021.The organization of the paper is as follows. In Section 2, we formulate a stochastic differential equation (SDE) used in modeling the number of infection at a given time by extending the well known deterministic SIS epidemic model with vital dynamics to a stochastic case. The existence and uniqueness of the solution of the SDE is also discussed. The threshold under which the disease dies out using the SDE model is obtained and analyzed. In Section 3, we derive the closed form time-dependent probability density function for the number of infection at a given time for the stochastic SIS model. Properties of the distribution, namely, mean, variance, skewness, kurtosis functions are also derived in Section 3. Special cases of the distribution are also discussed in this section. Numerical results are carried out in Section 4 to verify our claim. Discussion and summary of the work done are presented in Section 5.
Formulation of model
Our main focus in this work is on the derivation of time-dependent probability density function for the number of infected individuals following the general stochastic SIS model. To do this, we first study the SIS deterministic epidemic modelwhere and represent the population of infected and susceptible individuals, respectively,
and is the recruitment rate, is the transmission rate, is the natural death rate and is the temporary recovery rate. To maintain a constant population, we assume . With this, the population size remains constant over time so that . This reduces the model governing in (2.1) to the formThe model (2.1) has been studied extensively by Brauer et al. [11], Hethcote and Yorke [25], Lajmanovich and Yorke [33] and Nold [45].Many factors such aspneumonia seasonality, mobility, testing rates, mask use per capital weather [27], social behavior, strain-specific factors [42], and public health intervention [34] can act as external fluctuations affecting the contact/transmission rate of the disease. For this reason, we assume the contact/transmission rate of the disease is affected by environmental perturbations, such as stated above, changing the infection transmission rate to the formwhere each infected individual makes infectious contacts with each other in the interval
is a standard Wiener process on a filtered probability space (), the filtration function is right-continuous and each with contains all -null sets in . By substituting (2.3) into (2.1), we reduce the deterministic epidemic model (2.1) to the Stratonovich stochastic modelwhere denotes the Stratonovich integral [2]. We used the Stratonovich calculus instead of the Itô calculus following the arguments made in [7], [37]. We also assume the initial process is measurable and independent of . The solution of (2.4) with Itô calculus has been studied extensively by Gray [22]. In their work, Dalal et al. [16] showed that the introduction of stochastic noise can stabilize system that is unstable. In this work, we are interested in studying how the system behaves in the presence of noise in the transmission rate of the disease. This behavior is studied by analyzing the time-dependent probability density function of the process and . The stochastic differential equation has a unique positive solution [22], [39], [47], [59] in the feasible regionfor all with probability one.The model governing in (2.4) reduces to the Itô formDefineThe number is referred to the reproduction number for (2.1). This is the average number of infection cases produced by an infectious individual when introduced into a completely susceptible population. The number is the stochastic version of for the model (2.6).It was shown in Méndez et al. [39] and Otunuga [48] that the boundaries and of (2.6) are unattainable at all times if regardless of the noise intensity. Following a similar Theorem in [22], we show in Theorem 1 that the disease will die out exponentially almost surely on the long run in the presence of noise if .For any given initial condition
the solution
of the stochastic differential Eq.
(2.6)
tends to zero exponentially almost surely if
.It follows from (2.6) thatIf we have from (2.5) thatThe result follows from the fact thatusing the large number theorem for martingales [24]. □We see in Theorem 1 that replacing the condition with the condition for the disease to die out exponentially almost surely is also valid. We note here that epidemic advances can still grow initially, leading to transient epidemic advance ifA numerical result is derived later in Section 4 to confirm Theorem 1 by showing that the stationary probability distribution of concentrates entirely on zero if (which also implies ). Likewise, it can be shown that the stochastic differential Eq. (3.2) has a unique stationary distribution if . The proof of this is similar to the proof given in Theorem 6.2 of the work of Gray et al. [22], so we omit the proof here. We give the closed form representation of the time-dependent and stationary probability density function for in the next section and confirm that these distributions exist if .It was shown in Otunuga [48] that the solution of the system (2.4) is obtained aswhereandLet be the probability density for the process satisfying (2.6) at time given initial point . In his work, Otunuga [48] derived the probability density function for the specific case where the resulting Fokker Planck differential equation for has only discrete eigen-values and satisfyingis a non-negative integer. In this work, we extend the work of Otunuga [48] to include both discrete and continuous eigen-values.
Derivation of time-dependent probability distribution for the general stochastic SIS epidemic model
Using the transformationwe reduce (2.4) to the formBy setting the stochastic differential equation governing in (3.2) reduces to the Stratonovich stochastic differential equationThe Itô equivalent of (3.3) is given byUsing the change of state variablein (3.4), we obtain the Itô stochastic differential equationwhere and are defined in (2.10).Let be the probability density function for the process satisfying (3.6) at time given initial point and let denotes the corresponding limiting or stationary distribution. We give the closed form expression for and in the following theorem.Assume
and
. Then the solution
satisfying
(3.6)
has a unique stationary and time-dependent probability density function
and
respectively, obtained as
where
is the Laguerre polynomial
is the Pochhammer symbol
, and
is the Whittaker function
.It follows from (3.6) and the Fokker Planck equation that satisfieswith boundary conditionWe assume a solution of the formwhere is a constant. The stationary density function corresponding to (3.7) is obtained asprovided thatWe convert (3.7)-(3.8) to Sturm-Liouville equationby substituting . By substituting (3.10) into (3.12), we see that the differential equation in (3.12) reduces toUsing the transformationwe further reduce (3.13) into a differential equation of the formwhereThe differential Eq. (3.15) is the well known Whittaker differential equation (see Section 13.14 of [46]) with solutionwhere
are constants and ; are the Whittaker functions [5], [46]. It now follows from (3.14) that the function satisfying (3.12) is obtained asUsing relations 6.9.2 and 6.9.4 in Bateman [5], the Sturm-Liousville equation (3.12) has eigenvalueswith corresponding eigenfunctionwhere
is the Laguerre polynomial of degree and is a normalizing factor obtained asThe Sturm-Liouville equation also has continuous range of eigenvaluewith corresponding eigenfunctionwhere is the Kummer function [46], is the imaginary unit, and is a constant derived using the orthogonal conditionwhere is the Dirac delta function. According to Szmytkowski [52], the functions and with are orthogonal on the positive real semi-axis with the weight in the senseIt follows immediately from (3.25) and (3.26) thatThe principal solution is obtained as
□We note here that the distribution is the Gamma distribution with shape parameter . Since and it follows from (3.27) thatUsing the fact that the distribution is the Gamma distribution, it follows immediately that the limiting mean, mode, variance, skewness, kurtosis function of are obtained asrespectively.Using the following integrals regarding Laguerre polynomial (these follows from the Chu-Vandermonde identities (see [46] Section 16.1.1)),where is the Pochhammer’s symbol, it can be shown that andwhere is the Hypergeometric regularized function [46]. □In order to confirm the validity of the solution (3.27), we show that the solution corresponds to an initial value . That is, for any real valued continuous function we have . To do this, we show that for
. We use (3.29) and equation (44) in Becker [6] to show that
□Let and be the time-dependent probability density function for the number of infected and susceptible individuals at time given initial points and respectively. Also, letand be the corresponding stationary density function. We give the closed form expression for
and in the following theorem.Assume
and
. Then the solution
given in
(2.8)
and satisfying
(2.4)
has a unique stationary and time-dependent probability density function
respectively, obtained asUsing the transformationin (3.5), we obtain the probability stationary function and the density function of the number of infection at a given time time with initial condition asandLet . Since it follows that the density probability function of the number of susceptible individuals at time with initial condition can be obtained as
Eq. (3.30) follows using the transformation (3.1)
□The condition given in (3.10) is equivalent to . It follows that the probability distribution and do not exist if .Following similar approach in Remark 1, it can be shown that and . Also, using the substitution it can be shown that .
Properties of the stationary distribution
One can easily check thatThe cummulative density function, for the distribution in (3.30) is obtained aswhere is the lower incomplete gamma distribution.The mean and variance, denoted and respectively, of the distribution are obtained aswhere is the upper incomplete gamma function and is the Kummer/confluent hypergeometric function [46]. We see later in (3.40) that these results correspond to the limiting mean and variance of the distribution . It follows from (3.35) that the median is the number that satisfies the equationThe following theorem shows interval where the stationary density function is decreasing and increasing.DefineIf
then
and the probability stationary distribution
is increasing and decreasing on the interval
and
respectively, with maximum value attained at
.If
and
then
is increasing and decreasing on the interval
and
respectively. In this case,
has local maximum at
but no absolute maximum value on
.If
then
is decreasing on the interval
.We note that on and on . If then
. Hence, attains its maximum value at and (i) follows. Likewise, if and then and (ii) follows. Finally, if then on and (iii) follows. □
Special case where
The condition is equivalent to . For this case, reduces toIf are samples of independent identically distributed random variables, the maximum likelihood estimate of is
Properties of the density function
The th moment, denoted of the distribution at time is obtained aswhere
and withusing identities (13.14.5), (13.6.19) and (13.16.6) in [46]. The mean variance skewness and kurtosis of the distribution can easily be calculated from (3.37). The limiting th moment, is obtained aswhere is defined in (3.21) and is the Kummer/Congruent Hypergeometric function[46]. It follows immediately from (3.39) thatwhere is the incomplete gamma function. The values obtained in (3.40) are the mean, variance, skewness, and kurtosis, respectively, of the stationary density function in (3.30). This is confirmed numerically in Section 4.
Special cases
In this section, we briefly discuss few interesting cases of the expression for given in (3.30) involving particular values for the parameters
and .
Case for small
If then is approximatelywith .
Case for large
The constant is called the harvesting effort of the system (2.4). It follows from (3.11) that . For large so that it follows from (2.8) that . That is, large value of leads the infected population to extinct. Also, this condition is equivalent to so that and the probability density function reduces to that in (3.30). This is verified numerically in Section 4.
Case for large noise intensity
As the noise intensity in the transmission rate of disease increases (that is, as ), we see that
and the probability distribution converges to as and . This effect is shown numerically in Section 4.
Numerical simulation
In this section, numerical simulation is carried out to verify our claim. First, we use published and assumed COVID-19 epidemiological parameters in Subsection 4.1 to confirm our results. The properties of the distribution are investigated numerically in Subsection 4.2. In Subsection 4.3, the obtained time-dependent and stationary probability density function are used to analyze the distribution of aggregate number of COVID-19infection cases in the United States.
Analysis using published data
We apply the distribution using published and assumed parameters. According to recent survey conducted by CDC1
on people with mild COVID-19 cases, one-third did not return to normal health within two to three weeks of testing positive, and recovery takes six weeks or more for those with severe cases. For this reason, we set to be in the range for our analysis. The CIA World Factbook2
reported that the current population of the United States as at July 2020 is 332,639,102. So, we set unit representing 332,639,102 individuals. The total number of infected cases3
as of November 30, 2020 is 13386251. So is about and increasing. For this reason, we suggest . Also, the annual U.S. birth rates for the year 2020 is 12.4 births per 1000 population, while the death rate is 8.3 death per 1000 population and life expectancy is 80.3 years4
. Therefore, we set to be in the range . The parameters used are associated with COVID-19 data and are given in Table 1
below. A MATHEMATICA program that evaluates the Whittaker and Laguarre functions and performs the integration in (3.32) is available from the author upon request.
Table 1
Parameter values associated with COVID-19 data.
Parameter
Description
Value
Source
γ
temporary recovery rate (day−1)
[1/30,1/2]
CDC1
μ
death rate (day−1)
[833650000,180.3×365]
CIA4
Λ
recruitment rate (day−1)
μ
β
transmission rate (day−1)
[0.5,1.5],[0.1,2]
[18], Assumed
σ
noise intensity
[0.01,8]
Assumed
N
population Size
1
CIA2
I0
initial infection cases
[0.04,0.05]
CDC3
Parameter values associated with COVID-19 data.
Verification of the result obtained
We note from (3.1) that the time-dependent probability density function reduces to if the population size is converted to fractions, that is, if . For the rest of the numerical analysis in this subsection, we set .In order to verify the correctness of the density function and the stationary density function obtained in (3.30), we first discretize the stochastic model (3.4) using the Milstein scheme [20] as follows:where
are independent standard normal variable,
for sample size
for simulations. For large number of simulations the probability density function derived from the histogram of the random variable is plotted and compared with the graph of the probability density function for in Fig. 1
(a) below. To check the correctness of the result for we simulate the probability density function derived from the histogram of the random variable for large and large number of simulations . The simulated density function is now compared with the graph of in Fig. 1(b) below
Fig. 1
Comparison of the exact density functions and with simulated distribution of and respectively.
Comparison of the exact density functions and with simulated distribution of and respectively.The red and blue curves in Fig. 1(a) are the trajectories of the exact probability density function and simulated distribution of respectively, for simulations at time using the parameters
. The red and blue curves in Fig. 1(b) are the trajectories of the exact probability density function and simulated distribution of respectively, for simulations, using the same parameters above.
Behavior of the distribution of as the transmission rate changes
In this section, we analyze numerically the behavior of the time-dependent distribution and as the transmission rate increases.Fig. 2
(a), (b) and (c) is the graph of the probability density function for the case where
and respectively. Here, we use
and . Fig. 2(d) shows the stationary probability distribution which is the limiting distribution . We see here that the distributions for the three cases skewed left. This is evident in Figs. 9(b) and 10(b) as we can see there that the skewness plot is negative. We see from Fig. 2(d) that as . Also, we observe that as increases, the stationary density function shifts concentration to the right. The concentration moves closer to as grows larger. This is because as the transmission rate increases, we expect the number of infection to increase until everyone in the population becomes infected. This can be confirmed analytically using (3.30).
Fig. 2
Graphs of the probability density function and as increases.
Fig. 9
Graphs of the mean, variance, skewness, kurtosis function with time.
Fig. 10
Graphs of the mean, variance, skewness, kurtosis function with time.
Graphs of the probability density function and as increases.Stationary density function for the case where and respectively.Effect of changing on the stationary density function .
Behavior of the distribution of as and .
As shown in Theorem 1, the disease will die out on the long run if (implying also). Fig. 3(a) and (b) show the behavior of the distribution as disease dies out in the population using parameters
and varying so that in 3(a) and in 3(b). The result shows that on the long run, the density function of converges to the Dirac delta function as .
Fig. 3
Stationary density function for the case where and respectively.
Effect of change in temporary recovery rate
It follows from condition (3.11) that .Fig. 4 (a) and (b) shows the effect of increase in the parameter on the stationary density function using parameters
with in Fig. 4(a), and in Fig. 4(b), respectively. We see here that as increases to from the left, the distribution shifts concentration to the left until it concentrates entirely on 0. This shows that the disease dies out as . We also see the effect of large noise in the system as increases. We see that the distribution becomes taller and narrower in the presence of small noise, and flatter and wider in the presence of large noise. In Fig. 4(c), we see that the stationary distribution concentrates more on the final size of the epidemic as the recovery rate decreases to zero. This analysis shows that the disease dies out as the recovery rate reaches the threshold . The number of infected reaches maximum as the number of those who recovered declines to zero.
Fig. 4
Effect of changing on the stationary density function .
Distribution for the case where .
Fig. 5
(a) and (b) show the probability and stationary density functions and respectively, for the case where or equivalently, using parameters in Table 1 with
. We see here that as the stationary distribution . This follows directly from (3.30).
Fig. 5
Probability and stationary density functions and respectively, for the case where .
Probability and stationary density functions and respectively, for the case where .Fig. 6
shows the probability density function for the case where using parameters in Table 1 with
. The analysis also shows that if the number of infection is not harvested in any way (in this case, by recovering), then as for some with . Also, we see that . This is true because if
if as t→∞
Fig. 6
Probability density function for the case where .
Probability density function for the case where .Effect of noise intensity on the stationary density function .
Effect of noise in the system
Fig. 7 shows the effect of noise intensity on the stationary density function using
and varying . Fig. 7(a) shows the distribution as while Fig. 7(b) shows the distribution as . We see from Fig. 7(a) that the stationary density function concentrates entirely on the number as . This is because the solutionof the deterministic equivalent of (3.4) where converges to if . The number is the endemic equilibrium of the deterministic equivalent of (3.4) where . Endemic persists in the deterministic system if . This is confirmed in Fig. 7(a). The graph shows that the Dirac delta function, as . This is not the case for large . In fact, as the stationary density function as and as .
Fig. 7
Effect of noise intensity on the stationary density function .
Plot of the mean, variance, skewness, and kurtosis function using published parameters
Fig. 8
(a), (b), (c) and (d) is the graph of the mean, variance, skewness, and kurtosis, respectively, of the distribution at time using parameters in Table 1 with
and . The dashed line is the horizontal line representing the limits
of the mean, variance, skewness, and kurtosis functions, respectively, of the distribution given in (3.40). Fig. 8(a) shows that if the transmission rate is and the temporary recovery rate is as low as the average number of infection cases will rise to on the long run if proper care and prevention is not taken. The variance function converges to 0.0039 on the long run. This shows the magnitude of the spread of the data away from the mean is small. The skewness function is a positive function, suggesting that the distribution of the data is skewed right.
Fig. 8
Graphs of the mean, variance, skewness, kurtosis function with time.
Graphs of the mean, variance, skewness, kurtosis function with time.Graphs of the mean, variance, skewness, kurtosis function with time.Graphs of the mean, variance, skewness, kurtosis function with time.Fig. 9 (a), (b), (c) and (d) is the graph of the mean, variance, skewness, and kurtosis, respectively, of the distribution at time using parameters in Table 1 with
and . The dashed line is as described in Fig. 8. Fig. 9(a) shows that if the transmission rate is and the temporary recovery rate is as low as the average number of infection cases will rise to on the long run if proper care and prevention is not taken. The variance function converges to 0.012 on the long run. The skewness function is a negative function, suggesting that the distribution is left skewed.Fig. 10 (a), (b), (c) and (d) is the graph of the mean, variance, skewness, and kurtosis, respectively, of the distribution at time using parameters in Table 1 with
and . The dashed line is as described in Fig. 8. Fig. 10(a) shows that if the transmission rate is and the temporary recovery rate is as low as the average number of infection cases will rise to on the long run if proper care and prevention is not taken. The variance function converges to 0.0064 on the long run. The skewness function is a negative function. By comparing the results derived in Figs. 9 and 10, it follows that the average number of cases increases and the variance decreases on the long run as the transmission rate increases.
Probability distribution of aggregate number of COVID-19 infections in the United States
Several epidemiological models [1], [10], [23], [38], [43], [44], [49], [51], [57] have been developed to study the transmission of the COVID-19 virus. As it is well known, the generalized logistic equation is widely used in interpreting the aggregate number of COVID-19infection trajectories in several countries [49], [50], [57]. We see in our work that model (2.2) is a logistic differential equation. In the case of modeling the aggregate number of infected cases, is referred to as the aggregate infection growth rate, is the aggregate infection carrying capacity, and serves as the aggregate infection harvesting effort. In this section, we study the trajectory and the distribution of the aggregate counts of COVID-19 cases reported by the Centers for Disease Control and Prevention (CDC)5
for some states in the United States using model (2.6) and the time-dependent density function (3.30), respectively. We assume the aggregate number of infection follows the model (2.6), where in this case represents the aggregate infection cases at time . The advantage of using model (2.6) over most models available in literature is that it considers presence of external perturbations/disturbances in the contact rate of the disease that can be caused by many factors such aspneumonia seasonality, mobility, testing rates, mask use per capital weather [27], social behavior, strain-specific factors [42], and public health intervention [34]. Also, in the presence of this noise, the harvesting function is not constant as widely assumed, but non-linear.Figs. 11
, 12
, and 13
shows the plot of the aggregate infection counts of COVID-19 cases for the states and territories: AK, AR, AZ, CA, CO, CT, DC, DE, GA, GU, HI, IA, ID, IL, IN, KS, KY, LA, MD, ME, MI, MN, MO, MP, MS, MT, NC, ND, NE, NH, NJ, NM, NV, OH, OK, OR, PA, RI, SC, SD, TN, TX, UT, VA, VT, WI, WV, WY, in the United States for the period 01/22/2020 to 03/23/2021 using model (2.6), and the parameters in the model estimated for the deterministic version of the model case using the Non-linear Least Squares estimation scheme [15], [36]. We also estimated the initial point for better fit.
Fig. 11
Real and simulated aggregate counts of COVID-19 infection for the states and territories: AK, AR, AZ, CA, CO, CT, DC, DE, GA, GU, HI, IA, ID, IL, IN, KS in the United States.
Fig. 12
Real and simulated aggregate counts of COVID-19 infection for the states and territories: KY, LA, MD, ME, MI, MN, MO, MP, MS, MT, NC, ND, NE, NH, NJ, NM in the United States.
Fig. 13
Real and simulated aggregate counts of COVID-19 infection for the states and territories: NV, OH, OK, OR, PA, RI, SC, SD, TN, TX, UT, VA, VT, WI, WV, WY in the United States.
Real and simulated aggregate counts of COVID-19infection for the states and territories: AK, AR, AZ, CA, CO, CT, DC, DE, GA, GU, HI, IA, ID, IL, IN, KS in the United States.Real and simulated aggregate counts of COVID-19infection for the states and territories: KY, LA, MD, ME, MI, MN, MO, MP, MS, MT, NC, ND, NE, NH, NJ, NM in the United States.Real and simulated aggregate counts of COVID-19infection for the states and territories: NV, OH, OK, OR, PA, RI, SC, SD, TN, TX, UT, VA, VT, WI, WV, WY in the United States.The parameter estimates
and of the aggregate infection carrying capacity the infection growth rate the infection recovery rate and the initial population size are estimated along with the root mean square error (RMSE) of the estimation and reported in Table 2
. is the number of non-zero aggregate infection days from January 22, 2020 to March 23, 2021. The trajectory of the aggregate infection counts, together with the estimated trajectory for each of the 48 states and territories studied are reported in Fig. 11, Fig. 12, and 13. The red and blue curves are the trajectory for the real and simulated COVID-19 aggregate count data, respectively. The estimated parameter here represents the carrying capacity of the aggregate infection count for each states. We show in later section that as the aggregate harvesting effort reduces, or as the aggregate infection growth rate increases due to no proper public health intervention, the aggregate COVID-19 count estimate converges to .
Table 2
Parameter estimates and the RMSE for the period of Jan 22, 2020-March 23, 2021.
State
N^
β^
γ^
I0
T
RMSE
AK
116955
0.06495
0.03196
11.7
373
1169.2
AR
1387119
0.05572
0.03857
3073.4
374
8295.1
AZ
6207611
0.05191
0.03901
10109.9
419
42953.8
CA
18525258
0.04999
0.03442
18040.2
419
180570.9
CO
1287227
0.06487
0.04007
528.3
381
17744.6
CT
4265220
0.05993
0.04875
8086.5
377
16566.6
DC
1139045
0.06526
0.05747
3.501.3
379
1926.8
DE
993497
0.06256
0.05049
2217.8
375
3.887.0
GA
7996680
0.06166
0.04890
23805.8
386
38483.0
GU
14309
0.06804
0.03079
2.4421
371
152.2
HI
88343
0.066541
0.045843
231.96
381
1164.9
IA
1047217
0.058366
0.037049
1664.8
378
10774
ID
490785
0.055512
0.033361
688.45
372
4195.1
IL
4441504
0.053073
0.035285
5387.3
422
47364
IN
1783274
0.057553
0.033006
898.27
380
24196
KS
868640
0.058924
0.035784
615.67
378
7801
KY
1386548
0.06343
0.041157
797.04
379
7928.9
LA
3399215
0.063446
0.051774
19790
377
17743
MD
4032036
0.062989
0.0521
16646
381
15558
ME
152787
0.0656
0.040101
24.428
374
2044
MI
2316851
0.057358
0.037583
3746.3
376
32390
MN
1243941
0.061734
0.035266
555.81
380
20819
MO
1580962
0.056665
0.033793
1439.2
379
10266
MP
857
0.066576
0.053355
10.391
358
2.8125
MS
1939174
0.063662
0.049823
7495.3
374
9953.3
MT
188925
0.064356
0.029378
14.373
375
1290
NC
4994376
0.056624
0.041666
10277
384
26445
ND
174226
0.064338
0.026613
14.732
374
2473.8
NE
655227
0.062605
0.040601
714.66
380
7454
NH
232013
0.065474
0.038755
24.024
384
3291.7
NJ
21306864
0.064247
0.055493
45087
381
44812
NM
511161
0.062123
0.036503
208.01
375
7326.1
NV
1387057
0.065965
0.048652
2906.5
381
11652
OH
2680011
0.060963
0.035346
871.78
376
35223
OK
1386379
0.056107
0.035141
1358.7
379
10747
OR
655383
0.063132
0.043888
674.69
386
4765.3
PA
3720047
0.06246
0.041921
2585.5
380
44433
RI
559165
0.051781
0.035041
975.64
384
6601.1
SC
4294273
0.060356
0.047188
10037
381
20568
SD
245266
0.067793
0.035468
43.639
376
3069.5
TN
3173919
0.05427
0.036698
6150.5
381
26324
TX
17156629
0.05962
0.045363
43642
381
78597
UT
1095558
0.054204
0.03238
1074.7
378
9331.2
VA
5879402
0.056827
0.044575
13170
378
19814
VT
132684
0.076231
0.058152
46.712
378
607.51
WI
1372907
0.058849
0.030678
477.69
383
13913
WV
327686
0.064602
0.034811
36.89
369
3243.5
WY
99777
0.070869
0.031364
2.2229
374
1245.1
Parameter estimates and the RMSE for the period of Jan 22, 2020-March 23, 2021.
Distribution of the aggregate infection count in the United States for March 23, 2021
Here, we study the effect of small disturbances in the system by setting and plot the distribution of the aggregate number of infection that occurred on March 23, 2021 for the state of AK, ID, KY, ND, SD, WI, WV, WY. Let be the number of non-zero days from January 22, 2020 (first day CDC started recording data) to March 23, 2021. We plot the curve for the distribution together with the density function obtained from the histogram of the random variable
in (4.1) for simulations. The value of for each of the states analyzed are reported in Table 3
. We confirm the result derived in Fig. 7(a), that is, the probability density function of the aggregate count of COVID-19 for March 23, 2021 concentrates entirely on the number for small . This result is shown in Fig. 14
. The analysis shows that the aggregate count of COVID-19 for the states analyzed converges to asymptotically on the day March 23, 2021. This number is estimated and reported in Table 3.
Table 3
Parameter estimate for the day March 23, 2021.
State
AK
ID
KY
ND
SD
WI
WV
WY
N¯
59355
195541
486138
102066
116825
656407
150941
55572
Fig. 14
Probability distribution of aggregate number of counts that occurred on March 23, 2021.
Parameter estimate for the day March 23, 2021.Probability distribution of aggregate number of counts that occurred on March 23, 2021.
Distribution of the final size of the aggregate infection count in the United States
In this section, we study and analyze the distribution of the final size of the aggregate infection count in the United States by studying how the distribution behaves as . According to (3.34), we know thatWe verify this by plotting, for large say, the distribution together with . The verification plot is shown in Fig. 15
. For this reason, we analyze the distribution of the final size of the aggregate infection count using the stationary probability density function .
Fig. 15
Comparison of the probability density function and .
Comparison of the probability density function and .Fig. 15 shows the comparison of the stationary density function (in red color) and the limiting distribution (in blue color).As evidenced in Fig. 2(d), we see that the stationary probability density function concentrates more on the final size of the epidemic as the infection growth rate increases. We also see a similar behavior in Figs. 4(c) and 7 (b) as the aggregate harvesting effort reduces and as the noise intensity increases, respectively. For this reason, we study the effect of high infection growth rate low recovery rate (low aggregate harvesting effort ), and effect of large noise on the distribution of the aggregate COVID-19 counts in the United States in the following subsections. This is done by plotting the stationary probability density function of the final size of the aggregate infection count for large, small, and large
, and respectively. We see that the graph of concentrates on the estimate (obtained in Table 2), which is the aggregate carrying capacity of the COVID-19 count in the United States as public health intervention reduced.Fig. 16
shows the distribution of the final size of the aggregate infection count as the infection growth rate increases using the estimated parameters in Table 2 with large noise intensity and varying infection growth rates (blue curve), (black curve), and (red curve) for each states analyzed. The figure shows that if proper intervention is not taken (resulting in increase in infection growth rate and constant recovery rate ), the aggregate count of COVID-19infection will rise to the estimated aggregate carrying capacity value in Table 2 for each respective states and territories. The analysis was done for the 48 states and territories mentioned in Table 2 and similar results obtained for each states and territories. For this reason and in order to minimize space, we only report the plots for the state of AK, ID, KY, ND, SD, WI, WV, WY.
Fig. 16
Distribution of the final size of the aggregate infection as the infection growth rate increases for the states: AK, ID, KY, ND, SD, WI, WV, WY in the United States.
Distribution of the final size of the aggregate infectionas the infection growth rate increases for the states: AK, ID, KY, ND, SD, WI, WV, WY in the United States.Fig. 17
shows the distribution of the final size of the aggregate infection count as the aggregate harvesting effort decreases using the estimated parameters in Table 2 with large noise intensity and varying recovery rate (blue curve), (black curve), and (red curve) for each states analyzed. The figure shows that if proper intervention is not taken (resulting in decline in recovery rate and constant aggregate growth rate ), the aggregate count of COVID-19infection will rise to the estimated aggregate carrying capacity value in Table 2 for each respective states and territories. The analysis was done for the 48 states and territories mentioned in Table 2 and similar results obtained for each states and territories. We only report the plots for the state of AK, ID, KY, ND, SD, WI, WV, WY here.
Fig. 17
Distribution of the final size of the aggregate infection as the aggregate harvesting rate reduces with large noise for the states: AK, ID, KY, ND, SD, WI, WV, WY in the United States.
Distribution of the final size of the aggregate infectionas the aggregate harvesting rate reduces with large noise for the states: AK, ID, KY, ND, SD, WI, WV, WY in the United States.
Results, conclusion and discussion
By assuming the contact rate of certain diseases is affected by environmental perturbations, we convert the general deterministic SIS epidemic model with vital dynamics and constant population to a stochastic SIS model and used this model to study the dynamics of some infectious diseases. The study in this work involves analyzing the distribution of numbers of infections whose dynamic follows the SIS stochastic model (2.6). To do this, we derive the closed-form time-dependent probability density function of the number of infection at time given the initial point . The stationary probability density function of the process is also derived and analyzed and our studies show that converges to on the long run. We showed that these distibutions exist if the average number, of individuals infected by an infectious individual in a completely susceptible population is more than one. Also, we showed that the disease will die out if . Using the Milstein scheme, the stochastic model (2.6) is discretized for sample size and simulations and the correctness of the obtained probability density functions and are verified in Section 4.1.1 by comparing them to the probability density function derived from the histogram of the discretized processes and (for large ), respectively, in (4.1), for
and using published epidemiological data. The effect of changes in the transmission rate and temporary recovery rate on the distribution and is studied. It was shown that as approaches 1 (from above), the distribution approaches the Dirac delta function and concentrates entirely on zero. That is, on the long run, the number of infection declines and the disease eventually dies out as the reproduction number approaches 1 (from above). Also, as the recovery rate decreases to zero, the distribution of the number of infection shifts concentration to the right until it concentrates entirely on the infection carrying capacity . That is, the infected population converges to the infection carrying capacity as the recovery rate reduces significantly. A similar analysis done on the transmission rate shows that the infected population converges to the infection carrying capacity as increases. The effect of noise on the system is also analyzed. It was shown that on the long run, the probability density function shifts concentration on the disease endemic equilibrium point as the noise intensity approaches zero. This is because (2.2) has an endemic equilibrium that is globally stable if . Some properties of the time-dependent probability density function, namely the mean, variance, skewness, and Kurtosis functions are derived and analyzed as a function of time.Similar analysis described above is done using the real aggregate counts of COVID-19 cases reported by the Centers for Disease Control and Prevention for the states: AK, AR, AZ, CA, CO, CT, DC, DE, GA, GU, HI, IA, ID, IL, IN, KS, KY, LA, MD, ME, MI, MN, MO, MP, MS, MT, NC, ND, NE, NH, NJ, NM, NV, OH, OK, OR, PA, RI, SC, SD, TN, TX, UT, VA, VT, WI, WV, WY in the United States. We study the trajectory and the distribution of the aggregate counts using the time-dependent density function and stationary distribution given in (3.30). The validity of the derived time-dependent density function is verified by comparing the distribution for the aggregate count of COVID-19 for the day March 23, 2021 with the histogram of the distribution of the random variable
for simulations, where is the number of non-zero days from January 1, 2020 to March 23, 2021. We also show numerically using the real COVID-19 aggregate counts for AK, ID, KY, ND, SD, WI, WV, WY that on the long run, the time-dependent density function converges to the stationary density function . We use the distribution to confirm the estimate of the aggregate count for the month of March 23, 2021 in Table 3 and compare the result with the real data. By studying the distribution of the final size of the aggregate infection counts, our analysis shows that on the long run, if proper intervention is not taken such that the harvesting effort decreases significantly to 0.01 (or less), the aggregate counts of COVID-19 cases for each states may increase to the value reported in Table 2. A similar analysis shows that the aggregate counts of COVID-19 cases for each states may increase to the value if the infection growth rate increases to 0.3 (or more). These studies are confirmed by showing that the distribution shifts concentration on the value as the harvesting effort decreases or the infection growth rate increases. More studies are still ongoing on the distribution of the aggregate count of infectionas data are being updated daily. Further extension of the result obtained in this work using fractional order cases as discussed in Jajarmi et al. [28], [29] is under investigation. For more readings on fractional derivatives, we direct readers to the papers [3], [4], [51], [56].
Availability of data and materials
The data used in the analysis can be found on the CDC website.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
CRediT authorship contribution statement
Olusegun Michael Otunuga: Conceptualization, Data curation, Formal analysis, Software, Writing - original draft, Writing - review & editing.
Declaration of Competing Interest
The author declares that there is no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.