Literature DB >> 36247018

A phenomenological model for COVID-19 data taking into account neighboring-provinces effect and random noise.

Julia Calatayud1, Marc Jornet2, Jorge Mateu1.   

Abstract

We model the incidence of the COVID-19 disease during the first wave of the epidemic in Castilla-Leon (Spain). Within-province dynamics may be governed by a generalized logistic map, but this lacks of spatial structure. To couple the provinces, we relate the daily new infections through a density-independent parameter that entails positive spatial correlation. Pointwise values of the input parameters are fitted by an optimization procedure. To accommodate the significant variability in the daily data, with abruptly increasing and decreasing magnitudes, a random noise is incorporated into the model, whose parameters are calibrated by maximum likelihood estimation. The calculated paths of the stochastic response and the probabilistic regions are in good agreement with the data.
© 2022 Netherlands Society for Statistics and Operations Research.

Entities:  

Keywords:  COVID‐19 infections; generalized logistic differential equation; parameter calibration; spatial correlation; stochastic modeling

Year:  2022        PMID: 36247018      PMCID: PMC9538456          DOI: 10.1111/stan.12278

Source DB:  PubMed          Journal:  Stat Neerl        ISSN: 0039-0402            Impact factor:   1.239


INTRODUCTION

Phenomenological (or statistical) models are often useful to reproduce and forecast the course of an epidemic, when the insight is limited, treatments and interventions rapidly change, and data are scarce, uncertain and vary abruptly (Chowell et al., 2016; Pell, Kuang, Viboud, & Chowell, 2018). In these circumstances, some mechanistic models, based on specific laws of transmission, may not work well. The main example of phenomenological model is the logistic growth curve. Devised by P.F. Verhulst in 1838 as an extension of the Malthusian exponential model, it has become an essential tool in biology, ecology and epidemiology for the fit of growth phenomena. Examples of logistic epidemic modeling include Ebola (Chowell, Simonsen, Viboud, & Kuang, 2014) and COVID‐19 (Wang, Zheng, Li, & Zhu, 2020). Generalizations of the logistic equation, to capture other growth profiles, have been suggested and applied to tumor growth (Birch, 1999; Marusic, Bajzer, Vuk‐Pavlovic, & Freyer, 1994; Sachs, Hlatky, & Hahnfeldt, 2001; Spratt, Meyer, & Spratt, 1996) and to diseases such as SARS (Hsieh, 2009; Hsieh, Lee, & Chang, 2004), dengue fever (Hsieh & Ma, 2009), influenza H1N1 (Ebola Pell et al., 2018; Hsieh, 2010; Zika Chowell et al., 2016), and COVID‐19 (Aviv‐Sharon & Aharoni, 2020; Lee, Lei, & Mallick, 2020; Pelinovsky, Kurkin, Kurkina, Kokoulina, & Epifanova, 2020; Wu, Darcet, Wang, & Sornette, 2020). Though phenomenological, these models may be extended to incorporate some spatial effects. Extending logistic models to heterogeneous space may be done by including logistic growth as the reaction term in a reaction‐diffusion partial differential equation model, or by modeling space as a collection of discrete patches, among which populations can disperse (Zhang, DeAngelis, & Ni, 2021). The second case yields a coupled system of ordinary differential equations, which is simpler than a mechanistic compartmental system. Phenomenological models may also incorporate stochastic effects, to deal with the uncertainty associated to data collection and the phenomenon itself (Smith, 2013). For the COVID‐19 disease, some examples include a frequentist approach for the derivative of the logistic map with Gaussian error (Shen, 2020), a Bayesian approach for the Gompertz curve (Berihuete, Sánchez‐Sánchez, & Suárez‐Llorens, 2021), and a phenomenological model based on the spatiotemporal evolution of a Gaussian probability density function (Benítez et al., 2020). In this paper, the aim is to model COVID‐19 data phenomenologically, taking into account spatial and stochastic effects. We base on daily new infections through the derivative of a generalized logistic map, by generalizing somehow the simple logistic map from Shen (2020). By coupling, we include a spatial structure in the phenomenological model of differential equations, with a positive correlation of cases between nearby regions. To our knowledge, such a model has not been utilized in mathematical epidemiology. Finally, we also incorporate a random noise into the deterministic solution, in order to capture the highly irregular dynamics of the data. Our case study is the Spanish autonomous community of Castilla‐Leon, divided into nine regions called provinces. It is the largest community in Spain by area, it is located in the northwest of Spain, and it has a population of around 2.5 million. In Figure 1, we show the location of Castilla‐Leon among the autonomous communities of Spain (left map), as well as the nine provinces of Castilla‐Leon (right map). In Table 1, codes for the nine provinces are shown, as well as their populations (year 2019, approximated to the nearest thousands). Our aim is to model the first wave of the COVID‐19 epidemic, from March 1, 2020 to June 22, 2020 (114 days), with recorded data on daily new infections for the provinces. The cases have been retrieved from the open data portal of Castilla‐Leon: https://datosabiertos.jcyl.es/web/es/datos‐abiertos‐castilla‐leon.html.
FIGURE 1

Location of Castilla‐Leon among the autonomous communities of Spain (left map), and the nine provinces of Castilla‐Leon (right map). Source: Mathematica®, built‐in function GeoGraphics

TABLE 1

The nine provinces of Castilla‐Leon, their codes and populations

ProvinceIndexInhabitants
Leon1462,000
Palencia2160,000
Burgos3355,000
Soria489,000
Segovia5154,000
Avila6159,000
Salamanca7332,000
Zamora8173,000
Valladolid9520,000
Location of Castilla‐Leon among the autonomous communities of Spain (left map), and the nine provinces of Castilla‐Leon (right map). Source: Mathematica®, built‐in function GeoGraphics The nine provinces of Castilla‐Leon, their codes and populations

DETERMINISTIC MODEL

Given an index that identifies the province, let be the proportion of cumulative infections and be the proportion of new infections, scaled from the total population . These proportions depend on the day : and , . As suggested in Shen (2020), the relation is assumed (the prime denotes the derivative). The key idea is that the differential equation model is set for , while the parameter calibration is conducted for (scaled daily new infections) to avoid serial correlation in errors for cumulative cases and biased parameters. Within‐province dynamics may be governed by a generalized logistic differential equation model of the form The parameters are the growth rate , the local asymptotic equilibrium , and the flexibility coefficient , which are assumed to be time invariant. The saturation effect from implicitly captures public health interventions, without complex mechanistic assumptions about the transmission process. The parameter allows for more flexible S‐shaped growth profiles than the classical logistic formulation by Verhulst. It reflects the asymmetry of the curve of daily new infections with respect to the peak (new infections rise quicker than decrease). When and , the logistic and the Gompertz curves are retrieved, respectively. The reader may consult an extensive list of references for the generalized and classical logistic equations, with a variety of applications, in the Introduction section. Spatial structure, in which individuals interact more intensely with neighbors, may be incorporated as follows. Given two provinces and , we write whenever they are adjacent. The complete phenomenological model is then the following: To couple the provinces, we have related the daily new infections through a density‐independent parameter that entails positive spatial correlation: when some increases rapidly at , for , then augments quicker too. Again, no mechanistic assumptions are made. To construct (2), some ideas were taken from the theory of disperse populations in discrete patches (Zhang et al., 2021). After isolating in (2) symbolically, each is written as a linear combination of ; this is somehow similar to the coupled model investigated a few decades ago in Kendall and Fox (1998) from the system dynamics viewpoint. In contrast to the local model (1), which belongs to the class of Bernoulli ordinary differential equations, the coupled system (2) does not have a closed‐form solution; numerical methods are required for its resolution. The next section details the calibration of the 28 parameters in the coupled generalized logistic model (2).

Calibration of the deterministic model

The initial conditions in (2) are fixed as the initial data; if any of them is 0, then is set as (one infected individual). The 28 parameters are fitted by least‐squares optimization for (scaled daily new infections), as recommended by Shen (2020): Here denotes the observed datum. This minimum gives a measure of how good the fit is. For computations, Mathematica® has been used. The system of ordinary differential equations (2) is parametrically solved with the built‐in function ParametricNDSolveValue, with the option Method {“EquationSimplification” “Residual”} to isolate the derivatives. The minimization is carried out with the routine FindMinimum. This function was executed with 800 iterations, for 7.5 hr, and the algorithm converged. The estimates of the parameters that minimize (3) are presented in Table 2. The powers are closer to 0 than to 1, so the model for each province is more similar to a Gompertz curve than to a classical logistic curve. The coefficients provide the maximum level of infection at the first wave under no neighboring‐provinces effect; for example, in Leon it would have been and in Soria . The value of (3) is 0.0000498. It is observed that , so there is indeed a spatial effect. In fact, if and local generalized logistic curves are fitted for each province, then the value of (3) becomes 0.0000523, that is, greater. When , the provinces for which the least‐squares error increases are Zamora, Palencia, Avila, and Valladolid, in decreasing order of magnitude; this means that these four provinces were the most susceptible to their neighbors during the first wave of the epidemic. Of course, these assertions on are conditional on the validity of the generalized logistic model (1) to describe within‐province dynamics, so that any deviation of it is due to spatial factors; in general, the validity of the model is a necessary assumption when performing sensitivity analysis.
TABLE 2

Parameter estimates of (2) that minimize (3)

ParameterEstimateParameterEstimate
D 0.130 b5 0.0979
a1 0.939 b6 0.0421
a2 0.963 b7 0.0883
a3 0.922 b8 0.0262
a4 1.00 b9 0.00606
a5 0.900 K1 0.0195
a6 0.874 K2 0.0243
a7 0.927 K3 0.0291
a8 0.362 K4 0.0439
a9 0.411 K5 0.0381
b1 0.0724 K6 0.0347
b2 0.0330 K7 0.0284
b3 0.0213 K8 0.0347
b4 0.0492 K9 0.0429
Parameter estimates of (2) that minimize (3) In Figure 2, the fit of to the data is plotted. The deterministic model renders a smooth, averaged fit. However, the abrupt variation in consecutive days is not captured, and this is the reason of incorporating stochasticity in the following section.
FIGURE 2

Fit of to the number of daily new cases, by province

Fit of to the number of daily new cases, by province

A STOCHASTIC MODEL

In order to capture the highly irregular dynamics of the data, stochasticity is incorporated into the coupled generalized logistic model (2) (Smith, 2013). By inspecting the deterministic fit, a Gaussian white noise error is added to the scaled number of daily new cases : Here the power is independent of , the noise is an uncorrelated process, with variance independent of , and is the new stochastic response. This new response is highly irregular: it is neither jointly measurable nor right/left‐continuous on any interval. The term controls the dispersion of the random error; the higher the value of , the larger is the variability the error exhibits. The mean of (4) is the output of the deterministic model (2).

Calibration of the stochastic model

Given the estimates of , , , and from Table 2, both parameters and of the stochastic model (4) are calibrated for each province by means of maximum likelihood (Rossi, 2018). The likelihood of the observed time series is given by where denotes the probability density function. By maximizing it, the infinitesimal probability around , is also maximized. After applying , the maximization problem is more conveniently given as We use Mathematica® with the built‐in instruction NMinimize, in the region and . If some is a small negative number or zero, it is changed to . In Table 3, the optimal values of and are reported. These are then plugged in (4).
TABLE 3

Optimal values of and for the stochastic model

Province i 12345
λi 0.07010.5490.7570.5390.00168
σi 0.0003650.01270.08710.02160.000345
Province i 6 7 8 9
λi 0.6450.08630.8370.764
σi 0.03990.0005860.2230.0943
Optimal values of and for the stochastic model In Figure 3, the fit of the stochastic model is illustrated. We have taken the stochastic process , whose statistics are determined with Monte Carlo simulation. We show the mean (which is approximately equal to the deterministic fit) and probabilistic intervals, as well as an example of a randomly realizable path. One appreciates the similarity in pattern of the realizable path and the data, which justifies the need of stochasticity.
FIGURE 3

Fit of the stochastic model, for each province . The mean and probabilistic intervals are shown, as well as an example of a randomly realizable path.

Fit of the stochastic model, for each province . The mean and probabilistic intervals are shown, as well as an example of a randomly realizable path.

CONCLUSIONS

As shown in this paper, a phenomenological model may be useful to capture faithfully the dynamics of an epidemic. In the case study of Castilla‐Leon (Spain) and the first wave of COVID‐19, we have used a coupled system of generalized logistic differential equations. The coupling comes from the spatial effect due to neighboring provinces of Castilla‐Leon. The calibration of the 28 parameters is based on the daily new infections through the derivative of the system. The process is computationally intensive, but optimal parameters can be obtained at the end. The model yields a smooth, averaged curve that follows the pattern of the data. However, stochasticity is needed to obtain realizable paths that resemble the abrupt changes of the data. It is incorporated into the model via a random noise error, whose parameters are determined by maximum likelihood estimation. The main limitation when fitting models of coupled differential equations rigorously is the time to execute the optimization procedure. Once the optimal parameters are available, it is simple to incorporate a random noise and to estimate the parameters of dispersion. Several extensions of the present paper may be devised, but constrained to optimizing parameters of coupled differential equations with higher efficiency. This is not easy, due to the well‐known curse of dimensionality. Future works could be based on dealing with several waves of infection at once (through a sum of logistic responses), on estimating the deterministic and random error parameters at once (through an intensive likelihood maximization procedure), on dividing the space into finer subregions, or on adding mechanistic processes of infection. Nonetheless, it is important to emphasize that, sometimes, rather than augmenting the complexity of a simple but satisfactory model with mechanistic considerations, it might be better to treat the error as random, and to apply a stochastic fit.

FUNDING INFORMATION

Julia Calatayud has been supported by a postdoctoral contract from Universitat Jaume I, Spain (Acció 3.2 del Pla de Promoció de la Investigació de la Universitat Jaume I per a l'any 2021). Jorge Mateu has been supported by the grant PID2019‐107392RB‐I00 from Spanish Ministry of Science and the grant AICO/2019/198 from Generalitat Valenciana.

CONFLICT OF INTEREST

The authors declare that there is no conflict of interests regarding the publication of this article.
  16 in total

1.  Spatial structure, environmental heterogeneity, and population dynamics: analysis of the coupled logistic map.

Authors:  B E Kendall; G A Fox
Journal:  Theor Popul Biol       Date:  1998-08       Impact factor: 1.570

2.  Using phenomenological models for forecasting the 2015 Ebola challenge.

Authors:  Bruce Pell; Yang Kuang; Cecile Viboud; Gerardo Chowell
Journal:  Epidemics       Date:  2016-11-19       Impact factor: 4.396

Review 3.  Carrying Capacity of Spatially Distributed Metapopulations.

Authors:  Bo Zhang; Donald L DeAngelis; Wei-Ming Ni
Journal:  Trends Ecol Evol       Date:  2020-10-28       Impact factor: 17.712

4.  Generalized logistic growth modeling of the COVID-19 outbreak: comparing the dynamics in the 29 provinces in China and in the rest of the world.

Authors:  Ke Wu; Didier Darcet; Qian Wang; Didier Sornette
Journal:  Nonlinear Dyn       Date:  2020-08-19       Impact factor: 5.022

5.  Using Phenomenological Models to Characterize Transmissibility and Forecast Patterns and Final Burden of Zika Epidemics.

Authors:  Gerardo Chowell; Doracelly Hincapie-Palacio; Juan Ospina; Bruce Pell; Amna Tariq; Sushma Dahal; Seyed Moghadas; Alexandra Smirnova; Lone Simonsen; Cécile Viboud
Journal:  PLoS Curr       Date:  2016-05-31

Review 6.  Rates of growth of human neoplasms: Part II.

Authors:  J S Spratt; J S Meyer; J A Spratt
Journal:  J Surg Oncol       Date:  1996-01       Impact factor: 3.454

7.  Estimation of COVID-19 spread curves integrating global data and borrowing information.

Authors:  Se Yoon Lee; Bowen Lei; Bani Mallick
Journal:  PLoS One       Date:  2020-07-29       Impact factor: 3.240

8.  Prediction of epidemic trends in COVID-19 with logistic model and machine learning technics.

Authors:  Peipei Wang; Xinqi Zheng; Jiayang Li; Bangren Zhu
Journal:  Chaos Solitons Fractals       Date:  2020-07-01       Impact factor: 9.922

9.  Pandemic influenza A (H1N1) during winter influenza season in the southern hemisphere.

Authors:  Ying-Hen Hsieh
Journal:  Influenza Other Respir Viruses       Date:  2010-07       Impact factor: 4.380

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.