Literature DB >> 35469192

Spatialized epidemiological forecasting applied to Covid-19 pandemic at departmental scale in France.

Matthieu Oliver¹, Didier Georges², Clémentine Prieur¹.

Abstract

In this paper, we present a spatialized extension of a SIR model that accounts for undetected infections and recoveries as well as the load on hospital services. The spatialized compartmental model we introduce is governed by a set of partial differential equations (PDEs) defined on a spatial domain with complex boundary. We propose to solve the set of PDEs defining our model by using a meshless numerical method based on a finite difference scheme in which the spatial operators are approximated by using radial basis functions. Such an approach is reputed as flexible for solving problems on complex domains. Then we calibrate our model on the French department of Isère during the first period of lockdown, using daily reports of hospital occupancy in France. Our methodology allows to simulate the spread of Covid-19 pandemic at a departmental level, and for each compartment. However, the simulation cost prevents from online short-term forecast. Therefore, we propose to rely on reduced order modeling to compute short-term forecasts of infection number. The strategy consists in learning a time-dependent reduced order model with few compartments from a collection of evaluations of our spatialized detailed model, varying initial conditions and parameter values. A set of reduced bases is learnt in an offline phase while the projection on each reduced basis and the selection of the best projection is performed online, allowing short-term forecast of the global number of infected individuals in the department. The original approach proposed in this paper is generic and could be adapted to model and simulate other dynamics described by a model with spatially distributed parameters of the type diffusion-reaction on complex domains. Also, the time-dependent model reduction techniques we introduced could be leveraged to compute control strategies related to such dynamics.

Entities: Chemical

Keywords: COVID-19; Data assimilation; Epidemiological forecasting; Model reduction; Pandemic modeling and simulation

Year: 2022 PMID： 35469192 PMCID： PMC9020576 DOI： 10.1016/j.sysconle.2022.105240

Source DB: PubMed Journal: Syst Control Lett ISSN： 0167-6911 Impact factor: 2.742

Introduction

Over the past year, the spread of Covid-19 has had more serious consequences than expected. This pandemic has brought about massive organizational changes in many countries, with repeated lockdowns and the use of remote solutions for work, study and personal life. It has also had massive impacts on hospital care services with the need to expand hospital capacity and postpone operations for non-Covid-19 patients. Finally, the pandemic has killed several million people around the world [see, e.g., 1]. One of the most difficult specifics to model regarding the spread of Covid-19 is the amount of asymptomatic infected individuals, as described, for example, in [2], which makes the pandemic difficult to track and has led to massive testing campaigns to assess the evolution of the pandemic and the effectiveness of health measures. The efficiency of different strategies of lockdown and testing has been studied [see, e.g., 3]. The incubation period of more than 10 days resulted in a lag between the implementation of restriction rules and their effect on hospital admissions, and made the effectiveness of restrictions more difficult to measure. Forecasting the number of infections and modeling hospital needs is critical to pandemic management and has been the subject of numerous articles over the past year. Since the start of the epidemic, many studies have been carried out to model and forecast the spread of Covid-19 [see, e.g., 4, for a review on optimization studies]. The asymptomatic infected individuals were typically taken into account by building compartmental models that are more detailed than the usual susceptible–infected–recovered (SIR) model [3], [5], [6], [7], [8]. These works focus specifically on the effect of lockdown on the propagation of Covid-19 by taking into account detected and undectected infected individuals as well as hospitalizations. Forecasting the number of infections of Covid-19 has been studied in [5] by using a 3-step approach on the transmission rate during a pandemic outbreak with linear, then exponential increase in infections and finally an exponential decrease of the transmission rate due to major public interventions and social distancing measures. In this paper, we focus on deterministic models which can be interpreted as limits for a large population size of stochastic models [see, e.g., 9]. Most of deterministic compartmental models encountered in the literature on COVID-19 pandemic consist of systems of ordinary differential equations (ODEs). Then a common way to tackle spatial variation in such systems is to define regional compartments on homogeneous geographical sub-areas and to add coupling terms [10], [11], [12]. However, the number of parameters to describe such coupled systems of ODEs increases rapidly with the number of sub-areas. Then, PDE-based compartmental models is a natural alternative to describe more finely spatial heterogeneity. There exist a few papers analyzing COVID-19 pandemic with such models. Let us mention [13], [14], [15] based on diffusion–reaction compartmental models, or [16] [see also 17] in which the authors couple a transport dynamic of the commuter population at large spatial scales, based on kinetic equations, with a diffusion model for non commuters at the urban scale. In the present paper, our aim is to propose a spatially distributed compartmental model, accounting for spatial heterogeneity of the dynamics while including several features of the recent COVID-19 outbreak. We also propose to exploit meshless numerical scheme properties to simulate our model on a complex spatial domain. Finally we rely on very recent time-dependent model reduction techniques to lower short-term forecast complexity. More precisely, we propose a spatialized version of a detailed compartmental model inspired by [8], we then restrict the spatial domain to the French department of Isère and propose a calibration procedure over the period of the first French confinement, extending the whole methodology exposed in [12]. This procedure can be adapted to any scale, provided that corresponding data is available. It allows to reproduce the macroscopical behavior of Covid-19 pandemic on different geographical areas. A preliminary step to the calibration and a second contribution of this paper is the simulation of the model. Our model is defined as a set of spatio-temporal partial differential equations over a complex domain, namely the French department of Isère in our study. We propose a meshless method based on a finite difference scheme relying on radial basis function approximation of the spatial operators (the RBF-FD method, see [18]) to solve it, as such an approach is known to be flexible to solve problems on complex domains. To the best of our knowledge, it is the first time RBF-FD is applied in the framework of epidemiology to study the spatio-temporal spread of a disease. Our methodology allows to simulate the spread of Covid-19 pandemic at a departmental level, and for each compartment. However the simulation cost prevents from online short-term forecast. Therefore we propose as a final contribution of the paper to rely on reduced order modeling tools introduced in [19] to compute short-term forecasts of infection number. The strategy consists in learning a time-dependent reduced order model with few compartments from a collection of evaluations of our spatialized detailed model, by varying initial conditions, initial times and parameter values. A set of reduced bases is learnt in an offline phase while the projection on each reduced basis and the selection of the best projection is performed online, allowing short-term forecast of the global number of infections in the department. Finally, we would like to stress that the original approach we introduce in this paper is generic and not limited to the analysis of COVID-19 dynamics. Indeed, it could be adapted to model and simulate other dynamics described by a model with spatially distributed parameters of the type diffusion-reaction on complex domains. Also, the time-dependent model reduction techniques we introduced could be leveraged to compute control strategies related to such dynamics. The paper is organized as follows. In Section 2, we present an extension of the classical SIR epidemiological model that models the infections, recoveries and hospitalizations spatially over a given geographical area of interest. In Section 3, we present the meshless RBF-FD scheme for solving the set of differential equations from model proposed in Section 2. In Section 4, the model is calibrated using hospital data from Covid-19 outbreak in the department of Isère, France. In Section 5, we use model order reduction techniques to build a surrogate model and extrapolate the number of infected individuals on a 10-day horizon.

Spatialized model

In this section, we present the usual susceptible–infected–recovered (SIR) compartmental model, then we add compartments to adapt to Covid-19 specificities. Finally, we propose an extension of this detailed model to account for the spatial spread of Covid-19 pandemic on a geographical setting, in our framework the Isère department in France.

Usual SIR model

The model presented in this paper is an extension of the usual SIR epidemiological model from [20]. The typical SIR model describes a population of constant size, where each individual can be either Susceptible to a disease, Infected with the disease or Recovered from the disease. The transmission of the disease is ruled by the following equations: with where is the transmission rate of the disease and is the recovery rate.

The model

More detailed models such as the model in [8] can be better suited to model Covid-19 pandemic as they account for detected and undetected infected individuals and , detected and undetected recovered individuals and as well as hospitalized individuals , individuals receiving intensive care and deceased individuals . This approach is particularly useful in the case of Covid-19 because undetected infections and hospital saturation both play a key role in the handling of the pandemic. The flow diagram of this model is sketched out in Fig. 1.

Fig. 1

Flow diagram of the pandemic model described by the set of equations in (2).

The dynamics of each compartment is modeled by the following equations: with Flow diagram of the pandemic model described by the set of equations in (2). This model was calibrated in [12] to fit the Covid-19 outbreak at the regional scale in France. In this paper, we aim to enrich this model to better account for spatial non-homogeneity of Covid-19 propagation.

Spatialization of the model

In our model, we keep the same variables as in the model from [8], but each state variable now depends on location and time . The governing equations of our model are then: where denotes the spatial gradient operator. Note that each of the compartments , , and now feature a diffusion term which models local displacement of individuals inside geographical area of interest. On the contrary, we assume that individuals from compartments , , and are isolated and do not transmit the disease. It means that we neglect, e.g., the spread of the virus by individuals tested positive () during eventual transportation to hospital. The diffusion coefficient , that may depend on the position and vary in time due to changes in sanitary measures, is modeled as follows: where is the local population density, is the transmission rate and is a positive constant. It seems indeed reasonable (see, e.g., arguments in the french mobility report [21]) to assume that local movement is more intensive in high density regions with low sanitary restrictions. As a first approximation, we assume that parameter is constant, independent from compartments , , or . This parameter will be calibrated with other uncertain parameters (see Section 4). Parameters are derived from the probability of moving from compartment to and the average duration in compartment . These variables are more precisely described in [12]. Parameter corresponds to the proportion of undetected infected that are being tested positive, hence moving to compartment . This models virological testing such as PCR tests. corresponds to serological testing of recovered people from that are moved to when tested. These parameters are time-dependent because the testing policies and capabilities have changed during the pandemic. Moreover they can also be used for optimal control, because individuals from isolate themselves and do not spread the virus. In this work, we considered that the serological testing was negligible: . For the calibration presented in Section 4, virological testing was set to because there was very few testing during the period on which we calibrate the model. However, in the forecasting part of the study that is described in Section 5, this parameter is varied over a wide range of values to replicate different phases of the testing policies and capabilities. In this model, we do not account for the loss of immunity (which could be done by adding a transition from the recovered compartment to the susceptible one), this is because our modeling is done at scale of days to a couple of months, at which scale the loss of immunity is neglected.

A meshless approach for solving partial differential equations

In this work, we used the radial basis function finite difference (RBF-FD) method from [18] to solve the set of partial differential equations defined by (3) in Section 2. To the best of our knowledge, this is the first implementation of the RBF-FD approach to simulate an epidemiological model. Meshless approaches are particularly well-suited to our framework because the domain on which we solve PDEs is a geographical area that has a complex boundary. In the following, we take points denoted in the interior of the domain. We also take points denoted over the boundary. In the following of this section, we focus on compartment of susceptible individuals . Note that all the compartments involved in (3) are treated in a similar way.

Radial basis functions

Radial basis functions (RBF) are functions that depend on the distance between and . We will use the set of points to get an approximation of variable . For each point we denote the indices of the nearest neighbors of in . The set of points indexed by defines the RBF-FD stencil of . A local approximation of around can then be obtained from the radial basis functions . In this paper, to compute the approximation of , we used the inverse multiquadric functions: Eq. (4) gives a local approximation of around : Similarly, we denote the approximation of whose dynamic is governed by reaction–diffusion PDEs of system (3).

Radial basis function finite differences

The approximation of the variables around stencil points allows us to locally discretize the diffusion operators that appears in the set of Eqs. (3).

Operator discretization on the domain

Using (4), we can approximate the diffusion operator for variable appearing in (3) as follows: The sum can be written in matrix form with containing the operator values at points in the stencil of , indexed by , is a symmetric matrix containing the values of the radial basis functions at each couple of points in the stencil of , is a vector containing the values of at each point of the stencil of : The same diffusion operator approximations as given by (5) hold for the compartment variables governed by reaction–diffusion PDEs of system (3).

Discretization of the boundary conditions

Boundary conditions can be defined by operator , leading for variable to: where is the boundary term that is imposed on point . Applying the same method as in the previous section to the boundary points, we obtain for points of the boundary: where and are defined by (6), (7) respectively, and is the vector containing the values of the radial basis functions between and the nearest neighbors of : . Similar boundary operator approximations as given by (9) hold for other compartment variables governed by reaction–diffusion PDEs of System (3).

Reducing the equations to a system of ordinary differential equations (ODEs)

We finally apply (5) to the diffusion terms in (3) which leads, for points in the interior of the domain, to: where is the source term of compartment in (3). Additionally, for points on the boundary, the boundary condition for each compartment variable can be written as . This set of equations with the boundary conditions can be summarized as the following algebraic–differential equations: where G is the vector of boundary conditions for the compartment variables at each boundary point: and are the vectors of variables , , , evaluated at each point of the interior and boundary of the domain respectively, is the vector of variables , , , evaluated at each of the points (interior and on the boundary) of the domain, and matrices , , , contain the coefficients of the discretized diffusion and boundary operators using (5), (9). When applying Dirichlet conditions as proposed in what follows, we have and , leading to the following system of nonlinear ODEs:

Application to the epidemiological model

The methodology described in the previous section allows us to solve the set of differential Eqs. (3) on a domain given a set of collocation points in the interior of the domain and over its boundary. In our application we want to model the spread of Covid-19 in the department of Isère, France. However our methodology is generic and could be applied to any geographical area of interest, provided the relevant data are available.

Collocation points

For generating the collocation points in the interior of the domain, we used a two-dimensional Halton sequence [see, e.g., 22] defined on a square and only selected points in the interior of the region, as shown in Fig. 2. We thus get a low discrepancy sequence of points in the interior of the domain.

Fig. 2

1000 collocation points in the interior of the domain obtained by Halton sequence.

Each collocation point was given the population density value of the closest city taken from [23]. A polygon describing the boundary of the department was obtained from [24] then interpolated to obtain points along the boundary. 1000 collocation points in the interior of the domain obtained by Halton sequence.

Boundary conditions

As shown in Fig. 3, the population density is very heterogeneous over the domain, with the south-east being very rural with densities under and the north-east and center of the department being a lot more urbanized. We chose to apply null boundary condition on the boundary points with density . On the rest of the boundary we assumed that the population of each compartment was equal to the average population of that compartment over the domain. More precisely, for with , we assumed .

Fig. 3

Population density attributed to each collocation point.

Initial conditions

At , the proportion of individuals in each compartment should be defined in every point of the domain, which represents parameters. These proportions are not known everywhere nor for each compartment and thus have to be calibrated. To reduce the number of parameters to be calibrated, the proportions of infected and hospitalized individuals are modeled by a uniform distribution over the domain. For , we assume: where and are the initial proportions of infected and hospitalized individuals in the domain. Then we used a numerical integrator based on backward differentiation formulas to solve the stiff system of ordinary differential equations expressed in (11). The time step was set equal to one day.

Model calibration

In this section, we describe the calibration procedure we propose for our spatialized detailed model. It extends the procedure in [12], based on hospital data from [25].

Initial calibration

The calibration procedure we propose for the spatialized model described by (3) is initialized with the calibration results of its non-spatialized version performed in [12].

Variables to fit

We use data from [25], namely the number of hospitalizations , of intensive care hospitalizations , the death toll and the number of recoveries from hospitalization . Variables , and are already involved in the model description (3). Time evolution of the number of recoveries from hospitalization is described by: Because of the weekly fluctuations of the data, a rolling average of 7 days is applied to the data.

Calibration period

Following [12], we calibrate our model over the first wave of Covid-19 in France which took place between March 17th and May 11th, as sanitary restrictions remained stable during this period. Moreover, we use hospital data which are reliable data, as opposed to test data that are subject to changes in testing capabilities and policies.

Calibration parameters

Our model depends on numerous uncertain parameters that are presented in this section. Because of the amount of uncertain parameters, we only calibrate a selection of them, fixing remaining parameters at the value calibrated in [12] as detailed hereafter. Except parameters related to the transmission rate , the parameters we calibrate in this section are assumed to be constant over time. These parameters are disease-related. These parameters will not be re-calibrated for forecast. Only those related to time dependent transmission rate will be re-calibrated in Section 5.

Transmission rate

The transmission rate is one of the most sensitive parameters to tune the model [see, e.g., 26], hence its parametrization has to be chosen carefully. We used the one proposed in [5] with an additional non-zero asymptotic value : . Note that under this form, the transmission rate only depends on time. This assumption is motivated by the fact that sanitary measures were the same throughout the region of interest during the first wave of Covid-19 in France, which corresponds to the calibration period we have chosen. This parametrization of is well-suited for our calibration period: after time , the transmission rate decreases exponentially at speed from its original value to an asymptotic value . An example can be seen in Fig. 4. In the current form, the calibration of the transmission rate depends on 4 parameters. Note that for a larger scale implementation of our model, the transmission rate could benefit from a region-wise parametrization, depending on the scale at which sanitary measures are enforced.

Fig. 4

Transmission rate over time with coefficients , , and .

Optimal calibration parameters. Transmission rate over time with coefficients , , and .

Diffusion coefficient

The diffusion coefficient appearing in (3) accounts for the non-homogeneity of the transmission of the disease. It depends on a single parameter to be calibrated.

Compartment transition parameters

The parameters appearing in (3) depend on transition times and probabilities. The detailed formulas to obtain are written in [12]. The transition times and probabilities coming into play are: - , and which denote the respective probabilities of needing no hospital care, hospitalization and intensive care, - , , which denote the respective probabilities of needing intensive care or dying while hospitalized or in intensive care, - and which denote the recovery times of asymptomatic and lightly symptomatic individuals with no hospitalization, - , , , , and which denote the average transition times between compartments , , and . This totals 14 constants to be calibrated to account for recoveries. As detailed in Section 3.4.3, the initial conditions are reduced to the initial proportions and of and . If the calibration was done in a later phase of the pandemic, we could add initial proportions of recovered individuals () as well as intensive care occupation ().

Calibration process

In the above, we presented the variables to fit and the corresponding French hospital data, the set of parameter values for the initialization, the time period over which we perform the calibration and finally all the parameters to tune, including initial conditions. In this section, we detail the optimization procedure and we present the results of the calibration.

Loss function

As we use hospital data aggregated on the department of Isère to fit our model, we first compute , , and by aggregating on the department of Isère simulated data obtained from our spatialized model. We then define the loss function to be minimized over the set of parameters as: with , which is length of the calibration period, which corresponds to the first lockdown in France.

Minimization method

For minimizing the calibration loss function (13), we use a stochastic optimization Python package noisyopt proposed by [27]. We also give the solver bounds for each parameter equal to where is the nominal value obtained from Section 4.1.

Results

Only a subset of parameters is calibrated with the aforementioned procedure. Parameters , , , , , , and are kept to the value obtained in the calibration of the non-spatialized version of our model performed in [12]: , , , , , , and . This last point is commented in Section 4.5.4. The optimal parameters resulting from the data fitting as obtained on Fig. 5 are given in Table 1.

Fig. 5

Fitting of compartments , , and in Isère during the first lockdown in France (March–May) using the spatialized model described by (3).

Table 1

Optimal calibration parameters.

β0	μ	τ	β∞	i0	h0	Nih	Nhd	Nhu	Nud	Nhr	Nur
0.257	0.109	10.3	0.00875	5.25e −3	4e −5	9.5	1.5	0.65	1.8	1.56	1.03

Fitting of compartments , , and in Isère during the first lockdown in France (March–May) using the spatialized model described by (3). Evolution of obtained from hospital data between March 2020 and July 2021. We can see that the dates of the sanitary measures match the change of variations of . At the very end of the profile we can see the start of the “delta wave” that occurred during the redaction of this work. Evolution of the infection number over days with 50 sets of parameter values from the training set.

Calibration limitations

The calibration was performed under lockdown condition thus the transmission rate was assumed exponentially decreasing. This assumption does not hold in the general case; the evolution of the transmission rate can follow more complex variations as will be shown in the forecasting of that is carried out in Section 5. We also assumed that the probabilities of moving from one compartment to the other and the average duration in each compartment remained constant regardless of the sanitary measures, as being more disease-related parameters. Note however that this assumption may not hold when studying subsequent variants of Covid-19 and their propagation in a partially vaccinated population, hence these parameters should be included in the calibration procedure for studying this later phase of the pandemic.

Forecasting using reduced order modeling

As shown in Section 4, our spatialized model is able to reproduce the outbreak of Covid-19 at the scale of the department while giving a very localized information on hospital needs and infection number. However this model is computationally expensive compared to a non-spatialized model, this is why we propose to rely on reduced order modeling to fasten the online computations required for short-term forecasts of the infection number. In this section, we extend the model reduction tools introduced in [19] to the treatment of our spatialized model. In the following of this section, we explain how to build a reduced basis for extrapolating variables and defined in (14). We then detail how it is possible to use the reduced basis to compute online the -day forecasts of the global infection number in the department of Isère from an observation period of days.

Fitting epidemiological data with a time dependent SIR model

In this section, we project our spatialized 8-compartment model into a time-dependent SIR model described by (1). This is done by aggregating the spatial output of each compartment over the domain and regrouping separate compartments in either , or as follows: Starting from an initial epidemiological state , the evolution of the SIR model is ruled by parameters and . As a consequence, forecasting the total number of infections and recoveries over the domain can be done by extrapolating and . Let , and denote the evolution of , , observed over a time period. We then define: Under mild assumptions that are detailed in [19], Eqs. (14) allow a simple SIR modeling of the observed data starting from the initial state . Note that in (14), one can replace observation data by simulation data from a model that outputs , and over time, leading to the definition of and that we will use for forecasting as detailed in Section 5.5.2. It is possible to infer and from the data provided by [25], applying an adjustment factor (see [19][Section 4.1] for more details): , , with computed from the literature [28], [29] and assessing the proportion of individuals needing hospitalization when infected with Covid-19. The evolution of is shown in Fig. 6, on which one observes the influence of lockdown periods and sanitary measures. Note that takes negative values around July, 2020. This reflects the very low tendency of hospital data in Isère during the summer of 2020. During that period, the SIR dynamics did not fit well the hospital data, leading to negative values for . However this only happened in a short interval of time during which the pandemic was stable.

Fig. 6

Evolution of obtained from hospital data between March 2020 and July 2021. We can see that the dates of the sanitary measures match the change of variations of . At the very end of the profile we can see the start of the “delta wave” that occurred during the redaction of this work.

Subsets of (left) and (right) obtained by applying (14) to the SIR trajectories outputted by the detailed model.

Detailed model evaluations

Reduced order modeling is a way to fasten online computations. In the following, we detail the reduced order modeling approach we adapted for this study. A key stone in this approach was the construction of a reduced basis from a set of evaluations of the detailed model. We built different reduced bases for extrapolating the short-term evolution of and , by moving the initial time. We then selected the most suited reduced basis and used it for actual forecasting. Note that for the construction of the reduced bases, the detailed model described in (3) was slightly modified by considering that and do not depend on time. Then the model was run with different set of initial conditions and parameter values. The range of values we considered for initial conditions and for parameters and is provided in Table 2. Even if the bounds for each parameter were chosen to be consistent with the sanitary condition at the start of the second wave of Covid-19 in France (September, 2020), the idea is that the range of admissible values is chosen wide enough to match different calibration periods. Each parameter was sampled from a Halton sequence of length . The use of a Halton sequence allows to better explore the interval in which each parameter is defined. Other parameters were fixed from Table 1 in Section 4.5.3.

Table 2

Bounds for the training set of parameters.

Parameter	β0	λ1	i0−	i0+	r0−	r0+	h0	u0	kdiff
Min	0.05	0.1	0.03	0.0015	0.03	0.01	9e −5	8e −6	1e −8
Max	0.15	0.5	0.06	0.0025	0.06	0.1	1.1e −4	1.2e −5	5e-7

Then, the detailed model was run with these combinations of parameter values over a period of days, producing a set of detailed outputs – the number of individuals in each of the 8 compartments over time – that are collapsed into SIR outputs using the following rules: Bounds for the training set of parameters. For each run, we computed a realization of (, ) using (14) on the outputs. The set of realizations of and are respectively denoted and in the following sections. The infection number over time is shown in Fig. 7 for a selection of sets of parameter values. In Sections 5.3, 5.4 we detail the construction of the reduced order model from the set of realizations of .

Fig. 7

Evolution of the infection number over days with 50 sets of parameter values from the training set.

On most of the simulations in Fig. 7, we observe a wave growing over 50 days after the start of the simulation, as the bounds in Table 2 were chosen to be coherent with the sanitary conditions in France at the start of the second wave in France (September, 2020). However, recall that we widen the bounds in order to build a more flexible set of reduced bases, which explains why we observe other scenarios, such as simulations with a wave starting around 110 days after the starting time, or even simulations without any wave. Even though Fig. 8 was obtained from detailed model evaluations with not depending on time, the projection to a simple SIR model of the output as described in (15) led to a set of realizations of time-dependent .

Fig. 8

Subsets of (left) and (right) obtained by applying (14) to the SIR trajectories outputted by the detailed model.

Simulation output reduction

From the set of realizations of ( shown in Fig. 8 we built respective sets of functions and defined on for and that can reproduce the typical variations of the detailed model. We used two different methods for building and : one based on Singular Value Decomposition and another one on a greedy algorithm. In the following, we detail the construction of from , the same steps were applied to construct from .

Singular value decomposition

This approach consists in discretizing the functions in at a 1-day timestep. We then obtain a matrix denoted containing all the in . We then compute eigenvalues and eigenvectors of the matrix . The positive and ordered eigenvalues of are shown in Fig. 9. Then is composed with the eigenvectors with the largest eigenvalues. The basis we obtained for is shown in Fig. 10. We notice in Fig. 9 that the magnitude of the 10 largest eigenvalues is significantly larger than the rest, which is why with this approach we never used more than 10 functions to build afterwards.

Fig. 9

Eigenvalues of matrix .

Fig. 10

The set of functions obtained through the Spectral Values Decomposition-based algorithm with .

As can be seen in Fig. 10, the elements in are not constrained to remain positive. Section 5.3.2 tackles this issue by constructing in a greedy way a basis of positive functions. Eigenvalues of matrix . The set of functions obtained through the Spectral Values Decomposition-based algorithm with .

Greedy algorithm

The method is based on the algorithm called Enlarged Nonnegative Greedy (ENG) presented in [19]. We first use a greedy algorithm to build a set of nonnegative functions selected from . This set is initialized as follows: . Then, while has elements (), we add the function that has the maximal distance from : We thus obtain a set of nonnegative elements of that will be used to build the set using the enlarge cone algorithm from [19]: Note that, due to numerical integration errors, some of the curves in Fig. 8 take a few values below zero. However in practice, applying Algorithm 1 to these functions, we obtained positive functions in as can be seen in Fig. 11. The same happened for (defined from as from ).

Fig. 11

The set of functions obtained through the greedy algorithm combined with Algorithm 1 for .

Building a set of reduced bases for short-term forecasting by moving the initial time

In this section, we detail the way (obtained from SVD or ENG) was used to build a set of reduced bases for extrapolating .

Extraction of a reduced basis

Given a set of functions defined on days, we extract a set of reduced bases by applying a sliding window to these functions. More precisely, we first choose a window step of days. Then, for , with the integer part of , we denote the set of functions corresponding to the restriction of to the window , for (see Fig. 12).

Fig. 12

Extraction of a reduced basis from the functions obtained with the detailed model evaluations and the reduced order modeling algorithms presented in Sections 5.3.1, 5.3.2.

Enrichment of the reduced bases

In Section 5.4.1 we built a set of reduced bases for extrapolating from the reduction of our detailed model evaluations. Additionally, we can add well-chosen functions to these reduced bases thanks to prior knowledge of the evolution of . As can be seen in Fig. 6, the variations of are mostly exponential. Starting at time and given a -day evolution of , we can fit the following exponential function: to using least squares and add the function to the reduced basis. This is illustrated in Fig. 13. In the following, this raw exponential extrapolation will be used as benchmark extrapolation of and or in combination with the reduced bases obtained from and .

Fig. 13

Exponential fit and forecast of the observed in Isère department over a 20-day window starting on March 17th, 2020. Here .

Short-term forecasts of the infection number

We now have potential candidates for the reduced basis of extrapolation. In this section, we present the way we selected the best reduced basis, and then how we used it to get a forecast on the infection number.

Selection of the best forecast for beta

In order to compute a forecast on , we divided the calibration period in and . Then for each reduced basis , we defined the loss function: By optimizing it, we computed a function defined on fitting at best on . We then computed: and selected the function with the lowest evaluation of the loss for extrapolating on .

Short-term forecast of the infection number

By applying the methodology presented in Section 5.5.1 to the parameter , we obtained a calibration-forecast function as presented in Fig. 14. The methodology was applied in parallel to the parameter . From and , we could run the time-dependent SIR model defined by (1) with initial conditions and parameters . The model output on the infection number is shown in Fig. 15.

Fig. 14

10-day fit followed by a 10-day forecast of using the SVD basis presented in 5.3.1.

Fig. 15

Forecast of the infection number in Isère department using the SVD reduced basis with .

10-day fit followed by a 10-day forecast of using the SVD basis presented in 5.3.1. Forecast of the infection number in Isère department using the SVD reduced basis with .

Comparison of the SVD and greedy algorithms

To evaluate the error of our models, we generated 10-day forecasts every 5 days between March 2020 and May 2021. In this section and in the following, the forecast error for a calibration starting at is: We compared the SVD and Greedy algorithms with and without the addition of an exponential extrapolation of , the number of basis elements was set to . The average error on the full timeframe are shown in Fig. 16.

Fig. 16

10-day forecast error for the greedy algorithm (), the enriched greedy algorithm (), the SVD algorithm (), the enriched SVD algorithm () and the exponential benchmark algorithm (). We can see that the enrichment does not improve significantly the forecasting score, and that the SVD and ENG algorithms perform very similarly.

The ENG and SVD algorithms performed very similarly, and the exponential enrichment of the reduced bases did not improve significantly the forecasting score. Therefore in the following we computed 10 day forecasts with the SVD algorithm, as it is much faster. Also, in the following sections, we did not use the exponential enrichment. 10-day forecast error for the greedy algorithm (), the enriched greedy algorithm (), the SVD algorithm (), the enriched SVD algorithm () and the exponential benchmark algorithm (). We can see that the enrichment does not improve significantly the forecasting score, and that the SVD and ENG algorithms perform very similarly.

Building an aggregated model

In this section, we chose to focus on the SVD algorithm that we evaluated on 10-day forecasts. The choice of is crucial because keeping fewer basis functions may not be sufficient to capture the complexity of the problem while taking a too large span of basis functions could lead to overfitting over the calibration and deteriorating the forecast accuracy of the reduced basis. We computed the forecast error on the infection number during a full year of 10-day forecasts for models built with 1 to 8 SVD basis functions in the reduced bases for both and . The average output of these 8 models was also used as a forecast model, the so-called aggregated model, denoted by Agg hereafter. The error of each of these 9 models was computed using the forecast error introduced in Section 5.6. The results are shown in Fig. 17 for the SVD algorithm. We can see that our aggregated model performed better than any individual model, and that this model reached a 5.5% error on 10 day forecasts. The visualization of some individual model predictions can be seen in Fig. 19. Moreover, the aggregated model seems more robust than the exponential extrapolation, preventing from very large errors as can be seen, e.g., in Fig. 18.

Fig. 17

Fig. 19

Visualization of the individual model forecasts in different phases of the pandemic. The bottom left plot illustrates the fact that the aggregated model performs better than any individual model.

Fig. 18

Comparison of forecasts obtained with the aggregated and exponential models at periods where the dynamics of Covid-19 changed quickly. We can see that the exponential extrapolation of is more prone to exploding forecasts while the aggregated model has a more robust behavior. Additionally, the third plot shows an example of a change in the dynamics that creates a large error as the forecasts cannot anticipate a brutal change, e.g., in sanitary restrictions.

Forecast error of the aggregated model and of each individual model obtained with . The aggregated model is the average of models 3, 4 and 5. We can see that the aggregated model performs slightly better than any of these individual models. Comparison of forecasts obtained with the aggregated and exponential models at periods where the dynamics of Covid-19 changed quickly. We can see that the exponential extrapolation of is more prone to exploding forecasts while the aggregated model has a more robust behavior. Additionally, the third plot shows an example of a change in the dynamics that creates a large error as the forecasts cannot anticipate a brutal change, e.g., in sanitary restrictions. Visualization of the individual model forecasts in different phases of the pandemic. The bottom left plot illustrates the fact that the aggregated model performs better than any individual model.

Epidemiological forecast using the detailed model

To evaluate the efficiency of the reduced modeling approach for forecasting, we compared its cost with the one consisting in forecasting the infection number directly from the detailed model. All the computation times mentioned below were obtained on a regular laptop. In the reduced basis approach, the offline phase consists of evaluations of the detailed model over days. The detailed model runs in , which totals s. The online phase has a cost of s, which is the time for projecting and onto the reduced bases and , as described in 5.5.1. The cost of running the reduced model itself is negligible because it is run only once, with the optimal extrapolations of and . Hence, the total cost of the reduced model approach for forecasts is . In the direct approach, the detailed model is evaluated over days and the optimization of the 9 parameters presented in Section 5.2 required evaluations, totaling a direct time of s per forecast. For forecasts, the gain is . With forecasts, the cost of the offline phase is compensated by the gain in the online phase.

Discussion

This study has shown that extrapolating and using the exponential function as presented in 5.4.2 is a reasonable solution for forecasting the infection number at a 10-day timescale. However our approach led to a slightly better forecasting score with a model that is more robust, hence more reliable. We would like to point out that the forecasting score has an underlying error due to the changes in dynamics of the epidemic with sanitary measures that are taken. In fact, the model has large overestimations of the infection number every time a new measure is taken, as can be seen in the bottom plot of Fig. 18. As a consequence, the forecasting error could not be reduced to a very low value without the introduction of the sanitary measures in the model. The model would perform better with no changes in the dynamics, however this model is still very useful because it gives the evolution of the pandemic if no measures were taken, which can help to measure the efficiency of the restriction rules.

Conclusion

In this paper, a spatialized extension of a detailed multi-compartmental epidemiological model has been proposed, allowing to reproduce the evolution of hospital needs at the scale of the department of Isère in France during the Covid-19 outbreak, while giving a detailed information about the geographical heterogeneity of the sanitary conditions. The partial differential equations that define the model were solved with a very efficient meshless scheme on a very irregular domain. From this model we built a reduced basis for extrapolating the transmission rate and recovery rate obtained from hospital data and we used them in a surrogate model to output forecasts of the global infection number in the department of Isère. A simple exponential extrapolation proved to be efficient for the extrapolation of the transmission and recovery rates but the aggregation of surrogate models using different reduced bases gave a more robust forecast. The work presented in the present paper could be used to evaluate the efficiency of restriction rules that are taken by comparing the forecasts as a reference for the evolution of the pandemic with the actual evolution of the infection number when restriction rules are taken. Also, testing policies could be evaluated by controlling parameters and and by evaluating the dynamics of the pandemic while more people are being isolated after being tested. Note that the spatialized detailed model we introduced could be completed to account, e.g., for vaccination or emergence of a variant. We could add compartments for the vaccinated population by calibrating their own probability of transmission and severity of symptoms. In particular, this could allow to compare different vaccination policies. The spatialized model could take into account age categories using contact, testing and vaccination rates as well as symptom severity in each age category. This could be solved with the same approach, but would benefit from more precise hospital data involving the age of patients for calibration.

CRediT authorship contribution statement

Matthieu Oliver: Formal analysis, Software, Validation, Data curation, Co-writing. Didier Georges: Conceptualization, Methodology, Investigation, Supervision, Validation, Co-writing, Reviewing and editing. Clémentine Prieur: Conceptualization, Methodology, Investigation, Supervision, Validation, Co-writing, Reviewing and editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

16 in total

1. Diffusion-reaction compartmental models formulated in a continuum mechanics framework: application to COVID-19, mathematical analysis, and numerical study.

Authors: Alex Viguerie; Alessandro Veneziani; Guillermo Lorenzo; Davide Baroli; Nicole Aretz-Nellesen; Alessia Patton; Thomas E Yankeelov; Alessandro Reali; Thomas J R Hughes; Ferdinando Auricchio
Journal: Comput Mech Date: 2020-08-13 Impact factor: 4.014

2. Optimization in the Context of COVID-19 Prediction and Control: A Literature Review.

Authors: Elizabeth Jordan; Delia E Shin; Surbhi Leekha; Shapour Azarm
Journal: IEEE Access Date: 2021-09-17 Impact factor: 3.476

3. Spread and dynamics of the COVID-19 epidemic in Italy: Effects of emergency containment measures.

Authors: Marino Gatto; Enrico Bertuzzo; Lorenzo Mari; Stefano Miccoli; Luca Carraro; Renato Casagrandi; Andrea Rinaldo
Journal: Proc Natl Acad Sci U S A Date: 2020-04-23 Impact factor: 11.205

4. Years of life lost to COVID-19 in 81 countries.

Authors: Héctor Pifarré I Arolas; Enrique Acosta; Guillem López-Casasnovas; Adeline Lo; Catia Nicodemo; Tim Riffe; Mikko Myrskylä
Journal: Sci Rep Date: 2021-02-18 Impact factor: 4.379

5. Delay differential equations for the spatially resolved simulation of epidemics with specific application to COVID-19.

Authors: Nicola Guglielmi; Elisa Iacomini; Alex Viguerie
Journal: Math Methods Appl Sci Date: 2022-01-18 Impact factor: 3.007

6. Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy.

Authors: Giulia Giordano; Franco Blanchini; Raffaele Bruno; Patrizio Colaneri; Alessandro Di Filippo; Angela Di Matteo; Marta Colaneri
Journal: Nat Med Date: 2020-04-22 Impact factor: 87.241

7. Asymptomatic coronavirus infection: MERS-CoV and SARS-CoV-2 (COVID-19).

Authors: Jaffar A Al-Tawfiq
Journal: Travel Med Infect Dis Date: 2020-02-27 Impact factor: 6.211

8. Predicting the number of reported and unreported cases for the COVID-19 epidemics in China, South Korea, Italy, France, Germany and United Kingdom.

Authors: Z Liu; P Magal; G Webb
Journal: J Theor Biol Date: 2020-09-25 Impact factor: 2.691

9. Forecasting the daily and cumulative number of cases for the COVID-19 pandemic in India.

Authors: Subhas Khajanchi; Kankan Sarkar
Journal: Chaos Date: 2020-07 Impact factor: 3.642