Literature DB >> 34922976

Optimal control of the SIR model with constrained policy, with an application to COVID-19.

Abstract

This article considers the optimal control of the SIR model with both transmission and treatment uncertainty. It follows the model presented in Gatto and Schellhorn (2021). We make four significant improvements on the latter paper. First, we prove the existence of a solution to the model. Second, our interpretation of the control is more realistic: while in Gatto and Schellhorn (2021) the control α is the proportion of the population that takes a basic dose of treatment, so that α>1 occurs only if some patients take more than a basic dose, in our paper, α is constrained between zero and one, and represents thus the proportion of the population undergoing treatment. Third, we provide a complete solution for the moderate infection regime (with constant treatment). Finally, we give a thorough interpretation of the control in the moderate infection regime, while Gatto and Schellhorn (2021) focused on the interpretation of the low infection regime. Finally, we compare the efficiency of our control to curb the COVID-19 epidemic to other types of control.

Entities: Chemical

Keywords: COVID-19; Epidemiology; Population control; SIR model; Stochastic optimal control

Mesh：

Year: 2021 PMID： 34922976 PMCID： PMC8675184 DOI： 10.1016/j.mbs.2021.108758

Source DB: PubMed Journal: Math Biosci ISSN： 0025-5564 Impact factor: 2.144

Introduction

This article extends the analysis of the model presented in [1]. In that article the authors gave analytical expressions for the optimal proportion of infected to undergo treatment in a pandemic. Analytical approaches allow to better understand the form of the solution compared to numerical approaches, which are currently more prevalent in this domain. It was possible for the authors to find formulae because of the type of objective function they chose. Rather than minimize an expected value of the number of infected or the cumulated number of infected (under a constraint on the dispersion of the results), we chose to minimize the expected value of a convex increasing function of the number of infected at the horizon. The quadratic utility function is a special case of the latter, and we indeed show results for a whole family of functions, called isoelastic functions. Each function in this family is characterized by a single parameter, called the risk-aversion parameter, and varying this parameter as we shall see enables us to better understand how the optimal control depends smoothly on the level of aversion to risk, where risk is understood as the probabilistic uncertainty of the result. In contradistinction, the optimal control of deterministic epidemiological models is often of the bang–bang type (see for instance [2], [3]). Several authors, such as [3], [4], [5], [6] use Pontryagin’s maximum principle. Some authors (e.g., [7]) use dynamic programming. Laarabi et al. [8] use the same framework, but add delay. The literature on the control of stochastic epidemiologic models tends to be more sparse and more recent. A variety of models and numerical methods has been proposed: Markov chain [9], backward SDE solved by the 4-step scheme [10], simulated annealing [11], stochastic programming [12], genetic algorithms [13]. Wang et al. [14] consider time-varying parameters. The contributions of this article are fourfold. First, we prove existence of a solution. Second, whereas in [1] the optimal control has the interpretation of the proportion of the population that takes a basic dose of treatment, so that occurs only if a proportion of the population takes more than a basic dose of treatment. In the low infection regime part of our paper, is constrained to be between zero and one, and represents thus the proportion of the population undergoing treatment. The latter interpretation is much more realistic, as it is uncommon to ration treatment. Third, we provide a complete solution for the moderate infection regime (with constant treatment). The final improvement is a thorough numerical analysis and sensitivity analysis of the moderate infection regime, while [1] focused exclusively on the interpretation of the control in the low infection regime. This enables us to discover some errors in the second-order term of the solution in [1], which we correct here. Finally, we compare the efficiency of our control to curb the COVID-19 pandemic to other types of control. Our optimal control is, as expected, superior to a full control (or no control), in terms of expected utility. It is clearly superior in the case of low infection, but the benefit is less pronounced in the case of moderate infection. There are two possible reasons for that. First, we included only few terms in the analytic series of the optimal control. Adding more terms would have yielded slightly better results. More importantly, the quality of the treatment available to COVID-19 patients in the US in 2020 was probably not sufficient to make a difference if the number of infected had climbed to more than 1%. The structure of the article is as follows. In Section 2 we briefly introduce the model in [1], and provide a proof of existence of the solution. In Section 3, we show our results for the low infection regime. In Section 4, we extend and analyze the solution in the moderate infection regime. Section 5 shows our experimental results when applying our methodology to the COVID-19 in the US in 2020. We draw the conclusion in Section 6.

A stochastic SIR model with treatment uncertainty

Let , , be the proportions of susceptible, infected, and out of infection (recovered, and dead), respectively. Let be the transmission rate and be the death rate. In the SIR model, the rate of decrease of the proportion of susceptible is equal to the constant transmission rate time . As in [1], we add a term , where is white noise, in order to model the error in the transmission rate: Infected patients are either treated or not treated against the disease. In both cases they either recover or die. We denote by () the constant death rate without (with) treatment and by the recovery rate of the treatment. The optimal policy is a progressively measurable process that represents the proportion of the infected population that receives treatment, thus . This constraint is an important addition to the model in [1]. Depending whether the individual is treated or not, there are then four different ways for an infected individual to exit the pool of infected: not treated and recover not treated and die treated and recover treated and died. Thus, the “out of infection rate” will be: For simplicity, we assume that the Brownian motion driving transmission uncertainty () is independent from the Brownian motion driving treatment uncertainty (). Usually (people die faster without treatment than with treatment), but not necessarily. Most of the time (treatment is better than no treatment), but not necessarily. We relax this requirement somewhat by requiring: In order to keep the treatment rate within bounds, we model it as an Ornstein–Uhlenbeck process: with the mean-reversion rate and the long run value of the treatment rate . It is well-known that is Gaussian, with variance equal to: Thus, if mean-reversion is large compared to volatility , constraint (2) is satisfied. We simplify (1) by: Putting everything together, the dynamics of the infected is: We try to minimize a measure of the infected over our horizon . This article focuses on the solution of the Mayer problem, i.e., the objective is expressed as a measure of . Another possible control would have been the time-integral of the number of infected over the horizon (the Lagrange problem). Both figures have their merit,1 and we leave for future research the control of (a measure) of the Lagrange and Bolza problems. Rather than trying to minimize the expected value of the infected, namely , we include in our objective the risk caused by the uncertainty of the model and its observations. Decision-makers are notoriously risk-averse. For this reason, Morgenstern and Von Neumann [15] introduced a class of utility functions that bears their names. A decision-maker in epidemiology that is averse to risk will thus minimize the expected utility of the infected, namely where is increasing and convex. Alternately, one can maximize the negative thereof, i.e., maximize the expected value of a concave and decreasing function of . The policy obtained in maximizing the expected value of a concave utility function can be showed, under certain conditions, to maximize the expected value of the outcome (here ) under a constraint on the dispersion of the outcome. Out of the universe of concave decreasing utility functions, we choose the power utility function: The coefficient is often called the risk-aversion parameter . When the decision-maker is risk-neutral, meaning that the uncertainty does not have an influence on her decisions. It is straightforward to check that this power utility function is concave in when , which we will assume. The more negative the more risk-averse is the decision-maker. Taking for instance , we see that the objective is to which returns the same policy as: The importance of analytic formulations is that other figures of interest in this model, like the expected number of deaths from treatment can be analytically calculated, and depend on . Thus, a decision-maker can calibrate its risk-aversion parameter on other goals. Expected number of deaths is only one type of goal and economic factors that can be easily added. We define Our controlled SIR model is thus: The relative sign of our volatilities and is important. We will assume without loss of generality that . The sign of is the sign of covariance between the measured value of today’s treatment rate and the change in value of the treatment rate between today and a future date. An example may help illustrate the difference. Suppose that over a week one performs daily measurements of the treatment recovery rate as well as daily forecasts of the evolution of the treatment recovery rate over the next day. The two quantities measured each day are proportional to the same white noise 1 day. One then calculates weekly estimates of and of over these 7 daily observations. Since we arbitrarily choose , a negative shows a correlation of ＋1 between the measurement (of today’s treatment rate) and the forecast. Fig. 1 is a depiction of our model.

Fig. 1

A stochastic SIR model.

Let be the time when the infection stops, i.e., the first time when either or or both are equal to zero. We naturally require the control to go down to zero at that time, for definiteness reasons. The next result considers an open-loop control. A stochastic SIR model. For a given progressively measurable control , with and given initial values there exists a unique solution of (3) (4) (5) up to time . The proof of Theorem 1, included in Appendix A, follows the proof of a theorem of Yamada and Watanabe (1971), as exposed in the book by Karatzas and Shreve [16, Prop. 2.13, Sec. 5.2]. We showed in a companion document that the probability that either or is equal to zero over a finite interval is zero, following the method of proof in [17], [18]. It is thus unlikely that a simple discretization of our model results in negative values.2 For notational simplicity we define the impact of treatment risk : as well as the long run impact of the treatment risk : We define and . For simplicity we write . In the absence of bounds on , Gatto and Schellhorn [1] show that, when a smooth optimal closed-loop control exists, there exists a probability measure 3 under which is a martingale and is the optimal state. Moreover, has the explicit form: and is twice continuously differentiable. In the sequel we will shorten this statement by saying that “ solves the martingale problem (6)”.

Results in the low infection regime

We assume close to one and . Thus the term: is assumed constant. With this simplification, we give an analytical solution to the constrained problem, i.e., the case where , a significant improvement over [1], who considered the unconstrained case. We consider first the case where the treatment rate is constant, and then the case where it follows an Ornstein–Uhlenbeck process.

Constant treatment rate

Let . The problem is: The following constant control is optimal: The proof is in Appendix B, and follows closely [19].

Treatment rate as Ornstein–Uhlenbeck process

The problem is In the low infection regime our solution will depend on a kernel with , while in the moderate infection regime it will also depend on two other kernels and that are closely related. In order to unify notation we define the kernels. Define and, for where We provide an explicit formula for in Appendix C. Gatto and Schellhorn [1, Prop. 1] provide an explicit solution to the PDE that in (6) satisfies, but with some typos in the expression of , which we correct here.

Proposition 1 in [1]

If then solves the martingale problem (6) , where where The corresponding control is given by: This control has some very clear properties. It is decomposed into a myopic policy and a hedging policy, namely the second term of (15). Recall that is most often negative. Both myopic control and hedging policies are thus inversely proportional to the degree of risk aversion and to the volatility which corresponds to the contemporaneous transmission measurement error. The hedging policy gives protection against the risk of making decisions too soon. As expected, its magnitude decreases as time approaches the horizon . This is a typical feature of the Mayer problem, which is usually attenuated in the Bolza problem.

Results in the moderate infection regime

We first handle the Ornstein–Uhlenbeck treatment rate case, which was presented in [1, Prop. 2]. The problem is defined in Section 2. We rewrite here for convenience, We further define From this, we can calculate: Let . If then solves the martingale problem (6) , where: where satisfies: The corresponding control where: The proof is in Appendix D. We refer to [1] for a discussion of . In the moderate infection regime, the control (which is the control in the low infection regime) dominates the corrections due to transmission of the disease. Our asymptotic expansion thus indicates how the optimal control should change as the pandemic progresses. The sign of is determined by the signs of and More specifically, is positive if and (19) are both positive or negative. is negative if one of them is positive and the other one is negative. It is obvious that the magnitude of both and decrease with time and are equal to zero when . Therefore, the importance of decreases as time increases. To further discuss the sign of (19), we rewrite it by Thus, suppose , , and are all positive, (19) is positive, and vice versa. In the following cases, we provide two simple cases that we can easily discuss the sign of : if , , then is positive. if , , then is negative. In the following, we discuss the full expansion of the solution in Theorem 4. Consider equation (57) in [1]: This time we use full asymptotic expansion: and obtain: The terms of our asymptotic expansion are thus determined by: We use the Ansatz: We have showed that and in the proof of Theorem 4. By the same process, we can also calculate the expressions for , in the sequel. The problem is: The HJB equation of this problem is obtained by simplifying (6), i.e., making the value function independent from . Let , the solution kernels for are given by: where Let , then solves the martingale problem (6) , where: where , , can be obtained by (47) , and satisfies: The corresponding control is equal to , where and are equal to The proof is in Appendix E, where we also provide a formula for . Observe that is always positive because the signs of and are the same. The signs of and are determined by the sign of .

Application to COVID-19

We use the same weekly US COVID-19 data from June 7, 2020 to November 1, 2020 as in ([1] Sec. 5.1) and the parameters in Table 1 are estimated using the COVID-19 data set. In the following, we show both simulation plots and one scenario real data plots under low infection regime with constant treatment, low infection regime with OU treatment, moderate infection regime with constant treatment, moderate infection regime with OU treatment, respectively. We compare three types of treatment:

Table 1

Parameters.

Treatment parameter	Symbol	Value
Death rate/no treatment	μ0	0.0575
Death rate	μ1	0.0575
Recovery rate/no treatment	K0	0.2559
Recovery rate at time 0	K1(0)	0.2559
Long run value of recovery rate	k¯1	0.4612
Volatility of the measurement of today’s recovery rate	σ	0.4418
Volatility of changes in the recovery rate	σk	−1.1647
Speed of mean-reversion of the recovery rate	λk	0.7692
Transmission rate	β	0.025
Proportion of infected at time 0	ɛ	0.01
Time step	Δt	0.001
Volatility of the measurement of today’s susceptible rate	σS	2.17

no control, i.e., . full control, i.e., . optimal control, i.e. the control from Theorems 2, 3, 4, 5. The Github repository for implementing the models and generating the plots can be found at https://github.com/yujiading/optimal-control-sir-model. Parameters.

Simulations

In Figs. 2, 3, 4, and 5 we use the Euler scheme to simulate scenarios using the parameters in Table 1 and calculate the expected values of and utility of for each regime. The risk-aversion parameters that are considered are between and . In all the regimes, the expected utility of our control is higher than full control and no control, as expected. This effect is more pronounced in the low infection than in the moderate infection regime. When becomes more negative, the expected utility of no control is higher than that of full control. This is because when becomes more negative, the decision-maker becomes more risk averse and trades off expected value of against dispersion of . In the low infection regime, the expected number of infected is lower with full control than with no control, as expected, and, with the optimal control, it depends on the level of risk-aversion, as expected. When treatment is risky the more risk-averse a decision maker, the less he or she is likely to invest in treatment.

Fig. 2

Expected infection and expected utility of optimal control, full control, and no control of low infection regime with constant treatment. Using simulation scenarios.

Fig. 3

Expected infection and expected utility of optimal control, full control, and no control of low infection regime with OU treatment. Using simulation scenarios.

Fig. 4

Expected infection and expected utility of optimal control, full control, and no control of moderate infection regime with constant treatment. Using simulation scenarios.

Fig. 5

Expected infection and expected utility of optimal control, full control, and no control of moderate infection regime with OU treatment. Using simulation scenarios.

Expected infection and expected utility of optimal control, full control, and no control of low infection regime with constant treatment. Using simulation scenarios. Expected infection and expected utility of optimal control, full control, and no control of low infection regime with OU treatment. Using simulation scenarios. Expected infection and expected utility of optimal control, full control, and no control of moderate infection regime with constant treatment. Using simulation scenarios. Expected infection and expected utility of optimal control, full control, and no control of moderate infection regime with OU treatment. Using simulation scenarios.

One scenario real data

In Fig. 6, we use the one scenario COVID-19 data as introduced above to plot the infections with optimal control, full control, and no control for each of the regimes. We can see that as gamma varies, some plots show contrafactual results. In fact, it is possible for a particular scenario to result in a better or worse outcome. This does not contradict our theoretical results, as in stochastic control the goal is not to optimize a single scenario.

Fig. 6

Infection of optimal control, full control, and no control of low infection regime with constant treatment, low infection regime with OU treatment, moderate infection regime with constant treatment, and moderate infection regime with OU treatment. Using real COVID-19 data.

Conclusion

We showed that a stochastic optimal control approach enables to more efficiently use treatment in an pandemic such as COVID-19. On a theoretical level, we show that, in a first approximation, the control does not depend on the transmission of the pandemic, i.e., the control in the moderate infection regime resembles the control in the low infection regime. The influence of key parameters of the problems, namely risk-aversion and volatility, are clearly demonstrated in our formulas. On a practical level, we show that an optimal control would have been better than a full control, had it been available in the low infection regime which we experienced in summer 2020 in the US for COVID-19. However the relative poor efficiency of the treatment that we experienced then would have translated into a poor performance of any type of control, had the pandemic moved into a moderate infection regime. In that regime, the influence of the type of control would have turned out not to be significant. In the Mayer problem, which we study here, the control depends significantly on the horizon. The study of the full Bolza problem remains to be done. Many other interesting problems remain to be solved. For instance, we showed optimality of the constrained control only in the constant, low infection regime case. Verification theorems need to be worked out in the multiple treatment case or the Ornstein–Uhlenbeck case. Optimal vaccination is another area where we believe a similar asymptotic approach can be used. Finally, Bertozzi et al. [20] use Hawkes processes to model COVID-19. The control of Hawkes processes remains a largely open problem that deserves attention, in particular for its application to epidemiology.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

10 in total

Optimal control of the SIR model with constrained policy, with an application to COVID-19.

Introduction

A stochastic SIR model with treatment uncertainty

Results in the low infection regime

Constant treatment rate

Treatment rate as Ornstein–Uhlenbeck process

Proposition 1 in [1]

Results in the moderate infection regime

Application to COVID-19

Simulations

One scenario real data

Conclusion

Declaration of Competing Interest

1. Optimal vaccination policies for stochastic epidemics among a population of households.

2. Finding optimal vaccination strategies for pandemic influenza using genetic algorithms.

3. Optimal vaccination schedules using simulated annealing.

4. Finding optimal vaccination strategies under parameter uncertainty using stochastic programming.

5. Optimal control applied to vaccination and treatment strategies for various epidemiological models.

6. Optimal control and sensitivity analysis of an influenza model with treatment and vaccination.

7. Optimal Control of a Delayed SIRS Epidemic Model with Vaccination and Treatment.

8. The challenges of modeling and forecasting the spread of COVID-19.

9. Time-optimal control strategies in SIR epidemic models.

10. Optimal control of the SIR model in the presence of transmission and treatment uncertainty.