Literature DB >> 35400084

Closed-form expressions and nonparametric estimation of COVID-19 infection rate.

Mauro Bisiacco¹, Gianluigi Pillonetto¹, Claudio Cobelli^1,2.

Abstract

Quantitative assessment of the infection rate of a virus is key to monitor the evolution of an epidemic. However, such variable is not accessible to direct measurement and its estimation requires the solution of a difficult inverse problem. In particular, being the result not only of biological but also of social factors, the transmission dynamics can vary significantly in time. This makes questionable the use of parametric models which could be unable to capture their full complexity. In this paper we exploit compartmental models which include important COVID-19 peculiarities (like the presence of asymptomatic individuals) and allow the infection rate to assume any continuous-time profile. We show that these models are universal, i.e. capable to reproduce exactly any epidemic evolution, and extract from them closed-form expressions of the infection rate time-course. Building upon such expressions, we then design a regularized estimator able to reconstruct COVID-19 transmission dynamics in continuous-time. Using real data collected in Italy, our technique proves to be an useful tool to monitor COVID-19 transmission dynamics and to predict and assess the effect of lockdown restrictions.

Entities: Chemical

Keywords: COVID-19; Compartmental models; Epidemic spread; Infection rate; Nonparametric estimation; System identification

Year: 2022 PMID： 35400084 PMCID： PMC8976198 DOI： 10.1016/j.automatica.2022.110265

Source DB: PubMed Journal: Automatica (Oxf) ISSN： 0005-1098 Impact factor: 6.150

Introduction

Hundreds of countries and territories around the world have been affected by the diffusion of COVID-19 (Velavan and Meyer, 2020, Wittkowski, 2020). Even if many efforts are now devoted to the administration of a vaccine, important tools to contain the pandemic, also in view of possible variants of the virus, are social distancing measures coupled with the use of precautions like masks, massive testing and tracing approach, or more severe restrictions like lockdown’s setting (Lavezzo, Franchin, Ciavarella, & al, 2020). A better understanding of COVID-19 dynamics would be fundamental to monitor, and possibly increase, the effectiveness of such actions. Estimation of lockdown’s effect on people’s behavior is in fact key to inform health-care decisions on emergency management. The knowledge of the infection rate of a virus is key to obtain the goal described above. It permits to monitor the epidemic evolution by assessing the transmissibility and contagiousness of pathogens agents. Unfortunately, such epidemiological variable is not accessible to direct measurement and its estimation is far from trivial. In fact, the infection rate depends not only on the biological characteristics of the virus but also on all those factors which influence the human behavior and the contact rate among people. Hence, it can change significantly in time due to social organization, risk perception, use of masks, seasonality (people tend to use less precautions during the summer season and to stay mostly in enclosed spaces when coming back to work). An important class of models to describe COVID-19 dynamics relies on the so called compartmental models where the population is assumed to be well-mixed and divided into categories. The most popular description is the SIR which includes three compartments containing susceptible (S), infected (I) and removed (R) individuals (Kernack & McKendrick, 1927). Since its inception, many SIR variants have been then proposed in the literature to describe even more complex dynamics, like e.g. the SEIR model where the class of exposed (E), i.e. people who are host for infectious but cannot yet transmit the disease, is also included (Bootsma and Ferguson, 2007, Capasso and Serio, 2007, Korobeinikov and Maini, 2005, Liu et al., 1987). Additional differential equations have been included also to describe how people can react to knowledge of infections and risk of death as well as to account for the increasing of vaccination rate and its consequence on the susceptible fraction of individuals (Buonomo et al., 2008, Funk et al., 2009, Kiss et al., 2010, Samanta and Chattopadhyay, 2014, Yu et al., 2017). For what concerns COVID-19 pandemic, new SIR-type models have been indeed developed along this line e.g. to describe effects of delays and risk perception of the individuals (Anastassopoulou et al., 2020, Casella, 2020, Lin et al., 2020, Weitz et al., 2020), as well as the SAIR model that includes the class of asymptomatic people who play a major role in transmitting the virus (Bi and Beck, 2021, Lavezzo et al., 2020, Sadeghi et al., 2021, Wang et al., 2020). The above mentioned compartmental models use quite simple parametric descriptions of the infection rate which is hereby denoted by stressing its dependence on time . For instance, just two parameters are introduced in Bertozzi et al., 2020, Gatto et al., 2020, Lavezzo et al., 2020 to describe the different levels that may assume before and after the lockdown’s setting. While approximated expressions of the infection rate for the SIR model have been developed in [19], to better capture the complexity of the problem, in this work we use a SAIR model where (along the line of Pillonetto, Bisiacco, Palù, and Cobelli (2021) which considers the SIR model) can assume any temporal profile. Well-posedness of the problem will be then restored through a nonparametric regularized strategy which incorporates also knowledge on COVID-19 dynamics coming from a recent seroprevalence study performed in Italy. The paper is organized as follows. In Section 2 we introduce the SAIR model of COVID-19 dynamics. In Section 3 we report some important properties of such models and describe our infection rate estimator. In Section 4 case studies regarding the Italian scenario are illustrated. Conclusions then end the paper while all the mathematical details are gathered in Appendix A.

SAIR model and nonparametric description of COVID-19 infection rate

The classical SIR model includes the classes and which evolve in time and contain, respectively, susceptible, infected and removed people. They are normalized, so that their sum is equal to one for any . In the SAIR, the class contains two kinds of infected people. The first ones are asymptomatic/paucisymptomatic who can directly recover with a rate established by . The other ones move to the second class of infected with a rate and then recover with a rate . This leads to the following SAIR model where now describes the temporal evolution of the interaction between susceptible and the two classes of infected. For models including several classes of infected, a scalar reproduction number can still be defined in terms of the eigenvalue of a matrix which includes all their effects (van den Driesschea & Watmoughb, 2002). In this work, we instead introduce different functions to monitor the evolution of different classes of infected. Such functions can be seen as generalized time-varying reproduction numbers. In particular, from the second differential equation one can define so that values smaller than one imply that , the class which feeds both and , decreases over time. This thus gives crucial information e.g. to assess if restraints are effective, predicting that people in intensive care will be soon under control. After summing up the second and third equation, one can also introduce and values smaller than one now indicate that the total number of infected is decreasing. In Appendix A we will also study an even more complex variant of this model where two infection rates describe separately the interaction between and , respectively. Finally, the following measurement equation is introduced where is an unknown scalar. This parameter is necessary since the true number of infected is never perfectly known during an epidemic. The observable measurements instead represent epidemiological measurements like e.g. the number of hospitalized or diagnosed infected. Data regarding the diagnosed infected have some limitations. They do not give an accurate information on how many subjects were affected by the virus exactly at a certain day, due to delays in the swabs processing. In addition, the amount of performed swabs and the criteria used to select people who are tested may vary in time. The number of hospitalized people and, even more, of patients in critical care, appear more reliable and informative on COVID-19 dynamics. For this reason, in our model the multiplier formalizes the assumption, which appears statistically reasonable, that the number of true infected is proportional to the number of people in intensive care. Reconsidering (3), this assumption implies that ensures that the number of people in intensive care is closing at instant .

Universal models and infection rate estimator

Universal models and infection rate closed-form expressions

We summarize the main findings (described in detail in Appendix A) which are instrumental for our future developments: the models enriched with a nonparametric description of the infection rate are extremely expressive. In fact, also the apparently simple time-varying SIR turns out to be universal. Borrowing the term universal from the machine learning literature (Hastie et al., 2001, Micchelli et al., 2006), this means that, given any smooth and positive function , there exists a time-varying SIR whose output is ; using the SAIR model, the infection rate admits the following closed-form expression: where Such expressions then allow the design of both smoothers and filters (real-time estimators) of the infection rate temporal profile.

Infection rate estimator

The infection rate has to be reconstructed from real epidemiological data. For this purpose the SAIR will be adopted exploiting (4). However, since our model is universal, model complexity control is a fundamental issue. First, the output has to be regularized. In fact, (4) requires the knowledge of the entire continuous-time system output but actually is known only in noisy and sampled form, over using day as time unit. The measurement noise affecting the intensive care data here describes both possible delays/mistakes in reporting the number of people in intensive care and inevitable modeling errors. Interestingly, as in the SIR case, (4) shows that also when adopting the more complex SAIR model the infection rate depends only on and (ill-conditioning increases with the derivative order). A smooth reconstruction of these two signals can then be obtained by a spline estimator (Bottegal and Pillonetto, 2018, Wahba, 1990) implemented e.g. by a Kalman smoother (Aravkin, Burke, Ljung, Lozano, & Pillonetto, 2017) (or a Kalman filter to obtain an infection rate on-line estimator) where the regularization parameter is estimated from data using marginal likelihood optimization (Aravkin et al., 2012, Bell and Pillonetto, 2004, Pillonetto et al., 2014, Pillonetto and Saccomani, 2006). This defines the first step of our estimator, as graphically illustrated in the left part of Fig. 1. As case studies, Fig. 2 displays the estimates of (left) and (right) in Lombardy (top) and Italy (bottom).

Fig. 1

Fig. 2

Continuous-time estimates of the SAIR outputs and (apart from a normalization factor) in Lombardy (top) and Italy (bottom). The left panels report also the intensive care data (circles).

The figure illustrates the approach for the infection rate reconstruction developed in this paper. The first block of our procedure receives as input epidemiological data given e.g. by the number of diagnosed infected or hospitalized people. It then reconstructs smooth continuous-time profiles of the output of our compartmental model and of its derivative using a spline estimator implemented by a Kalman smoother. The second step combines these estimates and the prior information on the model parameters by exploiting (4). The final result is the continuous-time reconstruction of the infection rate, including lower and upper bounds on its temporal profile. Model complexity is regulated by introducing uncertainty intervals on the system parameters and supported from the literature (Gatto et al., 2020, Giordano et al., 2020, Lavezzo et al., 2020, Worldometer, 2021). We assume , which roughly means that infected people show the first symptoms in an interval ranging from three to seven days, and , so that asymptomatic people heal in an interval ranging from one to three weeks. We also let which implies that symptomatic people recover in an interval ranging from one week to almost two months. The constraint is also adopted. For what regards the multiplier , which defines how many people are infected for each individual present in intensive care, its interval is set to . The lower bound follows from simple considerations on the time-courses of diagnosed infected and people in intensive unit in Lombardy and Italy. The value means that almost 3% of people were infected in Lombardy when the number of people in intensive care reached the peak of the first infection wave. In addition, with and using values of in the intervals specified above, the time-varying SAIR predicts a total (cumulative) number of infected which can reach 15%. Hence, the upper bound for is robust since recent studies have assessed the presence of antibodies against SARS-CoV-2 in 7.5% of the population in Lombardy and 2.5% in Italy, e.g. see La Repubblica (Italian newspaper) (2020). Then, as illustrated in the right part of Fig. 1, the inputs to the last block of our estimator are the smoothed profiles of and the uncertainty intervals for . The closed-form expression in (4) is finally used to compute for any the minimum and maximum value of compatible with the values which the model parameters can assume, obtaining lower and upper bounds of the infection rate. Hence, bounds around the reproduction numbers displayed in (2), (3) become also immediately available. Continuous-time estimates of the SAIR outputs and (apart from a normalization factor) in Lombardy (top) and Italy (bottom). The left panels report also the intensive care data (circles). Lower and upper bounds of the infection rate time-course in Lombardy (left) and Italy (right). The temporal interval extends from March 1, 2020, until January 15, 2021. Dashed vertical lines indicate also the period of the first lockdown, August where the infection rate started increasing significantly and the beginning of the second lockdown. Lower and upper bounds of the generalized reproduction number defined by (2) in Lombardy (left) and Italy (right). Values smaller than one indicate that the class of infected , which contains also asymptomatic people and feeds both and , decreases over time. Lower and upper bounds of the generalized reproduction number defined by (3) in Lombardy (left) and Italy (right). Values smaller than one indicate that the entire class of infected and, hence, the number of people in intensive care, decreases over time.

Case studies: infection rate estimation in Lombardy and Italy

To test our procedure, we consider the Italian scenario, focusing in particular on the Lombardy region, the most affected by the outbreak. Results regarding the infection rate are in Fig. 3. It is worth noting that Italy has been the first country in Europe to set nationwide restrictions by introducing the lockdown on the whole territory on March 9, 2020. Restrictions have then been first further reinforced and, then, gradually relaxed. A second wave of infection affected the country after the summer season and forced the authorities to resume new restrictions in the whole territory in October. In particular, new restrictions were set around the mid of October, 2020, with also a new lockdown (milder than the first one) introduced in some regions including Lombardy on November 5, 2020.

Fig. 3

Lower and upper bounds of the infection rate time-course in Lombardy (left) and Italy (right). The temporal interval extends from March 1, 2020, until January 15, 2021. Dashed vertical lines indicate also the period of the first lockdown, August where the infection rate started increasing significantly and the beginning of the second lockdown.

The infection rate profiles displayed in Fig. 3 appear realistic and informative. Before the lockdown (interval on the -axis) the infection rate value was around 0.5 both in Lombardy and Italy. Then, quickly decreased: this describes the rapid change in people’s behavior near and after the beginning of the lockdown. The curves then kept decreasing, a bit slower in Lombardy. The first days of April (after 30 on the -axis) the transmission rate was below 0.1 and the epidemic was under control, reaching its lowest value a few days before the end of the lockdown. At the end of the restrictions, was inside the interval both in Lombardy and in Italy. After the end of the lockdown, on May, June and partly in July the situation was still under control, with the upper bound not exceeding 0.1. But the curve rapidly increased at the very end of July (153 on the -axis), in particular in Lombardy. A peak was reached on August 29 in Italy, with the upper bound close to 0.2 and the lower bound larger than 0.1. In September the curve decreased, but in October the situation became critical again, especially in Lombardy where the upper bound assumed value 0.28 a few days before November. Then, the effect of the restrictions became evident. The infection rate decreased significantly in Lombardy and on the last day here considered, January 15, 2021, the upper bound decreased from 0.28 to 0.1. The imposed restrictions seemed more effective in Lombardy than in Italy, where the decrease is slower, and our estimator provides a precise quantification of this phenomenon. Fig. 4 displays the bounds for the generalized reproduction number defined in (2). Its profile is similar to that followed by the infection rate. What is interesting to note is that during the summer season the maximum of was obtained in August, around 1.25 in both Lombardy and Italy. In October the peaks reached the values 1.7 in Lombardy and 1.3 in Italy. Then, the restrictions have been effective, making assume values smaller than the critical threshold 1. So, results suggest that the number of infected (which is the compartment containing asymptomatic people and feeding both and ) is currently decreasing. Our model thus suggests that the number of total infected and, hence, the number of people in critical care, should also start decreasing. This is confirmed by Fig. 5 where the temporal profile of the reproduction number defined in (3) is displayed. The upper bound around became smaller than 1 on January, 2021, in both Lombardy and Italy.

Fig. 4

Lower and upper bounds of the generalized reproduction number defined by (2) in Lombardy (left) and Italy (right). Values smaller than one indicate that the class of infected , which contains also asymptomatic people and feeds both and , decreases over time.

Fig. 5

Lower and upper bounds of the generalized reproduction number defined by (3) in Lombardy (left) and Italy (right). Values smaller than one indicate that the entire class of infected and, hence, the number of people in intensive care, decreases over time.

Conclusions

We feel that this work represents a significant addition to the COVID-19 literature. Models like those described in Flaxman et al., 2020, Gatto et al., 2020, Giordano et al., 2020 have certainly a broader scope than recovering the infection rate. In principle their complexity can permit to describe more in depth system dynamics, e.g. spatial models can overcome the (approximate) homogeneity assumption underlying the compartmental class here used. However, a limitation is that they exhibit a large number of parameters and their identification requires the introduction of significant and possibly delicate prior information. Our approach instead requires the user to specify just a few intervals where the SAIR parameters can assume their values. Also, identification of the currently used models of COVID-19 requires sophisticated numerical procedures, e.g. nonconvex optimization procedures which may undergo local minima or stochastic simulation techniques like MCMC which are powerful but may be difficult to implement, computationally demanding and subject to uncertain convergence (Gilks, Richardson, & Spiegelhalter, 1996). Here we have shown that, also when asymptomatic people are included in the model, the infection rate admits a closed-form expression which depends only on the SAIR output and its first-order derivative. This allows to design an estimator which does not suffer of local minima and is extremely efficient. Finally, differently from other model-based approaches, our technique returns continuous-time nonparametric estimates, allowing to monitor with detail the epidemic evolution. We have also obtained estimates of different reproduction numbers to monitor the evolution of different classes of infected, e.g. that containing asymptomatic people or that containing all the infected, that permit to predict when the number of people in intensive care will start decreasing. Code implementing the proposed approach is freely available, contained in the folder Nonparametric infection rate reconstruction at the website http://www.dei.unipd.it/~giapi/ under the voice Software.

20 in total

1. Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission.

Authors: P van den Driessche; James Watmough
Journal: Math Biosci Date: 2002 Nov-Dec Impact factor: 2.144

2. The effect of public health measures on the 1918 influenza pandemic in U.S. cities.

Authors: Martin C J Bootsma; Neil M Ferguson
Journal: Proc Natl Acad Sci U S A Date: 2007-04-06 Impact factor: 11.205

3. The impact of information transmission on epidemic outbreaks.

Authors: Istvan Z Kiss; Jackie Cassell; Mario Recker; Péter L Simon
Journal: Math Biosci Date: 2009-12-03 Impact factor: 2.144

4. Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.

Authors: Seth Flaxman; Swapnil Mishra; Axel Gandy; H Juliette T Unwin; Thomas A Mellan; Helen Coupland; Charles Whittaker; Harrison Zhu; Tresnia Berah; Jeffrey W Eaton; Mélodie Monod; Azra C Ghani; Christl A Donnelly; Steven Riley; Michaela A C Vollmer; Neil M Ferguson; Lucy C Okell; Samir Bhatt
Journal: Nature Date: 2020-06-08 Impact factor: 49.962

5. The spread of awareness and its impact on epidemic outbreaks.

Authors: Sebastian Funk; Erez Gilad; Chris Watkins; Vincent A A Jansen
Journal: Proc Natl Acad Sci U S A Date: 2009-03-30 Impact factor: 11.205

6. The challenges of modeling and forecasting the spread of COVID-19.

Authors: Andrea L Bertozzi; Elisa Franco; George Mohler; Martin B Short; Daniel Sledge
Journal: Proc Natl Acad Sci U S A Date: 2020-07-02 Impact factor: 11.205

7. The COVID-19 epidemic.

Authors: Thirumalaisamy P Velavan; Christian G Meyer
Journal: Trop Med Int Health Date: 2020-02-16 Impact factor: 2.622

8. Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy.

Authors: Giulia Giordano; Franco Blanchini; Raffaele Bruno; Patrizio Colaneri; Alessandro Di Filippo; Angela Di Matteo; Marta Colaneri
Journal: Nat Med Date: 2020-04-22 Impact factor: 87.241

Review 9. Unique epidemiological and clinical features of the emerging 2019 novel coronavirus pneumonia (COVID-19) implicate special control measures.

Authors: Yixuan Wang; Yuyi Wang; Yan Chen; Qingsong Qin
Journal: J Med Virol Date: 2020-03-29 Impact factor: 20.693

10. Universal features of epidemic models under social distancing guidelines.

Authors: Mahdiar Sadeghi; James M Greene; Eduardo D Sontag
Journal: Annu Rev Control Date: 2021-04-23 Impact factor: 6.091