G Nakamura1,2, B Grammaticos1,2, C Deroulers1,2, M Badoual1,2. 1. Université Paris-Saclay, CNRS/IN2P3, IJCLab, 91405 Orsay, France. 2. Université de Paris, IJCLab, 91405 Orsay, France.
Abstract
The severe acute respiratory syndrome COVID-19 has been in the center of the ongoing global health crisis in 2020. The high prevalence of mild cases facilitates sub-notification outside hospital environments and the number of those who are or have been infected remains largely unknown, leading to poor estimates of the crude mortality rate of the disease. Here we use a simple model to describe the number of accumulated deaths caused by COVID-19. The close connection between the proposed model and an approximate solution of the SIR model provides estimates of epidemiological parameters. We find values for the crude mortality between 10 - 4 and 10 - 3 which are lower than estimated numbers obtained from laboratory-confirmed patients. We also calculate quantities of practical interest such as the basic reproduction number and subsequent increment after relaxation of lockdown and other control measures.
The severe acute respiratory syndromeCOVID-19 has been in the center of the ongoing global health crisis in 2020. The high prevalence of mild cases facilitates sub-notification outside hospital environments and the number of those who are or have been infected remains largely unknown, leading to poor estimates of the crude mortality rate of the disease. Here we use a simple model to describe the number of accumulated deaths caused by COVID-19. The close connection between the proposed model and an approximate solution of the SIR model provides estimates of epidemiological parameters. We find values for the crude mortality between 10 - 4 and 10 - 3 which are lower than estimated numbers obtained from laboratory-confirmed patients. We also calculate quantities of practical interest such as the basic reproduction number and subsequent increment after relaxation of lockdown and other control measures.
Outbreaks of infectious diseases have been a common occurrence throughout history, often linked to or followed by disruptions in societies and human activities [1]. There are several ways to measure the impact of outbreaks but death tolls are the most relevant ones whenever the disease can threaten lives. For instance, 32 million persons have died between 1981 and 2019 in the ongoing HIV epidemics, 700000 in 2018 alone [2]. Aside from this large scale epidemic, the world has experienced several other recent outbreaks with varying degrees of severity and scale: Zika fever, whose symptoms are mild but can produce long-lasting effects in newborns (microcephaly) [3], [4]; Ebolavirus disease, with a high mortality rate estimated between 20 and 75% [5], [6]; Swine flu/H1N1, which became a pandemic in 2009-2010 although with a lower mortality rate than regular flu [7]. In 2019-2020, the severe acute respiratory syndromeCOVID-19 has emerged as the most recent pandemic, caused by the virus named SARS-CoV-2 [8], [9]. Due to its novelty and lack of previous exposition, humans have no immunity against this threat, leading to an increased number of infections. At the time of this writing, the specifics of the pathogen transmission are still being investigated, as well as the complete infection process once the virus enters the host. However, it has been shown that the main human-to-human transmission mode occurs by the spreading of contaminated droplets, similar to other flu-like diseases [8]. In sharp contrast with H1N1, however, the mortality rate of COVID-19 was first estimated in the range 1-4%, with higher ill-outcomes among persons of old age [10], [11]. This situation creates a unique scenario where healthcare facilities and workers can be overwhelmed in a short period, ultimately leading to untreated patients of COVID-19 as well as other diseases [12].To make matters even worse, asymptomatic patients can spread the pathogen for an extended period of time, showing none or mild symptoms during the course of the infection. As a result, laboratory tests to detect the viral load are necessary to identify the correct number of cases outside hospital environments. Similar to the H1N1 pandemic, the required number of tests far exceeds the current amount available in most countries. Without timely tracking of new cases, contact tracing becomes a challenging task, hindering estimates of new cases per infection summarized by the basic reproduction number, . The significance of this parameter lies in the fact that it provides a way to glimpse the values of the transmission rates. Those can then be used in compartmental models – mathematical models that describe the evolution of epidemics assuming nearly homogeneous populations [13], [14]. Earlier estimates for using epidemiological data from Wuhan, China, set between 1.5 to 5.7 without additional measures to restrict the spreading [15], [16], [17]. With measures in place – such as lockdown, self-isolation, and social distancing – was estimated to be around 1.05 [15]. More importantly, the insufficient number of tests, in addition to the long waiting time for lab results, affects the quality of epidemic models.In the absence of mass testings, the death toll can be used as an alternative metric to probe the extension of the epidemic. The medical staff can assess the cause of death from clinical reports, which may contain test results or not, using the best of their knowledge. Additional tests may also be appended to reports to further specify the cause of death.Both numbers of cases and deaths are publicly available as part of a global effort to tackle the pandemic. Here, we study the evolution of COVID-19deaths in order to reduce the issues caused by the limited number of laboratory-confirmed tests. We show that the accumulated deaths can be effectively described by simple functions, namely, sigmoids whose parameters are explained in terms of the SIR epidemic model. The SIR model is a compartmental model with the following health states: susceptible, infective, and removed [18]. The removed state represents those who have passed away or have developed immunity, either by recovering from the disease or any other method such as vaccination. The SIR model was chosen in preference to epidemic models with additional health states or reinfections because it is the simplest model that addresses immunity, requiring fewer parameters and whose solutions can be approximated by simpler functions. Among our results, we show that crude mortality rates can be computed from the parameters of sigmoids and that the rates can change up to one order of magnitude, depending on the severity of the outbreak in a given region. The paper is organized as follows. Sec. 2 contains the description of the data and variables used along the text. Sec. 3 explains how the SIR model is reduced from a system of differential equations to a single non-linear differential equation, with emphasis on the expansion around equilibrium. Data is modeled in Sec. 4 via sigmoidal functions, whose parameters are explained in terms of the epidemiological parameters of the SIR model. Time windows are addressed in Sec. 5, with special emphasis on crude mortality rate, and quantitative effects of the confinement. In Sec. 6, we investigate the evolution of once control measures are relaxed using the results from previous sections as input. Final comments and conclusions are listed in Sec 7.
Data
The European Center for Disease Prevention and Control (ECDC) provides COVID-19 data updated in a daily schedule [19]. The daily reports portray the distribution of new confirmed cases and new deaths presented as time series. The dataset also displays the population size according to 2019 World Bank census for each geographical region.To better grasp the nature of the data, consider the number of accumulated deaths in France, as shown in Fig. 1
. Similar to other European countries, France was heavily affected by the pandemic, above the mark of 30 000 deaths with a sharp increase in deaths of infectedpatients around April-May. These deaths are linked to infections that took place between 3 to 6 weeks prior. On March 16, the French government implemented measures to mitigate the propagation of the disease. The measures included confinement of non-essential workers, temporary closures of schools and universities as well as commercial stores and services. The impact of said measures did not show up immediately on the data (see Fig. 1b) but instead they appeared after some time had passed, around 4 weeks, reducing the number of daily fatalities. The death toll experienced a rapid growth between March and April (see Fig. 1). This regime is the hallmark of epidemics and denotes the exponential phase. In general, the growth rate in compartmental epidemic models is summarized by that is the ratio of the transmission rate to the removal rate . As the name implies, the transmission rate dictates the average number of persons a given infective typically infects during a fixed time interval. The inverse of the removal rate is the characteristic time in which a person remains contagious. Using Fig. 1 as reference, the number of accumulated deaths describes a monotonic function with early exponential growth followed by a rapid deceleration. This pattern suggests sigmoids as ideal candidates to model the data.
Fig. 1
Evolution of COVID-19 deaths in France. a) Accumulated COVID-19 deaths (circles). Countrywide measures to increase social distance and enforce confinement were introduced on March 16, and progressively removed starting from May 11 in a per region criteria. b) Asymmetry of the daily deaths (dotted line) in France induced by measures to control and reduce the spreading of the virus among the population. (Solid line) Bézier curve for 7 days moving average for daily deaths.
Evolution of COVID-19deaths in France. a) Accumulated COVID-19deaths (circles). Countrywide measures to increase social distance and enforce confinement were introduced on March 16, and progressively removed starting from May 11 in a per region criteria. b) Asymmetry of the daily deaths (dotted line) in France induced by measures to control and reduce the spreading of the virus among the population. (Solid line) Bézier curve for 7 days moving average for daily deaths.In addition to interventions designated to curb local transmissions, travel restrictions were put in place to reduce the circulation of the virus between cities and to/from different countries. Although a clear quantification of the impact of these restrictions still requires the complete picture of the epidemic, they serve as a mechanism to address and limit the influence of spatial effects for a fixed period of time. After that, we must assume that spatial effects are part of the spreading dynamics and become once again relevant. Finally, we used France as a concrete example but in reality the vast majority of countries adopted similar measures in the same time frame, with few notable exceptions like Sweden.
SIR model
We seek general features that can be used to model outbreaks with the available data. These features concern the spreading regime of the disease among the population but exclude demographics and spatial effects. Demographics entail the various age brackets, their distribution, their average number of contacts with elements of the population, and effects of the disease on each age bracket. Spatial effects arise due to heterogeneous distributions of population densities, as well as patterns and routes of transmissions, or imported cases. Although the inclusion of data portraying the characteristics of each age bracket and the spatial distribution of the population provide a far more realistic description of the problem, it also implies additional parameters in the model and methods to extract them.Instead, simpler models involving statistically equivalent individuals are useful to explain general features of the outbreak. That is the case of the classical compartmental models. In the case of COVID-19, the health-disease process can be crudely described by a progression through a sequence of different health stages. Starting from a healthy individual, exposition to the pathogen may trigger an infection with either mild or harsh symptoms. In both cases, virus shedding can occur before symptoms become apparent resulting in asymptomatic transmissions. If the infection persists, severe clinical symptoms may follow, including pneumonia and shortness of breath which may require hospitalization and, eventually, lead to death.For the sake of simplicity, let us assume the spreading dynamics is approximated by the standard SIR model in a homogeneous population of size
[20], [21]. The model comprises a population whose subjects can be classified in three distinct health states, namely, susceptible, infective, and removed. The removed state includes individuals that have either died or recovered from the disease. The latter are assumed to not be infective anymore, nor susceptible to become sick again because of some kind of immunity. The fraction of individuals in each compartment is, respectively,
and at time instant . The dynamic goes as follows. Infective subjects in the population transmit the pathogen to susceptible ones, under adequate conditions. The transmission occurs with rate and we assume the homogeneous mixture of the population, that is, each person in the population is statistically equivalent to each other [14]. Once infected, the person remains infective for an average period where is the removal rate. The dynamical equations that describe the model are:
with the constraint i.e., conservation of the population size. The model simplifies or neglects aspects of the COVID-19 pandemic such as differentiation between asymptomatic and symptomatic transmission or age-dependent rates [22], [23]. In fact, research on the biological characteristics of the pandemic are still ongoing [24], but evidence indicates re-infections should be minimal in recovered patients [25]. Therefore, we make a case for an approximate description of the problem via the SIR model over the inclusion of extra complexities and uncertainties, aiming to capture dominant aspects.The system of differential equations (1a-1c) can be further reduced to a single first-order differential equation as follows. From (1c) and (1a), one finds . The constant depends on the initial conditions and . Usually, we are more interested in scenarios in which and thus similar to the onset of an emerging disease. The complete expression for must be used for different initial conditions, which becomes a problem whenever the ratio is unknown. Ignoring constant solutions, it can be shown [26] that the equation can be further reduced towhose general solution can be obtained by quadratureThe stationary condition in (2) gives the value as a solution of the transcendental equation .The solution can be obtained by inverting (3), for which a general method remains unknown. To circumvent the issue, one may expand (2) around points of interest, for example, around or . Each expansion has advantages and issues. The former was first introduced by Kermac and McKendrick [18] to study small outbreaks and replaces by its expansion around up to second order in (2). The linear term in the expansion dictates the initial exponential growth of with an effective rate while the quadratic term produces the asymptotic value . The remaining constant term accounts for a temporal shift but, in general, approximates the onset of the epidemic so that . Crucially, the condition must hold throughout the full duration of the outbreak which is incompatible with large scale epidemics. Examples are quite common, in fact. One may consider the common cold or seasonal influenza whose typical reproduction number
[27] implies around of the population become part of the removed compartment by the end of the outbreak as illustrated in Fig. 2
. Unfortunately, for epidemics, the competition between the early exponential growth and relaxation towards the asymptotic solution is often difficult to assess near the onset and requires further powers of .
Fig. 2
SIR model. a) Numerical solution of the SIR model (empty circles) with . A good agreement is found between the sigmoidal curve (5) whose parameters were found by least-square fitting (line) and the SIR model, being less accurate for increasing values of . b) The curve that represents the infective fraction of the population in the SIR model is symmetrical, with peak at the center, as long as parameters remain constant.
SIR model. a) Numerical solution of the SIR model (empty circles) with . A good agreement is found between the sigmoidal curve (5) whose parameters were found by least-square fitting (line) and the SIR model, being less accurate for increasing values of . b) The curve that represents the infective fraction of the population in the SIR model is symmetrical, with peak at the center, as long as parameters remain constant.Alternatively, we can get a better picture of the problem by expanding around . There are two immediate advantages in this approach. First, it already carries information about the asymptotic solution so the constraint is no longer necessary. In practice, this means the expansion can be used to model large scale epidemics. Second, as we shall see further below, the convergence to the asymptotic solution does not depend on the initial conditions, which is the case of the growth rate in the previous expansion. One could argue that is just a matter of a shift in time. However, communicable diseases with varying degree of symptoms complicate the problem and introduce an new source of uncertainty to the data. More specifically, patients with mild symptoms are unlikely to search for immediate care and be included in surveillance databases during the early stages of the outbreak. The missing data leads to the under-notification of cases and outcomes which, coupled with large ratios creates large deviations in and predictions of the model, except in the case in which the onset of the outbreak is known a priori.As in the previous case, we shall limit the expansion around up to quadratic terms in order to produce analytical results and look for possible applications. Define . The expansion of (2) near together with the transcendental equation for givesKeeping terms up to one converts (4) into a Bernoulli equation with relaxation time . Solving for and transforming back to we find the approximate solution (see Fig. 2)with
and .
Effective model
The sigmoidal expression in (5) satisfies the requirements listed in Sec. 2 and is an excellent candidate to model accumulated deaths. However, the removed compartment corresponding to holds both recovered and deceased fractions of the population, i.e., all the infected who are unable to spread the disease. We can simplify this issue by assuming the existence of a simple relation between and the accumulated number of deaths divided by the population size, at the time instant . To keep the model as simple as possible, we neglect temporal delays and imposewhere the crude mortality rate is the ratio between deceased and infected. As such, it also can be used as an estimator for the likelihood to die after contracting the disease.The equilibrium value varies from country to country and can be used to characterize the outbreak, more specifically, to assess the impact of the outbreak in the afflicted population. The monotonic nature of cumulative quantities together with the upper bound restrict the possible functional forms for . Sigmoids are natural candidates to describe since they are bounded and monotonic. Here we consider the following expression in tandem with (5):The timescale dictates the relaxation of towards with inflection at . Equation (7) has 3 or 4 parameters depending on whether must pass by the initial entry or not. The former implies the constraint whereas the later with . Since the data are noisy, we do not require the fitting curve to pass by .The relationship between and is consistent if the following equations hold:The system of equations (8a-8c) connects the epidemiological parameters with the parameters of the sigmoid curve (5), which can be estimated by a fitting procedure, for instance, least-square. However, there are more variables than equations so at least one epidemiological variable must be fixed. There are two reasonable choices, namely, either or . The first choice should be selected if the data includes the onset of the outbreak because since the majority of the population should be in the susceptible state at the early stage of the epidemic. However, if the beginning of the outbreak is unknown, the assumption is no longer valid. Alternatively, one may consider an estimate for the removal rate or its probability distribution. In this case, the input parameter to solve (8a-8c) depends solely on the characteristics of the disease and local demographics. It does not require details concerning the disease spreading nor the moment in which the outbreak begins. Therefore, evidence-based values for from patient data are far more suitable for the purposes of this study, and shall be used hereafter.
Time windows
Fig. 3
shows the French death toll from March 16 to May 25. We use the opportunity to check whether the sigmoid (7) or the SIR model (1) can describe the data. It becomes clear that the agreement between data and (7) (performed with a standard non-linear least-square fitting procedure) is poor. In particular, the fitted curve becomes negative in early March. Parameter optimization of the complete SIR model (1) produce satisfactory results, except for the first 20 days. Furthermore, the optimized parameter days does not match the signature value for COVID-19 day. If the value days is held fixed during the optimization, then the optimal set of parameters become highly sensitive to the initial guess in the optimization algorithm.
Fig. 3
Issues when one tries to model the data with a single time window using French COVID-19 deaths from March 16 to May 25. Parameter optimization of the SIR model (dashed line) agrees well with input data (circles), especially after April 5. The optimization neglects the effects of control measures to reduce the transmission rate, while reducing the removal rate to day. The fit by a sigmoidal curve (solid line) underestimates and the curve turns negative in early March.
Issues when one tries to model the data with a single time window using French COVID-19deaths from March 16 to May 25. Parameter optimization of the SIR model (dashed line) agrees well with input data (circles), especially after April 5. The optimization neglects the effects of control measures to reduce the transmission rate, while reducing the removal rate to day. The fit by a sigmoidal curve (solid line) underestimates and the curve turns negative in early March.The experiment suggests that the SIR model (1) approximates the data but naive optimizations can return unrealistic epidemiological parameters. On the other hand, the sigmoid (7) only describes a portion of the data. This was not the case with the synthetic SIR data shown in Fig. 2. The inability to reproduce the entire real-world dataset can be explained by assuming that epidemiological parameters have changed at some point during the course of the data collection. In a sense, the simplicity of (7) lacks the flexibility to balance two sets of epidemiological parameters, which can be exploited as a tool to detect changes in epidemiological regimes. Indeed, the assumption finds support in the data of daily number of deaths (see Fig. 1b and Fig. 4
). An asymmetry is observed around the peak of daily deaths, which is in sharp contrast with the corresponding curve in the theoretical SIR model (see Fig. 2 b). The asymmetry can be explained by a variation in the transmission rate. Unlike other recent epidemics, several countries have adopted control measures such as lockdowns of non-essential workers, flight and other travel restrictions. The efforts effectively reduce the transmission rate of COVID-19 once they are in place. Thus, the data must be divided in non-overlapping time windows, each with its own set of epidemiological parameters.
Fig. 4
Global evolution of COVID-19. Accumulated deaths in millions (full circles). Daily deaths in the thousands (line with circles).
Global evolution of COVID-19. Accumulated deaths in millions (full circles). Daily deaths in the thousands (line with circles).As a first example, consider the Danish death toll between April 4 and May 31 as depicted in Fig. 5
. The time window is within the second epidemiological regime observed in Fig. 4 which simplifies the analysis. The data are fitted via (7) with parameters and using trust region reflective algorithm (Python/Scipy). It is convenient to fit the data using instead of and a finite interval so the infection lasts at least one day in the mathematical model. In the fitting procedure, we use bounds for which depends on the time interval between the inflection point and the initial entry, more specifically, . For we restrict the search to so that ; for i.e., starting the counting after the inflection point, the parameter space of is limited between 0 and 1. The remaining parameters are restricted to .
Fig. 5
COVID-19 deaths in Denmark from April 4 to May 31. The sigmoidal fitting (line) of input data (cicles) followed by the resolution of (8a-8b) produce and .
COVID-19deaths in Denmark from April 4 to May 31. The sigmoidal fitting (line) of input data (cicles) followed by the resolution of (8a-8b) produce and .The fitting (7) is in excellent agreement with data for (approximately 600 deaths) and days. Next, we solve the system (8a-8c) for distributed according to some probability distribution, centered around days. In practical terms, the width of the probability distribution forms the ground to compute uncertainties of the transmission and crude mortality rates. However, the finer details of the distribution of remain unknown so we resort to a uniform distribution for virus shedding, whose interval lies between 3 and 7 days [22], [28]. By doing so, we overestimate the uncertainties of the remaining epidemiological parameters of the SIR model and find . Our estimate for the crude mortality rate is compatible with the estimate (confidence interval: ) obtained by screening antibodies of 20 640 blood donors below the age of 70, in Denmark [29]. Also, the asymptotic value places the fraction of infected well below the threshold () for herd immunity.We can now move to more complicated cases, in which the data themselves contains artifacts and the fitting procedure can be tricky. That is the case of France, for instance. The lockdown was issued on March 16, and unaccounted deaths in nursing homes were added to the official statistics on April 3 and 4, producing a large fluctuation in the number of daily deaths. In this case, we separate the time windows according to the number of deaths. The first time window lies between 100 and 5000 deaths (March 16 - April 3), whereas the remaining days until May 31 comprises the second time window, as shown in Fig. 6
.
Fig. 6
Evolution of COVID-19 in France between March 16 and May 31. Reduced number of data points for clarity. Data (full circles) and sigmoidal fit (solid line) between March 16 and April 3, with and crude mortality rate . The effects of confinement emerge shortly after April 3, with a significant reduction in as indicated by the sigmoidal fit (dashed line) of data in the second time window (empty squares). The mortality rate remains nearly unchanged .
Evolution of COVID-19 in France between March 16 and May 31. Reduced number of data points for clarity. Data (full circles) and sigmoidal fit (solid line) between March 16 and April 3, with and crude mortality rate . The effects of confinement emerge shortly after April 3, with a significant reduction in as indicated by the sigmoidal fit (dashed line) of data in the second time window (empty squares). The mortality rate remains nearly unchanged .The curve fitting in the second time window is far more stable and insensitive to initial guesses, or bounds, with equilibrium death toll nearing and time scale days, which approaches the typical recovery time for mild COVID-19infections. The solution of the system (8a-8c) returns in agreement with the decline of new infections, where the notation with and 2 refers to the first and second time windows, respectively. In addition, the crude mortality rate in the second time window reads one order of magnitude higher than Denmark or Germany (see Table. 1
).
Table 1
Parameters of the SIR model and sigmoid from April 4 to May 31 for selected countries, using a uniform distribution for .
β
R0
f(10−3)
S0
g∞(10−4)
France
0.17±0.07
0.74±0.08
4.67±1.16
0.97±0.02
4.40±0.02
Italy
0.19±0.07
0.84±0.07
4.17±0.78
0.96±0.02
5.75±0.01
Spain
0.19±0.07
0.86±0.06
1.93±0.54
0.87±0.08
6.02±0.02
UK
0.18±0.07
0.82±0.07
5.86±1.08
0.97±0.01
6.26±0.05
Germany
0.18±0.07
0.83±0.07
0.39±0.11
0.90±0.06
1.05±0.01
Belgium
0.17±0.07
0.76±0.08
4.75±1.35
0.93±0.04
8.27±0.02
Parameters of the SIR model and sigmoid from April 4 to May 31 for selected countries, using a uniform distribution for .The curve fitting in the first time window, which includes the phase with exponential growth, can be challenging though. Even more so if the fitting data contains a large number of entries around the inflection point induced by the introduction of control measures. The inflection point is not the natural inflection point which occurs in an uncontrolled epidemic with constant parameters. Due to the significance of inflection points in the curve fitting, the induced inflection point artificially lowers in the first time window. Furthermore, multiple solutions can be found for (8a-8c), several of which are not realistic. Instead of using brute force, it is far more convenient to approximate the solution and set . In such case, (8c) is discarded and the remaining equations produce . The crude mortality rate in the first time window, shares the same order of magnitude as indicating that overall the French health care system remained responsive throughout the epidemic. If control measures had not been implemented, the expected number of deaths would have soared and reached 240 000 and assuming the mortality rate would have remained roughly the same.
What happens next?
In the previous section, the second window was delimited from April 4 to May 31 which included lockdowns, prohibition of large gatherings, and other non-pharmaceutical interventions to control the virus circulation. The window also coincides with the period in which spatial effects have less influence on the spreading dynamics. In general, spatial effects are important in outbreaks following changes in the chain of transmissions when compared to the random mixing hypothesis. In essence, they introduce correlations between persons which breaks a central assumption in compartmental models. The second window can be seen as the regime in which compartmental models are closest to describe real data. Therefore, estimates of epidemiological parameters produced during the second window should approximate their correct values and remain roughly unchanged throughout the full interval.What happens after the restrictions are lifted or relaxed? With reduced power to limit virus circulation it is natural to assume that the transmission rate should increase. According to our findings so far (7) should be unable to handle two sets of epidemiological parameters, deviating from the data. Indeed, this outcome is seen in Fig. 7
when one attempts to fit (7) via the least-square procedure from April 4 to September 1, i.e., the second window is extended by three months. By late July, the fitted curve and the data grow with different rates, with the latter being approximately linear.
Fig. 7
Deaths in France between March 16 and September 1. The shaded region indicates the second time window. Reduced density of data points (circles) for clarity. The least-square fit of (7) deviates from the data around late June. (inset) Linear fit approximates the growth observed between June 30 and September 1.
Deaths in France between March 16 and September 1. The shaded region indicates the second time window. Reduced density of data points (circles) for clarity. The least-square fit of (7) deviates from the data around late June. (inset) Linear fit approximates the growth observed between June 30 and September 1.The next question follows on what are the new parameters on the third window which starts around June 30 and lasts until September 15, the last day in our data. Here we exploit the early growth observed during the last days in Fig. 7 and match that behavior with the expansion of the SIR model around the beginning of the third window. In what follows, primed (non-primed) variables or parameters belong to the third (second) window.We start by expanding (2) around the beginning of the third window, . As briefly explained in Sec. 3, the expansion produces, among other corrections, a linear term whose coefficient is . The linear fit of the data produces in which is the linear coefficient and is the crude mortality rate for the window. The constant depends on and also on and which again are not known directly from the data. With hospitals and other healthcare services operating below their maximum capacity, it is reasonable to assume both removal and crude mortality rates remain unchanged, and so that and . Analogously, we evaluate using the results from the previous time window as input. After plugging the expressions we findIt is convenient to redefine and so that (9) becomes which is the equation satisfied by Lambert W functions. The solution is obtained numerically or graphically and gives the updated value . Table 2
displays obtained for selected European countries. The analysis shows that the relaxation of control measures lead to an increase in the transmission rate for all countries investigated.
Table 2
Updated estimates for in the third time window (June 30 to September 15) using input parameters obtained from the data analysis of the previous window.
R0 (2nd window)
R0′ (3rd window)
France
0.74
1.01±0.02
Italy
0.84
1.01±0.02
Spaina
0.86
1.08±0.06
UK
0.82
1.01±0.01
Germany
0.83
1.05±0.06
Belgium
0.76
1.02±0.04
From July 30 to September 1.
Updated estimates for in the third time window (June 30 to September 15) using input parameters obtained from the data analysis of the previous window.From July 30 to September 1.
Conclusion
The tragic developments in the COVID-19 pandemic have exposed flawed aspects in protocols used to assess large scale epidemics. Despite the various improvements in the global capacity to produce laboratory tests, usually based on reverse-transcription polymerase chain reaction (rRT-PCR), the majority of afflicted countries were unable to enforce early mass testing policies. This shortcoming has also been experienced in 2002 with the SARS epidemic but in a smaller scale. The lack of mass testing contributed in keeping the number of cases unknown, affecting the accuracy of disease spreading models. In this paper, we resort to the death toll as our primary data source as death certificates are mandatory and contain other medical assessments linking the cause of death with COVID-19 outbreaks. We emphasize that this methodology is not immune itself from sub-notification, nor unforeseen delays on death certificates.We model the death toll via the sigmoid curve in (7), which requires the parameters (capacity) and (time scale). The capacity limits the growth of the outbreak, which is expected to vary with geographic region, healthcare quality, and efforts to control the spreading of the virus. Together with the crude mortality rate which also depends on healthcare and population demographics, they give access to a far more credible estimate for the infected fraction of the population . To keep the model as simple as possible, we also neglect temporal delays between and . The inclusion of said delay may introduce additional effects, but those are not expected to be dominant in this case since spatial effects are not being investigated.The advantage of our approach relies on the curve fitting of a monotonic curve (death toll) to a sigmoid in sharp contrast to complex optimization schemes for models with multiple health states [22], [23]. Epidemiological parameters are extracted by solving the algebraic system of equations (8a-8c) for transmission rate, crude mortality rate and initial condition, respectively, and . Alternatively, the epidemiological parameters calculated from the curve fitting can be used as educated guesses in optimization algorithms reducing the likelihood of obtaining unrealistic optimal solutions. We stress that if is known, then the fraction of reported cases can be easily computed . We find that ranges from to depending on geographic location. Such values lie well below recent estimates [30] using only rRT-PCR tests from hospitalized cases (), whereas it is more in line with the values obtained from antibody screening with large sample size in Denmark [29]. Interestingly, countries lightly afflicted by the epidemic exhibit lower even though the transmission of the virus is similar to neighbouring countries. Taking Germany or Denmark as examples, we see that both have while maintaining compatible with other European countries. The result is likely attributed to healthcare resources and services either being more accessible or efficient, including early surveillance.The sigmoidal fitting becomes far more reliable once the outbreak has been active for some time as the data start to move away from the exponential phase. The distance from the inflection point is another important factor that affects the quality of the fit, especially . The introduction of lockdowns and other social distancing measures reduce the transmission rate and effectively create a new inflection point, which is different from the one expected without control measures. Thus, the fitting curve may converge to an incorrect equilibrium value if the fitting data include points around the induced inflection point. This can only be solved by introducing disjoint time windows for different regimes of as Fig. 6 shows. The difference between equivalent parameters in subsequent windows returns the effectiveness of interventions. Taking France as a concrete example, confinement reduced the basic reproduction number by while lifting and relaxation of restrictions resulted in . We stress that parameters calculated within the second time window do not require additional data other than the removal rate. That is not the general case, especially in regimes with exponential growth and an unknown number of susceptible and infected individuals. Strict control measures mitigate uncertainties concerning and that often arise in curve fitting procedures. As a result, parameter extraction within time windows with strict control measures can also be used as input in subsequent windows which greatly simplify the analysis if they exhibit exponential growth.Concerning deviations in parameters estimated through (8a-8c), they can be tracked to the uncertainties in the removal rate. In general, the inverse of the removal rate describes the average time required for an infective person to change to the removed state. For COVID-19, virus shedding occurs more prominently between 3 and 7 days, with a peak at day 5 since the onset of symptoms [28]. The exact probability distribution for remains an open issue so our analyses assume a uniform distribution.Finally, the SIR model rests on the random mixing hypothesis but deviations are expected with stronger effects in population with varying demographics or populations at risk. In particular, communicable respiratory diseases become major issues in correctional facilities, given the lack of adequate environmental and sanitary conditions. The combination of higher transmission rate of pathogens and reduced healthcare can increase the crude mortality rate for incarcerated individuals. By a similar argument, outbreaks in nursing homes may affect estimates of epidemiological parameters because the disease becomes disproportionately more lethal for older patients. In this study, spatial effects are neglected to understand the disease spreading of the average population with the minimal number of parameters possible.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.