M Hashem Pesaran1,2, Cynthia Fan Yang3. 1. University of Southern California Los Angeles California USA. 2. Trinity College, Cambridge UK. 3. Florida State University Tallahassee Florida USA.
Abstract
This paper develops an individual-based stochastic network SIR model for the empirical analysis of the Covid-19 pandemic. It derives moment conditions for the number of infected and active cases for single as well as multigroup epidemic models. These moment conditions are used to investigate the identification and estimation of the transmission rates. The paper then proposes a method that jointly estimates the transmission rate and the magnitude of under-reporting of infected cases. Empirical evidence on six European countries matches the simulated outcomes once the under-reporting of infected cases is addressed. It is estimated that the number of actual cases could be between 4 to 10 times higher than the reported numbers in October 2020 and declined to 2 to 3 times in April 2021. The calibrated models are used in the counterfactual analyses of the impact of social distancing and vaccination on the epidemic evolution and the timing of early interventions in the United Kingdom and Germany.
This paper develops an individual-based stochastic network SIR model for the empirical analysis of the Covid-19 pandemic. It derives moment conditions for the number of infected and active cases for single as well as multigroup epidemic models. These moment conditions are used to investigate the identification and estimation of the transmission rates. The paper then proposes a method that jointly estimates the transmission rate and the magnitude of under-reporting of infected cases. Empirical evidence on six European countries matches the simulated outcomes once the under-reporting of infected cases is addressed. It is estimated that the number of actual cases could be between 4 to 10 times higher than the reported numbers in October 2020 and declined to 2 to 3 times in April 2021. The calibrated models are used in the counterfactual analyses of the impact of social distancing and vaccination on the epidemic evolution and the timing of early interventions in the United Kingdom and Germany.
Since the outbreak of Covid‐19, many researchers in epidemiology, behavioral sciences, and economics have applied various forms of compartmental models to study the disease transmission and potential outcomes under different intervention policies. The compartmental models are a major group of epidemiological models that categorize a population into several types or groups, such as susceptible (S), infected (I), and removed (recovered or deceased, R). Compartmental models owe their origin to the well‐known SIR model pioneered by Kermack and McKendrick (1927) and have been developed in a number of important directions, allowing for multi‐category (multi‐location), under a variety of contact networks and transmission channels.
1In this paper, we develop a new stochastic network SIR model in which individual‐specific infection and recovery processes are modelled, allowing for group heterogeneity and latent individual characteristics that distinguish individuals in terms of their degrees of resilience to becoming infected. The model is then used to derive individual‐specific conditional probabilities of infection and recovery. In this respect, our modelling approach is to be distinguished from the individual‐based models in epidemiology that specify the transition probabilities of individuals from one state to another.
2 In modelling the infection process, we consider an individual's contact pattern with others in the network, plus an individual‐specific latent factor assumed to be exponentially distributed. The time from infection to recovery (or death) is assumed to be geometrically distributed. The individual processes are shown to aggregate up to the familiar multigroup SIR model. We allow for group heterogeneity and, in line with the literature, assume contact probabilities are homogeneous within groups but could differ across groups.We then derive the probabilities of individuals within a given group being in a particular state at a given time, conditional on contact patterns, exposure intensities, and unobserved characteristics. These conditional probabilities are aggregated up to form a set of moment conditions that can be taken to the data on the number of infected and active cases both at the aggregate and group (or regional) levels. We make use of the moment conditions to investigate the identification of the underlying structural parameters. Most importantly, we show that while one cannot distinguish between average contact numbers and the degree of exposure to the virus upon contact, it is nevertheless possible to identify the basic and effective reproduction numbers from relatively short time series observations on infections and recoveries. Using Monte Carlo simulations, the small sample properties of the proposed estimator are shown to be satisfactory, with a high degree of precision even when using 2 and 3 weeks of rolling observations.However, in practice, estimation of the transmission rate must take account of the well‐known measurement problem where the number of infected cases is often grossly under‐reported. This problem is further complicated since the degree of under‐reporting varies over time and tends to fall as society becomes more familiar with the disease and testing becomes more widespread. To deal with this mismeasurement problem, we propose a new method that jointly estimates the transmission rates and the multiplication factor that measures the degree of under‐reporting. Equipped with daily estimates of the transmission rates, we are then able to calibrate our epidemic model and investigate its properties under different network topology, group numbers, and different social distancing and vaccination strategies.We apply the proposed joint estimation approach to examine how well the outcome of the proposed epidemic model matches the Covid‐19 evidence in the case of six European countries (Austria, France, Germany, Italy, Spain, and the United Kingdom) from March 2020 to April 2021. We provide rolling estimates of the transmission rates and related effective reproduction numbers, as well as estimates of multiple factors. We then use the estimated transmission rates to calibrate the model parameters across the six countries. The stochastically simulated outcomes are shown to be reasonably close to the reported cases once the under‐reporting issue has been addressed. We estimate that the degree of under‐reporting declined from a multiple of 4–10 to 2–3 times during the study period across the countries considered.Finally, we illustrate the use of our model for two different counterfactual exercises. First, we consider the effects of vaccination on the evolution of the epidemic using a multigroup setup, where we also evaluate the implications of age‐based vaccine prioritization on the outcomes. Our model allows each individual to have their own degree of immunity, with vaccination increasing this individual‐specific immunity by a factor of 20 in the case of Moderna or Pfizer‐BioNTech that are shown to be 94–95% effective (Oliver et al., 2020, 2021). The multigroup model is particularly helpful in examining the implications of different vaccine prioritization strategies. Second, we investigate the potential outcomes if the first lockdown in Germany had been delayed for 1 or 2 weeks; and if the lockdown in the United Kingdom had started 1 or 2 weeks earlier. Such counterfactual analyses can be achieved by shifting the estimated transmission rates forward or backward. We show that early intervention is critical in managing the infection and controlling the total number of infected and active cases.The problem of how to balance the public health risks from the spread of the epidemic with the economic costs associated with lockdowns and other mandatory social‐distancing regulations will not be addressed in this paper. However, the proposed network SIR model with its individual‐based architecture is eminently suited to this purpose. The proposed model can be combined with behavioral assumptions about how individuals trade off infection risk and economic well‐being, thus generalizing the aggregate framework proposed in Chudik et al. (2021) to individual‐based SIR models.The rest of the paper is organized as follows. Section 2 introduces the basic concepts and the classical multigroup SIR model. Section 3 lays out our individual‐based stochastic model on a network. Section 4 explains the calibration of our model to basic and effective reproduction numbers. Section 5 documents the properties of the model. Section 6 discusses the estimation of the transmission rate. Section 7 presents the calibration of the model to Covid‐19 evidence. Section 8 concentrates on the counterfactual analyses, and Section 9 concludes.To save space, a detailed review of the related literature is given in an online supplement, where we also provide supplementary theoretical derivations, additional estimation results, counterfactual outcomes, and data sources.
BASIC CONCEPTS AND THE MULTIGROUP SIR MODEL
We consider a population of
individuals susceptible to the spread of a disease with some initially infected individuals. We suppose that the susceptible population can be categorized into
groups of size
,
, with
fixed such that
. It is further assumed that the group shares,
, for all
and as
. The grouping could be based on demographic factors (age and/or gender), or other observed characteristics such as contact locations and/or schedules, mode of transmission, genetic susceptibility, group‐specific vaccination coverage, as well as socioeconomic factors (see, e.g., Hethcote, 2000). Individual
in group
will be referred to as individual
, with
and
. It is assumed that
is relatively large but remains fixed over the course of the epidemic measured in days.Suppose that individual
becomes infected on day
, and let
take the value of unity for all
, and zero otherwise. In this way, we follow the convention that once an individual becomes infected, he/she is considered as infected thereafter, irrespective of whether that individual recovers or dies. Specifically, we set
The event of recovery or death of individual
at time
will be represented by
, which will be equal to zero unless the individual is “removed” (recovered or dead). An individual
is considered to be “active” if he/she is infected and not yet recovered. We denote the active indicator by
, which is formally defined by
takes the value of 0 if individual
has not been infected, or has been infected but recovered/dead. It takes the value of 1 if he/she is infected and not yet recovered. Any individual
who has not been infected is viewed “susceptible” and indicated by
, whereIt then readily follows that the total (cumulative) number of those “infected” in group
at the end of day
is given by
where the summation is over all individuals in group
. The total number of “recovered” in group
in day
is given byThe total number of “active” cases (individuals who are infected and not yet removed) in group
in day
isThe number of “susceptible” individuals in group
in day
is
Our model does not distinguish between recovery and death. Once an individual is removed (recovered or dead), following the SIR literature, we assume that he/she cannot be infected again. Under this assumption, recovery and death have the same effects on the evolution of the epidemic, and accordingly in what follows we shall not differentiate between recovery and death and refer to their total as “removed.”The classic multigroup SIR model in discretized form can be written as
3
for
and
, where
, and
are defined as above,
is the recovery rate which is assumed to be time‐invariant and the same for all people in group
, and
is the transmission coefficient between
and
. Note that individuals in group
may transmit the disease to individuals in group
, with the new infections in group
given by
.
AN INDIVIDUAL‐BASED STOCHASTIC EPIDEMIC MODEL ON A NETWORK
We now depart from the literature by explicitly modelling the individual indicators,
and
, (and hence
) and then simulate and aggregate up to match the theoretical predictions with realized aggregated outcomes. In this section, we first describe the infection and recovery processes at the individual level, we then show how they lead to the moment conditions at group levels, and finally derive the relation between aggregated outcomes from our model and the multigroup SIR deterministic model.
Modelling the infection and recovery processes
As an attempt to better integrate individual decisions to mitigate their health risk within the epidemic model, we propose to directly model
for each individual
, as compared with modelling the group aggregates
, and
. We follow the micro‐econometric literature and model the infection process using the latent variable,
, which determines whether individual
becomes infected. Specifically, we begin with the following Markov switching process for individual
:
where
is the indicator function that takes the value of unity if
holds and zero otherwise. We suppose that
is composed of two different components. The first component relates to the contact pattern of individual
with all other individuals in the active set, denoted by
, both within (when
) and outside of his/her group (when
). The second component is an unobserved individual‐specific infection threshold variable,
. Formally, we set
where the first component depends on the pattern of contacts,
, whether the contacted individuals are infectious,
, and the exposure intensity parameter, denoted by
.
is the contact network matrix, such that
if individual
is in contact with individual
at time
.
is an infectious indicator, already defined by (2), and takes the value of unity if individual
is infected and not yet recovered, zero otherwise. The exposure intensity parameter,
, is group‐specific and depends on the average duration of contacts, whether the contacting individuals are wearing facemasks, and if they follow other recommended precautions.The multigroup structure of the first component of (12) covers a wide range of observable characteristics and can be extended to allow for differences in age, location, and medical pre‐conditions. There are also many unobservable characteristics that lead to different probabilities of infection, even for individuals with the same contact patterns and exposure intensities. To allow for such latent factors, we have introduced the individual‐specific positive random variable (
) which represents the individual's degree of resilience to becoming infected and varies across
and
. Ceteris paribus, an individual with a low value of
is more likely to become infected.
is assumed to be independently distributed over
and
, and follows an exponential distribution with the cumulative distribution function given by
where
.To complete the specification of the infection process, we also need to model
, namely the recovery indicator. We assume that recovery depends on the number of days since infection. Specifically, the recovery process for individual
is defined by
where
if individual
recovers at time
, having been infected exactly at time
and not before, and
otherwise. The analysis of recovery simplifies considerably if we assume time to removal, denoted by
, follows the geometric distribution (for
)Then the probability of recovery at time
having remained infected for
days (also known as the “hazard function”) is given by
which is the same across all individuals within a given group and, most importantly, does not depend on the number of days since infection.
4 Therefore, using (16) in (14), the recovery micro‐moment condition simplifies to
which implies that
where
and
.We assume that
, the elements of the
network matrix
, are independent draws with
, namely, the probability of contacts is homogeneous within groups but differs across groups. Let
be the
th row of
. Also, let
be a column vector consisting of
, for
and
. Then using (11), we have
where
, which represents the net exposure effect. Since in general individual contact patterns are not observed, we also need to derive
. To this end, we first note that
and since by assumption
are independently distributed, we then haveHowever, recall that
if individual
is currently infected (namely if at time
he/she is a member of the active set,
), otherwise
. In the latter case
, and hence
where
. See also (6).
Moment conditions at group and aggregate levels
We will first derive the moment conditions at the group level. Denote the per capita infected and recovered in group
by
and
, respectively, and note that
. Let
, where
is the mean daily contacts from group
for an individual in group
.
5 To preserve the symmetry of contact probabilities, the mean contact numbers must satisfy the so‐called reciprocity condition,
(see, e.g., Willem et al., 2020). That is, the total number of contacts that people in group
have with people in group
must be the same as the number of contacts that people in group
have with those in group
. In practice,
is often quite large, with
relatively small (often less than 30). Therefore, it is reasonable to assume that
is fixed in
and hence
. Then we haveSuppose that
is small enough such that
is sufficiently accurate. Also recall that
, for all
, then
rises at the same rate as
. It follows that
and henceLet
, where
and
. Let
, where the expectations are taken with respect to the distribution of
for a given
, and
refers to the parameters of the distribution of
over
in group
. Note that
is the moment generating function of
, assumed to be the same across all individuals. Then using the above result in the micro infection moment conditions, (19), givesLet
and note that since
is a subset of
, then
and the moment condition (23) also implies that
for
. Averaging the above conditions over
for a given group
, and recalling that
, we obtainWe will return to the heterogeneous
in the counterfactual analysis of vaccination to be discussed in Section 8, where
is associated with the vaccine effectiveness for individual
. In order to derive analytical results and achieve identification in estimation, in what follows, we assume
for all
in group
. Also note that
and
are not separately identified. Without loss of generality, we normalize
. Under these conditions,
, the group‐level infection moment condition can be written as
which can be written equivalently as (recall that
)Also aggregating the micro recovery moment conditions, (18), we haveTo sum up, in per capita terms, we obtain the following
dimensional system of moment conditions (for
)Given time series data on
and
, the above moment conditions can be used to estimate the structural parameters,
and
.In relating the theory to the data, one may need to further aggregate across groups to the population level if group‐level data are unavailable or unreliable. It is interesting to note that the multigroup model does not lead to a model for the aggregates,
and
, without additional restrictions. To see this, using (22) in (28) and under the assumption that
, we obtain
where
. The approximation holds since
is small. Notice that
and
. If we multiply both sides of (30) by
and sum across
, we obtainIt is now clear that the group moment condition for infected cases, (28), does not aggregate up to the moment conditions in terms of
and
unless
is the same across all
and
. It is also straightforward to see that the group moment condition for recovery, (29), does not aggregate up either unless
for all
.In the case of a single group, we have
, and
, for all
and
. Then (31) simplifies toAlso, if
for all
, the recovery moment condition, (29), becomesGiven aggregate data on
, and
, one can estimate
and
using the moment conditions (32) and (33), respectively. Interestingly, it can be shown that the multigroup SIR model given by (8)–(10) is a linearized‐deterministic version of the above moment conditions. The relationship between our model and the classical SIR model is set out in Section S2.1 of the online supplement.
BASIC AND EFFECTIVE REPRODUCTION NUMBERS
In this section, we consider the calibration of our model to a given basic reproduction number assuming no intervention, and derive the effective reproduction numbers in terms of mean contact patterns, exposure intensities, and the recovery rate. We also consider the problem of identifying contact patterns from the exposure rates in single and multigroup contexts.
Basic reproduction number
The basic reproduction number, denoted by
, is defined as “the average number of secondary cases produced by one infected individual during the infected individual's entire infectious period assuming a fully susceptible population” (Del Valle et al., 2013). By construction,
measures the degree to which an infectious disease spreads when left unchecked. The infection spreads if
and abates if
.In order to derive
for our multigroup model, we suppose that on day 1 a fraction
of each group
becomes infected, which represents the equivalent of one individual becoming infected as required by the definition of
. That is, on day 1,
, for
. Also, in view of our model of the recovery, the probability that an individual infected on day 1 remains infected on day
is given by
, for
.
6 Hence, we have
where
. Now using (26) for
we haveDue to the large number of possibilities that follow after the second day of the epidemic, it is not possible to derive similar analytical expressions for
. But since the weights of these future expected values decay geometrically, and at the start of the epidemic the number of infected is likely to be very small relative to the susceptible population, we think it is reasonable to follow the literature (Elliott & Gourieroux, 2020; Farrington & Whitaker, 2003) and assume that
, for
.
7 Under this assumption, the following approximate expression for
obtains:
In the case where the recovery rates are the same across the groups (
), the above expression simplifies further and we have
.
8 Now using (35) givesTo see how the above result relates to the well‐known expression
, consider the case of a single group with
as
is large. Then the expression in (36) reduces to
where the last result follows by
. Hence the model can be calibrated to any choice of
and
by setting the average number of contacts,
, and/or the exposure intensity parameter,
. It is clear that
and
are not separately identified—only their product is identified. In addition, we would obtain the standard result
for SIR models if we set
.Returning to the multigroup case, expression (37) continues to apply if the population is homogeneous in the sense that
, for all
and
. But in the more realistic case of group heterogeneity, we can use (36) to calibrate
and/or
for given choices of
and
. Since we have assumed that
is large and
is fixed, (36) can be further simplified with a linear approximation derived as follows. Let
and use a similar argument as in (20) to obtain (recall that
)Then
. Using this in (36) givesAs before setting
, for
sufficiently large, the above expression can be written equivalently as
where
and
are the aggregate and group‐specific transmission rates, respectively.Similar to the case of a single group, Equation (38) implies that
and
are not separately identified; only their products are identified (or equivalently,
are identified). To see this more formally, consider the simple case of two groups
. Then for sufficiently large
, using (38) with
, we have
where the last line follows by the symmetry of contact probabilities:
. It is clear from (40) that only
, and
can be identified given
, and
. More generally, for finite
are identified for any
and
.
Effective reproduction numbers and mitigation policies
In reality, the average number of secondary cases will vary over time as a result of the decline in the number of susceptible individuals (due to immunity or death) and/or changes in behavior (due to mitigation strategies such as social distancing, quarantine measures, travel restrictions, and wearing of facemasks). The effective reproduction number, which we denote by
,
9 is the expected number of secondary cases produced by one infected individual in a population that includes both susceptible and non‐susceptible individuals at time
. In a multigroup setting, we represent “one infected individual” by the vector of population proportions,
. The evolution of
is determined by the remaining number of susceptibles by groups,
, for
. Formally,
is defined byIn the absence of any interventions, using (26), we haveRecalling that
, then for
sufficiently large we have the following approximate expression for
:
Setting
, we can alternatively write
as (recall that
and
)
where
is already defined by (39).In the case of a single group or when
is homogeneous across groups, the above expression simplifies to
, which can be written equivalently as
. In the absence of any interventions,
declines as
rises, and
falls below 1 when
. The value
is often referred to as the herd immunity threshold. For the multigroup case, using (39) and (43), the condition for herd immunity is more complicated and is given by (for
sufficiently large)
and the herd immunity threshold becomes
This formula clearly shows that for herd immunity to apply, the group‐specific infection rate,
, must be sufficiently large—shielding one group requires higher infection rates in other groups with larger population weights. To see this, let us consider a simple example of two groups (
) with a homogeneous transmission rate across the two groups (
), and note that
. Suppose that policymakers want to shield Group 1, which may comprise elderly people, from infection. In the extreme case where all individuals in Group 1 are protected, namely,
, then herd immunity requires
, which is higher than the threshold value of
where the population groups are treated symmetrically.Social intervention might be necessary if the herd immunity threshold is too high and could lead to significant hospitalization and deaths. In such cases, intervention becomes necessary to reduce the transmission rates
, thus introducing independent policy‐induced reductions in the transmission rates. In the presence of social policy interventions, the effective reproduction number for the multigroup can be written as
where (using (39))
. Reductions in
can come about either by reducing the average number of contacts within and across groups,
, or by reducing the group‐specific exposure intensity parameter,
, or both. Since only the product of
and
is identified, in our simulations we fix the contact patterns and calibrate the desired value of
by setting the value of
for each
to achieve a desired
number. Of course, one would obtain equivalent results if the average number of contacts is assumed to be time‐varying and the exposure intensity parameter is assumed constant. In the case of a single group or when
for all
, we have
where
is the herding component. It is also worth bearing in mind that at the outset of epidemic outbreaks the value of
is close to zero, which ensures that
.
CALIBRATION AND SIMULATION OF THE MODEL
Although it is difficult to obtain an analytical solution to the individual‐based stochastic epidemic model, we can study its properties by simulations. This section focuses on the baseline scenario of no containment measures or mutation of the virus so that the transmission rate is constant. We will discuss simulation results with time‐varying transmission rates under social distancing and vaccination in Section 8. In light of the recent studies on the value of
for Covid‐19, we set
.
10 For the recovery rate, in view of the World Health Organization guidelines of 2 weeks of self‐isolation, we set
.
11 It follows that
.We consider dividing the population into
age groups:
,
, and 65+ years old, and of course, one can readily consider a different number of groups based on other characteristics if such data are available. We use the data on Germany as an illustration. The social contact surveys by Mossong et al. (2008) provide rich data on the contact patterns in Germany. We update the contact matrix by age with the most recent population data such that the reciprocity condition,
, is satisfied. The population shares for the five age groups are
, 0.17, 0.28, 0.20,
, and the resulting (pre‐pandemic) contact matrix is
where the element,
, represents the average number of daily contacts reported by participants in group
with someone in group
. The larger diagonal values in (46) indicate that people tend to mix more with others of the same age group—a phenomenon well documented by contact surveys across different countries. In order to calibrate
across groups, we match the ratio of infection probabilities of groups with the ratio of reported cases. Specifically, denoting the reference group by
and the ratio of reported infections of group
to group
by
, then
should match the ratio of the related probabilities, namely,
Using (22), we now have
For the purpose of calibration, we further assume that
, where
can be viewed as the latent common driver of the epidemic at time
. It then follows thatWe now use data on infected cases in Germany by the five age groups at the end of 2020 (before the rollout of Covid‐19 vaccines) to calibrate the relative transmission rates by groups. Setting the first age group as the reference group (
), we obtain
, which in conjunction with (47) yields
. To calibrate
, we use (38) and obtain
setting
and
.For each replication, the simulation begins with 1/1000 of total population randomly infected on day
; that is,
and
, where
denotes the
th replication, for
.
12 Then from
onwards, the infection and recovery processes follow (11) and (14), respectively, for
. The proportion of infections for each age group is computed as
, and the daily new cases are computed by
. The aggregate infections and new cases are computed as
and
, respectively. Details on the generation of random networks are given in Section S3.1 of the online supplement. Note that the contact network randomly changes every day (and also across replications). This feature captures the random nature of many encounters an individual has on a daily basis. We consider
replications and set the population size to
10,000. We also tried larger population sizes, but, as will be seen below, the interquartile range of the simulated new cases is already very tight when
10,000. Some simulation results for
50,000 and
100,000 are provided in Figure S1 of the online supplement.Figure 1 displays the simulated proportion of group‐specific and aggregate new cases in fan chart style with the 10th, 25th, 50th, 75th, and 90th percentiles over 1000 replications. The mean values are very close to the median and not shown. We also report the maximum proportion of infected for each group averaged across replications, that is,
, and the maximum proportion of aggregate infected,
. The duration of the epidemic, denoted by
, is computed as the number of days to reach zero active cases averaged across replications.
13
FIGURE 1
Simulated group‐specific and aggregate new cases when there are no containment measures with
. Note: The duration of the epidemic is
days.
, for
, and
. The number of replications is
. Population size is
10,000
Simulated group‐specific and aggregate new cases when there are no containment measures with
. Note: The duration of the epidemic is
days.
, for
, and
. The number of replications is
. Population size is
10,000Figure 1 shows that if the disease transmits at a fixed
, the youngest age group will have the lowest maximum proportion of infections, ending up with 62% infected in comparison with over 90% infected in the other groups. The uncontrolled epidemic is expected to end about 215 days after the outbreak. The daily new cases for the five groups peak around the same time (about 50 days on average), with the highest daily infection ranging from 2.0% in the youngest group to 3.7% in the middle‐aged group (Group 3). As a whole, the maximum aggregate infection rate will reach 90%, with daily new cases peaking at 2.9% of the population.In order to examine whether the number of groups affects the aggregate outcomes, we carry out simulations using a single group model and compare the results with the aggregate outcomes using the multigroup model. When there is only one group, the contact network reduces to the Erdős‐Rényi random network (simply referred to as the random network below), where each pair of the nodes (or individuals) are connected at random with a uniform probability
, where
is the mean degree of the network (or the mean number of contacts per individual).
14 We set the average number of contacts to
based on the literature on social contacts in the pre‐Covid period, and then set the exposure intensity parameter to
, where, as before,
and
. Figure S2 of the online supplement shows that the simulated aggregate outcomes are very close under the single‐ and multi‐group models. This finding is reasonable because the simulations were performed with the same fixed
. The heterogeneity in
and
affects the epidemic curves for each group but does not seem to impact the aggregate outcomes. This result suggests that an aggregate analysis may be justified if the primary focus is on the spread of the infection across the population as a whole rather than on particular age/type groups.Lastly, to investigate whether the results are robust to different network topologies, we considered another widely used contact network—the power law random network, in which a small number of nodes (individuals) may have a relatively high number of links (contacts). Figure S4 of the online supplement shows that the simulation outcomes obtained by the Erdős‐Rényi and the power law networks with the same average number of contacts are very similar.
ESTIMATION OF TRANSMISSION RATES
The previous section investigates the properties of the model, assuming the transmission rates are given. This section turns to detailing how to estimate the transmission rate using data on infected cases. We first derive the method of moments estimation of the transmission rate when there are no measurement errors and present the finite sample properties of the estimators using Monte Carlo techniques. We then allow for under‐reporting of infected cases and propose a recursive joint estimation of the transmission rate and the degree of under‐reporting by a simulated method of moments.
Estimation without measurement errors
Let us first consider the case of a single group, and recall that the moment condition for this case is given by (32), which is replicated here for convenienceWe can estimate the transmission rate,
, using (48) by nonlinear least squares (NLS) given time series data on
. The recovery rate,
, can be estimated using the recovery equation, (29),
. Nevertheless, in reality,
is often not recorded in a timely manner, and
is estimated from the hospitalization data. We therefore set
in our estimation and calibration exercises
15 and discuss the properties of the moment estimator of
in the online supplement.In the absence of any interventions (voluntary or mandatory), we have
, where, as before,
is the basic reproduction number. It follows that
can be estimated by
, where
is the NLS estimate of
using (48). Under social interventions, the recovery equation holds (since
is unaffected), but the moment condition for
now depends on the time‐varying transmission rate,
. For
in the range of 1/14 to 1/21, it is reasonable to use 2 or 3 weeks rolling windows when estimating
. For a window of size
, we haveNote that even though the time series
over the course of the epidemic are non‐stationary, the rolling estimation is based on short‐
series (
or 21).To examine the finite sample performance of
, we estimate
using the simulated data generated from the single group stochastic SIR model on a random network with mean contact
and assuming 1/1000 of the population is randomly infected on day 1. The true value of the transmission rate is set to
such that
. We consider population sizes
10,000, 50,000, and 100,000, and set the the number of replications to
. Recall that
is fixed and
. To alleviate noise induced by zero and near‐zero observations at the start and final stages of the epidemic, the rolling estimation of
is carried out over the 4th–15th weeks after the outbreak.Since the value of
is quite small, we present the estimation results in terms of
. Table 1 summarizes the bias and root mean square error (RMSE) of the 2‐weekly rolling estimates of
, averaged over the four non‐overlapping 3‐weekly subsamples, for different population sizes. The bias is computed as
, and the RMSE is computed by
, where
and
is the estimate of
in the
th simulated sample. As can be seen from Table 1, although
tends to slightly underestimate
, its bias and RMSE are quite small in all experiments and sub‐samples. The RMSE declines as the population size
increases, but
can be estimated reasonably well even with
10,000. Comparing the results over different epidemic stages, the RMSE is relatively larger at the early and late stages of the epidemic. This finding is not surprising since it is difficult to obtain precise estimates when
and
are near zero. We also considered the 3‐weekly rolling estimates, reported in Table S1 of the online supplement, and as can be seen are very close to the 2‐weekly estimates, with slightly better performance in the early and late stages of the epidemic. We will hereafter mainly focus on the 2‐weekly rolling estimation.
TABLE 1
Finite sample properties of the 2‐weekly rolling estimates of
, in the case where it is fixed at
3 weekly subsamples
Weeks since the outbreak
4th–6th
7th–9th
10th–12th
13th–15th
Population
n= 10,000
Bias
−0.0108
−0.0038
−0.0006
0.0014
RMSE
0.0988
0.0563
0.1048
0.2436
n= 50,000
Bias
−0.0017
−0.0001
−0.0010
−0.0012
RMSE
0.0405
0.0251
0.0481
0.1070
n=100,000
Bias
−0.0002
0.0005
−0.0003
−0.0009
RMSE
0.0282
0.0172
0.0335
0.0771
Note: The true value of
is set to
, where
and
so that
. We fix
and estimate
using (49). The number of replications is
.
Finite sample properties of the 2‐weekly rolling estimates of
, in the case where it is fixed atNote: The true value of
is set to
, where
and
so that
. We fix
and estimate
using (49). The number of replications is
.Similar moment conditions can also be used to estimate the parameters of the multigroup model. If time series data on
, for
(
is finite) are available, we can estimate
using the moment conditions (30), namely,
Then, as we have discussed in Section 4,
is identifiable from (38) for given values of
and
.
Estimation allowing for measurement errors
It is widely recognized that in practice
and
are under‐reported. The magnitude of under‐reporting is measured by the multiplication factor (MF) in the literature (see, e.g., Gibbons et al., 2014). It is expected that the MF will decline over time since data quality will improve as more testing is conducted, but, in any case, MF is certainly greater than one. Denoting the multiplication factor by
, and denoting the observed values of
and
by
and
, respectively, we have
and
(assuming that
, where
is the observed value of
). Then the moment condition in terms of the observed values (
and
) can be written asIt can be seen from (50) that
is not identified when
and
are very small in the early stage of the epidemic. When
becomes large enough, we can estimate
by the simulated method of moments based on (50). In practice,
varies slowly, and it is reasonable to assume
within a short time interval (2 or 3 weeks). Then we have
where
denotes the simulated value of
in the
th replication, and
is the total number of replications. Solving (51) for
yieldsIt is now clear that one can estimate
by (52) for given values of
, and estimate
by (49) if
is known. Accordingly, we propose a method that estimates
and
jointly. The algorithm is described in detail in Section S4.1 of the online supplement. We apply the procedure recursively using 2‐ and 3‐weekly rolling windows in the next section to examine how the transmission rates and under‐reporting of cases changed over time in a number of countries and evaluate how our model matches the Covid‐19 evidence.
MATCHING THE MODEL WITH EVIDENCE FROM A NUMBER OF EUROPEAN COUNTRIES
We now assess how our model matches with the recorded cases in six European countries: Austria, France, Germany, Italy, Spain, and the United Kingdom, while taking account of under‐reporting of infections.
16 The Covid‐19 outbreak in Europe began with Italy in early February 2020, with the recorded number of infections accelerating rapidly from 21 February onward. A rapid rise in infections took place about 1 week later in Spain, France, and Germany, followed by the United Kingdom and Austria at the end of February.To estimate
, we need observations on per capita infected and active cases,
and
. Using the recorded number of infected cases,
, and population data, the per capita cases,
, are readily available, where as before we use the tilde symbol to indicate observed values. Since
, we can obtain
if the number of removed (the sum of those who recovered and the deceased) cases,
, is available. Unfortunately, the recovery data is either not reported or is subject to severe measurement error/reporting issues in many countries. For all six countries, we therefore estimate the number of removed using the recursion
, for
, where the recovery rate
is set to 1/14, and values of
are generated starting with
. We then compute
by subtracting the estimated
from the recorded
.
17 That is, in this empirical exercise, we only need data on Covid‐19 cases per capita,
. To alleviate the wide fluctuations in the data due to irregular update schedules and reporting/recording delays, we smooth the series by taking the 7‐day moving average before they are used in the estimation and calibration.We adopt the joint estimation approach proposed in the last section to calibrate and evaluate our stochastic network model. The procedure is applied recursively using 2‐ and 3‐weekly rolling windows. Recall that in the early stage of an epidemic,
is small, and MF is not identified. We choose
as the threshold value.
18 When
, we use an initial guess of the multiplication factor, MF
, in the estimation of
. Since the early Covid data are quite noisy, we start the rolling estimation when the daily new cases exceed one per 100,000 people and use the estimates below
in the calibration. Specifically, the simulations begin with 1/1000 of the population randomly infected on day 1. To render the calibrations comparable across the countries in our sample, during the first week after the outbreak we set the value of
such that
equals its first estimate,
, that is less than 3. Then, from the second week onwards, we set
to the rolling estimates computed from the realized data (with MF
) until
reaches
on day
. As shown in Section 5, it makes little difference to the aggregate outcomes whether we carry out the simulations using single‐ or multi‐group models. Since we are interested in comparing the calibrated outcomes with realized cases, we conduct simulations using the single group model with the Erdős‐Rényi random network in this exercise. The first estimate of MF is computed as the ratio of the average calibrated cases to realized cases on day
. When
, we perform the joint rolling estimation of
and
using (S9) and (S10) of the online supplement. We present the 2‐weekly estimation and calibration results in the main paper. The results using the 3‐weekly rolling windows are very close and are given in the online supplement.
19 Since the moment condition (50) used in the joint estimation was derived assuming no vaccination, we end the joint estimation when the recorded share of the population fully vaccinated reaches 10%. The population size in simulations is set to
50,000. To ease the computational burden, the number of replications is set to
.Figure 2 shows the evolution of
over the period March 2020–April 2021 for the six countries, where
and
is displayed in Figure 3.
20 It also marks the start and end dates of the respective lockdowns.
21 It should be noted that the epidemic tends to expand (contract) if
is above (below) unity.
22 Among the six countries, Italy started the first national lockdown on 9 March 2020, followed by Spain, Austria, and France about a week later, and Germany and the United Kingdom 2 weeks later (on 23 March 2020). As can be seen from Figure 2,
fell below one in mid to late April 2020 in all these countries except for the United Kingdom, which took a bit longer before falling below unity in early May. On average, it took 36 days to bring
down below one from the start of the lockdown, with Germany being the fastest (27 days) and the United Kingdom being the slowest (47 days). By the end of May 2020,
was brought down below 0.5 in all six countries except for the United Kingdom, where the lowest value of
occurred in early July. As lockdowns eased, not surprisingly, the transmission rates started to rise. This new surge in estimates of
led some of the countries to announce their second lockdowns in early November 2020. The estimates of
fell below one again in December 2020, but the second trough in
is higher than the first in all countries except France. As the pandemic progressed,
displayed different patterns (timing and magnitudes of peaks and troughs) across the six countries due to different containment measures. By late April 2021 (the end of our sample),
is estimated to be close to one in all these countries, but they appear to be rising in Spain and the United Kingdom and falling in the other countries.
FIGURE 2
Rolling estimates of the effective reproduction numbers (
) for selected European countries, with start and end dates of the respective lockdowns. Note:
, where
is the reported number of infections per capita and
.
weeks. The joint estimation starts when
. The initial guess estimate of the multiplication factor is 5. The simulation uses the single group model with the random network and population size
50,000. The number of replications is 500. The number of removed (recoveries + deaths) is estimated recursively using
for all countries, with
. Red (green) arrows indicate the start (end) dates of the respective lockdowns
FIGURE 3
Rolling estimates of the multiplication factor for selected European countries. Note: See the notes to Figure 2
Rolling estimates of the effective reproduction numbers (
) for selected European countries, with start and end dates of the respective lockdowns. Note:
, where
is the reported number of infections per capita and
.
weeks. The joint estimation starts when
. The initial guess estimate of the multiplication factor is 5. The simulation uses the single group model with the random network and population size
50,000. The number of replications is 500. The number of removed (recoveries + deaths) is estimated recursively using
for all countries, with
. Red (green) arrows indicate the start (end) dates of the respective lockdownsRolling estimates of the multiplication factor for selected European countries. Note: See the notes to Figure 2Figure 3 plots the estimated MF for the six countries. The results offer evidence of substantial under‐reporting in the pandemic's early stages, with the magnitude of under‐reporting falling over time in all these countries. A closer inspection of the figure shows that the number of cases in Austria, Germany, and Italy in late October‐mid November 2020 was underestimated by 5–6 times, which declined to 2–3 times in late April 2021. About a quarter of actual infections were recorded in Spain in September 2020, compared with about a half being detected in late April 2021. France and the United Kingdom have a greater level of under‐reporting in the early stages—the number of cases was underestimated by as much as a factor of 9–10, which fell to 2–4 during the study period. Overall, the magnitude of these estimated MF seems reasonable and comparable to the estimates obtained by other approaches in the literature.
23Figure 4 presents the calibrated new cases and the 7‐day moving average of the reported new cases multiplied by the estimated MF. The fan charts depict the 10th through the 90th percentiles of the calibrated data. It can be seen that once we have taken account of under‐reporting, the calibrated cases match with the recorded cases fairly well. It is noteworthy that our model is able to catch the multiple waves of Covid‐19 cases over the course of the epidemic. Lastly, it is interesting to see how the total cases per capita compare across the six countries, with and without adjustments for under‐reporting. Figure S11 of the online supplement displays the reported total cases and the case numbers after adjusting for under‐reporting using the MF estimates. The results show that the number of total cases could have been underestimated three to five times in these countries as of early August 2021. These comparisons clearly show the importance of adjusting the number of infected cases due to under‐reporting, which can be reasonably estimated using our joint estimation procedure.
FIGURE 4
Realized and calibrated number of new cases of Covid‐19 for selected European countries. Note: Realized series (7‐day moving average) multiplied by the estimated multiplication factor is displayed in red. See also the notes to Figure 2
Realized and calibrated number of new cases of Covid‐19 for selected European countries. Note: Realized series (7‐day moving average) multiplied by the estimated multiplication factor is displayed in red. See also the notes to Figure 2
COUNTERFACTUAL EXERCISES
Having shown that the outcomes of the calibrated model closely match the evidence, we now demonstrate how the model can be used for two counterfactual analyses of interest. First, we investigate the impact of social distancing and vaccination on the evolution of the epidemic. To simplify the exposition, we consider an epidemic with two waves and investigate if the second wave can be avoided by vaccination. Second, we consider counterfactual outcomes that could have resulted from different timing of the early interventions in Germany and the United Kingdom.
Social distancing and vaccination
In order to understand the impact of non‐pharmaceutical interventions and vaccination on controlling the epidemic, we perform counterfactual analyses using the multigroup model with the five age groups introduced in Section 5. The different age groups in the model also allow us to consider the implications of prioritizing the Covid‐19 vaccine by age.In reality, the transmission rate varies over time due to both voluntary and mandatory social distancing as well as other mitigation measures such as vaccination. Here we use social distancing to refer broadly to all types of non‐pharmaceutical interventions (including lockdown measures). We assume that, in the absence of a vaccination program, the (scaled) transmission rate,
, equals 3 in the first 2 weeks after the epidemic outbreak, falls to 0.9 linearly over the next 3 weeks, and remains at 0.9 for 8 weeks. When social distancing is relaxed, the transmission rate increases to 1.5 linearly over the next 3 weeks and remains at 1.5 thereafter. Note that the effective reproduction number,
, could still fall below unity due to herding.
24 Also note that
, where
. We assume that
has the same rate of change as
, for all
.To model vaccination, we assume, for simplicity, that a single‐dose shot vaccine with efficacy of
becomes available when
. The vaccine takes full effect immediately, and its immune protection does not wane over time.
25 The effectiveness of vaccination can be measured by the parameter
, which is the mean of the infection threshold variable,
, defined in (13). We assume that
takes the value
if individual
is not vaccinated and takes the value
after vaccination. In the case of a single group, the probability of an individual getting infected when the proportion of active cases is
, for any given value of
, is
which declines with
. By definition, the vaccine efficacy should equal the percentage reduction in the probability of infection. Then the value of
associated with efficacy
is given byUsing the result in (21), we haveCombining (53), (54), and (55), we obtain
, which simplifies to
26This result also holds in the multigroup model.
27 Intuitively, (56) states that an individual becomes
times more immune relative to his/her level of immunity after vaccination.In simulations, an individual's degree of resilience,
, are i.i.d. draws from an exponential distribution with mean
before (after) vaccination. Without loss of generality, we normalize
. The Pfizer‐BioNTech and Moderna vaccines have been shown to have 95% and 94.1% efficacy rates in preventing symptomatic laboratory‐confirmed Covid‐19 infection, respectively (Oliver et al., 2020, 2021). Accordingly, using (56) we have
for
, namely, Pfizer and Moderna vaccines increase the level of immunity by a factor of 20.
28We suppose that 75% of the population is vaccinated over 12 weeks, with an equal number of people vaccinated each day.
29 We consider two vaccination schemes—random vaccination and vaccination in decreasing age order. Under the former, people are randomly selected without replacement for vaccination irrespective of their age. In the latter, older people are vaccinated first. Individuals within an age group are randomly selected for vaccination on each day when their group is eligible. After all people in the oldest group have been vaccinated, the second‐oldest group becomes eligible. This process continues until 75% of the population is vaccinated. In both schemes, we assume that vaccination eligibility does not depend on whether an individual is susceptible, infected, or recovered.Let us first consider the random vaccination scheme. Figure 5 compares the aggregate outcomes when there are (a) no containment measures, (b) social distancing only, (c) vaccination only, and (d) combined social distancing and vaccination. Specifically, the transmission rate,
, is fixed at 3 in the absence of social distancing (i.e., in cases (a) and (c)). In case (c), the vaccination starts from the 4th week after the outbreak. In case (d), the vaccination starts during the last month of social distancing (i.e., the 10th week after the outbreak). In cases (c) and (d), 75% of the population is randomly vaccinated over 12 weeks, and the vaccine efficacy is set at
.
FIGURE 5
Simulated number of new cases when there are (a) no containment measures, (b) social distancing only, (c) vaccination only, and (d) combined social distancing and vaccination. Note: The average number of new cases over 1000 replications is displayed. The simulations use the multigroup model with the five age groups as described in Section 5. Population size is
10,000. Under social distancing, the transmission rate,
, equals 3 in the first 2 weeks after the outbreak, falls to 0.9 linearly over the next 3 weeks, and remains at 0.9 for 8 weeks. When social distancing is relaxed, the transmission rate increases to 1.5 linearly over the next 3 weeks and remains at 1.5 afterward. In the absence of social distancing (i.e., in cases (a) and (c)), the transmission rate,
, is fixed at 3. In case (c), the vaccination starts from the 4th week after the outbreak. In case (d), the vaccination starts during the last month of social distancing (i.e., the 10th week after the outbreak). In cases (c) and (d), 75% of the population is randomly vaccinated over 12 weeks, and the vaccine efficacy is
.
denotes the maximum proportion of infected and is computed as
, with
replications.
is the duration of the epidemic
Simulated number of new cases when there are (a) no containment measures, (b) social distancing only, (c) vaccination only, and (d) combined social distancing and vaccination. Note: The average number of new cases over 1000 replications is displayed. The simulations use the multigroup model with the five age groups as described in Section 5. Population size is
10,000. Under social distancing, the transmission rate,
, equals 3 in the first 2 weeks after the outbreak, falls to 0.9 linearly over the next 3 weeks, and remains at 0.9 for 8 weeks. When social distancing is relaxed, the transmission rate increases to 1.5 linearly over the next 3 weeks and remains at 1.5 afterward. In the absence of social distancing (i.e., in cases (a) and (c)), the transmission rate,
, is fixed at 3. In case (c), the vaccination starts from the 4th week after the outbreak. In case (d), the vaccination starts during the last month of social distancing (i.e., the 10th week after the outbreak). In cases (c) and (d), 75% of the population is randomly vaccinated over 12 weeks, and the vaccine efficacy is
.
denotes the maximum proportion of infected and is computed as
, with
replications.
is the duration of the epidemicThe results show that social distancing can quickly bring down the daily case rate and thus reduce the demands on the healthcare system. However, as social distancing restrictions are relaxed, a second wave is expected to emerge. The second wave may have a higher peak than the first wave if the transmission rates rise too fast due to increasing contact rates or the exposure intensity. Comparing graphs (a) and (b) reveal that social distancing alone can reduce the maximum proportion of cases from 90% in an uncontrolled epidemic to 50%. Nonetheless, the duration of the epidemic could more than double, and an enduring epidemic may entail high social and economic costs. If vaccination is the only containment tool, it must be implemented soon enough to curb the spread of the disease, especially for a highly contagious disease with
about 3 (or even higher as evidenced by the new variants). Graph (c) shows that even if (in a very unlikely scenario) a highly effective vaccine becomes available during the 4th week after the outbreak, 59% of the population could end up getting infected. In reality, developing a new vaccine takes considerable time. Therefore, vaccination is not a substitute for non‐pharmaceutical interventions, which are necessary to slow the spread of the disease, allowing more time for vaccine development. Vaccination can effectively prevent the second wave if it is rolled out during the last month of social distancing, as shown in Graph (d). Under the assumption that 75% of the population end up getting vaccinated with an efficacy of
, the maximum proportion of infected could reduce to 12%, and the number of highest daily new infections could be more than 10 and 7 times lower than that in cases (a) and (c), respectively. We also examined the implications of different vaccination coverages, start times, speeds of delivery, and vaccine efficacies. The results are summarized in Section S7.1 of the online supplement, where we provide counterfactual outcomes assuming (i) 50% versus 75% vaccination coverage, (ii) vaccination starts at the end of social distancing versus during the last month of social distancing, (iii) 75% of the population getting vaccinated over 8 versus 12 weeks, and (iv)
versus 0.66.Figure 6 compares the simulation outcomes under random vaccination and vaccination in decreasing age order for each age group and the entire population, assuming the same social distancing policy as described above. In this experiment, the vaccination starts during the last month of social distancing. Seventy‐five percent of the population is vaccinated over 12 weeks, and the vaccine efficacy is
.
30 The results show that the maximum proportion of infected in the oldest group is reduced by 2 percentage points if the old gets vaccinated first, compared with random vaccination. Not surprisingly, the cost of protecting the elderly is reflected in the higher infection rates in the younger age groups, increasing the maximum cases by 1, 4, and 2 percentage points for age groups 1 to 3, respectively. The vaccine effectively curbs the spread of the disease and prevents the second surge of cases in the two senior groups. A comparison of the aggregate outcomes reveals that prioritizing the old results in a higher level of overall infections and a longer duration of the epidemic, owning to higher contact rates of the younger population. Of course, how to prioritize vaccines is a complex decision requiring further information on the rates of hospitalization and death in each age group. It also requires evaluating the social and economic costs of high infection rates and lockdown measures among young people.
FIGURE 6
Simulated average number of group‐specific and aggregate new cases using the multigroup model with social distancing, assuming random vaccination or vaccination in decreasing age order. Note: The average number of new cases over
replications is displayed. Population size is
10,000. The social distancing policy is the same as that considered in Figure 5. The vaccination starts during the last month of social distancing (i.e., the 10th week after the outbreak). Seventy‐five percent of the population is vaccinated over 12 weeks. The vaccine efficacy is
. The duration of the epidemic is
days under random vaccination and
days under vaccination by decreasing age order.
, for
, and
Simulated average number of group‐specific and aggregate new cases using the multigroup model with social distancing, assuming random vaccination or vaccination in decreasing age order. Note: The average number of new cases over
replications is displayed. Population size is
10,000. The social distancing policy is the same as that considered in Figure 5. The vaccination starts during the last month of social distancing (i.e., the 10th week after the outbreak). Seventy‐five percent of the population is vaccinated over 12 weeks. The vaccine efficacy is
. The duration of the epidemic is
days under random vaccination and
days under vaccination by decreasing age order.
, for
, and
Counterfactual outcomes of early interventions in United Kingdom and Germany
We now turn to different counterfactual outcomes that could have resulted from different timing of the first lockdowns in Germany and the United Kingdom, focusing on the first wave of Covid‐19 that leveled off at the end of June 2020 in both countries.
31 In particular, we investigate the quantitative effect of bringing forward the lockdown in the United Kingdom on the number of infected cases, as compared with the effect of delaying the lockdown in Germany. To this end, we shift the estimated
values backward or forward for 1 or 2 weeks. As shown in Figure 7, if the German lockdown had been delayed by 1 week, the maximum proportion of infected cases would have increased from 2.2% to 5.0%, and the maximum proportion of active cases would have risen from 0.6% to 1.5%. In contrast, if the UK lockdown had been brought forward by 1 week, the model predicts that the maximum proportion of infected cases would have reduced from 5.3% to 2.3%, and the maximum number of active cases would have reduced from 1.2% to 0.5%. These results suggest that the United Kingdom could have achieved a similarly low level of infected cases per capita as Germany if it had implemented social distancing sooner. The maximum proportion of infected (active cases) is estimated to rise further to 10.8% (3.2%) if the German lockdown was delayed by 2 weeks, and the maximum proportion of infected (active cases) is estimated to decrease further to 1.1% (0.3%) if the UK lockdown was brought forward by 2 weeks.
32 In summary, this counterfactual exercise shows that it is critical to take measures to lower the effective reproduction number as early as possible if a policymaker aims to control the number of infected and active cases.
FIGURE 7
Counterfactual number of infected and active cases for Germany and UK under different lockdown scenarios. Note: The simulation uses the single group model with the Erdős‐Rényi random network and begins with 1/1000 of the population randomly infected on day 1. The population size used in the simulation is
50,000. The recovery rate is
. The number of removed (recoveries + deaths) is estimated recursively using
for both countries, with
, where
is the reported number of infections.
is the 2‐weekly rolling estimate computed by (49) assuming MF
. The mean of
, for
replications, is displayed in the last column
Counterfactual number of infected and active cases for Germany and UK under different lockdown scenarios. Note: The simulation uses the single group model with the Erdős‐Rényi random network and begins with 1/1000 of the population randomly infected on day 1. The population size used in the simulation is
50,000. The recovery rate is
. The number of removed (recoveries + deaths) is estimated recursively using
for both countries, with
, where
is the reported number of infections.
is the 2‐weekly rolling estimate computed by (49) assuming MF
. The mean of
, for
replications, is displayed in the last column
CONCLUDING REMARKS
This paper has developed a stochastic network SIR model for empirical analyses of the Covid‐19 pandemic across countries or regions. Moment conditions are derived for the number of infected and active cases for the single group as well as multigroup models. It is shown how these moment conditions can be used to identify the structural parameters and provide rolling estimates of the transmission rate in different phases of the epidemic. To allow for time‐varying under‐reporting of cases, it proposes a method that jointly estimates the transmission rate and the multiplication factor using a simulated method of moments approach. In empirical applications to six European countries, the estimates of the transmission rate are used to calibrate the proposed epidemic model. It is shown that the simulated outcomes are reasonably close to the reported cases once the under‐reporting of cases is addressed. The multiplication factors are found to be declining over the course of the pandemic. It is estimated that the actual number of infections could be between 4–10 times higher than the number of reported cases around October 2020, whereas only 2–3 times higher in April 2021. The multigroup model is used for counterfactual analyses of the impact of social distancing and vaccination on the evolution of the epidemic. It is shown that lockdown measures are needed to slow down the spread of a highly contagious disease such as Covid‐19, buying time for the development of vaccines and treatments. Vaccination can prevent additional waves of epidemics as social distancing is eased after lockdowns if it is introduced early enough. The calibrated model is also used for empirically‐based counterfactual analyses of the first lockdowns in Germany and the United Kingdom. It is shown that the United Kingdom could have achieved an outcome similar to that experienced by Germany during the first wave if she had started the lockdown just 1 week earlier. Almost symmetrically, Germany would have experienced much higher infection rates (similar to the UK's experience) if she had started the lockdown 1 week later.
OPEN RESEARCH BADGES
This article has been awarded Open Data Badge for making publicly available the digitally‐shareable data necessary to reproduce the reported results. Data is available at http://qed.econ.queensu.ca/jae/datasets/pesaran003/.The JAE Data Archive directory is available at http://qed.econ.queensu.ca/jae/datasets/pesaran003/
Authors: Fiona P Havers; Carrie Reed; Travis Lim; Joel M Montgomery; John D Klena; Aron J Hall; Alicia M Fry; Deborah L Cannon; Cheng-Feng Chiang; Aridth Gibbons; Inna Krapiunaya; Maria Morales-Betoulle; Katherine Roguski; Mohammad Ata Ur Rasheed; Brandi Freeman; Sandra Lester; Lisa Mills; Darin S Carroll; S Michele Owen; Jeffrey A Johnson; Vera Semenova; Carina Blackmore; Debra Blog; Shua J Chai; Angela Dunn; Julie Hand; Seema Jain; Scott Lindquist; Ruth Lynfield; Scott Pritchard; Theresa Sokol; Lynn Sosa; George Turabelidze; Sharon M Watkins; John Wiesman; Randall W Williams; Stephanie Yendell; Jarad Schiffer; Natalie J Thornburg Journal: JAMA Intern Med Date: 2020-07-21 Impact factor: 21.873
Authors: Sara E Oliver; Julia W Gargano; Mona Marin; Megan Wallace; Kathryn G Curran; Mary Chamberland; Nancy McClung; Doug Campos-Outcalt; Rebecca L Morgan; Sarah Mbaeyi; José R Romero; H Keipp Talbot; Grace M Lee; Beth P Bell; Kathleen Dooling Journal: MMWR Morb Mortal Wkly Rep Date: 2020-12-18 Impact factor: 17.586
Authors: Sara E Oliver; Julia W Gargano; Mona Marin; Megan Wallace; Kathryn G Curran; Mary Chamberland; Nancy McClung; Doug Campos-Outcalt; Rebecca L Morgan; Sarah Mbaeyi; José R Romero; H Keipp Talbot; Grace M Lee; Beth P Bell; Kathleen Dooling Journal: MMWR Morb Mortal Wkly Rep Date: 2021-01-01 Impact factor: 35.301
Authors: Cheryl L Gibbons; Marie-Josée J Mangen; Dietrich Plass; Arie H Havelaar; Russell John Brooke; Piotr Kramarz; Karen L Peterson; Anke L Stuurman; Alessandro Cassini; Eric M Fèvre; Mirjam E E Kretzschmar Journal: BMC Public Health Date: 2014-02-11 Impact factor: 3.295