Literature DB >> 32900923

Strong spatial embedding of social networks generates nonstandard epidemic dynamics independent of degree distribution and clustering.

David J Haw¹, Rachael Pung¹, Jonathan M Read², Steven Riley³.

Abstract

Some directly transmitted human pathogens, such as influenza and measles, generate sustained exponential growth in incidence and have a high peak incidence consistent with the rapid depletion of susceptible individuals. Many do not. While a prolonged exponential phase typically arises in traditional disease-dynamic models, current quantitative descriptions of nonstandard epidemic profiles are either abstract, phenomenological, or rely on highly skewed offspring distributions in network models. Here, we create large socio-spatial networks to represent contact behavior using human population-density data, a previously developed fitting algorithm, and gravity-like mobility kernels. We define a basic reproductive number [Formula: see text] for this system, analogous to that used for compartmental models. Controlling for [Formula: see text], we then explore networks with a household-workplace structure in which between-household contacts can be formed with varying degrees of spatial correlation, determined by a single parameter from the gravity-like kernel. By varying this single parameter and simulating epidemic spread, we are able to identify how more frequent local movement can lead to strong spatial correlation and, thus, induce subexponential outbreak dynamics with lower, later epidemic peaks. Also, the ratio of peak height to final size was much smaller when movement was highly spatially correlated. We investigate the topological properties of our networks via a generalized clustering coefficient that extends beyond immediate neighborhoods, identifying very strong correlations between fourth-order clustering and nonstandard epidemic dynamics. Our results motivate the observation of both incidence and socio-spatial human behavior during epidemics that exhibit nonstandard incidence patterns.

Entities: Chemical

Keywords: clustering; epidemics; networks; subexponential

Mesh：

Year: 2020 PMID： 32900923 PMCID： PMC7519285 DOI： 10.1073/pnas.1910181117

Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN： 0027-8424 Impact factor: 11.205

Epidemics are frequently conceptualized as resulting from the transmission of a pathogen across a network. Directly transmitted pathogens propagate through susceptible human populations and create directed infection trees with an offspring-like process (1). Each node may be a different type [e.g., children may be more infectious than adults (2)], and individuals with many contacts are more likely to cause infection than those with fewer contacts (3). Although difficult to observe, infection trees describe a real biological process: These pathogens do not reproduce outside of a human host, so the founding pathogen population for an infectee comes directly from their infector. Further, we can conceptualize that infection trees occur when a true offspring process is constrained to pass through a social network (4, 5), with infection occurring according to a specified probability when an edge exists between a susceptible and an infectious individual. The properties of different contact network types can be described by distributions associated with their topology (5). First-order network properties are associated with first-order connections, as defined by the degree distribution. For finite random networks of reasonable size, the degree distribution is well-approximated by a Poisson in which variance is equal to the square of the mean. In contrast, for finite, scale-free networks, the offspring distribution is power-law-like, with a much higher variance. Further, distributions of second-order phenomena describe connections of length two. For example, the local clustering coefficient is a second-order property, defined to be the neighborhood density of a given node (5). For a limited set of network types, we can use analytical expressions for higher moments of the degree distribution to calculate key properties of their potential epidemics, such as the probability of epidemic establishment and cumulative incidence (6, 7). Although these higher-order moments are tractable for some special cases, they are seldom the primary target of theoretical studies. Semi-empirical networks that arise from detailed simulations (8) may have complex higher moments; however, their impact on epidemic dynamics is obscured by the variance of their offspring distribution (e.g., ref. 9). Here, we explicitly control our network-generation algorithm so as to have nontrivial higher-order structure, while maintaining a Poisson degree distribution and a prespecified clustering coefficient. Epidemics can also be understood in terms of compartmental models, which are more tractable mathematically and are equivalent to large network models with very simple topologies (10). Key features of epidemic incidence curves are often explained by dynamics associated with these models (11, 12). Numerical solutions to multitype susceptible–infectious–removed-like compartmental models are easier to obtain than for many topologies of network and can explain the initial growth phase (13), the timing and amplitude of the peak (14), the epidemic duration (15), and the total number of cases (16). These models can efficiently describe many different types of complexity, such as age-specific susceptibility and transmissibility (17), behavioral risk groups (18), and, with increasing frequency, geographical location (19). The basic reproductive number has been defined for both compartmental models and network models. For compartmental models, the reproduction number is conditional on the system having a well-defined period of exponential growth (20) and is defined as the average number of new infections generated by a typically infectious individual in an otherwise infectious population (20). The word “typically” is somewhat overloaded in this definition: During the exponential phase, a system with heterogeneous population will reach a steady-state distribution of infectives, corresponding to the eigenstate of the renewal process. For network models, the basic reproduction number is most frequently defined as the expected ratio of cases between the first (seed) and second generations of infection. In homogeneous networks, this is equal to the product of the average degree and the probability of transmission per link per generation. However, many studies of epidemics on networks involve high-variance degree distributions (9, 21), and so this quantity must be modified to account for excess degree (21, 22). Here, we use to denote the expected first-generation ratio if a network is homogeneous, defined to be the expected number of cases in the second generation divided by the number in the first generation. Our is therefore consistent with , as defined in ref. 21, although we choose not to adjust for overdispersion, because we condition our network construction on this distribution having low variance. The reproduction number for networks has also been defined to be more consistent with its definition for compartmental models. In ref. 23, was defined as an asymptotic property of epidemics that were guaranteed to have an exponential phase when they occurred on infinitely large networks. We define our to be a finite-network approximation to this in ref. 23. This is well-defined during periods of exponential growth. Both compartmental and network models can be embedded in space (19). Each node can have a location in space, while each compartment can refer to a single unit of space. Node density can be assigned according to known population densities, and compartments can be assigned equal spatial areas, but different numbers of hosts. In general, the risk of infection passing between two people decreases as the distance between their home location increases. The propensity of nodes to form links across space or for infection to spread between compartments can be quantified by using mobility models borrowed from geography (24), such as the gravity and radiation models. Here, we are specifically interested in how the overall topology of a spatially embedded network model can be driven by different movement assumptions and, thus, drive the gross features of the epidemics that occur on the network.

Results

We used an existing variant of the Metropolis–Hastings algorithm (10) to create a spatially embedded bipartite network of homes and workplaces consistent with the population density of Monrovia, Liberia, and with three illustrative movement scenarios (). An individual’s propensity to choose a given workplace was determined by the distance between their home and workplace and parameters of a gravity-like kernel. The kernel was inversely proportional to distance raised to the power , with movement scenarios generated solely by changing the value of : a control value that removed the embedding and produced a nonspatial model; a wide kernel with typical of developed populations (10, 25); and a highly local kernel with representing less-developed populations ( compared with rural Huangshan in ref. 26). The resulting distributions of distances from home to work were driven strongly by our choice of , with 95% of journeys less than 24.12 km for ; less than 12.91 km for ; and less than 6.68 km for . Workplace links were dissolved into links between individuals in different households, resulting in a network of cliques (households) that were linked according to . The choice of movement kernel used to create the household–workplace networks affected gross features of simulated epidemics, even when controlling for other aspects of the network topology (Fig. 1). Unipartite contact networks between households were obtained from the bipartite network of households and workplaces and were dependent on three parameters: mean household size , mean number of workplace links , and probability of forming a link in the workplace . The mean workplace size and mean degree of the network were determined by these parameters: . Across a broad range of plausible values for , , and , very local movement () produced later epidemics than did typical developed-population movement () or spatially random mixing (; Fig. 1). Similarly, time to extinction was later for very local movement () compared with more frequent, longer-distance movement () or the absence of spatial embedding (). We calculated the coefficient of variation of the degree distribution for each network, independently of (21).

Fig. 1.

For each set of parameters drawn from the Latin hypercube, and for , we show relationships between and peak size (A), peak size and final size (B), and and (C) (with the line shown in black).

For each set of parameters drawn from the Latin hypercube, and for , we show relationships between and peak size (A), peak size and final size (B), and and (C) (with the line shown in black). Each simulation was assigned a value of , the average number of cases in the first generation per seed infection. For moderate to high values of the first-generation ratio , there was very little difference in the final size of the outbreak for the different movement assumptions. However, for low values of , the average final size of the outbreak was substantially smaller for more local kernels. This was driven by a higher probability of extinction when more local movement was assumed. The difference in final size driven by was no longer present when we controlled for extinction (). The choice of movement scenario had a substantial impact on peak incidence, even when was high and there was little difference in the final sizes (Figs. 1 and 2, rows 1 and 2). For example, for parameters with first-generation ratios in the range , average peak daily incidence as a fraction of the total population was for random spatial movement, for movement assumptions typical of developed populations, and when highly local movement was assumed. The relationship between peak height and first-generation ratio appeared to be strongly linear, with correlation coefficients , and 0.9806 for , and 6, respectively.

Fig. 2.

Columns correspond to network structures with , and 6 and simulations with . Exponential growth in real time is indicated by straight lines (second row) and horizontal lines (third row); horizontal lines in the bottom row indicate exponential growth by generation. show results for a wider range of values for . The relationship between peak incidence and final size for the three movement scenarios illustrates further how clustering within the network directly affects gross features of an epidemic. Peak incidence was observed prior to final size during an epidemic. For the same peak height, local movement gave substantially larger final sizes. For peak daily incidences in the range , the final size of the outbreak was 68% when random spatial movement was assumed, 74% when movement was assumed to be typical of developed populations, and 84% when highly local movement was assumed. For all movement scenarios, the basic reproductive number was smaller than the first-generation ratio and different from the expected number of secondary cases generated by a single seed in an otherwise-susceptible population. The duration of the exponential phase can be seen when incidence is plotted on a log scale: A constant gradient of log incidence is evidence of exponential growth (Fig. 2, third row). However, in a network model with clearly defined generations, the generation ratio can also be used to define exponential growth: If the ratio of incidence between generation and is the same as the ratio between generations and , then we can claim to have identified a period of exponential growth ( and Fig. 2). The value of that constant observed ratio is the basic reproductive number (20). Incidence grew exponentially for a much shorter time for highly local movement than it did for a wider movement kernel, or for nonspatial networks, even when we controlled for to be within a narrow range (e.g., ; Fig. 2). Despite this being a relatively large population, there was no obvious period of exponential growth when we assumed highly local movement. Therefore, given that the basic reproductive number is defined for a genuine renewal process—and its implied exponential growth (20)—it could be argued that does not exist for some of these networks for our model parameters. However, we did assign a value of for all simulations based on the most similar subset of consecutive early generations (). The amplitude of the difference was not driven in any obvious way by the underlying assumptions used to create the networks. These patterns were not specific to the range of values for (). Analysis of the higher-order structure of the networks suggests that movement scenarios were driving the observed characteristics of epidemics, such as peak timing and attack rate via increased fourth-order clustering. We use the term first-order clustering for the quantity typically described as the local clustering coefficient (5): the link density of the immediate neighborhood of a given node. By extension, we defined order- clustering coefficient to be the expected proportion of neighbors within steps on the network who were also neighbors of each other within steps (Fig. 3). We found no relationship between our assumed pattern of movement () and first- or second-order clustering coefficients. There was a weak relationship between and third-order clustering and then a very strong relationship between and fourth-order clustering. Patterns between epidemic properties and fourth-order clustering for individuals were similar to those between epidemic properties and second-order clustering of households, as would be expected, given the bipartite algorithm used to create individual-level networks.

Fig. 3.

A schematic showing the generalization of clustering coefficient to higher orders : measures the density of paths of length between the up-to- neighbors of node (where node is shown in gray).

A schematic showing the generalization of clustering coefficient to higher orders : measures the density of paths of length between the up-to- neighbors of node (where node is shown in gray). Final size increased with spatial correlation, despite peak size displaying the opposite trend for controlled or . There was a strong linear relationship between order- clustering and peak size/final size that could be explained by , the strength of spatial embedding, when we control for (Fig. 4). The gradient of the relationship decreased with order of clustering. Second-order household clustering showed the same relationship with peak size as did fourth-order individual clustering (Fig. 4). These strong linear relationships only existed when we effectively controlled for , rather than , and became less noisy when we reduced the interval used to define .

Fig. 4.

(A) The 25th, 50th, and 75th percentiles of order- clustering on networks constructed with different values of and and . Plot shows mean values over three different networks for each parameter set. (B) Using peak size as a crude metric for subexponential growth (given a fixed range for ), we see linear trends emerging with higher orders of clustering. Plot shows one point per network, with three networks generated for each parameter set, and the mean peak size over 10 independently simulated epidemics. All points are numbered with the corresponding value of . (C) Similarly for the household-only networks. Solid lines show linear fits to data, and dotted lines show CIs. Values of linear correlation coefficient and gradient of fits are given in . We conducted a number of sensitivity analyses for these network-simulation results. Analytic approximations for degree distribution and expected first-order clustering in our networks are given in and are independent of . We confirmed these relationships in by computing these quantities on a set of networks that differ in . shows the relationship between and clustering order 1 to 4 on networks generated by using a uniform population density. shows the relationship between order- clustering and peak size for different values of . shows clustering orders 1 to 4 on networks with different and , and provides an illustration of the relationship between higher-order clustering and rewiring probability on a commonly used network model with spatial embedding: the Watts–Strogatz Small World Network (5). Finally, we mapped our network model onto a deterministic metapopulation framework so as to relate our simulations of incidence to prior analytic approximations of traveling spatial waves (see for analytic construction). Fig. 5 shows the results of simulating on a grid of evenly spaced households of size , where a single continuous variable describes prevalence in each household, and spatial coupling between households used in the force of infection is exactly the kernel used in the construction of our spatially embedded networks. We simulate with randomly spaced seeds (as above) and with a central seed (the center-most four households), tracking global incidence and local time of peak incidence. The former case yielded global incidence curves similar to those generated in our network model (which was seeded similarly). The latter case allowed us to identify four distinct stages in the propagation of spatial waves that contribute to observed subexponential outbreak dynamics in more complex, network-based systems. shows local peak timing in each case, and shows simulation results in one spatial dimension with and , alongside statistical properties of prevalence, which further clarify these growth phases (cf. figure legends for details and for mathematical analysis).

Fig. 5.

Mean-field approximation with , using a grid of uniformly spaced households. (A) Seeding in 10 randomly selected households (the same households are used in each simulation). (B) Seeding in the center only. Incidence is given as a proportion of the total population for ranging from two to six. shows time of peak incidence in the case seeded as above.

Discussion

We have shown that nonstandard epidemic dynamics can arise from strongly spatially embedded social networks. Using a flexible algorithm of assigning individuals to households and then creating social networks with widely varying topologies, we can explain the absence of exponential growth and increased attack rate for a given peak height in terms of higher-order social structure, while maintaining a standard low-variance offspring distribution. We observed consistent patterns when we controlled for the basic reproductive number, as measured as directly as possible from a constant ratio of incidence between generations. The algorithm we used (10) captures the key social contexts of home and workplace while using few parameters, which has allowed us to isolate specific relationships within the epidemic dynamics, across a broad range of network topologies. However, its simplicity is a potential limitation. Specifically, an individual only belongs to a single workplace (which may represent a school or social club). In reality, people will gather nonhousehold contacts from a variety of sources. Also, our networks are not dynamic, which may limit the generalizability of the results to short-generation-time pathogens. Accurate empirical data about higher-order social contacts would allow us to address some of these issues. There are a number of different approaches to gathering social-contact data, including contact diaries, mobile phone applications, and tag-based location tracking (27). Diary methods and current analytical approaches can provide accurate estimates of first-order moments [degree distribution (28)] and valuable insights into second-order moments [clustering (29)]. However, these data and current analytical approaches are limited for the estimation of higher-order moments. It seems likely that either high-resolution mobile-phone location data (30) or very-high-coverage tag-based studies will be needed to reveal these patterns (31). In addition, further work is needed on the use of algorithms similar to that used here to explicitly fit fully enumerated social networks to egocentric sample data from a subset of the population (or low-coverage nonegocentric data) (32). Our results can be compared with other disease-dynamic models that produce nonstandard incidence profiles. Different functional forms have been suggested for the force-of-infection term in compartmental models that give polynomial growth in the early stages of an epidemic (20, 33). However, the key features of these model structures may be captured by a more straightforward underlying process (34). Faster-than-exponential growth can be achieved with very-high-variance offspring distributions, which have been inferred by diary studies of social contacts (9). There is also an extensive literature of much more abstract grid-based models of infectious disease that produce nonstandard epidemic dynamic because of very local spatial processes [cellular automata (35)]. We note that short periods of super-exponential growth were observed in our results for the simplified two-dimensional metapopulation example (Fig. 5), arising from accelerating spatial waves of incidence, not driven by the variance of the offspring distribution. Prospective forecasting of infectious-disease incidence during outbreaks (36) and seasonal epidemics (37) is an active area of public health research. Although nonmechanistic (38) and simple compartmental models (39, 40) have proven most reliable up to now, modern computing capacity enables studies to explore the possibility that incidence forecasts can be improved by the incorporation of realistic social-network topology (41, 42). For example, incidence of Ebola in West Africa in 2013 to 2016 and currently in Central Africa exhibits strong spatial clustering and highly nonstandard incidence dynamic, with short periods of exponential growth followed by low sustained peaks in incidence (43). Future forecasting studies should explore the possibilitythat that sparse population density and short distances between contacts result in higher-order clustering in the social networks and the resulting nonstandard incidence profiles.

Materials and Methods

The Model.

We simulated 10 independent epidemics for each of 200 parameter sets drawn from a Latin hypercube, each seeded in 10 randomly selected individuals, and for each . The ranges of values used in the Latin hypercube are given in , and complete parameter sets for all networks are given in . Our simulations allowed us to track disease incidence and disease generation of each infection. We simulated an epidemic on the network to reflect the natural history of Ebola, with a latent period of 9.7 d and a serial interval of 15.3 d. The generation time was calibrated by varying the relative infectiousness of a short period before the onset of symptoms. Global transmissibility was tuned to the value of drawn from the Latin hypercube. For each time step, the probability of infection was calculated for each edge in the network. The algorithm progresses in real time with small time steps, so that it can be compared with results from compartmental models. Details of the network-simulation algorithm are given in ref. 10.

Assigning to each Simulation.

For each simulation output, we calculated the mean reproductive ratio for each generation. For generations one to nine and for each possible consecutive string of three, four, or five values, we performed a linear regression fit. We defined as the mean reproductive ratio over the set of values for which the gradient of this fit was closest to zero (and all values that remained larger than one). This allowed us to assign a value to every simulation output.

Higher-Order Clustering.

We computed our higher-order clustering coefficients on a subset of 1,000 nodes in each network, chosen at random. The algorithm involved storing the network structure as lists of neighbors for each node and performing an effective contact-tracing procedure. Though it is possible to compute these metrics for all nodes via successive multiplication of adjacency matrices, this procedure becomes computationally expensive in higher orders as networks become large.

37 in total

1. Modelling disease outbreaks in realistic urban social networks.

Authors: Stephen Eubank; Hasan Guclu; V S Anil Kumar; Madhav V Marathe; Aravind Srinivasan; Zoltán Toroczkai; Nan Wang
Journal: Nature Date: 2004-05-13 Impact factor: 49.962

2. Collective dynamics of 'small-world' networks.

Authors: D J Watts; S H Strogatz
Journal: Nature Date: 1998-06-04 Impact factor: 49.962

3. Smallpox transmission and control: spatial dynamics in Great Britain.

Authors: Steven Riley; Neil M Ferguson
Journal: Proc Natl Acad Sci U S A Date: 2006-08-07 Impact factor: 11.205

4. A general theory of early growth?: Comment on: "Mathematical models to characterize early epidemic growth: A review" by Gerardo Chowell et al.

Authors: Thomas House
Journal: Phys Life Rev Date: 2016-08-17 Impact factor: 11.025

5. Ebola virus disease among children in West Africa.

Authors: Junerlyn Agua-Agum; Archchun Ariyarajah; Isobel M Blake; Anne Cori; Christl A Donnelly; Ilaria Dorigatti; Christopher Dye; Tim Eckmanns; Neil M Ferguson; Robert A Fowler; Christophe Fraser; Tini Garske; Wes Hinsley; Thibaut Jombart; Harriet L Mills; Srinivas Murthy; Gemma Nedjati Gilani; Pierre Nouvellet; Louise Pelletier; Steven Riley; Dirk Schumacher; Anita Shah; Maria D Van Kerkhove
Journal: N Engl J Med Date: 2015-03-26 Impact factor: 91.245