The COVID-19 pandemic of 2020 has led to a significant increase in research in the area of modeling and simulation of infectious diseases. There are numerous aspects associated with this epoch-changing event that is now facing humanity. Macroscale (planetary) disease propagation, in addition to the related issues of logistical and political responses, is a central issue. Accordingly, the objective of this work is to develop a computationally-amenable agent-based model to investigate the behavior of an infected population by directly working at the individual-to-individual level of interaction. The wide-spread availability of computational power now raises the possibility that robust agent-based modeling can play a significant role in the analysis of infectious disease propagation. The key feature of agent-based modeling is that discrete entities (agents) are used to directly represent a population (Fig. 1). This enables the detailed analysis of epidemiological population dynamics and the ability to investigate the emergent structure SIR-type (Susceptible-Infected-Removed/Recovered) populations, as well more complex extensions, due to initially localized infections within a population on a global planetary scale, including the effects of social responses.
Fig. 1
A model problem of a “planet” with a population, which experiences sudden localized infections. Left: a model schematic and right: a computational model (blue representing currently uninfected and green representing infected)
A model problem of a “planet” with a population, which experiences sudden localized infections. Left: a model schematic and right: a computational model (blue representing currently uninfected and green representing infected)
Classical basic models
Before proceeding with the construction of an agent-based approach, it is useful to review basic concepts in the analysis of population dynamics, which dates back over two centuries to the work of Thomas Malthus. In 1798, he postulated that a population, denoted p, at a future time (), is related to the current population (at time t) bywhere , where b is a birth rate parameter and where d is a death rate parameter. One may writeleading to, in the limit as ,Integrating and applying the initial condition yieldsVariants/extensions of this simple model include interacting subpopulations. One family of models are of particular interest in this work, namely the so-called SIR-type (Susceptible-Infected/Infectious-Removed/Recovered), described next. Subsets of the population are assigned either S, I, or R status. The genesis of such models is the paper from 1927 of Kermack and McKendrick [35].
SIR sub-population models
SIR models identify three subpopulation classes of individuals, with an assumption that the overall population is constant, since the pandemic/epidemic time scales are faster than birth time scales. The following are key:A crucial question is, under what conditions does an infection grow, i.e. an “epidemic” occur? From Eq. 1.6, ifand thusEquation 1.10 provides the threshold for the susceptible populationto allow growth to occur. The parameter is sometimes called the infectious contact rate, while its reciprocal is called the recovery/removal rate. This simple model is one of the most basic to describe epidemics. For reviews, we refer the reader to Murray [42].S “Susceptible”, that can contract diseases,I = “Infected”, that can transmit the disease and who are infected andR = “Recovered/Removed” (dead or immune), where the typical assumptions include: (a) the gain in the infected class is at a rate proportional to the number of infected and susceptible, that is , (b) the rate of removal of the infected is proportional to the number of infected, , and (c) the incubation period is short enough to be negligible.In addition, if it is assumed that the various classes are uniformly mixed, where for the susceptible population: while for the infected subpopulation and for the removed populationAdding all of the populations together yields where where p is the total population, , , , is the infection rate and is the death rate.
Generalization of the SIR family of models
The basic SIR model can be extended in the following ways:The SIS model: Susceptible-Infected-Susceptible (again), typified by the common cold.The SIRD model: Susceptible-Infected-Recovered-Deceased, which distinguished between recovered and dead.The MSIR model: Maternal-Susceptible-Infected-Recovered, where the “M class” stands for immunity derived from the mother.The SEIR model: Susceptible-Exposed-Infected-Recovered, which distinguishes between “infected” and “exposed”.The SEIS model: Susceptible-Exposed-Infected-Susceptible (again), typified by the common cold, yet distinguishes between “infected” and “exposed”.The MSEIR model: Maternal-Susceptible-Exposed-Infected-Recovered, which incorporates features of the models above.The MSEIRS model: Maternal-Susceptible-Exposed-Infected-Recovered-Susceptible, which incorporates all of the features of the models above.The classical process of developing a continuum model from an inherently discrete system, which is then re-discretized into nodes. Information is lost in this “homogenization” process (Zohdi [53])For overviews and details on these various models, we refer the reader to Kermack and McKendrick [35], Murray [42], Hethcote[30, 31], Harko et al. [29], Baily [3], Altizer and Nunn [1], Miller [40], Miller [41], Osemwinyen and Diakhaby [44], Brauer and Castillo [12] Anderson [2], Barlett [5], May and Anderson [38], Capasso [15] and Vynnycky and White [48]. Various features can be included, such asVirtually all subsequent, more complex spatio-temporal extensions construct homogenized continuum (PDE-based) models, incorporating the above models. These approaches require extensive, complex, discretization techniques and are of limited value for studies on population dynamics with underlying complex interaction between subpopulations (Fig. 2). Such models have limited predictive capability and are computationally expensive, due to the extremely fine discretization needed to achieve tolerable numerical accuracy. Independent of the numerical difficulties, such modeling approaches attempt to develop continuum type field equations, by passing to the spatio-temporal limit as , make somewhat unrealistic assumptions in order to obtain tractable partial differential equations to enable qualitative estimates of the true population behavior. One must question the process of first homogenizing an inherently discrete population’s characteristics in order to develop PDE-based continuum models and then re-discretizing them into nodal values. This process is not bijective, in other words, one does not recover the original discrete system (Fig. 2). Information is lost in this process. Also, because of the simplifying assumptions on interaction, births, age structuring, etc., that are typically made, in order to obtain tractable field equations, the resulting discrete equations are not as physically meaningful as the true discrete population interaction that it is based upon. Essentially, in dealing with small subpopulations, or populations with complex mixing and high heterogeneity, the assumptions behind regularization techniques leading to continuum models may be difficult to justify. This motivates developing agent based methods, which are based on direct accounting of interactions between individuals or population subgroups.
Fig. 2
The classical process of developing a continuum model from an inherently discrete system, which is then re-discretized into nodes. Information is lost in this “homogenization” process (Zohdi [53])
Variable contact rates,Adult vaccinations,Child vaccinations,Newborn vaccinations,Effects of age andVector transmission (for example from mosquitos).A schematic of the growth of subpopulations
Agent-based models and objectives
Agent based models attempt to model the interaction between individuals directly, incorporating stochastic methods and have been used across many different disciplines, such as: biology, business, economics, social sciences, robotics and technology. The objective is to obtain deeper insight into the connection between micro-interaction and emergent macroscale behavior. The method first appeared in the 1940’s, but was computationally infeasible until the 1990’s. Although there are many forms of agent-based models, computationally, they are quite similar to particle interaction models in mathematical physics. The use of the term agent is often attributed to Holland and Miller’s 1991 paper “Artificial Adaptive Agents in Economic Theory” [33]. The 1990’s and early 2000’s led to many approaches, often correlated to the concepts of swarms of agents and aggregate movement. For reviews of the literature, see Bonabeau [11]. The attractive features of the approaches are the ability to model individual interactions nonphenomenologically, and to allow the system to evolve autonomously. The utility of such approaches is that one can trivially modify the “rules of engagement”, population sizes, reproduction rates, etc, and provide quantitative spatial and temporal information. Clearly, such a computational technique is easy to implement, and it is no extra effort to increase the number of population character parameters. We refer the reader to Zohdi [49, 50, 53–55], for reviews. Our objective in the current paper is to apply similar concepts to pandemic modeling, where the interaction is via infection transmission. Specifically, we develop a robust agent-based computational framework to investigate the emergent structure of Susceptible-Infected-Removed/Recovered (SIR)-type populations and more complex extensions on a global planetary scale, where each agent on the surface of the planet is initially uninfected. Infections are then seeded on the planet in localized regions. Contracting an infection depends on the characteristics of each agent—i.e. their susceptibility and contact with the seeded, infected, agents. Agent mobility on the planet is dictated by social policies, for example such as “shelter in place”, “complete lockdown”, etc. The global population is then allowed to evolve according to infected states of agents, over many time periods, leading to an SIR population. The work illustrates the construction of the computational framework and provides numerical examples.Remarks: In the related field of astro-biology, there are concerns about Earth-based microbes potentially infecting other worlds, by being carried on spacecraft missions. The recent discovery of water on Mars (as well as the moons of Saturn and Jupiter) has heightened such concerns. It has been reported that, on Mars, liquid water can possibly form seasonally in locations where snow is present on soils, in the presence of saline, producing brine (Martinez and Renno [37]). Since terrestrial bacteria can grow brine, infection of the Marian biosphere could be possible. The reverse is also true, since returning spacecraft from Mars could bring back non-terrestrial organisms back to Earth. NASA and other space agencies have consistently reported that Bacillus spores, subjected to years vacuum in space, cosmic radiation and extreme temperatures, can survive, if they are shielded by the exterior of a spacecraft. We refer readers to Beaty [6], Fischer et al. [23], Martinez and Renno [37], Summons et al. [46], Michalski [39] and Debus [18] for details.The infection zone and flow chart
Direct agent-based interaction models
Following Zohdi [53], we now construct a model problem based on discrete rule-driven interaction between agents of subpopulations. One can consider an agent as an individual or a small group of individuals (a “meta-individual”).
Agent-to-agent interaction and rules of engagement
Consider the following construction, for the “rules of engagement” for intermeshed infected and uninfected subpopulations, which are in close proximity to one another (Fig. 4):
Fig. 4
The infection zone and flow chart
If two agents of the subpopulations, denoted and , come within a certain distance, then two are said to engage in “contact”.If the uninfected agent is susceptible, then the agent becomes infected.The “susceptibility” of the uninfected subpopulations is heterogeneous (preset at the beginning of the simulation).The incubation period for an infected agent to either recover or die is .Once an agent of the population perishes or becomes immune, it cannot affect the rest of the population.Spherical coordinates used for agent placement. In this example, agent high-density concentration centers are at the (z-poles)
Algorithm
The algorithm is as follows:The relative ease at which one can generate such a population, and step it through several time periods, is rather obvious.Step 1: Select:(a) The number of agents () in the populations.(b) The infection distance.(c) The susceptibility of the agents.(d) The total simulation time T.(e) The cycle time = . The number of time-periods is .Step 2: Generate the initial population locations on the globe.Step 3: For each population, loop over each agent in the infection radius. If so, according to the “rules of engagement” in the previous section, compute the interaction of the pair.Step 4: Compute the survivors and deaths of the existing agents for the time period.Step 5: Repeat steps 2–4 for the next time period until the overall simulation time is complete.
Computational acceleration
There are a variety of techniques to accelerate the computations. The primary computational expense is neighbor-to-neighbor contact checks (an operation). To mitigate this, one can construct so-called “interaction” or “Verlet” lists of neighboring agents that an agent may be in contact with at any given time. One can retain this Verlet list for a preset number of time steps and then update it when appropriate. The approach is relatively straightforward to implement and can speed up the computations dramatically (see Pöschel and Schwager [45] and Zohdi [51, 52]). Alternative computational acceleration approaches can be achieved via sorting and binning methods, which proceed by partitioning the whole domain into bins. The agents are sorted by the bins in which they reside. The agent interaction proceeds, bin by bin, where the agents within a bin potentially only interact with agents within their bin or neighboring bins. Parallel processing is a further acceleration that can be employed whereby groups of agents or bins are sent to each processor and updated periodically, sharing the agent information between processors every few time-steps. In this work, we only implemented the Verlet list algorithm.Starting from left to right and top to bottom, the progressive growth of at SIR population. Shown are after t = 0, t = 0.2T, t = 0.4T, t = 0.6T, t = 0.8T and t = T time, with mobility parameters: andEvolution of infected and dead/recovered with mobility parameters: and 0.005
A model problem
As an example, consider a population with 20,000 agents (Fig. 6). We consider six cases of increasingly mobile subpopulations that are initially uniformly mixed across the globe, with agents being infected or uninfected and being susceptible or not susceptible. We employed the following parameters:Figure 6 illustrates, starting from left to right and top to bottom, the progressive growth of SIR subpopulations. Shown are results at t = 0, t = 0.2T, t = 0.4T, t = 0.6T, t = 0.8T and t = T time, with mobility parameters: and . Figure 7 shows the evolution of infected and dead/recovered with mobility parameters: and . With increases in mobility, there is an immediate and direct impact on the spread of the infection, emanating primarily from the dense population centers at the z-poles. The utility of the presented computational approach is that one can trivially modify the “rules of engagement”, population sizes, etc., and provide quantitative spatial and temporal information. Clearly, such a computational technique is easy to implement, and it is no extra effort to increase the number of population character parameters.
This is straightforward to implement. Numerous additional features can be added easily. For example, one could add population growth in a variety of ways, such as, algorithmically:
Fig. 6
Starting from left to right and top to bottom, the progressive growth of at SIR population. Shown are after t = 0, t = 0.2T, t = 0.4T, t = 0.6T, t = 0.8T and t = T time, with mobility parameters: and
Fig. 7
Evolution of infected and dead/recovered with mobility parameters: and 0.005
Globe radius, .Total simulation time: .The infection distance: .The population initially infected = .The population that is susceptible = .The incubation period for an agent to either recover or die is .Update cycle time = .The location of each agent was achieved using a random spherical coordinate scheme (see Fig. 9), whereby (inclination angle) is a random number between and (azimuth angle) is a random number between , yielding the following in Cartesian coordinates: and and
Fig. 9
Agent-obstacle-target model problem
The mobility is given by (the location at an instant of time later) and and This produces dense population centers at the z-poles. The updated values for the angles are given by where , being a random number between and is a mobility amplitude parameter and where , being a random number between and is a mobility amplitude parameter.If an agent of a population survives beyond a certain number of time periods, it then produces offspring, and then perishes.The offspring are placed within an “offspring” radius, centered at the spatial location of the parent. The number of children possible that an individual can have, at maturity, is given by where is a random number and where M is the maximum number of children possible. The function “integer” extracts the nearest integer from .After giving birth to the offspring once, the agent cannot have offspring again.Creating a “forbidden” zoneAgent-obstacle-target model problemWe note that, if desired, incorporation of “forbidden regions” i.e. “uninhabitable zones” within the domain is relatively easily to enforce by checking at each time step whether an individual has entered such an area (Fig. 8). If so, then the individual is moved back outside, and a new position is recalculated with a different trajectory. Another extension to the overall modeling is to provide more detail on the movement of the agents. For example, as described at the outset of this paper, we could have allowed for them (as well as the parents) to move according to more physical rules, based on pandemic events and stimuli. It is here where one can draw on swarm movement models discussed earlier in the paper. This is discussed further next.
Fig. 8
Creating a “forbidden” zone
Extensions
One direction in which to extend this research is in the swarm-like movement of agents in response to pandemic events. The origins of swarm modeling are in the description of biological groups (flocks of birds, schools of fish, crowds of human beings, etc.) responses to predators or prey (Breder 1952 [13]). Early approaches that rely on decentralized organization can be found in Beni [8], Brooks [14], Dudek et al. [20], Cao et al. [16], Liu and Passino [36] and Turpin et al [47]. Usual models incorporate a tradeoff between long-range interaction and short-range repulsion between individuals, dependent on the relative distance between individuals (see Gazi and Passino [24-26], Bender and Fenton [7] or Kennedy and Eberhart [34]). The most basic model is to treat each individual as a point mass (Zohdi [49]), which we adopt here, and to allow the system to evolve, based on Newtonian mechanics, using a combination of short-range and long-range interaction forces (Gazi and Passino [24], Bender and Fenton [7], Kennedy and Eberhart [34] and Zohdi [49, 50, 53–55]).1 For some creatures, the “visual field” of individuals may play a significant role, while if the agents are robots or Unmanned Autonomous Vehicles (UAVs) the communication can be electronic. However, in some systems, agents interact with a specific set of other agents, regardless of whether they are far away (Feder [21]). This appears to be the case for Starlings (Sturnus vulgaris). In Ballerini et al. [4], the authors concluded, that such birds communicate with a certain number of birds surrounding it and that that interactions are governed by topological distance and not metric distance. Interested readers are referred to Ballerini et al. [4]. We refer the reader to Zohdi [49, 50, 53–55] for reviews and provide an example of such a formulation next.
An example of a swarm formulation
In order to illustrate how swarm movement is modeled, following Zohdi [49, 50, 53–55], we treat the agents as point masses, i.e. we ignore their dimensions. For each agent ( in total) the equations of motion arewhere the position of a point (agent) in space is given by the vector , the velocity is given by , the acceleration is given by , and where represents the total forces acting on an agent i, represents the interaction between agent i and desired targets, represents the interaction between agent i and obstacles and represents the interaction between agent i and other agents. In the context of a pandemic, examples of targets could be hospitals, food distribution centers, etc., while obstacles could be physical barriers, such as buildings.
Agent-target interaction
Consider agent-target interactionwhere is the position vector to target j and the direction to each target isFor each agent (i), we compute a weighted direction to each targetwhere the are weights reflecting the importance of the target, are decay parameters, which is summed (and normalized later in the analysis) to give an overall direction to move towards
Agent-obstacle interaction
Now consider agent-obstacle interactionwhere is the position vector to obstacle j and the direction to each obstacle isFor each agent (i), we compute a weighted direction to each obstaclewhere the are weights reflecting the importance of the obstacle, are decay parameters, which is summed (and normalized later in the analysis) to give an overall direction to move towards
Agent–agent interaction
Now consider agent(i)–agent(j) interactionand the direction to each agentFor each agent (i), we compute a weighted direction to each agentwhere the are weights reflecting the importance of the agents, are decay parameters, which is summed (and normalized later in the analysis) to give an overall direction to move towards
Summation of interactions
We now aggregate the contributions by weighting their overall importance with weights for agent/target interaction, , agent/obstacle interaction, and agent/agent interaction, :2normalize the resultThe forces are then constructed by multiplying the thrust force available by the propulsion system (foot, bicycle, car, etc.), , by the overall normal directionWe then integrate the equations of motion:yieldingandNote that ifthen we define and the velocity is rescaledwith .
An algorithm for movement in a region
Consider the following algorithm:Initialize the locations of the targets: , i = 1, 2,... = targets.Initialize the locations of the obstacles: , i = 1, 2,... = obstacles.Initialize the locations of the agents: , i = 1, 2,... = agents.For each agent (i), determine the distance and directed normal to each target, obstacle and other agents.For each agent (i), determine interaction functions , , and .For each agent (i), determine force acting upon it, .For each agent (i), integrate the equations of motion (checking constraints) to produce and .Determine if any targets have been reached by checking the distance between agents and targets For any , if any agent has satisfied this criteria, then immobilize i (fix ).The entire process is then repeated for the next time step.
Preliminary numerical example
As a preliminary example, consider the following parameters:The vector of system parameter inputs iswas given by a test parameter vectorselected within the following intervals:Figure 10 illustrates the results of this parameter choice.
Fig. 10
Starting from left to right and top to bottom, the progressive movement of a group of agents (blue) avoiding obstacles (green) to get to the target (red)
Mass = 75 kg,100 agents,1 target,10 obstacles,s,s,Initial agent velocity, m/s,Initial agent domain (10, 10, 0) m,Thrust force available by the system, Nt,Domain of (500, 500, 0) m,Maximum velocity agent m/s.Overall weights: ,Target weights: ,Obstacle weights: ,Agent weights: andDecay coefficients: , , .Starting from left to right and top to bottom, the progressive movement of a group of agents (blue) avoiding obstacles (green) to get to the target (red)
Parameter estimation via machine-learning
There are many parameters in the system, warranting the use a Machine Learning Algorithm. Here we follow Zohdi [50, 54, 55] in order to optimize behavior by minimizing a cost function. For example, let us consider minimizing the following cost functionwhere represents the number agents that reached the target within a specific time and represents the total number of agents in the system. In other words, the system is being driven to the parameters generating the best case scenario (all agents reaching the target). The design vector of system parameters is:Cost functions associated with optimization of complex behavior are oftentimes nonconvex in design parameter space and often nonsmooth, as is the case for the system of interest. Their minimization is usually difficult with direct application of gradient methods. This motivates nonderivative search methods, for example those found in Machine Learning Algorithms (MLA’s). One of the most basic MLA’s are so-called Genetic Algorithms (GA’s). Typically, one will use a GA first in order to isolate multiple local minima, and then use a gradient based algorithm in these locally convex regions or reset the GA to concentrate its search over these more constrained regions. GA’s are typically the simplest scheme to start the analysis, and one can, of course, use more sophisticated methods if warranted. For a review of GA’s, see the pioneering work of John Holland (Holland [32]), as well as Goldberg [27], Davis [17], Onwubiko [43] and Goldberg and Deb [28]. The GA approach is extremely well-suited for nonconvex, nonsmooth, multicomponent, multistage systems, and involves the following essential concepts:Population generation: Generate system population:Performance evaluation: Compute fitness/performance of each genetic string: and rank themMating process: Mate pairs/produce offspring: where (Fig. 11)
Fig. 11
The basic action of a machine learning-based genetic algorithm (Zohdi [55])
Gene elimination: Eliminate poorly performing genetic strings, keep top parents and generated offspringPopulation regeneration: Repeat the process with the new gene pool and new random genetic stringsSolution post-processing: Employ gradient-based methods afterwards in the local “valleys”-if smooth enough
Algorithmic specifics
Following Zohdi [50, 54, 55], the algorithm is as follows:Step 1: Randomly generate a population of S starting genetic strings,Step 2: Compute fitness of each string , (i = 1, ..., S)Step 3: Rank genetic strings: , (i = 1, ..., S)Step 4: Mate nearest pairs and produce two offspring, (i = 1, ..., S)Step 5: Eliminate the bottom strings and keep top parents and top K offspring (K offspring + K parents )Step 6: Repeat steps 1–6 with top gene pool (K offspring and K parents), plus M new, randomly generated, stringsNote: and are random numbers, such that , , which are different for each component of each genetic stringOption: Rescale and restart search around best performing parameter set every few generationsRemark: The system parameter search is conducted within the constrained ranges of , and , etc. These upper and lower limits would, in general, be dictated by what is physically feasible.The basic action of a machine learning-based genetic algorithm (Zohdi [55])As another example, one can use such algorithms to search for parameter sets that yield subpopulations (such as infected (I) and recovered (R)) having similar stable sizes after many time-periods. Mathematically speaking, this can be expressed by writing , whereand where , after a set number of time periods for a given . Clearly, there are numerous possibilities. Typically, for populations with a finite number of agents, there will be slight variations in the performance for different random starting configurations. In order to stabilize the objective function’s value with respect to the randomness of the starting configuration, for a given parameter selection (), a regularization procedure is applied within the genetic algorithm, whereby the performances of a series of different random starting configurations are averaged until the (ensemble) average converges, i.e. until the following condition is met:where index i indicates a different starting random configuration () that has been generated and Z indicates the total number of configurations tested. In order to implement this in the genetic algorithm, in step 2, one simply replaces compute with ensemble compute, which requires a further inner loop to test the performance of multiple starting configurations. Development of such stochastic Machine Learning Algorithms for complex, multiscale, systems is currently under investigation by the author.
Authors: M Ballerini; N Cabibbo; R Candelier; A Cavagna; E Cisbani; I Giardina; V Lecomte; A Orlandi; G Parisi; A Procaccini; M Viale; V Zdravkovic Journal: Proc Natl Acad Sci U S A Date: 2008-01-28 Impact factor: 11.205
Authors: Roger E Summons; Jan P Amend; David Bish; Roger Buick; George D Cody; David J Des Marais; Gilles Dromart; Jennifer L Eigenbrode; Andrew H Knoll; Dawn Y Sumner Journal: Astrobiology Date: 2011-03-09 Impact factor: 4.335
Authors: Mathias Peirlinck; Kevin Linka; Francisco Sahli Costabal; Jay Bhattacharya; Eran Bendavid; John P A Ioannidis; Ellen Kuhl Journal: Comput Methods Appl Mech Eng Date: 2020-09-08 Impact factor: 6.756
Authors: Georgii P Romanov; Anna A Smirnova; Vladimir I Zamyatin; Aleksey M Mukhin; Fedor V Kazantsev; Vera G Pshennikova; Fedor M Teryutin; Aisen V Solovyev; Sardana A Fedorova; Olga L Posukh; Sergey A Lashin; Nikolay A Barashkov Journal: Biology (Basel) Date: 2022-02-07
Authors: Mathias Peirlinck; Kevin Linka; Francisco Sahli Costabal; Jay Bhattacharya; Eran Bendavid; John P A Ioannidis; Ellen Kuhl Journal: medRxiv Date: 2020-08-29
Authors: Malú Grave; Alex Viguerie; Gabriel F Barros; Alessandro Reali; Alvaro L G A Coutinho Journal: Arch Comput Methods Eng Date: 2021-07-27 Impact factor: 7.302