Literature DB >> 35601479

The CP-ABM approach for modelling COVID-19 infection dynamics and quantifying the effects of non-pharmaceutical interventions.

Aleksandar Novakovic^1,2, Adele H Marshall^1,2.

Abstract

The motivation for this research is to develop an approach that reliably captures the disease dynamics of COVID-19 for an entire population in order to identify the key events driving change in the epidemic through accurate estimation of daily COVID-19 cases. This has been achieved through the new CP-ABM approach which uniquely incorporates Change Point detection into an Agent Based Model taking advantage of genetic algorithms for calibration and an efficient infection centric procedure for computational efficiency. The CP-ABM is applied to the Northern Ireland population where it successfully captures patterns in COVID-19 infection dynamics over both waves of the pandemic and quantifies the significant effects of non-pharmaceutical interventions (NPI) on a national level for lockdowns and mask wearing. To our knowledge, there is no other approach to date that has captured NPI effectiveness and infection spreading dynamics for both waves of the COVID-19 pandemic for an entire country population. Crown

Entities: Chemical

Keywords: Agent based model; COVID-19; Change point detection; Genetic algorithm; Non pharmaceutical interventions

Year: 2022 PMID： 35601479 PMCID： PMC9107333 DOI： 10.1016/j.patcog.2022.108790

Source DB: PubMed Journal: Pattern Recognit ISSN： 0031-3203 Impact factor: 8.518

Introduction

On the 30th January 2020, the World Health Organization declared a global outbreak, and on the 11th March of the same year, it declared the pandemic of coronavirus disease 2019 (COVID-19). COVID-19 is caused by a respiratory virus named SARS-CoV-2, with its main mode of human-to-human transmission via direct, indirect or close contact with an infected person through their infected secretion (droplets, aerosols, saliva) which is produced when an infected person sneezes, coughs, talks or sings [1]. One of the main characteristics of this virus is that the people who get infected do not develop any symptoms immediately upon its contraction (i.e. become symptomatic or clinical). The period of time that passes between exposure to the virus and symptom onset is called the incubation period [1]. This incubation period varies from person to person, lasting for a median of 5.1 days (95% CI = [4.5, 5.8]), but in some instances its duration can be as long as 14 days [2]. However studies have also demonstrated that infected people without symptoms, that is those who are in the incubation period and not yet symptomatic (i.e. presymptomatic or preclinical careers) and those who never develop any symptoms (i.e. asymptomatic or subclinical careers), can spread the disease too [3]. Given that the subclinical people are not always tested, numerous research findings have suggested that the total number of infections is very likely greater than the number of reported cases [4]. The true proportion of COVID-19 transmissions that is accounted for by preclinical or subclinical people is unknown and as such this has big implications for prevention [5]. To fight the global pandemic, the nations around the world have been introducing a range of different non-pharmaceutical interventions (NPIs) that aim to control and reduce the rapid spread of the virus. These NPIs include, but are not limited to, partial and/or total lockdowns on both a regional and/or national level, promoting social distancing and the making wearing of face masks mandatory on public transport, indoors and/or outdoors, etc. Although these measures are proven to be effective in reducing the spread of disease [6], their enforcement affects sociopolitical, economical and all the other aspects of life. Given the complexity of societies and differences in the interventions that different nations are taking in fighting COVID-19, it is very difficult to predict their short and medium term impact both on a national and global level [7,8]. Some interventions such as regional and total national lockdowns, can have a devastating impact on the national economies and mental health of society particularly those most vulnerable which is why it is particularly important to quantify the effect that these NPIs have on the reduction of the spread of the virus, so that taking any similar approaches can be justified in future decisions.

Related work

When it comes to modelling the spread of COVID-19 the three approaches most commonly found in the literature are the compartmental, AI and agent-based modelling approaches, each with their own advantages and disadvantages.

Compartmental and AI modelling approaches

Compartmental models belong to the equation based models group, and provide a theoretical framework for describing disease dynamics and analysing a specific outbreak or epidemic within a closed and well mixed homogenous population [9,10]. Each compartment represents one disease status, and each individual in the population can be in exactly one compartment in a given time but can move from one compartment to another depending on the model parameters [11]. A compartmental model consists of a system of differential equations, with each differential equation representing a single compartment in the model [12]. There are many types of these models, but given that there is a known incubation period for COVID-19, the SEIR (Susceptible-Exposed-Infected-Recovered) compartmental models [13] are most frequently used for modelling its disease dynamics as for instance the research by Kuniya [14]. By adding new compartments, many refinements of the standard SEIR COVID-19 models have been made in order to create more realistic models that for example include: super spreaders [15], preclinical and subclinical [16], quarantined [17], and other types of patient. Compartmental models represent the majority among those that can be found in the literature for simulating COVID-19 disease dynamics [8]. Their main advantage is in their capability to capture large scale infection dynamics at a macro level (e.g. country-wide/continent-wide pandemic), with relatively low computational overhead. However, at the same time this top down approach is usually being listed as their biggest limitation in the literature, as they are unable to capture more refined information on the spread of disease such as for instance the interaction between individuals [10]. More recently with the increasing data availability, AI approaches have been gaining more traction within COVID-19 modelling in particular, for diagnostic purposes from medical images [18], [19], [20] and for forecasting the spread of the disease. An example of the former is, Wang et al. [18] who proposed a new framework that utilises deep learning to differentiate and localise COVID-19 from chest X-ray images of community acquired pneumonia patients. Similarly, an example of the latter case is the work of ArunKumar et al. [21] who use deep layer Recurrent Neural Networks (RNNs) with Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTM) cells to estimate 60 day forecasts of COVID-19 cases for the top ten countries. However, the AI models require a large amount of training data which consume an exhaustive amount of training time. When it comes to modelling infection dynamics, the focus of this paper, the AI models have similar limitations as the compartmental models with being a top down approach. This includes an inability to capture more refined information on the spread of disease such as for instance the interaction between individuals, the change in their attitudes in response to the NPIs, the effect that this may have on the spread dynamics and the difficulty in quantifying the NPI effectiveness [22].

Agent based modelling approaches

Agent Based Models (ABMs), take the opposite approach to the macro level view of the compartmental and AI models, by creating a micro level view and simulating it up to the macro level. The ABMs are computer based simulations that consist of heterogeneous and adaptive individual entities called agents that are uniquely identifiable, and capable of acting autonomously and interacting with one another in the simulated environment [23,24]. Their main characteristic is that by defining the set of micro-rules that describe the agents’ behaviour in the simulated environment, they are able to capture emergent macro effects and realistically model a real world system [25]. More precisely, when it comes to epidemiological modelling, the behaviour of the agents combined with the transmission pattern and disease progression will lead to the emerging population dynamics such as a disease outbreak or pandemic [24]. In order to be as realistic as possible so that the results of the ABMs can be applied on a population level, the characteristics of agents, their behaviour and the characteristics of the disease in the simulated environment should be as close as possible to the ones that can be found in the real world. However this capability of ABMs to mimic real world scenarios comes with a price, as with the increased level of details that are captured by the models, comes a requirement for increased computing power to run these simulations. Therefore, the scalability related issue when creating highly detailed models, is often listed as one of the main limitations of ABMs as noted in the literature [10]. The majority of the previously published ABMs used for modelling the spread of COVID-19 have focused on a very small time interval almost all of which covering the first wave only. The limiting factor is likely to have been due to the big issue of scalability as national level models need to have the number of agents in the model matching the number of individuals in the population (based on their census). It is not surprising therefore that no papers can be found reporting the development of an ABM at a national level for both the first and second waves and only two that fit the first wave of the pandemic for their entire population [26,27]. Chang et al. [26] use an ABM to model the best COVID-19 strategies for intervention in Australia, such as restricting international travel, school closures and social distancing. The AceMod, an established Australian-census ABM, was calibrated by using real data until the end of June 2020 for the age-dependent fraction of symptomatic cases and simulations conducted for the period of March 2020 to August 2020. The model runs on a high performance cluster with 4264 cores with over 24 million software agents which mimic the population of Australia. Hinch et al. [27] developed an ABM for evaluating the impact of NPIs in the UK population based on the UK 2011 Census data for household size and age structure of the population of 65 million. The OpenABM-Covid19 model simulates the UK population from the date of the first national lockdown on the 23rd March 2020 until 60 days later. The authors report on the model realistically simulating an outbreak, the corresponding hospitalization and ICU admissions for a default population of 1 million inhabitants in an urban environment where each day takes approximately 3 seconds to run and requires 5Gb of memory (reduced to 1.7Gb if contact-tracing is disabled) on a 2015 MacBook Pro. The model has been developed using the C language but is run via Python using a SWIG-interface. A framework of over 200 tests are used to validate the model. There is no detail provided on how the validation is performed except to report that the expected outputs match the specified set of input parameters. More commonly, due to high computational costs, researchers using ABMs typically focus on modelling the COVID-19 spread at a regional or city level or by using a smaller number of agents than the entire population and then extrapolate the results. Examples of such are as follows: Urmia city, Iran is represented by an ABM with 750,805 agents for each individual of the population based on the Census 2017 data [28]. A chi-square test was performed to test whether the average number of weekly infected agents for 100 runs of the ABM were significantly different to the actual observed data. The results indicated that this was not significant and hence the model was considered a suitable representation of Urmia city. No further calibration was performed. In the Chernivtsi region of Ukraine, an ABM was developed to predict the spread of COVID-19 using 1000 agents and further applied to regions in Slovakia, Turkey and Serbia using between 500-1000 agents [29]. The authors include visual comparisons of the real data and forecasted spread to demonstrate the approach's suitability and report that statistical results, and sensitivity analysis were also conducted. The method is noted to be slower and disadvantaged due to the dependence on the random number generator which can produce different simulation results for the same initial parameters. This is overcome by parallelizing the models on different cores/computers. In Italy, the Calabria region was modelled using an ABM with a closed population of 250 agents moving within a square section of 250 × 250 m2 [9]. The number of agents was kept low to minimize the computational cost, which was reported to have an average CPU time for the simulations over the 90 day period to be 3.5 h on a computer with four cores and 6Gb of RAM. The model was assessed visually by comparing the simulated number of cases with those reported for Cambia and other regions of Italy. In Brazil, an ABM was created to replicate a closed society consisting of 300 agents made up of 5 agent types (people, houses, businesses, the government and the healthcare system) and simulated for a 2 month period [8]. The paper also models the financial impact, however there is no information provided on the calibration of the model. Across Ireland an ABM previously developed for modelling the spread of measles was extended into a hybrid model to represent COVID-19 infection for 12 small towns, with one agent representing each individual in the populations ranging in size from 73 to the largest town at 1782 people [10]. The overall hybrid model was compared with the ABM but no calibration with the real data was discussed. In France, an ABM was constructed for 500,000 agents and run for a period of 360 days [30]. The results were then extrapolated for the entire 64 million population of France. The model contained 194 parameters, 192 of which were estimated from the literature with the remaining two calibrated using the Nelder Mead method. To the best of our knowledge, this paper is the only one that fully calibrates their model using a number of different methods from visual inspection of the fit of the predicted to the observed data for the ICU admissions, bed occupancy and mortality (both daily and cumulative) to an evaluation of the resulting R² and Pearson's R estimates, ranging between 0.87 and 0.93. The Cochran–Mantel–Haenszel test was also carried out to compare the observed and predicted age distribution of deceased people with the conclusion that there was no significant difference. Some other examples of developing ABMs for COVID-19 applications consider the model of spread in facilities such as the one developed by Cuevas [31] who conduct a number of experiments to mimic different scenarios in a facility where there is between 10 and 1000 agents, and the one developed by Araya [32] who model 100 construction worker agents working on one specific project every hour of their working day for a period of 3 working months. Due to the lack of data, both models are hypothetical in nature and based on subject matter expert information rather than data.

Research motivation

The model presented in this paper is motivated by challenges faced by the authors when modelling the infection spread of COVID-19 for the Northern Ireland COVID-19 Modelling Group. Previous models listed a large number of fixed parameters which requires making a large number of assumptions which may be unrealistic or are too specific to the population in question to be easily transferable to Northern Ireland. There is only one paper [30] that clearly describes the calibration performed on their model and in many no description or calibration appears to be performed. Likewise, for those papers that consider a subsection of their population and wish to expand that to the entire population, it is unclear how extrapolation is performed if any. Additionally, all of the models previously reported are considering only wave one of COVID-19 pandemic and run the models for a short period of time. The main contributions of this paper are as follows: We propose a novel methodology based on the hybridisation of change point detection and agent based modelling techniques for modelling COVID-19 infection dynamics, and quantifying the effects of non-pharmaceutical interventions on a national level. The proposed methodology is programming language agnostic and enables researchers to develop models that can be run and calibrated on consumer grade hardware and not necessarily just on supercomputers. We demonstrate effectiveness of the methodology introduced by modelling the spread of the new daily confirmed COVID-19 cases in Northern Ireland during the period between the 9th March and 15th November 2020. This is accompanied by a detailed description of the entire calibration and validation process, as well as interpretations of the fitted parameters. We are able to successfully capture the role that subclinical (also known as the asymptomatic) patients have on the spread of the virus and estimate their relative infectiousness, as well as successfully quantify the NPI effects that the wearing of masks, one regional and two national lockdowns have had on the virus spread reduction. As far as we are aware of, this study is the first of its kind that was able to successfully capture COVID-19 infection dynamic during such a long period of time and successfully model both infection waves. The paper proceeds as follows. In Section 2 we introduce the CP-ABM methodology and provide detailed descriptions of the algorithms of each of its components. In Section 3 we describe how we applied this methodology in Northern Ireland (as our case study) and provide a thorough discussion of the obtained results making parallels with the outcomes that can be found in other independent research studies and literature. Finally, in Section 4 we give our closing remarks and lay out directions for our future research.

Methodology

The main objective of the CP-ABM modelling approach is to realistically simulate the COVID-19 infection transmission within a closed society living in a shared environment during the observed time period. The model assumes that a disease transmission may occur only when the infectious (preclinical, clinical or subclinical) person gets in direct contact with a susceptible person, and that changes in the contact behaviour of people is the driving force behind COVID-19 infection rates [33]. The CP-ABM model consists of two components: the change point detection (CP) component and the agent based modelling component (ABM). The change point detection (CP) component is used to identify key events that result in contact behaviour changes (i.e. mean daily contacts changes) of individuals living in the modelled society and which affect the disease transmission the most during the observed time period. The agent-based modelling (ABM) component takes the outputs from the CP component and adds other NPIs that do not directly influence the mean daily contacts but that have an influence on infection dynamics (such as the mandatory wearing of masks) to appropriately adjust the behaviour of its agents over the course of the simulation and produce realistic results (Fig. 1 ).

Fig. 1

A schematic overview of the CP-ABM methodology.

An overview of the change point detection (CP) component

Many timeseries datasets can exhibit a sudden change in structure, such as unexpected jumps or drops in level or volatility [34]. The indexes that denote the moments in time in which the characteristics of timeseries abruptly change are called change points. Change points usually indicate changes in the underlying data generation mechanism, and given that the changes in social mixing patterns have a detrimental impact on the COVID-19 spreading dynamics, in our work we associate change points with the moments in the timeseries that correspond to those non-pharmaceutical interventions (NPIs) that played a critical role in governing contact behaviour between individuals in the population. In order to explain how the CP component works we need to introduce some basic definitions. Let us assume that the observed data for the confirmed daily COVID-19 cases is represented by , where , is a sequence of observations in . Also, let be an increasing sequence of integers that can take the values between 1 and N (inclusive), representing the indexes of occurrence of all NPIs that were used for regulating social-mixing patterns in (e.g. lockdowns, opening/closing schools, etc.). The main task of the CP component is to find , an increasing sequence of integers that can take the values between 1 and N-1 (inclusive), that represent the change points in and correspond to the key NPIs from that were most influential on the spread of the disease. These change points segregate the timeseries into non-overlapping segments , where and . In practice the exact number of change points is usually unknown [34], hence the problem of estimating can be formulated as a model selection problem where the objective is to find the best segmentation that minimizes the following quantification criteria:where represents a cost function that measures a goodness of fit for segment , while is a carefully selected penalty parameter that increases with [35]. This penalty parameter is used to prevent overfitting by controlling the number of change points and thus penalizing for complexity. The choice of most often depends on the assumptions of how the distribution of the underlying dataset is parametrised. When it comes to the spread of COVID-19, a growing body of evidence suggests that the number of confirmed cases follows a power-law distribution [36]. Fagan et al. [37] demonstrate that no single change point detection method fully captures the behaviour of the heavy-tailed nature of the data of power-law distributions. To mitigate these issues, they propose a hybrid solution where the optimal segmentation that minimizes (1) is found by utilising the ED-PELT non-parametric change point detection algorithm in which the empirical distribution of is used to define [38], in conjunction with two penalty choices for : the modified Bayesian information criterion (mBIC) [39] and the change points for a range of penalties (CROPS) [35]. A schematic overview of the CP component that we use is shown in Fig. 2 . The approach of Fagan et al. [37] is utilised in the approximation phase of our algorithm for the CP Detection Component to approximate the locations where the statistical properties of change. The pseudo-code of the approximation phase along with the identification, confirmation and segmentation phases is provided in Algorithm 1 . The main task of the approximation phase (lines 2-9) is to find the optimal number of change points and their location in , and in the case of misalignment to adjust the positions of detected change points to match the nearest NPIs from that were crucial in influencing social-mixing patterns in the population (identification phase, lines 10-19). This component also enables the modeller to include any additional NPIs of interest that are not fully captured by the previous two phases (NPI confirmation phase, lines 20-24) and to produce the final segmentation (segmentation phase, line 25) to inform the ABM component of the model to appropriately adapt the behaviour of its agents during the simulation.

Fig. 2

A schematic overview of the processes inside the Change Point Detection Component.

Algorithm 1

The pseudo-code of the Change Point Detection (CP) Component of the CP-ABM model.

CP-COMPONENT(D,E)
Input:	D={〈dk〉k=1N\|N∈N+}← A sequence representing a timeseries dataset containing the daily numbers of confirmed COVID-19 cases;E={〈ek〉k=1Q\|Q≤N}← An increasing sequence of integers taking values between 1 and N, representing all indexes in D that correspond to NPI that affected social mixing patterns
Output:	T=〈t(k)〉k=1M← an ordered sequence of change points representing the key NPIs from E that affected disease spreading dynamics the most
1.	T←〈〉;k←1	Initialisation
2.	T^mBIC← estimated set of change points by applying ED-PELT with mBIC penalty (Eq. 1)	Approximation Phase (Fig. 2 a)) [37]
3.	T^CROPS← estimated set of change points by applying ED-PELT with CROPS (Eq. 1)
4.	if\|T^mBIC\|=0 and \|T^CROPS\|>2then
5.	returnT
6.	else
7.	T^←T^mBIC∩T^CROPS
8.	T′^←(T^mBIC∪T^CROPS)∖T^▹ interpretation required (steps 14-20)
9.	end if
10.	for eacht^∈T^do	Identification Phase(Fig. 2 b))
11.	t←argmine∈E\|t^−e\|
12.	Tk←t;k←k+1
13.	end for
14.	for eacht^′∈T^′do
15.	t←argmine∈E\|t^′−e\|
16.	if interpretation leads to conclusion that t is relevant then
17.	Tk←t;k←k+1
18.	end if
19.	end for
20.	for eache∈E∖Tdo	Confirmation Phase (Fig. 2 c))
21.	if interpretation leads to conclusion that e is relevant then
22.	Tk←e;k←k+1
23.	end if
24.	end for
25.	return sort(T)	Segmentation Phase (Fig. 2 d))

A schematic overview of the processes inside the Change Point Detection Component. The pseudo-code of the Change Point Detection (CP) Component of the CP-ABM model.

An overview of the agent based‐model component

The main objective of the ABM component is to realistically simulate COVID-19 infection dynamics inside the closed heterogenous population that is inhabiting the shared closed environment. It also aims to estimate the effectiveness of the key NPIs that lead to the changes in population mixing dynamics, as well as the effectiveness that mask wearing, has had on the disease spread during the observation time period. This ABM component consists of two types of entities: agents representing the age-stratified individuals inside the simulated population and cells representing the environment in which these agents exist and interact. There are 7 cells in total , with each cell describing one of the mutually exclusive infection status: (shielded susceptible), (susceptible), (exposed), (infectious and subclinical), (infectious and preclinical), (infectious and clinical) and (recovered). Each cell can contain multiple agents, with agents inheriting the infection status of the cell in which they are located (e.g. all agents that are on the cell with infection status are susceptible, while those on the cell with status are exposed, etc.). Let us denote with a set of agents that represent all individuals in the simulation, with the th agent belonging to one out of seven given cells . An overview of the attributes characterising cells and agents is provided in Table 1 .

Table 1

Overview of the entities and their attributes of the ABM component.

Entity	Attribute	Symbol	Possible Values	Description
cell	Status	ς	Ss,S,E,IS,IP,IC,R	Infection status representation
	n-agents	ης	N+	Number of agents residing at the cell
agent	Age	g	[0,90]	Age in years
	Status	ς	Ss,S,E,IS,IP,IC,R	Agent's infection status inherited from the status of the cell in which it is located
	Susceptibility	sg	={0.008,ifg∈[0,9]0.007,ifg∈[10,19]0.04,ifg∈[20,29]0.06,ifg∈[30,39]0.08,ifg∈[40,49]0.16,ifg∈[50,59]0.38,ifg∈[60,69]0.52,ifg∈[70,90]	Age stratified susceptibility to infection [40]
	Duration	δς	∼{⌈Γ(μ=3.0,k=4)⌉,ifς=``E″⌈Γ(μ=2.1,k=4)⌉,&ifς=``IP″⌈Γ(μ=2.9,k=4)⌉,&ifς=``IC″⌈Γ(μ=5.0,k=4)⌉,&ifς=``IS″	Duration in days that an agent spends in the E,IS,IP,ICcells before moving to another cell in the infection dynamics workflow (Fig. 3) [16]

Overview of the entities and their attributes of the ABM component. The ABM component conducts an iterative process (i.e. simulation) that is implemented through a series of discrete steps that we call ticks, with each tick () in the simulation representing one day in the real world. The simulation starts by initialising the input parameters for the baseline relative infectiousness for each agent type, the number of agents at each state at the beginning of the simulation, the mean daily contacts in each segment between change points, the probability of the exposed agents developing a subclinical infection at the end of the incubation period, and the variations attributed to additional behavioural effects. The output of the model is the estimated number of daily cases which is then used to evaluate the effectiveness of the NPIs by comparing the estimated values from a number of different scenarios against the true observed data . The model is also able to implicitly capture the number of agents on each cell and track their interactions and behaviours at each tick, but for the focus of this paper we keep our attention on the number of estimated daily cases. In order to produce outputs that are as realistic as possible, the number of agents to be used in the simulation and their age stratification should be initialised to match publicly available demographics data of the modelled population or geographical region. It is important to highlight that the size of the agent set remains constant within the simulated time period, that is, there is no change in the population demographics. The number of agents to be randomly allocated to each cell during the initialisation phase (), is controlled by the parameter. In reality, the true number of agents with exposed, preclinical, subclinical and recovered infection status is unknown so the parameters , where need to be either assumed or estimated through calibration. In addition, there is the possibility that some susceptible agents are being protected and shielded from the rest of the other types of agent. Such individuals cannot be infected and are described as shielded susceptible. Shielded agents are located at the cell with status and the time in which the shielding will occur will depend on the scenario being modelled. It has been assumed that the agents’ social mixing patterns are the driving force behind COVID-19 infection spread and that these patterns are directly affected by the key NPIs that are captured by the CP component at the change points . These change points segregate the simulation time period into non-overlapping segments , with the agents’ social mixing behaviour at tick being controlled by the mean daily contacts rate , where and . The mean daily contact rate on these segments are usually not known in advance, and therefore they need to be estimated via calibration. Bearing in mind the modelling assumption that the infection can only spread throughout the population when there is an effective contact between infectious and susceptible individuals, the high computational costs can be reduced in the ABM component by employing a novel infection centric approach which aims to capture the interactions of infectious agents only, i.e. the interactions of only those agents that are able to transmit a disease to others during the simulated time period. For every tick , each infectious agent encounters a number () of other agents, where is drawn from a Poisson distribution with mean [41]. Disease transmission from to each encountered agent may happen only if the is susceptible (and not shielding), and if a randomly drawn number from the uniform distribution is lesser than the probability of infection of the agent . In the case of successful disease transmission, the agent becomes exposed and moves from its cell with status to the cell with status for the period of ticks, during which it is not infectious. At the end of the incubation period , this agent will develop a subclinical infection with probability and move to cell for the period of ticks, after which it will recover (have status recovered). Otherwise, the agent will develop a preclinical infection and move to cell . After spending a period of ticks in the cell, the agent will develop a clinical infection and move to cell where it will stay until recovering, for a duration of ticks. The recovery of the agent (either from subclinical or clinical infection) means that it is no longer infectious nor prone to reinfection, and requires its movement to the cell with infection status until the end of simulation time period. The durations of time and (measured in ticks) represents the time that each infected agent spends in the cells and , respectively, and is approximated separately for each agent by rounding a randomly drawn number from the gamma distribution to the nearest integer, with and parameters provided in Table 1. The entire infection dynamics workflow is depicted in Fig. 3 .

Fig. 3

An overview of the infection dynamics workflow where the solid arrows represent possible transitions of agents between the cells and the dashed arrows indicate interactions between the agents where the transition of infection may occur. We always consider infectious agents to be all those agents with subclinical or preclinical infection, i.e. all agents that are located at the cells and at tick . Given that the NPIs usually require isolation of symptomatic patients, we introduced the parameter that controls whether the agents with developed clinical infection are also going to be involved in spreading infection at tick or not. Depending on the parameter , let denote the set of all infectious agents located at the infectious cells during tick . Likewise, we introduced the parameter that controls whether the agents with a developed clinical infection are also going to be involved in spreading infection at tick or not. Overall, the relative infectiousness of is affected by two main factors. The first factor is the baseline relative infectiousness of the agents that depends on the cell in which they are located. We assume that the baseline relative infectiousness of agents with preclinical and clinical infection is 1 (corresponding to 100%) and that the agents who develop subclinical infection are less infectious than those who develop preclinical and clinical infection as suggested by the evidence from the literature [5,16]. Based on the fact that, there is no clear agreement in the literature about how much smaller their relative infectiousness is, the value needs to be obtained via calibration. Once initialised, the baseline relative infectiousness will remain constant throughout the simulation for all infectious agents. The second factor that has a significant impact on the overall relative infectiousness of the agents are introductions and relaxations of non-pharmaceutical interventions such as social distancing, mask wearing enforcement or hygiene promoting rules, that cannot be solely captured by the decrease or increase of the mean daily contacts . To intrinsically capture this extra variation that is attributed to these external behavioural effects, we introduce a behavioural parameter that can either take a value of 0 if there are no non-pharmaceutical intervention taking place at tick or otherwise take a real value from the [0,1] interval that is obtained again by calibration. The value of the parameter varies from intervention to intervention, and requires separate fitting for each intervention. The probability that the susceptible agent will become infected at tick depends on its age stratified susceptibility to infection (Table 1) and on the overall relative infectiousness of the infectious agents that it encountered, and thus calculated as . Pseudo-code describing the entire process of the COVID-19 infection spread at each tick is provided in Algorithm 2 . The algorithm starts by initialising cells with the infectious agents and the cells containing the agents that these infectious agents can contact at tick . The only contact that cannot be made at any point in time in the simulation is the contact between the infectious agents and those that are shielded, that is, those that are located at cell If the parameter requires the agent to isolate after having developed a clinical infection, then the agents located at the cell should be excluded from the contact list too (lines 1-2). The algorithm proceeds with the infection spreading phase as outlined above (lines 3-19).

Algorithm 2

The pseudo-code of the Agent Based Modelling (ABM) component of the CP-ABM model.

INFECTIONSPREAD(τ)
Input:	QτIC; rbςI; bτ; mτ
1.	CτI←{{IS,IP},ifQτIC=TRUE{IS,IP,IC},ifQτIC=FALSE	Initialisation
2.	Cτ′←{C∖{SS,IC},ifQτIC=TRUEC∖{SS},ifQτIC=FALSE	Initialisation
3.	for eachςI∈CτIdo	Infection spreading
4.	AτςI← set of all agents located at the infectious cell ςI at tick τ
5.	ifAτςI≠∅then
6.	for eachaI∈AτςIdo
7.	NτaI←Poisson(mτ)▹ number of contacts agent aIhas at tick τ
8.	ANC′← set of NτaIrandomly selected agents from C′that are contacted by the agent aI
9.	for eachaς′∈ANC′do
10.	ifςs′=″S″then
11.	PaS→E←rbςI·(1−bτ)·sag
12.	ifU(0,1)<PaS→Ethen
13.	Move agent a from cell "s" to cell "E" and set δaE to[Γ(μ=3.0,k=4)] ticks
14.	end if
15.	end if
16.	end for
17.	end for
18.	end if
19.	end for

The pseudo-code of the Agent Based Modelling (ABM) component of the CP-ABM model.

Results and discussion

To demonstrate the effectiveness of the CP-ABM methodology, we applied it to Northern Ireland, (one of the four countries that make up the United Kingdom). The dataset used consists of 251 days of observations of confirmed COVID-19 cases that were collected in the period between the 9th March and 15th November 2020 in Northern Ireland (NI). In this period the devolved NI government introduced and/or lifted many interventions with the goal of preventing and controlling the spread of disease. We used the CP component described in Section 2.1 to identify which of the decisions affecting social-mixing patterns that were the major drivers behind the changes in the COVID-19 infection rates. These key NPIs are used in the ABM component of the model (Section 2.2) to inform how infectious agents should adjust their contact behaviour as the simulation progresses. Table 2 contains a timeline of the occurrences of these interventions and at which tick the agents should change their mean daily contact behaviour in the simulation. In the simulation time period the two intervals in ticks (19, 144] and (221, 251] correspond to two national lockdowns that were introduced in March and October respectively, whereas the period between ticks (206, 221] correspond to a regional lockdown in the Derry and Strabane local government district (LGD) in October 2020. In NI the mandatory wearing of masks commenced on the 10th August 2020 and remained in place until the end of the observational study. The wearing of masks is incorporated into the model through its impact on the overall relative infectiousness, not mean daily contacts, and is captured by the behavioural parameter. An overview of all the behavioural and mean daily contact parameters that were used in the simulations is provided in Table 3 . This requires fitting a total of 22 parameters, the process of which is described in Section 3.1. If we were interested in modelling a shorter time window, this would require fitting fewer and parameters, which would result in fitting a fewer number of parameters overall. This is entirely driven by the number of key NPIs identified by the CP component during the modelled time period. The timeline of these events plotted against the daily numbers of confirmed COVID-19 cases is illustrated in Fig. 4 .

Table 2

Timeline of the key NPIs extracted by the CP component that most impacts on social mixing.

Date	Tick (τ)	Non-Pharmaceutical Intervention
28 Mar 2020	19	From the following day the first full national lockdown was imposed restricting people by not allowing them to leave home without reasonable cause and forced businesses and schools to close. The restrictions also introduced the shielding programme for the elderly and vulnerable groups.
18 May 2020	70	Lockdown measures eased to allow a group of six to meet outdoors from the following day.
22 Jun 2020	105	Further lockdown measures eased to allow up to six people to meet indoors from the following day.
02 Jul 2020	115	Easing of the lockdown measures continues. From the following day all businesses in the hospitality sector (e.g. restaurants, hotels, pubs and cafes) are allowed to re-open.
31 Jul 2020	144	From the following day the entire shielding programme is suspended. This day can be considered as the end of the first national lockdown.
23 Aug 2020	167	Schools are reopened from the following day with pupils in grades seven, twelve and fourteen returning first.
13 Sep 2020	188	From the following day, freshers week in NI universities started.
01 Oct 2020	206	Due to increased number of confirmed COVID-19 cases, a localised circuit breaker lockdown was imposed on Derry and Strabane local government district from the following day limiting people's movement and reducing the operation of hospitality and leisure sectors.
16 Oct 2020	221	The circuit breaker lockdown was expanded to the entire nation including the closure of hospitality, close contact services and schools from the following day.
01 Nov 2020	237	From the following day, the schools were permitted to reopen.

Table 3

Overview of all input parameters and their values for the ABM component.

Parameter	Value	Description
rbςI	={1,ifςI=IP0.39,ifςI=IS	Baseline relative infectiousness of agents with developed subclinical, preclinical or clinical infection status [calibrated].
PE→IS	0.38	Probability of exposed agents developing a subclinical infection at the end of the incubation period. Otherwise the agent will develop a preclinical infection [calibrated].
m(0,19]	7.6	Mean daily contacts of the agents on the segment (τi,τj][calibrated].
m(19,70]	5.4
m(70,105]	5.5
m(105,115]	6.3
m(115,144]	6.4
m(144,167]	6.7
m(167,188]	6.8
m(188,206]	7.4
m(206,221]	7.1
m(221,237]	5.6
m(237,251]	5.7
b(0,19]	0	Associated behavioural effect parameter of the agents on the segment (τi,τj][calibrated].
b(19,144]	0.18
b(144,153]	0
b(153,206]	0.4
b(206,221]	0.41
b(221,251]	0.48
η0S	1,893,604	Number of agents at cell S calculated on initialisation as1,893,667−(η0SS+η0E+η0IS+η0IP+η0IC+η0R).
η0SS	0	Number of agents at cell SS on initialisation [observed].
η0E	29	Number of agents at cell E on initialisation [calibrated].
η0IS	3	Number of agents at cell lS on initialisation [calibrated].
η0IP	15	Number of agents at cell lP on initialisation [calibrated].
η0IC	16	Number of agents at cell lC on initialisation [observed].
η0R	0	Number of agents at cell R on initialisation [assumed].

Fig. 4

The number of new daily confirmed COVID-19 cases plotted alongside the timeline of NPIs identified by the CP component. The green dashed vertical lines indicate at which points the NPIs were made in this observation period (as outlined in Table 2) and the shaded backgrounds represent the duration of key NPIs such as the national and regional lockdowns, the wearing of facemasks and a period of no lockdown restrictions at all.

Timeline of the key NPIs extracted by the CP component that most impacts on social mixing. Overview of all input parameters and their values for the ABM component. The number of new daily confirmed COVID-19 cases plotted alongside the timeline of NPIs identified by the CP component. The green dashed vertical lines indicate at which points the NPIs were made in this observation period (as outlined in Table 2) and the shaded backgrounds represent the duration of key NPIs such as the national and regional lockdowns, the wearing of facemasks and a period of no lockdown restrictions at all. In order to model the COVID-19 infection dynamics as realistically as possible, the number of agents and their age characteristics used in the simulations conducted by the ABM component, have been set to match exactly the publicly available demographic projections for the NI population for 2019 [42], provided by the Northern Ireland Statistics and Research Agency (NISRA). More specifically, the simulation involved 1,893,667 agents, of which 224,851 agents represented the elderly population aged 70+ years. At the beginning of the simulation, during the initialisation step (, all non-elderly agents were randomly distributed across the cells as specified by the parameter, where (Table 3). The parameter was set to correspond to the number of confirmed clinical cases that were reported on the 9th March, and given that we take this date as a starting point in our simulation, we assumed that the there is no recovered cases at that point of time and therefore no agents initialised at the cell with status. The shielding programme in NI was active between 29th March – 31st July (inclusive) and required the isolation of all vulnerable population groups such as people with comorbidities, terminal illness or those aged 70 and older. Given we do not take into account factors other than age for modelling the spread of disease, to mimic real world behaviour in all of our simulations, the agents that represent the remaining susceptible elderly population enter the shielded state and move from cell with status to cell with status at tick and stay there until tick (inclusive). It is important to highlight that while shielded, the elderly agents are unable to be contacted by the infectious agents and thus unable to become infected. Upon the end of the shielding period, elderly agents become susceptible again and return back to the cell with status . To reflect the high level of adherence of the NI population to NPIs, we assumed that as soon as the agents develop clinical infection they go into isolation where they cannot be involved in any kind of interaction with others (i.e. cannot pass infection to others nor be contacted by other infectious agents), and hence it was specified that for all . This implies that in our simulations we assume that the infection is transmitted only via agents who developed either preclinical or subclinical infection. The true number of these agents at the beginning of simulation is unknown so we used calibration to define how many agents needs to be initialised at the cells with status and . All remaining agents (including all of those representing the elderly population) were initialised at the cell with status .

Calibration and validation

For the purpose of the ABM component calibration we utilised the standard generational genetic algorithm (GA) [43] with population size 50, 15% mutation rate and 85% crossover rate, using tournament selection with tournament size 2. Each individual in the GA population is called a chromosome, and consists of a sequence of 20 value-encoded genes that represent values of input parameters for the ABM component. To address the stochastic nature of the ABM component and to identify the emerging infectious dynamic pattern that it produces, the simulation runs are repeated for 50 times (each time with a different random number generator seed) with the input parameter settings matching those of the corresponding chromosome. This enables the average fitness score to be calculated as follows:where is the mean square error of the difference between the estimated values of daily confirmed cases produced by the simulation and the actual observed data. This fitness function describes how closely the ABM component fits the data, with lower values indicating a better fit, and is calculated for each chromosome in the population separately. The GA iterates from one population of chromosomes to another minimising the fitness function by giving preference to the chromosomes with lower fitness values which are used for reproduction during the tournament selection process, at the end of each generation. To identify the optimal values of the input parameters of the ABM component for the NI model, the GA was running continuously for almost 18 days and stopped when there was no further improvement in the fitness score. The fitness score of the best fit ABM component was . An overview of the input parameters, and their values that correspond to the best fitted model is provided in Table 3. All estimates produced by the ABM component are obtained by running the simulations 50 times using the same random number generator seeds as the ones used by the GA with the best solution (a choice made for the reproducibility reasons), and averaging the results obtained. In order to ensure that the ABM component is producing realistic patterns, deeming it fit for purpose, the patterns in its output had to agree with the patterns in the observed dataset, which is validated against several different criteria. Firstly, all GA estimated input parameters relating to the mean daily contacts are within 95% confidence interval (CI) of the values reported by the independently conducted national COMIX study on social mixing patterns covering the same time period. Likewise, the ABM component is a good fit visually between the estimated cumulative and daily confirmed COVID-19 cases and the actual observations (Fig. 5 a)). This is supported by the and Pearson's metrics that were in both instances higher than 0.93 and 0.96, respectively. Furthermore, the Kolmogorov-Smirnov test that was used to compare the distributions of the estimated and actual observations of weekly numbers of confirmed COVID-19 cases (Fig. 5b)) was non-significant (). And finally, during the observed time period 47,162 cumulative confirmed COVID-19 cases were reported in total in NI, with the ABM component estimations being 47,164 with indicating that there is no statistically significant difference between the estimated and actual observations (Fig. 5 c)).

Fig. 5

The estimated number of new COVID-19 cases against time compared with the actual observed number of (a) daily, (b) weekly, and (c) cumulative confirmed COVID-19 cases.

Interpretation of fitted parameters

Following the validation of the model, we can focus our attention on the other ABM input parameters that have been estimated by the GA algorithm. We have estimated that the overall proportion ( of people who contract COVID-19 and develop a subclinical infection is 38%, which falls within the 95% prediction interval of the proportions reported in the study by Buitrago-Garcia et al. [5]. There are no clear guidelines on the base relative infectiousness of people with developed subclinical infection, however given that the model has proven to validate well against several different criteria and that the estimated is in line with the findings reported in the literature, we estimate that their baseline relative infectiousness is 39%. The main effects attributed to lockdowns is the change in social mixing patterns captured by the mean daily contacts parameter. In addition lockdowns can also lead to a change in other behavioural aspects of the population which has been captured by the behavioural effects parameter. We assume that these additional behavioural changes are a direct result of other NPIs that were not directly impacting social mixing patterns but did have an impact on keeping the disease spread under control. These measures include, but are not limited to the introduction of hand sanitising and 2 metre distancing rules in public places, special visors for employees, increased marketing campaigns and signage promoting hand hygiene. The introduction of the national lockdown at the beginning of the pandemic in April 2020 led to a sharp decrease in the mean daily contacts from 7.6 to 5.4 and likewise for the second national lockdown in the middle of October, reduced the mean daily contacts from 7.1 to 5.6. Aside from the lockdowns, the most striking change in the mean daily contacts occurs on 22nd June 2020 when there was a sharp increase from 5.5 to 6.3. According to the NPIs at the time, this corresponds directly to the relaxation that permitted up to 6 people from different households to meet indoors. The other most striking change is on the 13th September 2020 when the mean daily contacts increased from 6.8 to 7.4. corresponding to the return of students to Universities. In addition to the reduction in the mean daily contacts, we should draw our attention to the behavioural effects parameters. During the first lockdown, the captured behavioural effects were attributed to the national lockdown only and reported as = 0.18. After the first lockdown, the wearing of masks was introduced and made compulsory which is captured by the behavioural effects parameter. This wearing of masks continued to be compulsory for the duration of the study which means any further behavioural effects attributed to other lockdowns are in fact a mixed effect of wearing masks and lockdown. For a limited period of time (between the 10th August and the 1st October) there was the isolated effect attributed to only wearing masks which was captured by the parameter = 0.40). With this in mind, the resulting behavioural effects parameter for the regional lockdown = 0.41 implies an additional 0.01 effect for the lockdown on top of the 0.40 effect for the masks and the previously reported reduction in mean daily contacts. Likewise, we can follow the same analogy for the second national lockdown where the behavioural effect of = 0.48 represents an additional 0.08 effect for the lockdown on top of the 0.40 effect for the masks and the previously reported reduction in the mean daily contacts.

Evaluation of NPI effectiveness

In the period between the 10th August (tick 154) and 1st October (tick 206) only the wearing of masks was compulsory (blue coloured area in Fig. 4) therefore, the number of cumulative confirmed COVID-19 cases in this time period can be used to quantify the effectiveness of wearing masks alone (when there is no mixed effect). The quantification of its effectiveness was performed by investigating what would be the estimated numbers of cumulative confirmed cases if the mask wearing rules were never introduced in the first place and comparing the estimated results with the actual data for the period between ticks 154 and 205 (inclusive). This can be achieved by setting the behavioural parameter corresponding to the time interval to 0, and running a series of 50 ABM component simulations for the 206 ticks, using the same random number generator seeds as the ones used by the GA with the best solution (a choice made for the reproducibility reasons). The estimated numbers of cumulative confirmed cases is averaged across for all simulations and the percentage increase on the th day () calculated relative to the actual data using:where and are the actual and projected numbers of cumulative confirmed cases respectively. These results have been further validated by comparing the CP-ABM estimations with the ones found in the literature. In Germany, Mitze et al. [44] suggested that 20 days after becoming mandatory face masks reduced the number of new COVID-19 infections by 45%. This is almost identical results to what we have reported from the CP-ABM model where we estimate approximately a 46% increase in the number of reported cases on the 28th August, the 19th day after mask wearing became mandatory. We extend Mitze et al.’s findings by estimating the impact beyond 20 days noting that the number of reported cases would double to what was reported after only 24 days, and more than triple after only 30 days into the estimation time period. After that, with each additional day that passes, the estimated daily number of confirmed cases rapidly grows towards a near complete spread with the projected numbers being almost 21 times higher on the 1st October compared to the actual number that was reported in NI at the time when mask wearing did take place. This translates to 267,915 (95% CI = [258,861; 276,969]) estimated confirmed cases on the 1st October if masks were never introduced compared to 12,792 confirmed cases of what was actually observed. Daily cumulative growth projections of the number of confirmed COVID-19 cases in the scenario when masks were never introduced is illustrated in Fig. 6 a).

Fig. 6

The estimated percentage increase in cumulative confirmed COVID-19 cases (Eq. (3)) in Northern Ireland if: (a) the masks wearing rules were never introduced (b) there was no first national lockdown, (c) there was no regional lockdown in Derry and Strabane local government district nor second national lockdown, and (d) there was no second national lockdown. A similar approach was taken for quantifying the effectiveness of the lockdowns. Given the assumption that the NPIs being made during lockdowns lead to changes in social-mixing patterns and other aspects of people's behaviour, the effectiveness of the lockdowns was estimated by altering both the mean daily contacts and behavioural effect parameters. Experiments are then simulated to consider what would have happened if lockdowns were never introduced by setting the mean daily contacts and behavioural parameters during the lockdown to be equal to their respective values prior to lockdown and kept constant for the entire evaluation period. We conducted three experiments in total. In the first experiment we explore the impact of the first national lockdown by projecting the number of cumulative confirmed cases in the scenario in which the first national lockdown was never introduced and hence perform the simulations when the mean daily contacts and behavioural effects parameters are set to match those prior the lockdown, i.e. and , respectively. In the second experiment we explore the impact of the regional lockdown by projecting the number of cumulative confirmed cases in the scenario in which the regional lockdown and second national lockdown did not occur. This required setting the mean daily contacts to be the same as the ones that preceded the regional lockdown . As the behavioural effects of these two lockdowns are mixed effects, we nullified the part attributed to the lockdown and left the remainder that equated to only wearing masks assuming that . Finally, in the third experiment we explore the impact of the second national lockdown by projecting the number of cumulative confirmed cases if the second national lockdown was never introduced, with only the regional lockdown taking place. Therefore, the mean daily contacts and behavioural effects parameters were set to match those of the regional lockdown, i.e. and . For all three experiments the simulations were run in the same manner as when we estimated the effectiveness of the masks. The simulations for the first experiment was set up to run until tick 144, corresponding to the 31st July (the end of the first lockdown period), and for the other two experiments until tick 251, which is the 15th November (the end of observed time period). The corresponding projected percentage increase in the number of cumulative confirmed cases relative to the observed data was calculated using Eq. (3). The projections for the first experiment focus on the time interval concerning the first lockdown, illustrated in Fig. 6 b). Similarly, the projections for the second experiment covers the time interval that overlaps with both the regional and second national lockdowns, illustrated in Fig. 6c). The projections for the third experiment are based on the time interval that concerns the second national lockdown, as shown in the Fig. 6d). In the first experiment, if there was no national lockdown (Fig. 6b)), we would have seen nearly 50 times more confirmed cases on the 31st July than what was actually reported in NI on that date. To be specific, the estimated number of confirmed cases would have been 295,252 (95% CI = [294,838; 295,666]) compared to the 5,917 confirmed cases reported at that time in the actual dataset. For the second and third experiments, the projections are not as stark due to the ongoing behavioural effect associated with wearing masks that was not present in the first lockdown. In the case that there was neither a regional nor second national lockdown, the estimated number of cumulative confirmed cases would have been almost 78% greater in comparison to the numbers that have been reported on the 15th November in NI (Fig. 6 c)). This translates to 83,818 cases (95% CI = [79,750; 87,886]) estimated in comparison to 47,162 confirmed cases on the 15th November in NI. From Fig. 6 d) we can observe that having only a regional lockdown in the Derry and Strabane local government district was not a strong enough intervention on its own to prevent the entire spread of disease throughout the country, however it did have an impact on slowing the rise in cases as the projected number of cumulative confirmed cases is almost 47% as opposed to 78% greater in comparison to having no regional nor second lockdown (Fig. 6 c)). The figures also demonstrate that without the second national lockdown being introduced, we would have expected an increase of 47% in the number of confirmed cases on the 15th November.

Implementation details

The NI COVID-19 infection dynamics model that was built using the CP-ABM methodology, that we introduce in this paper, is implemented using a combination of multiple technologies. The entire CP-component algorithm and statistical data analytics of all the results is implemented using the R v4.0.3 programming language. The ABM component of the model, is implemented using the NetLogo v6.1.1 programming language. The calibration of input parameters was performed using the BehaviorSearch tool [45] which interfaces with NetLogo to automate the exploration of the ABM parameter spaces using the GAs [46]. The automatised pipeline that connects all of these tools together was implemented using the Microsoft PowerShell v5.1 programming language. The GA calibration and NPI effectiveness quantification experiments presented in this paper were executed on a HP Z8 desktop workstation with 44 CPU (Intel Xeon) cores and 1.5TB of DDR4 RAM. To reduce the running time, all experiments were parallelised and executed one per CPU core. The average running time of the ABM component simulations depends on the values of its input parameters. The calibrated ABM component simulation, with input parameters as in Table 3, can run on any common grade hardware that has the NetLogo programming language installed. The average running time per simulation was about 123 seconds, as tested on the Dell Precision 5750 laptop with Intel i7-10875H CPU with 16 CPU cores and 64 GB of DDR-4 RAM. However using the parallelised approach to run 50 simulations with the same input parameters across all 16 cores takes approximately 45 minutes to completion. These performances significantly improve what was reported in the literature [26,27] and previously outlined in Section 1.1.2.

Conclusion

This paper introduces the CP-ABM, a new state of the art methodology that is capable of accurately mimicing the disease dynamics for COVID-19. The application of the CP-ABM to the Northern Ireland population is included as a demonstration of its suitability and effectiveness for this purpose. The model covers the time period of 251 days and to the best of our knowledge, this study is the first of its kind that was able to successfully capture COVID-19 infection dynamics during such a long period of time and successfully model both infection waves. Genetic algorithms were used for calibration purposes to ensure that the exact infection dynamics patterns can be replicated by the CP-ABM. This was validated through the existing literature. We also capture the role played by the subclinical people in the infection dynamics workflow. The CP-ABM approach uniquely incorporates change point detection into agent based models in order to identify key events which leads to changes in contact behaviour between people which is the driving force behind the COVID-19 infection rates [43]. The ABM component, unlike the top down approaches of the equation based modelling techniques, allows the simulation of each individual's behaviour in the population. The resulting key events from the CP component are used in the ABM component to appropriately direct the adaptive behaviour of its agents to produce realistic results. The demand for high computational power has limited the application of ABMs on a population level for COVID-19. However, we overcome this issue by employing an efficient infection centric modelling approach which aims to capture the interactions of only those agents who are able to transmit the disease to others during the simulation. The proposed methodology enables researchers to develop models that can be run and calibrated on consumer grade hardware as it is programming language agnostic and does not necessarily require supercomputers. The new CP-ABM methodology is also able to quantify the effects of non-pharmaceutical interventions on a national level which we demonstrated in use cases that estimate the effectiveness of masks and regional and national lockdowns. This has been achieved through the simulations of different scenarios which consider what would have happened if those NPIs were never introduced, and compare the estimated projections of the confirmed daily cases with the real observations. We identify that the NPIs in Northern Ireland considered in our experiments were necessary and highly effective in preventing an even more extreme outbreak of the country's pandemic. Although not explored in this paper, the model can be expanded to have the functionality to capture the number of agents on each cell and their interactions at each tick during simulations. This can be utilised in future work to consider the interaction between agents in different settings and to evaluate scenarios for considering the impact of vaccines, new COVID-19 variants and detailed analysis of clusters dynamically changing over time. We demonstrated the CP-ABM capabilities through the implementation of the model on the NI population but the model is not unique to NI and therefore it can be applied to other countries and geographical areas (both large and small). Also, the methodology is not COVID-19 specific and hence may be used as a basis for creating epidemiological models to capture infection dynamics properties and the assessment of the NPI effects of other diseases both in human and animal health. Our research is implemented using the Netlogo language which is a higher level language for rapid development of agent based models. The speed could be significantly improved if the algorithm was implemented in one of the lower level languages, such as C, C++ or Rust. Also in this paper, we use standard genetic algorithms for the purpose of the ABM component's hyperparameters calibration and in the future we wish to explore other optimisation strategies to investigate whether they would lead to obtaining faster results.

Funding

This research has been funded by Queen's University Belfast.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

35 in total

1. A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data.

Authors: Nancy R Zhang; David O Siegmund
Journal: Biometrics Date: 2007-03 Impact factor: 2.571

Review 2. A Survey on Mathematical, Machine Learning and Deep Learning Models for COVID-19 Transmission and Diagnosis.

Authors: Christopher Clement John; VijayaKumar Ponnusamy; Sriharipriya Krishnan Chandrasekaran; Nandakumar R
Journal: IEEE Rev Biomed Eng Date: 2022-01-20

3. Mathematical modeling of COVID-19 transmission dynamics with a case study of Wuhan.

Authors: Faïçal Ndaïrou; Iván Area; Juan J Nieto; Delfim F M Torres
Journal: Chaos Solitons Fractals Date: 2020-04-27 Impact factor: 5.944

4. Modeling and analysis of different scenarios for the spread of COVID-19 by using the modified multi-agent systems - Evidence from the selected countries.

Authors: Yaroslav Vyklyuk; Mykhailo Manylich; Miroslav Škoda; Milan M Radovanović; Marko D Petrović
Journal: Results Phys Date: 2020-12-09 Impact factor: 4.476

5. Modeling the spread of COVID-19 on construction workers: An agent-based approach.

Authors: Felipe Araya
Journal: Saf Sci Date: 2020-09-29 Impact factor: 4.877

6. Quantifying the impact of physical distance measures on the transmission of COVID-19 in the UK.

Authors: Christopher I Jarvis; Kevin Van Zandvoort; Amy Gimma; Kiesha Prem; Petra Klepac; G James Rubin; W John Edmunds
Journal: BMC Med Date: 2020-05-07 Impact factor: 8.775

7. The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study.

Authors: Kiesha Prem; Yang Liu; Timothy W Russell; Adam J Kucharski; Rosalind M Eggo; Nicholas Davies; Mark Jit; Petra Klepac
Journal: Lancet Public Health Date: 2020-03-25

8. MetaCOVID: A Siamese neural network framework with contrastive loss for n-shot diagnosis of COVID-19 patients.

Authors: Mohammad Shorfuzzaman; M Shamim Hossain
Journal: Pattern Recognit Date: 2020-10-17 Impact factor: 7.740

9. Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020.

Authors: Kenji Mizumoto; Katsushi Kagaya; Alexander Zarebski; Gerardo Chowell
Journal: Euro Surveill Date: 2020-03

1 in total

1. A Comparison of Germany and the United Kingdom Indicates That More SARS-CoV-2 Circulation and Less Restrictions in the Warm Season Might Reduce Overall COVID-19 Burden.

Authors: David Meintrup; Martina Nowak-Machen; Stefan Borgmann
Journal: Life (Basel) Date: 2022-06-24

1 in total