Jacqueline D Seufert1, Andre Python2, Christoph Weisser3, Elías Cisneros4,5, Krisztina Kis-Katos3, Thomas Kneib3. 1. Katholieke Universiteit Leuven Leuven Belgium. 2. Center for Data Science Zhejiang University Hangzhou PR China. 3. University of Göttingen, Campus Institute Data Science Göttingen Germany. 4. University of Göttingen Göttingen Germany. 5. University of Texas at Austin Austin USA.
Abstract
A rapid response to global infectious disease outbreaks is crucial to protect public health. Ex ante information on the spatial probability distribution of early infections can guide governments to better target protection efforts. We propose a two-stage statistical approach to spatially map the ex ante importation risk of COVID-19 and its uncertainty across Indonesia based on a minimal set of routinely available input data related to the Indonesian flight network, traffic and population data, and geographical information. In a first step, we use a generalised additive model to predict the ex ante COVID-19 risk for 78 domestic Indonesian airports based on data from a global model on the disease spread and covariates associated with Indonesian airport network flight data prior to the global COVID-19 outbreak. In a second step, we apply a Bayesian geostatistical model to propagate the estimated COVID-19 risk from the airports to all of Indonesia using freely available spatial covariates including traffic density, population and two spatial distance metrics. The results of our analysis are illustrated using exceedance probability surface maps, which provide policy-relevant information accounting for the uncertainty of the estimates on the location of areas at risk and those that might require further data collection.
A rapid response to global infectious disease outbreaks is crucial to protect public health. Ex ante information on the spatial probability distribution of early infections can guide governments to better target protection efforts. We propose a two-stage statistical approach to spatially map the ex ante importation risk of COVID-19 and its uncertainty across Indonesia based on a minimal set of routinely available input data related to the Indonesian flight network, traffic and population data, and geographical information. In a first step, we use a generalised additive model to predict the ex ante COVID-19 risk for 78 domestic Indonesian airports based on data from a global model on the disease spread and covariates associated with Indonesian airport network flight data prior to the global COVID-19 outbreak. In a second step, we apply a Bayesian geostatistical model to propagate the estimated COVID-19 risk from the airports to all of Indonesia using freely available spatial covariates including traffic density, population and two spatial distance metrics. The results of our analysis are illustrated using exceedance probability surface maps, which provide policy-relevant information accounting for the uncertainty of the estimates on the location of areas at risk and those that might require further data collection.
In December 2019, local hospitals in the Chinese city of Wuhan in Hubei province reported a surge of pneumonia cases, which was later confirmed to have been caused by the virus species severe acute respiratory syndrome‐related coronavirus (SARS‐CoV‐2, CSG, 2020; Zhu et al., 2020). The disease associated with the newly discovered virus became known as COVID‐19. By the end of December 2021, the virus has spread across the globe, infecting about 277 million people and leading to 5.4 million deaths (WHO, 2022).Indonesia's population has been particularly affected by the virus, with an estimated total of about 4.3 million confirmed cases and 144,000 deaths on by the end of the year 2021 (WHO, 2022). The demographic characteristics and geographical configuration of Indonesia make it extremely vulnerable to the spread of contagious diseases. It is the largest archipelago in the world, has the fourth largest population and represents a major hub connecting East Asia, South Asia and Oceania. The impact of COVID‐19 on the Indonesian public health has been exacerbated by a high prevalence of comorbidities and clinical risk factors. About 39% of its population regularly consumes tobacco, and the country counts the seventh largest number of diabetic cases (10.7 million) as well as the third largest number (29.1 million) of prediabetic cases globally (Brake et al., 2020; Del Rio & Malani, 2020).The estimation of national and local infection rates in Indonesia remains challenging for two main reasons. First, the country lacks of sufficient testing capacities (Rachman et al., 2020). While health centres are accessible in the metropolitan area of Jakarta and in the province of Bali, their spatial coverage is sparse and they are poorly accessible in remote provinces such as Papua and West Papua (Rachman et al., 2020). As of 7 December 2020, Indonesia has tested on average about 0.13 individuals per 1000 inhabitants, which is a relatively low testing rate in comparison with neighbouring countries, such as the Philippines (0.3), Malaysia (0.54) and Singapore (5.15) (Hasell et al., 2020). Second, asymptomatic and mild cases have been systematically excluded from testing (van Empel et al., 2020). Since about 80% of COVID‐19 cases are asymptomatic or exhibit mild symptoms (Vital Surveillances, 2020), a large number of infected individuals have not been detected, which partly explains the discrepancy between the reported and real numbers of COVID‐19 cases in Indonesia (van Empel et al., 2020).Global studies have shown that population density is a key factor that determines human‐to‐human transmission of infectious diseases (Hughes, 2020; Kucharski et al., 2020). The COVID‐19 transmission among humans tends to occur indoors, especially during cold weather periods (Ahlawat et al., 2020). COVID‐19 initially spread from Wuhan to other regions through the worldwide air and sea traffic network (Python et al., 2022; Zhang et al., 2020). The role of global transport has also been observed previously in the spread of the severe acute respiratory syndrome (SARS), another coronavirus disease (Tatem et al., 2006). Due to modern transport systems' capacity to carry contagious diseases over long distances in a short time, local strategies are often rendered ineffective. In a global study, Gunthe and Patra (2020) found a strong positive correlation between importation risk, estimated from global air traffic and the official reported numbers of COVID‐19 cases imported from abroad. Maier and Brockmann (2020) assessed the risk for major airports in the early stages of the COVID‐19 pandemic and provide insights into the most likely transmission routes from a location in the impacted region. This model is the foundation of the first stage of the approach proposed in this paper.For the case of Indonesia, quantitative research has mainly focused on examining the role of meteorological factors on the spread of COVID‐19 in the metropolitan area of Jakarta, but has not yet paid attention to socio‐economic drivers of the disease. Tosepu et al. (2020) documented a weak negative correlation between temperature and virus infections in Jakarta. Asyary and Veruswati (2020) showed a weak positive correlation between sunlight exposure and COVID‐19 recovery. Rendana (2020) found a low negative correlation between wind speed and early COVID‐19 infections.We aim to provide a method to assess the ex ante risk of COVID‐19 at a fine spatial scale across Indonesia which can be applied when the availability of data is limited (or biased). Spatial regression models based on current surveillance and prevalence survey data would be an obvious choice for this; the sparse spatial coverage and the selection bias associated with the systematic exclusion of asymptomatic cases would reduce the reliability of such models based exclusively on official COVID‐19 data. As an alternative or complementary approach, we suggest a two‐step modelling procedure that relies on a minimal set of routinely available input data related to the Indonesian flight network, traffic and population data, and geographical information allowing us to assess the ex ante spatial importation risk (i.e. the risk before the virus started to spread in Indonesia) as well as its uncertainty.The general workflow of our approach is summarised in Figure 1. In the first step, we utilise the importation risk data for all international airports of Indonesia available from the global model on disease spread by Maier and Brockmann (2020), to build an initial risk assessment for all flight connections associated with these airports based on information on flight connections and flight volume. We then complement these data with various characteristics of the Indonesian domestic flight network based on flight data from December 2019 to January 2020, which is prior to the global COVID‐19 outbreak (aviationstack, 2020). The data cover specifically a period from 12 January 2019 to 1 July 2020. The data set contains all available flight connections to, from and between Indonesian airports during this period. The data are not publicly available but can be purchased from specialised providers. With this information, we estimate an additive model with the connection‐specific risk as response variable and the network characteristics as potential determinants. This Airport Network Model finally serves as the basis for predicting the risk for all flight connections from which, in turn, the importation risk for all 78 Indonesian airports can be determined. As the output of our first step, we obtain localised importation risks at each domestic airport.
FIGURE 1
Workflow for predicting the ex ante COVID‐19 infection risk in Indonesia at a fine spatial scale.
Workflow for predicting the ex ante COVID‐19 infection risk in Indonesia at a fine spatial scale.In the second step, we aim at propagating the ex ante risk from the domestic airports to all of Indonesia at a 5 × 5 resolution. For this, we implement a Bayesian geostatistical model (the Spatial Propagation Model) using a set of relevant spatial covariates that describe the local traffic and population density and spatial proximity to the airports and highly populated islands. The Spatial Propagation Model is validated through a leave‐one‐out cross‐validation approach. Furthermore, the predictions are compared with official COVID‐19 infections data between March and May 2020 (Kawal COVID, 2021).The methodology suggested in this paper can help policymakers design and implement preventative measures in the presence of data sparsity and selection bias in the early stages of an epidemic disease outbreak. By assessing the ex ante distribution of COVID‐19 risk while accounting for its uncertainty, it not only provides policymakers with early information on the potential spatial spread of the disease but also highlights areas where early testing may be the most helpful. Hence, it can help reduce the socioeconomic damages of potential future outbreaks of COVID‐19 variants or other contagious diseases.This paper is organised as follows: Section 2 implements the prediction of the ex ante importation risk at the 78 Indonesian airports while Section 3 deals with the spatial propagation of the ex ante risk to all of Indonesia. Section 4 discusses the findings and limitations of our study and Section 5 provides a summary of the findings and suggests potential future research paths that could extend our work.
AIRPORT NETWORK MODEL: PREDICTING THE EX ANTE IMPORTATION RISK AT AIRPORTS
General strategy
Before starting to discuss the ingredients of the Airport Network Model in detail, we summarise the general strategy visualised in Figure 2. Our aim is to determine ex ante importation risks for all 78 Indonesian airports forming the flight network represented in Figure 2a where the vertices in the outer circle represent the airports while the edges connecting the vertices represent the 438 flight connections between these airports. Ex ante COVID‐19 importation risks are available for the seven international Indonesian airports (green vertices in Figure 2a) from Maier and Brockmann (2020) (data freely available here: http://rocs.hu‐berlin.de/corona/docs/model/importrisk). A total of 132 flights are associated with these international airports (green edges (training) in Figure 2b) such that a risk weight can be assigned to the corresponding edges in the network. To achieve this, we proceed as illustrated in Figure 2c, we calculate the total number of outgoing flights for each airport, and assign a corresponding risk weight to each outgoing edge (i.e. flight connection) for the seven international airports. Assume, for example that airport 1 (green vertex) has a COVID‐19 risk equal to 0.8 and the following number of departure flights to its connected airports (with a total number of departure flights of 10): 4 (edge 1‐2), 3 (edge 1‐3), 2 (edge 1‐4) and 1 (edge 1‐5). Hence, the relative risk weights are assigned to the connections (green edges) as follows: 0.8 × 4/10 (edge 1‐2), 0.8 × 3/10 (edge 1‐3), 0.8 × 2/10 (edge 1‐4) and 0.8 × 1/10 (edge 1‐5).
FIGURE 2
Airport Network Model workflow. (a) Initial network with the seven international airports highlighted in green, (b) Training network of flights associated with these seven airports, (c) Connection risk assignment, (d) All domestic flight connections, (e) Predicted risk for these connections and (f) Predicted risk in all airports.
Airport Network Model workflow. (a) Initial network with the seven international airports highlighted in green, (b) Training network of flights associated with these seven airports, (c) Connection risk assignment, (d) All domestic flight connections, (e) Predicted risk for these connections and (f) Predicted risk in all airports.Using the risk value attributed to each of the 132 connections associated with the seven airports, we implement a generalised additive model (GAM, cf. Section 2.3 below for details) to predict the COVID‐19 risk for the 306 remaining connections (red edges in Figure 2d) using various network characteristics as covariates. Based on the predicted risk weights that are obtained for all 438 connections (green edges in Figure 2e), the final step of the Airport Network Model consists of assigning COVID‐19 risks to the 78 airports, including those not in Maier and Brockmann (2020)'s data (71 red vertices in Figure 2e). We calculate the COVID‐19 risk for each airport (green vertices, Figure 2 (f)), as a normalised strength defined as the weighted sum of all incident flight connections risk.
Network characteristics
The Airport Network Model predicts the ex ante COVID‐19 risk for the 78 airports of the domestic Indonesian network based on various characteristics of this network. These network characteristics are calculated from flight connection data (available for a charge from aviationstack, 2020), which contain the locations of all Indonesian domestic airports along with their domestic flight frequencies for the period from 1 December 2019 to 7 January 2020. Most flights transit through Jakarta airport (CGK) and a few other large airports such as Bali (DPS), Sedati (SUB) and Makassar (UPG). Figure A1 illustrates the distribution of the frequency of flights.
FIGURE A1
Indonesian domestic flight network. Flight connections and their frequency (denoted by line thickness) among the 78 domestic Indonesian airports from December 2019 to early January 2020. Data source: aviationstack (2020).
From the primary data sources (aviationstack, 2020; Maier & Brockmann, 2020), we extract two types of metrics: flight (or connection‐specific) metrics that describe airport connectivity, flight frequency, volume and geographical distances between each respective Indonesian airport, and network metrics that describe the role of each airport within the structure of the Indonesian flight network.The network characteristics considered in the following can be roughly grouped into metrics associated with vertices or nodes (i.e. airports) and edges (i.e. flight connections). Vertex metrics include variables associated with the centrality of each airport, such as incoming and outcoming strength, eigenvector centrality, betweenness centrality and transitivity. Edge metrics include measures of connectivity between airports, such as topological distance, similarity, connectivity and Euclidean distance. Furthermore, we consider the structure of the Indonesian flight network by including indices of traffic volume and the frequency of departure flights.For all metrics discussed in the following, let A denote the adjacency matrix encoding the network structure of the N = 78 airports of the Indonesian flight network. It is an (N × N) matrix with one encoding a direct flight connection, and zero otherwise. Note that we are considering a directed network representation, where the row index in A refers to the departure airport while the column index represents the destination. An airport (vertex in the network) is considered incident on a flight connection (edge in the network), if it is the destination of that edge.
Vertex metrics
Strength: We consider two measures of strength representing the total incoming traffic () and total outgoing traffic () for each airport v (Kolaczyk, 2009).Eigenvector centrality: The eigenvector centrality (, also called eigencentrality or prestige score) is defined as the largest eigenvalue of the adjacency matrix A of the network, representing a measure for the total influence of a vertex in a network (see Kolaczyk, 2009, for details). Compared to the strength, it does not only consider direct flight connections, but also indirect influences via the neighbours. In a flight network, airports connected with influential hubs have a higher eigenvector centrality than those linked with less connected hubs, since they may be exposed to a larger number of indirect connections.Transitivity: The transitivity (, also called clustering coefficient) is a measure for the number of links between the vertices within the local neighbourhood of a vertex relative to the number of possible links between them (see Wang et al., 2017, for details).Betweenness: The number of shortest paths between any pair of vertices that pass through a vertex v defines the betweenness centrality (, Freeman, 1978). This metric provides information on the mediating role of a vertex in a network. In our case study, a high value of is expected in Jakarta airport (CGK in Figure A1) since a large number of domestic flights in Indonesia go through this airport and it therefore has a central role in the Indonesian flight network.
Edge metrics
Shortest path distance: The shortest path distance () is defined as the minimum number of edges between a pair of airports u and v (Kolaczyk, 2009, p. 17).Similarity: Similarity () combines information from the adjacency matrix with information on the degree (i.e. the number of incoming edges) of the vertices in a network, see Dice (1945). Therefore, this metric provides information on the number of mutual connections an airport v has with an airport u.Connectivity: The airport connectivity () between two vertices u and v represents the minimum number of vertices that have to be removed from a network to isolate airport u from airport v (White & Harary, 2001). Connectivity is useful to assess the cohesiveness of vertices within a network.In addition, we consider the Euclidean distance (in km) between each pair of airports (), the number of flight connections per route () and the proportion of departures per route ().
Additive model for the flight‐specific risk weights
We use a GAM to estimate the ex ante COVID‐19 risk weights for each domestic flight connection in Indonesia and subsequently predict the COVID‐19 infection risk at each airport. Initial risk values for each flight connection l = {1, …, 132} associated with the seven main airports in Indonesia are derived from Maier and Brockmann (2020) (see Appendix A.2 for further details). We log‐transform the initial risks due to data imbalances—a large number of observations have low risk values—such that a normal distribution can be reasonably assumed. Models based on a normal distribution exhibit a better predictive performance compared to a gamma distribution (see Appendix A.2.1 for detailed results). As potential covariates, we consider the metrics discussed in the previous section. Colizza et al. (2006) point out that flight metrics are unlikely to be linearly associated with airport risk. Furthermore, Zhang and Fu (2009) argue that a link between flight connectivity and risk is more likely to be non‐linear. Accordingly, we consider smooth, non‐linear effects of the vertex metrics via penalised splines (Wood, 2006) while restricting the edge metrics to linear effects.To select the covariates, we carry out a cross‐validation procedure and split our data into a training (70%) and a testing (30%) set. From 4139 models, we pre‐select 20 candidates based on the Akaike information criterion (AIC) with a stepwise procedure. Based on an out‐of‐sample procedure, we compute the root mean squared error (RMSE) and mean average error (MAE) metrics for each pre‐selected candidate (further details in Appendix A.2.2).
Results
We report the results of three models with the best predictive performance scores. The Parametric Model includes all covariates while assuming linear effects for all of them. The Full Model includes all investigated covariates (with linear effects for the vertex metrics and smooth, non‐linear effects for the edge metrics). The Stepwise Model includes three edge metrics and four vertex metric covariates, leading to the model equation:
where represents the overall intercept, are parametric effects of the vertex metrics betweeness BET, transitivity TRA, inward strength IN.STR, and outward strength OUT.STR, and , and are non‐linear effects of the edge metrics connectivity CONNECT, number of flights NB.FLY and proportion of departures DEP.FLY.Table 1 provides a summary of the results for our three model specifications. The results show that the Full Model and Stepwise Model have both a higher predictive performance compared to the Parametric Model. Their RMSE and MAE scores are similar. Following the parsimony principle, we choose the Stepwise Model since it exhibits a lower degree of complexity for a similar predictive performance.
TABLE 1
Airport Network Model. Summary for the Parametric Model, the Full Model, and the (Stepwise Model)
Parameter estimation: mean (st. errors)
Parameters
Type
Parametric Model
Full Model
Stepwise Model
Linear covariates
IN.STR
Vertex metric
−35.56*
−56.10***
−38.74***
(14.88)
(13.92)
(10.52)
OUT.STR
Vertex metric
34.09*
54.79***
36.06***
(14.51)
(14.27)
(10.30)
BET
Vertex metric
−2.87**
−3.26***
−2.92***
(0.94)
(0.67)
(0.66)
TRA
Vertex metric
‐30.36∗
−49.29***
−38.07***
(12.83)
(10.75)
(9.00)
CEN
Vertex metric
−3.23*
(1.60)
CONNECT
Edge metric
0.18***
(0.03)
NB.FLY
Edge metric
0.00
(0.00)
DEP.FLY
Edge metric
7.02*
(2.31)
intercept
−10.19***
−4.60**
−6.93***
(1.69)
(1.64)
(1.19)
Smooth terms
SIM
Edge metric
0.00
(9.00)
CONNECT
Edge metric
1.23**
1.10*
(9.00)
(9.00)
NB.FLY
Edge metric
5.47***
5.31***
(9.00)
(9.00)
DEP.FLY
Edge metric
1.74*
2.12***
(9.00)
(9.00)
G.DIST.AIRPORT
Edge metric
0.00
(9.00)
E.DIST.AIRPORT
Edge metric
0.00
(9.00)
Predictive performance
In‐sample
AIC
517.25
424.87
425.96
R2
0.59
0.81
0.80
Out‐of‐sample
RMSE
0.007937
0.000148
0.000129
MAE
0.000774
0.000044
0.000043
Num. obs.
132
132
132
***p < 0.001; **p < 0.01; *p < 0.05.
Airport Network Model. Summary for the Parametric Model, the Full Model, and the (Stepwise Model)***p < 0.001; **p < 0.01; *p < 0.05.The results of the Stepwise Model indicate that all parametric coefficients are highly significant (p < 0.001). The estimated means of the coefficients associated with IN.STR and OUT.STR are about −44 and 42, respectively. Hence, a one percentage point increase in in‐strength or out‐strength, corresponds to a log‐risk reduction of about 44% or a log‐risk increase of 42%, respectively, while keeping the other covariates constant. In line with the theory of importation risk (Maier & Brockmann, 2020), the results show that airports with a larger frequency of departing flights are more likely to act as hubs and end up at risk. The negative sign of IN.STR shows that more remote airports that count a relatively larger number of landing but fewer departing flights are associated with a lower risk.Airports that have a larger ability to control the traffic flow passing through the network, and hence demonstrate higher levels of betweenness (BET), are expected to show a smaller importation risk of the disease. According to the results, a one percentage point increase in BET corresponds to a (log) risk reduction of about 2.7%. Airports with high values of betweenness might be associated with stronger authority over neighbouring clusters or may be closer to the periphery of clusters. High values of transitivity are associated with high potential for clustering. Therefore, we expect a higher risk in areas that are outside clusters, which is confirmed by the negative effect of TRA. We expect an increase of risk along the smooth terms (visualised in Figure 3) measuring connectivity (CONNECT), the number of flight connections per route (NB.FLY) and proportion of departures per route (DEP.FLY). More connected and busy airports are likely to be at higher risk due to the potential spread of the virus via their many flight connections (Coelho et al., 2020; Daon et al., 2020).
FIGURE 3
Airport Network Model. Mean estimated marginal effect (red lines) on the log COVID‐19 risk of three edge metric covariates. Top‐left: number of flight connections per route (NB.FLY); top‐right: connectivity (CONNECT); bottom‐left: proportion of departures per route (DEP.FLY). The blue dashed lines indicate 95% pointwise confidence intervals.
Airport Network Model. Mean estimated marginal effect (red lines) on the log COVID‐19 risk of three edge metric covariates. Top‐left: number of flight connections per route (NB.FLY); top‐right: connectivity (CONNECT); bottom‐left: proportion of departures per route (DEP.FLY). The blue dashed lines indicate 95% pointwise confidence intervals.The results (top‐right, Figure 3) corroborate that a higher connectivity between airports is associated with an increase in the resilience of the network, which provides opportunities for the virus to spread easily and hence increases the risk (White & Harary, 2001). A higher traffic volume tends to be associated with higher risk prevalence (Bogoch et al., 2016), which appears in line with the positive trend (red line) observed in the effects of the number of flights on COVID‐19 risk (top‐left). However, the evidence of a positive trend in both variables is weak given the relatively large confidence intervals (CIs) (blue dashed lines).The estimated effects of the departing flight volume (bottom‐left) on COVID‐19 are not conclusive. The absence of a clear tendency is illustrated by the wide CIs. The large uncertainty observed in this analysis might result from the relatively low number of observations (132 flight connections) and the presence of outliers.
Risk estimates
Based on the specifications of the selected Airport Network Model (Equation 1), we perform out‐of‐sample predictions to compute the risk values for all 438 connections within the full Indonesian network (cf. Figure 2e). The predicted values are then used as edge weights to calculate the normalised ex ante COVID‐19 risk for each of the 78 airports on the Indonesian archipelago (cf. Figure 2f).Figure 4 shows the value of the mean (log) COVID‐19 risk (red dots) predicted at each airport along with uncertainty (bars) given by 95% CIs which provide a lower and upper estimate of (log) COVID‐19 risk. The main airports such as Jakarta (CGK), Sultan Hasanuddin International Airport (UPG) and Surabaya (SUB) exhibit the highest predicted (log) COVID‐19 risk.
FIGURE 4
Predicted ex ante COVID‐19 risk at airports. Ex ante COVID‐19 (log) risk (x‐axis) predicted for all (78) domestic airports in Indonesia (y‐axis). Red dots correspond to the mean estimation of the log ex ante COVID‐19 risk. Bars represent 95% confidence intervals, which provide a lower and upper estimate of (log) COVID‐19 risk.
Predicted ex ante COVID‐19 risk at airports. Ex ante COVID‐19 (log) risk (x‐axis) predicted for all (78) domestic airports in Indonesia (y‐axis). Red dots correspond to the mean estimation of the log ex ante COVID‐19 risk. Bars represent 95% confidence intervals, which provide a lower and upper estimate of (log) COVID‐19 risk.
SPATIAL PROPAGATION MODEL: A HIERARCHICAL BAYESIAN GEOSTATISTICAL MODEL TO PREDICT EX ANTE COVID‐19 INFECTION RISK
The results of the Airport Network Model provide an estimate of the ex ante COVID‐19 risk along with uncertainty for each connection in the domestic Indonesian airport network. To account for the uncertainty in the estimation of the ex ante COVID‐19 risk at the airports, we build three scenarios, which provide a lower (scenario I), mean (scenario II) and upper (scenario III) estimation of the ex ante COVID‐19 risk at the airports. For each scenario (I‐III), we extend the prediction of ex ante COVID‐19 risk over the entire archipelago at a fine spatial scale using a hierarchical Bayesian geostatistical model. The models include a set of spatial correlates associated with the risk of transmission and account for spatial dependencies (Diggle & Giorgi, 2018).
Data
We propagate the ex ante COVID‐19 risk from each airport into a 5 × 5 km spatial grid which covers the Indonesian archipelago. The Spatial Propagation Model considers four covariates related to an intensified spread of COVID‐19: airport access, traffic intensity, proximity to Java and Bali, and population. We measure airport access () as the travel distance from each grid cell to the nearest airport. Travel distances are computed following Weiss et al. (2018) high‐resolution friction maps. The spread of an infectious disease is expected to increase with the intensity of commuting and travelling (cf. Xu et al., 2019). We approximate travel intensity () using Meijer et al. (2018) road network density maps. Proximity to Java is proxied by the Euclidean distance to the islands Java and Bali (). We gather population size data from the High Resolution Settlement Layer (HRSL) population data set (Facebook Connectivity Lab and Center for International Earth Science Information Network, 2021) that we aggregated at pixel‐level (5 × 5 km). Larger population density increases the pool of individuals to be infected and quickens the spread of the virus (Bhadra et al., 2021). Population is considered to be a major driver associated with the transmission of the COVID‐19 at various spatial scales (Python et al., 2022; Zhang et al., 2020). Therefore, we expect more vulnerability to COVID‐19 outbreaks close to densely populated areas.All covariates are scaled multiplicatively to values from 0.01 to 0.99, which avoids numerical difficulties during the fitting process (Gómez‐Rubio, 2020) and ease the comparison among the covariate estimates. Figure 5 shows the values (log scale for illustration purposes) of the three distance‐based risk and population covariates aggregated within 5 × 5 km grid‐cells in Indonesia.
FIGURE 5
Covariates used in the Spatial Propagation Model. (a) log population size (), (b) log travel time to the nearest airport (), (c) log traffic density (), and (d) log Euclidean distance to Java and Bali (). All covariates are defined on a 5 × 5 km grid. Grey pixels are regions without population which are excluded in this study. Green dots in panel (b) indicate the location of the investigated airports (78) from which log COVID‐19 risk are taken as response in the Spatial Propagation Model.
Covariates used in the Spatial Propagation Model. (a) log population size (), (b) log travel time to the nearest airport (), (c) log traffic density (), and (d) log Euclidean distance to Java and Bali (). All covariates are defined on a 5 × 5 km grid. Grey pixels are regions without population which are excluded in this study. Green dots in panel (b) indicate the location of the investigated airports (78) from which log COVID‐19 risk are taken as response in the Spatial Propagation Model.
Model specification
The model selection procedure (Section 3.3) results in the following specification of the Spatial Propagation Model:
where is the ex ante COVID‐19 log‐risk value observed in the 78 grid cells i associated with the domestic airports (see Section 2). We assume that follows a Gaussian distribution with expectation and precision of the Gaussian errors .In addition to the selected covariates assumed to be associated with an intensified spread of the disease, our model features a spatial Gaussian process, to account for spatial dependencies. More precisely, we consider a stationary Gaussian field with a Matérn covariance function (Eq. 3):
where denotes the Euclidean distance between two locations , is the marginal variance of the Gaussian random field, is the modified Bessel function of second kind and order λ > 0 (quantifying the degree of smoothness of the process) and κ > 0 is the scaling parameter associated with range which can be interpreted as the distance above which the spatial correlation becomes negligible (Lindgren & Rue, 2015).For computationally efficient inference, we consider the stochastic partial differential equation (SPDE) approach available in the R package R‐INLA utilising Integrated Nested Laplace Approximations (INLA; Rue et al., 2009; Lindgren et al., 2011). R‐INLA has been widely adopted in epidemiological modelling (see, e.g. Bhatt et al., 2015; Cameletti et al., 2013; Golding et al., 2017; Noor et al., 2014). INLA can accommodate a large class of models (Rue et al., 2009) is often faster than approaches based on Markov chain Monte Carlo, and provides diagnostics that are easily interpreted and reproducible (Wang et al., 2018). For further details on INLA, see Lindgren et al. (2011), Rue et al. (2009) and Bakka et al. (2018) review on R‐INLA applications.The Bayesian nature of the investigated spatial models allows us to flexibly set informative priors in order to mitigate the problem of having a low sample size (78 airports). However, small‐sample estimates are ineluctably sensitive to the specification of the prior distribution. Therefore, we take particular care to set reasonable priors to obtain reliable and interpretable results. To do so, we use prior knowledge to impose relatively strict priors on the fixed effects parameters. Using R‐INLA default flat priors (range between −93 and 93) on the variance of the fixed effects would lead to implausible or inaccurate estimates. We assume an a priori negative effect of and , since it is likely that COVID‐19 risk is higher in densely populated areas. Note that is considered during the model selection procedure but the selected model with the highest predictive performance does not include it (Section 3.3). Similarly, we assume an a priori positive effect of and on COVID‐19 risk (cf. Python et al., 2022; Xu et al., 2019; Zhang et al., 2020). Further details on the prior specifications are provided in Appendix A.3.2.As we do not have particular a priori knowledge on the extent and magnitude of the spatial structure of the spread of COVID‐19 in Indonesia, we use penalised complexity priors (Fuglstad et al., 2019; Simpson et al., 2017) for the spatial effect. Penalised complexity priors are set on the two hyperparameters and associated with the spatial random field ξ (Equation 2). This requires us to provide a threshold and an a priori probability that the true values of each investigated parameter are above or below the threshold. We choose the following weakly informative priors on the range and marginal variance of ξ: , with measuring the maximum Euclidean distance between two airports from a set of all observed airports (78) across Indonesia and . The sensitivity to the choice of different priors is discussed in Appendix A.3.5.
Model selection procedure
We use the widely applicable information criterion (WAIC) (Watanabe & Opper, 2010) and two predictive performance metrics to select the model: the conditional predictive ordinate (CPO, Pettit, 1990) and the probability integral transform (PIT, Dawid, 1984). is the posterior probability of observing conditioned on all other observations that is
where contains the observed (log) risk with the observation omitted. We compare the CPO resulting from seven models, which use a different subset from all possible non‐null subsets composed of three covariates (results in Appendix, Table A2).
TABLE A2
Model predictive performance. This table provides the conditional predictive ordinate (CPO) values for seven model specifications along with the widely applicable information criterion (WAIC). For the two models with highest CPO and lowest WAIC, we report the Kolmogorov–Smirnov (K‐S) test on the uniformity of PITs as well as their kurtosis. All model specifications include an intercept and a spatial field (not shown)
Model
CPO
WAIC
K‐S
Kurtosis
1
ACCESS + TRAFFIC + POP
−108.594
215.232
0.995
1.827
2
TRAFFIC + POP
−108.326
214.548
0.900
1.859
3
ACCESS + DIST.JAVA + TRAFFIC + POP
−110.162
218.315
4
DIST.JAVA + TRAFFIC + POP
−111.063
220.787
5
ACCESS + DIST.JAVA + POP
−116.044
230.438
6
DIST.JAVA + POP
−116.699
231.435
7
ACCESS + POP
−116.196
230.917
8
POP
−116.565
229.608
9
ACCESS + DIST.JAVA + TRAFFIC
−111.930
222.707
10
DIST.JAVA + TRAFFIC
−113.816
226.785
11
ACCESS + TRAFFIC
−110.811
220.167
12
TRAFFIC
−111.388
221.575
13
ACCESS + DIST.JAVA
−120.499
239.357
14
DIST.JAVA
−121.445
242.540
15
ACCESS
−122.733
240.160
is given by the conditional (given all other observations) cumulative distribution function of a replicate observation evaluated at the observed value , i.e.
If the forecast is perfect and the distribution of is continuous, has a uniform distribution (Wang et al., 2018).We also use CPO as a cross‐validation measure of fit using the log pseudo marginal likelihood (LPML) which is defined as (Gneiting et al., 2007):
Larger values of LPML indicate a better fit in terms of predictive performance.The results of the best performing model predicting the ex ante COVID‐19 risk at Indonesian airports are shown in Table 2 (detailed results are shown in Appendix A.2.2). The selected model includes the following covariates (on a log scale): distance to airports (), traffic intensity () and population ()). Table 2 reports the mean, median, 95% lower bound CI, 95% upper bound CI and the standard deviation (std) of the estimated parameters. None of the CIs associated with the fixed effects coefficients contain zero.
TABLE 2
Spatial Propagation Model: Baseline scenario. Mean, median, 95% credible intervals with lower () and upper () bounds, and standard deviations (std) associated with the covariate effects (Fixed effects), precision of the unstructured errors () and hyperparameters of the spatial field (range and marginal variance )
Parameters
Mean
Median
95% CIl
95% CIu
STD
Fixed effects
log(ACCESS)
−0.50
−0.50
−0.76
−0.25
0.13
log(TRAFFIC)
0.46
0.46
0.21
0.71
0.13
log(POP)
0.19
0.19
0.00
0.38
0.10
intercept
−3.16
−3.16
−4.45
−1.91
0.65
Random effects
Spatial range
ρξ
2.89
1.04
0.12
17.34
6.43
Marginal variance
σξ2
1.54
1.29
0.26
4.21
1.05
Precision
τϵ
1.17
1.16
0.77
1.60
0.21
Spatial Propagation Model: Baseline scenario. Mean, median, 95% credible intervals with lower () and upper () bounds, and standard deviations (std) associated with the covariate effects (Fixed effects), precision of the unstructured errors () and hyperparameters of the spatial field (range and marginal variance )As expected, the logged travel time to the nearest airport () has a negative effect, which is in line with the literature (Knipl et al., 2013). A location 1% farther away from an airport decreases the (log) COVID‐19 risk by 0.5% on average while keeping all other explanatory variables constant. Both logged traffic intensity () and logged population () are associated with an increase in the logged COVID‐19 risk, which is consistent with the literature (Bhadra et al., 2021; Python et al., 2022; Zhang et al., 2020).The resulting estimation of the spatial range () provides complementary information on the initial spread of COVID‐19 risk. The estimated average distance above which spatial dependencies become negligible is 2.89 degrees, which correspond to about 300 km. This suggests that the ex ante COVID‐19 risk observed in Jakarta is unlikely to be associated with the risk observed in Bali—953 kilometres away from the capital.
Model validation
In order to systematically validate our model, we use a leave‐one‐out cross‐validation (LOOCV) procedure (Vehtari et al., 2017). We visually assess the uniformity of the PITs using a Q–Q plot, which plots the sample quantiles together with the theoretical quantiles associated with the PIT estimated for each of the 78 hold‐out data. The good alignment of the observations (dots) with the assumption of uniformity (diagonal line) in the Q–Q plot illustrated in Figure 6 support the assumption that the cross‐validated PITs follow a uniform distribution. These results suggest a relatively good model fit.
FIGURE 6
Model validation. Q–Q plot assessing the uniformity of the PITs associated with the predictions of ex ante COVID‐19 risks () for i = {1, …, 78} hold‐out data in a leave‐one‐out cross‐validation (LOOCV) procedure. Perfect uniformity is represented by the diagonal line. Dots represent the values of the theoretical quantiles (x‐axis) of the estimated PITs along with their corresponding estimated sample quantiles (y‐axis).
Model validation. Q–Q plot assessing the uniformity of the PITs associated with the predictions of ex ante COVID‐19 risks () for i = {1, …, 78} hold‐out data in a leave‐one‐out cross‐validation (LOOCV) procedure. Perfect uniformity is represented by the diagonal line. Dots represent the values of the theoretical quantiles (x‐axis) of the estimated PITs along with their corresponding estimated sample quantiles (y‐axis).Figure 7 illustrates the uncertainty associated with the predictions of ex ante COVID‐19 risk. For each of the hold out points, the figure shows the median (green points) and predicted 95% CI (red segments) along with the true value (blue points). While most true values (blue points) fall into the predicted 95% CI, one can observe that most CIs are relatively wide, which suggests a relatively high uncertainty associated with the predictions. Note that the model seems to underestimate the risk for extremely high risk values (see the blue points that lie above the red segments in Figure 7).
FIGURE 7
Model validation. Mean (green points) and predicted 95% CI (red segments) together with the true observed values, , (blue points) for each hold‐out airport i = {1, …, 78}, produced by a leave‐one‐out cross‐validation (LOOCV) procedure.
Model validation. Mean (green points) and predicted 95% CI (red segments) together with the true observed values, , (blue points) for each hold‐out airport i = {1, …, 78}, produced by a leave‐one‐out cross‐validation (LOOCV) procedure.As an ex post plausibility check for our modelling approach, we compare our risk predictions with external data on early COVID‐19 incidence in Indonesia. For this purpose, we rank Indonesian regencies by the total number of cases that are reported by the middle of March, April and May 2020. The comparison is imperfect as the official surveillance data is likely to under‐report COVID‐19 cases in the first months that followed the initial spread. Furthermore, bias is likely to affect the spatial distribution of the data and increase over time since spatially targeted public health interventions are carried out more frequently over time, which distorts the observed ranking of COVID‐19 cases at regency level especially in the later months. Other determinants of the spatial spread of the disease factors, including internal migration linkages and local public health capacity, may also introduce additional biases. These limitations notwithstanding, we observe a relatively good fit between the regency‐level average of the predicted risk and incidence data in mid‐March, mid‐April and mid‐May (see results in Appendix A.3.4).
Mapping the ex ante COVID‐19 risk across Indonesia
Within a Bayesian framework, we use the posterior distribution estimation of the parameters. The output of the two‐step modelling process consists of an estimation of the posterior distribution associated with the parameters of the Spatial Propagation Model which is used to make spatial inference about the ex ante COVID‐19 risk in Indonesia. We consider three scenarios to account for the uncertainty in the estimation of the first step (Airport Network Model) that is propagated in the second step (Spatial Propagation Model). For each scenario, we map the posterior median and the standard deviation of the (log) ex ante COVID‐19 risk into a fine spatial grid (5 × 5 resolution). Figure 8 illustrates the results. We observe that the patterns are consistent among the scenarios. As expected, the highest median risk values are observed in populated areas such as the islands of Java and Bali, while remote areas such as Papua show a smaller initial importation risk (panels (a–c)). However, areas with spatially sparse observations in regions such as Papua and northern Kalimantan exhibit higher uncertainty in the estimation of COVID‐19 risk, with high standard deviation values (panels (d–f)) relative to the median.
FIGURE 8
Maps of the ex ante (log) COVID‐19 risk in Indonesia. The maps show the (a–c) median and (d‐f) standard deviation of the (log) ex ante COVID‐19 risk for three scenarios (I: lower (first column), II: mean (second column), III upper estimations (third column) of the response) in Indonesia predicted at fine spatial grid (5 × 5 ).
Maps of the ex ante (log) COVID‐19 risk in Indonesia. The maps show the (a–c) median and (d‐f) standard deviation of the (log) ex ante COVID‐19 risk for three scenarios (I: lower (first column), II: mean (second column), III upper estimations (third column) of the response) in Indonesia predicted at fine spatial grid (5 × 5 ).While these maps help to visualise the results, they are especially difficult to interpret for policymakers. To determine areas with high COVID‐19 risk, end users need to consult at least two maps: (a) identifying locations with high median risk (based on, e.g., panels (a–c) of Figure 8); and (b) checking if the uncertainty about the prediction (based on, e.g. panels (d–f)) is small enough to avoid a false identification of high‐risk areas. Equally difficult is the detection of areas at low COVID‐19 risk or areas that require further data collection.To simplify the interpretation of the maps, we calculate the exceedance probability for each grid cell k = {1, …, 388,740} that covers the Indonesian archipelago. The resulting maps provide an estimated probability that the ex ante COVID‐19 risk is larger than a threshold value c, . This allows for the easy identification of high‐risk areas. In practice, thresholds have to be chosen according to the policymakers' needs and can vary with their strategies and available resources.Figure 9 illustrates such exceedance probability surface maps. We use two thresholds which can be adjusted e.g. in a policy‐making context. Panels (a–c) show the probability of the log COVID‐19 risk exceeding −2.2 for scenarios I–III. Panels (d–f) show the probability of the log COVID‐19 risk exceeding −1 for scenarios I–III.
FIGURE 9
Exceedance probability maps. Probability of log COVID‐19 risk exceeding −2.2 in scenarios I (a), II (b) and III (c). Probability of log COVID‐19 risk exceeding −1 in scenarios I (d), II (e) and III (f). The spatial resolution is 5 × 5 .
Exceedance probability maps. Probability of log COVID‐19 risk exceeding −2.2 in scenarios I (a), II (b) and III (c). Probability of log COVID‐19 risk exceeding −1 in scenarios I (d), II (e) and III (f). The spatial resolution is 5 × 5 .With very limited resources, policymakers may favour a high threshold in the exceedance probability map to highlight regions at elevated risk requiring direct intervention. Once more resources are available, the threshold can be reduced, increasing the area considered at high risk. When choosing a relatively high threshold, the regions considered at elevated risk include the metropolitan area around Jakarta and regions between Surabaya and Malang (Figure 9a–c). With a lower threshold, the spatial coverage of high risk areas is extended to almost all territories of Java, South Sumatra, South Kalimantan, Makassar, Bali and Mataram (Figure 9d‐f). The choice of the threshold is crucial. Consequently, in practical applications, it should be selected with care by end users.Besides, maps showing the (log) probability of COVID‐19 risk not exceeding a relatively large threshold (Figure 9a–c) highlight larger areas at low risk than those using smaller thresholds (Figure 9d–f). For a given threshold, areas with a low exceedance probability in the interval (0,0.25] are informative as well. They may highlight areas where the spatial distribution of the observations is too sparse such that additional data collection might be required. The information on statistical uncertainty conveyed by the exceedance probability maps is crucial for policymakers who need further information from additional sampling in areas that require surveillance.
DISCUSSION
The first step (Airport Network Model) and the second step (Spatial Propagation Model) exhibit important limitations since both models rely on simplifying assumptions and relatively scarce data. Estimates from the Airport Network Model crucially depend on the quality of the initial COVID‐19 risk estimates for the seven main entry airports provided by Maier and Brockmann (2020). Furthermore, the skewed distribution of the risk data makes the choice of an appropriate likelihood function for the Airport Network Model challenging. To mitigate this issue, we log the risk values such that normality assumptions appear valid for the resulting transformed data (see the diagnostics plot in Appendix A.2.1). Quantifying uncertainty associated with the estimates from the Airport Network Model remains challenging. To account for it, we computed the estimates under three scenarios, which provide a lower, mean and upper estimation of COVID‐19 risk based on the estimated mean and 95% CIs. For each corresponding scenario, we fit a Spatial Propagation Model and compared the results. The relatively high consistency in the results suggests that the Spatial Propagation Model is robust to variations of the response (i.e. the output of the Airport Network Model).Our Airport Network Model relies on airports' traffic volumes to derive Indonesia's COVID‐19 importation risk. This seems a reasonable assumption for island nations like Indonesia, Australia or New Zealand, but would face more significant limitations in case of countries where a substantial part of international travel occurs via land. Even in our setting, approximating the number of passengers at risk of being infected by COVID‐19 only by the number of flight connections ignores a potential association between airplane capacity and urbanisation. Small airplanes are more likely to fly to provincial airports. Therefore, our simplified approach may overestimate the risk in remote areas. We also ignored maritime transport routes such as ferries that are widely used across the Indonesian archipelago, but are a less substantial means of international travel. However, as the island of the arrival airport might not always coincide with the final destination of passengers, our approach does not capture maritime transmission routes. We also exclude Singapore as a final possible destination although it hosts a large share of international flights to Indonesia. Some passengers who enter Indonesia from Singapore via ferry will not necessarily enter the domestic Indonesian flight network. Despite these limitations, it appears reasonable to suppose that the COVID‐19 importation risk is still fairly well described by airplane connectivity because cross‐border traffic in Indonesia predominantly occurs via air transportation.We identified several shortcomings associated with the Spatial Propagation Model. It assumes both stationarity (and hence invariance to translation) and isotropy (invariance to rotation). Processes impacting the ex ante COVID‐19 risk might be non‐stationary, with the risk being clustered in one area, but incurring randomly in other regions, or anisotropic such that the ex ante risk increases systematically in some directions. Future studies investigating non‐stationary models as well as dynamic models including cell phone or real traffic data might be able to capture more complex spatio‐temporal processes associated with the ex ante spread of COVID‐19 through the SPDE/GMRF‐approach implemented in R‐INLA.The hierarchical Bayesian geostatistical model allows us to set up informative priors to benefit from prior knowledge and mitigate issues associated with the relatively low sample size of the observed data. While our prior sensitivity analysis (see Appendix A.3.5) suggests that limited changes in the prior specifications do not drastically alter the main results, the model ineluctably remains sensitive to important variations in the priors. In this context, we put particular emphasis on choosing relevant priors in line with the literature to increase the reliability of our results while ensuring that they remain interpretable.By mainly focusing on transport networks, the Spatial Propagation Model ignores potential drivers of the ex ante COVID‐19 risk such as the proportion of people with pre‐existing conditions or with a lifestyle associated with a higher risk. To mitigate the effects of the omitted variable bias, the spatial dependence structure, represented by a Gaussian Markov random field, is introduced in the model, which partially accounts for unobserved drivers.
CONCLUSION
This paper examined the log‐transformed ex ante risk of COVID‐19 in Indonesia by applying a Bayesian hierarchical geostatistical model based on risk values obtained through a network analysis of flight connections in Indonesia. The novelty of this paper lies in its ability to systematically capture the ex ante COVID‐19 risk both in urban as well as rural areas in Indonesia in the presence of scarcity of spatial data, which typically occurs in the early stages of a disease outbreak. Our approach provides tools for policymakers to identify areas that would require further surveillance and areas vulnerable to a potential future outbreak of a new COVID‐19 variant or another similar contagious respiratory disease.The suggested spatial statistical approach adds to the exploratory literature by assessing the role of major infrastructural and socio‐economic drivers of COVID‐19 (Bhadra et al., 2021; Dadar et al., 2020; Meijer et al., 2018). We found that the island of Java was exposed to a high initial risk. This seems plausible because of the important socioeconomic role of the island in the archipelago. Furthermore, we identified other metropolitan areas at risk to be affected at the beginning of the pandemic. Our findings corroborate that urban areas tend to be more exposed than peripheral islands to an initial spread of COVID‐19. Sparsely populated areas and those with a low traffic density—both effects are associated with less degree of urbanisation—seem less at risk. Our study brings support to previous findings showing the vulnerability of urban centres to COVID‐19 risk and their role in facilitating its spread (Carozzi et al., 2020).A rigorous model validation remains challenging in the context of predicting risk ex ante, which is inherently affected by the absence of data. However, our results show a good alignment of predictions with external data at a fine spatial scale, for several months after the initial spread of COVID‐19 in Indonesia. Future work collecting additional data based on a design aiming at reducing various sources of bias would improve the reliability of the procedure used to assess the validity of our model.Within a Bayesian model, we generated exceedance probability risk maps to illustrate estimates of the ex ante COVID‐19 risk and its uncertainty in different scenarios and at policy‐relevant spatial scales across Indonesia. A collaboration between the analyst who calibrates the model and policymakers is required to define sensible thresholds to identify local hotspots and areas that require further data collection (Moraga, 2019). We believe that initiating and maintaining a close dialogue between statisticians and policymakers is crucial to ensure successful interventions aiming at mitigating COVID‐19 risk in vulnerable areas.
TABLE A1
Model comparison summary. This table provides the root mean squared error (RMSE) and mean average error (MAE) scores and the Akaike information criterion (AIC) computed using maximum likelihood estimator (MLE) and computed to compare models based on a log‐normal and gamma distribution functions
Distribution
AICMLE
AICmodels
RMSE
MAE
Log‐normal
−2716.11
−2931.93
0.89
0.60
Gamma
−2653.97
−2826.49
1.49
1.16
TABLE A3
Correlation test at regency level: summary. Wilcoxon rank test and Spearman rank correlation are calculated between the mean prediction of the Spatial Propagation Model and external data set. Time period: model prediction: December 2019–February 2021; the reported infections: mid‐March, mid‐April and mid‐May 2020
Wilcoxon test p‐value
Spearman rank correlation
Data collection time
0.008
0.730
mid‐March 2020
<0.001
0.535
mid‐April 2020
<0.001
0.478
mid‐May 2020
TABLE A4
Spatial Propagation Model: prior sensitivity analysis. This table shows (for each combination of and ) the mean, median, 95% credible intervals with lower () and upper () bounds, and the standard deviation (STD) associated with the estimation of the covariate coefficients (Fixed effects), precision of the random effects () associated with the unstructured errors, and the hyperparameters of the spatial field (range and marginal variance ). Nb. of obs.: 78 domestic airports in Indonesia
Parameters
σ0
ρ0
Mean
Median
95% CIl
95% CIu
STD
Fixed effects
ACCESS
1.75
0.24
−0.50
−0.50
−0.76
−0.25
0.13
1.75
0.71
−0.51
−0.51
−0.77
−0.25
0.13
2.00
0.48
−0.50
−0.50
−0.76
−0.25
0.13
2.25
0.24
−0.50
−0.50
−0.76
−0.25
0.13
2.25
0.71
−0.50
−0.50
−0.76
−0.25
0.13
TRAFFIC
1.75
0.24
0.46
0.46
0.21
0.70
0.12
1.75
0.71
0.47
0.47
0.22
0.72
0.13
2.00
0.48
0.46
0.46
0.21
0.71
0.13
2.25
0.24
0.19
0.19
0.00
0.38
0.10
2.25
0.71
0.19
0.19
0.00
0.37
0.09
POP
1.75
0.24
0.19
0.19
0.00
0.38
0.10
1.75
0.71
0.19
0.19
−0.00
0.38
0.10
2.00
0.48
0.19
0.19
0.00
0.38
0.10
2.25
0.24
−0.31
−0.31
−0.51
−0.12
0.10
2.25
0.71
−0.32
−0.32
−0.52
−0.12
0.10
intercept
1.75
0.24
−3.15
−3.15
−4.44
−1.91
0.64
1.75
0.71
−3.16
−3.15
−4.46
−1.90
0.65
2.00
0.48
−3.16
−3.16
−4.45
−1.91
0.65
2.25
0.24
−3.16
−3.15
−4.45
−1.91
0.65
2.25
0.71
−3.15
−3.14
−4.47
−1.89
0.66
Random effects
Spatial range
ρξ
1.75
0.24
2.84
2.51
0.67
6.97
1.66
1.75
0.71
1.35
1.10
0.29
3.88
0.96
2.00
0.48
1.54
1.29
0.26
4.21
1.05
2.25
0.24
2.80
2.59
0.82
5.99
1.35
2.25
0.71
0.63
0.45
0.03
2.20
0.60
Marginal variance
σξ2
1.75
0.24
0.97
0.51
0.12
4.68
1.43
1.75
0.71
6.73
1.69
0.08
47.39
17.90
2.00
0.48
2.89
1.04
0.12
17.34
6.43
2.25
0.24
0.63
0.39
0.12
2.55
0.72
2.25
0.71
9.89
2.79
0.31
63.64
27.47
Precision
τϵ
1.75
0.24
1.17
1.17
0.75
1.60
0.22
1.75
0.71
1.23
1.21
0.80
1.81
0.26
2.00
0.48
1.17
1.16
0.77
1.60
0.21
2.25
0.24
1.19
1.19
0.75
1.61
0.22
2.25
0.71
1.20
1.19
0.80
1.64
0.22
TABLE A5
Spatial Propagation Model: Lower Bound (scenario I). Mean, median, 95% credible intervals with lower (CI
) and upper (CI
) bounds, and standard deviations (std) associated with the covariate effects (Fixed effects), precision of the unstructured errors (), and hyperparameters of the spatial field (range and marginal variance )
Parameters
Mean
Median
95% CIl
95% CIu
STD
Fixed effects
log(ACCESS)
−0.51
−0.51
−0.77
−0.26
0.13
log(TRAFFIC)
0.48
0.48
0.22
0.72
0.13
log(POP)
0.19
0.19
−0.00
0.38
0.10
intercept
−3.18
−3.18
−4.48
−1.92
0.65
Random effects
Spatial range
ρξ
1.64
1.39
0.34
4.40
1.07
Marginal variance
σξ2
2.51
1.06
0.17
14.00
4.80
Precision
τϵ
1.17
1.16
0.75
1.65
0.23
TABLE A6
Spatial Propagation Model: Upper Bound (scenario III). Mean, median, 95% credible intervals with lower () and upper () bounds, and standard deviations (std) associated with the covariate effects (Fixed effects), precision of the unstructured errors () and hyperparameters of the spatial field (range and marginal variance )
Authors: D J Weiss; A Nelson; H S Gibson; W Temperley; S Peedell; A Lieber; M Hancher; E Poyart; S Belchior; N Fullman; B Mappin; U Dalrymple; J Rozier; T C D Lucas; R E Howes; L S Tusting; S Y Kang; E Cameron; D Bisanzio; K E Battle; S Bhatt; P W Gething Journal: Nature Date: 2018-01-10 Impact factor: 49.962