Literature DB >> 33178556

A city cluster risk-based approach for Sars-CoV-2 and isolation barriers based on anonymized mobile phone users' location data.

Julio Cezar Soares Silva1, Diogo Ferreira de Lima Silva2, Afonso de Sá Delgado Neto3, André Ferraz3, José Luciano Melo3, Nivan Roberto Ferreira Júnior1, Adiel Teixeira de Almeida Filho1.   

Abstract

Given the recent outbreak of Sars-CoV-2, several countries started to seek different strategies to control contamination and minimize fatalities, which are usually the primary objectives for all strategies. Secondary objectives are related to economic factors, therefore ensuring that society would be able is to keep its essential activities and avoid supply disruptions. This paper presents an application of anonymized mobile phone users' location data to estimate population flow amongst cities with an origin-destination matrix. The work includes a clustering analysis of cities, which may enable policymakers (and epidemiologists) to develop public policies giving the appropriate consideration for each set of cities within a Province or State. Risk measures are included to analyze the severity of the spread among the clusters, which can be ranked. Then, intelligence can be obtained from the analysis, and some clusters could be isolated to avoid contagion while keeping their economic activities. Therefore, this analysis is reproducible for other states of Brazil and other countries and can be adapted for districts within a city, especially considering the possibility of a second wave COVID-19 pandemic.
© 2020 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  COVID-19; Clustering; Networks; Public health; Sars-CoV-2; Weighted directed graphs

Year:  2020        PMID: 33178556      PMCID: PMC7644257          DOI: 10.1016/j.scs.2020.102574

Source DB:  PubMed          Journal:  Sustain Cities Soc        ISSN: 2210-6707            Impact factor:   7.587


Introduction

The rapid spread of the novel COVID-19 worldwide urges studies regarding the causes, the current global/local situation, the effects of this epidemic disease, and the possible solutions on both social-economic and health aspects. Intending to contain the advance of this pandemic, several governments have imposed social restrictions on their populations, including policies such as quarantine and lockdowns (Gao & Yu, 2020; Yang et al., 2020). In parallel, the academic discussion about the studies on the impact of border/travel control, travelers targeting, and local governments' responses advances (Anzai et al., 2020; Shrivastava & Shrivastava, 2020; Sirkeci & Yucesahin, 2020; Wells et al., 2020). Furthermore, several studies have been conducted with the intend of detecting patterns to understand/predict the spread of the disease (Ahmad et al., 2020; Magesh, Niveditha, Rajakumar, RamMohan, & Natrayan, 2020; Sethy, Behera, Ratha, & Biswas, 2020; Sujath, Chatterjee, & Hassanien, 2020). Among the strategies related to COVID-19 combat, the use of mobility data can be highlighted. Allam and Jones (2020) argued that, in order to obtain a better global understanding and cooperative management of the Covid-19, it is necessary to develop communication amongst the smart cities network, such as sharing information from IoT devices to establish better protocols and policies by health professionals. Also, public information and smart city technologies availability are of critical importance for achieving healthier cities (Pineda & Corburn, 2020). Analyzing mobility data allows one to infer spatial dependencies amongst individuals (Delmelle & Delmelle, 2012; Vinayak et al., 2018) and the integration with network analysis (Broach, Dill, & Gliebe, 2012; Merchán, Winkenbach, & Snoeck, 2020). Moreover, the use of real mobility data is crucial for the development of smart cities and more resilient environments, which offers support to health monitoring solutions for policymakers, while respecting citizen privacy protection (Anisetti et al., 2018; Bibri, 2018; Rathore et al., 2018). For instance, cities can be ranked so that the government can achieve objectives prioritizing those most critical (Akande, Cabral, Gomes, & Casteleyn, 2019), and smart applications based on Internet of Things (IoT) devices can be used to monitor individuals and thus reduce the spread of infectious diseases, such as Covid-19 (Min-Allah & Alrashed, 2020). COVID-19 is the third type of zoonotic coronavirus that causes a large-scale outbreak, joying the outbreaks caused by Severe Acute Respiratory Syndrome (SARS) and Middle East Respiratory Syndrome (MERS) in the last decade (Ceylan, 2020; Chang et al., 2020; Sobral, Duarte, da Penha Sobral, Marinho, & de Souza Melo, 2020; Wu, Leung, & Leung, 2020). While MERS caused 858 deaths from 2494 infected people to date (World Health Organization, 2020a) and SARS caused more than 800 deaths and 8000 cases (Chang et al., 2020; Wu et al., 2020) since its first outbreak, the COVID-19 outbreak has already reached much larger numbers, causing enormous concerns around the world. As of 2020-05-07, there had been 3,833,957 cases and 268,999 deaths associated with COVID-19 (CCSE-JHU, 2020). In Brazil, the Supreme Federal Court authorized governors and mayors to determine the rules that should be followed by the citizens from their respective states and cities (Brígido, 2020). This decision results in the need for local analysis provided for each state to justify their positions. Moreover, the potential of coronaviruses to result in large case clusters via super spreading (Wu et al., 2020), such as the catastrophic situation in Italy caused by the novel COVID-19 in the Lombardy region (Giordano et al., 2020; Sebastiani, Massa, & Riboli, 2020), where more than 70,000 cases were reported, and worries regarding a second wave of infection (Liu, Eggo, & Kucharski, 2020; Xu & Li, 2020) call attention for the importance of studies focused in microregions scenarios and economic recovery in this context (Hart & Halden, 2020). In this paper, the state of Pernambuco is analyzed, located in northeastern Brazil. To date, Pernambuco is amongst the states with the highest number of deaths, 845, caused by COVID-19 in Brazil (Ministério da Saúde, 2020). Furthermore, about 98 % of its intensive care units are occupied, and a crash of the health system is near. Real anonymized data regarding movements between 192 cities located in Pernambuco from 2020-01-01 to 2020-03-31 were collected and used as input for a network model. From the obtained population mobility network, it has been considered the weighted directed graphs, which enabled us to analyze data with effects prior to and after the first insolation policies adopted. Therefore, the contributions of this paper are threefold. First, it discusses the COVID-19 outbreak within one of the most affected states of Brazil. Second, it presents a clustering analysis based on real data collected from mobile phone location, used to describe the dynamics of travels between pairs of cities located in Pernambuco made by the phone users. Plots of the clusters give a visual illustration of how cities interact with each other. Third, this paper uses the statistics of the cities' population and the number of COVID-19 active cases to calculate risk measures that can be used by authorities when deciding over the inclusion/exclusion of isolation barriers. Thus, the paper contributes by proposing the integration of anonymized mobile location data to estimate mobility patterns through an origin-destination matrix. The bi-directional mobility graph is obtained to enable a persistence analysis of the network to find natural clusters of cities. Then, an analysis is provided integrating a traditional epidemic measure, such as Force of Infection (Keeling & Rohani, 2008), and a proposed Risk exposure measure considering each node strength and the actual epidemic data provided by official sources to enable insights such as presented in Section 3.1.2 and Section 4 for supporting policymakers' decisions and epidemiologists to establish isolation protocols. The paper is organized into four sections. Section 1 introduces the topics discussed in the paper. Section 2 presents how the data were collected and the methods used to analyze them. The results of a case study are reported and discussed in Section 3, followed by a discussion on the insights and limitations in Section 4. Finally, Section 5 concludes the paper and presents some suggestions for the continuity of this work.

Related works

Mobile location data has been included as one of the main digital tools/technologies used for combating COVID-19 in a recent review (Budd et al., 2020). In this review, Budd et al. (2020) explain that the use of location data regards mainly a public-health need for interruption of community transmission. The concern of large global technology companies such as Google and Apple in generating and make available community mobility reports for aiding the COVID-19 combat highlights the importance of this kind of study (Apple, 2020; Google, 2020). Table 1 presents the comparison of the state-of-the-art studies involving COVID-19 and mobile phone location data and this work.
Table 1

Recent articles using mobility data to combat COVID-19.

PaperSource of DataPlace of interestPurposeAnalysis
(Kraemer et al., 2020)Baidu mobility data(T. Hu, Guan et al., 2020)China

Investigate when travel restrictions are effective

Descriptive statistics.

Difference in confirmed cases data originated from individuals with and without travel history to China.

Risk measure:

Transmission risk from travelers.

(Chinazzi et al., 2020)Baidu mobility data(T. Hu, Guan, Guan et al., 2020)China

Study the impact of control interventions on COVID-19 spread locally (China) and internationally.

Descriptive statistics.

Evaluated different travel restrictions and transmissibility scenarios.

Risk measure:

Risk of importing cases from mainland China.

(Jia et al., 2020b)One of the largest carriers in chinaChina

Investigate the impact of social distancing in mobility

Forecast confirmed cases distribution and identification of high-risk areas

Develop tools to support risk assessment and resource allocation planning

Modeled the effect of outflow distribution from Wuhan

COVID-19 evolution is characterized by a spatiotemporal hazard function.

Risk measure:

Daily risk score for prefectures

Outflow from a source of risk

(Zhang et al., 2020b)Hunan Provincial Center for Disease Control and Prevention, ChinaChina

Impact of age differences in the transmission of the COVID-19.

Investigation of mixing patterns change due to social distancing.

Statistics related to contact frequency given demographic characteristics and location

Contact matrix

Mixing pattern effects in the basic reproduction number

Impacts on the basic reproduction number due to removal of school contacts

(B. Hu, Qiu et al., 2020)Wayz IncChina

Analyzed the spatiotemporal association between COVID-19 spread dynamics and human movement

Spatiotemporal data of two categories of location-based service data of mobile phones

Statistical tools to measure correlation and spatial stratified heterogeneity

(Zhou et al., 2020)China UnicomChina

Model to support policymakers to define the optimal configuration of mobility restrictions

SEIR model to analyze the potential effects of different mobility restrictions

Risk measure:

Force of infection

(Pepe et al., 2020)Cuebiq IncItaly

Effectiveness of the control measures imposed by the Italian government in mobility

Mobility and Proximity networks

Developed metrics related to individual proximity and mobility

(Aleta et al., 2020)Cuebiq IncUSA

Investigation of different strategies effectiveness to relax social-distancing

Compartmental models

Agent-based approach

(Hill et al., 2020)Cuebiq IncUSA

Understand the impact of religiosity on mobility during the COVID-19 outbreak

Robust regression

Descriptive statistics analysis

Direct and moderation effects of mobility

Sensitivity analysis

(Peixoto et al., 2020)In Loco CompanyBrazil

Evaluate and forecast the spatiotemporal risk of infection in São Paulo and Rio de Janeiro

Metapopulation SI model

Risk measure:

Rank of infection

This articleIn Loco CompanyBrazil

Define strategic locations to place isolation barriers

Descriptive statistics: before and after the pandemic.

Clustering analysis: before and after the pandemic.

Risk measures:

Force of infection.

City exposure risk measure.

Cluster exposure risk measure.

Recent articles using mobility data to combat COVID-19. Investigate when travel restrictions are effective Descriptive statistics. Difference in confirmed cases data originated from individuals with and without travel history to China. Risk measure: Transmission risk from travelers. Study the impact of control interventions on COVID-19 spread locally (China) and internationally. Descriptive statistics. Evaluated different travel restrictions and transmissibility scenarios. Risk measure: Risk of importing cases from mainland China. Investigate the impact of social distancing in mobility Forecast confirmed cases distribution and identification of high-risk areas Develop tools to support risk assessment and resource allocation planning Modeled the effect of outflow distribution from Wuhan COVID-19 evolution is characterized by a spatiotemporal hazard function. Risk measure: Daily risk score for prefectures Outflow from a source of risk Impact of age differences in the transmission of the COVID-19. Investigation of mixing patterns change due to social distancing. Statistics related to contact frequency given demographic characteristics and location Contact matrix Mixing pattern effects in the basic reproduction number Impacts on the basic reproduction number due to removal of school contacts Analyzed the spatiotemporal association between COVID-19 spread dynamics and human movement Spatiotemporal data of two categories of location-based service data of mobile phones Statistical tools to measure correlation and spatial stratified heterogeneity Model to support policymakers to define the optimal configuration of mobility restrictions SEIR model to analyze the potential effects of different mobility restrictions Risk measure: Force of infection Effectiveness of the control measures imposed by the Italian government in mobility Mobility and Proximity networks Developed metrics related to individual proximity and mobility Investigation of different strategies effectiveness to relax social-distancing Compartmental models Agent-based approach Understand the impact of religiosity on mobility during the COVID-19 outbreak Robust regression Descriptive statistics analysis Direct and moderation effects of mobility Sensitivity analysis Evaluate and forecast the spatiotemporal risk of infection in São Paulo and Rio de Janeiro Metapopulation SI model Risk measure: Rank of infection Define strategic locations to place isolation barriers Descriptive statistics: before and after the pandemic. Clustering analysis: before and after the pandemic. Risk measures: Force of infection. City exposure risk measure. Cluster exposure risk measure. The effect of travel restrictions in Wuhan to prevent or delay case importations in other Chinese cities as well as internationally was discussed in (Chinazzi et al., 2020), where Baidu mobility data was used. Kraemer et al. (2020) also used real-time mobility data from Wuhan in their study, where the impact of control measures in the region was analyzed. Zhang et al. (2020a) used contact-tracing data from one of the largest operators in China to investigate the impact of age difference in COVID-19 transmission and modification of mixing patterns due to social distancing. Jia et al. (2020a) Studied the effect of the quarantine in ceasing mobility and showed that the outflow distribution from Wuhan predicts where the infections by COVID-19 mostly occurs. The authors also developed a spatio-temporal risk model to identify regions with a high risk of infection at an early stage. Zhou et al. (2020) used anonymized mobile phone data as an input for an SEIR model to study COVID-19 evolution dynamics when different mobility restrictions were imposed. Hu, Qiu et al. (2020) studied the correlation of the COVID-19 spatiotemporal data of two categories of location-based services data of mobile devices to understand the effect of human movement on the COVID-19 spread. Mobility data was also used to understand the COVID-19 spread and mitigation strategies in other countries. Pepe et al. (2020) used mobile location data provided by Cuebiq Inc to develop individual proximity and mobility metrics to study the impact of the social distancing measures imposed by the Italian government in the COVID-19 spread reduction. Aleta et al. (2020) used mobile location and census data to build agent-based models in Boston’s metropolitan area to understand the impact of strategies to relax social-distancing. They used compartmental models to study the evolving dynamics of COVID-19. Hill, Gonzalez, and Burdette (2020) used mobile phone location and census data to develop a robust regression model that analyzed the impact of religion on human mobility during the COVID-19 pandemic. Concerning Brazil, only one study that used mobile devices location data was developed until now. Peixoto, Marcondes, Peixoto, and Oliva (2020) used a metapopulation SI model to study the spatiotemporal dynamics evolution of COVID-19 in two Brazilian states. By ranking cities from São Paulo and Rio de Janeiro with respect to the rank of infection measure, the authors obtained a risk map for the pandemic spread evolution. Our study brings innovation because it explores COVID-19 risk of infection in groups of cities with a relatively strong relationship (clusters), therefore offering methods to mitigate infection risk, in this case, strategic placement of isolation barriers, while maintaining both economic and social interactions amongst cities of each cluster. This study also adopted both risk of infection of an individual present in a group of infected cities and the risk of contamination of cities containing no active cases.

Methods

Data collection

The data was provided by In Loco Company, which collects anonymized location data from about 60 million devices around the world, enabling mobile apps to provide location-aware services while securing the privacy of their users1 . In Pernambuco, more than 2 million devices are registered and were used in this application, which is significative data considering the 9.5 million population of Pernambuco. However, In Loco Company is unable to associate a user with any external information from it due to privacy purposes. Nevertheless, most of the apps (e.g., e-commerce stores, shopping stores, and online sales of used products) that have In Loco Company technology embarked are homogeneous amongst social classes, therefore, representing a significant portion of the company's database. As a result of having a stratified sample of users, this enables In Loco Company to sell mobility-based business intelligence applications to a wide variety of purposes. The performance of In Loco Company technology was amongst the best infrastructure-free technologies in IPSN 2014 (Lymberopoulos & Liu, 2017), with an average location error of 2.81 m for the indoor location. It relies only on the smartphone's sensors, including Wi-Fi, accelerometer, and magnetometer, in order to enhance GPS precision of location events, giving context to their nature. Also, the embedded Software Development Kit (SDK) of In Loco Company is specially designed to capture location in crucial moments of the user's path, that is, only moments that actually establish a visit to a specific place (sufficient steady-state in movement and for a sufficient amount of time in that location), only gathering the information once and when needed. Also, the absence of a user's internet connection in the moment of location collection is not a limitation. The SDK is programmed to store location visits and wait for the moment in which sending the visits information is possible. Therefore, movement noise is diminished, and locations may be gathered even if the device is offline or if its owner only connects to the internet when there is a Wi-Fi connection available. Using such rich yet anonymized information of the user's location enables us to build the origin-destination matrix by relying first on the distribution of pairs of distinct location transitions on users' paths. The same source of anonymized mobile user’s location data has been used for the main social isolation index (Queiroz et al., 2020) used in Brazil by policymakers for establishing the COVID-19 outbreak control with social distancing.

Data preprocessing

Fig. 1 presents a diagram that summarizes the source in which data was collected, how it was preprocessed, and used as input for the experiments.
Fig. 1

Data sources, preprocess, and resulting input.

Data sources, preprocess, and resulting input. The origin-destination matrix characterizes the overall flow distribution between pairs of locations. Each entry of the origin-destination matrix represents the relative flow estimate of people moving from city i to city j. If , then a 20 % flow was estimated with respect to the population of city i. Only the intracity movement was not estimated using that technique. Thus, the idea is to consider the rate of users that did not leave their current city (that can be estimated by clustering visits on time and location and choosing the most frequent location in moments of resting, such as the night) in a given day. So the data collected includes an origin-destination matrix for each day t. In this paper, the data used regards the behavior of the population of Pernambuco concerning travels between the cities on different dates (2020-01-01 to 2020-03-31). Therefore, for each city , it was calculated the percentage of people that remained in the city and the percentage that left city to each of other cities at each of the 91 dates collected. Devices from other states found within these 192 cities during the data collection were also considered due to its mobility from one city to another, even if it was inter-state mobility. The objective of this study is to consider worst-case scenarios of the COVID-19 spread; thus, we have estimated the worst-case flow for each pair of cities in the state. We considered the worst-case flow between i and j as where t represents an origin-destination matrix in the database. These scenarios were evaluated in two different periods: before the pandemic and after the start of the pandemic and the establishment of isolation protocols. An extract using the indexes [1:7, 1:7] from an OD matrix is shown in Table 2 . OD-matrix sample.Table 2.
Table 2

OD-matrix sample.

Abreu e LimaAfogados da IngazeiraAfrânioAgrestinaAlagoinhaAliançaAltinho
Abreu e Lima0.92852800000.0001530
Afogados da Ingazeira00.97977700000
Afrânio000.9917130000
Agrestina0.00032000.964455000.0131
Alagoinha00000.97581800
Aliança0.00115500000.9658560
Altinho0000.029395000.970642
OD-matrix sample. Furthermore, statistics involving the COVID-19 spread in Pernambuco were collected from the Brazilian Ministry of Heathy reports (Ministério da Saúde, 2020) and the Secretariat for Planning and Management of Pernambuco (Seplag, 2020). Also, information regarding the population and coordinates of the cities was collected from the Brazilian Institute of Geography and Statistics (IBGE, 2019).

Network analysis and risk exposure amongst cities

In this section, definitions that may be necessary for a better understanding of the results and analysis of this article are presented. Also, the sources here cited can be used by the reader that wants a more complete and more in-depth explanation of the measures and definitions. (Bondy & Murty, 2008): Let , be a directed graph, or digraph, where represents a set of vertices and , disjoint from , consists of a set of directed arcs together with an incidence function that associates each arc of with an ordered pair of vertices. Let each vertex (or node) represent one city, , where the set of vertices (cities) is obtained from the origin-destiny data collected from anonymized mobile phone location data. A proportion of the population from the city travels to each of the other to cities . Let this proportion be the weight of the associated arc , with tail in the and head in . Then, the digraph, together with the weights of its arcs, is called a weighted directed graph . Considering four cities , , , and . A directed graph is illustrated in Fig. 2 .
Fig. 2

Directed Graph.

Directed Graph. Let be a threshold used to filter the connections amongst cities based on the data. Let be the weight associated arc that has its tail in the and head in . Then, a graph constrained by the threshold can be obtained from the original graph using the following rule. An arc between two vertices exists if and only if, or . In other words, this threshold represents the fractional volume of people that implies a well-established communication amongst cities, therefore indicating that there is a proportion of people that travel from city to . This well-established communication is associated with frequent travels, relative to family, work, and expenditure of services and goods. For instance, analyzing Fig. 2, if we assume the only the flows corresponding to people going from city to city , from to , from to , and from to are above the threshold, the newly constructed graph could be expressed as in Fig. 3 .
Fig. 3

Directed Graph considering a threshold .

Directed Graph considering a threshold . Therefore, this threshold implies a risk exposure considering the connectedness amongst cities. Thus, depending on the tolerance level described by a threshold , cities within a province may be divided into clusters considering the dynamics of the flow of people amongst those cities. Where is the number of infectious individuals from a city and is the population size of . The parameter is found with the product of the contact rate () and the probability of transmission (t), therefore: . For developing a more conservative analysis, the probability of transmission was considered equal to 100 %, and the mean strength of the cluster (Barrat, Barthélemy, Pastor-Satorras, & Vespignani, 2004) was considered as a proxy for modeling contact rate. Where is the likelihood of one individual from having the infection. In this equation, the contamination within a city and its people flow to different cities are assumed as independent events. Force of Infection Measure (FOI) (Keeling & Rohani, 2008) - Intending to simplify the management of infected clusters, a proposition is to rank them in order of importance according to the FOI of a cluster , represented here as , which is the rate at which a single individual contracts the disease (Keeling & Rohani, 2008): City Risk Exposure Measure, , is given by Eq. (2): Cluster Risk Exposure Measure, , is given by Eq. (3): The following intersection of events may occur: city i is infected by city j and k (two cities simultaneously), city i is infected by more than two cities simultaneously, and so on. For both risk measures , ) a conservative approach has been assumed, therefore including the redundant intersection of events. i.e., being contaminated by more than one city at the same time.

Case study

The state of Pernambuco has an estimated population of more than 9.5 million people. The first occurrence of COVID-19 was reported on March 12, and until March 31, 87 cases had been observed. In April, the number of cases reported per day begun to multiply, as shown in (Ministério da Saúde, 2020). In the third week of March, the governor of Pernambuco started to declare the first social isolation rules. Shopping malls, schools, and universities are examples of organizations that stopped operating. Only the essential business should continue open. Despite efforts, the occurrences have not stagnated, and a growing behavior of the cumulative number of cases and deaths reported since the first occurrence can be found on the official Brazilian sources (Ministério da Saúde, 2020). With the increasing number of cases, the Pernambuco government announced investments in new intensive care and infirmary units dedicated to treating COVID-19 patients. On the other hand, those units' occupation has been high and achieved 99 % between April 20 and April 21. Since April 22, the occupation has been between 96 % and 98 %. Fig. 4 details the evolution of the number of units and their occupation in Pernambuco.
Fig. 4

Number and Occupation of ICU and infirmary units in Pernambuco. Source: (Seplag, 2020).

Number and Occupation of ICU and infirmary units in Pernambuco. Source: (Seplag, 2020). The state publishes a report at the end of each day, in which the cities are divided according to their localization into 12 health regions. On the other hand, no information regarding movements between cities is presented in this report. As mentioned in Section 2.1, mobile phone localization data have been used to obtain information about travels between cities in Pernambuco. Therefore, the percentage of the population of city that traveled to city in a set date was inferred. Fig. 5 presents the time series regarding the unidirectional flow of people traveling from Recife to Caruaru and from Caruaru to Garanhuns between January 01 and March 31. The dates are represented on the x-axis and the percentage of the population that traveled between two cities is shown on the y-axis. The vertical red line indicates the moment when social isolation policies were implemented in the state. In addition, the chart peaks indicate the maximum flows before (in red) and after (in green) social distancing. As it can be observed, the flow between the cities decreased on March 15, indicating results obtained from the implemented policies. Section 3.2 presents the identification and ranking of clusters and isolated cities concerning the adopted risk measures based on the maximum flow calculated for each pair of cities.
Fig. 5

Historic people flow from Recife-Caruaru and Caruaru-Garanhuns between Jan 01 and Mar 31.

Historic people flow from Recife-Caruaru and Caruaru-Garanhuns between Jan 01 and Mar 31.

Network clustering

The experiments developed in this section were performed by using the python-igraph library (Csardi & Nepusz, 2006), which is a python library developed for network science, containing functions that allow the user to build graphs, to develop quantitative analysis, and also offers graph visualization tools. Also, data concerning the active coronavirus infectious individuals in Pernambuco were collected from Secretariat for Planning and Management of Pernambuco (Seplag, 2020). Experiments were performed to evaluate the connection dynamics and economic recovery possibility of the clusters produced as the connection threshold increases from 0 (complete network with original connections) to 0.2. As the threshold grows from 0 to 0.2, clusters of cities with relatively strong connections are revealed. We considered that a representative cluster set concern a good quantity of disjoint regions that covers the majority of cities of the state. There was no reason to investigate the increase of the threshold beyond 0.2 since we couldn’t obtain a representative set of clusters when the threshold is greater than that value. Experiments were performed in two scenarios, one considering data collected before and after 2020-03-15, as this date marks the starting point of the social distancing interventions in Pernambuco. Both scenarios considered the maximum flow of travels amongst the cities during the respective periods (January 01 to March 15 and March 16 to March 31). Fig. 6 illustrates that the clustering generation with threshold decreases. It can be highlighted that as is relaxed more connections between cities are found. If , then the complete network associated with original data is obtained. Of course, as increases, the cities with strong dependency (not necessarily mutual) form isolated clusters, and cities with weak dependency get isolated due to the inferred mobility pattern.
Fig. 6

Network structures for different threshold values.

Network structures for different threshold values. Since the generated clusters are fully separated, one can use them to continuously classify regions into two categories: those in which policymakers and epidemiologists may propose a protocol for border/travel control when necessary, and others where social restrictions must be maintained/implemented. Policymakers and epidemiologists may also consider that with specific protocols, some regions may have the necessary care and border controls, associated with a plan for the gradual recovery of activities to be performed. In both categories, policymakers shall consider border controls so that infected individuals will not leave their original region. Such isolation barrier protocol is particularly important when a region/cluster is classified as feasible for economic recovery planning in order to prevent the entrance of infectious individuals. The following subsections aimed at answering three questions: “How to establish isolation barriers to contain the outbreak while minimizing social-economic losses?”, “What are the possibilities of representative mobility clusters available for the state?”, and “Once the representative set of clusters is selected, how to assess risk and establish priorities for mitigation actions?”

An exploratory analysis of cities’ network persistence

This subsection concerns understanding how the clusters of strongly related cities were affected by the government's isolation protocols and the available possibilities of representative sets of clusters for the state. Clusters are structures that contain a relatively higher probability of exchanging infected individuals amongst the cities belonging to them. When a threshold is used to construct a cluster, the probability of mobility inside it is greater or equal to and the probability of mobility coming from outside it is less than . A decrease in the threshold reduces the anomalous behavior that comes from outside each cluster. Nevertheless, if the threshold is too small, few clusters are constructed, and it is more difficult to control the spread inside a cluster. When few cities with no COVID-19 cases reported are placed within the same cluster, then travel barriers could be used to protect/isolate these. In the first section of the supplementary material a more detailed exploratory analysis of the persistence of the state’s networks and characteristics of the generated clusters is presented. Fig. 7 shows the behavior of the population included or not in clusters with more than one city when the threshold varies. Considering a threshold of , for instance, it is observed that 186 (172 before isolation) cities in individual clusters cover about 70 % (60.44 %) of the total population. It can be interpreted that a large proportion of individuals presents less than a 15 % chance of mobility between cities.
Fig. 7

The number of isolated cities and the state population covered by those isolated cities when varying the threshold.

The number of isolated cities and the state population covered by those isolated cities when varying the threshold. Thus, the chosen threshold also brings less than likelihood of exceptional communication with cities not included in any plotted cluster in the above figures. Another way is to fix (or accept) a relatively smaller threshold value and to create more clusters to subdivide the state, based on local communications amongst cities. They can be used to manage the minimization of the COVID-19 spread and to plan a gradual economic recovery of the cities. With the intend of showing a view of the clusters in the map of Pernambuco, Fig. 8 plots the vertices of the directed graph considering the coordinates of their respective cities. In this figure, a threshold of is used for the data regarding the flows after the implementation of social isolation rules. Two areas of the state are highlighted in the figure below the main map with a zoomed view.
Fig. 8

Cluster generated by l = 0.025 plotted in the state area.

Cluster generated by l = 0.025 plotted in the state area.

Risk measurement results

The objective of this section is to give answers on how to prioritize clusters and manage mitigation actions in the state. The first step to develop the risk analysis was to classify clusters into not infected and infected. A cluster is infected if it includes at least one infected city. The classification results are presented in Fig. 9 . Six clusters were found to be not infected () and 12 clusters are infected (). These clusters were colored according to their condition and indexed with a number k inside a colored square.
Fig. 9

Clusters after classification. six non-infected clusters () and twelve infected clusters (). Green polygons indicate healthy clusters and red polygons indicate infected clusters. There is also a square nearby each cluster indicating its index k (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).

Clusters after classification. six non-infected clusters () and twelve infected clusters (). Green polygons indicate healthy clusters and red polygons indicate infected clusters. There is also a square nearby each cluster indicating its index k (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article). Once the clusters are separated, their risk assessment can be developed to guide decisions using the metrics detailed in Section 2. The FOI measure is used to evaluate the criticality of infected clusters of cities or an isolated infected city, so epidemiologists may use this information for planning social isolation policies. In contrast, the Risk Exposure measures are used for non-infected cities and clusters and enable decisions regarding isolation barriers policies to keep economic activities and prevent infection. Table 3 presents the 12 infected clusters, ordered by their respective FOI. In this table, the first column indicates the clusters' indexes, and the reader can check the cities that constitute each cluster in Table 1 in the Supplementary Material. In the second column, the FOI, given by Eq. (1), of the clusters are presented, where the estimated actual active cases for each city were extrapolated from the number of cases officially reported. This extrapolation was necessary since only a small portion of the infected population effectively goes to a hospital and makes a COVID-19 test. There are reports that those persons seeking hospital and health support because they contain more severe symptoms (Day, 2020; World Health Organization, 2020b), and the same is being reported for Pernambuco. We assumed that the registered are only 20 % of the total number of cases that can transmit the disease since there is a general perception of underreporting due to the small number of tests. The third column presents the difference between the FOI of consecutive clusters, and the fourth column illustrates a ratio over these differences. At last, the fifth column shows the cumulative ratio, and the final value shows the relative of risk difference amongst each pair. When the sign ‘- ‘is included in an entry of the table, it means that there is no sufficient information to calculate the associated column output. This will happen for the last three columns, especially because at least one row below the row in which the values are being calculated is needed to calculate their output.
Table 3

Force of Infection of infected clusters. Data is ranked in descending order.

Ck1FOICk1FOICk1-FOICk+11FOICk1-FOICk+11FOICk+11-FOICk+21Cumm Perc(DiffRatio)
C110.0032820.00256435.234513,523.45 %
C210.0007187.28E-050.4435723,567.81 %
C310.0006450.00016412.701994,838.01 %
C410.0004811.29E-050.0915844,847.17 %
C510.0004680.0001416.4263445,489.80 %
C610.0003272.19E-051.6704765,656.85 %
C710.0003051.31E-050.4846895,705.32 %
C810.0002922.71E-052.7939695,984.71 %
C910.0002659.7E-060.1573266,000.45 %
C1010.0002556.17E-050.4608226,046.53 %
C1110.0001940.000134
C1215.99E-05
Force of Infection of infected clusters. Data is ranked in descending order. When analyzing Table 3, it can be observed that Cluster is the most critical, and should be prioritized by the policymakers. Also, policymakers can visualize the risk gap amongst clusters in order to prioritize initiatives concerning a resource constraint and enlighten the different aspects amongst those city sizes and populations. For example, in the fifth column, the difference from Cluster to Cluster is more than 35 times the difference from Cluster to Cluster . This magnitude is expressive and shows the relative severity of the cluster , which contains Recife. Furthermore, the sixth column shows the cumulative ratio, and its final value shows the quantity of risk range that needs to be mitigated. This additional information is useful due to resource limitations faced by the managers. Some clusters can contribute to the risk management process with relatively more impact. This is the case for clusters - , which together contain more than half of the DiffRatio or risk range to be mitigated. Thus, the public manager can discuss with epidemiologists which of these clusters to attack first, and depending on the budget, choose more than one to simultaneously perform risk mitigation. In Table 4 , the metrics discussed above for Table 3 are presented for isolated infected cities, which are not within any cluster. In this case, Cachoeirinha is the most critical city. As we had discussed before, we can also see the impact of each isolated city in the risk mitigation process. The first three cities (Cachoeirinha, Ipubi, and Tupanatinga) can be attacked first in order to perform a more efficient risk mitigation process.
Table 4

Force of Infection of isolated cities.

vjFOIvjFOIvj-FOIvj+1FOIvj-FOIvj+1FOIvj+1-FOIvj+2Cumm Perc(DiffRatio)
Cachoeirinha (35)0.0033570.0011272080.97868297.87 %
Ipubi (35)0.002230.0011517611.758373273.71 %
Tupanatinga (15)0.0010780.00065501512.133231,487.03 %
Inajá (5)0.0004235.39852E-051.674631,654.49 %
Panelas (5)0.0003693.22371E-055.690732,223.56 %
Ibimirim (5)0.0003375.66484E-060.1441042,237.98 %
São Bento do Una (10)0.0003313.93107E-05
São José do Belmonte (5)0.000292
Force of Infection of isolated cities. Metrics regarding non-infected clusters and cities are presented respectively in Table 5 and Table 6 details the non-infected clusters ordered by their risk exposure (). Concerning Table 5, the second column contain the risk of infection of non-infected clusters. The cities within each cluster are presented in Table 2 in the Supplementary Material. It is interesting to observe that cluster , formed by seven cities, is free of active cases. Similarly, the third, fourth, and fifth columns of the table display the difference and ratios for each pair of clusters to provide a better notion of how these are related to a risk perspective.
Table 5

Cluster Exposure Risk measure calculated for all "healthy" clusters.

Ck0RCk0RCk0-RCk+10RCk0-RCk+10RCk+10-RCk+20Cumm Perc(DiffRatio)
C100.0002630.0001293.367017336.70 %
C200.0001343.84E-0584.685628,805.26 %
C309.56E-054.54E-070.0230198,807.57 %
C409.51E-051.97E-050.3451528,842.08 %
C507.54E-055.71E-05
C601.83E-05
Table 6

City Exposure Risk Measure calculated for not infected isolated cities.

vjRvjRvj-Rvj+1Rvj-Rvj+1Rvj+1-Rvj+2Cumm Perc(DiffRatio)
Sertânia0.0001083.15678E-054.81209481 %
Araripina7.63E-056.56011E-061.158786597%
Custódia6.97E-055.66119E-060.280576625%
Águas Belas6.41E-052.01771E-052.213617847%
Iati4.39E-059.11497E-060.690857916%
Santa Maria da Boa Vista3.48E-051.31937E-054.1345691,329 %
Santa Terezinha2.16E-053.19108E-060.4627651,375 %
Parnamirim1.84E-056.89567E-0615.734212,949 %
Betânia1.15E-054.3826E-070.1119062,960 %
Casa Nova1.11E-053.91633E-060.6826673,028 %
Cedro7.15E-065.73681E-06249.391727,967 %
Sento Sé1.42E-062.30032E-080.01747127,969 %
Remanso1.39E-061.31665E-0644.2553232,395 %
Campo Alegre de Lourdes7.56E-082.97512E-08
Pilão Arcado4.58E-08
Cluster Exposure Risk measure calculated for all "healthy" clusters. City Exposure Risk Measure calculated for not infected isolated cities. It can be seen that the clusters that most contribute to the exposure risk mitigation are and , since they are associated with almost all the risk range to be mitigated. In Table 6, one can find the city exposure risk, calculated for non-infected isolated cities. According to this measure, Sertânia is the city with a higher risk of becoming infected based on the location flows used in this application. The first ten cities (Sertânia – Cedro) are those that dramatically contributed to the variation in the range of risk to be mitigated and should be given special attention. This section presented a risk mitigation process that considers groups of cities and minimal intervention on their economic relationships. The tools to prioritize infected clusters () and healthy clusters () where risk measures that quantify a range of risk to be mitigated. For each type of cluster, there is a subset of clusters that contributes more efficiently to the quantified range of risk to be mitigated. Also, not all cities were allocated to clusters, considering the adopted threshold. But, using the proposed tools, one can also investigate the situation of the infected and healthy isolated cities. Suppose an isolated city is healthy, with low exposure risk, and its geographical position is close to other healthy clusters. In that case, one can think of ways to allocate this city to a cluster such that there is no increase in this cluster’s exposure risk by considering the links (or strong relationships) between the cities of the cluster and the isolated city. Finally, since the COVID-19 reports are made available to the public daily, these risk tables must be calculated on a daily basis.

Further discussions

When analyzing a biologic calamity scenario such as the ongoing COVID-19 pandemic, it is essential to consider that people's behavior is affected, including the flows of people traveling between cities. For this analysis, data before and after the implementation of social isolation policies in the state were considered. Then, in the calculation of the risk measures, only the networks produced with data compatible with the social isolation behavior have been used. The networks have been constructed considering the maximum observed flow within the observed isolation state, ensuring a pessimistic approach A pessimistic approach was considered when calculating risk measures, where the number of reported cases of COVID-19 within each city was considering as only a portion of 20 % of the total cases due to under-reporting. Therefore, an estimated number of infected people was used, considerably increasing the number found in the reports. The entire cluster is considered contaminated if there is a single COVID-19 case reported for a unique city of the cluster. In this paper, the focus is on the spread of the disease amongst the cities of the state. Therefore, the decisions based on these metrics are related to the implementation of isolation barriers to isolate not infected cities. Then, the risk exposure metrics show that cities that are not infected yet have a mobility relationship with other infected cities. For instance, Table 4 shows that Sertânia is the most exposed non-infected city. It can be used to support a decision of the governor or mayor (policymaker) regulating the flow of people entering the city. On the other hand, the FOI is applied to clusters and cities already infected. As this metric is based on the proportion of the city population infected at the moment, this metric can be used to support social isolation or lockdown policies. Nevertheless, it is a multidimensional decision and should consider the city's healthcare system, the socio-economic impact of totally closing the city for its population. For example, in a low-income Brazilian neighborhood, a small house with one or two rooms may be the home of a family. All these aspects should be taken into account by policymakers, which is not the focus of this paper. Epidemiologists must support the definition of lockdown policies, supporting policymakers to assess all consequences when defining a lockdown procedure. It is essential to highlight that Pernambuco is similar to other states in Brazil, where its capital (Recife) is a center of gravity in terms of mobility and concentration of population. Therefore, it is the node with higher strength within the graph. Since our approach was based on the persistence of the network, it is important to notice that only those arcs with a weight higher than 2.5 % were considered to build the final cluster scenario. Thus, since the fact that Recife is about 80 km from the nearest border of Pernambuco, there is a minimal flow to other cities compared to the overall mobility. The data collected considered all mobile devices that were found within the cities of Pernambuco, and this includes devices from persons arriving from cities located in other states, including those that have borders with Pernambuco. The complete database of mobile phone devices monitored by In Loco Company has 60 million devices. From this database, only about 2 million devices had a visit record in Pernambuco during the data collection. There is a 2.5 % of mobility flow not considered when building these clusters of cities. So, there could be an error around this not considered population flow, but this would not significantly change the clustering analysis since it was based on the network persistence built from such a large dataset. Our contribution is to provide information so policymakers may be able to control effectively this 2.5 % of population flow, enabling, for instance, to establish further trace contacts or quarantine protocols instead of controlling all population flow without a strategic priority. With this perspective, epidemiologists can structure isolation and barrier protocols for ensuring safety to the essential population flows that may happen for the food and other essential supply chain activities. There are open issues and challenges to be addressed in future research. The information integration of the isolation barriers and our analysis with SEIR compartment models, Bayesian SEIR models, and other quantitative simulation models are the main challenges related to this research. Also, one could think about performing spatiotemporal clustering to analyze different cluster dynamics. Finally, another issue concerns how to connect healthy isolated cities or small clusters to relatively big healthy clusters in a way that does not increase the cluster’s exposure risk.

Conclusions

In this paper, networks based on mobile phone location data were used to analyze the isolation of cities in one of the most affected states of Brazil. Mobility thresholds are used to construct clusters and investigate the behavior of 192 cities before and after social distancing policies were implemented. Therefore, it was possible to observe mainly those cities which connect weakly and those who connect strongly with each other. Among the contributions of this approach, the clustering scenarios based on weighted flows can be used to aid rule-makers when deciding over isolating one city or a set or cities with border controls. Further developments of this study include cluster's risk assessment, which involves ranking from the most critical to the least critical one, depending on their situation, based on the adopted risk measures. Isolated clusters with no cases can receive priority when economic recovery policies start to get implemented. Therefore, economically viable clusters can be generated, preserving strong and weak relationships amongst cities. Thus, the idea is that our proposition may be used to provide intelligence, so policymakers may consider it associated with the epidemiologists' perspective and define regions of interest for isolation and progressive economic recovery. The isolation protocols and criteria for lockdown and opening areas shall be defined by epidemiologists considering the timeline of infections and available facilities for medical treatment. Future works can include small-world effects analysis, variations of algorithms for clustering such as the ones for weighted directed graphs. In this case, thresholds regarding a minimum amount of flow are not used to transform the original directed graph, and different clusters can be found and analysis performed. Furthermore, the inclusion of aspects regarding the hospital's situations of each city or region can be added to the risk analysis. Also, the analysis made in this paper is reproducible for other states of Brazil and other countries.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. The authors declare that they have no conflicting interests. Although the following authors, Afonso de Sá Delgado Neto, André Ferraz, and José Luciano Melo, work for In Loco Company, neither they nor their company shall receive financial benefits from this study.
  30 in total

Review 1.  The architecture of complex weighted networks.

Authors:  A Barrat; M Barthélemy; R Pastor-Satorras; A Vespignani
Journal:  Proc Natl Acad Sci U S A       Date:  2004-03-08       Impact factor: 11.205

Review 2.  Digital technologies in the public-health response to COVID-19.

Authors:  Jobie Budd; Benjamin S Miller; Erin M Manning; Vasileios Lampos; Mengdie Zhuang; Michael Edelstein; Geraint Rees; Vincent C Emery; Molly M Stevens; Neil Keegan; Michael J Short; Deenan Pillay; Ed Manley; Ingemar J Cox; David Heymann; Anne M Johnson; Rachel A McKendry
Journal:  Nat Med       Date:  2020-08-07       Impact factor: 53.440

3.  Population flow drives spatio-temporal distribution of COVID-19 in China.

Authors:  Jayson S Jia; Xin Lu; Yun Yuan; Ge Xu; Jianmin Jia; Nicholas A Christakis
Journal:  Nature       Date:  2020-04-29       Impact factor: 49.962

4.  Changes in contact patterns shape the dynamics of the COVID-19 outbreak in China.

Authors:  Marco Ajelli; Hongjie Yu; Juanjuan Zhang; Maria Litvinova; Yuxia Liang; Yan Wang; Wei Wang; Shanlu Zhao; Qianhui Wu; Stefano Merler; Cécile Viboud; Alessandro Vespignani
Journal:  Science       Date:  2020-04-29       Impact factor: 47.728

5.  First, second and potential third generation spreads of the COVID-19 epidemic in mainland China: an early exploratory study incorporating location-based service data of mobile devices.

Authors:  Bisong Hu; Jingyu Qiu; Haiying Chen; Vincent Tao; Jinfeng Wang; Hui Lin
Journal:  Int J Infect Dis       Date:  2020-05-17       Impact factor: 3.623

6.  Modeling future spread of infections via mobile geolocation data and population dynamics. An application to COVID-19 in Brazil.

Authors:  Pedro S Peixoto; Diego Marcondes; Cláudia Peixoto; Sérgio M Oliva
Journal:  PLoS One       Date:  2020-07-16       Impact factor: 3.240

7.  Impact of international travel and border control measures on the global spread of the novel 2019 coronavirus outbreak.

Authors:  Chad R Wells; Pratha Sah; Seyed M Moghadas; Abhishek Pandey; Affan Shoukat; Yaning Wang; Zheng Wang; Lauren A Meyers; Burton H Singer; Alison P Galvani
Journal:  Proc Natl Acad Sci U S A       Date:  2020-03-13       Impact factor: 11.205

8.  Secondary attack rate and superspreading events for SARS-CoV-2.

Authors:  Yang Liu; Rosalind M Eggo; Adam J Kucharski
Journal:  Lancet       Date:  2020-02-27       Impact factor: 79.321

9.  Computational analysis of SARS-CoV-2/COVID-19 surveillance by wastewater-based epidemiology locally and globally: Feasibility, economy, opportunities and challenges.

Authors:  Olga E Hart; Rolf U Halden
Journal:  Sci Total Environ       Date:  2020-04-22       Impact factor: 7.963

10.  The effect of human mobility and control measures on the COVID-19 epidemic in China.

Authors:  Moritz U G Kraemer; Chia-Hung Yang; Bernardo Gutierrez; Chieh-Hsi Wu; Brennan Klein; David M Pigott; Louis du Plessis; Nuno R Faria; Ruoran Li; William P Hanage; John S Brownstein; Maylis Layan; Alessandro Vespignani; Huaiyu Tian; Christopher Dye; Oliver G Pybus; Samuel V Scarpino
Journal:  Science       Date:  2020-03-25       Impact factor: 47.728

View more
  3 in total

1.  A novel social distancing analysis in urban public space: A new online spatio-temporal trajectory approach.

Authors:  Jie Su; Xiaohai He; Linbo Qing; Tong Niu; Yongqiang Cheng; Yonghong Peng
Journal:  Sustain Cities Soc       Date:  2021-02-06       Impact factor: 7.587

2.  What determines city's resilience against epidemic outbreak: evidence from China's COVID-19 experience.

Authors:  Jie Chen; Xiaoxin Guo; Haozhi Pan; Shihu Zhong
Journal:  Sustain Cities Soc       Date:  2021-03-30       Impact factor: 7.587

3.  A real-time web tool for monitoring and mitigating indoor airborne COVID-19 transmission risks at city scale.

Authors:  Maher Albettar; Liangzhu Leon Wang; Ali Katal
Journal:  Sustain Cities Soc       Date:  2022-03-03       Impact factor: 7.587

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.