Literature DB >> 35404949

Digital contact tracing and network theory to stop the spread of COVID-19 using big-data on human mobility geolocalization.

Matteo Serafino1,2, Higor S Monteiro3, Shaojun Luo1, Saulo D S Reis3, Carles Igual4, Antonio S Lima Neto5,6, Matías Travizano7, José S Andrade3, Hernán A Makse1.   

Abstract

The spread of COVID-19 caused by the SARS-CoV-2 virus has become a worldwide problem with devastating consequences. Here, we implement a comprehensive contact tracing and network analysis to find an optimized quarantine protocol to dismantle the chain of transmission of coronavirus with minimal disruptions to society. We track billions of anonymized GPS human mobility datapoints to monitor the evolution of the contact network of disease transmission before and after mass quarantines. As a consequence of the lockdowns, people's mobility decreases by 53%, which results in a drastic disintegration of the transmission network by 90%. However, this disintegration did not halt the spreading of the disease. Our analysis indicates that superspreading k-core structures persist in the transmission network to prolong the pandemic. Once the k-cores are identified, an optimized strategy to break the chain of transmission is to quarantine a minimal number of 'weak links' with high betweenness centrality connecting the large k-cores.

Entities:  

Mesh:

Year:  2022        PMID: 35404949      PMCID: PMC9053778          DOI: 10.1371/journal.pcbi.1009865

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.779


Introduction

In the absence of vaccine or treatment for COVID-19, state-sponsored lockdowns have been implemented worldwide to halt the spread of the ongoing pandemic creating large social and economic disruptions [1-4]. In addition, some countries have also implemented digital contact tracing protocols to track the contacts of infected people and reinforce quarantines by targeting those at high risk of becoming infected [5-14]. Here we develop, calibrate, and deploy a contact tracing algorithm to track the chain of disease transmission across society using big-data from mobile phone geolocalization. We then search for intelligent quarantine protocols to halt the epidemic spreading with minimal social disruptions [15-20]. Mobile phones or similar devices provide digital sources of information on human mobility and therefore offer a promising way to automate outbreak location detection. Mobile datasets generally consist of an ID associated with each user, a timestamp of the user location, and a location provided as latitude/longitude, which places the user in space. In [21] the authors propose a method to identify outbreak locations of point-source outbreaks from geo-located GPS movement data of affected individuals as recorded from mobile phones. In [22] the authors investigate whether the observed discrepancies between mobile phone datasets affect the results of epidemic simulations. Ferretti et al. [13] showed that a contact tracing App can achieve epidemic control if used by enough people without resorting to mass quarantines. Other works combined cross-sectional survey and GPS data. For example, in [23] the authors define a contact tracing strategy that is likely to identify a sufficient proportion of infected individuals such that subsequent spread could be prevented. The solutions proposed often rely on using GPS data alone or combining GPS with self-reported infections (through a mobile app or questionnaire). Our study uses two complementary datasets. The first includes data from ‘Grandata-United Nations Development Programme partnership to combat COVID-19 with data’ [24]. It is composed of anonymized global positioning system (GPS) data from a compilation of hundreds of mobile applications (apps) across Latin America that allow to track the trajectories of people (users). The data identify each mobile phone device with a unique encrypted mobile ID and specifies its latitude and longitude location through time, which is encoded by a geohash with 12 digits precision. Typically, this dataset generates ≈ 450 million data points of GPS location per day across Latin America (S1 Appendix, section 1). Our analysis is focused on the state of Ceará, Brazil, where we track the geolocation of over a quarter million unique users generating over half a billion GPS datapoints during the three months period of our study. The second dataset is an anonymized list of confirmed COVID-19 patients obtained from the Health Department authorities from the City of Fortaleza, Ceará, Brazil. The dataset contains the geohash of the residential address, the SARS-CoV-2 test detection date, and the first day of symptoms for each patient infected with COVID-19 in the city of Fortaleza over the studied period, which starts with patient zero arriving in the city and being detected on March 8, 2020. This dataset is used with the consent of the local health authorities in Fortaleza, Ceará and constrained the possibility of retrieving the chain of transmission of the virus to the state of Ceará. We cross-match the location of the residential address of each patient with the GPS geolocation from the mobile phone dataset, thus obtaining the encrypted mobile ID of the patients (S1 Appendix, section 7). We then trace the geolocalized trajectories of COVID-19 patients during a period -14/+7 days from the onset of symptoms to look for contacts of the infected person. These contacts define the chain of transmission of the disease which is obtained using the model described below.

Results

COVID-19 model

The COVID-19 spreading model is represented by a Susceptible-Exposed-Infectious-Recovered (SEIR) process [16] and considers the epidemiological profile depicted in Fig 1A. An epidemiological profile is characterized by the incubation time (time from exposure, E, to onset of symptoms), latency (from exposure to onset of infectiousness, I), infectious period (the period over which the patient is contagious), and the extend of the disease (from the onset of symptoms until recovery or death, R). The values of the corresponding times of SARS-CoV-2 depicted in Fig 1A are obtained from the literature [25-27]. A crucial feature to notice from the epidemiological profile is that the onset of the infectiousness period occurs before the onset of symptoms. In other words, the latency is shorter than the incubation period, and the patient becomes contagious before she/he starts to feel the symptoms of the disease. Furthermore, the peak of infectiousness, that is, when the patient is at its most contagious stage, occurs, in average, about a day or two before the onset of symptoms, according to the study done in [25] at the beginning of the pandemic. These numbers are crucial to understand the rapid spread of COVID-19. They imply that when the patient starts with the symptoms of the disease, she/he has already transmitted the virus to the majority of its infected people during the previous two days. Therefore, even if the patient isolates him/herself after feeling sick, the main transmissions have already occurred.
Fig 1

Infectiousness profiles of SARS-CoV-1 and SARS-CoV-2.

(A) Infectiousness profile of coronavirus SARS-CoV-2 responsible for COVID-19. The COVID-19 pandemic is modeled by a SEIR model. From exposure (E) the virus is incubated in average for 5.2 days (12.5 days 95 percentile), starting the symptoms 2 days after infectiousness (I) and lasting the disease up to 17 days to recover (R). We use a window -14/+7 days from the first symptoms to detect infectious and exposure. (B) Infectiousness profile of coronavirus SARS-CoV-1 responsible for SARS-2003. Data obtained from [25]. As opposed to COVID-19, we note that in this case the latency is longer than the incubation period, and the peak of infectiousness then appears after the onset of symptoms. Thus, when the patients present its first symptoms, upon isolation, the transmission of disease is interrupted. In this case, isolating the patients after the symptoms is an effective way to control the pandemic. On the contrary, COVID-19 in (A) is characterized by a latency shorter than incubation, and, even more troublesome, with a pre-symptomatic peak of transmission appearing before the onset of symptoms. Thus, in this case, even if the patient isolates after the symptoms appear, most of its infections have occurred already. This indicates that the only way to stop the chain of transmission of COVID-19 is by going into the past, before symptoms, and performing contact tracing to capture and isolate the contacts of the infected person before the symptoms have appeared. This crucial difference in the epidemiological profiles of these two coronaviruses might explain why SARS was contained successfully in 2003 producing around 8,000 infections and 800 deaths, while COVID-19 kept spreading reaching a much larger worldwide population of 250 million infections and 5 million deaths as of November 2021.

Infectiousness profiles of SARS-CoV-1 and SARS-CoV-2.

(A) Infectiousness profile of coronavirus SARS-CoV-2 responsible for COVID-19. The COVID-19 pandemic is modeled by a SEIR model. From exposure (E) the virus is incubated in average for 5.2 days (12.5 days 95 percentile), starting the symptoms 2 days after infectiousness (I) and lasting the disease up to 17 days to recover (R). We use a window -14/+7 days from the first symptoms to detect infectious and exposure. (B) Infectiousness profile of coronavirus SARS-CoV-1 responsible for SARS-2003. Data obtained from [25]. As opposed to COVID-19, we note that in this case the latency is longer than the incubation period, and the peak of infectiousness then appears after the onset of symptoms. Thus, when the patients present its first symptoms, upon isolation, the transmission of disease is interrupted. In this case, isolating the patients after the symptoms is an effective way to control the pandemic. On the contrary, COVID-19 in (A) is characterized by a latency shorter than incubation, and, even more troublesome, with a pre-symptomatic peak of transmission appearing before the onset of symptoms. Thus, in this case, even if the patient isolates after the symptoms appear, most of its infections have occurred already. This indicates that the only way to stop the chain of transmission of COVID-19 is by going into the past, before symptoms, and performing contact tracing to capture and isolate the contacts of the infected person before the symptoms have appeared. This crucial difference in the epidemiological profiles of these two coronaviruses might explain why SARS was contained successfully in 2003 producing around 8,000 infections and 800 deaths, while COVID-19 kept spreading reaching a much larger worldwide population of 250 million infections and 5 million deaths as of November 2021. This peculiar feature of the new coronavirus implies that the only way to stop the chain of transmission (in the absence of vaccines) is to perform contact tracing to track the past contagious contacts of the patient and isolate them. That is, we need to go back in time to identify the contacts that have already occurred before the patient reports the symptoms to the health authorities. Without contact tracing, the chain of transmission cannot be broken, even if the patient enters into isolation after the onset of symptoms. This situation is exacerbated due to the existence of asymptomatic cases, i.e., infected people who do not feel symptoms and can potentially transmit the disease without knowing it. The relation between latency and incubation of SARS-CoV-2 is inverted in coronavirus SARS-CoV-1 responsible for the SARS pandemic in 2003. As we see in the epidemiological profile of SARS-CoV-1 in Fig 1B, in this case, the latency is longer than the incubation period, according to studies reported in [25]. Patients of SARS 2003 become contagious a few days after the appearance of symptoms. In this case, upon isolation of the patient after reporting symptoms, the chain of transmission can be successfully broken. Thus there is no need to perform contact tracing back in time before the onset of symptoms since all contagious contacts happen during the manifestation of the disease. This situation could explain why SARS in 2003 was contained successfully without spreading worldwide, as this coronavirus infected “only” about 8,000 people with around 800 deaths worldwide. On the other hand, the new coronavirus spread to all continents infecting 250 million people and causing 5 million deaths as of November 2021 across the world. Furthermore, countries that implemented effective earlier contact tracing protocols, like South Korea and China [5-12], were able to contain the pandemic more successfully than counties who did not implement contact tracing protocols. Inspired by this evidence, our study is an attempt to scientifically show how contact tracing works in a real setting. The infectiousness period of an infected person starts 2 days before and lasts up to 5 days after the onset of symptoms [25]. In this paper, we added two extra days to be conservative in capturing the contacts since the number of days comes from statistical estimations of the different periods characterizing the epidemiological profile of the disease, see Ref. [25]. Thus, in principle, to trace those people potentially infected by COVID-19 patients, we track contacts 4 days before and 7 days after the reported date of first symptoms (see Fig 1A). In addition, we extend the tracing period further back in time to also consider exposures that could come from asymptomatic cases (S1 Appendix, section 7). Exposures start the incubation period of the infected person which can occur up to 12.5 days before onset of symptoms (5.2 days on average, 95% percentile 12.5 days [26, 27], Fig 1A). To conservatively trace these exposure events, we add ∼2 days to this incubation period and obtain the widely used 14 days period. Hence, to trace transmission and exposure cases, we perform contact tracing over -14/+7 days from onset of symptoms (Fig 1A). As noted above, the peak of infectiousness as well as 44% (95% confidence interval, 25–69%) of infected cases occur during the pre-symptomatic stage [25]. Thus, performing contact tracing is essential to stop the spreading of the disease.

Contact model

The GPS geolocation of the trajectories of both infected and susceptible people is used to trace several layers of contacts in the transmission network using the following model (S1 Appendix, section 2). A contact at time stamp n is initiated with an infected user (source) at time t0 (see Fig 2A). The timestamp n enumerates each GPS datapoint, while t refers to the actual time attached with that point. At t0 we draw a contact area as a circle centered in the source position with a radius r. We then gather all the GPS datapoints from susceptible users (targets) that enter the contact area from t0 to t0 + T, where T is the total exposure time. We follow the trajectories of source and target within the time-space area and compute the probability of infection at time stamp n as p[n] = p[n] ⋅ p[n], where p[n] is the spatial component, and p[n] is the temporal component. When the average overlap between source and target is zero, then p[n] = 1, and when the overlap is 2r, then p[n] = 0. On the other hand, when the exposure time ≥ T, then p[n] = 1, and decreases to p[n] = 0 as the exposure time decreases (S1 Appendix, section 2). The probability p[n] quantifies the contact probability for two users in the same area defined by r. A contact requires non only a space overlapping but also a time overlap, p[n], which quantifies the probability that two users met based on the time commonly spent in the same area. We then combine these two probabilities for each timestamp n into their product.
Fig 2

COVID-19 contact model.

(A) Contact area used in the contact tracing model. The grey person is at the first datapoint of the source at t0. We collect all datapoints for every user in a T = 30 min forward window (t1, t2, t3, …, t0 + T) within an 8 m circle from the initial position. For each target (green and red) we compute the average position and the time spent inside the contact area (red part of the trajectory line). (B) Partial transmission tree of outbreak of confirmed SARS-CoV-2 infection identified by contact tracing during calibration in the month of March 2020. Links goes from the source of infection to the target. The colors represent the day of first symptoms for each node and size is the out-degree.

COVID-19 contact model.

(A) Contact area used in the contact tracing model. The grey person is at the first datapoint of the source at t0. We collect all datapoints for every user in a T = 30 min forward window (t1, t2, t3, …, t0 + T) within an 8 m circle from the initial position. For each target (green and red) we compute the average position and the time spent inside the contact area (red part of the trajectory line). (B) Partial transmission tree of outbreak of confirmed SARS-CoV-2 infection identified by contact tracing during calibration in the month of March 2020. Links goes from the source of infection to the target. The colors represent the day of first symptoms for each node and size is the out-degree. Contacts with low probability of infection p[n], but repeated throughout time, can also infect the target. To incorporate this effect in the model, we define the probability of infection for a series of repeated contacts P[n] as a recursive formula from time 1 to n with P[0] = 0: The iteration of contacts between source and target, P[n], generates higher probability of infection than a single contact p[n]. This means that there is a difference between a short single contact between two people and short repeated contacts between the same people. The latter scenario should have a larger probability than the former to become infected. While the distribution of p[n] is homogeneous without a clear threshold for an infectious contact, P[n] presents a very polarized distribution where the values are accumulated in the extremes: P = 0 or P = 1 (see S1 Fig). Thus, P[n] is better indicator than p[n] to separate infectious from non-infectious contacts. A contact is then considered infectious when this probability exceeds a certain threshold, P[n] > p. The hyperparameters of the contact model (T, r, p) are obtained by calibrating the model using only the contacts between infected people to reproduce the basic reproduction number R0 = 2.78 in Ceará in the month of March, 2020 (S1 Appendix, section 3). We obtain T = 30 min, r = 8 m and p = 0.9. Thus, a contact is defined with probability one when exposure is at least 30 minutes within a distance ≪ 8m. This calibration procedure provides the partial transmission tree of the outbreak from patient zero to the end of the calibration period shown in Fig 2B.

Transmission network model

Next, we create the contact network of coronavirus transmission by first tracing the trajectories of confirmed COVID-19 patients to search for contacts -14/+7 days from the onset of symptoms using the above model. From the first contact layer, we add four layers of contacts to constitute the contact network of transmission that is used to monitor the progression of the pandemic. The time-varying network is aggregated to a snapshot defined over a time window of a week [16] (S1 Appendix, section 7). We find that other aggregation windows give similar results as presented. Next, we analyze the spatio-temporal properties of the contact network. The government of the State of Ceará imposed a mass quarantine on March 19, 2020 which led to a decrease in people’s mobility by 56.5% as shown in Fig 3A. During the lockdown, only the displacements of essential workers were allowed. A large decrease in mobility is also observed across all Latin America, see [24].
Fig 3

Structural components of transmission networks across the lockdown.

(A) Evolution for different metrics in Ceará, Brazil, previous to the mass quarantine (grey area), right after the imposed quarantine (yellow area) and later. The plot shows the root mean square displacement (MSD) normalized by the maximum value over the total period (blue), the cumulative number of cases (green) and the size of the GCC normalized by the maximum value over the total period (black). The uncertainty corresponds to the standard error (SE). The mobility data is showcased in the Grandata-United Nations Development Programme map shown in https://covid.grandata.com. The initial rise in GCC is due to the lack of data before March 1. (B) The plot shows the 0.5-core size (red), the 0.5-shell size (cyan) all normalized by their respective maximum value pre-lockdown. While the size of the 0.5-shell is reduced drastically during the lockdown, the 0.5-core was not reduced as much and keeps increasing, contributing to sustain the pandemic. The 0.5-core seems to follow the trend in the MSD, which we plot again to show this trend.

Structural components of transmission networks across the lockdown.

(A) Evolution for different metrics in Ceará, Brazil, previous to the mass quarantine (grey area), right after the imposed quarantine (yellow area) and later. The plot shows the root mean square displacement (MSD) normalized by the maximum value over the total period (blue), the cumulative number of cases (green) and the size of the GCC normalized by the maximum value over the total period (black). The uncertainty corresponds to the standard error (SE). The mobility data is showcased in the Grandata-United Nations Development Programme map shown in https://covid.grandata.com. The initial rise in GCC is due to the lack of data before March 1. (B) The plot shows the 0.5-core size (red), the 0.5-shell size (cyan) all normalized by their respective maximum value pre-lockdown. While the size of the 0.5-shell is reduced drastically during the lockdown, the 0.5-core was not reduced as much and keeps increasing, contributing to sustain the pandemic. The 0.5-core seems to follow the trend in the MSD, which we plot again to show this trend.

Giant connected component (GCC)

To understand the effect of the lockdown on the contact network, we think by analogy with a “bond percolation” process [16, 17, 28]. In bond percolation, the network connectivity is reduced by removing a small fraction of links (bonds) between nodes, and the global disruption in network connectivity is monitored by studying the normalized size of the giant connected component (S1 Appendix, section 4). Following this analogy, the lockdown acts as a percolation process, and therefore we monitor the GCC of the transmission network before and after the lockdown. We find a large decrease in the size of the GCC [16, 28] within 6 days of the implementation of the lockdown on March 19, when the GCC is almost fully dismantled decreasing by 89.6% of its pre-lockdown size (Fig 3A). Despite the disintegration of the GCC, the cumulative number of cases kept growing albeit at a lower rate (Fig 3A). We find that the mass quarantine was able to reduce the basic reproduction number from R0 = 2.78 before lockdown to an effective reproduction number of R = 1.2 after the lockdown (Fig 3A). Despite this disruption in the network connectivity, R has not decreased below one, as it would have been needed to curb the spread of the disease. The drastic reduction in the GCC is visually apparent in the contact networks in Fig 4. Before lockdown on March 19 (Fig 4A), the network is a strongly-connected unstructured “hairball”. Eight days into the lockdown on March 27 (Fig 4B), the network has been untangled into a set of strongly-connected modules integrated by tenuous paths of contacts. This structure is even more pronounced a few weeks later on April 28 (Fig 4C).
Fig 4

Evolution of GCC and k-cores over the quarantine.

Disease transmission networks in the state of Ceará over time before and after the lockdown on March 19, 2020. (A) Transmission network on March 19 (pre-lockdown). A hairball highly-connected network is observed. The disconnected components of the 7-core ( in this network) are colored. These components are well connected into the hairball network as expected since mobility and connectivity is high. (B) The pre-quarantine hairball in (A) has been untangled and the k-cores have emerged 8 days into the lockdown on March 27. Here, we color the nodes according to layers of the transmission network starting at COVID-19 patient (black nodes). Size of nodes is according degree. (C) Network on April 28 including the components of the 5-core in different colors ( for this network). Visible is the high betweenness centrality node representing the weak-link of this k-core. (D) We plot the location of the contacts in the map of Fortaleza constituting the components of the 5-core of the April 28 in (C). The size of the circles in the map corresponds to the number of contacts inside each location. The colors correspond to the clusters of the 5-core in (C). The 5-core sustaining transmission is composed of clusters of contacts localized in hospitals, large warehouses and business buildings. Hospital 3, one of the largest in Fortaleza, constitutes the maximal of the pandemic. The underlying map comes from the Folium library of Python: https://github.com/python-visualization/folium which relies on the OpenStreetMap project [29].

Evolution of GCC and k-cores over the quarantine.

Disease transmission networks in the state of Ceará over time before and after the lockdown on March 19, 2020. (A) Transmission network on March 19 (pre-lockdown). A hairball highly-connected network is observed. The disconnected components of the 7-core ( in this network) are colored. These components are well connected into the hairball network as expected since mobility and connectivity is high. (B) The pre-quarantine hairball in (A) has been untangled and the k-cores have emerged 8 days into the lockdown on March 27. Here, we color the nodes according to layers of the transmission network starting at COVID-19 patient (black nodes). Size of nodes is according degree. (C) Network on April 28 including the components of the 5-core in different colors ( for this network). Visible is the high betweenness centrality node representing the weak-link of this k-core. (D) We plot the location of the contacts in the map of Fortaleza constituting the components of the 5-core of the April 28 in (C). The size of the circles in the map corresponds to the number of contacts inside each location. The colors correspond to the clusters of the 5-core in (C). The 5-core sustaining transmission is composed of clusters of contacts localized in hospitals, large warehouses and business buildings. Hospital 3, one of the largest in Fortaleza, constitutes the maximal of the pandemic. The underlying map comes from the Folium library of Python: https://github.com/python-visualization/folium which relies on the OpenStreetMap project [29].

Superspreading k-core structures

The highly connected modules found in Fig 4B and 4C are k-core structures [30-33] of higher complexity than the GCC (which is a 1-core), that are known to sustain an outbreak even when the GCC has been disintegrated [16, 33]. The k-core of a graph is the maximal subgraph in which all nodes have a degree (number of connections) larger or equal than k [30-33]. The k-shell is the periphery of the k-core and is composed by all the nodes that belong to the k-core but not to the (k+1)-core (S1 Appendix, section 5). The k-core is obtained by iteratively pruning the nodes with degree smaller than k. For instance, the 3-core is obtained by removing the 1-shell and 2-shell in a k-shell decomposition process (S1 Appendix, section 5). Thus, all nodes in a k-core have at least degree k, and are connected to other nodes with degree at least k too. K-cores are nested and can be made of disconnected components (see S4 Fig). High k-cores are those with large k up to a maximal , and constitute the inner most important part of the network. In theory, the high k-cores are known from network science studies to be the reservoir of disease transmission persistence [16, 33]. On the contrary, low peripheral k-shells (see S2 Fig) do not contribute as much to the spread as the high inner k-cores. Fig 3B shows that despite the disappearance of the GCC, there is a significant maximal k-core that was not dismantled by the mass quarantine. The figure shows that the outer k-shells of the transmission network (i.e., the 0.5-shell defined as the union of the k-shells with ) are disintegrated in the lockdown, decreasing by 91% with respect to their pre-quarantine size, in tandem with the GCC. However, the inner k-core (i.e., the 0.5-core defined as the k-core with ) persists in the lockdown. The figure shows that the decrease of the 0.5-core is only 50% compared to the 91% decrease of the 0.5-shell; the former even increases slightly at the end of April, following the same trend in mobility (see Fig 3B). This process is visually corroborated in the evolution of the networks seen from Fig 4A and 4C where we observe the disappearance of the peripheral k-shells and the persistence of the maximal k-core. Indeed, the unessential contacts in the peripheral k-shells may have been first pruned during social distancing. Using numerical simulations, we corroborate previous results indicating that the infection can persist in these high k-cores of the network while virus persistence in outer k-shells is less important [16, 33]. We use a SIR model on the transmission network (see S15 Fig) showing that the maximal k-cores of the network sustain the spreading of the disease more efficiently than the outer k-shells. Thus, the maximal k-core components of the contact network are plausible drivers of disease transmission. Apart from this structural explanation (i.e., k-core), epidemiological factors may also play a role in the persistence of the disease, such as a transition of the disease to vulnerable communities with high demographic density, or with large inhabitants per household where isolation is poorly fulfilled. When we plot the geolocation of the contacts forming the maximal k-core in the map of Ceará, we find that these contacts take place in highly transited areas of the capital Fortaleza, such as hospitals, business buildings, warehouses as well as large condominiums, see Fig 4D. These contacts generate superspreading k-core events that generalize the conventional notion of superspreaders, which refer mainly to individuals with large number of transmission contacts [34-36]. However, connections are not everything [18, 19]. K-core superspreaders not only generate a large number of transmission contacts, but their contacts are also highly connected people, and so forth.

Optimized quarantine

The existence of k-cores in the transmission network suggests that a more structured quarantine could be deployed to either isolate or destroy those cores that help maintain the spread of the virus. We perform an optimal percolation analysis [18-20] to find the minimal number of people necessary to quarantine that will dismantle the transmission network. We compare different strategies to find the best among them to break the network by ranking the nodes based on (1) the number of contacts (hub-removal) [16, 18, 19], (2) the largest k-shells and then by the degree inside the k-shells [16, 33], (3) the collective influence algorithm for optimal percolation [20], (4) the generalized k-core strategy [37], and (5) betweenness centrality [38-41]. Fig 5B shows the normalized size of the GCC versus the fraction of removal nodes following different strategies, as well as a random null model of removal in a typical network under lockdown in April 28 (March 19 pre-lockdown results are plotted in S15 Fig). While the disease can persist in the k-cores (Fig 5A), quarantining people directly inside the maximal k-core is not an optimal strategy. The reason is that k-cores are populated by hyper-connected hubs that require many removals to break the GCC [40] (around 7%, see Fig 5B). For the same reason, removing directly the hubs is not the optimal strategy either, since the hubs are within the maximal k-core and not outside. A collective influence strategy [20] improves over hub-removal since it takes into account how hubs are spatially distributed, yet, it is far from optimal. A generalized k-core strategy, which consists in sequentially removing the nodes in the k-leaf (where k = k), has been recently reported to be more suitable to study spreading behavior [37]. Fig 5B shows that, in this case, it performs similarly like k-core. The reason for this can be found in the tree structure of the network and its low average degree. Clearly, Fig 5B shows that the best strategy is to quarantine people by their betweenness centrality. By removing just the top 1.6–2% of the high betweenness centrality people, the GCC is disintegrated. This result is consistent with the particular structure of the transmission networks seen in Figs 4B and 4C and 5.
Fig 5

Weak links and k-cores.

(A) Average size of infected population, M [33], in an outbreak average over all starting nodes in a k-shell as a function of the probability of infection β for a SIR model on the network in Fig 4C during the lockdown. The black is the average value over all the network. The average divides the k-shell contribution to the spreading of the virus in two groups: above and below the average. The 0.5-cores have maximal spreading and the 0.5-shell have minimal spreading. Error bars correspond to a confidence interval of 95%. (B) Optimal percolation analysis performed over the network in Fig 4C during the lockdown in following different attack strategies and their effect on the size of the largest connected component G(q) versus the removal node fraction, q. Nodes are removed (in order of increasing efficiency): randomly (blue); by the highest k-shell followed by high degree inside the k-shell [33]; by highest degree (orange); by collective influence (red) [20]; by the highest generalized k-core (brown) [37]; and by the highest value of betweenness centrality (green) [38, 39]. After each removal we re-compute all metrics. The most optimal strategy among those studied is removing the nodes by the highest value of betweenness centrality. (C)-(D) Effect of removing three high betweenness centrality nodes shown in Fig 5B in the network of Fig 4C. (C) We show the 2-core component of the network after the removal of 12 high betweenness centrality nodes. The red node is the one with the highest betweenness centrality value (next node to remove, 13th) and the blue node is the 14th removal. Different k-cores and k-shell are in different colors. (D) Network k-cores are disintegrated after the removal of the high BC nodes.

Weak links and k-cores.

(A) Average size of infected population, M [33], in an outbreak average over all starting nodes in a k-shell as a function of the probability of infection β for a SIR model on the network in Fig 4C during the lockdown. The black is the average value over all the network. The average divides the k-shell contribution to the spreading of the virus in two groups: above and below the average. The 0.5-cores have maximal spreading and the 0.5-shell have minimal spreading. Error bars correspond to a confidence interval of 95%. (B) Optimal percolation analysis performed over the network in Fig 4C during the lockdown in following different attack strategies and their effect on the size of the largest connected component G(q) versus the removal node fraction, q. Nodes are removed (in order of increasing efficiency): randomly (blue); by the highest k-shell followed by high degree inside the k-shell [33]; by highest degree (orange); by collective influence (red) [20]; by the highest generalized k-core (brown) [37]; and by the highest value of betweenness centrality (green) [38, 39]. After each removal we re-compute all metrics. The most optimal strategy among those studied is removing the nodes by the highest value of betweenness centrality. (C)-(D) Effect of removing three high betweenness centrality nodes shown in Fig 5B in the network of Fig 4C. (C) We show the 2-core component of the network after the removal of 12 high betweenness centrality nodes. The red node is the one with the highest betweenness centrality value (next node to remove, 13th) and the blue node is the 14th removal. Different k-cores and k-shell are in different colors. (D) Network k-cores are disintegrated after the removal of the high BC nodes. The betweenness centrality of a node is proportional to the number of shortest paths in the network going through that node. Thus, given the particular structure of the networks in Figs 4B and 4C and 5C, the high betweenness centrality nodes are the bottlenecks of the network, i.e., loosely-connected bridges between the largely-connected k-cores components. These connectors are the “weak links”, fundamental concept in sociology proposed by Granovetter [42], according to which, strong ties (i.e., contacts in the k-cores) clump together forming clusters. A strategically located weak tie between these densely “knit clumps”, then becomes the crucial bridge that transmits the disease (or information [42]) between k-cores. These weak links are people traveling among the different k-cores components allowing the disease to escape the cores into the rest of society. These bridges are displayed in the network of Fig 5C as the yellow, blue and red nodes. The removal of these high betweenness centrality people disconnects the k-core components of the network entirely, as shown in Fig 5D, halting the disease transmission from one core to the other [40, 43]. An important finding is that quarantining the large superspreading k-cores is neither optimal (as shown in Fig 5B, green curve) nor practical, since they are mainly comprised by chiefly essential workers who need to remain operational (Fig 4D). Thus, the best strategy, in conjunction with a mass quarantine, is then to disconnect these k-cores from the rest of the social network (Fig 5C and 5D), rather than quarantining the people inside the k-cores. This can be performed by quarantining the high betweenness centrality weak-links that simultaneously preserve the operational k-cores. However, individuals belonging to the maximal k-cores should be tested at a higher frequency to promptly detect their infectiousness before the symptoms start, to help control the spreading inside the k-cores.

Conclusion

Isolating the k-core structures by quarantining the high betweenness centrality weak links in the transmission network proves to be an effective way to dismantle the GCC of the disease while keeping essential k-cores working. While destroying the strong links and cores is a less manageable task to execute and control, isolating the weak links between cores is a more feasible task that will assure the dismantling of the GCC. In other words, if one core is infected, the disease will be controlled within that core and not extended to the rest of society. It is worth stressing that the optimal strategy to break the transmission of the virus depends on the particular spreading dynamics of the disease, patterns of mobility, and strength of the quarantine applied to each region. As we show in Fig 4B, every centrality measure can, with a certain degree of disruption, dismantle the chain of transmission of the virus. As we can see from the same figure, the betweenness centrality provides the minimal number of nodes that need to be isolated to dismantle the chain of transmission as compared with the studied centralities. The reason why BC performs better than the other centralities can be found in the particular structure of the contact network left after the quarantine. As we show in Fig 4C a k-core structure appears due to the strict lockdown, during which only essential workers were allowed to go out. The lockdown essentially removes the majority of the links leaving only those inside the k-cores plus their weak links. These k-cores, which represent the virus reservoir, are generally located in hospitals, warehouses, and some particular condominiums since they are composed mainly of the essential workers who are allowed to circulate during the quarantine. The k-cores are connected by a few links, which work as bridges for the virus transmission. This particular network structure explains why a BC-based ranking is able to break the transmission chain with fewer removals than other centralities, since BC can identify better those bridges that connect the k-cores. Thus, in the particular case of Fortaleza, we found that betweenness centrality provides the best ranking among the studied centralities to break the transmission chain. However, in another pandemic or even the same pandemic under a different quarantine protocol, the particular network structure that we found in Fortaleza may not appear. Therefore we do not expect that BC will always be the best method to break the transmission chain, and each particular case should be analyzed independently. However, the strategy proposed here to use contact tracing and network theory is valid for any pandemic. This includes building and monitoring the GCC of transmission as a function of time by combing GPS data with patient-list data and then testing different centralities with the objective of finding the best strategy to break the GCC. Each pandemic and quarantine may lead to a different network structure with its concomitant optimal centrality. The proposed protocol is then to investigate all centralities as done in this study and find the strategy that would break the chain of transmission in the most optimal way. As governments around the world have been trying to roll out digital contact tracing apps to curb the spread of coronavirus [5-12], our modeling suggests possible intelligent quarantine protocols that could become key in future phases of reopening economies across the world and, in particular, in developing countries where resources are scarce. Overall, our network-based optimized protocol is reproducible in any setting and could become an efficient solution to halt the progress of the COVID-19 pandemic worldwide drawing upon effective quarantines with minimal disruptions.

Ethics statement

This study was approved by the Institutional Review Board (IRB) at City College of New York (Approval No: 2020–0423) and by the Comitê de Ética em Pesquisa at Universidade Federal do Ceará in accordance with Resolution CNS No 510. Patient data was used with the approval and consent from the Epidemiological Surveillance Department, Fortaleza Health Secretariat and the Mayor of the Prefeitura de Fortaleza, Ceará, Brazil. Detailed description of of the data acquisition and treatment (Section 1), of the theoretical tools mentioned in the main text (Sections 2, 3, 4, 5 and 6), and discussion on the extent to which the present results would hold under implementation in terms of robustness to data quality and coverage, sampling bias on demographics such as coverage of location, socio-economic status, age and gender and privacy (Section 7). (PDF) Click here for additional data file.

Transmission probability.

(A) Probability distribution of p[n] = p[n] ⋅ p[n] (orange) and the recursive form P[n] defined in Eq (1) (blue). The P[n] are polarized to 0 and 1 becoming the best thresholded metric to use to consider a contact as infectious. (B) Average value < P[n] > as a function of the time window T of the spatio-temporal contact area. P[n] has a peak at T = 30 min; it decreases for T > 30 min and increase for T < 30 min as a function of T. The decreasing behaviour is what is expected, thus, 30 min is the minimum bound for the correct value of T. (TIFF) Click here for additional data file.

Network structure under k-shell decomposition.

(A) A sample network with 3 shells. The k-shell index k is not necessarily associated with other centralities. Here, the hub of the network in black with k = 7 is in the 1-shell, k = 1. The two top node in betweenness centrality, highlighted in red, belong to the 2-shell and the 3-shell, respectively. The 1-core is equivalent to the GCC. (B) The nodes with k = 1 form the 1-shell, (C) the nodes with k = 2 form the 2-shell, and (D) the nodes with k = 3 form the 3-shell which is also the 3-core. (TIFF) Click here for additional data file.

K-cores of a network.

(A) We start the k-shell decomposition with a network configuration where every node has at least degree k = 1. This set of nodes forms a 1-core. (B) Then, every node with k = 1 is iteratively removed to obtain the 2-core. As one can see, the removal of these nodes changes the degree distribution. Thus, nodes are removed until all remaining nodes are left with k ≥ 2. (C) Following the k-shell decomposition nodes are removed until we obtain the 3-core. The 3-core can be made of multiple disconnected clusters. (TIFF) Click here for additional data file.

K-cores decomposition.

Example of k-core and k-shell structure in the network plotted in Fig 3B obtained during the lockdown. Here the colors are set by the k-shell occupancy of each node. Each k-core is composed by the k-shell plus the (k+1)-core. The k-cores are nested structures. For instance, the 5-core in (E) is composed by the 5-shell (yellow nodes) and the 6-core, which, in turn, is composed by the 6-shell (in red) and the 7-core (in purple). Since the 7-core is the maximal k-core, for this network, then the 7-core is also the 7-shell. In this network the 0.5-core is the 4-core and the 0.5-shell is composed by the 1-shell plus the 2-shell and the 3-shell. We notice how a given k-core can be composed of many disconnected components. For instance, the 6-core is composed by 5 disconnected components. This is important, since each component of a given k-core can be localized in different areas, like different hospitals, in the map, see for instance, Fig 3C and 3D. It is also visually apparent that to destroy this network, a direct ‘attack’ to the high k-cores is not optimal. Instead, removing the high BC nodes that populate the lower k-shells is the best strategy. We plot each k-core in turn: (A) 1-core, (B) 2-core, (C) 3-core, (D) 4-core, (E) 5-core, (F) 6-core and (G) 7-core. (TIFF) Click here for additional data file.

Degree distribution of the contact network.

Degree distribution of the contact network before (blue) and after (orange) the quarantine. (TIFF) Click here for additional data file.

Evolution of the maximum k-core.

Evolution of maximum k-core index versus time previous to the quarantine (grey area), right after the quarantine (yellow area) and later. We see how the maximum k-core index drops drastically after the mass quarantine. (TIFF) Click here for additional data file.

Contact layers.

Contact layers or pre-symptomatic and asymptomatic captured by the model. Our treatment of asymptomatic cases is to increase the exposure period to -14 days to accounting for possible two-chains of infection as shown in the figure. Contacts between -2 days to -14 days from the day of first symptoms are more likely to be an exposure from an asymptomatic infected person. Contact from -2 days to +7 days from first symptoms are considered to be transmissions contacts from the patient. (TIFF) Click here for additional data file.

Sampling bias-coverage.

(A) Probability density function and (B) Cumulative distribution function of the fraction of the population per neighborhood in Fortaleza to the total population. We show the real distributions and the distributions from the apps GPS data. Both distributions pass a two-sample KS test indicating that we cannot reject the hypothesis that they come from the same distribution under the test. (TIFF) Click here for additional data file.

Sampling bias-HDI.

(A) Probability density function and (B) Cumulative distribution function of the fraction of the population per neighborhood with a given HDI in Fortaleza to the total population. We show the real distributions and the distributions from the apps GPS data. Two-sample KS test indicates that we cannot reject the hypothesis that the real and GPS sample come from the same distribution under the test, indicating lack of sampling bias under this test. (TIFF) Click here for additional data file.

Sampling bias-age.

(A) PDF and (B) CDF of age distribution in the GPS geolocalized data compared with the real patient data. We cannot reject the hypothesis that both samples come from the same distribution under KS statistical testing. (TIFF) Click here for additional data file.

Sampling bias-gender.

(A) PDF and (B) CDF of gender distribution in the GPS geolocalized data compared with the real patient data suggesting lack of bias. (TIFF) Click here for additional data file.

K-core persistence.

Persistence of people in the k-cores in the temporal networks. We plot the percentage of people in the cores from network to network. The persistance is calculated by the overlap of people in the k-shells from a time of observation to the next (three days later in this particular example). (TIFF) Click here for additional data file.

Robustness to false positive.

Normalized efficacy of BC centrality as a function of false positives in the report of infected people. A false positive is an individual who reported to have symptoms but was not infected with Covid-19. We plot the relative error in the determination of the minimal number of people to quarantine versus the false positive rate. The measure starts to deviate from linear behaviour beyond the error bars around 20% false positive rate. (TIFF) Click here for additional data file.

GPS pings distribution.

Distribution of the time interval between GPS pings during all day and separated by day and night. (TIFF) Click here for additional data file.

Weak links and k-cores pre-quarantine.

(A) Amount of infected population ( see [33]) when the spreading starts in a given node in a k-shell as a function of the probability of infection β for a SIR model on the same network on March 19 in Fig 3A in pre-quarantine Ceará. The black is the average value over all the starting nodes in the network. The average divides the shell contribution to the spreading of the virus in two groups above and below the average. The 0.5-core composed of the 6-core ( in this network) which contains nodes from the 6-shell to the 12-shell, has maximal spreading. The 0.5-shell which is composed by the remaining shell from 1-shell to 5-shell has minimal spreading, below the average. (B) Optimal percolation analysis performed over the network in Fig 3A before the quarantine on March 19 in Ceará with different attack strategies and their effect on the size of the largest connected component G(q) versus the removal node fraction, q. Depending on the strategy nodes are removed: randomly (blue), by the highest value of betweenness centrality (green) [38, 39], degree (orange), collective influence (red) [20], and by the highest k-shell followed by high degree inside the k-shell [33]. After each removal we re-compute all the metrics. The best strategy among those studied is removing the nodes directly by the highest value of betweenness centrality. (TIFF) Click here for additional data file.

Size of the GCC over time.

The number of nodes (blue) and edges (oranges) in the GCC versus time. The initial increase in the number of nodes is artificial due to the fact that we perform contact tracing 14 days back for each patient and our data collection started in March 1. Thus the networks in the first two weeks have relatively lower contacts than the rest. (TIFF) Click here for additional data file.

Size of the 0.5-core over time.

Evolution of maximum 0.5-core size versus time normalized by the size of the GCC. The proportion of these maximum k-cores keeps increasing after the quarantine. (TIFF) Click here for additional data file. 21 Sep 2021 Dear Prof. Makse, Thank you very much for submitting your manuscript "Superspreading k-cores at the center of COVID-19 pandemic  persistence" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Benjamin Muir Althouse Associate Editor PLOS Computational Biology Thomas Leitner Deputy Editor PLOS Computational Biology *********************** Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: This paper implements a comprehensive contact tracing network analysis to find an optimized quarantine protocol to dismantle the chain of transmission of coronavirus with minimal disruptions to society. The authors track billions of anonymized GPS human mobility datapoints to monitor the evolution of the contact network of disease transmission before and after mass quarantines. The results reveal here are timely and interesting. However, there are some issues that need to be clarified. Two data sets are chosen to be studied in the paper. One is the Grandata united nations development programme partnership to combat covid and the the other is an anonymized list of confirmed covid patients obtained by the health department authorities from the two countries. There are some mobility data available covering a wide range of regions out there. The authors may want to comment on the choice of the datasets and limit their conclusion about the findings. The infectiousness period of an infected person starts two days before and lasts up to five days after the onset of symptoms. However, this is not directly related to the problem and data studied in this paper. The decision to adding two days tothe limits should be commented. Both cc and k core have been studied in this paper. However, it is recently reportly that generalized k core is more suitable for the study of spreading behavior. Therefore, this aspect should be discussed and compared thoroughly. Another concern is regarding the optimal quarantine. The inconsistencies between different centralities should be explained better. Do you expect the same qualitative results for other similar infectious disease or covid related data in other capabilities? In other words, how general are the obtain results? The resilience of the method should not be overlooked. Reviewer #2: The presented manuscript proposed an interesting contention strategy for the spreading phenomena in contact networks obtained from real GPS human mobility data. The authors create a contact tracing network that presumes to have the full contagion network. Then, they conclude k-core structures persist in the transmission network even when an extreme measure such as a lockdown is applied maintaining the spreading activity. This suggests that an optimized isolation measure can be found to avoid contagion. After trying different centrality measures the authors found that the betweenness centrality is the best breaking the transmission network. I found this preprint very interesting, well-written, and easy to follow. Furthermore, it contains enough novel results to be published in PLOS Computational Biology. Reviewer #3: # Summary The manuscript conducts a contact tracing network analysis matching GPS and confirmed cases data applied to the case of COVID-19. They propose an effective way to break the network of transmission based on data from the state of Ceará, in Brazil. This GPS data allow monitoring the mobility of users and build a contact network identifying temporal changes before and after the lockdown and the persistence of $k$-cores, linked to super-spreading events. The main finding is that it is possible to break the transmission tree quarantining those with high betweenness centrality, linking the maximum $k$-cores with the rest of the population. The work is original and of high importance, with rigorous network and statistical analyses, very well documented in the Supplementary Material. The authors state that the research followed ethical guidelines to treat personal data, not allowing to identify anyone, and has the approval of the Epidemiological Surveillance Department of Ceará. # Some points - I had some questions about the sampling bias and the fact that only a few confirmed cases were linked to the GPS data. I suggest empathizing that these analyses were also performed and cite the Supplementary Material more often in the main text. - line 18: typo -> "teh state" - line 20: "both states"? Only Ceará is mentioned before. As I could see in the SM, there is a "Puebla" state that was not used. By the way, in the caption of Table S1 "Puebla" is also mentioned. - line 40: I suggest changing $\\sim$ by $\\approx$ if the authors agree. - line 49: what is the difference between timestamp $n$ and time $t$? - lines 151-158: in Ref. DOI:10.1103/PhysRevE.98.012310 the authors show that the maximum $k$-core can be the driver of disease transmission in contact networks. I was wondering if it has relation with the case presented here. - How is the degree distribution $P(k)$ of the network? Does it keep the shape over time, taking snapshots aggregating the network over different time windows before and after the lockdown? - page 25 of SM: typo? -> "These GSP-based apps" - Notation: The notation about $k$-cores and $k$-shells can be improved. Sometimes they appear as 1-shell (line 128) or 1 $k$-shell (Fig. 4a), while 0.5-kcore is also used. I would suggest to use 1-shell and 0.5-core, instead, to match the $k$-shell and $k$-core pattern, respectively. ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at . Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols Submitted filename: report.pdf Click here for additional data file. 17 Nov 2021 Submitted filename: Response_To_Reviewers.pdf Click here for additional data file. 25 Jan 2022 Dear Prof. Makse, We are pleased to inform you that your manuscript 'Digital contact tracing and network theory to stop the spread of COVID-19 using big-data on human mobility geolocalization' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Benjamin Althouse Associate Editor PLOS Computational Biology Thomas Leitner Deputy Editor PLOS Computational Biology *********************************************************** Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: I have no further comment on this paper. Reviewer #2: The authors have addressed the other reviewers concerns, I think the manuscript deserves to be published. Reviewer #3: The authors revised the whole manuscript and I am satisfied with the answers to the questions raised in the review process. ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No 1 Apr 2022 PCOMPBIOL-D-21-00586R1 Digital contact tracing and network theory to stop the spread of COVID-19 using big-data on human mobility geolocalization Dear Dr Makse, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Olena Szabo PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol
  23 in total

1.  Error and attack tolerance of complex networks

Authors: 
Journal:  Nature       Date:  2000-07-27       Impact factor: 49.962

2.  k-Core organization of complex networks.

Authors:  S N Dorogovtsev; A V Goltsev; J F F Mendes
Journal:  Phys Rev Lett       Date:  2006-02-02       Impact factor: 9.161

3.  Generalization of core percolation on complex networks.

Authors:  N Azimi-Tafreshi; S Osat; S N Dorogovtsev
Journal:  Phys Rev E       Date:  2019-02       Impact factor: 2.529

4.  Temporal dynamics in viral shedding and transmissibility of COVID-19.

Authors:  Xi He; Eric H Y Lau; Peng Wu; Xilong Deng; Jian Wang; Xinxin Hao; Yiu Chung Lau; Jessica Y Wong; Yujuan Guan; Xinghua Tan; Xiaoneng Mo; Yanqing Chen; Baolin Liao; Weilie Chen; Fengyu Hu; Qing Zhang; Mingqiu Zhong; Yanrong Wu; Lingzhai Zhao; Fuchun Zhang; Benjamin J Cowling; Fang Li; Gabriel M Leung
Journal:  Nat Med       Date:  2020-04-15       Impact factor: 53.440

5.  Epidemiology: dimensions of superspreading.

Authors:  Alison P Galvani; Robert M May
Journal:  Nature       Date:  2005-11-17       Impact factor: 49.962

6.  Superspreading and the effect of individual variation on disease emergence.

Authors:  J O Lloyd-Smith; S J Schreiber; P E Kopp; W M Getz
Journal:  Nature       Date:  2005-11-17       Impact factor: 49.962

7.  On the use of human mobility proxies for modeling epidemics.

Authors:  Michele Tizzoni; Paolo Bajardi; Adeline Decuyper; Guillaume Kon Kam King; Christian M Schneider; Vincent Blondel; Zbigniew Smoreda; Marta C González; Vittoria Colizza
Journal:  PLoS Comput Biol       Date:  2014-07-10       Impact factor: 4.475

8.  Mobile phone data for informing public health actions across the COVID-19 pandemic life cycle.

Authors:  Nuria Oliver; Bruno Lepri; Harald Sterly; Renaud Lambiotte; Sébastien Deletaille; Marco De Nadai; Emmanuel Letouzé; Albert Ali Salah; Richard Benjamins; Ciro Cattuto; Vittoria Colizza; Nicolas de Cordes; Samuel P Fraiberger; Till Koebe; Sune Lehmann; Juan Murillo; Alex Pentland; Phuong N Pham; Frédéric Pivetta; Jari Saramäki; Samuel V Scarpino; Michele Tizzoni; Stefaan Verhulst; Patrick Vinck
Journal:  Sci Adv       Date:  2020-06-05       Impact factor: 14.136

9.  Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing.

Authors:  Luca Ferretti; Chris Wymant; David Bonsall; Christophe Fraser; Michelle Kendall; Lele Zhao; Anel Nurtay; Lucie Abeler-Dörner; Michael Parker
Journal:  Science       Date:  2020-03-31       Impact factor: 47.728

10.  Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2).

Authors:  Ruiyun Li; Sen Pei; Bin Chen; Yimeng Song; Tao Zhang; Wan Yang; Jeffrey Shaman
Journal:  Science       Date:  2020-03-16       Impact factor: 47.728

View more
  2 in total

1.  A spatiotemporal decay model of human mobility when facing large-scale crises.

Authors:  Weiyu Li; Qi Wang; Yuanyuan Liu; Mario L Small; Jianxi Gao
Journal:  Proc Natl Acad Sci U S A       Date:  2022-08-08       Impact factor: 12.779

2.  Ranking the effectiveness of non-pharmaceutical interventions to counter COVID-19 in UK universities with vaccinated population.

Authors:  Zirui Niu; Giordano Scarciotti
Journal:  Sci Rep       Date:  2022-07-29       Impact factor: 4.996

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.