Literature DB >> 25747871

Using mobile phone data to predict the spatial spread of cholera.

Linus Bengtsson1, Jean Gaudart2, Xin Lu3, Sandra Moore4, Erik Wetter5, Kankoe Sallah2, Stanislas Rebaudet4, Renaud Piarroux4.   

Abstract

Effective response to infectious disease epidemics requires focused control measures in areas predicted to be at high risk of new outbreaks. We aimed to test whether mobile operator data could predict the early spatial evolution of the 2010 Haiti cholera epidemic. Daily case data were analysed for 78 study areas from October 16 to December 16, 2010. Movements of 2.9 million anonymous mobile phone SIM cards were used to create a national mobility network. Two gravity models of population mobility were implemented for comparison. Both were optimized based on the complete retrospective epidemic data, available only after the end of the epidemic spread. Risk of an area experiencing an outbreak within seven days showed strong dose-response relationship with the mobile phone-based infectious pressure estimates. The mobile phone-based model performed better (AUC 0.79) than the retrospectively optimized gravity models (AUC 0.66 and 0.74, respectively). Infectious pressure at outbreak onset was significantly correlated with reported cholera cases during the first ten days of the epidemic (p < 0.05). Mobile operator data is a highly promising data source for improving preparedness and response efforts during cholera outbreaks. Findings may be particularly important for containment efforts of emerging infectious diseases, including high-mortality influenza strains.

Entities:  

Mesh:

Year:  2015        PMID: 25747871      PMCID: PMC4352843          DOI: 10.1038/srep08923

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Re-occurring infectious disease outbreaks due to cholera, measles and other preventable infectious diseases contribute to a major disease burden affecting low- and middle-income countries12. Concurrently, outbreaks of new infectious diseases with pandemic potential pose a considerable threat to human life and development34. Response to, and ideally containment of5, an infectious disease outbreak can be greatly improved if health care response and outbreak control measures can be focused to areas predicted to be at the highest risk of experiencing new outbreaks67. Accurate models of the geographic distribution of epidemic risk could significantly enhance the population-level effects of interventions implemented to control the spread of transmittable diseases8. Considerable progress has been made in predicting temporal evolution of epidemics once outbreaks have progressed beyond a small initial group of cases79. However, predicting spatial transmission routes of epidemics has proven to be remarkably difficult, due to the importance of rare, long-distance transmission events10, limited data on population mobility, unknown population immunity levels9, low sensitivity and specificity of case reports11 and limited access to accurate and spatiotemporally resolved case data12. Empirical data has provided key insight into the spatial spread of measles in England13 and Niger14 as well as into influenza spread in the USA and Europe111215. While population mobility plays a key role in such modelling studies1016, it has not been possible, until now, to study detailed and concurrent data on both population mobility and spatiotemporal distribution of cases. Instead, empirical studies have used either models of population mobility, preferentially gravity models17, or census data on work-home commuting as proxies for total mobility during outbreaks11. Although highly significant correlations exist between these mobility patterns and retrospective data on epidemic spread, large unexplained variations remain101819. It is also not clear how to choose and properly parameterize mobility models across contexts in new outbreaks. This is especially problematic during the critical early outbreak phases, when interventions have the greatest effect, but limited data are available to fit transmission models. Anonymous mobile operator data may provide a new source of large-scale empirical data on which to build more accurate models of infectious disease spread. Mobile phone operators register the mobile phone tower closest to the mobile user at the time of each call and text message. This allows individual phones to be localized at a resolution equal to the coverage area of the mobile phone tower (typically one to ten km2). In the public health field, this data has notably been used as a proxy for nationwide mobility patterns in malaria modelling studies2021. However, the extent to which this type of data accurately reflect movements of infectious persons and its utility in predicting spatial spread of infectious agents have not been evaluated. The largest cholera epidemic to strike a single country in recent history was the 2010 Haitian outbreak22. The first confirmed cholera case in Haiti developed symptoms on October 14, 2010 in a hamlet 60 km north of the capital Port-au-Prince. The epidemic spread first explosively along the nearby Artibonite river (Fig. 1) and subsequently, during a period of two months, throughout the entire country23.
Figure 1

Mobile phone mobility network.

The average absolute number of mobile phones moving between the study areas (October 15 to December 19, 2010). Thicker, bluer lines indicate larger number of travelers. The original outbreak location (Mirebelais), the Artibonite River (dark blue) and Port-au-Prince (PAP) are depicted (visualisation using Gephi and ArcGIS).

The 2010 Haiti cholera epidemic provides a unique opportunity to explore the influence of population mobility on the spatial evolution of a large-scale cholera outbreak. First, cholera had not previously affected the country for at least a century, thereby rendering epidemic development unbiased by differential population immunity. Second, the circumstances and location of the onset of the cholera epidemic are well understood232425. Third, daily case reporting based on WHO criteria was initiated very early throughout the country, and the notification system was highly effective22232425. In this study, we utilized data on the movement of 2.9 million anonymised mobile phone SIM cards in Haiti during the early phase of the Haitian cholera epidemic together with highly spatiotemporally resolved case data. We used these data to test the hypothesis that mobile operator data could be used to dynamically predict the spatial evolution of the epidemic from outbreak onset.

Methods

Data collection

Cholera case data

As soon as the epidemic was recognized, the Haitian government, with support from the US Centers for Disease Control and Prevention, implemented a nationwide monitoring program22. Each day, government and non-governmental health facilities in Haiti reported probable cholera cases (ambulatory patients, hospital admissions and deaths) to the Directorate of Health in each of the ten administrative departments (Haitian provinces). Probable cases were defined according to a modified WHO definition as “acute watery diarrhoea, with or without vomiting, in persons of all ages”22. Vibrio cholerae O1 infection was confirmed via bacterial culture for early cases in all departments. We have previously validated case data from the National Cholera Surveillance system by carrying out field investigations, comparing case reports with data available from registers in cholera treatment centres managed by Haitian teams, Doctors without Borders and medical brigades from Cuba23. The daily case reports per health facility enabled us to determine daily case numbers per commune while the epidemic spread throughout the country (October 14 and 64 days onwards). We defined the end of the study period as December 16, when the peak of the epidemic was reached and all but one commune had reported at least one case. In 62 communes out of 140 communes, including the eight communes within the Port-au-Prince metropolitan area, there may have been patients who sought healthcare in neighbouring communes. For all such suspected communes we merged the communes into a single area, thereby creating a total of 78 study areas throughout the country (see S1 for details). One study area was excluded due to absence of mobile network coverage. To predict spatial spread of cholera, we defined a study area to have acquired a novel local outbreak if five or more cases were recorded on any given day, thereby avoiding misclassification of spurious cases of diarrhea as new cholera outbreaks. For sensitivity analyses of the outbreak definition, see S5.

Mobile phone data

The analysed anonymous mobile phone data consisted of the last outgoing call or text message each day from October 15 to December 19, 2010 for all 2.9 million users belonging to the largest mobile operator, Digicel Haiti26. Research on bias in population mobility estimates stemming from differential ownership of mobile phones between socio-economic groups has been evaluated in Kenya, finding only minor bias27. Mobile phone mobility patterns based on the mobile phone dataset used in this study have previously been shown to approximate mobility patterns from a representative survey of 2,500 households in the capital Port-au-Prince, Haiti, during the same year (2010)28. S2 provides further details on the mobile operator dataset.

Analyses

We used the mobile phone data to construct a mobility matrix M, with elements , indicating the average daily proportion of mobile phones relocating from study area i to j, comparing their last registered location on day t with their last registered location on day t-1. The mobility network built on the basis of M displays strong connectivity both between Port-au-Prince and large parts of the country as well as between other urban areas and their surrounding countryside (Fig. 1). We calculated the infectious pressure P(t), sustained by each study area j during the period from October 21 (from seven days after the disease onset of the first case in Haiti) to December 16, according to Eq. 1, in which c(t) is the number of reported cases in study area i on day t: We thus assumed that a) the number of infectious individuals in a study area was proportional to the cumulative number of reported cases in the area during the preceding seven days (approximating the generation time of cholera, see also S5 and below)2930 and b) the proportion of mobile phone movements between study areas was representative of the movements of infectious persons between study areas. For comparative purposes, we implemented a gravity model of population mobility. Gravity models have previously been used to model mobility of infectious persons in a large number of studies in Haiti and elsewhere8161731 and assume that population mobility between areas depends positively on their population sizes and negatively on the distance between them. We calculated the infectious pressure according to Eq. 1, replacing by using the following gravity model31. in which μ is the average daily proportion of the population in area i that moves out of the area. The remaining ratio is the estimated probability that an individual leaving study area i, goes to study area j. H denotes the population in study area j, and d denotes the distance between the population-weighted centroids of study areas i and j. In the absence of detailed mobility data, values for μ and δ (the scaling parameter) in Eq. 2 are unknown and needs to be assigned. Parameter values vary however widely between studies. As appropriate values for Haitian mobility are unknown, we chose to optimize the model based on the retrospective case data from the complete study period. Note that this optimisation thus could not have been performed until after the spatial spread of the epidemic was complete. Our comparison model thus performs better than a model that could have been developed during the epidemic. We produced two separate optimisations. In the first we optimised the gravity model by choosing values for μ and δ (0.154 and 122 respectively), which minimised the residual sum of squares between reported daily cholera cases in each study area and the estimated pressure from the gravity model31. In the second we chose parameter values (0.158 and 3.5 respectively) that maximised the area under the curve (AUC), among all possible ROC curves. The ROC curve in this analysis depicts the sensitivity and specificity, using increasing thresholds of infectious pressures, to predict outbreak occurrence32 (see Results and Fig. 2b). We denote infectious pressures calculated from mobile phone movements by P and from the optimised gravity models by P1 and P2, respectively (see S3).
Figure 2

(a) Relationship between infectious pressure, calculated from the mobile phone data (P), and the risk of areas experiencing a new outbreak within seven days. Ninety-five percent confidence intervals based on a binomial distribution are included. (b) ROC curve (sensitivity and specificity) for predicting outbreak occurrence within seven days at increasing thresholds of infectious pressure (red: P, green: P1, black: P2). Random guesses would yield values along the diagonal line.

Results

Outbreak risk

For an efficient response to a developing epidemic, it is important to rapidly focus intervention resources to areas at highest risk. By utilizing data for all days of the study period, we plotted the proportion of non-infected communes that experienced an outbreak within seven days, for various intervals of infectious pressure, P (Fig. 2a). The risk of a study area experiencing a new outbreak correlated closely with the infectious pressure. Over a pressure level of 22 (P), all areas (six study areas) experienced outbreaks within seven days (see S5 for P1, P2 and sensitivity analyses).

Sensitivity and specificity of outbreak predictions

Building upon this strong correlation between infectious pressure and outbreak risk, we created a binary test to predict an outbreak occurring within the upcoming seven days, based solely on thresholds of infectious pressure. We predicted an outbreak to occur at a given pressure threshold) and plotted the corresponding sensitivity and specificity of each threshold (Fig. 2b). We compared the model based on the mobile operator mobility data (P) with the gravity models (P1 and P2), for which P2 was optimised specifically to yield the maximum possible area under the curve (AUC) in this analysis. Comparing these ROC curves, the P model clearly performs better than the P1 model and slightly better than the P2 model, yielding a higher specificity for a given level of sensitivity. Note that both gravity models rely on parameter optimisations that could not have been performed until the epidemic spread was completed. Analyses of ranks of infectious pressure yielded similar results (S4). As the generation time of cholera in Haiti is uncertain and may have deviated from the seven-day period assumed in this study (Eq. 1)29, we additionally calculated infectious pressure based on other time periods (three, five and nine days), which did not alter the results (S5).

Early outbreak incidence

In addition to predicting the risk of a new outbreak occurring in an area, an effective health care response requires good estimates of the number of cases that are likely to occur if an outbreak takes place. We may however expect the local evolution of an epidemic, after its start, to be largely dependent on local environmental and behavioural factors. One may thus hypothesize that the infectious pressure leading to the seeding of an outbreak would have little further influence on the number of cases in an area. This did however not seem to be the case. We correlated, for each newly infected area, the infectious pressure sustained by the area at outbreak onset with the average daily number of cases during the first D days of the new outbreak. For all values of D from one to ten days, we found a linear correlation (r) of approximately 0.3 (Fig. 3). Correlation coefficients were significant (p < 0.05) for all periods for the P model and non-significant for all 10 periods for the P1 and the P2 model (one outlier excluded).
Figure 3

Correlation (r) between infectious pressure at outbreak onset and average daily number of reported cases during the first D days of the outbreak (one to ten days from onset).

Red: P, solid green: P1, black: P2.

Discussion

Our results show that the risk of epidemic onset of cholera in a given area and the initial intensity of local outbreaks could have been anticipated during the early days of the Haitian epidemic using case reports and the mobility patterns of mobile phones. We show that the specificity and sensitivity of predictions of epidemic spread was improved or comparable to currently available optimized mobility models. Most importantly, the predictions based on the mobile operator data did not rely on retrospective optimization of parameter models and could thus be available from the start of an outbreak. This is important as gravity model parameters are highly context specific3334. These results indicate that outbreak preparedness and response to epidemic agents, such as cholera, can be enhanced. The findings may have particular importance for improving early containment efforts of emerging infectious diseases, such as high mortality strains of pandemic influenza, and the response to vaccine-preventable diseases, such as measles, in low-income settings. Cholera is known to be disseminated not only via human movement, but also by water and sometimes food contamination35. Even stronger predictive power may thus be achieved for infectious agents that exhibit exclusive person-to-person transmission. Furthermore, in this study, we focused on an extremely large and rapid cholera epidemic. It is likely that surveillance data would be more accurate for a disease that exhibits obvious symptoms and longer generation times, which may reduce reporting bias and delays. In particular, mobile operator data may represent a powerful tool for the containment of measles, which exhibits a high mortality rate, is preventable with vaccine, is readily identifiable based on simple clinical criteria and displays a sufficiently long generation time. Mobile phone-based connectivity matrices may also be very useful for early containment of emerging infectious agents, such as the localized appearance of high-mortality influenza strains with pandemic potential6. Mobile operator data may be especially advantageous in settings where poor road quality renders distance a weak measure of connectivity. Access to operator data is a prerequisite for the future use of the method. The study demonstrates however that for the purpose of predicting future outbreaks, mobile phone operators may not need to provide access to their complete customer databases, but rather only to aggregated data on mobility between areas. Such aggregated connectivity matrices can be made available for preparedness purposes before outbreaks. They can then be repeatedly updated during the outbreak response to take into account changes in population mobility, which would not be captured by a gravity model. In newly infected areas, infectious pressure based on the mobile phone model (but not gravity models) correlated positively with initial incidence. The correlation between mobility and initial incidence was unexpected as cholera transmission depends on a number of local socio-economic and environmental conditions. There may either be a causal connection between mobility and outbreak size (if rapid outbreaks were caused by multiple seeding events), alternatively areas between which there is high mobility may have environmental and behavioural similarities. Although the findings indicate an important new use of mobility data, a policy relevant tool for predicting early case numbers thus needs to incorporate additional variables to strengthen correlations. Some study limitations should be noted. Reporting errors and delays are likely to have occurred in the case data and may have reduced the predictive accuracy for both the phone and gravity models. This will also be the case in future applications of the method. The case data in the study should be interpreted as reflecting relative rather than absolute differences in case load per area, as the Haitian cholera reporting system excluded asymptomatic and mild infections, as well as some severely ill patients who did not reach health facilities36. Short-range movements may be under-recorded in the mobile data since they take place over shorter time intervals. However, weighting short and long-term movements differently did not change the prediction results (S5). Frequency of mobile phone use varies throughout the country and differential mobility between phone and non-phone users may exist. Studies evaluating bias in mobility estimates based on mobile operator data do however provide strong support for operator data being the currently best measure of nationwide mobility patterns272837. Although this study focuses only on the influence of human mobility, future mobile phone based models focusing on cholera may benefit from including data on spatial distributions of access to water and sanitation38, bacterial transmission via waterways31, agricultural practices39, differential population immunity levels40 and interactions between infectiousness and mobility and between infectiousness and phone use. In summary, the results show that mobile phone mobility patterns in Haiti during the 2010 cholera outbreak enabled predictions of epidemic spread, which did not require retrospective optimization of parameter models and could thus be available at outbreak onset. The results imply that anonymous mobile phone data may represent a key data source to both increase our understanding of the mechanisms underlying the spatial spread of infectious agents and provide an important policy-relevant tool for future outbreak preparedness and response efforts.

Author Contributions

L.B. wrote the main manuscript text and all authors provided edits. X.L., L.B., R.P., K.S. and J.G. performed analyses. E.W., S.M. and S.R. provided additional input and participated in data collection. X.L. prepared the figures.
  36 in total

1.  Containing pandemic influenza at the source.

Authors:  Ira M Longini; Azhar Nizam; Shufu Xu; Kumnuan Ungchusak; Wanna Hanshaoworakul; Derek A T Cummings; M Elizabeth Halloran
Journal:  Science       Date:  2005-08-03       Impact factor: 47.728

2.  Synchrony, waves, and spatial hierarchies in the spread of influenza.

Authors:  Cécile Viboud; Ottar N Bjørnstad; David L Smith; Lone Simonsen; Mark A Miller; Bryan T Grenfell
Journal:  Science       Date:  2006-03-30       Impact factor: 47.728

3.  Estimation of potential global pandemic influenza mortality on the basis of vital registry data from the 1918-20 pandemic: a quantitative analysis.

Authors:  Christopher J L Murray; Alan D Lopez; Brian Chin; Dennis Feehan; Kenneth H Hill
Journal:  Lancet       Date:  2006-12-23       Impact factor: 79.321

Review 4.  Large-scale spatial-transmission models of infectious disease.

Authors:  Steven Riley
Journal:  Science       Date:  2007-06-01       Impact factor: 47.728

5.  Detecting robust patterns in the spread of epidemics: a case study of influenza in the United States and France.

Authors:  Pascal Crépey; Marc Barthélemy
Journal:  Am J Epidemiol       Date:  2007-10-15       Impact factor: 4.897

6.  Multiscale mobility networks and the spatial spreading of infectious diseases.

Authors:  Duygu Balcan; Vittoria Colizza; Bruno Gonçalves; Hao Hu; José J Ramasco; Alessandro Vespignani
Journal:  Proc Natl Acad Sci U S A       Date:  2009-12-14       Impact factor: 11.205

7.  Understanding individual human mobility patterns.

Authors:  Marta C González; César A Hidalgo; Albert-László Barabási
Journal:  Nature       Date:  2008-06-05       Impact factor: 49.962

8.  The economy-wide impact of pandemic influenza on the UK: a computable general equilibrium modelling experiment.

Authors:  Richard D Smith; Marcus R Keogh-Brown; Tony Barnett; Joyce Tait
Journal:  BMJ       Date:  2009-11-19

9.  Seroepidemiologic survey of epidemic cholera in Haiti to assess spectrum of illness and risk factors for severe disease.

Authors:  Brendan R Jackson; Deborah F Talkington; James M Pruckler; M D Bernadette Fouché; Elsie Lafosse; Benjamin Nygren; Gerardo A Gómez; Georges A Dahourou; W Roodly Archer; Amanda B Payne; W Craig Hooper; Jordan W Tappero; Gordana Derado; Roc Magloire; Peter Gerner-Smidt; Nicole Freeman; Jacques Boncy; Eric D Mintz
Journal:  Am J Trop Med Hyg       Date:  2013-10       Impact factor: 2.345

Review 10.  Mathematical models of infectious disease transmission.

Authors:  Nicholas C Grassly; Christophe Fraser
Journal:  Nat Rev Microbiol       Date:  2008-06       Impact factor: 60.633

View more
  65 in total

1.  Climate-driven endemic cholera is modulated by human mobility in a megacity.

Authors:  Javier Perez-Saez; Aaron A King; Andrea Rinaldo; Mohammad Yunus; Abu S G Faruque; Mercedes Pascual
Journal:  Adv Water Resour       Date:  2016-11-27       Impact factor: 4.510

2.  River networks as ecological corridors: A coherent ecohydrological perspective.

Authors:  Andrea Rinaldo; Marino Gatto; Ignacio Rodriguez-Iturbe
Journal:  Adv Water Resour       Date:  2018-02       Impact factor: 4.510

3.  The ethics of big data as a public good: which public? Whose good?

Authors:  Linnet Taylor
Journal:  Philos Trans A Math Phys Eng Sci       Date:  2016-12-28       Impact factor: 4.226

4.  Mobile phone data highlights the role of mass gatherings in the spreading of cholera outbreaks.

Authors:  Flavio Finger; Tina Genolet; Lorenzo Mari; Guillaume Constantin de Magny; Noël Magloire Manga; Andrea Rinaldo; Enrico Bertuzzo
Journal:  Proc Natl Acad Sci U S A       Date:  2016-05-23       Impact factor: 11.205

5.  The DAGs of war.

Authors:  David Fisman; Ashleigh Tuite
Journal:  Proc Natl Acad Sci U S A       Date:  2019-11-07       Impact factor: 11.205

6.  Understanding post-disaster population recovery patterns.

Authors:  Takahiro Yabe; Kota Tsubouchi; Naoya Fujiwara; Yoshihide Sekimoto; Satish V Ukkusuri
Journal:  J R Soc Interface       Date:  2020-02-19       Impact factor: 4.118

7.  Rapid and Near Real-Time Assessments of Population Displacement Using Mobile Phone Data Following Disasters: The 2015 Nepal Earthquake.

Authors:  Robin Wilson; Elisabeth Zu Erbach-Schoenberg; Maximilian Albert; Daniel Power; Simon Tudge; Miguel Gonzalez; Sam Guthrie; Heather Chamberlain; Christopher Brooks; Christopher Hughes; Lenka Pitonakova; Caroline Buckee; Xin Lu; Erik Wetter; Andrew Tatem; Linus Bengtsson
Journal:  PLoS Curr       Date:  2016-02-24

Review 8.  Connecting Mobility to Infectious Diseases: The Promise and Limits of Mobile Phone Data.

Authors:  Amy Wesolowski; Caroline O Buckee; Kenth Engø-Monsen; C J E Metcalf
Journal:  J Infect Dis       Date:  2016-12-01       Impact factor: 5.226

Review 9.  Measuring mobility, disease connectivity and individual risk: a review of using mobile phone data and mHealth for travel medicine.

Authors:  Shengjie Lai; Andrea Farnham; Nick W Ruktanonchai; Andrew J Tatem
Journal:  J Travel Med       Date:  2019-05-10       Impact factor: 8.490

10.  Boundaries Between Research Ethics and Ethical Research Use in Artificial Intelligence Health Research.

Authors:  Gabrielle Samuel; Jenn Chubb; Gemma Derrick
Journal:  J Empir Res Hum Res Ethics       Date:  2021-03-18       Impact factor: 1.742

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.