Literature DB >> 31240097

epicontacts: Handling, visualisation and analysis of epidemiological contacts.

V P Nagraj1, Nistara Randhawa2, Finlay Campbell3, Thomas Crellen4, Bertrand Sudre5, Thibaut Jombart3.   

Abstract

Epidemiological outbreak data is often captured in line list and contact format to facilitate contact tracing for outbreak control. epicontacts is an R package that provides a unique data structure for combining these data into a single object in order to facilitate more efficient visualisation and analysis. The package incorporates interactive visualisation functionality as well as network analysis techniques. Originally developed as part of the Hackout3 event, it is now developed, maintained and featured as part of the R Epidemics Consortium (RECON). The package is available for download from the Comprehensive R Archive Network (CRAN) and GitHub.

Entities:  

Keywords:  R; contact tracing; outbreaks

Mesh:

Year:  2018        PMID: 31240097      PMCID: PMC6572866          DOI: 10.12688/f1000research.14492.2

Source DB:  PubMed          Journal:  F1000Res        ISSN: 2046-1402


Introduction

In order to study, prepare for, and intervene against disease outbreaks, infectious disease modellers and public health professionals need an extensive data analysis toolbox. Disease outbreak analytics involve a wide range of tasks that need to be linked together, from data collection and curation to exploratory analyses, and more advanced modelling techniques used for incidence forecasting [1, 2] or to predict the impact of specific interventions [3, 4]. Recent outbreak responses suggest that for such analyses to be as informative as possible, they need to rely on a wealth of available data, including timing of symptoms, characterisation of key delay distributions (e.g. incubation period, serial interval), and data on contacts between patients [5– 8]. The latter type of data is particularly important for outbreak analysis, not only because contacts between patients are useful for unravelling the drivers of an epidemic [9, 10], but also because identifying new cases early can reduce ongoing transmission via contact tracing, i.e. follow-up of individuals who reported contacts with known cases [11, 12]. However, curating contact data and linking them to existing line lists of cases is often challenging, and tools for storing, handling, and visualising contact data are often missing [13, 14]. Here, we introduce epicontacts, an R [15] package providing a suite of tools aimed at merging line lists and contact data, and providing basic functionality for handling, visualising and analysing epidemiological contact data. Maintained as part of the R Epidemics Consortium ( RECON), the package is integrated into an ecosystem of tools for outbreak response using the R language.

Use cases

Those interested in using epicontacts should have a line list of cases as well as a record of contacts between individuals. Both datasets must be enumerated in tabular format with rows and columns. At minimum the line list requires one column with a unique identifier for every case. The contact list needs two columns for the source and destination of each pair of contacts. The datasets can include arbitrary features of case or contact beyond these columns. Once loaded into R and stored as data.frame objects, these datasets can be passed to the make_epicontacts() function (see ‘Methods’ section for more detail). For an example of data prepared in this format, users can refer to the outbreaks R package. ## ’data.frame’: 3800 obs. of 3 variables: ## $ infector: chr "d1fafd" "cac51e" "f5c3d8" "0f58c4" ... ## $ case_id : chr "53371b" "f5c3d8" "0f58c4" "881bd4" ... ## $ source : Factor w/ 2 levels "funeral","other": 2 1 2 2 2 1 2 2 2 2 ... The data handling, visualization, and analysis methods described above represent the bulk of epicontacts features. More examples of how the package can be used as well as demonstrations of additional features can be found through the RECON learn platform and the .

Methods

Operation

epicontacts is released as an open-source R package. A stable release is available for Windows, Mac and Linux operating systems via the CRAN repository. The latest development version of the package is available through the RECON Github organization. At minimum users must have R installed. No other system dependencies are required.

Implementation

epicontacts includes a novel data structure to accommodate line list and contact list datasets in a single object. This object is constructed with the make_epiconctacts() function and includes attributes from the original datasets. Once combined, these are mapped internally in a graph paradigm as nodes and edges. The epicontacts data structure also includes a logical attribute for whether or not this resulting network is directed. The package takes advantage of R’s generic functions, which call specific methods depending on the class of an object. This is implemented several places, including the summary.epicontacts() and print.epicontacts() methods, both of which are respectively called when the summary() or print() functions are used on an epicontacts object. The package does not include built-in data, as exemplary contact and line list datasets are available in the outbreaks package [16]. The example that follows will use the mers_korea_2015 dataset from outbreaks, which which includes initial data collected by the Epidemic Intelligence group at European Centre for Disease Prevention and Control (ECDC) during the 2015 outbreak of Middle East respiratory syndrome (MERS-CoV) in South Korea. Note that the data used here was provided in outbreaks for teaching purposes, and therefore does not include the complete line list or contacts from the outbreak. epicontacts implements two interactive network visualisation packages: visNetwork and threejs [17, 18]. These frameworks provide R interfaces to the vis.js and three.js JavaScript libraries respectively. Their functionality is incorporated in the generic plot() method ( Figure 1) for an epicontacts object, which can be toggled between either with the “type” parameter. Alternatively, the visNetwork interactivity is accessible via vis_epicontacts() ( Figure 2), and threejs through graph3D() ( Figure 3). Each function has a series of arguments that can also be passed through plot(). Both share a color palette, and users can specify node, edge and background colors. However, vis_epicontacts() includes a specification for “node_shape” by a line list attribute as well as a customization of that shape with an icon from the Font Awesome icon library. The principal distinction between the two is that graph3D() is a three-dimensional visualisation, allowing users to rotate clusters of nodes to better inspect their relationships.
Figure 1.

The generic plot() method for an epicontacts object will use the visNetwork method by default.

Figure 2.

The vis_epicontacts() function explicitly calls visNetwork to make an interactive plot of the contact network.

Figure 3.

The graph3D() function generates a three-dimensional network plot.

Subsetting is a typical preliminary step in data analysis. epicontacts leverages a customized subset method to filter line lists or contacts based on values of particular attributes from nodes, edges or both. If users are interested in returning only contacts that appear in the line list (or vice versa), the thin() function implements such logic. For analysis of pairwise contact between individuals, the get_pairwise() feature searches the line list based on the specified attribute. If the given column is a numeric or date object, the function will return a vector containing the difference of the values of the corresponding “from” and “to” contacts. This can be particularly useful, for example, if the line list includes the date of onset of each case. The subtracted value of the contacts would approximate the serial interval for the outbreak [19]. For factors, character vectors and other non-numeric attributes, the default behavior is to print the associated line list attribute for each pair of contacts. The function includes a further parameter to pass an arbitrary function to process the specified attributes. In the case of a character vector, this can be helpful for tabulating information about different contact pairings with table().

Discussion

Benefits

While there are software packages available for epidemiological contact visualisation and analysis, none aim to accommodate line list and contact data as purposively as epicontacts [20– 22]. Furthermore, this package strives to solve a problem of plotting dense graphs by implementing interactive network visualisation tools. A static plot of a network with many nodes and edges may be difficult to interpret. However, by rotating or hovering over an epicontacts visualisation, a user may better understand the data.

Future considerations

The maintainers of epicontacts anticipate new features and functionality. Future development could involve performance optimization for visualising large networks, as generating these interactive plots is resource intensive. Additionally, attention may be directed towards inclusion of alternative visualisation methods.

Conclusions

epicontacts provides a unified interface for processing, visualising and analyzing disease outbreak data in the R language. The package and its source are freely available on CRAN and GitHub. By developing functionality with line list and contact list data in mind, the authors aim to enable more efficient epidemiological outbreak analyses.

Software availability

Software available from: https://CRAN.R-project.org/package=epicontacts Source code available from: https://github.com/reconhub/epicontacts Archived source code as at time of publication: https://zenodo.org/record/1210993 [23] Software license: GPL 2 The authors present an R package that helps in the visualisation and analysis of epidemiological contacts. It is a well-presented summary of the capabilities of the software and brings network theory tools to academics from the areas of public health and epidemiology. For version 2 of the manuscript, the authors follow the Introduction with Use cases, which benefits the understanding of how data has to be prepared for input. I followed the examples, and the output that they present about the ebola data set is outdated. I suggest to include the version of the outbreaks package used in the examples. For those not familiar with R, presenting the data with the str command is not clear; to use the command head instead, could be a better option. This is something minor and not needed but would improve the understanding of the software on the most critical point, which is data input. As in version 2, the Use cases section is presented before the Methods section, the last paragraph from the Use cases section needs to be changed from: "methods described above" to "methods described below". The visualisations produced from epicontacts are impressive. For the manuscript to become more accessible to main stream readers, it would need to include the output from the analysis and interpretation. I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. This is a good software developed which could help in continuous visualization of contacts and their progression in disease tracking. It is user friendly for those who are not computer specialist and still want to visualize data. Data visualization is pertinent to disease monitoring and what the authors have done will aid in helping epidemiologist and public health specialist involved in outbreak response to quickly visualize progression and spread of disease from primary to secondary contacts and how the disease is evolving among contacts. The software will actually achieve its purpose as stated in the conclusion of the write-up. Good work done. I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. The article describes an R-based software tool aimed to facilitate analysis of data from outbreaks that include line lists of cases and case-contact data. The R package, epicontacts, is part of a larger suite of tools housed at the R Epidemics Consortium (RECON). The epicontacts package has the ability to merge data about cases in a line list with case-contact details, which then allows the user to describe and visualize contact networks, incubation periods, and serial intervals within an outbreak. The codes and methods for analysis are partly described in the article, and the authors should provide a link to the packages documentation, either at CRAN or RECON webpages, where readers could learn more about the package and its options. The output of the package provided in the article was interesting and intriguing. I felt that it was only partly explained and the article could benefit from the authors annotating the output and its interpretation a bit further. I have explored the RECON website and found the RECON Learn modules to be quite helpful in providing annotation of the epicontacts output and some guidance on interpretation. I would recommend that the authors consider either expanding the annotation of the output in this article or explicitly direct readers to the RECON Learn website for further instruction. Additional suggestions: Consider moving the section of the article called "Use cases" to before the "Data handling" subsection of the "Implementation" section. I felt that the description of the input datasets under "Use cases" was very informative and would have been organizational more helpful had it been placed earlier in the article. Consider describing the sample outbreak data in a bit further detail. It appears to be data describing the MERS outbreak that occurred in South Korea in 2015. I think the description should include whether the data are simulated or from a real outbreak (if from a real outbreak, then a reference to the outbreak description should be included), the scenario of the outbreak, how many cases, how many contacts, place of the outbreak, duration of the outbreak, and a brief description of the demographic details included in the dataset. This amount of detail would allow the reader to translate the details of the outbreak from your text to the output provided by epicontacts. I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
  17 in total

Review 1.  Role of contact tracing in containing the 2014 Ebola outbreak: a review.

Authors:  Shrivastava Saurabh; Shrivastava Prateek
Journal:  Afr Health Sci       Date:  2017-03       Impact factor: 0.927

2.  Investigating a community-wide outbreak of hepatitis a in India.

Authors:  Ps Rakesh; Daniel Sherin; Hari Sankar; Marydasan Shaji; Saraswathy Subhagan; Sreekumar Salila
Journal:  J Glob Infect Dis       Date:  2014-04

Review 3.  Visualization and analytics tools for infectious disease epidemiology: a systematic review.

Authors:  Lauren N Carroll; Alan P Au; Landon Todd Detwiler; Tsung-Chieh Fu; Ian S Painter; Neil F Abernethy
Journal:  J Biomed Inform       Date:  2014-04-16       Impact factor: 6.317

4.  The interval between successive cases of an infectious disease.

Authors:  Paul E M Fine
Journal:  Am J Epidemiol       Date:  2003-12-01       Impact factor: 4.897

5.  Contact tracing performance during the Ebola virus disease outbreak in Kenema district, Sierra Leone.

Authors:  Mikiko Senga; Alpha Koi; Lina Moses; Nadia Wauquier; Philippe Barboza; Maria Dolores Fernandez-Garcia; Etsub Engedashet; Fredson Kuti-George; Aychiluhim Damtew Mitiku; Mohamed Vandi; David Kargbo; Pierre Formenty; Stephane Hugonnet; Eric Bertherat; Christopher Lane
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2017-05-26       Impact factor: 6.237

6.  West African Ebola epidemic after one year--slowing but not yet under control.

Authors:  Junerlyn Agua-Agum; Archchun Ariyarajah; Bruce Aylward; Isobel M Blake; Richard Brennan; Anne Cori; Christl A Donnelly; Ilaria Dorigatti; Christopher Dye; Tim Eckmanns; Neil M Ferguson; Pierre Formenty; Christophe Fraser; Erika Garcia; Tini Garske; Wes Hinsley; David Holmes; Stéphane Hugonnet; Swathi Iyengar; Thibaut Jombart; Ravi Krishnan; Sascha Meijers; Harriet L Mills; Yasmine Mohamed; Gemma Nedjati-Gilani; Emily Newton; Pierre Nouvellet; Louise Pelletier; Devin Perkins; Steven Riley; Maria Sagrado; Johannes Schnitzler; Dirk Schumacher; Anita Shah; Maria D Van Kerkhove; Olivia Varsaneux; Niluka Wijekoon Kannangarage
Journal:  N Engl J Med       Date:  2014-12-24       Impact factor: 91.245

7.  Real-time forecasting of infectious disease dynamics with a stochastic semi-mechanistic model.

Authors:  Sebastian Funk; Anton Camacho; Adam J Kucharski; Rosalind M Eggo; W John Edmunds
Journal:  Epidemics       Date:  2016-12-16       Impact factor: 4.396

8.  Unraveling the drivers of MERS-CoV transmission.

Authors:  Simon Cauchemez; Pierre Nouvellet; Anne Cori; Thibaut Jombart; Tini Garske; Hannah Clapham; Sean Moore; Harriet Linden Mills; Henrik Salje; Caitlin Collins; Isabel Rodriquez-Barraquer; Steven Riley; Shaun Truelove; Homoud Algarni; Rafat Alhakeem; Khalid AlHarbi; Abdulhafiz Turkistani; Ricardo J Aguas; Derek A T Cummings; Maria D Van Kerkhove; Christl A Donnelly; Justin Lessler; Christophe Fraser; Ali Al-Barrak; Neil M Ferguson
Journal:  Proc Natl Acad Sci U S A       Date:  2016-07-25       Impact factor: 11.205

Review 9.  Impact of inactivated poliovirus vaccine on mucosal immunity: implications for the polio eradication endgame.

Authors:  Edward Pk Parker; Natalie A Molodecky; Margarita Pons-Salort; Kathleen M O'Reilly; Nicholas C Grassly
Journal:  Expert Rev Vaccines       Date:  2015-07-09       Impact factor: 5.217

10.  The role of rapid diagnostics in managing Ebola epidemics.

Authors:  Pierre Nouvellet; Tini Garske; Harriet L Mills; Gemma Nedjati-Gilani; Wes Hinsley; Isobel M Blake; Maria D Van Kerkhove; Anne Cori; Ilaria Dorigatti; Thibaut Jombart; Steven Riley; Christophe Fraser; Christl A Donnelly; Neil M Ferguson
Journal:  Nature       Date:  2015-12-03       Impact factor: 49.962

View more
  11 in total

1.  Clinical Time Delay Distributions of COVID-19 in 2020-2022 in the Republic of Korea: Inferences from a Nationwide Database Analysis.

Authors:  Eunha Shim; Wongyeong Choi; Youngji Song
Journal:  J Clin Med       Date:  2022-06-07       Impact factor: 4.964

2.  Transmission Potential of the Omicron Variant of Severe Acute Respiratory Syndrome Coronavirus 2 in South Korea, 25 November 2021-8 January 2022.

Authors:  Eunha Shim; Wongyeong Choi; Donghyok Kwon; Taeyoung Kim; Youngji Song
Journal:  Open Forum Infect Dis       Date:  2022-05-13       Impact factor: 4.423

3.  Increased transmission of SARS-CoV-2 in Denmark during UEFA European championships.

Authors:  Marc Bennedbæk; Mia Sarah Fischer Button; Lise Birk Nielsen; Jonas Bybjerg-Grauholm; Christina Wiid Svarrer; Karina Lauenborg Møller; Brian Kristensen; Rebecca Legarth; Vithiagaran Gunalan; Ditte Rechter Zenas; Irfatha Irshad; Sophie Gubbels; Raphael N Sieber; Marc Stegger; Palle Valentiner-Branth; Morten Rasmussen; Camilla Holten Møller; Jannik Fonager; Frederik Trier Moller
Journal:  Epidemiol Infect       Date:  2022-03-23       Impact factor: 4.434

Review 4.  Outbreak analytics: a developing data science for informing the response to emerging pathogens.

Authors:  Jonathan A Polonsky; Amrish Baidjoe; Zhian N Kamvar; Anne Cori; Kara Durski; W John Edmunds; Rosalind M Eggo; Sebastian Funk; Laurent Kaiser; Patrick Keating; Olivier le Polain de Waroux; Michael Marks; Paula Moraga; Oliver Morgan; Pierre Nouvellet; Ruwan Ratnayake; Chrissy H Roberts; Jimmy Whitworth; Thibaut Jombart
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2019-07-08       Impact factor: 6.237

5.  COVID-19 epidemic monitoring after non-pharmaceutical interventions: The use of time-varying reproduction number in a country with a large migrant population.

Authors:  Adil Al Wahaibi; Abdullah Al Manji; Amal Al Maani; Bader Al Rawahi; Khalid Al Harthy; Fatma Alyaquobi; Amina Al-Jardani; Eskild Petersen; Seif Al Abri
Journal:  Int J Infect Dis       Date:  2020-08-20       Impact factor: 3.623

6.  Chainchecker: An application to visualise and explore transmission chains for Ebola virus disease.

Authors:  Katy Gaythorpe; Aaron Morris; Natsuko Imai; Miles Stewart; Jeffrey Freeman; Mary Choi
Journal:  PLoS One       Date:  2021-02-19       Impact factor: 3.240

7.  Tracing COVID-19 Source of Infection Among Health Personnel in a Pediatric Hospital.

Authors:  Daniela de la Rosa-Zamboni; Fernando Ortega-Riosvelasco; Nadia González-García; Ana Estela Gamiño-Arroyo; Guillermo Alejandro Espinosa-González; Juan Manuel Valladares-Wagner; Araceli Saldívar-Flores; Olivia Aguilar-Guzmán; Juan Carlos Sanchez-Pujol; Briseida López-Martínez; Mónica Villa-Guillén; Israel Parra-Ortega; Lourdes María Del Carmen Jamaica-Balderas; Juan José Luis Sienra-Monge; Ana Carmen Guerrero-Díaz
Journal:  Front Pediatr       Date:  2022-06-09       Impact factor: 3.569

8.  epiflows: an R package for risk assessment of travel-related spread of disease.

Authors:  Ilaria Dorigatti; Zhian N Kamvar; Pawel Piatkowski; Paula Moraga; Salla E Toikkanen; V P Nagraj; Christl A Donnelly; Thibaut Jombart
Journal:  F1000Res       Date:  2018-08-31

9.  Tracking progress towards malaria elimination in China: Individual-level estimates of transmission and its spatiotemporal variation using a diffusion network approach.

Authors:  Isobel Routledge; Shengjie Lai; Katherine E Battle; Azra C Ghani; Manuel Gomez-Rodriguez; Kyle B Gustafson; Swapnil Mishra; Juliette Unwin; Joshua L Proctor; Andrew J Tatem; Zhongjie Li; Samir Bhatt
Journal:  PLoS Comput Biol       Date:  2020-03-23       Impact factor: 4.475

10.  Presentation of a developed sub-epidemic model for estimation of the COVID-19 pandemic and assessment of travel-related risks in Iran.

Authors:  Mohsen Ahmadi; Abbas Sharifi; Sarv Khalili
Journal:  Environ Sci Pollut Res Int       Date:  2020-11-19       Impact factor: 4.223

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.