| Literature DB >> 35605437 |
Chaoran Chen1, Sarah Nadeau1, Ivan Topolsky1, Niko Beerenwinkel1, Tanja Stadler2.
Abstract
The SARS-CoV-2 pandemic led to a huge increase in global pathogen genome sequencing efforts, and the resulting data are becoming increasingly important to detect variants of concern, monitor outbreaks, and quantify transmission dynamics. However, this rapid up-scaling in data generation brought with it many IT infrastructure challenges. In this paper, we report about developing an improved system for genomic epidemiology. We (i) highlight key challenges that were exacerbated by the pandemic situation, (ii) provide data infrastructure design principles to address them, and (iii) give an implementation example developed by the Swiss SARS-CoV-2 Sequencing Consortium (S3C) in response to the COVID-19 pandemic. Finally, we discuss remaining challenges to data infrastructure for genomic epidemiology. Improving these infrastructures will help better detect, monitor, and respond to future public health threats.Entities:
Keywords: Data infrastructure; Genomic epidemiology; Microservices; Relational database; SARS-CoV-2
Mesh:
Year: 2022 PMID: 35605437 PMCID: PMC9107180 DOI: 10.1016/j.epidem.2022.100576
Source DB: PubMed Journal: Epidemics ISSN: 1878-0067 Impact factor: 5.324
Fig. 1An illustration of how three key entities – tests, plates, and sequences – are stored in database tables and the mapping table that links the information from each.
Fig. 2Containerized microservices operate autonomously to add or extract data from the database.
Fig. 3A SQL query that finds the samples with the S:N501Y mutation.