| Literature DB >> 35401651 |
Lingqiao Song1, Hanshi Liu1, Fiona S L Brinkman2, Erin Gill2, Emma J Griffiths3, William W L Hsiao2, Sarah Savić-Kallesøe2, Sandrine Moreira4, Gary Van Domselaar5, Ma'n H Zawati1, Yann Joly1.
Abstract
COVID-19 was declared to be a pandemic in March 2020 by the World Health Organization. Timely sharing of viral genomic sequencing data accompanied by a minimal set of contextual data is essential for informing regional, national, and international public health responses. Such contextual data is also necessary for developing, and improving clinical therapies and vaccines, and enhancing the scientific community's understanding of the SARS-CoV-2 virus. The Canadian COVID-19 Genomics Network (CanCOGeN) was launched in April 2020 to coordinate and upscale existing genomics-based COVID-19 research and surveillance efforts. CanCOGeN is performing large-scale sequencing of both the genomes of SARS-CoV-2 virus samples (VirusSeq) and affected Canadians (HostSeq). This paper addresses the privacy concerns associated with sharing the viral sequence data with a pre-defined set of contextual data describing the sample source and case attribute of the sequence data in the Canadian context. Currently, the viral genome sequences are shared by provincial public health laboratories and their healthcare and academic partners, with the Canadian National Microbiology Laboratory and with publicly accessible databases. However, data sharing delays and the provision of incomplete contextual data often occur because publicly releasing such data triggers privacy and data governance concerns. The CanCOGeN Ethics and Governance Expert Working Group thus has investigated several privacy issues cited by CanCOGeN data providers/stewards. This paper addresses these privacy concerns and offers insights primarily in the Canadian context, although similar privacy considerations also exist in other jurisdictions. We maintain that sharing viral sequencing data and its limited associated contextual data in the public domain generally does not pose insurmountable privacy challenges. However, privacy risks associated with reidentification should be actively monitored due to advancements in reidentification methods and the evolving pandemic landscape. We also argue that during a global health emergency such as COVID-19, privacy should not be used as a blanket measure to prevent such genomic data sharing due to the significant benefits it provides towards public health responses and ongoing research activities.Entities:
Keywords: COVID-19; contextual data; data-sharing strategy; genomic (or scientific) governance; health information access; metadata; privacy; viral sequence
Year: 2022 PMID: 35401651 PMCID: PMC8988250 DOI: 10.3389/fgene.2021.716541
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
MIxS Compliance and Implementation Metadata Standards (Genomics Standards Consortium, 2021).
| Field Name | Definition |
|---|---|
| sample collector sample ID | The user-defined name for the sample. |
| sample collected by | The name of the agency that collected the original sample. |
| sequence submitted by | The name of the agency that generated the sequence. |
| sample collection date | The date on which the sample was collected. |
| geo_loc_name (country) | The country where the sample was collected. |
| geo_) loc_name (state/province/territory) | The province/territory where the sample was collected. |
| organism | Taxonomic name of the organism. |
| Isolate | Identifier of the specific isolate. |
| isolation source | The material sampled (this information is encoded by 6 additional fields which need only be filled as applicable, depending on sample type; anatomical material, anatomical site, body product, environmental material, environmental site, collection device, collection method). |
| host (scientific name) | The taxonomic, or scientific name of the host. |
| host disease | The name of the disease experienced by the host. |
| host age | Age of host at the time of sampling. |
| host gender | The gender of the host at the time of sample collection. |
| sequencing instrument | The model of the sequencing instrument used. |
| consensus sequence software name | The name of software used to generate the consensus sequence. |
| consensus sequence software version | The version of the software used to generate the consensus sequence. |