| Literature DB >> 34970567 |
Hatairat Yingtaweesittikul1,2, Karrie Ko3,4,5,6,7, Nurdyana Abdul Rahman4,6, Shireen Yan Ling Tan4,6, Niranjan Nagarajan3,7, Chayaporn Suphavilai3.
Abstract
Background: The ongoing COVID-19 pandemic is a global health crisis caused by the spread of SARS-CoV-2. Establishing links between known cases is crucial for the containment of COVID-19. In the healthcare setting, the ability to rapidly identify potential healthcare-associated COVID-19 clusters is critical for healthcare worker and patient safety. Increasing sequencing technology accessibility has allowed routine clinical diagnostic laboratories to sequence SARS-CoV-2 in clinical samples. However, these laboratories often lack specialized informatics skills required for sequence analysis. Therefore, an on-site, intuitive sequence analysis tool that enables clinical laboratory users to analyze multiple genomes and derive clinically relevant information within an actionable timeframe is needed.Entities:
Keywords: SARS-CoV-2; nanopore sequencing; outbreak tracking; web application; whole genome sequencing
Year: 2021 PMID: 34970567 PMCID: PMC8712659 DOI: 10.3389/fmed.2021.790662
Source DB: PubMed Journal: Front Med (Lausanne) ISSN: 2296-858X
Figure 1CalmBelt overview. (A) On-site sequencing protocol based on the Nanopore sequencing platform and ARCTIC protocol for sequencing and generating draft genomes. (B) Rapid analysis pipeline. Multiple SARS-CoV-2 genomes are aligned to the reference genome. A distance matrix calculated based on whole-genome alignment is used for hierarchical clustering. CalmBelt automatically generates an interactive tree capturing whole genome similarity, collection date (x-axis) and case information (color).
Figure 2Statistics of SARS-CoV-2 genomes in Singapore (January 2020–June 2021). (A) A time series presenting proportion of different clades based on GISAID criteria and the newly defined lineage name according to WHO. (B) Nucleotide diversity across SARS-CoV-2 genomic loci (C,D) T-SNE plots visualize clusters of SAR-CoV2 genomes, where the distances are based on high diversity loci (E,F) Dendrograms present different SAR-CoV2 genome along with collection date and lineage information.
Figure 3Relationship of 12 SARS-CoV-2 genomes observed in the community (C), healthcare workers (H), and dormitory residents (D). (A) A similarity tree is based on a combination of whole genome similarity and collection date. (B) A phylogenetic tree based on multiple sequence alignment and a predefined mutation rate model (Mafft and IQ-TREE2), where ** indicate zero distance.
Figure 4A genome similarity tree shows the relationship among 89 randomly selected, de-identified positive cases which were labeled to be from dormitory residents. These positive cases were detected from 12 dormitories (D1–D12) from the foreign worker dormitory outbreak in Singapore (April–August 2020). Each data point represents a positive case, and colors represent different, de-identified dormitories.
Figure 5Inspecting mutations in a specific region for each month. (A) Mutation frequencies of three genomic regions in the N gene (PCR primers US-CDC-N1 to N3). (B–D) CalmBelt reports the number of cases for each amino acid change (missense mutation) according to user input. (D) Number of cases harboring co-occurrence amino acid changes (L452R and P681R in S gene).