| Literature DB >> 36146815 |
Tanner Wiegand1, Artem Nemudryi1, Anna Nemudraia1, Aidan McVey1, Agusta Little1, David N Taylor2, Seth T Walk1, Blake Wiedenheft1.
Abstract
In late December of 2019, high-throughput sequencing technologies enabled rapid identification of SARS-CoV-2 as the etiological agent of COVID-19, and global sequencing efforts are now a critical tool for monitoring the ongoing spread and evolution of this virus. Here, we provide a short retrospective analysis of SARS-CoV-2 variants by analyzing a subset (n = 97,437) of all publicly available SARS-CoV-2 genomes (n = ~11.9 million) that were randomly selected but equally distributed over the course of the pandemic. We plot the appearance of new variants of concern (VOCs) over time and show that the mutation rates in Omicron (BA.1) and Omicron sub-lineages (BA.2-BA.5) are significantly elevated compared to previously identified SARS-CoV-2 variants. Mutations in Omicron are primarily restricted to the spike and nucleocapsid proteins, while 24 other viral proteins-including those involved in SARS-CoV-2 replication-are generally conserved. Collectively, this suggests that the genetic distinction of Omicron primarily arose from selective pressures on the spike, and that the fidelity of replication of this variant has not been altered.Entities:
Keywords: BA.4; BA.5; COVID-19; Omicron; SARS-CoV-2; viral surveillance
Mesh:
Substances:
Year: 2022 PMID: 36146815 PMCID: PMC9505243 DOI: 10.3390/v14092009
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.818
Figure 1Phylogenetic relationship of named SARS-CoV-2 variants. Variants of concern (VOC) are represented by a colored node. The phylogenetic tree was adapted from data provided by NextStrain, CoVariants (i.e., covariants.org, http://covariants.org (accessed on 18 July 2022)), and Pangolin (i.e., cov-lineages.org, http://cov-lineages.org (accessed on 18 July 2022)) [1,8,13].
Figure 2(A) Non-synonymous mutations acquired over time in 26 SARS-CoV-2 protein sequences extracted from 97,437 genomes from 19 December 2021, to 17 June 2022 (GISAID accessions available at: https://github.com/WiedenheftLab/Omicron (accessed on 18 July 2022) and DOI: https://doi.org/10.55876/gis8.220721mv (accessed on 18 July 2022)). Genomes included in this analysis are a random sampling of 11,336,176 million SARS-CoV-2 from GISAID, which were quality filtered using NextClade (“good” overall QC status) then sampled with the Filter utility in NextStrain (selecting up to 120 genome per country per month of the pandemic) [1,14]. Variants of concern (VOC) are shown in bold and are colored as in Figure 1. Vertical lines mark the date the first sequence for each lineage was identified. The time elapsed between first detection and VOC designation by the WHO is shown as dotted lines above the graph. Dots are colored similar to variant names, and grey circles represent non-VOC lineage genomes. A linear regression line is shown in black. Omicron variants deviate from the trend. (B) Synonymous mutations in the SARS-CoV-2 genomes shown in panel (A). (C) Non-synonymous mutations in the Omicron RNA replication proteins (n = 13,094 genomes) are shown as red lines on schematic representations of each protein, and frequencies of each mutation shown as vertical lines (red). Most non-synonymous mutations are found in less than 1% of Omicron sequences. (D) Schematic depiction of SARS-CoV-2 protein coding sequences, with each gene colored according to its respective amino acid mutation rate. These rates are normalized to account for the length of each protein (i.e., substitutions/amino acids in protein/month of the pandemic).
Figure 3SARS-CoV-2 genomic sequences submitted to the GISAID data repository between January 2020 and June 2022 [12].