| Literature DB >> 35331836 |
Mercedes Paz1, Fabián Aldunate2, Rodrigo Arce2, Irene Ferreiro2, Juan Cristina3.
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a novel virus that belongs to the family Coronaviridae. This virus produces a respiratory illness known as coronavirus disease 2019 (COVID-19) and is to blame for the pandemic of COVID-19. Due to its massive circulation around the world and the capacity of mutation of this virus, genomic studies are much needed in to order to reveal new variants of concern (VOCs). On November 26th, 2021, the WHO announced that a new SARS-CoV-2 VOC, named Omicron, had emerged. In order to get insight into the emergence, spread and evolution of Omicron SARS-CoV-2 variants, a comprehensive phylogenetic study was performed. The results of these studies revealed significant differences in codon usage among the S genes of SARS-CoV-2 VOCs Alfa, Beta, Gamma, Delta and Omicron, which can be linked to SARS-CoV-2 genotypes. Omicron variant did not evolve out of one of the early VOCs, but instead it belongs to a complete different genetic lineage from previous ones. Strains classified as Omicron variants evolved from ancestors that existed around May 15th, 2020, suggesting that this VOC may have been circulating undetected for a period of time until its emergence was observed in South Africa. A rate of evolution of 5.61 × 10-4 substitutions/site/year was found for Omicron strains enrolled in these analyses. The results of these studies demonstrate that S genes have suitable genetic information for clear assignment of emerging VOCs to its specific genotypes.Entities:
Keywords: COVID-19; Coronavirus; Omicron; SARS-CoV-2; evolution
Mesh:
Substances:
Year: 2022 PMID: 35331836 PMCID: PMC8937608 DOI: 10.1016/j.virusres.2022.198753
Source DB: PubMed Journal: Virus Res ISSN: 0168-1702 Impact factor: 6.286
Fig. 1PCA of codon usage in Spike proteins from SARS-CoV-2 VOCs strains. In (A) the position of the Spike proteins in the plane conformed by the first two major components of PCA is shown. SVD was used to calculate principal components and unit variance was applied. The proportion of variance explained by each axis is shown between parentheses. Prediction ellipses are such that with probability 0.95, a new observation from the same group will fall inside the ellipse. Genotypes are indicated at the right of the figure. N = 256 data points. In (B) Heatmaps of codon usage in Spike proteins are shown. Unit variance scaling was applied. Each column corresponds to a different Spike protein from SARS-CoV-2 VOCs strains, who's genotype is shown in the upper part of the figure. Both rows and columns are clustered using correlation distance and average linkage.
Fig. 2PCA of nucleotide composition in Spike proteins from SARS-CoV-2 VOCs. In (A) a PCA analysis of nucleotide frequencies for first, second and third codon positions in S protein from SARS-CoV-2 strains is shown. SVD was used to calculate principal components and unit variance was applied. The proportion of variance explained by each axis shown between parentheses. Prediction ellipses are such that with probability 0.95, a new observation from the same group will fall inside the ellipse. Genotypes are indicated at the right of the figure. N = 256 data points. In (B) Heatmaps of nucleotide frequencies in Spike proteins are shown. Frequencies for A, C, U and G at first, second and third codon positions are indicated 1 through 3. Unit variance scaling was applied. Each column corresponds to a different Spike protein from SARS-CoV-2 VOCs strains, who's genotype is shown in the upper part of the figure. Both rows and columns are clustered using correlation distance and average linkage.
Bayesian coalescent inference of Omicron SARS-CoV-2 strains.
| ESS | ||
|---|---|---|
| -41527.05 | -41604.75 to -41432.15 | 214.50 |
| 318.78 | 255.70 to 397.14 | 209.40 |
| -41845.83 | -41863.00 to -41830.62 | 314.10 |
| 5.613 × 10−4 | 3.051 × 10−4 to 9.014 × 10−4 | 208.10 |
| 1.539 | 0.806 to 2.251 | 210.70 |
| 0.310 | 0.120 to 0.570 | |
See Supplementary Material Table 1 for strains included in this analysis.
The rate of evolution is indicated in substitions/site/year.
tMRCA, time of the most common recent ancestor is shown in years.
tMRCA South Africa, time to the most recent common ancestor for all strains isolated in South Africa. The date estimated for the tMRCAs are indicated in bold.
In all cases, the mean values are shown.
HPD, high probability density values.
ESS, effective sample size.
Fig. 3Bayesian MCMC phylogenetic tree analysis of Omicron SARS-CoV-2 strains. A maximum clade credibility tree obtained using the HKY+I model, a strict molecular clock and the Birth-Death Skyline Serial population model is shown. The tree is rooted to the Most Recent Common Ancestor (MRCA). The two main clades containing strains isolated in South Africa are shown in blue and red, respectively. Time to the MRCA is shown in years at the bottom of the figure. Bar at the bottom of the tree denotes time in years.