| Literature DB >> 35760317 |
Yali Hou1, Shilei Zhao1, Qi Liu1, Xiaolong Zhang1, Tong Sha1, Yankai Su1, Wenming Zhao1, Yiming Bao1, Yongbiao Xue2, Hua Chen3.
Abstract
SARS-CoV-2 is a new RNA virus affecting humans and spreads extensively through world populations since its first outbreak in December, 2019. Whether the transmissibility and pathogenicity of SARS-CoV-2 in humans after zoonotic transfer are actively evolving, and driven by adaptation to the new host and environments is still under debate. Understanding the evolutionary mechanism underlying epidemiological and pathological characteristics of COVID-19 is essential for predicting the epidemic trend, and providing guidance for disease control and treatments. Interrogating novel strategies for identifying natural selection using within-species polymorphisms and 3,674,076 SARS-CoV-2 genome sequences of 169 countries as of December 30, 2021, we demonstrate with population genetic evidence that during the course of SARS-CoV-2 pandemic in humans, (i) SARS-CoV-2 genomes are overall conserved under purifying selection, especially for the 14 genes related to viral RNA replication, transcription, and assembly; (ii) Ongoing positive selection is actively driving the evolution of 6 genes (e.g., S, ORF3a, and N) that play critical roles in molecular processes involving pathogen-host interactions, including viral invasion into and egress from host cells, viral inhibition, or evasion of host immune response, possibly leading to high transmissibility and mild symptom in SARS-CoV-2 evolution. According to an established haplotype phylogenetic relationship of 138 viral clusters, a spatial and temporal landscape of 556 critical mutations is constructed based on their divergence among viral haplotype clusters or repeatedly increase in frequency within at least 2 clusters, of which multiple mutations potentially conferring alterations in viral transmissibility, pathogenicity, and virulence of SARS-CoV-2 are highlighted, warranting attentions.Entities:
Keywords: COVID-19; Darwinian selection; Natural selection; SARS-CoV-2; Viral evolution
Year: 2022 PMID: 35760317 PMCID: PMC9233880 DOI: 10.1016/j.gpb.2022.05.009
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 6.409
Chi-squared tests to compare the Nm/Sm ratios between different mutation groups
| Variations with low frequency of derived alleles | < 0.0001 | 45,292 | 14,305 | 3.17 | |
| Variations with high frequency of derived alleles (a) | > 0.001 | 998 | 778 | 1.28 | |
| Variations with high frequency of derived alleles (b) | 0.0001 < f < 0.001 | 4407 | 3181 | 1.39 | |
| Variations with high frequency of derived alleles (c) | 0.001 < f < 0.01 | 870 | 722 | 1.20 | |
| Widespread variations | > 30 countries | 3606 | 2395 | 1.51 | |
| Non-widespread variations | < 15 countries | 41,398 | 11,347 | 3.65 | |
| Long-time spanning mutations | > 300 days | 27,855 | 14,482 | 1.92 | |
| Short-time spanning mutations | < 150 days | 14,998 | 1967 | 7.62 |
Note: Nm, nonsynonymous mutations; Sm, synonymous mutations.
Figure 1Evidence of natural selection acting on the SARS-CoV-2 genome. A. Genes showing significant signals of positive selection in this study are marked in red, and those showing significant signals of negative selection in this study are in blue. B. The genetic diversity of each gene, which is indicated by Theta (w), calculated as nucleotide diversity per site in the sequences. C. The mutation frequency spectrum. D. The gene structures.
Figure 2Illustration of the trends of Nm/Sm ratio along with the increased allele frequencies for genes with strong evidence of positive or purifying selections. A. Significantly increasing trends of Nm/Sm ratio with the elevated allele frequencies for N, ORF3a, and S genes, indicative of signals of positive selection. B. The insignificant trends for ORF7a, NSP8, and NSP9 genes, demonstrating no selection. C. Significantly decreasing trend of Nm/Sm ratio with the elevated allele frequencies for NSP1, NSP7, and M genes, indicative of signals of purifying selection. Nm/Sm, nonsynonymous vs. synonymous mutations.
Figure 3The genealogical relationship of worldwide haplotype clusters of SARS-CoV-2. The nodes represent different haplotype clusters, with the node sizes proportional to the counts of the belonged sequences. The number of line segments separated by dots between adjacent nodes indicates the hamming distance between clusters. Within each node, its geographical distributions are presented. The listed mutations differentiating adjacent clusters are marked in purple for those within genes under positively selection, in cyan for those within genes under purifying selection, and in red for those that repeatedly occurred at least twice in distinct phylogenetic relationships.