| Literature DB >> 35464844 |
Qiong Zhang1,2,3, Huai-Lan Guo3,4, Jing Wang3,4, Yao Zhang3,4, Ping-Ji Deng3,4, Fei-Feng Li3,4.
Abstract
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) is the causative agent of the coronavirus disease 2019 (COVID-19) pandemic. In this study, we conducted a comparative analysis of the structural genes of SARS-CoV-2 and other CoVs. We found that the sequence of the E gene was the most evolutionarily conserved across 200 SARS-CoV-2 isolates. The E gene and M gene sequences of SARS-CoV-2 and NC014470 CoV were closely related and fell within the same branch of a phylogenetic tree. The absolute diversity of E gene and M gene sequences of SARS-CoV-2 isolates was similar to that of common CoVs (C-CoVs) infecting other organisms. The absolute diversity of the M gene sequence of the KJ481931 CoV that can infect humans was similar to that of SARS-CoV-2 and C-CoVs infecting other organisms. The M gene sequence of KJ481931 CoV (infecting humans), SARS-CoV-2 and NC014470 CoV (infecting other organisms) were closely related, falling within the same branch of a phylogenetic tree. Patterns of variation and evolutionary characteristics of the N gene and S gene were very similar. These data may be of value for understanding the origins and intermediate hosts of SARS-CoV-2.Entities:
Keywords: common coronaviruses (C-CoVs); evolution; intermediate hosts; severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2); structural gene
Year: 2022 PMID: 35464844 PMCID: PMC9024071 DOI: 10.3389/fgene.2022.801902
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
Analysis of structural gene sequences of 200 severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) isolates.
| Genes | Size (nt) | Variations | Variance rate | Gene size variance rate | SNPs | Mutations | For further analysis |
|---|---|---|---|---|---|---|---|
| E gene | 228 | 2 | 1 | 0.44/10,000 | MT263389, MT259248 | — | MT263389, MT259248, |
| M gene | 669 | 9 | 4.5 | 0.67/10,000 | MT259252, MT263384, MT263410, MT263389, MT263443, MT263388, MT263422, MT263447 | MT263397 | MT263410, MT263389, MT263397, |
| N gene | 908 | 28 | 14 | 1.54/10,000 | MT263398, | MT259237, MT259269, MT259274, MT263429, | MT263410, MT263074, MT263422, MT259237, MT259269, MT256917, MT263386, MT263411, MT258382, MT263398, MT259274, MT259270, MT263429, MT259267, MT263421, MT256924, LC534419, MT263435, MT263395, |
| S gene | 3,822 | 89 | 44.5 | 1.16/10,000 | MT259262, MT263410, MT259257, MT263441, MT263469, MT263386, MT259287, MT263074, MT259269, MT259227 | MT263414, MT263460, MT263384, MT259249, | MT263410, MT263074-3, MT263466, MT263384, MT263443, MT259269, MT263386, MT259249, MT263414, MT259262, MT259257, MT259236, MT259282, MT263441, MT262915, MT259287, MT251973, MT263393, MT263385, MT259253, MT263457, MT263420, MT259227, |
Notes: 1Variations include single nucleotide polymorphisms (SNPs) and mutations.
Variance rate= (variations/200) × 100%.
Gene size variance rate= (variations/200/gene size) × 10,000/10,000.
There were two variations in the MT256917 and MT256918 CoVs, respectively.
There were two mutations in the MT263466 CoV.
No variation controls for further analysis of structural genes.
FIGURE 1Absolute diversity and variations in the structural genes of 200 severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) isolates. The similarity and absolute diversity in structural genes sequences were very high. Two SARS-CoV-2 isolates had two single nucleotide polymorphisms (SNPs) within the E gene, nine isolates had three variations (one mutation and two SNPs) within the M gene, 28 strains had 22 variations (13 mutations and nine SNPs) within the N gene, and 89 strains had 25 variations (16 mutations and nine SNPs) within the S gene.
FIGURE 2Evolutionary characteristics of the structural genes of 200 severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) isolates. (A,B): The two isolates with single nucleotide polymorphisms (SNPs) in the E gene were evolutionary distinct. (C,D): The nine isolates with variations in the M gene were evolutionary distinct. (E,F): The 28 isolates with variations in the N gene were evolutionarily distinct. (G,H): The 88 strains with variations in the S gene evolutionarily distinct.
FIGURE 3Evolutionary characteristics and absolute diversity of structural genes in severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) isolates and common coronaviruses (C-CoVs) that infect humans. (A): The E gene sequences of SARS-CoV-2 isolates were evolutionary intermediates between KJ481931 and MG011357. (C): The M gene sequences of SARS-CoV-2 isolates were evolutionary intermediates between KJ48193 and a group of C-CoVs (KP209309, KY581691, KY581689, KY581686, KP209307, KP209313, and KP209306). (B,D): The absolute diversities of the E and M gene sequences within the KJ481931 C-CoV were similar to those of the E and M gene sequences of SARS-CoV-2 isolates. (E,G): The N and S gene sequences of SARS-CoV-2 isolates were evolutionarily distinct. (F,H): The absolute diversities of the N and S gene sequences of SARS-CoV-2 isolates differed from those of all C-CoVs that infect humans.
FIGURE 4Evolutionary characteristics and absolute diversities of structural genes in severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) isolates and common coronaviruses (C-CoVs) that infect other organisms. (A,C): The E and M gene sequences of SARS-CoV-2 isolates were evolutionarily intermediates between C-CoVs that infect other organisms. (A): In terms of E gene sequences, SARS-CoV-2 isolates were most closely related to the C-CoVs NC014470, DQ415914, NC026011, NC006213, JN874559 and U00735. (B): The absolute diversities of the E gene sequences of the C-CoVs NC014470, DQ415914, NC026011, NC006213, JN874559, and U00735 were similar to those of the E gene sequences of SARS-CoV-2 isolates. (C): In terms of M gene sequences, SARS-CoV-2 isolates were most closely evolutionarily related to the C-CoVs NC014470, EF065513 and NC030886. (D): The absolute diversities of the M gene sequences of the C-CoVs NC014470, EF065513 and NC030886 were similar to those of M gene sequences of SARS-CoV-2 isolates. (E,G): In terms of N and S gene sequences, SARS-CoV-2 isolates were most closely evolutionarily related to the C-CoV NC014470, forming a separate clade. (F): The absolute diversity of N gene sequences of SARS-CoV-2 isolates was similar to that of the C-CoV NC014470. (H): The absolute diversity of the S gene sequence of the C-CoV NC014470 was similar to those of other C-CoVs.
Analysis of structural gene sequences of common coronaviruses (C-CoVs) evolutionarily related to severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2).
| Genes | C-CoVs infecting humans | C-CoVs other organisms |
|---|---|---|
| E gene | KJ481931, MG011357 | NC014470, DQ415914, NC026011, NC006213, JN874559, U00735 |
| M gene | KJ481931, KP209309, KY581691, KY581689, KY581686, KP209307, KP209313, KP209306 | NC014470, EF065513, NC030886 |
| N gene | KJ156911, KJ156905 | NC014470 |
| S gene | KJ481931, MG011344 | NC014470 |
FIGURE 5Evolutionary characteristics and absolute diversities of structural genes of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) isolates and common coronaviruses (C-CoVs). (A,B,D,E): The E and M gene sequences of SARS-CoV-2 isolates and common CoVs could be grouped into three clades (CI, CII and CIII). A, B: The E gene sequences of SARS-CoV-2 isolates and C-CoV NC014470 were evolutionary intermediates between C-CoVs that infect humans and other organisms. (C): The absolute diversities of E gene sequences of SARS-CoV-2 isolates were more similar to those of C-CoVs that infect other organisms. (D,E): The M gene sequences of SARS-CoV-2 isolates, NC014470 (infecting other organisms) and KJ481931 (infecting humans) were closely related and grouped together in the same branch of a phylogenetic tree. The M gene sequences of SARS-CoV-2 isolates were evolutionary intermediates between those of NC014470 (infecting other organisms) and KJ481931 (infecting humans). (F): The absolute diversities of M gene sequences of SARS-CoV-2 isolates were more similar to those of C-CoVs infecting other organisms. (F) (Green box): The absolute diversity of M gene sequences of KJ481931 (infecting humans) was more similar to those of M gene sequences from SARS-CoV-2 isolates and C-CoVs that infect other organisms. (G,H,J,K): The N and S gene sequences of SARS-CoV-2 strains grouped closely together on the same branch of an evolutionary tree. (G,H,J,K): The N and S gene sequences of NC014470 were located between those of SARS-CoV-2 isolates and C-CoVs that infect humans. (I,L): The absolute diversities of N and S gene sequences of SARS-CoV-2 isolates were unlike those of C-CoVs.