| Literature DB >> 34118359 |
Mohamed Emam1, Mariam Oweda1, Agostinho Antunes2, Mohamed El-Hadidi1.
Abstract
The human β-coronavirus SARS-CoV-2 epidemic started in late December 2019 in Wuhan, China. It causes Covid-19 disease which has become pandemic. Each of the five-known human β-coronaviruses has four major structural proteins (E, M, N and S) and 16 non-structural proteins encoded by ORF1a and ORF1b together (ORF1ab) that are involved in virus pathogenicity and infectivity. Here, we performed detailed positive selection analyses for those six genes among the four previously known human β-coronaviruses and within 38 SARS-CoV-2 genomes to assess signatures of adaptive evolution using maximum likelihood approaches. Our results suggest that three genes (E, S and ORF1ab genes) are under strong signatures of positive selection among human β-coronavirus, influencing codons that are located in functional important protein domains. The E protein-coding gene showed signatures of positive selection in two sites, Asp 66 and Ser 68, located inside a putative transmembrane α-helical domain C-terminal part, which is preferentially composed by hydrophilic residues. Such Asp and Ser sites substitutions (hydrophilic residues) increase the stability of the transmembrane domain in SARS-CoV-2. Moreover, substitutions in the spike (S) protein S1 N-terminal domain have been found, all of them were located on the S protein surface, suggesting their importance in viral transmissibility and survival. Furthermore, evidence of strong positive selection was detected in three of the SARS-CoV-2 nonstructural proteins (NSP1, NSP3, NSP16), which are encoded by ORF1ab and play vital roles in suppressing host translation machinery, viral replication and transcription and inhibiting the host immune response. These results are insightful to assess the role of positive selection in the SARS-CoV-2 encoded proteins, which will allow to better understand the virulent pathogenicity of the virus and potentially identifying targets for drug or vaccine strategy design.Entities:
Keywords: Drug targeting; Maximum likelihood; Pathogenicity; Positive selection; SARS-COV-2
Year: 2021 PMID: 34118359 PMCID: PMC8190378 DOI: 10.1016/j.virusres.2021.198472
Source DB: PubMed Journal: Virus Res ISSN: 0168-1702 Impact factor: 3.303
E protein, S protein and ORF1ab protein positively selected sites, residue positions and their amino acids change and their posterior probability for each codon between HBC their accession number are: (NC_045512.2: SARS COV2 Wuhan-Hu-1, NC_004718.3: SARS coronavirus Tor2, FJ882947.1: SARS coronavirus wtic-MB, FJ882926.1: SARS coronavirus ExoN1, NC_006577.2: Human coronavirus HKU1, KF430201.1: Human coronavirus HKU1-18, KF686342.1: Human coronavirus HKU1-11, NC_006213.1: Human coronavirus OC43, KF530099.1: Human coronavirus OC43-971-5, MK303621.1: Human coronavirus OC43 MDS4, NC_019843.3: Middle East respiratory syndrome coronavirus(MERS), MN723542.1: MERS Riyadh-KSA-036D1N, MG757605.1: MERS KSA-036D1N).
| Protein | Position | Positively selected AA | Substitution | Posterior probability (BEB) | Posterior probability (NEB) |
|---|---|---|---|---|---|
| E Protein | 66 | ASN (N) | VLA (V), LYS (K) and SER (S) | 0.948 | 0.997 |
| 68 | SER (S) | PRO (P) and GLU (E) | 0.978 | 0.993 | |
| S Protein | 26 | PRO (P) | ARG (R), PHE (F), SER (S), LYS (K) and ASN (N) | 0.875 | 0.990 |
| 148 | ASN (N) | LYS (K) and PRO (P) | 0.851 | 0.984 | |
| 153 | MET (M) | ARG (R), PHY (F), TYR (Y) and THR (T) | 0.802 | 0.902 | |
| ORF1ab Protein | |||||
| Nsp1 | 138 | ALA (A) | ILE (I), CYS (C), ARG (A) and TYR (Y) | 0.842 | 0.970 |
| Nsp3 | 196 | MET(M) | LEU (L), VAL (V) and GLU (E) | 0.823 | 0.939 |
| 1229 | VAL (V) | GLU (E), GLY (G), SER(S) and Thr (T) | 0.807 | 0.923 | |
| Nsp16 | 216 | ARG (R) | Lys (K) and SER (S) | 0.820 | 0.939 |
Fig. 1I‐Tasser model of the SARS-COV-2 E protein (QHD43418). positively selected residues with a P < 0.05 are shown as transparent spheres and are marked by the corresponding labels.
Fig. 2PDB structure of S protein (6XR8). positively selected residues with a P < 0.05 are shown as transparent spheres and are marked by the corresponding labels.
The CODEML output contains the LRT result for M7 vs M8 and M8 vs M8a models and the P-value for each of the studied genes. HBC (Human β-coronavirus) and SARS-CoV-2 (Severe acute respiratory syndrome coronavirus 2).
| Gene | Model 7 (lnL) null model | Model 8 (lnL) alt model | Model 8a (lnL) null model | LRT (M7 vs M8) | LRT (M8 vs M8a) | ||
|---|---|---|---|---|---|---|---|
| M gene (HBC) | -3194.123 | -3194.123 | -3193.271 | 0 | 1 | -1.704336 | 1 |
| N gene (HBC) | -6751.231 | -6744.739 | -6743.957 | 12.983 | 0.0022 | -1.564974 | 1 |
| ORF1a (HBC) | -62127.61 | -62101.35 | -62109.86 | 52.519172 | 1.182e-11 | 17.016658 | 0.00011 |
| ORF1ab (HBC) | -95269.129 | -95229.14 | -95231.61 | 79.963936 | 2.595e-17 | 4.92846 | 0.03962 |
| S gene (HBC) | -19554.79 | -19539.19 | -19543.94 | 31.196074 | 3.364e-07 | 9.493334 | 0.004124 |
| E gene (HBC) | -1198.212 | -1193.354 | -1225.632 | 9.715472 | 0.00932 | 64.554818 | 5.633e-15 |
| E gene (SARS COV 2) | -299.1647 | -298.6350 | -298.8755 | 1.059568 | 1 | 0.481106 | 0.48792 |
| M gene SARS COV 2) | -904.370 | -903.7801 | -903.780 | 1.179642 | 1 | -0.000224 | 1 |
| ORF1ab SARS COV 2) | -28191.49555 | -28189.43693 | -28190.8 | 4.11724 | 1 | 2.780354 | 0.954 |
| S gene SARS COV 2) | -5042.16 | -5042.16 | -5042.16 | -4.4E-05 | 1 | -0.000 | 1 |
| N gene SARS COV 2) | -1711.67 | -1711.67 | -1711.67 | -0.000674 | 1 | -0.0004 | 1 |
| ORF3a SARS COV 2) | -1114.01 | -1113.97 | -1113.97 | 0.06619 | 1 | -0.003 | 1 |
| ORF6 SARS COV 2) | -224.847 | -224.847 | -224.8472 | 0.000682 | 1 | -0.000 | 1 |
| ORF7 SARS COV 2) | -478.682 | -478.682 | -478.683 | -2E-06 | 1 | 0.0010 | 1 |
| ORF8 SARS COV 2) | -488.540 | -487.950 | -488.495 | 1.179918 | 1 | 1.09062 | 1 |
| ORF10 SARS COV 2) | -148.138 | -148.138 | -148.138 | 0 | 1 | 0 | 1 |
Fig. 3ECDF for comparison among SARS-CoV2, SARS CoV, HCoV-HKUI, HCoV-OC43 and MERS CoV (3’-UTR and 5′-UTR).
Fig. 4I‐Tasser model of the SARS-COV-2 NSP3 (QHD43415_3). positively selected residues with a P < 0.05 are shown as transparent spheres and are marked by the corresponding labels.
Fig. 5PDB crystal structure of NSP16 (6w75*). positively selected residues with a P < 0.05 are shown as transparent spheres and are marked by the corresponding labels. * This accession number contain Crystal Structure of NSP16 - NSP10 Complex however in this figure we present the NSP16 as it is the main focus.