H Y Lim Tung1, Pierre Limtung2. 1. Peptide and Protein Chemistry Research Laboratory, Nacbraht Biomedical Research Institute, 3164 21st Street Suite 122, Astoria (NYC), NY, 11106, USA. Electronic address: hyltung2010@nacbrahtbiomedresins.org. 2. Peptide and Protein Chemistry Research Laboratory, Nacbraht Biomedical Research Institute, 3164 21st Street Suite 122, Astoria (NYC), NY, 11106, USA.
Abstract
SARS-CoV-2 is the etiologic agent of COVID-19. There is currently no effective means of preventing infections by SARS-CoV-2, except through restriction of population movement and contact. An understanding of the origin, evolution and biochemistry (molecular biology) of SARS-CoV-2 is a prerequisite to its control. Mutations in the phosphorylation sites of SARS-CoV-2 encoded nucleocapsid protein isolated from various populations and locations, are described. Mutations occurred in the phosphorylation sites, all located within a stretch which forms a phosphorylation dependent interaction site, including C-TAK1 phosphorylation sites for 14-3-3. The consequences of these mutations are discussed and a structure-based model for the role of protein 14-3-3 in the sequestration and inhibition of SARS-CoV-2 nucleocapsid protein's function is presented. It is proposed that the phosphorylation of SARS-CoV-2 nucleocapsid protein and its sequestration by Protein 14-3-3 is a cellular response mechanism for the control and inhibition of the replication, transcription and packaging of the SARS-CoV-2 genome.
SARS-CoV-2 is the etiologic agent of COVID-19. There is currently no effective means of preventing infections by SARS-CoV-2, except through restriction of population movement and contact. An understanding of the origin, evolution and biochemistry (molecular biology) of SARS-CoV-2 is a prerequisite to its control. Mutations in the phosphorylation sites of SARS-CoV-2 encoded nucleocapsid protein isolated from various populations and locations, are described. Mutations occurred in the phosphorylation sites, all located within a stretch which forms a phosphorylation dependent interaction site, including C-TAK1 phosphorylation sites for 14-3-3. The consequences of these mutations are discussed and a structure-based model for the role of protein 14-3-3 in the sequestration and inhibition of SARS-CoV-2nucleocapsid protein's function is presented. It is proposed that the phosphorylation of SARS-CoV-2nucleocapsid protein and its sequestration by Protein 14-3-3 is a cellular response mechanism for the control and inhibition of the replication, transcription and packaging of the SARS-CoV-2 genome.
SARS-CoV-2 is the etiologic agent of COVID-19 which is now considered a pandemic [[1], [2], [3], [4]]. Currently, the only effective means of avoiding morbidity due to infections by SARS-CoV-2 is through population control and contact restriction which appear to be quite effective in some parts of the world and not others. However, the number of deaths in various populations and regions of the world are not uniform [5,6]. Although the medical infrastructure and response strategy and protocol play an important role in mitigating illness and death as a result of SARS-CoV-2 infection, the possibility that variability in the number of deaths following SARS-CoV-2 infections may be due to SARS-CoV-2 mutations cannot be ruled out.Several reports have documented mutations in several SARS-CoV-2 encoded proteins, in particular the Spike protein [7] which determines the infectivity of SARS-CoV-2 by binding to ACE2 [8,9]. Other reports have described mutations in other proteins encoded by SARS-CoV-2, including the nucleocapsid protein [[10], [11], [12], [13], [14]]. SARS-CoV-2nucleocapsid protein has an essential role in the replication, transcription and assembly of SARS-CoV-2 genome [15,16]. There is also interest in developing drugs and vaccines that target the SARS-CoV-2nucleocapsid protein [17,18] which has also been shown to control the cell cycle of the host cell [19,20]. The cell cycle control is regulated in large part by protein phosphorylation and dephosphorylation mechanisms involving a large number of protein kinases and protein phosphatases [[21], [22], [23], [24], [25]]. Development of drugs and vaccines that target SARS-CoV-2nucleocapsid protein must take into account mutations in SARS-CoV-2nucleocapsid protein that occur strains/sub-strains isolated in various populations and locations, and their functional significance. Here, we describe mutations of the phosphorylation sites of SARS-CoV-2nucleocapsid protein that were isolated from several populations and locations. We identify the binding of the phosphorylation rich region of SARS-CoV-2nucleocapsid protein to protein 14-3-3, an important signaling molecule that controls the cell cycle, cell survival and cell death [[26], [27], [28], [29], [30]]. We present a structure-based model of phosphorylation dependent binding and sequestration of SARS-CoV-2nucleocapsid protein by protein 14-3-3. The consequences of these mutations are also discussed.
Materials and methods
The sequences of SARS-CoV-2nucleocapsid protein from SARS-CoV-2 strains/sub-strains isolated from various countries were curated from the National Center for Biotechnology Information (NCBI) [31] and analyzed for alignments and identities by using the blastp program [32].Rendering of the structure of SARS-CoV-2Nucleocapsid phosphoprotein encompassing amino acids 123 to 310 which contains the phosphorylation rich stretch (amino acids 180–210) was performed by QUARK2 Protein Analysis Program pursuant to Xu and Zhang [33,34, https://zhanglab.ccmb.med.umich.edu/QUARK2/]. The structures of SARS-COV-2Nucleocapsid protein rendered in this work and Protein 14-3-3 (1YZ5) based on the structure determination of Benzinger et al. [35] were visualized and analyzed by the CCP4 Molecular Graphics Program Version 2.10.11 as described by Mc Nicolas et al. [36] and the ZMM Molecular Modeling Program as described by Garden and Zhorov [37].Phosphorylation of SARS-CoV-2nucleocapsid protein on serines 197 and 206 were performed using FOLD-X pursuant to Ref. [38]. SARS-CoV-2nucleocapsid protein phosphorylated on serines 197 and 206 was then analyzed for binding with protein 14-3-3 by protein-protein docking experiments pursuant to Pierce et al. [39].
Results
Fig. 1
shows the protein sequences of SARS-CoV-2 encoded nucleocapsid protein of SARS-CoV-2 strains/sub-strains isolated from various countries including, China (NC_045,512), Iran (MT459928; MT447177) Spain (MT655134; MT292671), India (MT451879), Israel (MT276598), Russia (MT510643), Italy (MT527178; MT531537) Poland (MT511066), Bangladesh (MT509958), Greece (MT459920) and Czech Republic (MT517420). Using the sequence of one SARS-CoV-2nucleocapsid protein of one strain isolated from China as reference, mutations were identified in SARS-COV-2Nucleocapsid protein of strains/sub-strains isolated from several countries, including Iran, Spain, India, Israel. Russia, Italy, Poland, Bangladesh and Czech Republic. SARS-CoV-2nucleocapsid proteins of several SARS-CoV-2 strains/sub-strains have undergone mutations on serines 186, 197, and 202 to Phenylalanine, Leucine and Asparagine respectively. Serine 186 is a phosphorylation site for CKI which is involved in the control a wide variety of cellular processes, including the control of the cell cycle [[40], [41], [42]]. Serine 197 is a phosphorylation site for Aurora A and B kinases which are involved in the control of the cell cycle and proper chromosome segregation [[43], [44], [45]]. Next to serine 197, is threonine 198 which is a phosphorylation site for proline directed protein kinases, including Cdk1 and GSK-3 [[46], [47], [48], [49], [50], [51], [52], [53]]. Serine 202 is a phosphorylation site for GSK-3 which is involved in a wide variety of cellular processes including, the control of the cell cycle [[48], [49], [50], [51]]. In addition, arginine 203 and glycine 204 are frequently mutated to lysine and arginine. Arginine 203 and glycine 204 (RG 203–204) lie sandwiched between two phosphorylation sites, including upstream phospho-serine 202 which is phosphorylated by GSK-3 and downstream phospho-threonine 205 which is phosphorylated by PKA [52,53]. Of particular interest, further downstream of RG 203–204 lies serine 206 which is a phosphorylation site for C-TAK1 [54,55]. Serine 197 is also a phosphorylation site for C-TAK1 [54,55].
Fig. 1
Protein Sequence Alignment of SARS-COV-2 encoded Nucleocapsid protein of SARS-COV-2 isolated from different countries including China, Iran, Spain, India, Israel, Russia, Italy, Poland, Bangladesh, Greece and Czech Republic (between amino acids 180–210). Amino acids in bold letters are mutations observed in this work.
Protein Sequence Alignment of SARS-COV-2 encoded Nucleocapsid protein of SARS-COV-2 isolated from different countries including China, Iran, Spain, India, Israel, Russia, Italy, Poland, Bangladesh, Greece and Czech Republic (between amino acids 180–210). Amino acids in bold letters are mutations observed in this work.Phosphorylation sites 186, 197, 202, 204 and 206 are located in a stretch that is sandwiched between the RNA binding domain and dimerization domain of SARS-CoV- 2nucleocapsid protein [56]. Together with phospho-serines 188 and 198 which are nearby, and in particular, phosphoserines 197 and 206 [54,55], the phosphorylation rich stretch forms a phosphorylation dependent binding domain for Protein 14-3-3 which is an important signaling molecule that is involved in the control of a wide variety of cellular processes, including the control of the cell cycle, cell survival and cell death [[26], [27], [28], [29], [30],[57], [58], [59], [60], [61], [62], [63]]. There is no structure described for the full-length SARS-CoV-2nucleocapsid protein. Only the structures of the N terminal and C terminal portions lacking the phosphorylation rich region have been described [[64], [65], [66], [67], [68], [69]]. A model structure encompassing amino acid residues 123 to 310 was therefore rendered pursuant to Xu and Zhang’s method of protein structure prediction [33,34] (Fig. 2
). The phosphorylation rich domain of SARS-CoV-2nucleocapsid protein is localized at the surface of the protein (Fig. 2).
Fig. 2
A rendering of the structure of SARS-COV-2 Nucleocapsid protein (amino acids 123–309). The structure of SARS-COV-2 Nucleocapsid protein (amino acids 123–309) depicted here was rendered pursuant to Xu and Zhang [33,34]. The phosphorylation rich stretch encompassing amino acids 180 to 210 is shown as spheres in blue and is localized at the surface of SARS-COV-2 Nucleocapsid protein. The sequence of the phosphorylation rich domain of SARS-COV-2 Nucleocapsid protein is: SQASSRSSSRSRNSSRNSTPGSSRGTSPARM.
A rendering of the structure of SARS-COV-2Nucleocapsid protein (amino acids 123–309). The structure of SARS-COV-2Nucleocapsid protein (amino acids 123–309) depicted here was rendered pursuant to Xu and Zhang [33,34]. The phosphorylation rich stretch encompassing amino acids 180 to 210 is shown as spheres in blue and is localized at the surface of SARS-COV-2Nucleocapsid protein. The sequence of the phosphorylation rich domain of SARS-COV-2Nucleocapsid protein is: SQASSRSSSRSRNSSRNSTPGSSRGTSPARM.To show that SARS-CoV-2nucleocapsid protein can bind protein 14-3-3 in a phosphorylation dependent manner, SARS-CoV-2nucleocapsid protein phosphorylated on serines 197 and 206 which together with their surrounding amino acid motifs form recognition sites for protein 14-4-4, protein-protein docking experiments were performed using protein 14-3-3 as the receptor protein. Fig. 3
shows that the phosphorylation rich stretch encompassing amino acid residues 180 to 210 forms a binding site for protein 14-3-3 as it can be fitted nicely in the protein binding groove of protein 14-3-3 [70,71]. The binding free energy of the complex of protein 14-3-3 to SARS-CoV-2nucleocapsid protein phosphorylated on serines 197 and 206 was estimated to be −21.72 kcal/mol. Protein-protein docking experiments between protein 14-3-3 and unphosphorylated SARS-CoV-2nucleocapsid protein shows only unstable complexes with high binding free energies, high binding affinities and high dissociation constants (Data not shown). The molecular mass of Protein 14-3-3 dimer is ∼56 kDa while the molecular mass of SARS-CoV-2Nucleocapsid protein is∼46 kDa indicating that only monomeric form of the latter can bind 1 mol of dimeric Protein 14-3-3 (Fig. 3, Fig. 4
). Interestingly, Fig. 3, Fig. 4 shows that only the monomeric SARS-CoV-2nucleocapsid protein can form a complex with protein 14-3-3, indicating that protein 14-3-3 acts to prevent the dimerization of SARS-COV-2Nucleocapsid protein.
Fig. 3
Protein-protein docking of Protein 14-3-3 (as the receptor) and SARS-COV-2 Nucleocapsid protein. The phosphorylation rich domain encompassing amino acids 180 to 210 is shown as spheres in blue. The sequence of the phosphorylation rich domain of SARS-COV-2 Nucleocapsid protein is: SQASSRSSSRSRNSSRNSTPGSSRGTSPARM. The phosphorylation rich domain of SARS-COV-2 Nucleocapsid protein is in contact with the binding groove of Protein 14-3-3. The phosphorylation rich stretch encompassing amino acids 180 to 210 fits into the binding groove of Protein 14-3-3 dimer. Only monomeric of SARS-COV-2 Nucleocapsid protein can bind Protein 14-3-3 dimer.
Fig. 4
Realistic representations of the structures of SARS-COV-2 Nucleocapsid protein (amino acids 123–309) (blue and red on the right) and Protein 14-3-3 (red, grey and blue of the left). The phosphorylation rich stretch encompassing amino acids 180 to 210 fits into the binding groove of Protein 14-3-3 dimer. Only monomeric SARS-COV-2 Nucleocapsid protein can bind Protein 14-3-3 dimer.
Protein-protein docking of Protein 14-3-3 (as the receptor) and SARS-COV-2Nucleocapsid protein. The phosphorylation rich domain encompassing amino acids 180 to 210 is shown as spheres in blue. The sequence of the phosphorylation rich domain of SARS-COV-2Nucleocapsid protein is: SQASSRSSSRSRNSSRNSTPGSSRGTSPARM. The phosphorylation rich domain of SARS-COV-2Nucleocapsid protein is in contact with the binding groove of Protein 14-3-3. The phosphorylation rich stretch encompassing amino acids 180 to 210 fits into the binding groove of Protein 14-3-3 dimer. Only monomeric of SARS-COV-2Nucleocapsid protein can bind Protein 14-3-3 dimer.Realistic representations of the structures of SARS-COV-2Nucleocapsid protein (amino acids 123–309) (blue and red on the right) and Protein 14-3-3 (red, grey and blue of the left). The phosphorylation rich stretch encompassing amino acids 180 to 210 fits into the binding groove of Protein 14-3-3 dimer. Only monomeric SARS-COV-2Nucleocapsid protein can bind Protein 14-3-3 dimer.
Discussion
The sequences of several SARS-CoV-2 strains have been determined [[1], [2], [3], [4],31]. Lu et al. [2] reported that they sequenced SARS-COV-2 genomes from nine individuals from the same location and found that they had 99.98 sequence identity. However, analysis of the sequences of SARS-CoV-2 isolated in different populations and environments must be made to determine differences in infectivity and virulence that are populations and environments specific. Analysis of mutations in SARS-CoV-2 encoded proteins from SARS-CoV-2 strains/sub-strains isolated from various populations and regions have been reported [[10], [11], [12], [13], [14]]. It is anticipated that the consequences of identified mutations will be studied as they may explain how SARS-CoV-2 encoded proteins interact with cellular proteins in order to maximize its viability and replication. A number of reports have documented the role of phosphorylation in the control of SARS-CoV and SARS-CoV-2nucleocapsid protein’s activity in the cell. However, the functions of the phosphorylation reactions have not been delineated [19, 20, [72], [73], [74]].The present work shows that SARS-COV-2 encoded Nucleocapsid protein have undergone certain mutations in strains/sub-strains of SARS-CoV-2 isolated from various populations and locations. The mutations of serine 186, serine 197 and serine 202 to phenylalanine, leucine and asparagine respectively of the SARS-CoV-2nucleocapsid protein are of major consequences because (i) serine 186 is a phosphorylation site for CKI which is involved in the control a wide variety of cellular processes, including the control of the cell cycle [[40], [41], [42]], (ii) serine 197 is a phosphorylation site for Aurora A and B kinases which are involved in the control of the cell cycle and proper chromosome segregation and C-TAK1 which is also involved in the control of the cell cycle [[43], [44], [45],54,55], and (iii) serine 202 is a phosphorylation site for GSK-3 which has been shown to be involved in a wide variety of cellular processes including, the control of the cell cycle [[47], [48], [49], [50], [51], [52], [53]]. The mutation of serine 197 would also have major ramification for the adjacent threonine 198 which is a phosphorylation site for the proline-directed protein kinases, including Cdk-1 and GSK-3 [[46], [47], [48], [49], [50], [51], [52], [53]]. The destruction of phospho-serine 197 would prevent the regulation of phosphorylation of threonine 198 by GSK-3 [Tung, H.Y.L., Manuscript in preparation]. The mutations of arginine 203 and glycine 204 (AG 203–204) of the SARS-CoV-2nucleocapsid protein will also affect the regulation of the phosphorylations of serines 202 and 206 which are phosphorylation sites for GSK-3 and C-TAK1 respectively [[46], [47], [48], [49], [50], [51], [52], [53], [54], [55]]. It is significant that mutations take place within a stretch that is phosphorylation rich and forms a phosphorylation dependent binding site for protein 14-3-3 [54,55], an important signaling molecule that is involved in the control of the cell cycle, cell survival and cell death [[26], [27], [28], [29], [30]]. Of particular interest is phospho-serines 197 and 206 which are C-TAK1 phosphorylation sites recognized by protein 14-3-3. Mutations of the phosphorylation sites described in this paper would result in the inability of protein 14-3-3 to bind to the SARS-CoV-2nucleocapsid protein, the consequence of which would be its unfettered actions, including its ability to dimerize, bind to the SARS-COV-2 genome, control and up-regulate the replication, transcription and packaging of the SARS-CoV-2 genome [15,16]. Surijit et al. [74] have described the role of protein 14-3-3 in the translocation of SARS-CoV-2nucleocapsid protein from the nucleus to the cytoplasm. In this work, structural analysis of SARS-CoV-2nucleocapsid protein and protein 14-3-3 shows that only monomeric SARS-CoV-2nucleocapsid protein can form a complex with protein 14-3-3. Here, we propose that in response to SARS-CoV-2 infection, cellular protein 14-3-3 acts to prevent the dimerization of SARS-CoV-2nucleocapsid protein, replication, transcription and packaging of the SARS-CoV-2 genome. Protein 14-3-3 achieves this by binding to SARS-CoV-2Nucleocapsid protein and sequestrating it in a phosphorylation dependent manner following phosphorylation of SARS-CoV-2nucleocapsid protein by several protein kinases that control the cell cycle, including C-TAK1 and possibly CKI, Cdk1, Cdk-5, GSK-3 (Manuscript in preparation). It is also proposed that the phosphorylation of SARS-CoV-2nucleocapsid protein and its sequestration by protein 14-3-3 is a cellular response mechanism for the control and inhibition of the replication, transcription and packaging of the SARS-CoV-2 genome.
Authors: Miruthula Tamil Selvan; Sachithra Gunasekara; Ping Xiao; Kristen Griffin; Shannon R Cowan; Sai Narayanan; Akhilesh Ramachandran; Darren E Hagen; Jerry W Ritchey; Jennifer M Rudd; Craig A Miller Journal: Viruses Date: 2022-06-01 Impact factor: 5.818
Authors: Kristina V Tugaeva; Dorothy E D P Hawkins; Jake L R Smith; Oliver W Bayfield; De-Sheng Ker; Andrey A Sysoev; Oleg I Klychnikov; Alfred A Antson; Nikolai N Sluchanko Journal: J Mol Biol Date: 2021-02-05 Impact factor: 5.469
Authors: D A B Rex; Shobha Dagamajalu; Richard K Kandasamy; Rajesh Raju; T S Keshava Prasad Journal: J Cell Commun Signal Date: 2021-06-28 Impact factor: 5.782