Weihong Zeng1, Guangfeng Liu2, Huan Ma3, Dan Zhao3, Yunru Yang3, Muziying Liu3, Ahmed Mohammed3, Changcheng Zhao4, Yun Yang4, Jiajia Xie5, Chengchao Ding4, Xiaoling Ma6, Jianping Weng7, Yong Gao4, Hongliang He8, Tengchuan Jin9. 1. Department of Obstetrics and Gynecology, The First Affiliated Hospital of USTC, Division of Molecular Medicine, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230001, China; Hefei National Laboratory for Physical Sciences at Microscale, Laboratory of Structural Immunology, CAS Key Laboratory of Innate Immunity and Chronic Disease, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230027, China. 2. National Center for Protein Science Shanghai, Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai, 201210, China. 3. Hefei National Laboratory for Physical Sciences at Microscale, Laboratory of Structural Immunology, CAS Key Laboratory of Innate Immunity and Chronic Disease, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230027, China. 4. Department of Infectious Diseases, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230001, China. 5. Department of Dermatology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230001, China. 6. Department of Laboratory Medicine, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230001, China. 7. Institute of Public Health, University of Science and Technology of China, Hefei, Anhui, 230026, China. 8. Department of Infectious Diseases, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230001, China. Electronic address: hhl725@ustc.edu.cn. 9. Department of Obstetrics and Gynecology, The First Affiliated Hospital of USTC, Division of Molecular Medicine, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230001, China; Hefei National Laboratory for Physical Sciences at Microscale, Laboratory of Structural Immunology, CAS Key Laboratory of Innate Immunity and Chronic Disease, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230027, China; CAS Center for Excellence in Molecular Cell Science, Chinese Academy of Science, Shanghai, 200031, China. Electronic address: jint@ustc.edu.cn.
Abstract
The nucleocapsid (N) protein is an important antigen for coronavirus, which participate in RNA package and virus particle release. In this study, we expressed the N protein of SARS-CoV-2 and characterized its biochemical properties. Static light scattering, size exclusive chromatography, and small-angle X-ray scattering (SAXS) showed that the purified N protein is largely a dimer in solution. CD spectra showed that it has a high percentage of disordered region at room temperature while it was best structured at 55 °C, suggesting its structural dynamics. Fluorescence polarization assay showed it has non-specific nucleic acid binding capability, which raised a concern in using it as a diagnostic marker. Immunoblot assays confirmed the presence of IgA, IgM and IgG antibodies against N antigen in COVID-19 infection patients' sera, proving the importance of this antigen in host immunity and diagnostics.
The nucleocapsid (N) protein is an important antigen for coronavirus, which participate in RNA package and virus particle release. In this study, we expressed the N protein of SARS-CoV-2 and characterized its biochemical properties. Static light scattering, size exclusive chromatography, and small-angle X-ray scattering (SAXS) showed that the purified N protein is largely a dimer in solution. CD spectra showed that it has a high percentage of disordered region at room temperature while it was best structured at 55 °C, suggesting its structural dynamics. Fluorescence polarization assay showed it has non-specific nucleic acid binding capability, which raised a concern in using it as a diagnostic marker. Immunoblot assays confirmed the presence of IgA, IgM and IgG antibodies against N antigen in COVID-19 infection patients' sera, proving the importance of this antigen in host immunity and diagnostics.
In December 2019, a new type of coronavirus (SARS-CoV-2 or 2019-nCoV) causing a novel pneumonia now named COVID-19 broke out in Wuhan, China. The virus is rapidly spreading cross the world and caused a great impact on health and economy [1,2]. So far, as of April 16, 2020, there were 83,797 confirmed cases of COVID-19 coronavirus infection in China and over 1,954,724 cases globally in over 200 countries [3,4]. Studies on the virus are urgently needed for such severe situation.The SARS-CoV-2 genome is composed of approximately 30,000 nucleotides, which encodes four structural proteins include spike (S) protein, envelope (E) protein, membrane (M) protein, and nucleocapsid (N) protein [5]. Among them, N protein is a highly immunogenic and abundantly expressed protein during infection [2,6]. Furthermore, N protein is frequently used in vaccine development and serological assays [7]. At present, there is few reports focus on SARS-CoV-2 N protein, and the updated understanding of SARS-CoV-2 N protein is in urgent need.After infection, the N protein enters the host cell together with the viral RNA to facilitate its replication and process the virus particle assembly and release [8]. SARS-CoV N protein contains two distinct RNA-binding domains (the N-terminal domain [NTD] and the C-terminal domain [CTD]) linked by a poorly structured linkage region (LKR) containing a serine/arginine-rich (SR-rich) domain (SRD) [9,10]. Due to the positive amino acids, SARS-CoV N-NTD and N-CTD have been reported to bind with viral RNA genome [11,12]. LKR is ability to improve oligomerization [13,14]. However, the molecular properties of SARS-CoV-2 N protein remain to be excavated.Serological diagnosis detected that the specific antibodies against the N protein in the serum of SARS patients have higher sensitivity and longer persistence than those of other structural proteins of SARS-CoV [15,16]. Moreover, anti-N antibodies have been detected with high specificity in the early stage of infection [17]. Thus, any information generated from the analysis of this protein, whether in vivo or in vitro, will improve our understanding of COVID-19 and help us to design better biological agents for the treatment or diagnostics of diseases.At present work, we found SARS-CoV-2 N protein a dimer in solution by CTD-CTD interaction. Additionally, N protein can binding with non-specific dsDNA probably by its electrostatic interaction. Furthermore, we analyzed the immunogenicity of antibodies which specific for N protein. Our work reveals new information of the mechanism and characterization of N protein, which may provide a prospection for the vaccine or diagnostic kit development of N protein.
Results
SARS-CoV-2 N protein profile
To gain insights into the structural and functional relationships of the SARS-CoV-2 N protein, we purified full-length protein with 419 amino acids (Fig. 1
A). It is predicted to have two well-folded domain, both of the NTD and CTD of SARS-CoV-2 N protein are rich in β-strands while CTD has some short helices (Fig. 1B).
Fig. 1
Structural organization of SARS-CoV-2 N protein and sequence alignment.
(A) Domain structure of SARS-CoV-2 N protein. The domain boundaries were shown on the top and the different domains were labeled in different colors. (B) The predicted structure of SARS-CoV-2 N protein was presented. The NTD and CTD were highlighted in red and blue, respectively.
Structural organization of SARS-CoV-2 N protein and sequence alignment.(A) Domain structure of SARS-CoV-2 N protein. The domain boundaries were shown on the top and the different domains were labeled in different colors. (B) The predicted structure of SARS-CoV-2 N protein was presented. The NTD and CTD were highlighted in red and blue, respectively.Sequence analysis showed that it has 90.52% identity to that of SARS-CoV, with the most conserved region in the two core domains and the linker (Fig. S1A). Molecular evolutionary analysis of the N proteins showed that SARS-CoV-2 belongs to lineage B betacoronavirus which lies in the same branch as SARS-CoV and two bat coronaviruses (Fig. S1B). They are well-separated with other coronaviruses, which is generally in agreement with the evolution tree of these coronaviruses [18].
figs1
The solution oligomerization state of SARS-CoV-2 N protein
To access the oligomerization state of N-protein in solution, we used static light scattering (SLS) to determine the molecular weight. Our data showed that the SARS-CoV-2 N protein formed dimers according to a calculated molecular weight of the protein was 114 ± 0.7 kDa by SLS (Fig. 2
A). Furthermore, DSS cross-linking verified that the N protein, with a theoretical molecular weight of 49.5 kDa including an extra 20 residues at the N-terminus, could form dimers (Fig. 2B). A small portion of higher-order oligomers was also observed by cross-linking.
Fig. 2
Oligomerization state and conformation analysis of the N protein.
(A) Static light scattering analysis of the oligomerization of the N protein. The molecular weight was calculated by Astra software and is shown in red.
(B) DSS cross-linking analysis of the oligomerization forms of the N-protein (1). The protein used for positive control was mCARD9-CARD with an MBP tag (52 kDa) which was reported to form dimers in solution (2) [33]. The MBP was used as a negative control (42 kDa) (3). (C) SAXS results for the protein. Scattering profile (points) and fitting with GNOM (solid lines). I, scattering intensity; q, scattering angle vector. Insert: the guinier region with fitting line of the scattering profile.
(D) Dimensionless Kratky plot showed that the protein was partially extended in solution. (E) A representative CORAL model in which the NTDs are shown in yellow and purple, respectively, and the CTD dimer is shown in green. The coiled coil regions are represented as dots. (F) Results from GNOM showing the pairwise distance distribution [P(r)] and the maximum distance. The radius of gyration is fitted to 59 Å, and r represents the pairwise distances.
Oligomerization state and conformation analysis of the N protein.(A) Static light scattering analysis of the oligomerization of the N protein. The molecular weight was calculated by Astra software and is shown in red.(B) DSS cross-linking analysis of the oligomerization forms of the N-protein (1). The protein used for positive control was mCARD9-CARD with an MBP tag (52 kDa) which was reported to form dimers in solution (2) [33]. The MBP was used as a negative control (42 kDa) (3). (C) SAXS results for the protein. Scattering profile (points) and fitting with GNOM (solid lines). I, scattering intensity; q, scattering angle vector. Insert: the guinier region with fitting line of the scattering profile.(D) Dimensionless Kratky plot showed that the protein was partially extended in solution. (E) A representative CORAL model in which the NTDs are shown in yellow and purple, respectively, and the CTD dimer is shown in green. The coiled coil regions are represented as dots. (F) Results from GNOM showing the pairwise distance distribution [P(r)] and the maximum distance. The radius of gyration is fitted to 59 Å, and r represents the pairwise distances.
The flexible linker is partially extended in solution
The confirmation of the full-length protein, we further studied by the SAXS technique to provide information on its shape. As shown in Fig. 2C, the radius of gyration of the molecule was 59 Å, much larger than that expected for a 99 kDa globular protein (Fig. 2F), and Kratky plot showed that the protein was partially extended in solution (Fig. 2D). This is in consistent with the model that the NTD and CTD do not interact, and the two NTDs in the dimer are likely to move freely in solution. A representative structure of NP45-365 based on CORAL simulations is shown in Fig. 2E. Due to the flexible nature of the linker region, this structure represents only a model of the conformational ensemble and does not represent a structure per se. However, the model captures features of the conformational ensemble and allows for the qualitative analysis of gross structural features. The most prominent feature of the model is that the flexible linker does not adopt a fully extended conformation, suggesting the existence of residual structures within the linker.
Circular dichroism (CD) spectroscopic analysis
In order to characterize the conformational properties of the N protein, CD spectroscopy was used to analyze the secondary structures. The spectra shown in Fig. 3
A demonstrated that the N protein is mainly composed of coils, which consistent with the structural model in Fig. 1B and the SAXS results (Fig. 2E). Interestingly, the content of secondary structures increase with temperature and then started decreasing when it above 55 °C.
Fig. 3
Conformational and functional analysis of the N protein
(A) CD spectrum analysis of the N protein (right) and thermal denaturation of the N protein monitored at Θ222 nm (left).
(B) Fluorescence polarization analysis of the N protein. The concentration of 5′-FAM double stranded 14mer DNA was 20 nM, and the apparent Kd value was 191 ± 0.036 nM. (C) The electrostatic surface of the N protein generated by PyMOL, where the negatively charged region are represented in red, neutral regions in white, and positively charged regions in blue.
Conformational and functional analysis of the N protein(A) CD spectrum analysis of the N protein (right) and thermal denaturation of the N protein monitored at Θ222 nm (left).(B) Fluorescence polarization analysis of the N protein. The concentration of 5′-FAM double stranded 14mer DNA was 20 nM, and the apparent Kd value was 191 ± 0.036 nM. (C) The electrostatic surface of the N protein generated by PyMOL, where the negatively charged region are represented in red, neutral regions in white, and positively charged regions in blue.
The N protein is potent to bind non-specific nucleic acid with high affinity
In order to characterize the nuclei acid binding ability of SARS-CoV-2 N protein, we used fluorescence polarization to assess the binding affinity of the protein to a non-specific nucleic acid (a double stranded 14mer DNA probe with a fluorescence label). As shown in Fig. 3C, the N protein is potent to bind the dsDNA, the apparent Kd value is 191 ± 0.036 nM. Additionally, the electrostatic surface potential map generated with PyMOL (Fig. 3D) confirms SARS CoV-2 N protein is a highly basic protein. The surface of both NTD and CTD displayed highly positively charged regions which may facilitate binding to nucleic acids.
The N protein is an important viral antigen for SARS-CoV-2
To pinpoint the possibility of the N protein as a diagnosis marker in COVID-19, we used Western Blotting and Dot Blotting to identify the antibodies which specifically bind with the N antigen. WB analysis (Fig. 4
A) and Dot blot analysis (Fig. 4B) showed the presence of IgG, IgA and IgM antibodies against the N protein were detected in the confirmed COVID-19 patients’ sera pool with different dilution. This result further confirmed that the N protein is a potent antigen for host immunity and for disease diagnosis.
Fig. 4
Antigenicity of the N antigen
Western Blot (A) and Dot Blot (B) analysis of specific IgA, IgM, IgG antibodies against the N
-protein after incubated with different dilution of COVID-19 recovering patients’ serum pool using anti-human IgA-Fc/IgM-μ chain/IgG-Fc secondary antibodies, respectively.
Antigenicity of the N antigenWestern Blot (A) and Dot Blot (B) analysis of specific IgA, IgM, IgG antibodies against the N-protein after incubated with different dilution of COVID-19 recovering patients’ serum pool using anti-human IgA-Fc/IgM-μ chain/IgG-Fc secondary antibodies, respectively.
Discussion
The nucleocapsid protein is an important structural protein for the coronaviruses. It is highly abundant in the viruses. Its function involves entering the host cell, binding to the viral RNA genome, and forms the ribonucleoprotein core. The SARS-CoV-2 N protein shares high homology with the SARS-CoV N protein, with a sequence identity of 90.52%.Our structural characterization of recombinant full length N protein showed that it has high content of disordered region without bound nucleic acid (Fig. 1B/3A). Noticeably, the linker of SARS-CoV N protein is also highly disordered, as reported before [19,20]. This disordered region may facilitate the protein to transiently bind to different partners and maintain a correct conformation of the N protein [13,21,22].According to SAXS modeling, the NTD seems to move freely in solution, and the flexible linker is partially extended in solution (Fig. 2E), while the CTD forms a dimer similar to other N proteins [23]. Static light scattering and DSS cross-linking were strongly corroborating the results (Fig. 2 A/B). Furthermore, N protein of SARS-CoV-2 is highly positively charged (Fig. 3C), which may facilitate the binding ability of non-specific nucleic acid (Fig. 3B). These results further confirmed the functional conformation of N protein gain the ability to bind nuclei acids.Guo et al. confirmed that the IgG in SARS-CoV and SARS-CoV-2 infected patients’ sera can bind with the N antigen by WB and ELISA [24]. Another report showed that antibodies against SARS-CoV-2 N protein and RBD protein began to rise at the 10th day after COVID-19 symptoms onset [25]. Importantly, by using COVID-19 patients’ sera, we found the existence of IgG, IgA, and IgM antibodies against N antigen in recovering patients (Fig. 4).Overall, our study increased the understanding of the SARS-CoV-2 nucleocapside protein and provided the basis for future vaccine and diagnostic kits development.
Experimental procedures
Patient serum samples
Serum samples were collected from recovering COVID-19 patients admitted to the First Affiliated Hospital of USTC between Jan 30 and Feb 23, 2020. All patients were confirmed to be infected with SARS-CoV-2 by use of real-time RT- PCR (rRT-PCR) on throat swab samples from the respiratory tract. Serum preparation as [26].
Molecular cloning, protein expression and purification
The coding sequence of the core N protein factor homology region (A1-A419) (NCBI accession code: ADI66791.1) was ligated into pET28a with a His∗6 on the N-terminus. The recombinant plasmid was transformed into BL21 (DE3) bacteria for protein overexpression. The lysis supernatant was purified by a 5 ml Hisprep™ IMAC column (GE Healthcare), and eluted protein was added with Ammonium sulfate to a final concentration of 0.5 M. The final protein was further purified with a 24 ml Superdex-200 gel filtration column. The UV–vis spectrum was acquired on the purified protein using a spectrophotometer (Jena).
Circular dichroism spectroscopic study
Circular dichroism (CD) spectra were acquired on a Chirascan Spectrometer (Applied Photophysics, Leatherhead, UK). The proteins were changed buffer into PBS. The procedure is as reported before [27].
Small-angle X-ray scattering (SAXS) and low-resolution model building
Purified full-length N protein was concentrated to 5 mg/ml using Amicon centrifugal concentrators (Millipore). To exclude concentration dependence, two different concentrations, 1 mg/ml and 5 mg/ml of purified N protein (corresponding to 21.7 μM, 101 μM, respectively) were prepared and measured. The procedure followed previous report [28]. The parameter is shown in Table S1.
Rigid body modeling using SAXS data
Modeling of the N protein was performed using three rigid bodies. The model of residues 46–171 was built by the 1.7 Å resolution crystal structure of the RNA binding domain of nucleocapsid phosphoprotein from SARS coronavirus 2 (PDB: 6M3M.1.A) with SWISS-MODEL [29]. The model of the CTD dimer was built using the structure of SARS Coronavirus Nucleocapsid Protein (PDB: 2CJR.1.A) as a template.SAXS-based rigid body modeling of complexes was performed by CORAL (COmplexes with RAndom Loops) [30]. CORAL fixed the CTD dimer, translated and rotated the atomic models of NTD domains. The NTER and CTER loops were randomly generated by a library of self-avoiding random loops. A simulated annealing protocol was employed to find the optimal positions and orientations of available high-resolution models of domains and the approximate conformations of the missing portions of the polypeptide chain(s).
Cross-linking
Disuccinimidyl suberate (DSS, Pierce) was used to cross-link closely spaced surface-exposed active amino groups of interacting proteins. The experimental protocol is according to previous report [31].
Immuno-blotting
For Western Blot, 0.5 μg per well of the N proteins were analyzed with SDS-PAGE and transferred to a PVDF membrane (Millipore); for Dot Blot, 0.1 μg per drop of the N protein were spotted on a nitrocellulose membrane (Pall). The protein-coupled membranes were blocked with defatted milk at room temperature for 1 h and then incubated with different dilutions of virus-free sera of COVID-19 patients overnight at 4 °C. On the next day, the membranes were washed with PBST (0.1%v/v Tween 20). After that, the membranes were incubated with a secondary antibody, anti-IgA (Boster biological technology), anti-IgM-μ (Boster biological technology), or anti-IgG-Fc (Sino biological), for 1 h. Last, the membranes were washed with PBST and detected with an ECL kit (abpbiotech) using a chemiluminescence apparatus (Bio-Rad).
Fluorescence polarization (FP)
The purified N protein and 5′-FAM fluorescently labeled dsDNA (5′-FAM-TCG TCG TTT TGT CG) were mixed together with the final concentration of 6.25 μM and 20 nM, respectively, in PBS with 15 mM MgCl2. Then, the mixture was serially diluted with PBS containing 15 mM MgCl2 and 20 nM 5′-FAM dsDNA to different protein concentrations. The procedure was reported before [32].
Author contributions
T.J. and H.H. provided the funding, designed the study, participated in data analysis and extensively reviewed the manuscript. W.Z. and H.M. designed the study, performed the experiments, analyzed the data and drafted the manuscript. Other authors participated in the experiments and reviewed the manuscript.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Authors: Yee-Joo Tan; Phuay-Yee Goh; Burtram C Fielding; Shuo Shen; Chih-Fong Chou; Jian-Lin Fu; Hoe Nam Leong; Yee Sin Leo; Eng Eong Ooi; Ai Ee Ling; Seng Gee Lim; Wanjin Hong Journal: Clin Diagn Lab Immunol Date: 2004-03
Authors: Michael A Casasanta; G M Jonaid; Liam Kaylor; William Y Luqiu; Maria J Solares; Mariah L Schroen; William J Dearnaley; Jarad Wilson; Madeline J Dukes; Deborah F Kelly Journal: Nanoscale Date: 2021-04-12 Impact factor: 7.790
Authors: Huaying Zhao; Di Wu; Ai Nguyen; Yan Li; Regina C Adão; Eugene Valkov; George H Patterson; Grzegorz Piszczek; Peter Schuck Journal: iScience Date: 2021-05-07
Authors: Oskar I Koifman; Natalia Sh Lebedeva; Yury A Gubarev; Mikhail O Koifman Journal: Chem Heterocycl Compd (N Y) Date: 2021-05-14 Impact factor: 1.277
Authors: Andrew Cameron; Claire A Porterfield; Larry D Byron; Jiong Wang; Zachary Pearson; Jessica L Bohrhunter; Anthony B Cardillo; Lindsay Ryan-Muntz; Ryan A Sorensen; Mary T Caserta; Stephen Angeloni; Dwight J Hardy; Martin S Zand; Nicole D Pecora Journal: J Clin Microbiol Date: 2021-01-21 Impact factor: 5.948
Authors: Jorgelina M Calandria; Surjyadipta Bhattacharjee; Nicholas J Maness; Marie-Audrey I Kautzmann; Aram Asatryan; William C Gordon; Khanh V Do; Bokkyoo Jun; Pranab K Mukherjee; Nicos A Petasis; Nicolas G Bazan Journal: Sci Rep Date: 2021-06-10 Impact factor: 4.379