| Literature DB >> 15626344 |
Jingqiang Wang1, Jia Ji, Jia Ye, Xiaoqian Zhao, Jie Wen, Wei Li, Jianfei Hu, Dawei Li, Min Sun, Haipan Zeng, Yongwu Hu, Xiangjun Tian, Xuehai Tan, Ningzhi Xu, Changqing Zeng, Jian Wang, Shengli Bi, Huanming Yang.
Abstract
The Coronaviridae family is characterized by a nucleocapsid that is composed of the genome RNA molecule in combination with the nucleoprotein (N protein) within a virion. The most striking physiochemical feature of the N protein of SARS-CoV is that it is a typical basic protein with a high predicted pI and high hydrophilicity, which is consistent with its function of binding to the ribophosphate backbone of the RNA molecule. The predicted high extent of phosphorylation of the N protein on multiple candidate phosphorylation sites demonstrates that it would be related to important functions, such as RNA-binding and localization to the nucleolus of host cells. Subsequent study shows that there is an SR-rich region in the N protein and this region might be involved in the protein-protein interaction. The abundant antigenic sites predicted in the N protein, as well as experimental evidence with synthesized polypeptides, indicate that the N protein is one of the major antigens of the SARS-CoV. Compared with other viral structural proteins, the low variation rate of the N protein with regards to its size suggests its importance to the survival of the virus.Entities:
Mesh:
Substances:
Year: 2003 PMID: 15626344 PMCID: PMC5172421 DOI: 10.1016/s1672-0229(03)01018-0
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Fig. 1The predicted distributions of GC content (A), electric charge (B), hydrophobicity (C) and secondary structure (D) in the N protein of SARS-CoV.
The Amino Acid Composition of the SARS-CoV N Protein
| Number | Percentage (%) | |
|---|---|---|
| Ala, A | 34 | 8.06 |
| Phe, F | 13 | 3.08 |
| Gly, G | 45 | 10.66 |
| Ile, I | 11 | 2.61 |
| Leu, L | 26 | 6.16 |
| Met, M | 7 | 1.66 |
| Pro, P | 31 | 7.35 |
| Val, V | 11 | 2.61 |
| Trp, W | 5 | 1.18 |
| 183 | 43.36 | |
| Cys, C | 0 | 0.00 |
| Asn. N | 25 | 5.92 |
| Gln, Q | 34 | 8.06 |
| Ser, S | 35 | 8.29 |
| Thr, T | 33 | 7.82 |
| Tyr, Y | 11 | 2.61 |
| 138 | 32.70 | |
| His, H | 5 | 1.18 |
| Lys, K | 29 | 6.87 |
| Arg, R | 31 | 7.35 |
| 65 | 15.40 | |
| Asp, D | 22 | 5.21 |
| Glu, E | 14 | 3.32 |
| 36 | 8.53 | |
Fig. 2The predicted phosphorylation sites on the N protein. We identified 33 potential phosphorylation sites in the N protein, including 22 serines, 8 threonines and 3 tyrosines. The average score of serines are significantly higher than that of the other two. The phosphorylation sites concentrate in the middle of the N protein.
Fig. 3The similarity chart of the N protein. Based on multi-alignment of totally nineteen coronavirus N proteins, two conserved regions were found around a.a. 81-140 and a.a. 270-320 (amino acid positions are all referred to the N protein of SARS-CoV, Isolate BJ01). The arrow indicates the most conserved domain, and its sequence and amino acid position are given. In contrast, the two termini of the N protein are more variable, particularly the C- terminal. The figure was generated by Plotcon in the EMBOSS package (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/).
The Core Motif of SR-rich Region in the Coronavirus N Protein
| Coronavirus | Core motif of SR-rich region in the N protein |
|---|---|
| SARS Coronavirus BJ01 | |
| Murine Hepatitis Virus | |
| Puffinosis Virus | |
| Rat Sialodacryoadenitis Coronavirus | |
| Rat Coronavirus | |
| Equine Coronavirus | |
| Bovine Coronavirus | |
| Porcine Hemagglutinating Encephalomyelitis Virus | |
| HCoV-OC43 | |
| Turkey Coronavirus | |
| Avian Infectious Bronchitis Virus | |
| Porcine Respiratory Coronavirus | |
| Transmissible Gastroenteritis Virus | |
| Canine Enteric Coronavirus | |
| Canine Coronavirus | |
| Feline Infectious Peritonitis Virus | |
| Feline Coronavirus | |
| Human Coronavirus 229E | |
| Porcine Epidemic Diarrhea Virus | |
Bold letter indicates the marker SR. Normal letter indicates the amino acid between two SRs.
Square brackets indicate this position may be occupied by one of the amino acids in them.
Round brackets indicate the amino acids in them may occur in some viruses while may not in other viruses.
Curly brackets indicate amino acids between SRs, X indicates any amino acid, and subscript letter n indicates the number of amino acids between SRs.
Fig. 4The possible antigenic sites of the N protein. We have predicted sixteen antigenic sites of the N protein, and found they are clustering in the middle and the C-terminal. There is a strong antigenic site (TALALLLLDR) located around Codons 218-227. In addition, another three strong antigenic sites are detected around Codons 156-166 (AATVLQLPQGT), Codons 347-363 (FKDNVILLNKHIDAYKT) and Codons 389-398 (KQPTVTLLPA).
The Predicted Antigenic Sites on the SARS-CoV N Protein
| No. | Start Position | Sequence | End Position |
|---|---|---|---|
| 1 | 52 | SWFTALTQ | 59 |
| 2 | 69 | RGQGVPI | 75 |
| 3 | 83 | DQIGYYR | 89 |
| 4 | 106 | SPRWYFYYLG | 115 |
| 5 | 118 | PEASLPY | 124 |
| 6 | 130 | GIVWVAT | 136 |
| 7 | 156 | AATVLQLPQGT | 166 |
| 8 | 218 | TALALLLLDR | 227 |
| 9 | 229 | NQLESKVSG | 237 |
| 10 | 243 | QGQTVTK | 249 |
| 11 | 267 | KQYNVTQ | 273 |
| 12 | 299 | YKHWPQIAQFAPSASAF | 315 |
| 13 | 323 | MEVTPSGTWLTYHGAIK | 339 |
| 14 | 347 | FKDNVILLNKHIDAYKT | 363 |
| 15 | 379 | EAQPLPQ | 385 |
| 16 | 389 | KQPTVTLLPA | 398 |
amino acid position in the SARS-CoV N protein (BJ01)
The Synthesized Peptides Representing the N Protein of SARS-CoV and Their ELISA Result
| Peptides Number | Start Position | Sequence | End Position | ELISA Result |
|---|---|---|---|---|
| N1 | 1 | MSDNGPQSNQRSAPRITFGGPTD | 23 | ++ |
| N21 | 21 | PTDSTDNNQNGGRNGARPKQRR | 42 | ++ |
| N35 | 35 | GARPKQRRPQGLPNNTASWFTA | 56 | + |
| N99 | 99 | DGKMKELSPRWYFYYLGTGPEA | 120 | - |
| N161 | 161 | QLPQGTTLPKGFYAEGSRGGSQ | 182 | +++ |
| N177 | 177 | SRGGSQASSRSSSRSRGNSRNS | 198 | ++ |
| N196 | 196 | RNSTPGSSRGNSPARMASGGGE | 217 | - |
| N215 | 215 | GGETALALLLLDRLNQLESKVSGKG | 239 | ++ |
| N245 | 245 | QTVTKKSAAEASKKPRQKRTATKQ | 268 | ++ |
| N258 | 258 | KPRQKRTATKQYNVTQAFGRRG | 279 | + |
| N355 | 355 | NKHIDAYKTFPPTEPKKDKKKK | 376 | ++ |
| N371 | 371 | KDKKKKTDEAQPLPQRQKKQ | 390 | +++ |
| N385 | 385 | QRQKKQPTVTLLPAADMDDFSRQ | 407 | ++ |
| N401 | 401 | MDDFSRQLQNSMSGASADSTQA | 422 | - |
amino acid position in the SARS-CoV N protein (BJ01)
Fig. 5The phylogenetic tree based on nineteen coronavirus N proteins. It reveals that SARS-CoV is closer to Group 2 than to Groups 1 and 3. Abbreviations: AIBV: avian infectious bronchitis virus; BCoV: bovine coronavirus; CCoV: canine coronavirus; CECoV: canine enteric coronavirus; ECoV: equine coronavirus; FCoV: feline coronavirus; FIPV: feline infectious peritonitis virus; HCoV-OC43: human coronavirus strain OC43; HCoV-229E: human coronavirus strain 229E; MHV: murine hepatitis virus; PEDV: porcine epidemic diarrhea virus; PHEV: porcine hemagglutmating encephalomyelitis virus; PRCoV: porcine respiratory coronavirus; PTGV: transmissible gastroenteritis virus; PV: puffinosis virus; RCoV: rat coronavirus; RSCoV: Rat sialodacryoadenitis coronavirus; SARS-CoV: human severe acute respiratory syndrome-associated coronavirus isolate BJ01; TCoV: turkey coronavirus.
The Four Substitutions in the N Protein of SARS-CoV
| Nt Position in BJ01 | a.a. Position in the ORF | a.a. (Ratio) | Synonymous Substitution |
|---|---|---|---|
| 28,519 | 140 | L(16)/W(1) | No |
| 28,520 | 140 | L(17) | Yes |
| 28,560 | 154 | N(16)/Y(1) | No |
| 28,677 | 193 | G(16)/C(1) | No |