| Literature DB >> 15626341 |
Jingxiang Li1, Chunqing Luo, Yajun Deng, Yujun Han, Lin Tang, Jing Wang, Jia Ji, Jia Ye, Fanbo Jiang, Zhao Xu, Wei Tong, Wei Wei, Qingrun Zhang, Shengbin Li, Wei Li, Hongyan Li, Yudong Li, Wei Dong, Jian Wang, Shengli Bi, Huanming Yang.
Abstract
The corona-like spikes or peplomers on the surface of the virion under electronic microscope are the most striking features of coronaviruses. The S (spike) protein is the largest structural protein, with 1,255 amino acids, in the viral genome. Its structure can be divided into three regions: a long N-terminal region in the exterior, a characteristic transmembrane (TM) region, and a short C-terminus in the interior of a virion. We detected fifteen substitutions of nucleotides by comparisons with the seventeen published SARS-CoV genome sequences, eight (53.3%) of which are non-synonymous mutations leading to amino acid alternations with predicted physiochemical changes. The possible antigenic determinants of the S protein are predicted, and the result is confirmed by ELISA (enzyme-linked immunosorbent assay) with synthesized peptides. Another profound finding is that three disulfide bonds are defined at the C-terminus with the N-terminus of the E (envelope) protein, based on the typical sequence and positions, thus establishing the structural connection with these two important structural proteins, if confirmed. Phylogenetic analysis reveals several conserved regions that might be potent drug targets.Entities:
Mesh:
Substances:
Year: 2003 PMID: 15626341 PMCID: PMC5172354 DOI: 10.1016/s1672-0229(03)01015-5
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
The Amino Acid Composition of the S Protein
| Non-polar, Neutral | Number | Percentage (%) |
|---|---|---|
| Ala | 84 | 6.69 |
| Gly | 78 | 6.22 |
| Ile | 77 | 6.14 |
| Leu | 99 | 7.89 |
| Met | 20 | 1.59 |
| Phe | 83 | 6.61 |
| Pro | 57 | 4.54 |
| Trp | 11 | 0.88 |
| Val | 91 | 7.25 |
| Asn | 81 | 6.45 |
| Cys | 39 | 3.11 |
| Gln | 55 | 4.37 |
| Ser | 96 | 7.65 |
| Thr | 100 | 7.97 |
| Tyr | 54 | 4.3 |
| Arg | 39 | 3.11 |
| His | 15 | 1.2 |
| Lys | 60 | 4.78 |
| Asp | 74 | 5.9 |
| Glu | 42 | 3.35 |
Fig. 1The predicted distributions of GC content (A), electric charge (B), hydrophobicity (C), helix (D), coil (E), signal peptide transmembrane region (F) and the points of non-synonymous substitutions (the reversed triangles) in the S protein.
Fig. 2The predicted N-glycosylated and O-glycosylated sites in the S protein of SARS-CoV. The rectangles mean the different types of antigenic determinants.
The ELISA Checking Result of the Synthesized Peptides of the S Protein
| Peptide | Position (a.a.) | Sequence | Positive |
|---|---|---|---|
| S67 | 67 - 89 | TGFHTINHTFDNPVIPFKDGIYF | - |
| S84 | 84 - 108 | KDGIYFAATEKSNVVRGWVFGSTMN | + |
| S101 | 101 - 122 | WVFGSTMNNKSQSVIIINNSTN | + |
| S118 | 118 - 142 | NNSTNVVIRACNFELCDNPFFAVSK | + |
| S260 | 260 - 281 | TTFMLKYDENGTITDAVDCSQN | - |
| S277 | 277 - 298 | DCSQNPLAELKCSVKSFEIDKG | + |
| S301 | 301 - 322 | QTSNFRVVPSGDVVRFPNITNL | +++ |
| S323 | 324 - 344 | PFGEVFNATKFPSVYAWERKK | + |
| S345 | 345 - 366 | ISNCVADYSVLYNSTFFSTFKC | - |
| S582 | 582 - 604 | SVITPGTNASSEVAVLYQDVNCT | + |
| S599 | 599 - 620 | QDVNCTDVSTAIHADQLTPAWR | +++ |
| S624 | 624 - 645 | TGNNVFQTQAGCLIGAEHVDTS | + |
| S645 | 645 - 667 | SYECDIPIGAGICASYHTVSLLR | ++ |
| S886 | 886 - 906 | YRFNGIGVTQNVLYENQKQIA | - |
| S1130 | 1130 -1147 | FKEELDKYFKNHTSPDVD | +++ |
| S1211 | 1121 - 1232 | MVTILLCCMTSCCSCLKGACSC | - |
| S1227 | 1227 - 1248 | KGACSCGSCCKFDEDDSEPVLK | + |
| S1234 | 1234 - 1255 | SCCKFDEDDSEPVLKGVKLHYT | ++ |
The Distribution of Substitutions in the S Protein
| Subregions (a.a.) | Position (a.a./nt) | Substutution (a.a./nt) | pI value | Hydrophobicity (%) | Hydrophilicity (%) | Charge (+) | Charge (-) | Isolate |
|---|---|---|---|---|---|---|---|---|
| 1 (60–115) | 77(21,702) | Asp-Gly (A/G) | 7.9–9.0 | (33.9) | 46.4–44.6 | (10.7) | 5.4–3.6 | BJ04, CUHK, HKU, SIN1, SIN2, SIN3, TOR2, SIN4, SIN5, TW1, Urbani, ZJ01 |
| 2 (115–145) | 144(21,902) | Met-Leu (A/C) | (5.8) | (38.7) | (38.7) | (6.5) | (6.5) | BJ03 |
| 3 (220–255) | 244(22,203) | Thr-Ile (C/T) | (8.0) | 33.3–36.1 | 33.3–30.6 | (5.6) | (2.8) | CUHK, HKU, SIN1, SIN2, SIN3, SIN4, SIN5, TOR2, TW1, Urbani, ZJ01 |
| 4 (255–320) | 311(22,403) | Gly-Arg (G/A) | 5.1–6.0 | (28.8) | 48.5–50.0 | 10.6–12.1 | (12.1) | BJ02 |
| 5 (560–580) | 577(23,201) | Ser-Ala (T/G) | (4.3) | (23.8) | 52.4–47.6 | (9.5) | (19.0) | TOR2 |
| 6 (830–880) | 860(24,050) | ValSer-LeuArg (AG/CT) | 4.1–4.6 | (33.3) | (27.5) | 2.0–3.9 | (5.9) | BJ03 |
| 7 (975–1,010) | 1,001(24,474) | Arg-Met (G/T) | 10.8–9.7 | 27.8–30.6 | 50.0–47.2 | 13.9–11.1 | (5.6) | BJ04 |
CUHK: CUHK-Su10; HKU: HKU-39849; SIN1: SIN2500; SIN2: SIN2677; SIN3: SIN2679; SIN4: SIN2748; SIN5: SIN2774
Pairwise Identity/Similarity Between Coronaviruses Based on the S Protein
AIBV: avian infectious bronchitis virus; BCoV: bovine coronavirus; CCoV, canine coronavirus; FCoV: feline coronavirus; FIPV: feline infectious peritonitis virus; HCoV-229E: human coronavirus 229E; HCoV-OC43, human coronavirus OC43; MHV: murine hepatitis virus; PEDV: porcine epidemic diarrhea virus; PHEV: porcine hemagglutinating encephalomyelitis virus; PRCoV: porcine respiratory coronavirus; PTGV: porcine transmissible gastroenteritis virus; RCoV: rat coronavirus; RSCoV: rat sialodacryoadenitis coronavirus; HECoV: human enteric coronavirus; SARS-CoV: the human SARS-associated coronavirus; TCoV: turkey coronavirus.
Fig. 3A phylogenetic tree of the coronaviruses based on the S protein.
Fig. 5A typical Cys conserved region of coronaviruses.
Fig. 4The similarity among the S proteins of 13 coronaviruses and dotplot result between MHV and SARS-CoV (BJ01 - 04).