| Literature DB >> 33121513 |
Jerome Rumdon Lon1, Yunmeng Bai1, Bingxu Zhong1, Fuqiang Cai1, Hongli Du2.
Abstract
BACKGROUND: In order to obtain antibodies that recognize natural proteins, it is possible to predict the antigenic determinants of natural proteins, which are eventually embodied as polypeptides. The polypeptides can be coupled with corresponding vectors to stimulate the immune system to produce corresponding antibodies, which is also a simple and effective vaccine development method. The discovery of epitopes is helpful to the development of SARS-CoV-2 vaccine.Entities:
Keywords: Bioinformatics; Epitopes; Evolution; SARS-CoV-2
Mesh:
Substances:
Year: 2020 PMID: 33121513 PMCID: PMC7594941 DOI: 10.1186/s12985-020-01437-4
Source DB: PubMed Journal: Virol J ISSN: 1743-422X Impact factor: 4.099
Fig. 1The 3D structure prediction and Ramachandran plot analysis of the E protein. a The Ramachandran plot analysis of the 3D structure of the E protein (without Gly and Pro). All of the residues located on the allowed region. indicating that the structure was reliable from a thermodynamic point of view. b The 3D structure of the E protein predicted by homology modeling. It is a pentamer with ion channel activity [38]. Its head is short, the middle of the tail is a transmembrane region which help the E protein embed in the envelope of SARS-CoV-2
The plot statistics of the Ramachandran plot
| Plot statistics-E | % | |
|---|---|---|
| Residues in most favoured regions [A, B, L] | 228 | 84.40 |
| Residues in additional allowed regions [a, b, l, p] | 38 | 14.10 |
| Residues in generously allowed regions [~ a, ~ b, ~ l, ~ p] | 4 | 1.50 |
| Residues in disallowed regions | 0 | 0.00 |
All of the residue located on the allowed regions, which reveals that the model is reasonable on the energy level
Fig. 2The secondary structures and properties analysis of the S, E and M protein. a Analysis of the S protein. It contains most α-helix and β-sheet, some Turn and Coli region, several discontinuous high flexibility fragments, fluctuant surface probability with a few of positive peak and several antigenicity regions with positive peak. The S protein showed concentrated high antigenicity peaks in 600–800 residues. b Analysis of the E protein. It contains most α-helix and β-sheet, some Turn and Coli region, three high flexibility fragments, few surface probability regions and two antigenicity regions with positive peak in the begin and the end of polypeptide chain, respectively. The E protein showed concentrated high antigenicity peaks in 60–70 residues. c Analysis of the M protein. It contains most α-helix and β-sheet, some Turn and Coli region, several high flexibility fragments, few surface probability regions, two antigenicity region with positive single peak in the begin and middle of peptide chain, respectively, and consecutive positive peaks in the end. The M protein showed concentrated high antigenicity peaks in 200–220 residues. Interestingly, the high antigenicity peaks of all three proteins were in the region where the α-helix is relatively sparse, which may be related to the fact that the α-helix structure of the helix prevents continuous residues from being located on the surface
The composition and the antigenic index of the epitopes of SARS-CoV-2
| Name | Position | Amino acid | Antigenic index |
|---|---|---|---|
| A | 601–605 | GTNTS | 0.525 |
| B | 656–660 | VNNSY | 0.575 |
| C | 676–686 | TQTNSPR | 0.675 |
| D | 808–813 | DPSKPS | 0.580 |
| E | 60–65 | SRVKNL | 0.588 |
| F | 60–65 | SRVKNL | 0.767 |
| G | 211–215 | SSSSD | 0.656 |
The scores of the epitope E and the epitope G were calculated by Ellipro, the others were calculated by Bepipred 2.0. The epitopes A, B, C and D belong to S protein, the epitopes E and F belong to E protein and they are coincident, the epitope G belongs to M protein
Fig. 3The predicted epitopes of the S and E protein. a The predicted linear B-cell epitopes of the S protein. The epitope A, B, C located in the forepart of the tail, the epitope D located in the back part of the tail and is close to the transmembrane region. b The predicted B-cell epitope of the E protein. The epitope G is the linear epitope and the F is the conformational epitope, which are coincide
The conservation of the epitopes in SARS-CoV-2 dataset
| Name | Position | Conservation score | Average | Name | Position | Conservation score | Average |
|---|---|---|---|---|---|---|---|
| A | 601 | 0.482 | − 0.199 | E/F | 60 | −0.936 | −0.434 |
| 602 | − 0.962 | 61 | −0.966 | ||||
| 603 | −0.208 | 62 | 0.741 | ||||
| 604 | −0.01 | 63 | −0.222 | ||||
| 605 | −0.297 | 64 | −0.787 | ||||
| 65 | −0.936 | ||||||
| B | 656 | −0.332 | 0.114 | ||||
| 657 | 0.243 | G | 211 | 0.086 | 0.282 | ||
| 658 | −0.341 | 212 | 0.171 | ||||
| 659 | 0.198 | 213 | −0.011 | ||||
| 660 | 0.803 | 214 | 1.561 | ||||
| 215 | −0.395 | ||||||
| C | 676 | −0.297 | 0.093 | ||||
| 677 | 0.368 | ||||||
| 678 | −0.533 | ||||||
| 679 | 0.457 | ||||||
| 680 | 0.629 | ||||||
| 681 | 0.188 | ||||||
| 682 | −0.159 | ||||||
| D | 808 | 1.842 | 0.369 | ||||
| 809 | 1.211 | ||||||
| 810 | −0.272 | ||||||
| 811 | −0.208 | ||||||
| 812 | 0.366 | ||||||
| 813 | −0.723 |
The calculation was independent and based on the SARS-CoV-2 data set