| Literature DB >> 32835775 |
Maryam Tohidinia1, Fatemeh Sefid2.
Abstract
Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus that it disease spreads in over the world. Coronaviruses are single-stranded, positive-sense RNA viruses with a genome of approximately 30 KD, the largest genome among RNA viruses. Most people infected with the COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness. At this time, there are no specific vaccines or treatments for COVID-19. So, there is an emergency need for vaccines and antiviral strategies. The spike protein is the major surface protein that it uses to bind to a receptor of another protein that acts as a doorway into a human cell. The putative antigenic epitopes may prove effective as novel vaccines for eradication and combating of COV19 infection. A combination of available bioinformatics tools are used to synthesis of such peptides that are important for the development of a vaccine. In conclusion, amino acids 250-800 were selected as effective B cell epitopes, T cell epitopes, and functional exposed amino acids in order to a recombinant vaccine against coronavirus.Entities:
Keywords: SARS-CoV-2/COVID-19; Spike protein; T and B cell epitopes prediction; Vaccine design
Mesh:
Substances:
Year: 2020 PMID: 32835775 PMCID: PMC7441888 DOI: 10.1016/j.micpath.2020.104459
Source DB: PubMed Journal: Microb Pathog ISSN: 0882-4010 Impact factor: 3.738
Fig. 1Flow chart showing an overview of the methodology pipeline.
Fig. 2S sequence alignment with 10 sequences obtained from protein BLAST against Coronavirus. The superposition was made with the T-coffee program and adjusted manually. Residues conservancy is depicted by blue to pink colors.
Fig. 3Topology prediction and outer membrane proteins by TMHMM server. The diagram shows the estimated preference of a particular residue to be located either on the transmembrane (red) or on the inside (blue) or outside (pink). Schematic picture shows amino acids from 18 to 700 is outside regions.
Fig. 4The best model of Spike glycoprotein predicted by SWISS MODEL server.
Fig. 5Model validations. Both global and local estimation of the quality of the obtained model is reasonable.
Fig. 6Model evaluation. (a) Ramachandran plot of final S protein model. Number of residues in favored region: 2718 (96.3%). Number of residues in allowed region: 100 (3.5%). Number of residues in outlier region: 3 (0.1%). (b)- ProSA protein structure analysis results. Z score = −9.7. Overall quality of the ultimate model is acceptable.
Fig. 7Functional residues at the protein structure surface predicted by Interprosurf. Top functional residues are highlighted in filling space model with red balls and the next top cluster highlighted in filling space model with green balls. (Color figure online).
Fig. 8Graphical display of properties such as hydrophilicity, accessibility, antigenicity, flexibility, and beta-turn secondary structure in the protein sequence by IEDB and Bcepred servers. The regions with the highest score are shown in red and the remainders are in blue.
Fig. 9Epitope mapping on 3D models. Discovery Studio Visualizer 2.5.5 software was used. From 1 to 8 pictures, 8 linear epitopes with the highest PI score predicted by Ellipro server are shown.
8 linear epitopes with the highest PI score predicted by Ellipro server.
| No | Start | End | Peptide | Number of residues | Score |
|---|---|---|---|---|---|
| 1 | 239 | 265 | QTLLALHRSYLTPGDSSSGWTAGAAAY | 27 | 0.84 |
| 2 | 392 | 525 | FTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVC | 134 | 0.812 |
| 3 | 64 | 83 | WFHAIHVSGTNGTKRFDNPV | 20 | 0.787 |
| 4 | 64 | 83 | ENSVAYSNNSIAIPTNF | 17 | 0.708 |
| 5 | 188 | 190 | ESNKKFLPFQQ | 49 | 0.668 |
| 6 | 879 | 925 | AGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIAN | 47 | 0.651 |
| 7 | 577 | 584 | RDPQTLEI | 8 | 0.603 |
| 8 | 109 | 116 | TLDSKT | 6 | 0.518 |
4 Discontinuous epitopes with the highest PI score predicted by Ellipro server.
| No | Residues | Number of residues | Score |
|---|---|---|---|
| 1 | A:D1139, A:P1140, A:L1141, A:Q1142, A:P1143, A:E1144, A:L1145, A:D1146 | 8 | 0.97 |
| 2 | A:Y707, A:S708, A:N709, A:N710, A:S711, A:I712, A:A713, A:I714, A:P715, A:T716, A:N717, A:F718, A:Y1067, A:P1069, A:A1070, A:Q1071, A:E1072, A:K1073, A:N1074, A:F1075, A:T1076, A:T1077, A:A1078, A:P1079, A:A1080, A:I1081, A:C1082, A:H1083, A:D1084, A:G1085, A:K1086, A:A1087, A:H1088, A:F1089, A:P1090, A:R1091, A:E1092, A:G1093, A:V1094, A:F1095, A:V1096, A:S1097, A:N1098, A:G1099, A:T1100, A:H1101, A:W1102, A:F1103, A:V1104, A:T1105, A:Q1106, A:R1107, A:F1109, A:Y1110, A:E1111, A:P1112, A:Q1113, A:I1114, A:I1115, A:T1116, A:T1117, A:D1118, A:N1119, A:T1120, A:F1121, A:V1122, A:S1123, A:G1124, A:N1125, A:C1126, A:D1127, A:V1128, A:V1129, A:I1130, A:G1131, A:I1132, A:V1133, A:N1134, A:N1135, A:T1136, A:V1137, A:Y1138 | 82 | 0.84 |
| 3 | A:F329, A:P330, A:N331, A:I332, A:T333, A:N334, A:L335, A:C336, A:P337, A:F338, A:G339, A:E340, A:V341, A:F342, A:N343, A:A344, A:T345, A:R346, A:F347, A:A348, A:S349, A:V350, A:Y351, A:A352, A:W353, A:N354, A:R355, A:K356, A:R357, A:I358, A:S359, A:N360, A:C361, A:V362, A:A363, A:D364, A:Y365, A:S366, A:V367, A:L368, A:N370, A:S371, A:A372, A:S373, A:F374, A:S375, A:T376, A:F377, A:K378, A:Y380, A:L387, A:C391, A:F392, A:T393, A:N394, A:V395, A:Y396, A:A397, A:D398, A:S399, A:F400, A:V401, A:I402, A:R403, A:G404, A:D405, A:E406, A:V407, A:R408, A:Q409, A:I410, A:A411, A:P412, A:G413, A:Q414, A:T415, A:G416, A:K417, A:I418, A:A419, A:D420, A:Y421, A:N422, A:Y423, A:K424, A:L425, A:P426, A:D427, A:D428, A:F429, A:T430, A:G431, A:C432, A:V433, A:I434, A:A435, A:W436, A:N437, A:S438, A:N439, A:N440, A:L441, A:D442, A:S443, A:K444, A:V445, A:G446, A:G447, A:N448, A:Y449, A:N450, A:Y451, A:L452, A:Y453, A:R454, A:L455, A:F456, A:R457, A:K458, A:S459, A:N460, A:L461, A:K462, A:P463, A:F464, A:E465, A:R466, A:D467, A:I468, A:S469, A:T470, A:E471, A:I472, A:Y473, A:Q474, A:A475, A:G476, A:S477, A:T478, A:P479, A:C480, A:N481, A:G482, A:V483, A:E484, A:G485, A:F486, A:N487, A:C488, A:Y489, A:F490, A:P491, A:L492, A:Q493, A:S494, A:Y495, A:G496, A:F497, A:Q498, A:P499, A:T500, A:N501, A:G502, A:V503, A:G504, A:Y505, A:Q506, A:P507, A:Y508, A:R509, A:V510, A:V511, A:V512, A:L513, A:S514, A:F515, A:E516, A:L517, A:L518, A:H519, A:A520, A:P521, A:A522, A:T523, A:V524, A:C525, A:G526, A:P527, A:E554, A:S555, A:N556, A:K557, A:Q564, A:R577, A:D578, A:P579, A:Q580, A:T581, A:L582, A:E583, A:I584 | 201 | 0.761 |
| 4 | A:F65, A:H66, A:A67, A:I68, A:H69, A:V70, A:S71, A:G72, A:T73, A:N74, A:G75, A:T76, A:K77, A:R78, A:F79, A:D80, A:N81, A:P82, A:V83, A:L84, A:S94, A:T95, A:E96, A:K97, A:S98, A:N99, A:I100, A:I101, A:R102, A:G103, A:W104, A:I105, A:T109, A:L110, A:D111, A:S112, A:K113, A:T114, A:Q115, A:L118, A:I119, A:V120, A:N121, A:N122, A:A123, A:T124, A:N125, A:V126, A:V127, A:I128, A:K129, A:V130, A:C131, A:E132, A:F133, A:Q134, A:F135, A:C136, A:N137, A:D138, A:P139, A:F140, A:L141, A:G142, A:V143, A:Y144, A:Y145, A:H146, A:K147, A:N148, A:N149, A:K150, A:S151, A:W152, A:M153, A:E154, A:S155, A:E156, A:F157, A:R158, A:V159, A:Y160, A:S161, A:S162, A:A163, A:N164, A:N165, A:C166, A:T167, A:F168, A:E169, A:Y170, A:V171, A:S172, A:Q173, A:P174, A:F175, A:L176, A:M177, A:D178, A:L179, A:E180, A:G181, A:K182, A:Q183, A:G184, A:N185, A:F186, A:K187, A:N188, A:L189, A:R190, A:T208, A:P209, A:I210, A:N211, A:L212, A:V213, A:R214, A:D215, A:L216, A:P217, A:Q239, A:T240, A:L241, A:L242, A:A243, A:L244, A:H245, A:R246, A:S247, A:Y248, A:L249, A:T250, A:P251, A:G252, A:D253, A:S254, A:S255, A:S256, A:G257, A:W258, A:T259, A:A260, A:G261, A:A262, A:A263, A:A264, A:Y265 | 149 | 0.742 |
Linear B-cell epitopes and T-cell epitopes of S proteins with their antigenicity scores.
| No | B-cell Epitope | Position | B cell Epitope Vaxijen score | Promiscuous T-cell Epitopes | Number of bound alleles in MHC I + MHC II | T-cell Epitope |
|---|---|---|---|---|---|---|
| 1 | LNEVAKNLNESLIDLQELGK | 1205–1186 | 0.4281 | LNESLIDLQ | 1 + 7 | 1.0001 |
| LIDLQELGK | 8 + 14 | 0.9206 | ||||
| 2 | ITSGWTFGAGAALQIPFAMQ | 882–901 | 0.6031 | FGAGAALQI | 29 + 23 | 0.6377 |
| WTFGAGAAL | 7 + 38 | 0.4918 | ||||
| ITSGWTFGA | 27 + 9 | 0.4577 | ||||
| 3 | QSIIAYTMSLGAENSVAYSN | 690–709 | 0.5222 | YTMSLGAEN | 11 + 6 | 0.9735 |
| LGAENSVAY | 3 + 14 | 0.4173 | ||||
| YTMSLGAEN | 11 + 6 | 0.9735 | ||||
| 4 | GVSVITPGTNTSNQVA | 639–654 | 0.4651 | VSVITPGTN | 18 + 6 | 0.5501 |
| 5 | GWTAGAAAYYVGYLQP | 302–316 | 0.6210 | WTAGAAAYY | 25 + 1 | 0.6306 |
| 6 | HRSYLTPGDSSSGWTA | 290–305 | 0.6017 | LTPGDSSSG | 9 + 4 | 0.7579 |
| YLTPGDSSS | 27 + 1 | 0.5905 | ||||
| 7 | TVEKGIYQTSNFRVQP | 352–367 | 0.6733 | YQTSNFRVQ | 8 + 9 | 0.7821 |
| 8 | ESPIRATRYSYNDRME | 22–37 | 1.0078 | IRATRYSYN | 10 + 8 | 1.5237 |
| 9 | GCLIGAEHVNNSYECD | 693–654 | 0.8480 | IGAEHVNNS | 20 + 29 | 1.2611 |
| LIGAEHVNN | 9 + 6 | 0.9246 | ||||
| 10 | LQSYGFQPTNGVGYQP | 537–552 | 0.5258 | YGFQPTNGV | 39 + 33 | 1.0509 |
| LQSYGFQPT | 37 + 4 | 0.7917 | ||||
| 11 | TRFQTLLALHRSYLTP | 281–296 | 0.5115 | LLALHRSYL | 39 + 33 | 0.5241 |
| 12 | FAMQMAYRFNGIGVTQ | 943–958 | 1.3096 | FAMQMAYRF | 36 + 8 | 1.0278 |
| MQMAYRFNG | 11 + 28 | 0.4868 | ||||
| YRFNGIGVT | 15 + 12 | 1.7692 | ||||
| 13 | EVRQIAPGQTGKIADY | 451–466 | 1.3837 | VRQIAPGQT | 13 + 39 | 0.8675 |
| 14 | QTLLALHRSYLTPGDSSSGWTAGAAAY | 239–265 | 0.4822 | LLALHRSYL | 34 + 40 | 0.5241 |
| LHRSYLTPG | 1 + 10 | 0.7761 | ||||
| YLTPGDSSS | 7 + 2 | 0.5905 | ||||
| 15 | FTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVC | 392–525 | 0.4563 | FELLHAPAT | 5 + 13 | 0.5409 |
| YFPLQSYGF | 2 + 23 | 0.5107 | ||||
| YGFQPTNGV | 8 + 33 | 1.0509 | ||||
| YQPYRVVVL | 8 + 32 | 0.5964 | ||||
| VVVLSFELL | 8 + 15 | 1.0909 | ||||
| VRQIAPGQT | 2 + 26 | 0.8675 | ||||
| 16 | WFHAIHVSGTNGTKRFDNPV | 64–83 | 0.4100 | FHAIHVSGT | 15 + 39 | 0.9305 |
| IHVSGTNGT | 11 + 30 | 0.8621 | ||||
| 17 | ENSVAYSNNSIAIPTNF | 702–718 | 0.4481 | VAYSNNSIA | 27 + 20 | 0.8537 |
| YSNNSIAIP | 4 + 5 | 0.5477 | ||||
| 18 | AGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIAN | 879–925 | 0.5747 | FAMQMAYRF | 8 + 18 | 1.0278 |
| FGAGAALQI | 9 + 7 | 0.6377 | ||||
| WTFGAGAAL | 22 + 1 | 0.4918 | ||||
| MQMAYRFNG | 0 + 40 | 0.4868 |
Common epitopes of both MHC class-1 and MHC class-II alleles with their mutation, toxicity, and allergenicity.
| No | T-cell epitope sequences | T-cell epitope coordinates | ToxinPred | Allergen prediction | Mutation position |
|---|---|---|---|---|---|
| 1 | LNESLIDLQ | 1212–1220 | Non-Toxin | Allergen | No Mutation |
| 2 | LIDLQELGK | 1216–1224 | Non-Toxin | No Allergen | No Mutation |
| 3 | FGAGAALQI | 888–896 | Non-Toxin | Allergen | No Mutation |
| 4 | WTFGAGAAL | 886–894 | Non-Toxin | Allergen | No Mutation |
| 5 | ITSGWTFGA | 882–890 | Non-Toxin | Allergen | No Mutation |
| 6 | YTMSLGAEN | 595–603 | Non-Toxin | Allergen | No Mutation |
| 7 | LGAENSVAY | 699–707 | Non-Toxin | Allergen | No Mutation |
| 8 | VSVITPGTN | 640–648 | Non-Toxin | Allergen | No Mutation |
| 9 | WTAGAAAYY | 303–311 | Non-Toxin | No Allergen | No Mutation |
| 10 | LTPGDSSSG | 295–303 | Non-Toxin | Allergen | No Mutation |
| 11 | YLTPGDSSS | 294–303 | Non-Toxin | Allergen | No Mutation |
| 12 | YQTSNFRVQ | 358–366 | Non-Toxin | No Allergen | No Mutation |
| 13 | IRATRYSYN | 25–33 | Non-Toxin | Allergen | No Mutation |
| 14 | IGAEHVNNS | 696–704 | Non-Toxin | No Allergen | No Mutation |
| 15 | LIGAEHVNN | 695–703 | Non-Toxin | Allergen | No Mutation |
| 16 | YGFQPTNGV | 540–548 | Non-Toxin | Allergen | No Mutation |
| 17 | LQSYGFQPT | 537–545 | Non-Toxin | Allergen | No Mutation |
| 18 | LLALHRSYL | 286–294 | Non-Toxin | No Allergen | No Mutation |
| 19 | FAMQMAYRF | 25–33 | Non-Toxin | Allergen | No Mutation |
| 20 | MQMAYRFNG | 943–950 | Non-Toxin | Allergen | No Mutation |
| 21 | YRFNGIGVT | 945–953 | Non-Toxin | No Allergen | No Mutation |
| 22 | VRQIAPGQT | 452–460 | Non-Toxin | Allergen | No Mutation |
| 23 | LLALHRSYL | 241–249 | Non-Toxin | No Allergen | No Mutation |
| 24 | LHRSYLTPG | 245–253 | Non-Toxin | Allergen | No Mutation |
| 25 | YLTPGDSSS | 249–257 | Non-Toxin | Allergen | No Mutation |
| 26 | FELLHAPAT | 515–523 | Non-Toxin | Allergen | No Mutation |
| 27 | YFPLQSYGF | 487–495 | Non-Toxin | Allergen | No Mutation |
| 28 | YGFQPTNGV | 493–501 | Non-Toxin | Allergen | No Mutation |
| 29 | YQPYRVVVL | 504–512 | Non-Toxin | No Allergen | No Mutation |
| 30 | VVVLSFELL | 510–518 | Non-Toxin | Allergen | No Mutation |
| 31 | VRQIAPGQT | 407–415 | Non-Toxin | Allergen | No Mutation |
| 32 | FHAIHVSGT | 949–957 | Non-Toxin | Allergen | No Mutation |
| 33 | IHVSGTNGT | 65 + 73 | Non-Toxin | Allergen | No Mutation |
| 34 | VAYSNNSIA | 405–413 | Non-Toxin | No Allergen | No Mutation |
| 35 | YSNNSIAIP | 407–415 | Non-Toxin | No Allergen | No Mutation |
| 36 | FGAGAALQI | 68–76 | Non-Toxin | Allergen | No Mutation |
| 37 | WTFGAGAAL | 886–897 | Non-Toxin | Allergen | No Mutation |
Fig. 10Pocket detection of S protein by GHECOM server. (a) Graph residue-based pocketness. The height of the bar shows the value of pocketness [%] for each residue. The color of pocketness bar indicates cluster number of pocket (red: cluster 1, blue: cluster 2, green: cluster 3, yellow: cluster 4, cyan: cluster 5). (b) Jmol view of pocket structure based on pocketness color.
Fig. 11Prediction of Probability of resi-due forming a binding site and residue depth plot and a 3D rendition of the cavity by depth server. (a) Probability of residue forming a binding site and residue depth plot. (b) A 3D rendition of the cavity prediction is shown using Jmol. Residues of the predicted bind-ing cavity are colored black and the rest of the protein is colored white.
Fig. 12Graphical display of immunogenic regions predicted by various methods and parameters. The region with the highest score are shown in red and the remainders are in blue. The consensus predictions of more or all servers shown in violet.
Average physicochemical properties of vaccine candidate and S protein.
| Candidate Number | Number of amino acids | Weight | Vaxigen Antigenicity Score | Allergen prediction | Instability index | Hydropathicity | Isoelectric PH |
|---|---|---|---|---|---|---|---|
| 1 | 550 | 60488.03 | 0.5574 | NON-ALLERGEN | 27.80 | −0.179 | 6.04 |
| S | 1273 | 141178.47 | 0.4646 | NON-ALLERGEN | 33.01 | −0.079 | 6.24 |