| Literature DB >> 34276265 |
Subhamoy Biswas1, Smarajit Manna2,3, Ashesh Nandy3, Subhash C Basak4.
Abstract
The design for vaccines using in silico analysis of genomic data of different viruses has taken many different paths, but lack of any precise computational approach has constrained them to alignment methods and some alignment-free techniques. In this work, a precise computational approach has been established wherein two new mathematical parameters have been suggested to identify the highly conserved and surface-exposed regions which are spread over a large region of the surface protein of the virus so that one can determine possible peptide vaccine candidates from those regions. The first parameter, w, is the sum of the normalized values of the measure of surface accessibility and the normalized measure of conservativeness, and the second parameter is the area of a triangle formed by a mathematical model named 2D Polygon Representation. This method has been, therefore, used to determine possible vaccine targets against SARS-CoV-2 by considering its surface-situated spike glycoprotein. The results of this model have been verified by a parallel analysis using the older approach of manually estimating the graphs describing the variation of conservativeness and surface-exposure across the protein sequence. Furthermore, the working of the method has been tested by applying it to find out peptide vaccine candidates for Zika and Hendra viruses respectively. A satisfactory consistency of the model results with pre-established results for both the test cases shows that this in silico alignment-free analysis proposed by the model is suitable not only to determine vaccine targets against SARS-CoV-2 but also ready to extend against other viruses.Entities:
Keywords: Alignment-free sequence analysis; In silico drug design; Peptide vaccines; SARS-CoV-2; Viral epidemics
Year: 2021 PMID: 34276265 PMCID: PMC8270779 DOI: 10.1007/s10989-021-10251-7
Source DB: PubMed Journal: Int J Pept Res Ther ISSN: 1573-3149 Impact factor: 1.931
Fig. 1Pictorial representation of the 2D Polygon model. Here,. Also,
Top 10 peptides of length 12 for spike glycoprotein of SARS-CoV-2 given by w parameter
| Rank | Starting position of the peptides | Score | PV | Peptide (Length = 12) |
|---|---|---|---|---|
| 1 | 1027 | 19.27289 | 1 | TKMSECVLGQSK |
| 2 | 1026 | 18.86894 | 1 | ATKMSECVLGQS |
| 3 | 1030 | 18.82406 | 1 | SECVLGQSKRVD |
| 4 | 1028 | 18.62657 | 1 | KMSECVLGQSKR |
| 5 | 334 | 18.39318 | 1 | NLCPFGEVFNAT |
| 6 | 982 | 18.35727 | 1 | SRLDKVEAEVQI |
| 7 | 1031 | 18.17774 | 1 | ECVLGQSKRVDF |
| 8 | 1029 | 18.03411 | 1 | MSECVLGQSKRV |
| 9 | 985 | 18.00718 | 1 | DKVEAEVQIDRL |
| 10 | 1023 | 17.98025 | 1 | NLAATKMSECVL |
Peptide zones for spike glycoprotein of SARS-CoV-2 predicted using the 2D model and comparison with the eye-estimated regions for the same, using all the 2812 full-length sequences of the protein used in the current analysis
| As per 2D Polygon Representation model, using the currently available 2812 sequences (as of 8 May, 2020) | As per eye-estimation of the ASA and PV profiles, using the same 2812 sequences | |||
|---|---|---|---|---|
| Start–End position of the region based on 2D Polygon Representation | Peptide Stretch (based on the 2D model) | 2D Polygon Score | Start–End position of the region based on eye-estimation | Peptide Stretch (based on eye-estimation) |
| 1021–1046** | SANLAATKMSECVLGQSKRVDFCGKG | 66.20194 | 1019–1049** | RASANLAATKMSECVLGQSKRVDFCGKGYHL |
| 329–345** | FPNITNLCPFGEVFNAT | 30.595 | 323–347** | TESIVRFPNITNLCPFGEVFNATRF |
| 975–1004** | SVLNDILSRLDKVEAEVQIDRLITGRLQSL | 64.50292 | 953–1007** | NQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTY |
| 458–470** | KSNLKPFERDIST | 17.15506 | 456–475** | FRKSNLKPFERDISTEIYQA |
| 414–440** | QTGKIADYNYKLPDDFTGCVIAWNSNN | 65.6183 | 412–448** | PGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGN |
| 523–546** | TVCGPKKSTNLVKNKCVNFNFNGL | 39.63615 | 523–551** | TVCGPKKSTNLVKNKCVNFNFNGLTGTGV |
| 386–400** | KLNDLCFTNVYADSF | 22.54376 | 349–399** | SVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADS |
| 373–386 | SFSTFKCYGVSPTK | 11.57532 | 349–399** | SVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADS |
| 860–878** | VLPPLLTDEMIAQYTSALL | 36.08575 | 858–877 | LTVLPPLLTDEMIAQYTSAL |
| 289–300 | VDCALDPLSETK | 14.40109 | – | – |
| 1088–1108 | HFPREGVFVSNGTHWFVTQRN | 17.0471 | – | – |
| 660–671 | YECDIPIGAGIC | 13.04917 | 655–671 | HVNNSYECDIPIGAGIC |
| 901–917 | QMAYRFNGIGVTQNVLY | 11.67947 | 907–923 | NGIGVTQNVLYENQKLI |
| 1148–1161 | FKEELDKYFKNHTS | 16.8394 | 1145–1161 | LDSFKEELDKYFKNHTS |
| 316–335 | SNFRVQPTESIVRFPNITNL | 13.16672 | – | – |
| 782–793 | FAQVKQIYKTPP | 7.009527 | 779–795 | QEVFAQVKQIYKTPPIK |
| – | – | – | 1168–1185 | DISGINASVVNIQKEIDR |
| – | – | – | 195–209 | KNIDGYFKIYSKHTP |
Double asterisks refer to those peptide regions which fell in the top 50th percentile as per the 2D Polygon score for SARS-CoV-2, Hendra and Zika viruses respectively
Fig. 2ASA and PV profiles of the full-length sequence of spike glycoprotein of SARS-CoV-2: The components shown in the figure are: ASA profile (blue), PV profile (red), predicted peptide stretches using the new approach (green) and peptide stretches determined using eye-estimation (black). (Color figure online)
Discontinuous epitopes of spike glycoprotein of SARS-CoV-2 based on the analysis using IEDB-AR Ellipro
| Residues | Number of residues | Score |
|---|---|---|
| A:Y707, A:S708, A:N709, A:N710, A:S711, A:I712, A:A713, A:I714, A:P715, A:T716, A:N717, A:F718, A:A783, A:Q784, A:V785, A:K786, A:Q787, A:I788, A:Y789, A:K790, A:T791, A:P792, A:P793, A:I794, A:K795, A:D796, A:F797, A:G798, A:G799, A:F800, A:P863, A:L864, A:L865, A:E868, A:M869, A:Q872, A:Y873, A:S875, A:A876, A:A879, A:G880, A:I882, A:T883, A:S884, A:G885, A:W886, A:T887, A:F888, A:G889, A:A890, A:G891, A:A892, A:A893, A:L894, A:Q895, A:I896, A:P897, A:F898, A:A899, A:M900, A:Q901, A:M902, A:A903, A:Y904, A:F906, A:N907, A:G908, A:I909, A:G910, A:V911, A:T912, A:Q913, A:N914, A:V915, A:L916, A:Y917, A:E918, A:N919, A:Q920, A:K921, A:L922, A:I923, A:A924, A:N925, A:L1034, A:G1035, A:Q1036, A:Q1071, A:E1072, A:K1073, A:N1074, A:F1075, A:T1076, A:T1077, A:A1078, A:P1079, A:A1080, A:I1081, A:C1082, A:H1083, A:D1084, A:G1085, A:K1086, A:A1087, A:H1088, A:F1089, A:P1090, A:R1091, A:E1092, A:G1093, A:V1094, A:F1095, A:V1096, A:S1097, A:N1098, A:G1099, A:T1100, A:H1101, A:W1102, A:F1103, A:V1104, A:T1105, A:Q1106, A:R1107, A:N1108, A:F1109, A:Y1110, A:E1111, A:P1112, A:Q1113, A:I1114, A:I1115, A:T1116, A:T1117, A:D1118, A:N1119, A:T1120, A:F1121, A:V1122, A:S1123, A:G1124, A:N1125, A:C1126, A:D1127, A:V1128, A:V1129, A:I1130, A:G1131, A:I1132, A:V1133, A:N1134, A:N1135, A:T1136, A:V1137, A:Y1138, A:D1139, A:P1140, A:L1141, A:Q1142, A:P1143, A:E1144, A:L1145, A:D1146, A:S1147 | 164 | 0.751 |
| A:R328, A:F329, A:P330, A:N331, A:I332, A:T333, A:N334, A:L335, A:C336, A:P337, A:F338, A:G339, A:E340, A:V341, A:F342, A:N343, A:A344, A:T345, A:R346, A:F347, A:A348, A:S349, A:V350, A:Y351, A:A352, A:W353, A:N354, A:R355, A:K356, A:R357, A:I358, A:S359, A:N360, A:C361, A:V362, A:A363, A:D364, A:V367, A:L368, A:S371, A:A372, A:S373, A:F374, A:S375, A:T376, A:Y380, A:T393, A:N394, A:V395, A:Y396, A:A397, A:D398, A:S399, A:F400, A:V401, A:I402, A:R403, A:G404, A:D405, A:E406, A:V407, A:R408, A:Q409, A:I410, A:A411, A:P412, A:G413, A:Q414, A:T415, A:G416, A:K417, A:I418, A:A419, A:D420, A:Y421, A:N422, A:Y423, A:K424, A:L425, A:P426, A:D427, A:D428, A:F429, A:V433, A:I434, A:A435, A:W436, A:N437, A:S438, A:N439, A:N440, A:L441, A:D442, A:N448, A:Y449, A:N450, A:Y451, A:L452, A:Y453, A:R454, A:L455, A:F456, A:R457, A:K458, A:S459, A:N460, A:L461, A:K462, A:P463, A:F464, A:E465, A:R466, A:D467, A:I468, A:S469, A:T470, A:F490, A:P491, A:L492, A:Q493, A:S494, A:Y495, A:G496, A:F497, A:Q498, A:P499, A:T500, A:N501, A:V503, A:G504, A:Y505, A:Q506, A:P507, A:Y508, A:R509, A:V510, A:V511, A:V512, A:L513, A:S514, A:E516, A:L517, A:L518, A:H519, A:A520, A:P521, A:A522, A:T523, A:V524, A:C525, A:G526, A:P527, A:K528, A:K529, A:S530, A:T531, A:N532, A:L533, A:V534, A:K535, A:N536, A:K537, A:N544, A:T553, A:E554, A:S555, A:N556, A:K557, A:F559, A:L560, A:P561, A:F562, A:Q563, A:V576, A:D578, A:P579, A:Q580, A:T581, A:L582, A:E583, A:I584, A:L585 | 182 | 0.735 |
| A:A27, A:Y28, A:T29, A:N30, A:S31, A:F32, A:F59, A:S60, A:N61, A:V62, A:T63, A:W64, A:F65, A:H66, A:A67, A:I68, A:H69, A:P82, A:V83, A:L84, A:P85, A:N87, A:F92, A:A93, A:S94, A:T95, A:E96, A:K97, A:S98, A:N99, A:I100, A:I101, A:R102, A:G103, A:W104, A:I105, A:F106, A:G107, A:T108, A:T109, A:L110, A:D111, A:S112, A:K113, A:S116, A:L117, A:L118, A:I119, A:V120, A:N121, A:N122, A:A123, A:T124, A:N125, A:V126, A:V127, A:I128, A:K129, A:V130, A:C131, A:E132, A:F133, A:Q134, A:F135, A:C136, A:N137, A:D138, A:P139, A:F140, A:L141, A:G142, A:V143, A:C166, A:T167, A:F168, A:E169, A:Y170, A:V171, A:S172, A:F186, A:K187, A:N188, A:L189, A:R190, A:E191, A:F192, A:G199, A:I203, A:S205, A:K206, A:H207, A:T208, A:P209, A:I210, A:N211, A:L212, A:V213, A:R214, A:D215, A:L216, A:P217, A:Q218, A:G219, A:L223, A:L226, A:V227, A:L229, A:P230, A:I231, A:G232, A:I233, A:N234, A:I235, A:T236, A:R237, A:F238, A:Q239, A:T240, A:L241, A:L242, A:A263, A:A264, A:Y265, A:Y266, A:V267 | 125 | 0.732 |
| A:E702, A:N703, A:S704, A:V705, A:A706 | 5 | 0.613 |
| A:N801, A:F802, A:S803, A:Q804, A:I805, A:L806, A:P807, A:D808, A:P809, A:S810, A:K811, A:S813, A:K814, A:R815 | 14 | 0.554 |
| A:D985, A:P986, A:P987 | 3 | 0.548 |
| A:T747, A:E748, A:S750, A:N751, A:L754, A:Q755, A:G757, A:S758 | 8 | 0.546 |
Linear epitopes of spike glycoprotein of SARS-CoV-2 based on the analysis using IEDB-AR Ellipro
| Start–End position of the linear epitopes predicted by Ellipro | Peptide stretch | Score |
|---|---|---|
| 1071–1147 | QEKNFTTAPAICHDG……PELDS | 0.882 |
| 92–192 | FASTEKSNIIRGWIFG……NLREF | 0.811 |
| 433–537 | VIAWNSNNLD……KSTNLVKNK | 0.767 |
| 328–364 | RFPNITNLCPFGEVF……NCVAD | 0.754 |
| 236–267 | TRFQTLLALHRSYL……AAYYV | 0.728 |
| 553–564 | TESNKKFLPFQQ | 0.728 |
| 393–428 | TNVYADSFVIRGDE……KLPDD | 0.718 |
| 60–86 | SNVTWFHAIHVSGT……PVLPF | 0.699 |
| 203–219 | IYSKHTPINLVRDLPQG | 0.686 |
| 702–718 | ENSVAYSNNSIAIPTNF | 0.678 |
| 576–585 | VRDPQTLEIL | 0.674 |
| 879–925 | AGTITSGWTFGAGAA……KLIAN | 0.629 |
| 783–815 | AQVKQIYKTPPIKDFGG……SKR | 0.622 |
| 371–376 | SASFST | 0.575 |
| 226–234 | LVDLPIGIN | 0.507 |
Summary of the IEDB-AR analysis of the 5 grouped peptide zones obtained for spike protein of SARS-CoV-2, which had good binding capacity
| Start–End | Grouped peptide zone | MHC II DP/DQ | MHC II DRB | ||||
|---|---|---|---|---|---|---|---|
| Score | Adjusted peptide | Allele | Score | Adjusted peptide | Allele | ||
| 1021–1046 | SANLAATKMSECVLGQSKRVDFCGKG | 5* | SANLAATKMSECVLG | HLA-DQA1*01:02/DQB1*06:02 | 14 | MSECVLGQSKRVDFC | HLA-DRB1*03:01 |
| 975–1004 | SVLNDILSRLDKVEAEVQIDRLITGRLQSL | 3.8 | SRLDKVEAEVQIDRL | HLA-DQA1*03:01/DQB1*03:02 | 1.1* | VEAEVQIDRLITGRL | HLA-DRB1*03:01 |
| 458–470 | KSNLKPFERDIST | 14 | SNLKPFERDISTEIY | HLA-DQA1*03:01/DQB1*03:02 | 1.8* | SNLKPFERDISTEIY | HLA-DRB3*01:01 |
| 523–546 | TVCGPKKSTNLVKNKCVNFNFNGL | 16 | NLVKNKCVNFNFNGL | HLA-DPA1*02:01/DPB1*05:01 | 5.1* | PKKSTNLVKNKCVNF | HLA-DRB1*13:02 |
| 860–878 | VLPPLLTDEMIAQYTSALL | 12 | LLTDEMIAQYTSALL | HLA-DPA1*02:01/DPB1*01:01 | 1.6* | LLTDEMIAQYTSALL | HLA-DRB1*15:01 |
The star mark indicates 15-length adjusted peptide that has been chosen to form the intersecting zone corresponding to the particular peptide zone
The single asterisk marks represent the 15-length epitope chosen out of the two choices for every grouped peptide zone, such that the chosen ones form intersecting zones with the grouped peptide regions for SARS-CoV-2, Hendra and Zika viruses respectively
Best possible BLAST matches and their corresponding E values for the 4 intersecting peptide zones finally shortlisted as peptide vaccine candidates for SARS-CoV-2
| Intersecting peptide zone | Best BLAST match | E value of the match | Accession/PDB ID |
|---|---|---|---|
| SANLAATKMSECVLG | Crystal structure of influenza A NS1A protein in complex with F2F3 fragment of human cellular factor CPSF30, Northeast Structural Genomics Targets OR8C and HR6309A | 22 | 2RHK_C |
| VEAEVQIDRLITGRL | Chromosome 14 open reading frame 103 | 3.9 | EAW81633.1 |
| SNLKPFERDIST | Rho-associated, coiled-coil containing protein kinase 1 variant | 9.1 | AAI13115.1 |
| PKKSTNLVKNKCVNF | Chromosome 17, hCG 2,045,508 | 15 | EAW89537.1 |
Comparison of the final shortlists obtained from the newer and older approaches showed that there were 3 instances where a good level of matching was observed, thus, indicating that our new approach is indeed consistent in terms of determining peptide candidates for SARS-CoV-2
| Eligible vaccine candidates selected as per the new approach using the 2D Polygon model | Eligible vaccine candidates selected as per the older method of eye-estimation | ||
|---|---|---|---|
| Start–End position | Peptide stretch | Start–End position | Peptide stretch |
| 1021–1035 | SANLAATKMSECVLG | 1019–1033 | RASANLAATKMSECV |
| – | – | 959–973 | LNTLVKQLSSNFGAI |
| 527–541 | PKKSTNLVKNKCVNF | 537–551 | KCVNFNFNGLTGTGV |
| 459–470 | SNLKPFERDIST | 460–474 | NLKPFERDISTEIYQ |
| – | – | 431–445 | GCVIAWNSNNLDSKV |
| – | – | 323–337 | TESIVRFPNITNLCP |
| 987–1001 | VEAEVQIDRLITGRL | – | – |
The matching peptide zones have been placed side-by-side for easy comparison
Fig. 3ASA and PV profiles of the full-length sequence of G glycoprotein of Hendra virus: The components shown in the figure are: ASA profile (blue), PV profile (red), predicted peptide stretches using the new approach (green) and peptide stretches determined in Dey et al. (2018) (black). (Color figure online)
The 8 grouped peptide zones obtained after grouping of the top 75 ranks for G glycoprotein
| Starting position | Ending position | Consolidated Peptide stretch |
|---|---|---|
| 29 | 49 | YGTMDIKKINDGLLDSKILGA** |
| 11 | 28 | NNNLSGKIKDQGKVIKNY** |
| 370 | 400 | LPRTEFQYNDSNCPIIHCKYSKAENCRLSMG** |
| 297 | 313 | VSHVGDPILNSTSWTES |
| 122 | 154 | ANIGLLGSKISQSTSSINENVNDKCKFTLPPLK** |
| 84 | 110 | KESLQSVQQQIKALTDKIGTEIGPKVS |
| 548 | 564 | QVPLAEDDTNAQKTITD |
| 593 | 604 | FAVKIPAQCSES |
The zones marked with star indicate the ones which fell into the top 50th percentile as per the 2D Polygon analysis
Double asterisks refer to those peptide regions which fell in the top 50th percentile as per the 2D Polygon score for SARS-CoV-2, Hendra and Zika viruses respectively
Summary of the IEDB-AR study for G glycoprotein where the best possible 15-length epitopes for both the MHC II DRB and DP/DQ analyses have been listed
| Start–End | Grouped peptide zone | MHC II DP/DQ | MHC II DRB | ||||
|---|---|---|---|---|---|---|---|
| Score | Adjusted peptide | Allele | Score | Adjusted peptide | Allele | ||
| 29–49 | YGTMDIKKINDGLLDSKILGA | 13 | KINDGLLDSKILGAF | HLA-DQA1*01:02/DQB1*06:02 | 14 | KINDGLLDSKILGAF | HLA-DRB1*03:01 |
| 11–28 | NNNLSGKIKDQGKVIKNY | 37 | LSGKIKDQGKVIKNY | HLA-DPA1*02:01/DPB1*05:01 | 12 | LSGKIKDQGKVIKNY | HLA-DRB3*01:01 |
| 370–400 | LPRTEFQYNDSNCPIIHCKYSKAENCRLSMG | 28 | PRTEFQYNDSNCPII | HLA-DQA1*01:01/DQB1*05:01 | 6.1* | EFQYNDSNCPIIHCK | HLA-DRB3*02:02 |
| 122–154 | ANIGLLGSKISQSTSSINENVNDKCKFTLPPLK | 4.8* | VNDKCKFTLPPLKIH | HLA-DPA1*02:01/DPB1*14:01 | 5.8 | ANIGLLGSKISQSTS | HLA-DRB1*15:01 |
The starred regions were chosen to form the intersecting zone corresponding to the particular peptide zone
The single asterisk marks represent the 15-length epitope chosen out of the two choices for every grouped peptide zone, such that the chosen ones form intersecting zones with the grouped peptide regions for SARS-CoV-2, Hendra and Zika viruses respectively
Best possible BLAST matches and their corresponding E values for the 2 intersecting peptide zones finally shortlisted as peptide vaccine candidates for G protein of Hendra virus
| Intersecting zone | Best BLAST match | E value of the match | Accession/PDB ID |
|---|---|---|---|
| EFQYNDSNCPIIHCK | Immunoglobulin light chain junction region in Homo sapiens | 9.1 | MCC90139.1 |
| VNDKCKFTLPPLK | Coxsackievirus and adenovirus receptor isoform 4 precursor in Homo sapiens | 45 | NP_001193994.1 |
Comparison of the final shortlist for peptide candidates for G protein of Hendra virus obtained from the new approach with that given in Dey et al. (2018), after the IEDB-AR and BLAST analyses
| Final shortlist for G protein of Hendra virus as per the new method | Final shortlists obtained for G protein of Hendra virus for each of the three affected countries as per Dey et al. ( | ||
|---|---|---|---|
| India | Bangladesh | Malaysia | |
| EFQYNDSNCPIIHCK (374–388) | PIAECQYSKPENCRL (383–397) | FKYNDSNCPIAECQY (375–389) | PITKCQYSKPENCRL (383–397) |
| VNDKCKFTLPPLK (142–154) | VNEKCKFTLPPLKIH (142–156) | VNEKCKFTLPPLKIH (142–156) | NVNEKCKFTLPPLKI (141–155) |
| – | KKINEGLLDSKILSA (35–49) | KINEGLLDSKILSAF (36–50) | YGTMDIKKINEGLLD (29–42) |
| – | AVSVVGDPILNSTYW (296–310) | VVGDPVLNSTYWSNS (299–313) | PILNSTYWSGSLMMT (303–317) |
The comparison shows that all the regions we predicted were present among the published results in Dey et al. (2018)
Fig. 4ASA and PV profiles of the 251 aa fragment sequence of E protein of Zika virus: The components shown in the figure are: ASA profile (blue), PV profile (red), predicted peptide stretches using the new approach (green) and peptide stretches determined in Dey et al. (2017) (black). (Color figure online)
The 9 grouped peptide zones obtained after grouping of the top 50 ranks for E protein of Zika virus
| Starting position | Ending position | Consolidated Peptide stretch |
|---|---|---|
| 212 | 233 | PAQMAVDMQTLTPVGRLITANP** |
| 120 | 145 | AKRQTVVVLGSQEGAVHTALAGALEA** |
| 102 | 114 | GTPHWNNKEALVE** |
| 53 | 66 | FGSLGLDCEPRTGL** |
| 113 | 130 | VEFKDAHAKRQTVVVLGS** |
| 139 | 158 | LAGALEAEMDGAKGRLFSGH |
| 68 | 79 | FSDLYYLTMNNK |
| 38 | 51 | EVTPNSPRAEATLG |
| 188 | 201 | PAETLHGTVTVEVQ |
The zones marked with star indicate the ones which fell into the top 50th percentile as per the 2D Polygon analysis
Double asterisks refer to those peptide regions which fell in the top 50th percentile as per the 2D Polygon score for SARS-CoV-2, Hendra and Zika viruses respectively
Summary of the IEDB-AR study for the E protein where the best possible 15-length epitopes for both the MHC II DRB and DP/DQ analyses have been listed
| Start–End | Grouped peptide zone | MHC II DP/DQ | MHC II DRB | ||||
|---|---|---|---|---|---|---|---|
| Score | Adjusted peptide | Allele | Score | Adjusted peptide | Allele | ||
| 212–233 | PAQMAVDMQTLTPVGRLITANP | 20 | VDMQTLTPVGRLITA | HLA-DPA1*02:01/DPB1*14:01 | 8.3* | PAQMAVDMQTLTPVG | HLA-DRB4*01:01 |
| 120–145 | AKRQTVVVLGSQEGAVHTALAGALEA | 2* | SQEGAVHTALAGALE | HLA-DQA1*05:01/DQB1*03:01 | 7.7 | SQEGAVHTALAGALE | HLA-DRB1*09:01 |
| 102–114 | GTPHWNNKEALVE | 7.4* | TGTPHWNNKEALVEF | HLA-DQA1*03:01/DQB1*03:02 | 25 | TGTPHWNNKEALVEF | HLA-DRB1*13:02 |
| 53–66 | FGSLGLDCEPRTGL | 44 | FGSLGLDCEPRTGLD | HLA-DQA1*03:01/DQB1*03:02 | 5.1* | ATLGGFGSLGLDCEP | HLA-DRB1*15:01 |
| 113–130 | VEFKDAHAKRQTVVVLGS | 7.9* | DAHAKRQTVVVLGSQ | HLA-DPA1*02:01/DPB1*14:01 | 21 | VEFKDAHAKRQTVVV | HLA-DRB5*01:01 |
The starred regions were chosen to form the intersecting zone corresponding to the particular peptide zone
The single asterisk marks represent the 15-length epitope chosen out of the two choices for every grouped peptide zone, such that the chosen ones form intersecting zones with the grouped peptide regions for SARS-CoV-2, Hendra and Zika viruses respectively
Best possible BLAST matches and their corresponding E values for the 5 intersecting peptide zones to be finally shortlisted as peptide vaccine candidates for E protein of Zika virus
| Intersecting zone | Best BLAST match | E value of the match | Accession/PDB ID |
|---|---|---|---|
| PAQMAVDMQTLTPVG | MON1 homolog A (yeast), isoform CRA_a in Homo sapiens | 2.7 | EAW65037.1 |
| SQEGAVHTALAGALE | Protein PEAK3 in Homo sapiens | 22 | NP_940934.1 |
| GTPHWNNKEALVE | Tumor protein p63-regulated gene 1-like protein in Homo sapiens | 11 | NP_877429.2 |
| FGSLGLDCEP | Immunoglobulin heavy chain junction region in Homo sapiens | 5.3 | MBN4382203.1 |
| DAHAKRQTVVVLGS | Immunoglobulin heavy chain junction region in Homo sapiens | 2.8 | MOP52445.1 |
Comparison of the final shortlist for peptide candidates for E protein of Zika virus obtained from the new approach with that given in Dey et al. (2017), after the IEDB-AR and BLAST analyses
| Finally shortlisted regions for E protein of Zika virus using the new approach | Finally shortlisted peptide regions for E protein of Zika virus listed in Dey et al. ( | ||
|---|---|---|---|
| Start–End position | Peptide Stretch | Start–End position | Peptide stretch |
| 212–226 | PAQMAVDMQTLTPVG | 215–223 | MAVDMQTLT |
| 130–144 | SQEGAVHTALAGALE | – | – |
| 102–114 | GTPHWNNKEALVE | 97–106 | AGADTGTPHW |
| 53–62 | FGSLGLDCEP | – | – |
| 117–130 | DAHAKRQTVVVLGS | – | – |
| – | – | 43–50 | SPRAEATL |
| – | – | 180–190 | AAFTFSKVPAE |
The comparison shows that two of the five regions we predicted were present among the published results in Dey et al. (2017)