| Literature DB >> 33625681 |
Seyed Mehdi Sadat1, Mohammad Reza Aghadadeghi2, Masoume Yousefi1, Arezoo Khodaei1, Mona Sadat Larijani1, Golnaz Bahramali3.
Abstract
The emerging Coronavirus Disease 2019 (COVID-19) pandemic has posed a serious threat to the public health worldwide, demanding urgent vaccine provide. According to the virus feature as an RNA virus, a high rate of mutations imposes some vaccine design difficulties. Bioinformatics tools have been widely used to make advantage of conserved regions as well as immunogenicity. In this study, we aimed at immunoinformatic evaluation of SARS-CoV-2 proteins conservancy and immunogenicity to design a preventive vaccine candidate. Spike, Membrane and Nucleocapsid amino acid sequences were obtained, and four possible fusion proteins were assessed and compared in terms of structural features and immunogenicity, and population coverage. MHC-I and MHC-II T-cell epitopes, the linear and conformational B-cell epitopes were evaluated. Among the predicted models, the truncated form of Spike in fusion with M and N protein applying AAY linker has high rate of MHC-I and MCH-II epitopes with high antigenicity and acceptable population coverage of 82.95% in Iran and 92.51% in Europe. The in silico study provided truncated Spike-M-N SARS-CoV-2 as a potential preventive vaccine candidate for further in vivo evaluation.Entities:
Keywords: COVID-19; In silico; Protein; SARS-CoV-2; Vaccine design
Year: 2021 PMID: 33625681 PMCID: PMC7902242 DOI: 10.1007/s12033-021-00303-0
Source DB: PubMed Journal: Mol Biotechnol ISSN: 1073-6085 Impact factor: 2.695
Fig. 1Schematic view of the applied methods in the study
Fig. 2Schematic view of predicted constructs with the flexible spacer (AAY)
BALB/c MHC class I epitopes in predicted models
| Antigenicity score | Percentile rank | Allele | Length | End | Start | Peptide |
|---|---|---|---|---|---|---|
| Model 1 | ||||||
| 1.426 | 0.3 | H-2-Kd | 9 | 69 | 61 | CYGVSPTKL |
| 0.596 | 0.3 | H-2-Dd | 9 | 195 | 187 | YQPYRVVVL |
| 0.578 | 0.64 | H-2-Kd | 9 | 178 | 170 | CYFPLQSYG |
| 0.5003 | 0.7 | H-2-Kd | 9 | 40 | 32 | VYAWNRKRI |
| 0.5107 | 0.8 | H-2-Dd | 9 | 179 | 171 | YFPLQSYGF |
| 0.5453 | 0.92 | H-2-Ld | 9 | 174 | 166 | EGFNCYFPL |
| Model 2 | ||||||
| 1.4263 | 0.3 | H-2-Kd | 9 | 69 | 61 | CYGVSPTKL |
| 0.5964 | 0.3 | H-2-Dd | 9 | 195 | 187 | YQPYRVVVL |
| 0.4821 | 0.4 | H-2-Kd | 9 | 314 | 306 | SYFIASFRL |
| 0.578 | 0.64 | H-2-Kd | 9 | 178 | 170 | CYFPLQSYG |
| 0.734 | 0.68 | H-2-Ld | 9 | 550 | 542 | SPRWYFYYL |
| 0.5003 | 0.7 | H-2-Kd | 9 | 40 | 32 | VYAWNRKRI |
| 0.8519 | 0.7 | H-2-Kd | 9 | 625 | 617 | SQASSRSSS |
| 0.611 | 0.75 | H-2-Ld | 9 | 746 | 738 | WPQIAQFAP |
| 0.5107 | 0.8 | H-2-Dd | 9 | 179 | 171 | YFPLQSYGF |
| 0.5453 | 0.92 | H-2-Ld | 9 | 174 | 166 | EGFNCYFPL |
| Model 3 | ||||||
| 1.4177 | 0.2 | H-2-Kd | 9 | 898 | 890 | QYIKWPWYI |
| 1.4263 | 0.3 | H-2-Kd | 9 | 69 | 61 | CYGVSPTKL |
| 0.596 | 0.3 | H-2-Dd | 9 | 195 | 187 | YQPYRVVVL |
| 0.8274 | 0.62 | H-2-Kd | 9 | 396 | 388 | AYSNNSIAI |
| 0.578 | 0.64 | H-2-Kd | 9 | 178 | 170 | CYFPLQSYG |
| 0.5003 | 0.7 | H-2-Kd | 9 | 40 | 32 | VYAWNRKRI |
| 0.853 | 0.7 | H-2-Kd | 9 | 408 | 400 | FTISVTTEI |
| 1.29 | 0.7 | H-2-Kd | 9 | 445 | 437 | QYGSFCTQL |
| 0.5107 | 0.8 | H-2-Dd | 9 | 179 | 171 | YFPLQSYGF |
| 0.5453 | 0.92 | H-2-Ld | 9 | 174 | 166 | EGFNCYFPL |
| Model 4 | ||||||
| 1.4177 | 0.2 | H-2-Kd | 9 | 898 | 890 | QYIKWPWYI |
| 1.4263 | 0.3 | H-2-Kd | 9 | 69 | 61 | CYGVSPTKL |
| 0.5964 | 0.3 | H-2-Dd | 9 | 195 | 187 | YQPYRVVVL |
| 0.4821 | 0.4 | H-2-Kd | 9 | 363 | 355 | SYQTQTNSP |
| 0.8274 | 0.62 | H-2-Kd | 9 | 396 | 388 | AYSNNSIAI |
| 0.578 | 0.64 | H-2-Kd | 9 | 178 | 170 | CYFPLQSYG |
| 0.734 | 0.68 | H-2-Ld | 9 | 1296 | 1288 | SPRWYFYYL |
| 0.5003 | 0.7 | H-2-Kd | 9 | 40 | 32 | VYAWNRKRI |
| 0.8535 | 0.7 | H-2-Kd | 9 | 408 | 400 | FTISVTTEI |
| 1.2906 | 0.7 | H-2-Kd | 9 | 445 | 437 | QYGSFCTQL |
| 0.8519 | 0.7 | H-2-Kd | 9 | 1371 | 1363 | SQASSRSSS |
| 0.611 | 0.75 | H-2-Ld | 9 | 1492 | 1484 | WPQIAQFAP |
| 0.5107 | 0.8 | H-2-Dd | 9 | 179 | 171 | YFPLQSYGF |
| 0.5453 | 0.92 | H-2-Ld | 9 | 174 | 166 | EGFNCYFPL |
BALB/c MHC class II epitopes in predicted models
| Antigenicity score | Percentile rank | Allele | Length | End | Start | Peptide |
|---|---|---|---|---|---|---|
| Model 1 | ||||||
| 0.3676 | 0.07 | H2-IEd | 15 | 29 | 43 | FASVYAWNRKRISNC |
| 0.4243 | 0.1 | H2-IEd | 15 | 28 | 42 | RFASVYAWNRKRISN |
| 0.3086 | 0.14 | H2-IEd | 15 | 30 | 44 | ASVYAWNRKRISNCV |
| 0.4963 | 0.14 | H2-IEd | 15 | 27 | 41 | TRFASVYAWNRKRIS |
| 0.1089 | 0.17 | H2-IEd | 15 | 130 | 144 | NYNYLYRLFRKSNLK |
| 0.2254 | 0.17 | H2-IEd | 15 | 131 | 145 | YNYLYRLFRKSNLKP |
| 0.3301 | 0.3 | H2-IEd | 15 | 31 | 45 | SVYAWNRKRISNCVA |
| 0.1801 | 0.36 | H2-IEd | 15 | 129 | 143 | GNYNYLYRLFRKSNL |
| 0.0415 | 0.38 | H2-IEd | 15 | 132 | 146 | NYLYRLFRKSNLKPF |
| 0.0814 | 0.71 | H2-IEd | 15 | 133 | 147 | YLYRLFRKSNLKPFE |
| 0.0207 | 0.74 | H2-IEd | 15 | 128 | 142 | GGNYNYLYRLFRKSN |
| Model 2 | ||||||
| 0.4243 | 0.1 | H2-IEd | 15 | 42 | 28 | RFASVYAWNRKRISN |
| 0.4614 | 0.14 | H2-Ied | 15 | 534 | 520 | QIGYYRRATRRIRGG |
| 0.4963 | 0.14 | H2-IEd | 15 | 41 | 27 | TRFASVYAWNRKRIS |
| 0.4072 | 0.27 | H2-Ied | 15 | 322 | 308 | FIASFRLFARTRSMW |
| 0.6649 | 0.27 | H2-IEd | 15 | 535 | 521 | IGYYRRATRRIRGGD |
| 0.4424 | 0.3 | H2-IEd | 15 | 323 | 309 | IASFRLFARTRSMWS |
| 0.7304 | 0.57 | H2-Ied | 15 | 324 | 310 | ASFRLFARTRSMWSF |
| 0.7955 | 0.73 | H2-IEd | 15 | 325 | 311 | SFRLFARTRSMWSFN |
| 0.7387 | 0.94 | H2-Ied | 15 | 260 | 246 | LLQFAYANRNRFLYI |
| 0.8634 | 0.98 | H2-IEd | 15 | 416 | 402 | DSGFAAYSRYRIGNY |
| Model 3 | ||||||
| 0.4243 | 0.1 | H2-IEd | 15 | 42 | 28 | RFASVYAWNRKRISN |
| 0.4963 | 0.14 | H2-Ied | 15 | 41 | 27 | TRFASVYAWNRKRIS |
| Model 4 | ||||||
| 0.4243 | 0.1 | H2-IEd | 15 | 42 | 28 | RFASVYAWNRKRISN |
| 0.4614 | 0.14 | H2-IEd | 15 | 1280 | 1266 | QIGYYRRATRRIRGG |
| 0.4963 | 0.14 | H2-IEd | 15 | 41 | 27 | TRFASVYAWNRKRIS |
| 0.4072 | 0.27 | H2-IEd | 15 | 1068 | 1054 | FIASFRLFARTRSMW |
| 0.6649 | 0.27 | H2-IEd | 15 | 1281 | 1267 | IGYYRRATRRIRGGD |
| 0.4424 | 0.3 | H2-IEd | 15 | 1069 | 1055 | IASFRLFARTRSMWS |
| 0.7304 | 0.57 | H2-IEd | 15 | 1070 | 1056 | ASFRLFARTRSMWSF |
| 0.7955 | 0.73 | H2-IEd | 15 | 1071 | 1057 | SFRLFARTRSMWSFN |
| 0.7387 | 0.94 | H2-IEd | 15 | 1006 | 992 | LLQFAYANRNRFLYI |
| 0.8741 | 0.98 | H2-IEd | 15 | 1162 | 1148 | DSGFAAYSRYRIGNY |
Human class I epitopes in predicted model 4
| Antigenicity score | Percentile rank | Allele | Toxicity | End | Start | Peptide |
|---|---|---|---|---|---|---|
| 0.7571 | 0.01 | HLA-A*03:01 | Non-toxin | 1552 | 1544 | KTFPPTEPK |
| 0.7571 | 0.01 | HLA-A*11:01 | Non-toxin | 1552 | 1544 | KTFPPTEPK |
| 0.8132 | 0.01 | HLA-A*11:01 | Non-toxin | 755 | 747 | VTYVPAQEK |
| 0.7014 | 0.01 | HLA-A*03:01 | Non-toxin | 710 | 702 | ASANLAATK |
| 1.4278 | 0.01 | HLA-B*35:01 | Non-toxin | 586 | 578 | IPFAMQMAY |
| 0.5669 | 0.01 | HLA-B*57:01 | Non-toxin | 1457 | 1449 | KAYNVTQAF |
| 0.882 | 0.01 | HLA-B*51:01 | Non-toxin | 404 | 396 | IPTNFTISV |
| 0.5669 | 0.01 | HLA-B*57:01 | Non-toxin | 1457 | 1449 | KAYNVTQAF |
| 0.8132 | 0.01 | HLA-A*03:01 | Non-toxin | 755 | 747 | VTYVPAQEK |
| 1.4177 | 0.02 | HLA-A*24:02 | Non-toxin | 898 | 890 | QYIKWPWYI |
| 1.7462 | 0.02 | HLA-B*57:01 | Non-toxin | 1291 | 1283 | KMKDLSPRW |
| 0.53 | 0.02 | HLA-B*50:01 | Non-toxin | 871 | 863 | KEIDRLNEV |
| 0.7052 | 0.02 | HLA-B*51:01 | Non-toxin | 402 | 394 | IAIPTNFTI |
| 1.2394 | 0.02 | HLA-B*27:02 | Non-toxin | 1070 | 1062 | ARTRSMWSF |
| 1.7462 | 0.02 | HLA-B*57:01 | Non-toxin | 1291 | 1283 | KMKDLSPRW |
| 0.5107 | 0.03 | HLA-A*24:02 | Non-toxin | 179 | 171 | YFPLQSYGF |
| 1.1141 | 0.03 | HLA-B*27:02 | Non-toxin | 17 | 9 | VRFPNITNL |
| 1.6639 | 0.04 | HLA-A*02:01 | Non-toxin | 107 | 99 | KIADYNYKL |
| 0.8597 | 0.04 | HLA-B*50:01 | Non-toxin | 1100 | 1092 | LESELVIGA |
| 0.5781 | 0.04 | HLA-A*11:01 | Non-toxin | 517 | 509 | TLADAGFIK |
| 0.7785 | 0.05 | HLA-B*35:01 | Non-toxin | 1003 | 995 | FAYANRNRF |
| 0.9457 | 0.05 | HLA-B*35:01 | Non-toxin | 1136 | 1128 | VATSRTLSY |
| 0.6409 | 0.05 | HLA-B*50:01 | Non-toxin | 1102 | 1094 | SELVIGAVI |
| 0.7585 | 0.05 | HLA-B*51:01 | Non-toxin | 1257 | 1249 | FPRGQGVPI |
Human MHC class II epitopes in predicted Model 4
| Antigenicity score | Percentile rank | Allele | Toxicity | End | Start | Peptide |
|---|---|---|---|---|---|---|
| 0.6031 | 0.12 | HLA-DRB1*11:04 | Non-toxin | 1414 | 1400 | AALALLLLDRLNQLE |
| 0.5057 | 0.12 | HLA-DRB1*11:04 | Non-toxin | 1415 | 1401 | ALALLLLDRLNQLES |
| 0.5669 | 0.12 | HLA-DRB1*11:04 | Non-toxin | 1417 | 1403 | ALLLLDRLNQLESKM |
| 0.5531 | 0.12 | HLA-DRB1*11:04 | Non-toxin | 1413 | 1399 | DAALALLLLDRLNQL |
| 0.7357 | 0.12 | HLA-DRB1*11:04 | Non-toxin | 1416 | 1402 | LALLLLDRLNQLESK |
| 0.6286 | 0.12 | HLA-DRB1*11:04 | Non-toxin | 1418 | 1404 | LLLLDRLNQLESKMS |
| 0.6019 | 0.16 | HLA-DRB1*11:04 | Toxin | 918 | 904 | AGLIAIVMVTIMLCC |
| 0.6693 | 0.16 | HLA-DRB1*11:04 | Toxin | 919 | 905 | GLIAIVMVTIMLCCM |
| 0.7442 | 0.16 | HLA-DRB1*11:04 | Toxin | 921 | 907 | IAIVMVTIMLCCMTS |
| 0.6171 | 0.16 | HLA-DRB1*11:04 | Toxin | 920 | 906 | LIAIVMVTIMLCCMT |
| 0.6806 | 0.16 | HLA-DRB1*07:01 | Non-toxin | 409 | 395 | AIPTNFTISVTTEIL |
| 0.7719 | 0.4 | HLA-DRB1*07:01 | Non-toxin | 408 | 394 | IAIPTNFTISVTTEI |
| 1.1349 | 0.51 | HLA-DRB1*07:01 | Non-toxin | 411 | 397 | PTNFTISVTTEILPV |
| 0.8294 | 0.52 | HLA-DRB1*07:01 | Non-toxin | 410 | 396 | IPTNFTISVTTEILP |
| 1.1691 | 0.52 | HLA-DRB1*07:01 | Non-toxin | 412 | 398 | TNFTISVTTEILPVS |
| 0.6336 | 0.58 | HLA-DRB1*15:01 | Non-toxin | 445 | 431 | CSNLLLQYGSFCTQL |
| 0.9934 | 0.58 | HLA-DRB1*07:01 | Non-toxin | 1525 | 1511 | GTWLTYTGAIKLDDK |
| 0.6215 | 0.58 | HLA-DRB1*07:01 | Non-toxin | 1524 | 1510 | SGTWLTYTGAIKLDD |
| 1.2416 | 0.58 | HLA-DRB1*07:01 | Non-toxin | 1526 | 1512 | TWLTYTGAIKLDDKD |
| 0.8305 | 0.6 | HLA-DRB1*15:01 | Non-toxin | 446 | 432 | SNLLLQYGSFCTQLN |
| 0.6128 | 0.69 | HLA-DRB1*15:01 | Non-toxin | 19 | 5 | TESIVRFPNITNLCP |
| 1.2905 | 0.7 | HLA-DRB1*07:01 | Non-toxin | 1034 | 1020 | LACFVLAAVYRINWI |
| 0.7635 | 0.72 | HLA-DRB1*15:01 | Non-toxin | 444 | 430 | ECSNLLLQYGSFCTQ |
| 0.8668 | 0.75 | HLA-DRB1*15:01 | Non-toxin | 447 | 433 | NLLLQYGSFCTQLNR |
| 0.8548 | 0.76 | HLA-DRB1*07:01 | Non-toxin | 1031 | 1017 | PVTLACFVLAAVYRI |
| 1.0450 | 0.76 | HLA-DRB1*07:01 | Non-toxin | 1032 | 1018 | VTLACFVLAAVYRIN |
| 1.1115 | 0.88 | HLA-DRB1*07:01 | Non-toxin | 1035 | 1021 | ACFVLAAVYRINWIT |
| 1.3132 | 0.88 | HLA-DRB1*07:01 | Non-toxin | 1033 | 1019 | TLACFVLAAVYRINW |
| 0.6125 | 0.92 | HLA-DRB1*15:01 | Non-toxin | 20 | 6 | ESIVRFPNITNLCPF |
Predicted epitopes of Model 4 interacting with combined of human MHC class I and II among different population worldwide
| Average of epitope hits | MHCI and MHCII Combined PPCa (%) | Population |
|---|---|---|
| 4.92 | 82.95 | Iran |
| 4.86 | 74.74 | Southwest Asia |
| 7.03 | 79.57 | South Asia |
| 7.87 | 92.51 | Europe |
| 6.83 | 89.20 | North America |
| 3.21 | 64.37 | South America |
| 3.04 | 48.97 | Africa |
aProjected population coverage
Fig. 3Graphical representation of B cell epitopes prediction by a Parker hydrophilicity prediction (threshold: 1.474), b Emini surface accessibility prediction (threshold: 1.000), c Karplus and Schulz flexibility prediction (threshold: 0.999), d Chou and Fasman beta turn prediction (threshold: 1.004) and e Kolaskar and Tongaonkar Antigenicity (threshold: 1.0). The yellow regions above the threshold (red line) are supposed to be a part of B cell epitope whereas the green areas are not (Color figure online)
B-cell linear epitopes for selected Model 4
| Antigenicity | Length | B cell (position) | Proteins |
|---|---|---|---|
| 1.2606 | 14 | VRQIAPGQTGKIAD (89) | Model 1 |
| 0.7136 | 12 | YGFQPTNGVGYQ (177) | |
| 0.8904 | 9 | NNLDSKVGG (121) | |
| 1.3668 | 6 | RVQPTE (1) | |
| 0.5455 | 53 | GTTLPKGFYAEGSRGGSQASSRSSSRSRNSSRNSTPGSSRGTSPARMAGNGGD (601) | Model 2 |
| 0.5302 | 38 | SKMSGKGQQQQGQTVTKKSAAEASKKPRQKRTATKAYN (669) | |
| 0.5605 | 30 | KTFPPTEPKKDKKKKADETQALPQRQKKQQ (798) | |
| 0.5570 | 28 | QHGKEDLKFPRGQGVPINTNSSPDDQIG (495) | |
| 1.1728 | 15 | AFGRRGPEQTQGNFG (710) | |
| 1.2606 | 14 | VRQIAPGQTGKIAD (89) | |
| 0.7136 | 12 | YGFQPTNGVGYQ (177) | |
| 0.8771 | 12 | RIRGGDGKMKDL (530) | |
| 2.1298 | 10 | KLDDKDPNFK (775) | |
| 0.8904 | 9 | NNLDSKVGG (121) | |
| 1.3668 | 6 | RVQPTE (1) | |
| 0.6838 | 6 | TDYKHW (733) | |
| 1.2606 | 14 | VRQIAPGQTGKIAD (89) | Model 3 |
| 0.7136 | 12 | YGFQPTNGVGYQ (177) | |
| 0.5322 | 12 | ILPDPSKPSKRS (487) | |
| 1.4039 | 11 | KNHTSPDVDLG (839) | |
| 0.8904 | 9 | NNLDSKVGG (121) | |
| 1.3668 | 6 | RVQPTE (1) | |
| 0.5455 | 53 | GTTLPKGFYAEGSRGGSQASSRSSSRSRNSSRNSTPGSSRGTSPARMAGNGGD (1347) | Model 4 |
| 0.5302 | 38 | SKMSGKGQQQQGQTVTKKSAAEASKKPRQKRTATKAYN (1415) | |
| 0.5605 | 30 | KTFPPTEPKKDKKKKADETQALPQRQKKQQ (1544) | |
| 1.1728 | 15 | AFGRRGPEQTQGNFG (1456) | |
| 1.2606 | 14 | VRQIAPGQTGKIAD (89) | |
| 0.7136 | 12 | YGFQPTNGVGYQ (177) | |
| 0.5322 | 12 | ILPDPSKPSKRS (487) | |
| 0.8771 | 12 | RIRGGDGKMKDL (1276) | |
| 1.4039 | 11 | KNHTSPDVDLG (839) | |
| 2.1298 | 10 | KLDDKDPNFK (1521) | |
| 0.8904 | 9 | NNLDSKVGG (121) | |
| 1.3668 | 6 | RVQPTE (1) | |
| 0.7417 | 6 | DSLSST (618) | |
| 0.6838 | 6 | TDYKHW (1479) |
Fig.4Sequence and structural analysis of Model 4. a Secondary structure by SOPMA tool, b Three dimensional structure by PyMOL and c Ramachandran Plot generated to validate the modeled 3 structure of model 4 protein which indicates that 91.7% of residues are in the favored region
Discontinuous B-cell epitopes predicted by Ellipro for model 4