| Literature DB >> 32295479 |
Maryam Enayatkhani1, Mehdi Hasaniazad1, Sobhan Faezi2, Hamed Gouklani1, Parivash Davoodian1, Nahid Ahmadi3, Mohammad Ali Einakian4, Afsaneh Karmostaji1, Khadijeh Ahmadi1.
Abstract
At present, novel Coronavirus (2019-nCoV, the causative agent of COVID-19) has caused worldwide social and economic disruption. The disturbing statistics of this infection promoted us to develop an effective vaccine candidate against the COVID-19. In this study, bioinformatics approaches were employed to design and introduce a novel multi-epitope vaccine against 2019-nCoV that can potentially trigger both CD4+ and CD8+ T-cell immune responses and investigated its biological activities by computational tools. Three known antigenic proteins (Nucleocapsid, ORF3a, and Membrane protein, hereafter called NOM) from the virus were selected and analyzed for prediction of the potential immunogenic B and T-cell epitopes and then validated using bioinformatics tools. Based on in silico analysis, we have constructed a multi-epitope vaccine candidate (NOM) with five rich-epitopes domain including highly scored T and B-cell epitopes. After predicting and evaluating of the third structure of the protein candidate, the best 3 D predicted model was applied for docking studies with Toll-like receptor 4 (TLR4) and HLA-A*11:01. In the next step, molecular dynamics (MD) simulation was used to evaluate the stability of the designed fusion protein with TLR4 and HLA-A*11:01 receptors. MD studies demonstrated that the NOM-TLR4 and NOM-HLA-A*11:01 docked models were stable during simulation time. In silico evaluation showed that the designed chimeric protein could simultaneously elicit humoral and cell-mediated immune responses. Communicated by Ramaswamy H. Sarma.Entities:
Keywords: COVID-19; Coronavirus; Epitope; Immunoinformatics; Vaccine
Year: 2020 PMID: 32295479 PMCID: PMC7196925 DOI: 10.1080/07391102.2020.1756411
Source DB: PubMed Journal: J Biomol Struct Dyn ISSN: 0739-1102
Figure 1.Strategies employed in the overall study.
Amino acid sequences of proteins were retrieved from NCBI.
| Name protein | Accession number | FASTA |
|---|---|---|
| Nucleocapsid protein | >QIC53221.1 nucleocapsid protein [Severe acute respiratory syndrome coronavirus 2] | |
| [Severe acute respiratory syndrome coronavirus 2] | MSDNGPQNQRNAPRITFGGPSDSTGSNQNGERSG ARSKQRRPQGLPNNTASWFTALTQHGKEDLKFPRG QGVPINTNSSPDDQIGYYRRATRRIRGGDGKMKDLS PRYFYYLGTGPEAGLPYGANKDGIIWVATEGALNTP KDHIGTRNPANNAAIVLQLPQGTTLPKGFYAEGSRG GSQASSRSSSRSRNSSRNSTPGSSRGTSPARMAGN GGDAALALLLLDRLNQLESKMSGKGQQQQGQTVTK KSAAEASKKPRQKRTATKAYNVTQAFGRRGPEQTQ GNFGDQELIRQGTDYKHWPQIAQFAPSASAFFGMS RIGMEVTPSGTWLTYTGAIKLDDKDPNFKDQVILLNK HIDAYKTFPPTEPKKDKKKKADETQALPQRQKKQQT VTLLPAADLDDFSKQLQQSMSSADSTQA | |
| >QIC53216.1 membrane protein [Severe acute respiratory syndrome coronavirus 2] | ||
| MADSNGTITVEELKKLLEQWNLVIGFLFLTWICLLQF AYANRNRFLYIIKLIFLWLLWPVTLACFVLAAVYRINWI TGGIAIAMACLVGLMWLSYFIASFRLFARTRSMWSF NPETNILLNVPLHGTILTRPLLESELVIGAVILRGHLRIA GHHLGRCDIKDLPKEITVATSRTLSYYKLGASQRVAG DSGFAAYSRYRIGNYKLNTDHSSSSDNIALLV | ||
| >QIC53212.1 ORF10 protein [Severe acute respiratory syndrome coronavirus 2] | ||
| MGYINVFAFPFTIYSLLLCRMNSRNYIAQVDVVNFNLT | ||
| >QIC53206.1 envelope protein [Severe acute respiratory syndrome coronavirus 2] | ||
| MYSFVSEETGTLIVNSVLLFLAFVVFLLVTLAILTALRL CAYCCNIVNVSLVKPSFYVYSRVKNLNSSRVPDLLV | ||
| >QIC53210.1 ORF8 protein [Severe acute respiratory syndrome coronavirus 2] | ||
| MKFLVFLGIITTVAAFHQECSLQSCTQHQPYVVDDP CPIHFYSKWYIRVGARKSAPLIELCVDEAGSKSPIQYID IGNYTVSCLPFTINCQEPKLGSLVVRCSFYEDFLEYH DVRVVLDFI | ||
| >QIC53205.1 ORF3a protein [Severe acute respiratory syndrome coronavirus 2] | ||
| MDLFMRIFTIGTVTLKQGEIKDATPSDFVRATATIPIQ ASLPFGWLIVGVALLAVFQSASKIITLKKRWQLALSKG VHFVCNLLLLFVTVYSHLLLVAAGLEAPFLYLYALVY FLQSINFVRIIMRLWLCWKCRSKNPLLYDANYFLCWH TNCYDYCIPYNSVTSSIVITSGDGTTSPISEHDYQIGG YTEKWESGVKDCVVLHSYFTSDYYQLYSTQLSTDT GVEHVTFFIYNKIVDEPEEHVQIHTIDVSSGVVNPVM EPIYDEPTTTTSVPL |
HLA I antigenic epitopes predicted using Rankpep.
| Antigen | HLA_A0201 | HLA_A0204 | HLA_A0206 | HLA_B0702 | HLA_B51 | HLA_B5401 | HLA_B5301 |
|---|---|---|---|---|---|---|---|
| Nucleocapsid protein | 316-324 | 299-307 | 66-74 | 343-351 | 66-74 | 45-53 | |
| GMSRIGMEV | KHWPQIAQF | FPRGQGVPI | DPNFKDQVI | FPRGQGVPI | LPNNTASWF | ||
| 45-53 | |||||||
| LPNNTASWF | |||||||
| 66-74 | |||||||
| FPRGQGVPI | |||||||
| ORF3a | 39-47 | 39-47 | 45-53 | 35-43 | 41-49 | ||
| ASLPFGWLIV | ASLPFGWLI | WLIVGVALL | IPIQASLPF | LPFGWLIVG | |||
| 46-55 | 72-80 | 72-80 | 35-43 | ||||
| LIVGVALLAV | ALSKGVHFV | ALSKGVHFV | IPIQASLPF | ||||
| 64-73 | 220-228 | 82-90 | |||||
| TLKKRWQLAL | STDTGVEHV | NLLLLFVTV | |||||
| 79-87 | |||||||
| FVCNLLLLFV | |||||||
| Membrane protein | 16-24 | 56-64 | 21-29 | 58-66 | 58-66 | 58-66 | |
| LLEQWNLVI | LLWPVTLAC | NLVIGFLFL | WPVTLACFV | WPVTLACFV | WPVTLACFV | ||
| 15-23 | 96-104 | ||||||
| KLLEQWNLV | FIASFRLFA | FVLAAVYRI | |||||
| 65-73 | 89-97 | ||||||
| FVLAAVYRI | GLMWLSYFI | ||||||
| 96-104 | |||||||
| FIASFRLFA |
HLA II antigenic epitopes predicted using Rankpep.
| Antigen | HLADRB10101 | HLADRB10401 | HLADRB10402 | HLADRB10402 | HLADRB10701 | HLADRB10801 | HLADRB11101 | HLADRB11501 |
|---|---|---|---|---|---|---|---|---|
| Nucleocapsid protein | 298-305 | 52-60 | 346-354 | 41-49 | 87-95 | 50-58 | ||
| YKHWPQIAQ | WFTALTQHG | FKDQVILLN | RPQGLPNNT | YRRATRRIR | ASWFTALTQ | |||
| 354-362 | 86-94 | 300-308 | 348-356 | 86-94 | 360-368 | |||
| NKHIDAYKT | YYRRATRRI | HWPQIAQFA | DQVILLNKH | YYRRATRRI | YKTFPPTEP | |||
| 86-94 | 300-308 | 34-42 | ||||||
| YYRRATRRI | HWPQIAQFA | GARSKQRRP | ||||||
| 305-313 | 49-57 | 298-306 | ||||||
| AQFAPSASA | TASWFTALT | YKHWPQIAQ | ||||||
| 52-60 | 301-309 | |||||||
| WFTALTQHG | WPQIAQFAP | |||||||
| ORF3a | 211-219 | 211-219 | 45-53 | 62-70 | 59-67 | 62-70 | 77-85 | |
| YYQLYSTQL | YYQLYSTQL | WLIVGVALL | IITLKKRWQ | ASKIITLKK | IITLKKRWQ | VHFVCNLLL | ||
| 212-220 | 45-53 | 211-219 | 85-93 | 87-95 | 68-76 | 84-92 | ||
| YQLYSTQLS | WLIVGVALL | YYQLYSTQL | LLFVTVYSH | FVTVYSHLL | RWQLALSKG | LLLFVTVYS | ||
| 65-73 | 54-62 | |||||||
| LKKRWQLAL | AVFQSASKI | |||||||
| Membrane protein | 32-40 | 28-36 | 65-73 | |||||
| ICLLQFAYA | FLTWICLLQ | FLTWICLLQ | FAYANRNRF | FVLAAVYRI | RFLYIIKLI | WICLLQFAY | ||
| 65-73 | 55-63 | 48-56 | 55-63 | 39-47 | ||||
| FVLAAVYRI | WLLWPVTLA | IKLIFLWLL | IIKLIFLWL | WLLWPVTLA | YANRNRFLY | |||
| 76-84 | 65-73 | 90-98 | ||||||
| ITGGIAIAM | FVLAAVYRI | WPVTLACFV | LMWLSYFIA | |||||
| 80-88 | 71-79 | |||||||
| IAIAMACLV | YRINWITGG |
Predicted epitopes of N, ORF3a and M proteins via Bepipred and Kolaskar & Tongaonkar antigenicity.
| Antigen | Server | Amino acid Position | Sequence |
|---|---|---|---|
| Nucleocapsid protein | 1-51 | MSDNGPQNQRNAPRITFGGPSDSTGSNQNGERSGARSKQRRPQGLPNNTAS | |
| (QIC53221.1) | 58-85 | QHGKEDLKFPRGQGVPINTNSSPDDQIG | |
| 93-104 | RIRGGDGKMKDL | ||
| 306-310 | QFAPS | ||
| 323-321 | EVTPSGTWL | ||
| 338-347 | KLDDKDPNFK | ||
| 361-390 | KTFPPTEPKKDKKKKADETQALPQRQKKQQ | ||
| 52-59 | WFTALTQH | ||
| 69-75 | GQGVPIN | ||
| 83-89 | QIGYYRR | ||
| 299-315 | KHWPQIAQFAPSASAFF | ||
| 333-339 | YTGAIKL | ||
| 347-363 | KDQVILLNKHIDAYKTF | ||
| ORF3a | 17-28 | QGEIKDATPSDF | |
| (QIC53205.1) | 61-71 | KIITLKKRWQL | |
| 213-214 | QL | ||
| 216-225 | STQLSTDTGV | ||
| 44-58 | GWLIVGVALLAVFQS | ||
| 74-100 | SKGVHFVCNLLLLFVTVYSHLLLVAAG | ||
| 212-217 | YQLYST | ||
| Membrane protein (QIC53207.1) | 5-20 | NGTITVEELKKLLEQW | |
| 40-41 | AN | ||
| 132-137 | PLLESE | ||
| 180-191 | KLGASQRVAGDS | ||
| 29-38 | LTWICLLQFA | ||
| 46-71 | LYIIKLIFLWLLWPVTLACFVLAAVY | ||
| 83-91 | AMACLVGLM | ||
| 93-101 | LSYFIASFR |
Figure 2.The schematic diagram of the vaccine candidate construct consists of N, ORF3a and M proteins epitopes of the COVID-19 linked together with AAA linkers.
Figure 3.(a-c) Prediction and validation of tertiary structure of the NOM recombinant protein using (a) Prediction of the tertiary structure of the NOM recombinant protein, (b) ProSA web, (c) Ramachandran plot.
Conformational B-cell epitopes from vaccine protein using Ellipro server.
| No. | Residues | Number of residues | Score |
|---|---|---|---|
| 1 | P1, S2, D3, S4, T5, G6, S7, N8, Q9, N10, G11, E12, S14, G15, A16, R17, S18, K19, Q20, R21, R22 | 21 | 0.945 |
| 2 | K123, L124, D125, D126, K127, D128, P129, N130, F131, K132, D133, Q134, V135, I136 | 14 | 0.819 |
| 3 | P23, Q24, G25, L26, P27, N28, N29, T30, A31, S32, W33, F34, T35, A36, L37, T38, Q39, H40, G41, K42, E43, D44, L45 | 23 | 0.779 |
| 4 | I181, I182, T183, L184, K185, K186, R187, W188, Q189, L190, A191, L192, S193, K194, G195, V196, H197, F198, V199, C200, N201, L203, F244, I245, Y246, N247, K248, I249, V250, D251, E252, P253, A254, A255, A256, W257, N258, L259, V260, I261, G262, F263, L264, F265, L266, T267, W268, I269, C270, L271, L272, Q273, F274, A275, Y276, A277, N278, R279, N280, R281, F282, L283, I285, I286, I289, L304, A305, A306, Y308, R309, I310, N311, W312, I313, T314, G315, G316, I317, I319 | 79 | 0.679 |
Figure 4.Three-dimensional representation of discontinuous epitopes of the NOM designed protein. The epitopes are represented by a yellow surface, and the bulk of the polyprotein is represented in grey sticks.
Figure 5.The RMSD values of the simulated monomer forms of the proteins throughput the 100 ns of production runs.
The rankings of the solution of the complexes of NOM protein and the immune receptors sorted by global energy (kJ/mol).
| Complex | No of Solution | glob energy | aVdW | rVdW | aElec | rElec | laElec | lrElec |
|---|---|---|---|---|---|---|---|---|
| CoVir NOM-TLR4 | 6 | −70.72 | −35.8 | 15.25 | −59.29 | 32.2 | −9.04 | 7.84 |
| CoVir NOM-TLR4 | 5 | −36.56 | −38.35 | 23.35 | −35.83 | 0 | −8.39 | 8.75 |
| CoVir NOM-TLR4 | 7 | −36.39 | −24.48 | 25.24 | 0 | 0 | 0 | 0 |
| CoVir NOM-TLR4 | 9 | −36.21 | −35.91 | 16.81 | −16.15 | 12.49 | −16.07 | 8.56 |
| CoVir NOM-TLR4 | 4 | −34.54 | −20.58 | 9.45 | −11.25 | 0 | −5.55 | 0 |
| CoVir NOM-TLR4 | 3 | −21.92 | −34.54 | 8.21 | −25.98 | 58.08 | −29.15 | 12.41 |
| CoVir NOM-TLR4 | 2 | −19.49 | −23.45 | 11.3 | −82.71 | 81.52 | −17.3 | 0 |
| CoVir NOM-TLR4 | 8 | −18.15 | −17.09 | 6.82 | 0 | 0 | 0 | 0 |
| CoVir NOM-TLR4 | 10 | −6.55 | −17.77 | 6.58 | −7.49 | 5.9 | −8.21 | 7.88 |
| CoVir NOM-TLR4 | 1 | 26.74 | −3.48 | 49.43 | 0 | 0 | −2.35 | 3.58 |
| CoVir NOM-HLA-A*11:01 | 2 | −26.26 | −30.64 | 19.51 | −26.6 | 52.43 | −24.6 | 8.53 |
| CoVir NOM-HLA-A*11:01 | 3 | −21.9 | −19.09 | 7.4 | −47.3 | 14.42 | −39.27 | 20.53 |
| CoVir NOM-HLA-A*11:01 | 1 | −5.38 | −5.26 | 0.43 | 0 | 0 | −5.89 | 3.65 |
| CoVir NOM-HLA-A*11:01 | 8 | −0.97 | −8.76 | 3.27 | 0 | 0 | 0 | 0 |
| CoVir NOM-HLA-A*11:01 | 5 | 0.15 | −25.97 | 7.15 | −37.47 | 90.84 | −26.96 | 28.97 |
| CoVir NOM-HLA-A*11:01 | 6 | 1.52 | −0.8 | 0.61 | 0 | 0 | 0 | 0 |
| CoVir NOM-HLA-A*11:01 | 9 | 1.74 | −28.7 | 15.11 | −90.47 | 136.21 | −32.16 | 26.54 |
| CoVir NOM-HLA-A*11:01 | 4 | 10.29 | −10.6 | 22.49 | −40.45 | 11.12 | −24.32 | 24.47 |
| CoVir NOM-HLA-A*11:01 | 7 | 18.39 | −21.1 | 10.69 | −35.18 | 99.31 | −30.65 | 40.48 |
| CoVir NOM-HLA-A*11:01 | 10 | 1498.09 | −34.31 | 2561.4 | −73.6 | 10.46 | −7.59 | 18.87 |
Figure 6.The total energy plots of the simulations.
Figure 7.The RMSD values of each protein in the simulated complexes throughout the 100 ns of production runs.
The Van der Waals, Electrostatic, Polar solvation, SASA and Binding Energy of protein complexes, kJ/mol, calculated by MMPBSA method.
| Complex | Van der Waals | Electrostatic | Polar solvation | SASA | Binding Energy |
|---|---|---|---|---|---|
| CoVir NOM-HLA-A*11:01, sol no 2 | −556 +/- 8 | −2991.1 +/- 28.5 | 1354.4 +/- 20.5 | −79.46 +/- 1.2 | −2267.7 +/- 6.5 |
| CoVir NOM-HLA-A*11:01, sol no 3 | −381.9 +/- 8.7 | −2705.2 +/- 11.5 | 729.4 +/- 10.9 | −57.1 +/- 1 | −2423.3 +/- 24.9 |
| CoVir NOM-TLR4, sol no 5 | −503 +/- 5.7 | −1598.9 +/- 10.8 | 1202 +/- 6.1 | −75.4 +/- 0.3 | −978.4 +/- 7.6 |
| CoVir NOM-TLR4, sol no 6 | −407.6 +/- 5.9 | −1406.8 +/- 23.4 | 1009.3 +/- 12.3 | −64.8 +/- 0.6 | −865.7 +/- 10.6 |
Figure 8.The RMSF values of each protein in the simulated complexes compared to the simulated monomer forms of the proteins throughout the 100 ns of production runs. a, The comparison of the RMSF values of NOM recombinant protein in the complexes with the monomer form. b, The comparison of the RMSF values of HLA-A*11:01 in the complexes with the simulated monomer form. c, The comparison of the RMSF values of TLR4 in the complexes with the simulated monomer form.
Figure 9.The graphical illustration of the monomer forms and the complex forms of the NOM recombinant protein and the HLA-A*11:01 and TLR4 immune receptors.
Five epitope-rich domains were selected
| Antigen | Position | Antigenic determinant |
|---|---|---|
| Nucleocapsid protein | 20–100 | PSDSTGSNQNGERSGARSKQRRPQGLPNNTASWFTALTQHGK EDLKFPRGQGVPINTNSSPDDQIGYYRRATRRIRGGDGK |
| 300–370 | HWPQIAQFAPSASAFFGMSRIGMEVTPSGTWLTYTGAIKLDDK DPNFKDQVILLNKHIDAYKTFPPTEPKK | |
| ORF3a | 40–100 | SLPFGWLIVGVALLAVFQSASKIITLKKRWQLALSKGVHFVCNLLLLFVTVYSHLLLVAAG |
| 210–240 | DYYQLYSTQLSTDTGVEHVTFFIYNKIVDEP | |
| Membrane protein | 20–100 | WNLVIGFLFLTWICLLQFAYANRNRFLYIIKLIFLWLLWPVTLA CFVLAAVYRINWITGGIAIAMACLVGLMWLSYFIASF |