Literature DB >> 28356762

In silico-based vaccine design against Ebola virus glycoprotein.

Raju Dash¹, Rasel Das², Md Junaid³, Md Forhad Chowdhury Akash⁴, Ashekul Islam⁵, Sm Zahid Hosen¹.

Abstract

Ebola virus (EBOV) is one of the lethal viruses, causing more than 24 epidemic outbreaks to date. Despite having available molecular knowledge of this virus, no definite vaccine or other remedial agents have been developed yet for the management and avoidance of EBOV infections in humans. Disclosing this, the present study described an epitope-based peptide vaccine against EBOV, using a combination of B-cell and T-cell epitope predictions, followed by molecular docking and molecular dynamics simulation approach. Here, protein sequences of all glycoproteins of EBOV were collected and examined via in silico methods to determine the most immunogenic protein. From the identified antigenic protein, the peptide region ranging from 186 to 220 and the sequence HKEGAFFLY from the positions of 154-162 were considered the most potential B-cell and T-cell epitopes, correspondingly. Moreover, this peptide (HKEGAFFLY) interacted with HLA-A*32:15 with the highest binding energy and stability, and also a good conservancy of 83.85% with maximum population coverage. The results imply that the designed epitopes could manifest vigorous enduring defensive immunity against EBOV.

Entities: CellLine Chemical Disease Mutation Species

Keywords: Ebola virus; epitope; glycoprotein; vaccine design

Year: 2017 PMID： 28356762 PMCID： PMC5367765 DOI： 10.2147/AABC.S115859

Source DB: PubMed Journal: Adv Appl Bioinform Chem ISSN： 1178-6949

Introduction

Ebola virus (EBOV) is an antisense-strand RNA virus from the Filoviridae family, and it is structurally filamentous.1 Although the initial discovery of EBOV was in 1976, till now more than 24 epidemics have been reported from Africa, mostly with the Zaire species (http://who.int/mediacentre/factsheets/fs103/en/).2–4 The genome of EBOV enciphers the seven structural proteins, ie, nucleoprotein (NP), viral structural proteins (VP35, VP40, VP30, and VP24), glycoprotein (GP), and RNA-dependent RNA polymerase (L).5 Among these, three different versions of glycoprotein are transcribed by the GP gene.6–9 Both attachment protein (GP1) and entry/fusion protein (GP2) are expressed from the full length of the GP chain, which are synthesized from messenger RNAs (mRNAs), containing an additional nontemplated adenosine. The soluble GP (sGP) is synthesized from the unedited RNA transcript. On the contrary, small soluble GP (ssGP) is translated during this process by adding two additional adenosine residues.10 The GPs are expressed virally on the virion surface, which plays a crucial role in the catalysis of membrane fusion and amalgamation to host cells. As a result, it is considered not only a crucial component for vaccines but also an essential target for developing inhibitors and antibodies of attachment and fusion.11–13 With the advances in genomics, proteomics, and the understanding of pathogens, the field of viral vaccine preparation has been recently expanded by a most promising approach, known as epitope-based vaccine design.14 Epitope represents the negligible immunogenic region of a protein sequence, which specifically elicits accurate immune responses.15 Various studies recently reported that the vaccination process based on epitope efficiently educes defensive immune responses against diverse pathogens.16–19 In this context, prediction of epitopes via in silico tools in vaccine designing process can significantly minimize the time and cost required in the development process. Thereby, based on the available GP sequences of EBOV, this study attempted to design effective epitope-based peptide vaccines (T-cell and B-cell epitope) using various in silico tools. These results offer new epitope vaccine candidates for vaccine development against EBOV.

Materials and methods

The methodologies used for peptide vaccine development are shown in Figure 1.

Figure 1

Graphical depiction of the methodologies used in peptide vaccine design.

Abbreviations: EBOV, Ebola virus; GP, glycoprotein; RMSF, root mean square function; RMSD, root mean square deviation; MD, molecular dynamics; HLA, human leukocyte antigen; IEDB, immune epitope database.

Protein sequence retrieval, evaluation analysis, and antigenic protein identification

All available sequences of the GP of EBOV were extracted from the UniProt database.20 After that, multiple sequence alignment was performed by using the ClustalW2 tool, and a phylogenetic tree was assembled by MEGA 6.021 software. And then, VaxiJen v2.022 was used to predict most efficient antigenic protein from the available protein sequences.

T-cell epitope identification and conservancy analysis

T-cell identification was done using the NetCTL 1.2 server,22 setting thresholds at 0.5, 0.89, and 0.94 for sensitivity and accuracy. MHC-I binding of the identified epitopes and epitope conservancy were then calculated using tools from the immune epitope database (IEDB).24–26 These tools calculate the half maximal inhibitory concentration (IC50) value of epitope binding to human leukocyte antigen (HLA) molecules using the stabilized matrix base method.26,27 The restriction for epitope identification was set to 12 MHC-I supertypes. Prior to the run, all the alleles were considered, and the length of the peptides was set at 9.0.

Prediction of population coverage and allergenicity assessment

The population coverage tool from IEDB was applied to determine the population coverage for every single epitope by selecting HLA alleles of the corresponding epitope. Allergenicity of the predicted epitope was calculated using AllerHunter,27 which can predict both nonallergens and allergens with a high level of accuracy, by comparing the input sequence with the sequence of known allergen.29

Molecular simulation analysis of HLA allele interaction

Design of the three-dimensional structure of epitope and HLA protein

The three-dimensional structures of all the five epitopes were predicted by a PEP-FOLD web-based server.30 For each sequence, this server predicted the five most provable structures, the best of which, having the lowest energy model, was chosen for further analysis. To validate the binding of identified epitope and HLA molecule, we considered the homology modeling as there is no relevant structure available in the protein data bank. We selected homology modeling using the most popular online protein fold recognition server, Phyre2,31 to generate the three-dimensional structure of HLA-A*32:1532 (accession id: AM422702). Then, ModRefiner33 was used to minimize and correct the hypothetical structure. The validation of the predicted structure was done using PROCHECK,34 verify 3D,35 ERRAT,36 PROVE,37 and QMEAN.38

Docking analysis

Molecular docking analysis was performed using AutoDock Vina,39 by considering HLA molecule as a protein and identified epitopes as ligands. First, we used the protein preparation wizard of UCSF Chimera40 to prepare the hypothetical protein for docking analysis by adding hydrogens and Gasteiger–Marsili charges.41 The prepared file was then converted into pdbqt format. The parameters used for the docking simulation were set to default. The size of the grid box in AutoDock Vina was kept at 36.3095, 54.3374, and 48.025, respectively, for X, Y, and Z. The energy range was kept at 4, according to the default setting. AutoDock Vina was implemented via the shell script offered by AutoDock Vina developers. Docking results were observed by negative score in kcal/mol, as binding affinity of ligands.39

Binding energy estimation and molecular dynamics (MD) simulation

The binding free energy of HLA-epitope complexes were calculated by using MM (CHARMm)42 – Generalized Born Surface Area (GBSA) and Poisson–Boltzmann Surface Area (PBSA) protocols, implemented in Accelrys Discovery Studio 2.5. Using implicit solvent models of GBSA and PBSA, the binding free energy (ΔGbind) for each epitope was calculated by maintaining salt concentration of 0.15 M. Default value was set for conformational entropy and ligand minimization. The distance cutoff value was set to 14.0 Å. The binding energy was calculated by using following equation: The entire dynamics simulation study for the HLA-epitope complex was accomplished in YASARA Dynamics software. Prior to simulation, the complex was cleaned and optimized the hydrogen bond network.43 After that, a cubic simulation cell was created with a periodic boundary condition, and the atoms of the complex were typed using the AMBER1444 force field. The pKa (acid dissociation constant) values of protein titratable amino acids were calculated and solvated the simulation box using the transferable intermolecular potential3 points (TIP3P) water model (density: 0.997 g/L−1). The system consistent with 46406 atoms was energy minimized using the steepest gradient approach (5000 cycles) followed by simulated annealing method. Restrained and unrestrained all-atom molecular dynamics simulation were performed in solvent using the PME method to describe long-range electrostatic interactions at a cut off distance of 8 Å at physiological conditions (298 K, pH 7.4, 0.9% NaCl).45 A multiple time step algorithm together with a simulation time step interval of 2.50 fs was chosen.46 Molecular dynamics simulations of 100 ns long were performed at constant temperature using a Berendsen thermostat and constant pressure. The MD trajectories were saved every 250 ps for analysis. The trajectories generated from the simulation were analyzed for the stability by various evaluative measures viz. RMSD, RMSF (RMS fluctuations), and initial and final protein backbone comparisons using YASARA structure built in macros and VMD software.

Identification of the B-cell epitope

To detect B-cell epitope, various tools from IEDB were used to identify the B-cell antigenicity, together with the Emini surface accessibility prediction,47 Kolaskar and Tongaonkar antigenicity scale,48 Karplus and Schulz flexibility prediction,49 and Bepipred linear epitope prediction analysis.50 Since antigenic parts of a protein belong to the beta turn regions,51 the Chou and Fasman beta turn prediction tool52 was also used.

Results

Evolutionary analysis of the GP sequences

A total of 46 GP sequences from the different variants of EBOV were collected from the UniProtKB database. Multiple sequence alignment analysis was then performed, and a phylogenetic tree (Figure 2) was constructed thereby. Using the unweighted pair-group method with arithmetic mean, a phylogram was constructed using the bootstrap with 1,000 replications in MEGA6.53 From the multiple sequence alignment analysis, it is clearly seen that protein sequences that isolated from various strains were having a close relationship. Also, from the multiple comparison result, the selected sequences of EBOV of the same subtype have 78%–99% similarity. This result also confers the possibilities of mutation in glycoprotein of all strains, which demonstrates a good agreement with the results from Veljkovic et al.54

Figure 2

Evolutionary divergence analysis of available glycoproteins of different strains of EBOV; results are represented in a phylogenetic tree.

Abbreviations: EBOV, Ebola virus; GP, glycoprotein; sGP, soluble GP.

Antigenic protein prediction

Protein sequences in this study were considered to screen out using VaxiJen web server for the identification of potent antigenic protein. As a corollary, UniProtKB id: Q9YMG2 was identified as the most potent antigenic protein having a maximum total prediction score of 0.5390. Here the threshold of 0.5 is considered as the potent antigenicity.55 This sequence was used for further analysis.

Prediction of potent T-cell epitope

On the basis of the high combinatorial score, the five best epitopes were predicted by the NetCTL server from the selected protein sequence in a preselected environment. The identified epitopes are represented in Table 1. In combination with several methods such as proteasomal cleavage/transporter associated with antigen processing (TAP)/MHC-I combined predictor, MHC-I processing of the NetCTL server calculates an overall score for each peptide’s intrinsic potential from a protein for the designing of T-cell epitope. Peptides with a higher score represent higher processing capabilities.

Table 1

Five most potent T-cell epitopes, according to the overall score predicted by the NetCTL server

Number	Epitopes	Overall score (nM)
1	ATEDPSSGY	2.8069
2	TEDPSSGYY	1.5503
3	LFEVDNLTY	1.1224
4	HKEGAFFLY	1.0887
5	LLQLNETIY	0.9753

The five T-cell epitopes were subjected to MHC-I binding prediction, using the stabilized matrix base method. The epitopes that elicited higher affinity (IC50 <200 nM) were subjected to afterward analysis (Table 2). Notably, proteins are transformed into peptides by proteasome complex, which cleaved the peptide bonds. By combining with Class I MHC molecules, these peptides were deported to the cell membrane, where they were introduced to T helper cells.

Table 2

Interaction, binding, and conservancy of identified T-cell epitopes

Epitope	Interacting MHC-I allele (proteasome score, TAP score, MHC-I score, processing score)	Epitope conservancy (%)
ATEDPSSGY	HLA-B15:02 (0.59) HLA-A26:02 (0.59), HLA-A32:07 (1.03), HLA-B15:17 (1.05), HLA-C03:03 (1.06), HLA-C05:01 (1.21), HLA-C12:03 (1.23), HLA-A68:23 (1.93)	66.23
TEDPSSGYY	HLA-B40:13 (0.52), HLA-B15:02 (0.64), HLA-A32:15 (0.69), HLA-A68:23 (0.7), HLA-A32:07 (0.77), HLA-C05:01 (0.87), HLA-B27:20 (1.21), HLA-C12:03 (1.52)	66.23
LFEVDNLTY	HLA-B40:13 (0.6), HLA-A32:15 (0.63), HLA-C03:03 (0.67), HLA-A29:02 (0.8), HLA-C14:02 (0.9), HLA-A68:23 (1.11), HLA-B27:20 (1.11), HLA-C12:03 (1.36), HLA-A*32:07 (1.66)	66.28
HKEGAFFLY	HLA-B15:02 (0.23), HLA-C03:03 (0.31), HLA-A32:15 (0.72), HLA-B15:03 (0.73), HLA-B27:20 (0.91), HLA-C12:03 (1.08), HLA-A68:23 (1.09), HLA-B40:13 (1.1), HLA-A*32:07 (1.22)	83.85
LLQLNETIY	HLA-B15:02 (0.45), HLA-B15:03 (0.7), HLA-C03:03 (0.71), HLA-C14:02 (0.71), HLA-A32:15 (0.74), HLA-A68:23 (1.11), HLA-C12:03 (1.17), HLA-A32:07 (1.26), HLA-B*27:20 (1.41)	60.38

Notes: MHC-I alleles that have an interacting affinity lower than 200 nm are represented, and total processing scores are shown as enclosed numbers.

Abbreviations: HLA, human leukocyte antigen; MHC-I, major histocompatibility complex I.

As shown in Table 2, a 9-mer T-cell epitope HKEGAFFLY was detected to interact with most MHC-I alleles, including HLA-B*15:02; HLA-C*03:03; HLA-A*32:15; HLA-B*15:03; HLA-B*27:20; HLA-C*12:03; HLA-A*68:23; HLA-B*40:13; and HLA-A*32:07 with higher affinity, among the five T-cell epitopes. Furthermore, this epitope retained the highest conservancy of 83.85%, according to the IEDB conservancy analysis, as tabulated in Table 2. As population coverage in vaccine design generally plays a crucial role, it was calculated in this study.

Population coverage and allergenicity analysis

The cumulative percentage of population coverage was obtained for the predicted epitope HKEGAFFLY. As shown in Table 3, the population coverage for East Africa was found to be 66.98%; in West and North Africa, it was 69.50% and 63.89%, respectively; and for Central Africa it was observed to be 75.93%. The population coverage was recorded at 55.88% for the East Asian region, which was a major hotspot for viral infection. For North America, the population coverage was found to be 58.69%. In current vaccine design pipeline, allergenicity is considered the most prominent barrier in vaccine designing, since most vaccines convert the immune system into an “allergic” reaction56 by inducting Type 2 T helper cells and immunoglobulin E. That is why we predicted allergenicity of the selected epitope by the AllerHunter web server, where the probability is >0.06. The epitope HKEGAFFLY was scored 0.00 (sensitivity =91.6%, specificity =89.3%), and was thus considered a nonallergen, according to the Food and Agriculture Organization/World Health Organization evaluation system of allergenicity prediction.

Table 3

Analysis of the population coverage for the proposed epitope against EBOV

Population	Coverage (%)a	Average hitb	PC90c
East Africa	66.98	1.00	0.23
West Africa	69.50	0.89	0.25
North Africa	63.89	1.13	026
Central Africa	75.93	1.04	0.23
North America	58.69	0.98	0.21
East Asia	55.88	2.11	0.29
Europe	67.57	1.00	0.23
Southeast Asia	63.00	1.45	0.30

Notes:

Projected population coverage.

Average number of epitope hits/HLA combinations recognized by the population.

Minimum number of epitope hits/HLA combinations recognized by 90% of the population.

Abbreviations: EBOV, Ebola virus; HLA, human leukocyte antigen.

Validation of predicted T-cell epitope

As described in the “Materials and methods” section, the hypothetical structure of HLA-A*32:15 protein was generated using the homology technique. The structure was then analyzed through various web-based protein validation software. As shown in Figure 3, the Ramachandran plot generated by the PROCHECK34 server showed that about 98.9% of the residues of protein are located in the most favored region, as against 0% in the outlier region and 1.1% in the generously allowed region. It should be noted that the protein model having >90% of the residues in the core and allowed regions can be considered a high-quality model.57 The hypothetical model was further analyzed using ERRAT and Verify 3D.58 For a good model, structure should retain an ERRAT score >80.00, against which the model in this study obtained an ERRAT score of 89.859.55 Verify 3D graph indicates that 100.00% of residues of this model had an averaged 3D-1D score of 0.2, which is good.59 Along with the QMEAN analysis, the protein model in our interest resulted in a Z-score of −1.33, and the total score was 0.636. This value denotes a higher quality of the model, where the acceptable score ranges between 0 and 1 (Figure 3B).38 On the basis of the results obtained from the aforementioned structural validation programs, the model (Figure 3C) showed much reliability and was considered for further study.

Figure 3

Evaluation of structural superiority by (A) Ramachandran plot, (B) QMEAN assessment, and (C) three-dimensional structure of final model of HLA-A*32:15.

Abbreviations: PDB, Protein Data Bank; HLA, human leukocyte antigen; SSE, secondary structure element; ACC, accessibility.

Molecular docking simulation revealed that the proposed epitopes bound in the cleft of the HLA-A*32:15 (Figure S2), where the highest binding affinity was −7.6 kcal/mol (Table S2, observed for the HKEGAFFLY epitope). The Chimera40 program was used to visualize the interactions of docked HLA-A–epitope complexes, as shown in Figures 4 and S2. Then, binding energy calculation was carried out to understand the binding of HLA with epitopes. Here the binding free energies of MM-GBSA and MM-PBSA are approximate free energies of binding, so a more negative value denotes stronger binding. From MM-GBSA analysis, the highest binding free energy was observed for HLA-A*32:15 with epitope (HKEGAFFLY) of –63.89 kj/mol (Table S1). On the contrary, the lowest binding free energy was obtained for ATEDPSSGY epitope, i.e. –44.86 kj/mol. In contrast of MM-GBSA, the HKEGAFFLY epitope was also resulted the highest binding energy of –38.48 kj/mol, while the lowest binding free energy was seen for TEDPSS-GYY epitope, –20.98 kj/mol. Since HKEGAFFLY epitope obtained the highest docking affinity and binding free energy, its complex subjected for molecular dynamics simulation.

Figure 4

Molecular interaction analysis of HLA-A*32:15 and epitope HKEGAFFLY complex, where the epitope HKEGAFFLY binds in the groove of the HLA-A*32:15, generated by Autodock Vina.

Notes: Here, blue color represents the corresponding distances of hydrogen bonding, the red stick model defines the epitope.

Abbreviation: HLA, human leukocyte antigen.

The 100 ns MD simulation of HLA-epitope (HLA-A*32:15-HKEGAFFLY) complex was carried out using AMBER14 force field, following the energy minimization protocol. The stability of the HLA-epitope complex by means of RMSD was calculated and rendered in Figure 5A. From the results, it is revealed that the HLA molecule was stabilized after 5 ns simulation and tended to remain in plateau phase thereafter for rest of the period. The RMSD value of HLA was observed to grow up quickly from 0.48 Å to 2.214 Å and remained stable in the range from 2.0 Å to 3 Å. In case of epitope, similar RMSD pattern was observed, where the order of magnitude was seen to fluctuate in some range. The average energy of the simulation was –578125.270 kj/mol; the average Coulombic charge and van der Waals interactions was –694749.662 kj/mol, 77122.511 kj/mol, respectively. We also calculated the contribution of each residue for both HLA and epitope in the simulation, in terms of RMSF and RMSD. As seen in Figure 6A, highest RMSD was observed for ARG residue at the position of 180 in HLA, while lowest RMSD observed for CYS100. However, this residue was also resulted highest RMSF value of 7.181 Å, while the rests of the residues were in lowest fluctuation. In case of epitope, the histidine residue at the first position and the tyrosine residue in 9th position were seen to be very much flexible, as these residues were resulted with highest RMSD and RMSF (Figure 6B). In the meanwhile, we calculated the number of hydrogen bond formed between the epitope and HLA molecule during the simulation. The results represented in Figure 5B, showed that hydrogen bond at initial stage was 236, and the range decreased to 160. During the simulation, the number of hydrogen bond was at a range of 160–210, which indicates the strong binding of epitope-HLA complex. Hence, all analyses lead to the conclusion that HKEGAFFLY is one of the most prominent T-cell epitopes for GP based designing of vaccine.

Figure 5

Time evolutions of (A) the backbone RMSD and (B) the number of hydrogen bonds formed between the epitope-HLA (HLA-A*32:15) complex in binding position. The results are presented with respect to the time (ps) of MD simulation, demonstrating the mechanical stability and flexibility of the complex.

Abbreviations: HLA, human leukocyte antigen; MD, molecular dynamics; RMSD, root mean square deviation.

Figure 6

(A) Plot showing the RMSF (black) and RMSD (red) values of C-alpha atoms from of MD simulation of HLA protein. (B) Plot showing the RMSF (black) and RMSD (red) values of C-alpha atoms from of MD simulation of 9-mer epitope. Here, Y axis represented the RMSD and RMSF values in molecular distance unit, ie, Angstrom (A), while X axis demonstrated the position of residue.

Abbreviations: RMSF, root mean square function; RMSD, root mean square deviation; MD, molecular dynamics; HLA, human leukocyte antigen.

B-cell epitope identification

For the identification of potential B-cell epitopes, amino acid scale base methods have been used in this study. Consistent with this protocol, we used diverse investigation processes for the calculation of an incessant B-cell epitope. According to the analysis of Kolaskar and Tongaonkar’s48 antigenicity prediction method, the average antigenicity was 1.028, while 1.225 and 0.894 were the maximum and minimum, respectively. The Kolaskar and Tongaonkar48 antigenicity prediction uses a semiempirical method to predict antigenicity on the basis of physico-chemical properties of the residues in a protein and their diversity in experimentally known epitopes, where values >1.00 were considered to denote a potential antigen. As summarized in Table 4 and Figure S3, 14 epitopes have been found to satisfy the threshold value, and also have the potentiality to express the B-cell response. Furthermore, the surface accessibility of the protein was also analyzed using the Emini surface accessibility prediction methods, since a potent B-cell epitope should be accessible through the surface.47 As shown in Figure S4 and Table 5, higher accessibility was found in regions 9–17 and 186–223 amino acid residues. Figure S5 represents the β-turns region identified by Chou and Fasman β-turn methods.52 According to the result, the region from 200 to 220 (in the region of 200–220 and 105–150) is regarded as β-turns as well as hydrophilic in nature. These are two properties required to be a potent B-cell epitope.60 Experimentally, antigenicity is related to the protein flexibility.61 That is why we implemented the Karplus and Schulz flexibility prediction method, where it was evident that the regions of 255–280 and 200–220 were regarded as the most flexible (Figure S6). Finally, based on the Hidden Markov model, the Bepipred linear epitope prediction tool was utilized to predict linear B-cell epitopes. The predicted result is rendered and tabulated in Figure S7 and Table 6. Hence, by comparing the foregoing results, the peptide sequences ranging from 186 to 220 are able to provoke the immune response as B-cell epitope for GP-based designing of vaccine.

Table 4

Prediction of antigenic region of the protein by Kolaskar and Tongaonkar antigenicity prediction method

Number	Start	End	Peptide	Length
1	4	11	TGILQLPR	8
2	17	56	TSFFLWVIILFQRTFSIPLGVIHNSTLQVSDVDKLVCRDK	40
3	63	69	LRSVGLN	7
4	76	82	ATDVPSA	7
5	89	99	RSGVPPKVVNY	11
6	118	126	GSECLPAAP	9
7	132	154	FPRCRYVHKVSGTGPCAGDFAFH	23
8	156	172	EGAFFLYDRLASTVIYR	17
9	177	189	AEGVVAFLILPQA	13
10	194	202	FSSHPLREP	9
11	211	221	SGYYSTTIRYQ	11
12	233	247	LFEVDNLTYVQLESR	15
13	249	259	TPQFLLQLNET	11
14	274	280	IWKVNPE	7

Table 5

Prediction of the Emini surface accessibility of the protein under study

Number	Start	End	Peptide	Length
1	9	17	LPRDRFKRT	9
2	56	63	KLSSTNQL	8
3	81	87	SATKRWG	7
4	111	117	LEIKKPD	7
5	188	193	QAKKDF	6
6	197	221	HPLREPVNATEDPSSGYYSTTIRYQ	25
7	227	232	TNETEY	6
8	244	249	LESRFT	6
9	263	270	SGKRSNTT	8

Table 6

Identification of the Bepipred linear epitope from the protein sequence

Number	Start	End	Peptide	Length
1	14	14	F	1
2	57	59	LSS	3
3	73	106	NGVATDVPSATKRWGFRSGVPPKVVNYEAGEWAE	34
4	114	131	KKPDGSECLPAAPDGIRG	18
5	141	148	VSGTGPCA	8
6	175	176	TF	2
7	191	193	KDF	3
8	198	215	PLREPVNATEDPSSGYYS	18
9	223	229	TGFGTNE	7
10	261	270	YTSGKRSNTT	10
11	279	285	PEIDTTI	7

Discussion

In recent trends, the primary focus of vaccine development is very much rely on GPs, as they are involved in cell attachment, fusion and entry as well as assist in invasion; and thus plays the role of pathogenesis of disease. The central role of GP of EBOV therefore makes an appropriate antigenic target for vaccine development; as a result, several reports have been published on this perspective.62–66 However, the information representing the population coverage in the worldwide are still limited. In such case, computational based epitiope screening is very much efficient in context of HLA class I molecules,67 and also much safe, high specificity and cost effective. Therefore, this study incorporated various immunoinformatics and molecular modelling tools to identify potential epitopes present in EBOV GPs. Initially, a set of 46 glycoprotein sequences from the different strains of EBOV has been subjected to perform multiple sequence alignment. Previous GP sequences analysis of different strains of each EBOV species revealed a high degree of sequence similarity,68,69 and thereby, it is believed that targeting GP from old strain could provide strong and cross reactive immunity against the new strain and previous outbreaks in 2014.70 Interestingly, in our molecular analysis, we have found ~98–99% conservation for the amino acid sequences of different strains within the species, which confers the degree of similarity and support the previous analysis. From this set of GPs, the most antigenic protein sequence was determined by Vaxijen server. Based on auto cross covariance (ACC), the Vaxijen server transform the protein sequence into uniform vectors of physicochemical properties of proteins. With 91% sensitive, 82% accuracy and 72 specificity, the l00-CV (leave one – out cross validation) was used to identify antigenicity of protein for viral species.71 The resultant antigenic protein (VaxiJen score ≥0.5) was then subjected for various immunoinformatics analysis, followed by IEDB web server. At the beginning five potent 9-mer epitopes have been predicted from NetCTL 1.2 server and selected for further study. Using the threshold of 0.5, the NetCTL 1.2 server predicts maximum number of epitopes without compromising the specificity or sensitivity levels, covering all 12 MHC class I supertypes.23 The five most potent epitopes are represented in Table 1, and the scores are the predicted MHC class I affinities in the form of –logIC50 and IC50 value. For MHC-I binding prediction, peptides with IC50 values <50 nM are considered high affinity, <500 nM for intermediate affinity, and <5000 nM for low affinity. Therefore, we selected maximum alleles having binding affinity <200 nM.72 It is advocated that T-cell epitope binding to specific multiple HLA supertypes are termed as promiscuous in vaccine design, since they effectively increase the coverage of higher proportions of human populations.73,74 According to the results, both HKEGAFFLY and LFEVDNLTY bind to the highest number of alleles. However, HKEGAFFLY represents highest conservancy and was hence considered as epitope of choice. We also validated each epitope by molecular docking simulation and MM-GBSA/MM-PBSA studies with HLA-A*32:15 protein, as it was found common in the results from MHC-I binding interaction analysis. Prior of docking simulation, the three dimensional structure of HLA molecule was prepared by using the Phyre2, followed by intensive mood. As a result, 179 residues (99%) of HLA-A*32:15 modelled at >90% accuracy. In these study, 2BCK_Chain A, crystal structure of HLA-A*2402 showed the highest similarity of 90% (Figure S1). The selected model has been chosen from the twenty models generated by Phyre2, on the basis of similarity and confidence level. Phyre2 is one of the best protein prediction servers that allows remote fold reorganization and homology detection. Using hidden Markov model (HMM), this server predicts the structure of given protein sequence by constructing backbone, loop modelling and adding side chains.75 However in intensive mode, additional ab initio approach is used for reconstruction of missing region, backbone and side chain.75 In docking simulation, among the other epitopes, HKEGAFFLY obtained highest binding affinity (Table S1, supplementary material). In addition, MM-PBSA and MM-GBSA techniques are frequently to re-rank docking poses from molecular docking study, as they achieve a much better performance than docking scoring functions.76 Nevertheless, the success rate of the absolute binding free energy prediction strongly depends on the systems.77 Thence, we used both of these solvation models to predict binding energy more accurately, where the results from MM-PBSA examine the accuracy and reliability of the results from MM-GBSA. Results of binding energy calculation are shown in Table S1. The relative magnitude of binding free energy obtained from GB methods is found to be consistent with those calculated using PB method, despite of the differences in the absolute value of salvation energy. As a corollary, these results also demonstrated the consistence with relative stabilities of HLA-epitope complexes. In previous published reports regarding in silico epitope identification of EBOV,78–81 the studies are limited to sequence-based scoring function techniques and some extend to docking simulation. These techniques have certain limitations, though these are very useful.82 In docking simulation of peptide and protein, it faces problem like peptide’s flexibility.83,84 Whereas, energy based approach like molecular mechanics and interaction energy scoring can add valuable information to sequence based results.85,86 Therefore, we have performed MD simulation study of 100 ns long to enhance the predictive power of the peptide affinity calculations to MHC molecule. In molecular dynamics simulation, both epitope and HLA protein were seen to achieve equilibration, while different fluctuations of RMSD were seen by the time evolution. Higher RMSD values of epitope indicate the flexibility in binding with HLA molecule, during the simulation. These results were further confirmed by the analysis of per residue contributions in dynamics simulation by means of RMSF and RMSD. Low values of RMSF indicate the core region of the HLA protein was stable, while high values of RMSD demonstrated the motion of the protein during the simulation. In like manner, the RMSD and RMSF profiles of eptiope confirm the synergic conformation changes to accommodate the binding pocket of HLA. The hydrogen bond occupancy analysis between the HLA-epitope further confirmed the stability of the complex during the simulation. Overall, these results evidently demonstrate that both HLA and epitope have remarkable conformation changes to facilitate the binding and formed stable complex in thermodynamic environment (Figure 7).

Figure 7

Conformational illustration of HLA-epitope complex. Here, cartoon structure of green color represents the starting confirmation the complex, while red color represents the confirmation of last step in 100 ns long MD simulation.

Abbreviations: MD, molecular dynamics; HLA, human leukocyte antigen.

It is one of important factors in vaccine design that the distribution of HLA varies according to the diverse ethnic groups and geographic regions around the world. Therefore, wide range of population coverage must be considered during the designing of an effective design. According to the results from population coverage analysis, the epitope HKEGAFFLY showed wide range of population coverage in different regions of the world (Table 3), where the highest coverage was observed Central Africa; one of the most EBOV infected areas. This result indicates that it will specifically bind with the prevalent HLA molecules in the target population, where the vaccine will be employed. In other aspects, the B-cell epitope stimulates minimal immune unity, which is very much strong enough to elicit a potent humoral immune response, causing no harmful side effects to human body. Thereby, we are also calculated and found that the sequences ranging from 186–220 as a B-cell epitope, by taking consideration of amino acid property, hydrophilicity, accessibility, flexibility, turns, exposed surface, polarity and antigenic propensity. This study could provide a solid base for vaccine design.

Conclusion

In recent years, most vaccines have been developed based on B-cell immunity; however, the current strategy relies mostly on T-cell epitope owing to long-lasting immunity. Both B-cell and T-cell epitopes are offered in this study for stimulating immunity in several ways. The resulting peptides showed B-cell and T-cell selectivity, better conservancy, population coverage, and significant interaction with MHC-1 allele with good affinity. Above all, the predicted epitopes are anticipated to offer long-term and high protective immunity against EBOV. Multiple sequence alignment and secondary structure predictions of query sequence (HLA*A15:02) and template sequence (PDB: 2BCK). Notes: Green color represents alpha helix, whereas beta sheet is represented by blue color. Abbreviations: HLA, human leukocyte antigen; PDB, Protein Data Bank. Visualization of the best docked HLA–epitope complexes obtained from molecular docking analysis. The complexes are representing the interaction of HLA (HLA*A15:02) with (A) ATEDPSSGY, (B) TEDPSSGYY, (C) LFEVDNLTY, and (D) LLQLNETIY epitope. Abbreviation: HLA, human leukocyte antigen. Kolaskar and Tongaonkar antigenicity prediction of the most antigenic protein. Notes: Here, the x-axis and y-axis represent sequence position and antigenic propensity, respectively. The threshold value is 1.000. The regions above the threshold are antigenic, shown in yellow while, green color reflects the polypeptide regions that could not satisfy the threshold margin. Emini surface accessibility prediction of the most antigenic protein. Notes: Here, the x-axis and y-axis represent sequence position and antigenic propensity, respectively. The threshold value is 1.000. The regions above the threshold are antigenic, shown in yellow while, green color reflects the polypeptide regions that could not satisfy the threshold margin. Chou and Fasman β-turn prediction of the most antigenic protein. Notes: Here, the x-axis and y-axis represent sequence position and antigenic propensity, respectively. The threshold value is 1.000. The regions above the threshold are antigenic, shown in yellow while, green color reflects the polypeptide regions that could not satisfy the threshold margin. Karplus and Schulz flexibility prediction. Notes: Here, x-axis and y-axis represent position and score, respectively. The threshold is 1.00. The flexible regions are shown in yellow while, green color reflects the polypeptide regions that could not satisfy the threshold margin. Bepipred linear epitope prediction of the most antigenic protein. Notes: Here, x-axis and y-axis represent position and score, respectively. The threshold is 0.35. The flexible regions are shown in yellow while, green color reflects the polypeptide regions that could not satisfy the threshold margin. Results of binding affinity and binding energy calculations of the selected epitopes

Table S1

Results of binding affinity and binding energy calculations of the selected epitopes

Epitope	Binding affinity (kcal/mol)	(ΔG) MM–PBSA	(ΔG) MM–GBSA
ATEDPSSGY	−6.8	−21.89	−44.86
TEDPSSGYY	−6.5	−20.98	−57.63
LFEVDNLTY	−6.3	−33.58	−55.96
HKEGAFFLY	−7.6	−38.48	−63.89
LLQLNETIY	−7.0	−37.25	−48.63

83 in total

Review 1. More than one reason to rethink the use of peptides in vaccine design.

Authors: Anthony W Purcell; James McCluskey; Jamie Rossjohn
Journal: Nat Rev Drug Discov Date: 2007-05 Impact factor: 84.694

2. Protein structure prediction on the Web: a case study using the Phyre server.

Authors: Lawrence A Kelley; Michael J E Sternberg
Journal: Nat Protoc Date: 2009 Impact factor: 13.491

3. An Adenovirus Vaccine Expressing Ebola Virus Variant Makona Glycoprotein Is Efficacious in Guinea Pigs and Nonhuman Primates.

Authors: Shipo Wu; Andrea Kroeker; Gary Wong; Shihua He; Lihua Hou; Jonathan Audet; Haiyan Wei; Zhe Zhang; Lisa Fernando; Geoff Soule; Kaylie Tran; Shengli Bi; Tao Zhu; Xuefeng Yu; Wei Chen; Xiangguo Qiu
Journal: J Infect Dis Date: 2016-08-04 Impact factor: 5.226

4. Rational Design, Synthesis, and Biological Evaluation of 7-Azaindole Derivatives as Potent Focused Multi-Targeted Kinase Inhibitors.

Authors: Bénédicte Daydé-Cazals; Bénédicte Fauvel; Mathilde Singer; Clémence Feneyrolles; Benoit Bestgen; Fanny Gassiot; Aurélia Spenlinhauer; Pierre Warnault; Nathalie Van Hijfte; Nozha Borjini; Gwénaël Chevé; Abdelaziz Yasri
Journal: J Med Chem Date: 2016-04-15 Impact factor: 7.446

Review 5. The multiple roles of sGP in Ebola pathogenesis.

Authors: Marc-Antoine de La Vega; Gary Wong; Gary P Kobinger; Xiangguo Qiu
Journal: Viral Immunol Date: 2015-02 Impact factor: 2.257

Review 6. Computations of standard binding free energies with molecular dynamics simulations.

Authors: Yuqing Deng; Benoît Roux
Journal: J Phys Chem B Date: 2009-02-26 Impact factor: 2.991

7. Induction of hepatitis A virus-neutralizing antibody by a virus-specific synthetic peptide.

Authors: E A Emini; J V Hughes; D S Perlow; J Boger
Journal: J Virol Date: 1985-09 Impact factor: 5.103

Review 8. Conserved epitopes of influenza A virus inducing protective immunity and their prospects for universal vaccine development.

Authors: Zuzana Staneková; Eva Varečková
Journal: Virol J Date: 2010-11-30 Impact factor: 4.099

9. Identification of Novel Potential Vaccine Candidates against Tuberculosis Based on Reverse Vaccinology.

Authors: Gloria P Monterrubio-López; Jorge A González-Y-Merchand; Rosa María Ribas-Aparicio
Journal: Biomed Res Int Date: 2015-04-15 Impact factor: 3.411

10. Ebola virus RNA editing depends on the primary editing site sequence and an upstream secondary structure.

Authors: Masfique Mehedi; Thomas Hoenen; Shelly Robertson; Stacy Ricklefs; Michael A Dolan; Travis Taylor; Darryl Falzarano; Hideki Ebihara; Stephen F Porcella; Heinz Feldmann
Journal: PLoS Pathog Date: 2013-10-17 Impact factor: 6.823

23 in total

1. Current progress of immunoinformatics approach harnessed for cellular- and antibody-dependent vaccine design.

Authors: Ada Kazi; Candy Chuah; Abu Bakar Abdul Majeed; Chiuan Herng Leow; Boon Huat Lim; Chiuan Yee Leow
Journal: Pathog Glob Health Date: 2018-03-12 Impact factor: 2.894

2. Enhanced In Vitro and In Vivo Potency of a T Cell Epitope in the Ebola Virus Glycoprotein Following Amino Acid Replacement at HLA-A*02:01 Binding Positions.

Authors: Sylvie Chabot; Yusra Gimie; Karam Obeid; Jaekwan Kim; Clement A Meseda; Krishnamurthy Konduru; Gerardo Kaplan; Li Sheng Fowler; Jerry P Weir; Keith Peden; Marian E Major
Journal: J Virol Date: 2022-09-07 Impact factor: 6.549

3. Mapping HLA-A2, -A3 and -B7 supertype-restricted T-cell epitopes in the ebolavirus proteome.

Authors: Wan Ching Lim; Asif M Khan
Journal: BMC Genomics Date: 2018-01-19 Impact factor: 3.969

Review 4. Epitope Prediction by Novel Immunoinformatics Approach: A State-of-the-art Review.

Authors: Ehsan Raoufi; Maryam Hemmati; Samane Eftekhari; Kamal Khaksaran; Zahra Mahmodi; Mohammad M Farajollahi; Monireh Mohsenzadegan
Journal: Int J Pept Res Ther Date: 2019-08-20 Impact factor: 1.931

5. Vaccine design of coronavirus spike (S) glycoprotein in chicken: immunoinformatics and computational approaches.

Authors: Eman A Awadelkareem; Sumaia A Ali
Journal: Transl Med Commun Date: 2020-08-27

6. Metal based donepezil analogues designed to inhibit human acetylcholinesterase for Alzheimer's disease.

Authors: Md Junaid; Nazrul Islam; Md Kamal Hossain; M Obayed Ullah; Mohammad A Halim
Journal: PLoS One Date: 2019-02-20 Impact factor: 3.240

7. Conserved B and T cell epitopes prediction of ebola virus glycoprotein for vaccine development: An immuno-informatics approach.

Authors: Bilal Ahmad; Usman Ali Ashfaq; Mahmood-Ur Rahman; Muhammad Shareef Masoud; Muhammad Zubair Yousaf
Journal: Microb Pathog Date: 2019-05-07 Impact factor: 3.738

8. Subtractive proteomics to identify novel drug targets and reverse vaccinology for the development of chimeric vaccine against Acinetobacter baumannii.

Authors: Vandana Solanki; Vishvanath Tiwari
Journal: Sci Rep Date: 2018-06-13 Impact factor: 4.379

9. Exploring Lassa Virus Proteome to Design a Multi-epitope Vaccine Through Immunoinformatics and Immune Simulation Analyses.

Authors: Sifat Bin Sayed; Zulkar Nain; Md Shakil Ahmed Khan; Faruq Abdulla; Rubaia Tasmin; Utpal Kumar Adhikari
Journal: Int J Pept Res Ther Date: 2020-01-02 Impact factor: 2.191

Review 10. Advances in Designing and Developing Vaccines, Drugs, and Therapies to Counter Ebola Virus.

Authors: Kuldeep Dhama; Kumaragurubaran Karthik; Rekha Khandia; Sandip Chakraborty; Ashok Munjal; Shyma K Latheef; Deepak Kumar; Muthannan Andavar Ramakrishnan; Yashpal Singh Malik; Rajendra Singh; Satya Veer Singh Malik; Raj Kumar Singh; Wanpen Chaicumpa
Journal: Front Immunol Date: 2018-08-10 Impact factor: 7.561