Literature DB >> 26970211

Two highly similar LAEDDTNAQKT and LTDKIGTEI epitopes in G glycoprotein may be useful for effective epitope based vaccine design against pathogenic Henipavirus.

Md Masud Parvege¹, Monzilur Rahman¹, Yead Morshed Nibir¹, Mohammad Shahnoor Hossain².

Abstract

Nipah virus and Hendra virus, two members of the genus Henipavirus, are newly emerging zoonotic pathogens which cause acute respiratory illness and severe encephalitis in human. Lack of the effective antiviral therapy endorses the urgency for the development of vaccine against these deadly viruses. In this study, we employed various computational approaches to identify epitopes which has the potential for vaccine development. By analyzing the immune parameters of the conserved sequences of G glycoprotein using various databases and bioinformatics tools, we identified two potential epitopes which may be used as peptide vaccines. Using different B cell epitope prediction servers, four highly similar B cell epitopes were identified. Immunoinformatics analyses revealed that LAEDDTNAQKT is a highly flexible and accessible B-cell epitope to antibody. Highly similar putative CTL epitopes were analyzed for their binding with the HLA-C 12*03 molecule. Docking simulation assay revealed that LTDKIGTEI has significantly lower binding energy, which bolstered its potential as epitope-based vaccine design. Finally, cytotoxicity analysis has also justified their potential as promising epitope-based vaccine candidate. In sum, our computational analysis indicates that either LAEDDTNAQKT or LTDKIGTEI epitope holds a promise for the development of universal vaccine against all kinds of pathogenic Henipavirus. Further in vivo and in vitro studies are necessary to validate the obtained findings.

Entities: Chemical Disease Gene Species

Keywords: Conserved epitopes; G glycoprotein; Henipavirus; Universal vaccine

Mesh：

Substances：
Epitopes
Glycoproteins

Year: 2016 PMID： 26970211 PMCID： PMC7172312 DOI： 10.1016/j.compbiolchem.2016.03.001

Source DB: PubMed Journal: Comput Biol Chem ISSN： 1476-9271 Impact factor: 2.877

Introduction

Henipavirus, a genus of the paramyxovirus family, is responsible for several epidemics in some regions of Asia and Australia in the last few years (Aljofan, 2013, Field et al., 2007, Luby and Gurley, 2012). Viruses belonging to this group are deadly pathogens and are characterized by their zoonotic nature, broad host range, and ease of transmission from person to person (Mathieu and Horvat, 2015). Among three established species of the genus Henipavirus, Nipah Virus (NiV) and Hendra Virus (HeV) are responsible for causing lung disease and encephalitis in pigs, horses and humans, while the other, the most recent member of Henipavirus, Cedar Virus (CedPV) is harmless (Luby et al., 2009, Middleton, 2014, Marsh et al., 2012). Pteroid fruit bats and microbats are the common reservoir of Henipavirus (Croser and Marsh, 2016). Upon contact with the bats, domestic animals develop infection and further transmit the virus to human. Environmental changes, destruction of bat habitats through deforestation, and genetic diversification have been implicated as the reasons behind the emergence and re-emergence of the viruses in recent time (Clayton et al., 2013). Since the first emergence in 1994, HeV outbreaks have been observed at least once a year (Tulsiani et al., 2011). During all outbreak events, the horses were infected first and subsequently transmitted to human. Approximately, one hundred horses were affected by the virus with a fatality rate of 70%. In case of human, four out of seven infections became fatal along with influenza-like symptoms in all patients. While some of them recovered from the viral infection, others either developed pneumonitis or encephalitis or even multi-organ failure (Blum et al., 2009). NiV was identified relatively later in 1999 during an outbreak in pig farms. The event was responsible for 257 human infections, including 105 confirmed deaths (Field et al., 2001). Eight more outbreaks of NiV have occurred since then in Bangladesh and India (Chadha et al., 2006). In an outbreak in February, 2011 in the northern region of Bangladesh, 17 out of 24 NiV infected persons died (Chua et al., 2000, Chua et al., 2002, Hughes et al., 2009). The number of outbreaks and re-emergence of Henipaviruses imply its high potency as pathogen which can evolve further by the interplay between the genome of the virus species during growth and multiplication in the common host. Henipaviruses are pleomorphic in nature and their size ranges from 40 to 600 nm in diameter17. The genome of the Henipavirus is ∼18.2 kb long, non-segmented, single stranded negative sense RNA, which encodes six genes corresponding to six structural proteins (Wang et al., 2001). The core of the virus is composed of RNA, which is tightly bound to nucleocapsid (N) protein and associated with large (L) and phosphoprtein (P). F (fusion) and G (attachment) proteins are embedded within the lipid membrane. The role of the G glycoprotein is to attach the virus with the surface of a host cell through a highly conserved mammalian protein, EFN B2 (Bonaparte et al., 2005, Negrete et al., 2005). The F protein helps to fuse the viral membrane with the host cell membrane, which allows the release of the virion into the cell. The deadly nature of Henipaviruses become more frightening due to the unavailability of any approved vaccine or drug either to induce protection against the virus or to control viral infection. None of the several attempts that have been made could develop a successful treatment option. For example, Ribavirin was tested to treat animal models infected with NiV and HeV, but showed no significant increase in the survival rate of the animals (Gurley et al., 2007, Homaira et al., 2010). Monoclonal antibody has also been an attractive treatment option. A neutralizing antibody, mAb m102.4, targeting the glycoprotein domain of the virus that binds to ephrin-B2 and ephrin-B3 receptor of host, has already been tested (Bishop et al., 2007, Negrete et al., 2006, Pasquale, 2008). In vivo experiment has shown enhanced survival of the animals challenged with HeV and NiV (Plotkin, 1999). In addition to these post-exposure treatment options, some recent studies have been able to successfully induce immunity against Hendravirus by injecting animals with a fragment of G protein working as a subunit vaccine (Guillaume et al., 2004, Guillaume et al., 2006, Guillaume et al., 2009, Weingartl et al., 2006). Assessment of the vaccine in non-human primates revealed its potential for human use, but it has not been studied in human so far (Mire et al., 2014). Nevertheless, no universal vaccine has yet been developed that works against all Henipaviruses irrespective of their genetic differences. With the advent of various bioinformatics tools and the availability of enormous sequence data, epitope based vaccine design against highly conserved antigenic protein has become easier. Computational prediction of conserved epitope for the development of novel vaccine will not only save time, but also reduces the cost associated with the whole process. In the present study, we employed various bioinformatics tools to find out the highly conserved peptide and mapped the evolutionary conserved epitopes. We predicted two highly conserved epitopes which may be used to develop universal vaccines to prevent all types of Henipavirus infections.

Materials and methods

Protein sequence retrieval

Sequences of the G glycoprotein of Henipaviruses were retrieved from the NCBI database (Benson et al., 2009) and a total 40 sequences were found in the database. All the sequences were downloaded from the database in FASTA format. Among 40 sequences, 21 belongs to Hendra Virus (HeV) and 19 belongsto Nipah Virus (NiV) respectively. The sequences were selected from different isolates, all of them were wild type, covering a wide range of geographical distribution. Also, their dates of isolation were taken into regard to cover maximum epidemics in the past. The length of the glycoprotein sequences for HeV were 604 amino acids, whereas that for NiV were 602 amino acids.

Variability analysis of G glycoproteins

To determine the degree of conservation, all the retrieved sequences were aligned by using EBI-Clustal Omega program (Sievers et al., 2011) anda multiple sequence alignment (MSA) was generated. The MSA was visualized using Jalview (Waterhouse et al., 2009). The absolute site variability in the MSA was calculated by Shannon entropy analysis (Shannon, 1948) using Protein Variability Server (PVS) (Garcia-Boronat et al., 2008). PVS utilizes several variability metrics to compute absolute variation in a MSA.

Prediction of antigenicity

In order to develop a subunit vaccine, it is imperative to identify those proteins which can induce protection from subsequent challenges. A reference G glycoprotein having accession number, NP_047112, was tested for its antigenicity. VaxiJen v2.0 server (Doytchinova and Flower, 2007) and Kolaskar & Tongaonkar method (Kolaskar and Tongaonkar, 1990) were used to analyze the antigenic property of the given sequence. VaxiJen utilizes an alignment independent approach to predict the antigenicity of a given protein, which is solely based on physiochemical properties of amino acids.

Linear B-cell prediction

B cell epitope prediction can be done with several software packages and each one has its own strength and weakness (Blythe and Flower, 2005, Yang and Yu, 2009). To minimize the false positive predictions, four popular and widely used tools were utilized. For theprediction of B-cell epitopes, sequence was submitted to BepiPred (Larsen et al., 2006), IEDB linear B cell prediction (Vita et al., 2010), BCPRED (El-Manzalawy et al., 2008), and ABCpred (Saha and Raghava, 2006) web servers. BepiPred 1.0 (Larsen et al., 2006) server predicts the epitopes based on a hidden Markov model along with a propensity scale and the threshold for the prediction was set 0.35. IEDB linear B cell epitope prediction tool (Vita et al., 2010) was used as default parameters. Non-overlapping and fixed length epitopes of 14 amino acids were predicted using 75% specific criteria using BCPREDS server (El-Manzalawy et al., 2008). Seventy five percent threshold along with a window length of 10 amino acids was set as parameter for the prediction of epitopes in ABCpred server (Saha and Raghava, 2006). The predicted epitopes from these four servers were compared manually and epitopes that were recognized by all the servers were selected for further analysis.

Surface accessible regions prediction

When an antibody or a cell surface receptor binds to an epitope, it will induce an immune response in turn. An ideal epitope should be accessible to an antibody or a cell surface receptor (Caoili, 2010). In order to determine the surface accessible regions, Emni surface accessible prediction tool of IEDB was used (Vita et al., 2010, Emini et al., 1985). Later, the accessible regions were compared with the selected B cell epitopes. Epitopes that were found to have surface accessibility were selected further for conservancy analysis.

Conservancy analysis of B cell epitope

An ideal epitope should be conserved so that it provides a broader protection against multiple strains, in some cases, even species. Amino acid residues that are crucial for critical functions are believed to have lower variability even under immune pressure. Conservancy analysis of the surface accessible epitopes with all the glycoprotein sequences was analyzed by the IEDB epitope conservancy tool (Bui et al., 2007). For calculating conservancy score, the sequence identity threshold was set at atleast 80 percent.

Prediction of flexibility and hydrophilicity of B cell epitopes

Earlier studies reported that hydrophilicity and flexibility of a peptide correlate with its antigenicity (Novotny et al., 1986). For this, conserved epitopes were subjected to Karplus and Schulz (KS) flexibility prediction (Karplus and Schulz, 1985), and Parker hydrophilicity prediction tools (Parker et al., 1986) for flexibility and hydrophilicity analyses respectively (Tenzer et al., 2005). The KS method uses normalized B-values of Cα-atoms in protein structures for predicting protein flexibility. For its robustness, it has been widely used for analysis of protein flexibility (Sharmin and Islam, 2014, Islam et al., 2012, Oany et al., 2014).

T cell epitope prediction and conservancy analysis

T cell epitopes were identified by NetCTL prediction method (Larsen et al., 2005, Larsen et al., 2007) of IEDB. The threshold was set at 0.50 allowing the sensitivity and specificity of 0.89 and 0.94 respectively. NetCTL predicts T cell epitopes by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. This combined algorithm provides overall prediction scores and based on this scores, top 10 epitopes were taken for further analysis. MHC-I alleles interacting each of the selected epitopes were determined by MHC-I prediction server (Peters and Sette, 2005) of IEDB. Stabilized matrix method (SMM) (Peters and Sette, 2005) was utilized for the determination of half maximal inhibitory concentration (IC50) of peptide binding to MHC-I alleles. For the binding analysis, all the available MHC class I alleles were selected and the peptide lengths wereset at 9 aminoacids. Selected T cell epitopes were subjected to IEDB conservancy analysis tool with a sequence identity threshold of 80 percent (Bui et al., 2007).

Prediction of the 3D structure of conserved T cell epitopes and selected HLA-C 12*03

For the docking simulation assay, 3D structures both of the peptides and corresponding alleles were required. The 3D structures of the selected peptides were generated using the PEP-FOLD Peptide Structure Prediction server (Thevenet et al., 2012, Maupetit et al., 2009). PEP-FOLD uses a de novo approach to predict peptide structure from amino acids, ranging from 9 to 36 residues. This method is based on a Hidden Markov Model derived Structural Alphabet (SA) letter coupled the predicted series of SA letters to a greedy algorithm and a coarse-grained force field (Thevenet et al., 2012). SA letters descibre conformations of four consecutive residues. The best models provided by the server were chosen for the docking study. HLA-C 12*03 was found to interact with most of the predicted T cell epitopes. For this reason, the 3-D model of HLA-C 12*03 was generated using SWISS MODEL server (Biasini et al., 2014). The homology-modeling method comprises following four steps: (i) template selection; (ii) target template alignment; (iii) model building; and (iv) evaluation (Schwede et al., 2003). These steps can be iteratively repeated, until a satisfying model structure is obtained. The HLA-C 12*03 −3D model was evaluated by PROCHECK software (Laskowski et al., 1993) and ProSA web tool (Wiederstein and Sippl, 2007). Ramachandran plot (Laskowski et al., 1993), constructed using the PROCHECK software assesses the stereo-chemical quality of 3-D structure analyzing residue-by-residue geometry and overall structure geometry. ProSA tool provides a Z-score which is a measurement for the quality of the model (Wiederstein and Sippl, 2007).

Docking simulation assay of conserved T cell epitopes with the HLA-C 12*allele

To investigate the interaction of the T cell epitopes with the corresponding MHC class I molecules, docking analysis was performed. Computer-simulated ligand docking is a powerful technique for evaluating relative binding affinity of the ligand toward its receptor (Patronov and Doytchinova, 2013). The AutoDOCK tool from the MGL software package was utilized for the docking purpose (Morris et al., 1998, Morris et al., 2009). At first, both the allele (HLA-C 12*03) and ligand (epitope) files were converted into PDBQT format to use them for the docking study. The grid/space box center was set at −15.059, −3.063, and −26.955 Å in the x-, y-, and z-axes, respectively, which allows the predicted epitopes to bind to the binding groove of the selected MHC class I. The size was set at 20, 40, and 40° A in the x, y, and z dimensions, respectively. The 3D structure of MHC class I H-2Kb molecule complexed with the octapeptide, KVITFIDL (PKB1), was employed as a positive control for the binding assay. The complex structure was resolved at 2.3 Å by X-ray differaction method.The binding of octapeptide to H-2Kb molecule causes a tremendously large conformational change to be readily reconiged by a T-cell receptor (TCR) (Reiser et al., 2016). This conformational change renders a good platform to analyze docking simulation assay. First, the octapeptide was separated from H-2Kb and then it was docked with H-2Kb and selected the MHC class I molecule by setting similar parameters. Binding energy for control peptide was calculated and compared with the energy found for the predicted T cell epitopes. All the analyses were done at 1.0 Å spacing and complexes were visualized using Discovery studio 4.1 (Anon., 2013).

Toxicity prediction of the peptides

The ease of manufacture, high specificity, and high penetration of peptides presents the peptide based therapy promising. But toxicity of peptides bottlenecks the success of peptide based therapy. An ideal epitope should have no or less toxicity, but high antigenicity (Vlieghe et al., 2010). Hence, to determine the toxicity of the selected epitopes, ToxinPred web server was used (Gupta et al., 2013).

Validation of the workflow

Since the workflow used here included various computational tools developed by different platforms, so the valiadation of the workflow was required. Both positive and negative controls were taken into consideration for this purpose. Five linear B cell epitopes were identified and mapped for non-structural protein 1 (NS1) of West Nile virus (Sun et al., 2012). The protein sequence of NS1, as positive control, was retrieved from the NCBI database and fed into the workflow. In parallel, as negative control, a random sequence about 600 aminoacids long was used to determine the functionality of the designed workflow.

Results

Glycoprotein G is conserved in all pathogenic Henipavirus strains

The degree of variability or similarity of specific proteins provides important insights about its evolution, structure function, and immunology. To determine the degree of conservation, MSA by Clustal Omega (Sievers et al., 2011) and protein variability analysis (Garcia-Boronat et al., 2008) were performed. From the MSA, G glycoprotein was found to be well conserved in all sequences (additional Fig. 1 ). The absolute variability computed by PVS (Garcia-Boronat et al., 2008) revealed 11 fully conserved regions, which comprise more than 94% of the length of G protein (Fig. 1).

Fig. 1

Protein variability index of G protein determined by using PVS server. The conservancy threshold was 1.0 in this analysis. X axis indicates the amino acid position in sequences and Y axis indicates the Shannon entropy.

G protein is highly antigenic

To be a vaccine candidate, a protein must be antigenic enough to provoke sufficient immune response. Evaluation of G protein sequence having accession number NP_047112 by VaxiJen server (Doytchinova and Flower, 2007) identified it as a probable antigen with a value of 0.53. A window size of 7 amino acids was set to determine the antigenicity of the central amino acid for each of residue of G protein. Kolaskar & Tongaonkar antigenicity prediction tool (Kolaskar and Tongaonkar, 1990) revealed that 450 amino acid residues out of 604 in the protein were above the threshold value of 1.0. (Fig. 2 ). The maximum and minimum scores were 1.23 and 0.98 for residues at position 297 and 77 respectively, with an average of 1.04.

Fig. 2

The G glycoprotein was found to be highly antigenic. The threshold is 1.00 and residues in yellow regions are antigenic in nature. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Nine B cell epitopes were recognized by all the prediction tools

Several B cell epitope prediction software packages are currently available. Each software has its own dataset and utilizes a particular method for prediction. Since, epitopes predicted for a given protein differ from one software to another (Blythe and Flower, 2005, Yang and Yu, 2009), perfect identification of immunogenic regions in a given antigen is difficult, and prediction of false positive epitopes is a typical problem (Gao et al., 2012). Hence, we utilized four different software packages for the B cell epitope prediction. The number of peptides recognized by BepiPred, IEDB, BCPRED and ABCpred were 17, 18, 16 and 19, respectively. Epitopes recognized by the four prediction tools were taken into account for further analysis. Nine antigenic epitopes of protein were found common to the four prediction tools (Table 1 ). The locations of these epitopes are 1: 74–80, 2: 163–176, 3: 271–279, 4: 322–332, 5: 342–359, 6: 375–380, 7: 529–534, 8: 551–561, 9: 582–587.

Table 1

Nine B cell epitopes found common in four prediction servers along with their lengths and sequence positions.

No	Sequence	Length	Start	End
1	TRTTDNQ	7	74	80
2	PNPLPFREYRPISQ	14	163	176
3	WTPPNPSTI	9	271	279
4	RPKSDSGDYNQ	11	322	332
5	RGKYDKVMPYGPSGIKQG	18	342	359
6	FQYNDS	6	375	380
7	NQTAEN	6	529	534
8	LAEDDTNAQKT	11	551	561
9	DTGDSV	6	582	587

Nine B cell epitopes found common in four prediction servers along with their lengths and sequence positions.

Thirteen surface accessible peptides were predicted

An ideal epitope should have access to an antibody so that the antibody can bind to the epitope to induce immune response (Caoili, 2010). At threshold cutoff 1.0, the surface accessibility of the G protein was determined by Emni surface accessibility predictiontool (Emini et al., 1985), and 13 peptides were found to have scores above the threshold. These peptides were compared with the selected nine epitopes (Table 2 and Fig. 3 ). Seven epitopes were found to have surface accessibility properties. These seven epitopes were analyzed for conservancy.

Table 2

Predicted surface accessible antigenic sites by using Emni surface accessibility prediction analysis.

No	Peptide	Length	Sequence position
1	IKNYYG	6	25–30
2	NYTRTTDN	8	72–79
3	KISQST	6	130–135
4	PLPFREYRPI	10	165–174
5	VWTPPNPS	8	270–277
6	TYHEDFY	7	285–291
7	RPKSDSGDYNQK	12	322–333
8	KVERGKYDKV	10	339–348
9	PRTEFQYNDS	10	371–380
10	KYSKAE	6	388–393
11	QASYSW	6	455–460
12	NSNQTAE	7	527–533
13	AEDDTNAQKT	10	552–561

Fig. 3

Surface accessibility of G protein. The horizontal red line indicates surface accessibility cutoff and yellow color regions above this cutoff are surface accessible epitopes. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Predicted surface accessible antigenic sites by using Emni surface accessibility prediction analysis. Surface accessibility of G protein. The horizontal red line indicates surface accessibility cutoff and yellow color regions above this cutoff are surface accessible epitopes. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Four B cell epitopes are highly similar among all sequences

The use of conserved epitopes provides broader protection across multiple strains, or even species, than epitopes derived from highly variable genomic regions. So, in an epitope based vaccine setting, an ideal epitope should be highly similar or conserved. The conservancy of the seven B cell epitopes was evaluated by the IEDB conservancy analysis tool (Bui et al., 2007). Among seven, four epitopes were found to be higly similar in all G protein sequences (Table 3 ).

Table 3

Consensus sequences between the predicted B cell epitopes and Emni surface peptides along with their conservancy using IEDB conservancy analysis.

No	Sequence	Length	Percent of protein sequence matches at identity ≥100%	Minimum identity (%)	Maximum identity (%)
1	NYTRTTDNQ	9	52.50 (21/40)	88.89	100.00
2	PLPFREYRPI	10	52.50 (21/40)	80.00	100.00
3	VWTPPNPSTI	10	52.50 (21/40)	70.00	100.00
4	RPKSDSGDYNQ	11	47.50 (19/40)	45.45	100.00
5	KVERGKYDKV	10	52.50 (21/40)	60.00	100.00
6	NSNQTAEN	8	52.50 (21/40)	87.50	100.00
7	LAEDDTNAQKT	11	52.50 (21/40)	81.82	100.00

Consensus sequences between the predicted B cell epitopes and Emni surface peptides along with their conservancy using IEDB conservancy analysis.

LAEDDTNAQKT is highly flexible and accessible

Flexibility and accessibility are two key features of an epitope to induce an immune response (Novotny et al., 1986). Among four highly similar epitopes, LAEDDTNAQKT and NYTRTTDNQ were found to be highly flexible in Karplus and Schulz flexibility prediction analysis (Karplus and Schulz, 1985) (Fig. 4 ). Then, these two epitopes were assessed for their hydrophilicity by IEDB Parker hydrophilicity analysis (Parker et al., 1986). LAEDDTNAQKT epitope was found to be hydrophilic in nature, whereas NYTRTTDNQ was hydrophobic (Fig. 5 ).

Fig. 4

Flexibility of the LAEDDTNAQKT epitope. Amino acids of this epitope were found to be above the threshold level.

Fig. 5

Hydrophilicity of the LAEDDTNAQKT epitope. Most of the residues of this selected epitope were found to be hydrophilic in nature. Residues above the cutoff 6.069 (horizontal red line) are in the yellow region. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Flexibility of the LAEDDTNAQKT epitope. Amino acids of this epitope were found to be above the threshold level. Hydrophilicity of the LAEDDTNAQKT epitope. Most of the residues of this selected epitope were found to be hydrophilic in nature. Residues above the cutoff 6.069 (horizontal red line) are in the yellow region. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

LTDKIGTEI and NSLGQPVFY are highly conserved T cell epitopes

According to selected parameter settings, NetCTL server (Larsen et al., 2005) identified 27 potential T cell epitopes, but only 10 epitopes were chosen on the basis of high combinatorial scores. MHC class I alleles that interact with the selected epitopes were determined by MHC-I binding prediction server based on IC50 values (Peters and Sette, 2005, Peters et al., 2003). The result is summarized in Table 4 . Conservancy analysis of the 10 T cell epitopes revealed that both LTDKIGTEI and NSLGQPVFY epitopes had highest similariliy in all protein sequences (Table 4).

Table 4

Predicted T-cell epitopes along with their interacting MHC-I alleles.

Epitopes	Prediction score	Interacting MHC-I allele with an affinity of IC₅₀ <200 nM	Percent of protein sequence matches at identity ≥100%	Identity (Min/Max) (%)
LTDKIGTEI	1.8091	C12:03, A68:23, C05:01, A32:07, A32:07, C15:02, C14:02, B27:20, A*02:50	52.50 (21/40)	88.89/100
NSLGQPVFY	1.2151	C03:03, A32:07, B27:20, A32:15, A68:23, C12:03, B*40:13	52.50 (21/40)	88.89/100
AVDNGFFAY	3.2584	C05:01, C12:03, A68:23, A80:01, A32:07, A32:15, B35:01, A29:02, B15:02, A30:02, B27:20, A11:01	52.50 (21/40)	66.67/100
SSTYHEDFY	2.5667	A32:07, B27:20, A68:23, C12:03, B40:13, C03:03, A32:15, B15:02, B*15:17	52.50 (21/40)	44.44/100
STYHEDFYY	2.3544	B15:17, A68:23, A32:07, A29:02, C12:03, C03:03, B27:20, A11:01, B40:13, A02:17, A80:01, A32:15, A30:02, C14:02, A26:02, A68:01, B*15:03	52.50 (21/40)	44.44/100
DSGDYNQKY	2.2951	C12:03, A32:07, A68:23, B40:13, B27:20, C07:01, A26:02, A32:15	47.50 (19/40)	33.33/100
ITKVERGKY	2.1445	A68:23, C12:03, A30:02, B15:17, A32:07, B27:20, A32:15, C03:03	50.00 (20/40)	44.44/100
CSSTYHEDF	1.7539	B27:20, C03:03, A32:07, A68:23, B15:03, B40:13, C12:03, A32:15, B15:17, B58:01, C15:02, C07:01	52.50 (21/40)	44.44/100
ITDCFLLEN	1.7215	C12:03, A68:23, B27:20, C05:01, A32:07, C15:02, A02:11, B40:13, A02:06, A02:50	52.50 (21/40)	77.78/100
TTDNQALIK	1.2856	C12:03, A68:23, C05:01, A32:07, A11:01, A32:15, B27:20, C14:02, B*40:13	52.50 (21/40)	77.78/100

Predicted T-cell epitopes along with their interacting MHC-I alleles.

3D structures of the predicted epitope peptides and HLA-C 12*03 allele were predicted and validated

For the molecular docking simulation assay, the 3-D structure of both the epitopes and MHC molecule are required. PEP- FOLD Peptide Structure Prediction server was used for generating 3-D structure of two T cell epitopes LTDKIGTEI and NSLGQPVFY (Fig. 6 A and B). As both epitopes were predicted to interact with HLA-C 12*03 allele, the 3-D structure of it was generated by homology modeling (Biasini et al., 2014). A total of three models were generated by SWISS MODEL server. Based on the GQME and QMEAN4 score, best model was selected for which the template was 4nt6.1.A. The target and template sequences showed 97.44% sequence identity. The The model was further validated using Ramachandran plot and Z-score. Ramachandran plot generated by Procheck software showed that 80% residues were in the favorable region (Fig. 6D). To check whether the input structure is within the range of scores typically found for native protein of similar size, ProSAz-score was calculated (Wiederstein and Sippl, 2007). Z-score, indicative of overall model quality, was −8.81 that ascertained the quality of the generated model (Fig. 6E).

Fig. 6

Predicted 3-D structures using homology modelling of the T cell epitopes LTDKIGTEI (A) and NSLGQPVFY (B), and MHC class I molecule HLA-C 12*03 (C). Ramachandran plot along with statistics showing residues in the most favorable and disallowed regions (D). Z-score for quality of the 3D structure HLA-C 12*03 (E).

T cell epitope LTDKIGTEI bound with the HLA-C 12*03 allele with a binding energy of −6.8 kcal/mol

Computer-simulated ligand docking is a rapid and powerful technique to evaluate the relative binding affinity of the ligand toward its receptor. At first binding models for both of the conserved T cell epitopes with the HLA-C 12*03 were generated using the AutoDOCK tool (Morris et al., 2009). Free energy of binding was estimated by Autodock tool according to the following equation: In docking analysis, intermolecular molecular included van der waals, H-bond, disolv and electrostatic energy. The energy values calculated for binding both of the epitopes, LTDKIGTEI and NSLGQPVFY, to the binding groove of the HLA-C 12*03 were −6.8 and −2.1 kcal/mol, respectively (Fig. 7 ). In case of binding groove of H-2KB (control MHC allele), the binding energy were −7.5 and −2.2 kcal/mol, respectively.

Fig. 7

Docking simulation assay of the binding of predicted and control epitopes to MHC class I molecule HLA-C 12*03 and H-2Kb. Binding of “LTDKIGTEI” to the binding grooves (A) of the predicted structure of HLA-C 12*03 (binding energy: −6.8 kcal/mol) and (B) of the 3D structure of H-2Kb (binding energy: −7.5 kcal/mol); (C) binding of control peptide “KVITFIDL” to the predicted 3D structure of HLA-C 12*03 (−6.9 kcal/mol) and (D) H-2Kb (−7.7 kcal/mol). Control peptide “KVITFIDL” bound to the HLA-C 12*03 and H-2kb alleles with the binding energy of −6.9 and −7.7 kcal/mol, respectively (Fig. 7). As lower binding energy favors the formation of stable interaction, it is appreciable to expect that LTDKIGTEI will interact with MHC-I molecules in vivo readily.

LTDKIGTEI and NSLGQPVFY are non-cytotoxic

Because of the advantages of high specificity, high penetration, and ease of manufacturing over small molecules, peptides have emerged as promising therapeutics against many fatal diseases (Thundimadathil, 2012). But, one of the bottlenecks that impede the efficacy of peptide based therapies is their toxicity. Hence, toxicity for the selected epitopes was assessed by ToxinPred tool (Gupta et al., 2013). Both of the selected epitopes were found to be non-toxic to cell rendering their potential as candidate vaccines.

Designed workflow concords with experimental results

Although bioinformatics has revolutionized the research in biomedical science, but some concerns still prevails in the scientific community whether the prediction is in agreement with the in-vivo experimental result. The workflow, designed to predict epitopes, used here was validated by positive and negative controls. Sun et al. (2012) mapped five linear epitopes, located at aminoacids 21–36, 101–116, 191–206, 231–246, and 261–276 in NS1 protein of West Nile virus. When the sequence of NS1 protein fed into the workflow, it identified all epitopes except the one at 21–36. Random sequence used as negative control failed to pass the first step of the workflow.

Discussion

Henipavirus is an emerging zoonotic pathogen which causes severe encephalitis and respiratory illness in humans (Williamson and Torres-Velez, 2010). The frequent outbreaks of this pathogenhave exceeded the mortality rate of 70%. The swift human to human transmission, unavailability of therapeutic options, and widespread distribution of reservoirs have rendered Henipavirus a profound global public health concern (Eaton et al., 2006). Therefore, it is inevitable to take preventive measures.The development of a universal vaccine for all of the pathogenic members of Henipavirus genus is much more promising and economical solution rather than individual vaccine to each member.The concept of designing universal vaccine to prevent virus infection has already been reported. In case of influenza virus, a universal vaccine against matrix 2 protein has been designed which is able to fight against all Influenza serotypes (Pica and Palese, 2013). An endeavor to design a universal vaccine targeting spike protein against the members of coronaviruses was also reported (Jones et al., 1993). At present, vaccination is widely conceived as the most effective method for the prevention of infectious diseases. Conventional development of vaccinesis relied on the inactivated or live attenuated pathogens. But, now the increasing understanding of antigen recognition at the molecular level has opened a new window for vaccine design and development, which surmounts the drawbacks of the traditional vaccine development process. The rationale behind the epitope based vaccine is the chemical synthesis of identified B cell and T cell epitopes, that are immunodominant and can generate specific immune responses (Patronov and Doytchinova, 2013). Epitopes have become suitable vaccine candidates owing to its comparative ease of construction and production, absence of infectious potential, and chemical stability (Naz and Dabir, 2007, Purcell et al., 2007). In the current study, we aimed to identify potential epitopes which can be used to design universal vaccines for all three members of Henipavirus. We chose G protein over other proteins for epitope prediction as it facilitates the attachment of Henipavirus with the host cell membrane (Bishop et al., 2007). Most of the vaccines are based on B cell immunity, but at present T cell epitope-based vaccine has been drawn much more interest as CD8+T cells generate strong immune responses in the host against viral infection (Shrestha and Diamond, 2004). For this reason, we identified B cell epitopes as well as T cell epitopes. Surface glycoproteins always undergo frequent mutations to evade host defense. First step in an epitope based vaccine design is to identify the sequences of low variability. Hence, at first, we determined the degree of conservation of G protein. MSA and absolute variability site analysis demonstrated that G protein is highly conserved in Henipavirus. Several B cell epitope prediction methods have been developed in recent years, but performances of prediction of these methods are still far from ideal (Greenbaum et al., 2007). To avoid the false positive, we utilized four different B cell epitope prediction methods and only the nine epitopes which were found common to all prediction were selected for further analysis. B cell epitopes must be accessible enough to bind to antibodies for eliciting immune responses (Caoili, 2010). Here, we identified the surface peptides and compared these peptides with the selected B cell epitopes. From the conservation analysis of seven B cell epitopes, it was found that four B cell epitopes are fully conserved. The flexibility and accessibility are two key requirements for a B cell epitope to be a vaccine. LAEDDTNAQKT and NYTRTTDNQ, two conserved B cell epitopes were found to be highly flexible in nature, but only LAEDDTNAQKT epitope was found to be hydrophilic. The T cell epitope prediction server, NetCTL, identified 27CD8+ epitopes and based on the prediction scores, top 10 Tcell epitopes were evaluated further by immune-informatics tools.To identify interacting MHC-I allelesfor each of the selected epitopes, an elegant machine learning method called support vector machine (SVM) was utilized along with the binding affinity. Two conserved T cell epitopes, LTDKIGTEI and NSLGQPVFY, were found to interact with the HLA-C 12*03 allele with high affinity. We further utilized computer-simulated molecular docking to investigate the intermolecular interaction. From the results of docking simulation assay, it was found that the binding energy of the LTDKIGTEI epitopeis −6.8 kcal/mol, which is almost similar to the binding energy of the control peptide. As lower binding energy favors stable intermolecular interaction, LTDKIGTEI could be a potential epitope. We strongly believe that nontoxic property of these two epitopes can bypass the toxicity related problem of peptide based vaccine. Stability and antigenicity can be further enhanced conjugating these epitopes with the adjuvant (Olesen et al., 2009). To address the questions regarding actual immunogenicity, stability, efficacy, and delivery strategy inside the recipient bodies of these epitopes, in vitro and in vivo experiments are essential.

Conclusion

This study shows potential epitopes to design an epitope-based universal vaccine for all pathogenic members of Henipavirus. Our results are based on sequence analyses of a surface glycoprotein and the predicted epitopes would be candidate targets for the universal vaccine design. The actual effectiveness of these two epitopes for eliciting immune response is needed to be tested both in in vitro and in vivo experiments.

Disclosure

There is no conflict of interest in this work

Author contribution

Conceived and designed the experiments: MSH, MR and MMP. Analyzed the data: MMP, MR and YMN. Wrote the manuscript: MMP and MR. Agreed with manuscript results and conclusions: MMP, MR, YMN and MSH. Jointly developed the structure and arguments for the paper: MMP and MR. Made critical revisions and approved final version: MSH. All authors reviewed and approved the final manuscript.

73 in total

1. Identifying MHC class I epitopes by predicting the TAP transport efficiency of epitope precursors.

Authors: Björn Peters; Sascha Bulik; Robert Tampe; Peter M Van Endert; Hermann-Georg Holzhütter
Journal: J Immunol Date: 2003-08-15 Impact factor: 5.422

Review 2. The changing face of the henipaviruses.

Authors: Emma L Croser; Glenn A Marsh
Journal: Vet Microbiol Date: 2013-08-13 Impact factor: 3.293

Review 3. Synthetic therapeutic peptides: science and market.

Authors: Patrick Vlieghe; Vincent Lisowski; Jean Martinez; Michel Khrestchatisky
Journal: Drug Discov Today Date: 2009-10-30 Impact factor: 7.851

4. Identification of Hendra virus G glycoprotein residues that are critical for receptor binding.

Authors: Kimberly A Bishop; Tzanko S Stantchev; Andrew C Hickey; Dimple Khetawat; Katharine N Bossart; Valery Krasnoperov; Parkash Gill; Yan Ru Feng; Lemin Wang; Bryan T Eaton; Lin-Fa Wang; Christopher C Broder
Journal: J Virol Date: 2007-03-21 Impact factor: 5.103

5. A recombinant Hendra virus G glycoprotein subunit vaccine protects nonhuman primates against Hendra virus challenge.

Authors: Chad E Mire; Joan B Geisbert; Krystle N Agans; Yan-Ru Feng; Karla A Fenton; Katharine N Bossart; Lianying Yan; Yee-Peng Chan; Christopher C Broder; Thomas W Geisbert
Journal: J Virol Date: 2014-02-12 Impact factor: 5.103

6. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility.

Authors: Garrett M Morris; Ruth Huey; William Lindstrom; Michel F Sanner; Richard K Belew; David S Goodsell; Arthur J Olson
Journal: J Comput Chem Date: 2009-12 Impact factor: 3.376

7. BEST: improved prediction of B-cell epitopes from antigen sequences.

Authors: Jianzhao Gao; Eshel Faraggi; Yaoqi Zhou; Jishou Ruan; Lukasz Kurgan
Journal: PLoS One Date: 2012-06-27 Impact factor: 3.240

8. Design of an epitope-based peptide vaccine against spike protein of human coronavirus: an in silico approach.

Authors: Arafat Rahman Oany; Abdullah-Al Emran; Tahmina Pervin Jyoti
Journal: Drug Des Devel Ther Date: 2014-08-21 Impact factor: 4.162

Review 9. Human vaccine research in the European Union.

Authors: Ole F Olesen; Anna Lonnroth; Bernard Mulligan
Journal: Vaccine Date: 2008-12-06 Impact factor: 3.641

Review 10. Hendra and Nipah viruses: different and dangerous.

Authors: Bryan T Eaton; Christopher C Broder; Deborah Middleton; Lin-Fa Wang
Journal: Nat Rev Microbiol Date: 2006-01 Impact factor: 60.633

4 in total

1. Exploring the Human-Nipah Virus Protein-Protein Interactome.

Authors: Luis Martinez-Gil; Natalia M Vera-Velasco; Ismael Mingarro
Journal: J Virol Date: 2017-11-14 Impact factor: 5.103

2. Construction and immunogenicity of a recombinant swinepox virus expressing a multi-epitope peptide for porcine reproductive and respiratory syndrome virus.

Authors: Huixing Lin; Zhe Ma; Xin Hou; Lei Chen; Hongjie Fan
Journal: Sci Rep Date: 2017-03-08 Impact factor: 4.379

3. Immuno-informatics approach for B-cell and T-cell epitope based peptide vaccine design against novel COVID-19 virus.

Authors: Jitender Singh; Deepti Malik; Ashvinder Raina
Journal: Vaccine Date: 2021-01-09 Impact factor: 3.641

Review 4. Recent advances in the understanding of Nipah virus immunopathogenesis and anti-viral approaches.

Authors: Rodolphe Pelissier; Mathieu Iampietro; Branka Horvat
Journal: F1000Res Date: 2019-10-16

4 in total