Literature DB >> 32376973

In-silico identification of the vaccine candidate epitopes against the Lassa virus hemorrhagic fever.

Prabin Baral¹, Elumalai Pavadai^1,2, Bernard S Gerstman^1,3, Prem P Chapagain^4,5.

Abstract

Lassa virus (LASV), a member of the Arenaviridae, is an ambisense RNA virus that causes severe hemorrhagic fever with a high fatality rate in humans in West and Central Africa. Currently, no FDA approved drugs or vaccines are available for the treatment of LASV fever. The LASV glycoprotein complex (GP) is a promising target for vaccine or drug development. It is situated on the virion envelope and plays key roles in LASV growth, cell tropism, host range, and pathogenicity. In an effort to discover new LASV vaccines, we employ several sequence-based computational prediction tools to identify LASV GP major histocompatibility complex (MHC) class I and II T-cell epitopes. In addition, many sequence- and structure-based computational prediction tools were used to identify LASV GP B-cell epitopes. The predicted T- and B-cell epitopes were further filtered based on the consensus approach that resulted in the identification of thirty new epitopes that have not been previously tested experimentally. Epitope-allele complexes were obtained for selected strongly binding alleles to the MHC-I T-cell epitopes using molecular docking and the complexes were relaxed with molecular dynamics simulations to investigate the interaction and dynamics of the epitope-allele complexes. These predictions provide guidance to the experimental investigations and validation of the epitopes with the potential for stimulating T-cell responses and B-cell antibodies against LASV and allow the design and development of LASV vaccines.

Entities: Chemical Disease Gene Species

Mesh：

Substances：

Year: 2020 PMID： 32376973 PMCID： PMC7203123 DOI： 10.1038/s41598-020-63640-1

Source DB: PubMed Journal: Sci Rep ISSN： 2045-2322 Impact factor: 4.379

Introduction

Lassa virus (LASV), a member of the Arenaviridae[1], is an ambisense RNA virus that causes a severe hemorrhagic Lassa fever in humans. LASV is endemic, particularly in the West African countries of Sierra Leone, The Republic of Guinea, Nigeria, and Liberia[2,3]. The transmission of LASV to humans occurs through the urine or feces of infected Mastomys rats and the virus spreads human-to-human through direct contact with the blood, urine, feces, or other bodily secretions of an infected person. LASV can be fatal and no approved effective therapeutics are currently available. The development of therapeutics such as antibodies and vaccines for the treatment of LASV is therefore of significant urgency[4-6]. Of the four proteins that are encoded by the two RNA segments of the LASV genome, the glycoprotein (GP) is the only protein on the viral surface. GP results from the cleavage of a 75 kDa precursor polypeptide, GPC by signal peptidase and then further glycosylated and processed into GP1 and GP2[7]. GP1 is the receptor-binding subunit, and GP2 is the membrane-spanning fusion subunit[8-10]. The virion envelope protein spikes are composed of three heterotrimers, with each heterotrimer containing signal peptide, GP1, and GP2[11,12], shown in Fig. 1. A chalice-like GP trimer interacts with receptors on the cell surface, for example matriglycan, which mediates the entry of the virus into the host cell. In addition, the GP also interact with ERGIC-53 in the exocytic pathway, which helps to form infectious virions[13]. GP is considered to be a key factor for LASV growth, cell tropism, host range and pathogenicity, and as it is the only protein situated on the LASV virion surface, GP becomes a primary target for vaccine design[4].

Figure 1

3D structure of the LASV GP trimer consisting of the three GPs (GP-A, GP-B, GP-C). Each GP has a GP1 subunit and a GP2 subunit (zoomed view). Each monomer is colored differently in the GP trimer. In the zoomed view, the GP2 subunit is lightly shaded to differentiate from the GP1 subunit, and some of the antibody binding sites (Site A, Site B) are highlighted (figure generated from the crystal structure of the LASV GP in the Protein Data Bank[21], PDB ID: 5VK2[4]). The crystal structure of the trimeric LASV GP in complex with the 37.7 H neutralizing antibody from a human survivor (PDB ID: 5VK2, Fig. 1) has been determined, thereby providing insight into the structural basis for antibody design. Analysis of the GP-37.7 H antibody complex shows that the antibody simultaneously binds to two GP monomers at the base of the GP trimer. The binding involves four discontinuous regions of LASV GP: two in site A and two in site B. Site A contains residues 62 and 63 of the N-terminal loop of GP1 and residues 387 to 408 in the T-loop (residues 365–384) and HR2 (residues 400–412) regions of GP2. Site B contains residues 269 to 275 of the fusion peptide and residues 324 to 325 of HR1 (residues 311–355) of GP2[4,14]. Although the antibody predominantly binds to GP2, GP1 is required to maintain the proper prefusion conformation of GP2 for antibody binding[4]. Identification of epitopes is an essential step for understanding disease etiology, immunotherapy, immunodiagnostics, and the discovery and development of epitope based-vaccines. An epitope-based vaccine has fewer side effects compared to conventional vaccines. Experimental identification of a promiscuous epitope involves many expensive and time-consuming steps, including the production of antibodies to map antigenic regions on a target protein, animal models, and determination of the crystal structure of antigen-antibody complexes using X-Ray crystallography. Computational identification of epitopes is often employed as a powerful and fast approach to facilitate the identification of potential epitope candidates that can decrease the number of validation experiments and time[15,16]. Multi-epitope based vaccine development has already proven effective against several viral infections and cancer[17,18]. In this study, we have identified and characterized T and B-cell epitopes for the LASV GP using different sequence and structure-based computational epitope prediction methods. We then selected potential B and T-cell epitopes for the LASV GP based on a consensus approach, and the novelty of the epitopes was examined with the Immune Epitope Database (IEDB) tools. Subsequently, we identified strongly binding alleles to the MHC-I T-cell epitopes and modeled the allele structures and performed docking to understand the interaction between alleles and epitopes. We further investigated the stability and dynamics of the epitope-allele complexes using molecular dynamics simulations. Analyses of root-mean square deviations, hydrogen bond, interaction energy, and solvent accessibility showed that epitope-allele complexes are stable, indicating that the epitopes strongly bind to the alleles. The identified B and T-cell epitopes of LASV GP in the study can be useful for the development of effective vaccines against Lassa hemorrhagic fever.

Materials and methods

Selection of LASV GP sequence and 3D structure

The sequence of GP for different LASV strains was obtained from the NIAID Virus Pathogen Database and Analysis Resource (ViPR)[19]. Subsequently, multiple sequence alignments were performed between the sequences using Clustal Omega[20] to select a conserved LASV GP for sequence-based epitope prediction. The corresponding X-Ray crystal structure of the Mouse/Sierra Leone/Josiah/1976 LASV GP was obtained from the Protein Data Bank (PDB ID: 5VK2)[4,21] for structure-based B-cell epitope prediction. The missing residues were modeled using the Charmm-Gui[22-24].

Prediction of B-cell epitopes

Sequence-based B-cell epitope prediction was performed with the use of BepiPred2.0[25], BCPREDS[26] and BcePred[27] servers separately. These servers predict epitopes based on physico-chemical properties of amino acids, and these servers accept the primary sequence of LASV GP as an input. Structure-based B-cell epitope prediction for the LASV GP (PDB ID: 5VK2) was carried out using three different programs separately: ElliPro[28], Epitopia[29] and DiscoTope[30]. These servers predict epitopes regions based on the geometrical and solvent surface-accessibility of a protein structure, and these servers accept the 3D structure of a protein as input. The consensus epitopes from both sequence and structure-based predictions were selected as potential epitopes for further analysis.

Prediction of T-cell epitopes

Sequence-based MHC-I T-cell epitope predictions for LASV GP were carried out by using three different servers, ProPred-I[31], CTLPred[32] and NetCTL1.2[33]. To predict their alleles, the consensus epitopes among these three prediction methods were analyzed using IEDB[34]. The epitopes that strongly bind to the alleles (lowest IC50) were selected for further analysis. Sequence-based MHC-II T-cell epitope predictions for LASV GP were performed with the use of three different servers: ProPred[35], NetMHCII2.3[36] and EpiTOP3.0[37]. The antigenicity score of the selected epitopes was predicted by VaxiJen 2.0[38].

Homology modeling and epitope-allele docking

The structure of HLA-A*02:06 (A1) [PDB ID 3OXR[39]], HLA-A*02:03 (A2) [PDB ID: 3OX8[39]], and HLA-B*35:01 (A3) [PDB ID: 2CIK[40]] were obtained from the PDB. The experimental structure for the HLA-A*32:01 (A4) allele is not available, and thus, the sequence of this allele was obtained from the UniProt database (UniProtKB ID: P01892), and subsequently its structure was modeled using Swiss-Model[41-43]. The selected consensus MHC-1 epitopes were extracted from the crystal structure of LASV GP (PDB: 5VK2). The epitopes and the alleles were prepared for docking using Autodock Tool version 1.5.6[44]. Autodock Vina 1.1.2[45] was used for peptide docking with a grid space that covered the entire allele. The best peptide-allele complexes were selected for further investigation based upon visual inspection of peptide-allele interactions and the Autodock Vina criteria. The stability and dynamics of the selected peptide-allele complexes were further studied using molecular dynamics simulations.

Molecular dynamics simulations

All-atom, explicit-solvent molecular dynamics (MD) simulations were performed to investigate the stability and dynamics of the MHC-1 T-cell epitope-allele complexes using the CHARMM36m force field[46] with the NAMD 2.12 software package[47]. The systems were minimized for 10,000 steps followed by 200 ps of equilibration. This was followed by MD production runs for 200 ns at a temperature of 300 K using a 2 fs time-step. The long-range ionic interactions were calculated using the particle mesh Ewald (PME) method[48] while the covalent hydrogen atoms were constrained by using a SHAKE algorithm[49]. The temperature was controlled by using the Langevin temperature coupling with a friction coefficient of 1 ps−1 and pressure was controlled using the Nose-Hoover Langevin-Piston method[50]. Visualization, and rendering of trajectories and pictures were performed using VMD[51].

Results and Discussion

The multiple sequence alignment of the 84 LASV GP sequences resulted in the LASV GP Mouse/Sierra Leone/Josiah/1976) [UniprotKB ID: P08669] as a highly conserved strain, and we thus selected this strain for the sequence-based MHC-I and MHC-II T-cell epitope predictions and for both structure and sequence-based B-cell epitope predictions. In addition, a search of this strain with the experimentally determined structure available in the PDB displayed the 3.2 Å resolution crystal structure of the prefusion GP trimer of LASV in complex with the human neutralizing antibody 37.7 H. [PDB ID: 5VK2] as shown in Fig. 1. This structure was used for the structure-based B-cell epitope prediction. A schematic representation of the epitope prediction cascade is shown in Fig. 2. We have adopted multiple methods to predict and rank the epitopes as they use different criteria for their predictions. Some approaches may incorporate some properties that are similar such as solvent accessible surface area, but the predicted epitopes are different. Previous studies[52,53] have suggested that the consensus approach would improve the specificity and accuracy of the epitope prediction as it can reduce the false positives. Therefore, we employ a consensus approach; for example, an epitope can be considered if it overlaps with even a single residue by at least two prediction methods. Our consensus approach selected several nanomer epitopes for MHC-I (Table S1). Although the predicted epitopes for MHC-II T-cell vary in length, the consensus core region between predicted MHC-II epitopes is a nanomer (Table S2) which is considered[54] an optimal length for the HLA.

Figure 2

The workflow cascade for epitope identification of (a) T-cell and (b) B-cell.

Prediction of T-cell Epitopes

MHC-I T-cell epitope prediction with the LASV GP sequence was performed using three different methods separately: ProPred-1, CTLPred, and NetCTL1.2, and the results are shown in Supplementary Table S1. The epitopes listed by at least two of the methods are listed in Table 1 along with their binding affinity (IC50), antigenicity, and allele. Among these four consensus epitopes, the nanomer E1 epitope FATCGLVGL shows the lowest average IC50 value of 34 nM against the A1 allele as predicted by the IEDB, and it has also a reasonable antigenicity score of 1.65. This was followed by the E3 epitope FSRPSPIGY, which has an average IC50 value of 88 nM against the A3 allele, and also has a better antigenicity score of 2.50 compared to the FATCGLVGL epitope. Interestingly, the E4 epitope RRGTFTWTL is predicted by all three methods though its IC50 and antigenicity scores are not as good as the other epitopes (Table 1). All four of these consensus epitopes were docked to the alleles and we performed the MD simulations to investigate the stability and dynamics of the allele-epitope complex as discussed later.

Table 1

Consensus prediction of the MHC-I T-cell epitopes.

Epitope	Sequence	Interval	Prediction method			Antigenicity	IC₅₀ (nM)		Allele
Epitope	Sequence	Interval	ProPred-1	CTLPred	NetCTL 1.2	Antigenicity	ANN	SMM	Allele
E1	FATCGLVGL	38–46	✓	✓	×	1.65	11.91	55.79	HLA-A*02:06 (A1)
E2	IINHKFCNL	112–120	✓	✓	×	1.23	101.6	214.8	HLA-A*02:03 (A2)
E3	FSRPSPIGY	233–241	×	✓	✓	2.50	81.63	94.1	HLA-B*35:01 (A3)
E4	RRGTFTWTL	258–266	✓	✓	✓	1.04	109.6	727.7	HLA-A*32:01 (A4)

The epitope predicted by all three methods is highlighted in boldface.

Consensus prediction of the MHC-I T-cell epitopes. The epitope predicted by all three methods is highlighted in boldface. MHC-II T-cell epitope prediction with the LASV GP sequence was performed using three different methods separately: ProPred, NetMHCII 2.3, and EpiTOP 3.0, and the results are shown in Supplementary Table S2. ProPred uses a quantitative matrix[35] approach and NetMHCII2.3 uses ANN[36], while EpiTOP 3.0 uses Quantitative Structure–Activity Relationship models (QSAR)[37] to predict the MHC-II T-cell epitopes. The epitopes that were predicted by at least two methods are listed in Table 2. Among these consensus MHC-II T-cell epitope predictions, the E9 and E13 epitopes were predicted by all three methods and have a reasonable antigenicity score of 0.7, indicating that these two epitopes can be potential candidates for the design of MHC-II T-cell based vaccines. ProPred and EpiTOP 3.0 predict most epitopes as nanomers whereas NetMHCII 2.3 predicts varying lengths of epitopes (Table 2). Interestingly, the 15-mer epitopes predicted by NetMHCII have the consensus core nanomer epitopes, suggesting that the core region is responsible for strong binding of the epitope into the MHC-II binding site[55-57].

Table 2

Prediction of the MHC-II T-cell epitopes.

Epitope	Sequence	Interval	Prediction Method			Antigenicity
Epitope	Sequence	Interval	ProPred	NetMHCII 2.3	EpiTOP 3.0	Antigenicity
E5	MGQIVTFFQ	1–9	✓	×	✓	−0.1820
E6	VYELQTLEL	65–73	✓	×	✓	0.8600
E7	LNMTMPLSC	78–86	✓	×	✓	0.9390
E8	INHKFCNLS	113–121	✓	×	✓	1.5060
E9	MSIISTFHL	134–142	✓	✓	✓	0.7080
	LYDHALMSIISTFHL	128–142	×	✓	×	0.2896
	YDHALMSIISTFHLS	129–143	×	✓	×	0.4907
	DHALMSIISTFHLSI	130–144	×	✓	×	0.4809
	HALMSIISTFHLSIP	131–145	×	✓	×	0.1949
	ALMSIISTFHLSIPN	132–146	×	✓	×	0.2066
	LMSIISTFHLSIPNF	133–147	×	✓	×	0.2428
E10	FNQYEAMSC	147–155	✓	×	✓	0.5520
E11	ISVQYNLSH	162–170	✓	×	✓	1.1310
E12	LQTFMRMAW	188–196	✓	✓	×	0.2620
	VANGVLQTFMRMAWG	183–197	×	✓	×	0.1328
	ANGVLQTFMRMAWGG	184–198	×	✓	×	0.1683
	NGVLQTFMRMAWGGS	185–199	×	✓	×	0.0579
	GVLQTFMRMAWGGSY	186–200	×	✓	×	0.1572
	VLQTFMRMAWGGSYI	187–201	×	✓	×	0.1895
E13	MRMAWGGSY	192–200	✓	✓	✓	0.7630
	GVLQTFMRMAWGGSY	186–200	×	✓	×	0.1572
	VLQTFMRMAWGGSYI	187–201	×	✓	×	0.1895
	LQTFMRMAWGGSYIA	188–202	×	✓	×	0.1902
	QTFMRMAWGGSYIAL	189–203	×	✓	×	0.3470
	TFMRMAWGGSYIALD	190–204	×	✓	×	0.4434
	FMRMAWGGSYIALDS	191–205	×	✓	×	0.3543
E14	YQYLIIQNT	217–225	✓	✓	×	0.4720
	DCIMTSYQYLIIQNT	211–225	×	✓	×	0.6600
	CIMTSYQYLIIQNTT	212–226	×	✓	×	0.7075
	IMTSYQYLIIQNTTW	213–227	×	✓	×	0.6029
	TSYQYLIIQNTTWED	215–229	×	✓	×	0.6874
E15	LIIQNTTWE	220–228	✓	×	✓	0.9100
E16	IGYLGLLSQ	239–247	✓	×	✓	1.5300
E17	LLSQRTRDI	244–252	✓	×	✓	1.7310
E18	IYISRRRRG	252–260	✓	✓	×	1.5560
	SQRTRDIYISRRRRG	246–260	×	✓	×	1.6434
	QRTRDIYISRRRRGT	247–261	×	✓	×	1.5276
	RTRDIYISRRRRGTF	248–262	×	✓	×	1.7213
	TRDIYISRRRRGTFT	249–263	×	✓	×	1.4112
	RDIYISRRRRGTFTW	250–264	×	✓	×	1.5207
	DIYISRRRRGTFTWT	251–265	×	✓	×	1.4261
	IYISRRRRGTFTWTL	252–266	×	✓	×	1.2680
E19	WMLIEAELK	283–291	✓	×	✓	1.3250
E20	IQLINKAVN	334–342	✓	×	✓	0.7710
E21	LINDQLIMK	344–352	✓	×	✓	−0.0481
E22	LRDIMCIPY	355–363	✓	×	✓	1.0590
E23	LVSNGSYLN	387–395	✓	×	✓	0.3450

The epitopes predicted by all three methods are highlighted in boldface with Italic font. The consensus core regions highlighted in boldface are in the epitopes predited by NetMHCII 2.3.

Prediction of the MHC-II T-cell epitopes. The epitopes predicted by all three methods are highlighted in boldface with Italic font. The consensus core regions highlighted in boldface are in the epitopes predited by NetMHCII 2.3. In addition to the T-cell epitope predictions, we also predicted the linear B-cell epitopes for the LASV GP using sequence-based methods BepiPred 2.0[25], BCPREDS[26], and BcePred[27]. The BepiPred predicts the epitopes based on a random forest algorithm trained on epitopes annotated from antibody-antigen structures. BCPREDS predicts epitopes by using SVM combined with a different kernel method, including string kernels, radial basis kernels, and subsequence kernels. The BcePred locates B-cell epitopes using four physicochemical properties like hydrophilicity, polarity, exposed surface and beta-turns[27]. The epitope E30 containing 10 residues was predicted by all three of these sequence methods (Table 3) but with a negative antigenicity score.

Table 3

Prediction of the B-cell epitopes.

Epitope	Sequence	Interval	Sequence based			Structure based			Rank	Antigenicity
Epitope	Sequence	Interval	BepiPred	BCPREDS	BcePred	ElliPro	Epitopia	DiscoTope	Rank	Antigenicity
E24	LSDAHKKNLYD	120–130	✓	×	✓	✓	✓	✓	5/6	0.74
E25	PNFNQYEA	145–152	✓	×	✓	✓	✓	×	4/6	0.4565
E26	DFNGGKI	156–162	×	✓	×	✓	✓	×	3/6	0.7315
E27	LSHSYAGDAANHCGT	168–182	✓	×	×	✓	✓	×	3/6	0.0814
E28	LDSGCGNWDCIMTSYQY	203–219	×	✓	×	✓	✓	×	3/6	1.0802
E29	ISRRRRGT	254–261	×	×	✓	✓	✓	✓	4/6	1.2517
E30	SDSEGKDTPG	267–276	✓	✓	✓	✓	✓	×	5/6	−0.0739
E31	NHTTTGRT	373–380	×	✓	✓	✓	✓	×	4/6	0.9941
E32	ETHFSDDIE	396–404	✓	×	✓	✓	✓	✓	5/6	0.4989
E33	MLQKEYMERQ	414–423	×	✓	✓	✓	✓	✓	5/6	−0.14

The epitopes predicted by either all three sequence- or structure-based methods are highlighted by boldface. Conformational epitopes chosen by all three structure-based methods are indicated in italics.

Prediction of the B-cell epitopes. The epitopes predicted by either all three sequence- or structure-based methods are highlighted by boldface. Conformational epitopes chosen by all three structure-based methods are indicated in italics. We also performed structure-based B-cell epitope prediction using three representative structural and geometrical properties-based methods: ElliPro, Epitopia and DiscoTope. For this, the experimental 3D structure LASV GP (PDB ID: 5VK2) with the modeled missing residues was used. ElliPro predicts linear and conformational epitopes by incorporating the antigenicity, solvent accessibility, and flexibility of protein structures[28]. Epitopia uses a machine learning algorithm to analyze the antigenic features on protein structure and predicts the probable conformational epitope regions[29]. DiscoTope uses amino acid statistics, spatial information, and surface accessibility on the protein 3D structure to predict residue-by- residue conformational epitopes[30]. The E24, E29, E32 and E33 structure-based epitopes in Table 3 are especially interesting as potential candidates as they were predicted by all three methods. In Table 3, we also ranked each epitope based upon how many of the sequence and structure-based methods predicted each epitope, which do not always correlate with the highest antigenicity scores of E24, E26, E28, E29 and E31. Robinson et al.[14] have recently reported the cloning of many human monoclonal antibodies derived from memory B cells of Lassa fever survivors in West Africa. These antibodies specifically bind to both GP1 and GP2 epitopes of LASV. The comparison of our predicted B-cell epitopes with those epitopes shows that there are five consensus epitopes (Table 3) that share similarity with Robinson et al. (Table S3), and another five epitopes that do not share similarity, indicating that our consensus epitope prediction strategy has identified new epitopes.

Epitope surface mapping

For efficacy of vaccines, the epitopes should be located on an accessible region of the protein so that the epitope will be able to bind with antibodies[53]. This is especially important for the six epitopes that we list in the Tables above that do not share any part of their sequence with known epitopes: E1, E4, E18, E22, E27, E29. In Fig. 3, we highlight the positions of these epitopes on LASV GP. We also highlight the positions of E2 and E3 because the four MHC-I T-cell epitopes have IC50 information readily available. Figure 3 shows that the E1, E2, E3, E4, E18, E22 and E27 epitopes are well located on the exposed regions and thus can interact well with the alleles.

Figure 3

Mapping of some representative epitopes are highlighted on the LASV GP. Mapping of: (a) secondary structural elements, (b) surface accessibility. The location of the epitopes on the GP suggests that they are on the solvent exposed region, indicating promiscuity as they have easy access to alleles.

MHC-I T-cell Allele and epitope modeling and docking

Swiss-Model identified the 1.61 Å resolution crystal structure of the HLA class I antigen (PDB ID: 6EI2) as the best template for constructing models. The sequence identity between A4 and the template was 92%. The best model was then selected based on multiple validation methods, including GMQE (Global Model Quality Estimation) and QMEAN. The GMQE and QMEAN values[41,58] of the model are 0.75, and 0.6, respectively. In addition to these analyses, Ramachandran plots and ERRAT were also used for the model validation. Analysis of Ramachandran plot[59] of the model shows 99.6% of residues are either in favored or in allowed regions (Supplementary Fig. S1), indicating that backbone torsion angles of these models are acceptable. The ERRAT overall quality factor[60] score was computed as 99, which is greater than the normally accepted score range for a high quality model of 50. These analyses show that the model is within a high quality range and can be used for further analysis. Docking of the four consensus MHC-I epitopes (Table 1) was performed using Autodock Vina, which enabled the docking of epitopes obtained from the sequence-based MHC-1 T-cell prediction into the promising allele structures. The Autodock Vina docking protocol has been previously demonstrated to successfully dock epitopes into allele structures[45]. However, we validated the capability of the docking protocol before docking the epitopes by redocking the epitopes into the allele crystal structure (PDB ID:3OX8) to see whether the crystal bound conformation of the peptide could be reproduced or not. The docked allele-epitope complex showed the same residue-epitope interactions observed in the epitope bound crystal structure, indicating that the Autodock Vina docking protocol was capable of reproducing the experimentally observed binding mode of the epitope. We applied Autodock Vina to each of the four MHC-I allele-epitope complexes. Autodock Vina found that the highest ranked docking structure had the following binding affinities: −5.5 kcal/mol for A1::E1 −5.0 kcal/mol for A2::E2, −6.8 kcal/mol for A3::E3, and −6.0 for A4::E4. These epitopes-alleles docking complexes are shown in Fig. 4.

Figure 4

Snapshots of allele-epitope complexes. (a) A1::E1, (b) A2::E2, (c) A3::E3, and (d) A4::E4 at the beginning and end of the MD simulations: t = 0 (minimized structure), t = 200 ns. Allele is gold and epitope is green.

Dynamics of the allele-epitope complex

In order to investigate the dynamics and stability of the four MHC-I allele-epitope complexes, we performed 200 ns all-atom, explicit solvent MD simulations. To quantitatively understand the stability of the allele-epitope complex, we calculated the root mean square deviations (RMSD) of the backbone atoms of the allele-epitope complexes as a function of simulation time as shown in Fig. 5. Figure 5 also includes curves of the RMSD of the backbone atoms of just the allele, and separately, just the backbone atoms of the epitope. All alleles have an RMSD compared to their initial structures of approximately 2 Å, whereas the allele-epitope complexes have a bit higher RMSD of approximately 2.5 Å, indicating that the epitopes make the complexes more flexible. Interestingly, in the case of A3::E3, the allele and the complex show almost the same RMSD, suggesting that the complex is especially stable. To pinpoint why the complexes show a higher RMSD, we further computed the RMSD of only the backbone atoms of the epitope in each the complex. Figure 4 shows that the initial configuration of epitopes E1 and E4 is compact, and that both of these epitopes rearrange their configuration in the binding site and elongate during the 200 ns MD simulation. This elongated configuration is consistent with the investigations of Antunes et al.[61] on MHC-I epitopes.

Figure 5

Root-mean-squared deviations (RMSD) calculated for the backbone atoms of allele (A), epitope (E) and complex (A + E) from MD simulations of MHC-I allele-epitope complexes.

Root-mean-squared deviations (RMSD) calculated for the backbone atoms of allele (A), epitope (E) and complex (A + E) from MD simulations of MHC-I allele-epitope complexes. Since the interactions between protein and epitope peptide are mostly influenced by non-covalent interactions, we computed the number of hydrogen bonds and the interaction energy between the allele and epitope as a function of the MD simulation time. The hydrogen bond was calculated between the protein interface atoms with a distance cut-off of 3.5 Å and angle cut-off of 30o between the donor and acceptor heavy atoms. As shown in Fig. 6, the number of H-bonds fluctuates during the MD simulations for all the complexes. The A3 complex has the largest number of H-bonds. Table 4 shows that during the last 50 ns of the MD simulation trajectory, the A3 complex averages 2.5 H-bonds. Additional analysis of the hydrogen bonding between allele and epitope are listed in Supplementary Table S4.

Figure 6

(a) The number of allele-epitope intermolecular hydrogen bonds as a function of MD simulation time. (b) Interaction energy calculated between allele and epitopes as a function of simulation time.

Table 4

Allele–epitope interaction parameters calculated by averaging over the last 50 ns of the MD simulation trajectory.

Complex	Interaction Energy (kcal/mol)	No. of H-bonds
A1::E1	−53.53 ± 7.40	0.64 ± 0.54
A2::E2	−64.54 ± 10.88	1.49 ± 0.63
A3::E3	−74.85 ± 14.94	2.48 ± 0.50
A4::E4	−73.23 ± 27.07	1.51 ± 0.67

(a) The number of allele-epitope intermolecular hydrogen bonds as a function of MD simulation time. (b) Interaction energy calculated between allele and epitopes as a function of simulation time. Allele–epitope interaction parameters calculated by averaging over the last 50 ns of the MD simulation trajectory. Figure 6b shows the interaction energy (electrostatic interaction + van der Waals contacts) throughout the entire MD simulation and Table 4 lists the average over the last 50 ns. The A3::E3 and A4::E4 display relatively stronger interaction energies than the A1:E1 and A2::E2 complexes. The comparison of RMSD, hydrogen bond, and interaction energy information indicates that the E3 epitope is an especially promising epitope candidate.

Novelty analysis

The novelty of the four MHC I T-cell epitopes in Table 1, the nineteen MHC II T-cell epitopes in Table 2, and the ten B-cell epitopes in Table 3 identified in this study were analyzed using IEDB[34]. The IEDB database contains the epitopes that are annotated based on scientific literature. The IEDB showed that the E1, E4, E18, E22, E27, E29 epitopes, which bind to solvent exposed regions on the protein (Fig. 3), have not been previously reported as LASV epitopes or vaccine candidates. In addition, this analysis further indicates that 24 other epitopes (E2, E3, E5, E6, E7, E8, E10, E11, E12, E14, E15, E16, E17, E19, E20, E23, E24, E25, E26, E28, E30, E31, E32, E33) have partial segments of their sequence reported as subsets of other epitopes, whereas E9, E13, E21 are exact match to previously reported sequences. For these epitopes, a comparison showing the overlap between the predicted epitopes in this study and previously known epitopes documented in IEDB is given in Table S5. In addition to the epitopes in the IEDB, we compared our consensus predicted epitopes with the previously reported predictions[62-67] in Table S6. This comparison shows a varying degree of overlap in the predicted sequences. The novelty results confirm that thirty epitopes have not been previously tested experimentally as LASV epitopes, suggesting that their therapeutic potentials in designing vaccines against LASV can be further explored.

Conclusion

LASV hemorrhagic fever is endemic in West Africa, and no approved effective therapeutics are currently available. Therefore, there is an urgent need for the discovery and development of potential antiviral therapeutics. The LASV GP spike has emerged as a promising selective target for the development of novel vaccines as it plays an essential role in the virus-host interaction. Several in-silico studies[62-67] were performed to predict LASV GP epitopes with the use of a single prediction tool for each type of epitope. We have identified new T and B-cell epitopes using a variety of computational approaches, including twelve epitope prediction methods, protein-peptide docking, and MD simulations. The MHC I and II T-cell epitopes were separately predicted with the LASV GP sequence using well-known prediction methods. The predicted MHC I T-cell epitopes then were prioritized based on the consensus score, binding affinity, and antigenicity, while MHC II T and B-cell epitopes were prioritized based on the consensus score. Novelty analysis of the consensus-selected 33 epitopes showed that thirty of these predicted epitopes have either no overlap or only a partial overlap to previously reported sequences. Within this list of new epitopes, six sequences have no overlap with any known experimentally tested epitopes in the IEDB. In addition, docking and MD simulations were performed to further validate the MHC I T-cell epitopes. The simulation results show that the allele-MHC-I epitopes are stable, with favorable hydrogen-bond and interaction energy. Of these, Epitope E3 (233FSRPSPIGY241) segment was found to be especially stable. This study demonstrates that the adopted consensus epitope prediction strategy is valuable for in-silico investigations of known epitopes and the identification of new epitopes. Experimental validation of these epitopes may lead to the design and development of effective LASV vaccines. Supplementary Information.

38 in total

Review 1. An introduction to epitope prediction methods and software.

Authors: Xingdong Yang; Xinglong Yu
Journal: Rev Med Virol Date: 2009-03 Impact factor: 6.989

2. ICTV Virus Taxonomy Profile: Arenaviridae.

Authors: Sheli R Radoshitzky; Michael J Buchmeier; Rémi N Charrel; J Christopher S Clegg; Jean-Paul J Gonzalez; Stephan Günther; Jussi Hepojoki; Jens H Kuhn; Igor S Lukashevich; Víctor Romanowski; Maria S Salvato; Manuela Sironi; Mark D Stenglein; Juan Carlos de la Torre
Journal: J Gen Virol Date: 2019-06-13 Impact factor: 3.891

3. Protein structure of lymphocytic choriomeningitis virus: evidence for a cell-associated precursor of the virion glycopeptides.

Authors: M J Buchmeier; M B Oldstone
Journal: Virology Date: 1979-11 Impact factor: 3.616

Review 4. Understanding the cryptic nature of Lassa fever in West Africa.

Authors: Rory Gibb; Lina M Moses; David W Redding; Kate E Jones
Journal: Pathog Glob Health Date: 2017-09-06 Impact factor: 2.894

Review 5. Lassa fever in West Africa: evidence for an expanded region of endemicity.

Authors: N Sogoba; H Feldmann; D Safronetz
Journal: Zoonoses Public Health Date: 2012-09 Impact factor: 2.702

6. Structural basis for antibody-mediated neutralization of Lassa virus.

Authors: Kathryn M Hastie; Michelle A Zandonatti; Lara M Kleinfelter; Megan L Heinrich; Megan M Rowland; Kartik Chandran; Luis M Branco; James E Robinson; Robert F Garry; Erica Ollmann Saphire
Journal: Science Date: 2017-06-02 Impact factor: 47.728

7. Arenavirus stable signal peptide is the keystone subunit for glycoprotein complex organization.

Authors: Lydia H Bederka; Cyrille J Bonhomme; Emily L Ling; Michael J Buchmeier
Journal: mBio Date: 2014-10-28 Impact factor: 7.867

8. Most neutralizing human monoclonal antibodies target novel epitopes requiring both Lassa virus glycoprotein subunits.

Authors: James E Robinson; Kathryn M Hastie; Robert W Cross; Rachael E Yenni; Deborah H Elliott; Julie A Rouelle; Chandrika B Kannadka; Ashley A Smira; Courtney E Garry; Benjamin T Bradley; Haini Yu; Jeffrey G Shaffer; Matt L Boisen; Jessica N Hartnett; Michelle A Zandonatti; Megan M Rowland; Megan L Heinrich; Luis Martínez-Sobrido; Benson Cheng; Juan C de la Torre; Kristian G Andersen; Augustine Goba; Mambu Momoh; Mohamed Fullah; Michael Gbakie; Lansana Kanneh; Veronica J Koroma; Richard Fonnie; Simbirie C Jalloh; Brima Kargbo; Mohamed A Vandi; Momoh Gbetuwa; Odia Ikponmwosa; Danny A Asogun; Peter O Okokhere; Onikepe A Follarin; John S Schieffelin; Kelly R Pitts; Joan B Geisbert; Peter C Kulakoski; Russell B Wilson; Christian T Happi; Pardis C Sabeti; Sahr M Gevao; S Humarr Khan; Donald S Grant; Thomas W Geisbert; Erica Ollmann Saphire; Luis M Branco; Robert F Garry
Journal: Nat Commun Date: 2016-05-10 Impact factor: 14.919

9. The intracellular cargo receptor ERGIC-53 is required for the production of infectious arenavirus, coronavirus, and filovirus particles.

Authors: Joseph P Klaus; Philip Eisenhauer; Joanne Russo; Anne B Mason; Danh Do; Benjamin King; Douglas Taatjes; Cromwell Cornillez-Ty; Jonathan E Boyson; Markus Thali; Chunlei Zheng; Lujian Liao; John R Yates; Bin Zhang; Bryan A Ballif; Jason W Botten
Journal: Cell Host Microbe Date: 2013-11-13 Impact factor: 21.023

10. Acidic pH-Induced Conformations and LAMP1 Binding of the Lassa Virus Glycoprotein Spike.

Authors: Sai Li; Zhaoyang Sun; Rhys Pryce; Marie-Laure Parsy; Sarah K Fehling; Katrin Schlie; C Alistair Siebert; Wolfgang Garten; Thomas A Bowden; Thomas Strecker; Juha T Huiskonen
Journal: PLoS Pathog Date: 2016-02-05 Impact factor: 6.823

2 in total

1. A metagenomic insight into the Yangtze finless porpoise virome.

Authors: Zhigang Liu; Xin Ding; Muhammad Shahan Haider; Farah Ali; Han Yu; Xin Chen; Shuaishuai Tan; Yuan Zu; Wenlong Liu; Bangzhi Ding; Aifang Zheng; Jinsong Zheng; Zhengyi Qian; Hassan Ashfaq; Daoping Yu; Kun Li
Journal: Front Vet Sci Date: 2022-09-02

2. Combination of highly antigenic nucleoproteins to inaugurate a cross-reactive next generation vaccine candidate against Arenaviridae family.

Authors: Kazi Faizul Azim; Tahera Lasker; Rahima Akter; Mantasha Mahmud Hia; Omar Faruk Bhuiyan; Mahmudul Hasan; Md Nazmul Hossain
Journal: Heliyon Date: 2021-05-19

2 in total