Literature DB >> 35007561

A vaccine built from potential immunogenic pieces derived from the SARS-CoV-2 spike glycoprotein: A computational approximation.

Abstract

Coronavirus Disease 2019 (COVID-19) represents a new global threat demanding a multidisciplinary effort to fight its etiological agent-severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In this regard, immunoinformatics may aid to predict prominent immunogenic regions from critical SARS-CoV-2 structural proteins, such as the spike (S) glycoprotein, for their use in prophylactic or therapeutic interventions against this highly pathogenic betacoronavirus. Accordingly, in this study, an integrated immunoinformatics approach was applied to identify cytotoxic T cell (CTC), T helper cell (THC), and Linear B cell (BC) epitopes from the S glycoprotein in an attempt to design a high-quality multi-epitope vaccine. The best CTC, THC, and BC epitopes showed high viral antigenicity and lack of allergenic or toxic residues, as well as CTC and THC epitopes showed suitable interactions with HLA class I (HLA-I) and HLA class II (HLA-II) molecules, respectively. Remarkably, SARS-CoV-2 receptor-binding domain (RBD) and its receptor-binding motif (RBM) harbour several potential epitopes. The structure prediction, refinement, and validation data indicate that the multi-epitope vaccine has an appropriate conformation and stability. Four conformational epitopes and an efficient binding between Toll-like receptor 4 (TLR4) and the vaccine model were observed. Importantly, the population coverage analysis showed that the multi-epitope vaccine could be used globally. Notably, computer-based simulations suggest that the vaccine model has a robust potential to evoke and maximize both immune effector responses and immunological memory to SARS-CoV-2. Further research is needed to accomplish with the mandatory international guidelines for human vaccine formulations.

Entities: Chemical

Keywords: COVID-19; Epitope; SARS-CoV-2; Spike; Vaccine

Mesh：

Substances：

Year: 2022 PMID： 35007561 PMCID： PMC8739792 DOI： 10.1016/j.jim.2022.113216

Source DB: PubMed Journal: J Immunol Methods ISSN： 0022-1759 Impact factor: 2.303

Introduction

On 31st December 2019, a dramatic increase in the number of patients with a potentially fulminant respiratory disease was reported in Wuhan, China. The etiological agent was eventually identified as a novel highly pathogenic betacoronavirus—severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Lu et al., 2020). Thereafter, SARS-CoV-2 caused an overwhelming wave of coronavirus disease 2019 (COVID-19) cases across Asia, Europe, Oceania, the Americas, and Africa, which led to the World Health Organization declared COVID-19 as a pandemic on 11th March 2020 (World Health Organization, n.d.). Unfortunately, at the time of this research, no clinically approved treatment is available to fight SARS-CoV-2, whose rapid spread has generated explosives waves of COVID-19 all over the world (World Health Organization, n.d.). The SARS-CoV-2 outer membrane is decorated with several structural proteins, including the S glycoprotein, the membrane protein, and the envelope protein (Lu et al., 2020). The S glycoprotein forms homotrimers containing both a receptor-binding domain (RBD) and a receptor-binding motif (RBM) (Lan et al., 2020a). The latter mediates contacts with human angiotensin-converting enzyme 2 (hACE2), thereby allowing SARS-CoV-2 entry into host cell (Lan et al., 2020a). This critical role in viral pathogenesis turns the SARS-CoV-2S glycoprotein into an attractive target for vaccine development (Amanat and Krammer, 2020). Multi-epitope vaccines designed from immunoinformatics tools could aid to elicit a protective immune response against SARS-CoV-2, as reported previously for other infectious agents (Zhou et al., 2009). In this regard, recent data indicate that the SARS-CoV-2 S glycoprotein harbours prominent immunologically active regions, which may serve as candidates for multi-epitope vaccine models (Bhattacharya et al., 2020). Accordingly, the present study aimed to design a multiple-epitope vaccine construct against SARS-CoV-2 using for this purpose an integrated in silico approach.

Materials and methods

Protein sequence retrieval

Taking into account that the SARS-CoV-2 S glycoprotein represents the major target for vaccine development (Amanat and Krammer, 2020), the present work focused only on such viral spike. The complete amino acid sequence of the SARS-CoV-2 S glycoprotein was retrieved from Uniprot (http://www.uniprot.org/) in FASTA format (accession number: P0DTC2). Fig. 1 summarises the in silico experimental work.

Fig. 1

Overall experimental workflow (made with Biorender.com). Best epitopes predicted from the SARS-CoV-2 S glycoprotein were selected to design the multi-epitope vaccine construct, which was subjected to further in silico evaluations. CTC-E: cytotoxic T cell epitope. THC-E: T helper cell epitope. LBC-E: linear B cell epitopes. IFN-g: Interferon gamma. aa: amino acid. 6×-H: polyhistidine tag.

Prediction of allergenicity, toxicity, and viral antigenicity

Potential epitopes from the S glycoprotein were subjected to allergenic evaluation using AllergenFP (http://ddg-pharmfac.net/AllergenFP/index.html) (Dimitrov et al., 2014), whereas toxicity was predicted using the ToxinPred server (http://crdd.osdd.net/raghava/toxinpred/) (Gupta et al., 2013). Finally, viral antigenicity was calculated from the Vaxijen server (threshold: 0.5) (www.ddg-pharmfac.net/vaxijen/) (Doytchinova and Flower, 2007).

Immunogenicity

CTC, THC, and BC epitopes were predicted and the best were selected for the final vaccine design (Fig. 1). To achieve this aim, multiple prediction tools were used to improve the rate of true positives. Furthermore, the algorithms parameters were chosen based on the recommendations from the software developers/authors.

Prediction of CTC and THC epitopes

Peptides that interact with HLA-I and HLA-II molecules commonly have 9 and 15 amino acids in length, respectively (Owen et al., 2013). In consequence, 9-mer and 15-mer peptides were considered in this work as CTC and THC epitopes, respectively (Fig. 1). These epitopes were identified using the Immune Epitope Database and Analysis Resource (IEDB-AR) (http://tools.immuneepitope.org/main/) (Kim et al., 2012). To cross-validate binding peptides to HLA molecules, several methods were applied. In this regard, the 9-mer binding peptides to HLA-I were predicted using the artificial neural network (ANN) method (Tenzer et al., 2005), the Consensus method (Moutaftsi et al., 2006), and the NetMHCpan method (Hoof et al., 2009). The prediction of the 15-mer binding peptides to HLA-II was performed using the Consensus method (Wang et al., 2008), the NetMHCIIpan method (Nielsen et al., 2008), and the SMM-align method (Nielsen et al., 2007). The above algorithms generated a prediction output based on a percentile rank—peptides with a small percentile rank have high affinity by HLA alleles. This percentile rank is produced on IEDB-AR by comparing the IC50 of each predicted peptide against random peptides from SWISSPROT database. In this work, epitopes were selected by following this guideline as well as by using a percentile rank cut-off ≤20 as recommended previously (Paul et al., 2015), which has also been successfully applied in other in silico studies focused on SARS-CoV-2 (Grifoni et al., 2020a; Marchan, 2020). In addition, binding peptides to HLA-II were also chosen by their potential to induce interferon-gamma (IFN-g) (Fig. 1), which is a cytokine necessary to fight viral infections (Owen et al., 2013). Epitopes with a high potential to induce the production of IFN-g were selected using the IFNepitope server (http://crdd.osdd.net/raghava/ifnepitope/) (Dhanda et al., 2013). This website harbours three models (motif based, SVM based and hybrid approach), which has been trained on 10,433 experimentally validated IFN-gamma inducing and non-inducing MHC class II peptides (Dhanda et al., 2013).

Prediction of linear BC epitopes

BCPRED (http://ailab.ist.psu.edu/bcpred/) (Saha and Raghava, 2006) was used to predict linear BC epitopes based on several physicochemical properties: hydrophilicity, flexibility, accessibility, and antigenicity propensity (threshold = 1 for each parameter). Simultaneously, the S glycoprotein amino acid sequence was also subjected to iBCE-EL (http://thegleelab.org/iBCE-EL/) (Manavalan et al., 2018) and BepiPred-2.0 (http://www.cbs.dtu.dk/services/BepiPred/) (Jespersen et al., 2017) for additional predictions of linear BC epitopes.

Three-dimensional (3D) interaction between HLA alleles and viral peptides: Molecular docking

To evaluate the presentation of the best epitopes in the context of HLA molecules, a molecular docking study was conducted (Fig. 1). Taking into account that HLA-A*02:01 and HLA-DRB1*01:01 were predicted as common interacting HLA alleles, they were selected for this purpose. The molecular docking simulation and the Gibbs free energy (ΔG) of the HLA-viral peptide complexes were evaluated as recently reported (Marchan, 2020).

Design of the multi-epitope vaccine against SARS-CoV-2

High potential CTC, THC, and linear BC epitopes were selected to generate the amino acid sequence of the multi-epitope vaccine. The CTC and THC epitopes were linked together using AAY and GPGPG linkers, respectively, whereas linear BC epitopes were connected by KK linkers (Fig. 1). Moreover, a TLR4 agonist, known as RS09 (Sequence: APPHALS) (Shanmugam et al., 2012), was added as an adjuvant at the N-terminus by using an EAAAK linker (Fig. 1). For future validation studies, the vaccine molecule must be expressed in vitro and then purified. Therefore, a polyhistidine-tag (6×-H tag) was included at the C-terminus (Fig. 1), which would allow its purification (Loughran et al., 2017).

Physicochemical evaluation and general studies

The ProtParam tool (https://web.expasy.org/protparam/) (Wilkins et al., 1999) was used to examine relevant physiochemical parameters of the multi-epitope vaccine. To reconfirm its viral antigenicity and lack of allergenicity and toxicity, the web tools described in Section 2.2 were applied. In addition, the vaccine solubility was predicted using the SOLpro server (http://scratch.proteomics.ics.uci.edu/) (Magnan et al., 2009).

Structure prediction, refinement, and validation

PSIPRED (http://bioinf.cs.ucl.ac.uk/psipred/) (Buchan and Jones, 2019) and GalaxyWEB (http://galaxy.seoklab.org/) (Ko et al., 2012) were utilized to predict the secondary and tertiary structure, respectively, of the multi-epitope vaccine construct. The best model was refined with GalaxyWEB (Ko et al., 2012). The vaccine structure was validated by comparing with experimentally validated 3D protein structure. In this regard, the vaccine structure was submitted to ProSA-web (https://prosa.services.came.sbg.ac.at/prosa.php), which provides a general quality score for a given structure (Wiederstein and Sippl, 2007). Furthermore, the Ramachandran plot was created on PROCHECK website (https://servicesn.mbi.ucla.edu/PROCHECK/) whereby the protein structure can be validated according to energetically allowed and disallowed dihedral angles psi and phi of amino acid residues (Laskowski et al., 1993).

Prediction of conformational BC

B-cells are also considered professional antigen presenting cells and they can initiate this function by recognizing the antigen through B-cell receptors (Owen et al., 2013). Therefore, conformational epitopes of the multi-epitope vaccine construct were predicted from Ellipro (http://tools.iedb.org/ellipro/) (Ponomarenko et al., 2008), which represents the protein structure as an ellipsoid and calculates protrusion indexes for protein residues outside of such ellipsoid (Ponomarenko et al., 2008). Minimum levels of 0.8 and a distance of 6.0 Å were applied. The epitopes were visualized with the VMD software (Version 1.9.3) to illustrate their position and 3D structure as previously reported (Marchan, 2020).

Innate immune receptor-Vaccine interaction: Molecular docking

Since TLR4 may serve as a sensor for the recognition of coronaviruses S glycoproteins (Lester and Li, 2014), this germline-encoded pattern recognition receptor was selected for the docking study. The 3D structure of TLR4 was obtained from PDB (accession number: 4G8A). The refined model of the multi-epitope vaccine was used as a ligand. The TLR4-Vaccine docking simulation and its 3D visualization were performed as recently reported (Marchan, 2020).

In silico immune response simulations

To further characterize the potential immune response of the multi-epitope vaccine, immune simulations were performed using the C-ImmSim server (http://150.146.2.1/C-IMMSIM/index.php) (Rapin et al., 2010). Three injections were applied four weeks apart as described previously (Nain et al., 2020). Furthermore, 12 injections were applied four weeks apart to simulate repeated exposure to the potential immunogen. The Simpson index D was used to interpret the diversity of the immune response.

Population coverage

Global population coverage of the multi-epitope vaccine construct was calculated from IEDB-AR (http://tools.iedb.org/population/) (Bui et al., 2006). The HLA allele genotypic frequencies available on IEDB-AR were obtained from Allele Frequency Database (AFD) (http://www.allelefrequencies.net/). At present, AFD contains allele frequencies from 115 countries and 21 different ethnicities (http://www.allelefrequencies.net/). Those 115 countries were selected and the HLA-I and HLA-I interacting alleles predicted in this work were included to perform an HLA combined analysis. A highest result means that the vaccine harbours a broad HLA coverage and, therefore, it could be applied all over the world. The results were shown on a world map using Datawrapper (https://www.datawrapper.de/).

Results

Prediction of T cell epitopes

The in silico approach (Fig. 1) allowed predict a total of 47 T cell epitopes; however, 7 cytotoxic T cell (CTC) and 11 T helper cell (THC) epitopes were identified as the best (Table 1 ). These epitopes showed a potent viral antigenicity—ranging from 0.63 to 1.52—and lack of allergenic or toxic residues in their sequences (Table 1). Moreover, THC epitopes were characterized by their potential capability to induce IFN-g (Table 1). Although “EGFNCYFPLQSYGFQ” (E47 in Table 1) could be categorized as a strong potential THC epitope, it was identified as a probable inductor of toxicity. Therefore, this epitope was not included in the amino acid sequence of the multi-epitope vaccine.

Table 1

Evaluation of potential T cell epitopes.

ID	Epitope sequence	Position (star-end)	Percentile rank	Viral antigenicity	Allergenicity	Toxicity	IFN-g inductor	Selected for vaccine design
	CTC-E (9-mer peptides):
E1	EPLVDLPIG	224–232	4.1	0.46	Negative	Negative	ne	No
E2	APGQTGKIA	411–419	1.8	1.20	Positive	Negative	ne	No
E3	VVLSFELLH	511–519	1.5	1.41	Negative	Negative	ne	Yes
E4	FPLQSYGFQ	490–498	0.4	0.46	Positive	Negative	ne	No
E5	ATRFASVYA	344–352	0.2	0.09	Negative	Negative	ne	No
E6	VDLPIGINI	227–235	0.01	1.38	Negative	Negative	ne	Yes
E7	FTISVTTEI	718–726	0.01	0.85	Negative	Negative	ne	Yes
E8	SVYAWNRKR	349–357	0.01	0.76	Positive	Negative	ne	No
E9	YLQPRTFLL	269–277	0.02	0.45	Positive	Negative	ne	No
E10	VRFPNITNL	327–335	0.03	1.11	Negative	Negative	ne	Yes
E11	FERDISTEI	464–472	0.04	0.74	Positive	Negative	ne	No
E12	TLDSKTQSL	109–117	0.04	1.06	Positive	Negative	ne	No
E13	KIADYNYKL	417–425	0.04	1.66	Positive	Negative	ne	No
E14	YLQPRTFLL	269–277	0.04	0.45	Positive	Negative	ne	No
E15	YSKHTPINL	204–212	0.05	1.05	Negative	Negative	ne	Yes
E16	VGYLQPRTF	267–275	0.06	1.22	Positive	Negative	ne	No
E17	TLKSFTVEK	302–310	0.06	0.08	Positive	Negative	ne	No
E18	FEYVSQPFL	168–176	0.07	0.63	Negative	Negative	ne	Yes
E19	GFQPTNGVG	496–504	5.9	0.64	Negative	Negative	ne	Yes
E20	VRQIAPGQT	407–415	18.0	0.86	Positive	Negative	ne	No

	THC-E (15-mer peptides):
E21	ITRFQTLLALHRSYL	235–270	0.02	0.11	Negative	Negative	Yes	No
E22	QSLLIVNNATNVVIK	115–129	0.02	0.43	Negative	Negative	No	No
E23	LSFELLHAPATVCGP	513–527	0.03	0.51	Negative	Negative	No	No
E24	VVLSFELLHAPATVC	511–525	0.03	0.86	Negative	Negative	Yes	Yes
E25	SLLIVNNATNVVIKV	116–130	0.03	0.47	Negative	Negative	Yes	No
E26	KTQSLLIVNNATNVV	113–127	0.17	0.63	Negative	Negative	Yes	Yes
E27	AIPTNFTISVTTEIL	713–727	0.40	0.68	Negative	Negative	Yes	Yes
E28	SFVIRGDEVRQIAPG	399–413	0.51	0.58	Negative	Negative	Yes	Yes
E29	TPINLVRDLPQGFSA	208–222	0.51	0.55	Negative	Negative	No	No
E30	TRFASVYAWNRKRIS	345–358	0.52	0.50	Negative	Negative	Yes	Yes
E31	IPTNFTISVTTEILP	714–728	0.52	0.83	Negative	Negative	No	No
E32	PTESIVRFPNITNLC	322–336	0.64	0.25	Negative	Negative	No	No
E33	ECSNLLLQYGSFCTQ	748–762	0.72	0.76	Negative	Negative	Yes	Yes
E34	LQIPFAMQMAYRFNG	894–908	0.73	0.72	Negative	Negative	No	No
E35	HTPINLVRDLPQGFS	207–221	0.74	0.39	Negative	Negative	No	No
E36	ADYSVLYNSASFSTF	363–377	0.85	0.22	Negative	Negative	No	No
E37	SKTQSLLIVNNATNV	112–126	0.99	0.62	Positive	Negative	No	No
E38	TDEMIAQYTSALLAG	866–880	1.10	0.16	Negative	Negative	Yes	No
E39	QMAYRFNGIGVTQNV	901–915	1.10	1.03	Negative	Negative	No	No
E40	QIPFAMQMAYRFNGI	895–909	0.44	0.96	Negative	Negative	No	No
E41	AALQIPFAMQMAYRF	892–906	0.44	0.91	Negative	Negative	No	No
E42	LEPLVDLPIGINITR	223–237	28.00	1.01	Negative	Negative	Yes	Yes
E43	DEVRQIAPGQTGKIA	405–419	27.00	0.98	Negative	Negative	Yes	Yes
E44	EVRQIAPGQTGKIAD	406–420	33.00	1.34	Negative	Negative	Yes	Yes
E45	VRQIAPGQTGKIADY	407–421	41.00	1.30	Negative	Negative	Yes	Yes
E46	RQIAPGQTGKIADYN	408–422	49.00	1.52	Negative	Negative	Yes	Yes
E47	EGFNCYFPLQSYGFQ	484–498	7.20	0.57	Negative	Positive	Yes	No

E: Epitope; CTC-E: cytotoxic T cell epitope; THC-E: T helper cell epitope; ne: not evaluated; Epitopes highlighted in boldface were selected for final vaccine design.

Evaluation of potential T cell epitopes. E: Epitope; CTC-E: cytotoxic T cell epitope; THC-E: T helper cell epitope; ne: not evaluated; Epitopes highlighted in boldface were selected for final vaccine design.

HLA-I and HLA-II interacting alleles

The selected CTC epitopes (Table 1) showed promiscuous affinity by several HLA-I alleles, including HLA-A*01:01, HLA-A*02:01, HLA-A*03:01, HLA-A*11:01, HLA-A*23:01, HLA-A*25:01, HLA-A*30:01, HLA-A*68:01, HLA-A*74:01, HLA-B*07:02, HLA-B*08:01, HLA-B*13:01, HLA-B*13:02, HLA-B*14:02, HLA-B*15:01, HLA-B*15:02, HLA-B*18:01, HLA-B*27:02, HLA-B*35:03, HLA-B*40:01, HLA-B*58:01, HLA-C*01:02, HLA-C*02:02, HLA-C*02:09, HLA-C*03:02, HLA-C*03:03, HLA-C*03:04, HLA-C*04:01, HLA-C*05:01, HLA-C*06:02, HLA-C*07:01, HLA-C*08:01, HLA-C*12:02, HLA-C*12:03, HLA-C*14:02, HLA-C*15:02, HLA-C*16:01, and HLA-C*17:01. Likewise, the selected THC epitopes (Table 1) showed common interaction with the following HLA-II alleles: HLA-DRB1*01:01, HLA-DRB1*01:03, HLA-DRB1*07:01, HLA-DRB1*15:01, HLA-DRB1*01:20, HLA-DRB3*02:02, HLA-DRB4*01:01, and HLA-DRB5*01:01.

Prediction of linear B cell (BC) epitopes

A total of 10 linear BC epitopes of varying amino acid lengths were predicted (Table 2 ). Most of the epitopes showed robust viral antigenicity (≥0.5), as well as, they were identified as non-allergenic and non-toxic (Table 2). However, only 7 epitopes were selected for the vaccine design due to they were predicted simultaneously by 3 different web tools (BCPRED, iBCE-EL, and BepiPred-2.0) (Table 2). Interestingly, overlapping residues were observed between some linear BC and T cell epitopes.

Table 2

Evaluation of potential linear B cell epitopes.

ID	Epitope sequence	Position (star-end)	Predicted on BCPRED, iBCE-EL, and BepiPred-2.0	Viral antigenicity	Allergenicity	Toxicity	Selected for vaccine design
E1	TTLDSKTQSL	127–136	Yes	0.99	Negative	Non-Toxin	Yes
E2	MDLEGKQGNFKNLREF	196–211	Yes	0.83	Negative	Non-Toxin	Yes
E3	PDPSKPSKRS	826–835	Yes	0.84	Negative	Non-Toxin	Yes
E4	ILDITPCSFGGVSVITPG	603–620	Yes	1.10	Negative	Non-Toxin	Yes
E5	YQPYRVVVLSFELLH	524–538	Negative on iBCE-EL	0.97	Negative	Non-toxin	No
E6	FSTFKCYGVSPT	270–281	Yes	0.81	Negative	Non-toxin	Yes
E7	VYYHKNNKSWMESEFRVYSS	162–181	Yes	0.30	Negative	Non-toxin	No
E8	GDEVRQIAPGQTGKI	423–437	Negative on iBCE-EL	0.97	Negative	Non-toxin	No
E9	NLDSKVGGNYNY	459–470	Yes	1.09	Negative	Non-toxin	Yes
E10	GFQPTNGVGYQPYR	496–509	Yes	0.72	Negative	Non-toxin	Yes

E: epitope; Epitopes highlighted in boldface were selected for final vaccine design.

Evaluation of potential linear B cell epitopes. E: epitope; Epitopes highlighted in boldface were selected for final vaccine design.

HLA allele-viral peptide interaction: Molecular docking

To evaluate the presentation of the best epitopes in the context of HLA, molecular docking simulations were conducted. For this purpose, HLA-A*02:01 and HLA-DRB1*01:01 were chosen as representative alleles. HLA-I and HLA-II alleles were docked with CTC and THC epitopes, respectively, using the Cluspro server, which has been recently applied to successfully dock epitopes from SARS-CoV-2 non-structural proteins into HLA molecules (Marchan, 2020). The inspection on VDM software allowed observing different binding patterns wherein viral peptides rightly interact with the active site residues of the HLA groove in a similar way to control peptides (Fig. 2A). Moreover, several viral peptides (e.g., E18 and E33) formed a bulge that projected from their respective HLA allele (Fig. 2B and C), which is relevant, for instance, to activate CTC against SARS-CoV-2 infected cells (Owen et al., 2013) (Fig. 2E). Importantly, each HLA-viral peptide complex showed robust potential interactions (free energy values −7 < kcal/mol−1) comparable to control peptides (Fig. 2D).

Fig. 2

Screenshots of the HLA-viral peptide complexes. (A) Top view of HLA class I (HLA-A*02:01) and HLA class II (HLA-DRB1*01:01) presenting 9-mer and 15-mer viral peptides, respectively. HLA alleles, viral peptides, and control peptides are shown in grey, cyan, and green, respectively. Epitopes are named according to the nomenclature established in Table 1. (B–C) Representative lateral views of HLA alleles interacting with viral peptides. Importantly, peptides formed a bulge that project from the HLA groove, which could lead to a more direct interaction with immune cell receptors such as the T cell receptor. (D) Free energy for each HLA-viral peptide interaction. Note that the values of control peptides (CTRL+) are shown in green bars. (E) Diagram showing the presentation of viral peptides in the context of HLA-I alleles to Cytotoxic T Cell (CTC), which may target SARS-CoV-2 infected cells. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Design of the multi-epitope vaccine against SARS-CoV-2: General evaluations

To design the amino acid sequence of the multi-epitope vaccine, epitopes were organized using several linkers (Fig. 1). This sequence is constituted by 425 amino acid residues (Fig. 1). Of particular note, several epitopes selected for the vaccine design (E19, E42, E43, E44, and E45 in Table 1; E10 in Table 2) harbour residues that are usually involved in the interaction between the SARS-CoV-2S glycoprotein and hACE2 (Lan et al., 2020a; Ortega et al., 2020; Lan et al., 2020b). For instance, N501—which is present in the amino acid sequence of E19 (Table 1) and E10 (Table 2)—has been recently described as one of the critical hACE2 -binding residues in SARS-CoV-2 (Lan et al., 2020a). The vaccine showed a strong viral antigenicity (0.64), as well as neither allergenic nor toxic residues were observed in its amino acid sequence. Furthermore, the physicochemical properties examined with the ProtParam tool, including molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient, estimated half-life, instability index, aliphatic index, and GRAVY, were computed as conventional results (Table 3 ).

Table 3

Physicochemical parameters of the multiple-epitope vaccine construct.

Parameter	Value	Comment
Number of aa	425	Suitable
Molecular weight	44 kDa	Average
Theoretical pI	9.59	Slightly basic
Negatively charged residues (N + Q)	25
Positively charged residues (R + K)	43	Charged positive
Extinction coefficient (at 280 nm in water)	31,080 M-¹ cm-¹	–
The instability index (II)	26.42	Stable
Aliphatic index	74.82	Thermostable
GRAVY	−0.306	Hydrophilic
Solubility	0.703269	Soluble
Estimated half-life:		Satisfactory
Mammalian reticulocytes (in vitro)	4.4 h
Yeast (in vivo)	>20 h
Escherichia coli (in vivo)	>10 h

aa: amino acid; GRAVY: Grand average of hydropathicity.

Physicochemical parameters of the multiple-epitope vaccine construct. aa: amino acid; GRAVY: Grand average of hydropathicity. The vaccine construct was analysed using the PSIPRED server to predict its secondary structure, which identified 309, 65, and 51 amino acids forming coil, helix, and strand regions, respectively (Fig. 3 ).

Fig. 3

Secondary structure prediction of the multi-epitope vaccine against SARS-CoV-2. This graphic was obtained from PSIPRED.

Secondary structure prediction of the multi-epitope vaccine against SARS-CoV-2. This graphic was obtained from PSIPRED. The predicted tertiary structure was subjected to refinement using the GalaxyRefine server. The output showed four potential models. Model 1 (Fig. 4A) was classified as the best according to the web tool. Therefore, this model was selected for further analysis. In this regard, the Ramachandran plot (Fig. 4B) showed that 98.1 of residues were located in allowed regions, whereas the remaining residues were observed in disallowed regions (1.9%). In addition, the Z-score value (−2.4) (Fig. 4C) suggests that the vaccine structure is similar to native proteins of comparable size.

Fig. 4

Tertiary structure prediction and validation of the multi-epitope vaccine. (A) Tertiary structure after refinement showed in space-filling model (inset) and best conformational B cell epitopes (CE) identified in the vaccine construct. Of note, these epitopes (cyan atoms) showed high probability scores (>0.8), thereby suggesting a considerable accessibility for antibodies (showed in turquoise). (B) Ramachandran plot of the 3D refined structure. (C) Z-score plot obtained from ProSA-web. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Prediction of conformational BC epitopes

Four conformational BC epitopes (CE) were predicted using Ellipro (Fig. 4A). These CE showed high probability scores—CE1: 0.911, CE2: 0.819, CE3: 0.803, and CE4: 0.803 suggesting a considerable accessibility for antibodies (Fig. 4A). Likewise, these results also confirm the immunogenic potential of the multi-epitope vaccine construct.

TLR4-vaccine model interaction: Molecular docking

The vaccine model showed a suitable interaction (Fig. 5) between the TLR-4 chain B and the myeloid differentiation factor 2 (MD-2) molecule, which are known to initiate the cascade signalling pathways in vivo (Owen et al., 2013), thereby suggesting that such a vaccine model could elicit an appropriate immune response against SARS-CoV-2. Importantly, the adjuvant inserted in the vaccine sequence was observed in the interaction zone with the TLR4 and MD-2, which indicates the relevance of such an adjuvant in the potential efficacy of the present vaccine model.

Fig. 5

Docked complex of TLR4, MD-2 (myeloid differentiation factor 2), and the multi-epitope vaccine.

Immune response simulations

The immune response simulations with the multi-epitope vaccine construct (3 doses given 4 weeks apart) showed cell-mediated and humoral responses. As expected, increased number and activity of Natural Killer (NK) cells—a relevant line of attack against viruses (Owen et al., 2013), and macrophages were observed (Fig. 6 ). Regarding the adaptive immune response, CTC and THC populations showed a proliferative burst, effector cell generation, and a dramatic cell number contraction (Fig. 6). Importantly, IL-2, which is necessary for T cell activation and optimal proliferation (Owen et al., 2013), was amplified after each dose (Fig. 6). Moreover, the vaccine model increased BC and plasma cell populations, particularly immunoglobulin M (IgM) and IgG1 isotypes (Fig. 6). In this regard, titres of IgM, IgG1, and IgG2 were higher in the secondary and tertiary response compared to primary response (Fig. 6). Of note, immunogen concentrations decreased after antibody response (Fig. 6). Notably, repeated exposure with 12 injections (given 4 weeks apart) increased the IgG1 levels and stimulated CTC and THC populations (Fig. S1). Taken together, these results suggest that the multi-epitope vaccine could evoke and maximize both effector responses and immunological memory to SARS-CoV-2.

Fig. 6

Immune response simulations with the multi-epitope vaccine (3 doses given 4 weeks apart). (A) Innate immune response after in silico simulations. NK cells and Macrophages were properly activated. (B) The adaptive immune response was clearly elicited after vaccine shots on the C-ImmSim server, including the cytotoxic T cell, T helper, B cell, and antibody responses as well as the cytokine production (red arrows). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) To investigate whether the multi-epitope vaccine may be used in different ethnic groups or globally, a population coverage analysis was performed. Remarkably, the multi-epitope vaccine construct showed high global population coverage: 99.69%. For instance, several countries with positive reports of COVID-19 (>6000 cases) (World Health Organization, n.d.), obtained the highest values, including, Australia, Brazil, Ecuador, Chile, China, France, Germany, India, Iran, Israel, Italy, Japan, Mexico, Morocco, Peru, Philippines, Russia, Singapore, South Korea, Spain, Sweden, USA, UK, etc., (Fig. 7 ).

Fig. 7

Combined HLA population coverage analysis of the multi-epitope vaccine against SARS-CoV-2.

Discussion

Immunoinformatics represents a valuable tool whereby the limitations in the selection of appropriate antigens and immunodominant epitopes may be overcome (Sharma et al., 2019). Previous in silico-based reports have shown that the SARS-CoV-2S glycoprotein contains potential epitopes (Gupta et al., 2013). Therefore, researchers have recently attempted to design epitope-based vaccine candidates against SARS-CoV-2 (Gupta et al., 2013). In the present study, highly potential B and T cell epitopes from the SARS-CoV-2 glycoprotein were predicted and the best selected to design a high-quality multi-epitope vaccine candidate. Remarkably, this vaccine harbours 2 epitopes (E19 in Table 1 and E10 in Table 2) that could evoke immune responses against SARS-CoV-2 RBM—the main responsible for virus entry into human cells (Lan et al., 2020a) whereas 4 epitopes (E43, E44, E45, and E46 in Table 1) may direct the immune attack against other regions of SARS-CoV-2 RBD. These results are consistent with in vitro data that have demonstrated the antigenicity of the SARS-CoV-2 S glycoprotein (Walls et al., 2020). The T cell epitopes included in the vaccine sequence accomplish with relevant requisites to design a suitable multi-epitope vaccine candidate. Firstly, they showed a marked antigenicity, immunogenicity, and lack of allergenic or toxic residues. Secondly, the THC epitopes were predicted as potent inductors of IFN-g—a crucial cytokine for CTC activation (Owen et al., 2013). Thirdly, both CTC and THC epitopes properly interacted with the groove of HLA-I and HLA-II alleles, respectively, which is in agreement with other computer-based reports (Marchan, 2020), thereby suggesting that the T cell epitopes identified and selected in the present study could be successfully presented in the context of HLA molecules. In addition, most of the peptides arched away from the HLA alleles and are, therefore, more exposed, which in turn suggests that they could interact more directly with the T-cell receptor, thereby possibly leading to a proper activation of T cells (Owen et al., 2013). The purpose of an adjuvant is to make a vaccine “detectable” for antigen-presenting cells such as dendritic cells (Owen et al., 2013). Here, the TLR4 adjuvant known as RS09 (Shanmugam et al., 2012) was included in the multi-epitope vaccine sequence. The molecular docking simulation showed that the multi-epitope vaccine rightly interacts with this innate immune receptor in a similar way to previous works (Pandey et al., 2018). Notably, this study shows, by immunoinformatics simulations, the induction of both innate and adaptive responses to SARS-CoV-2. In this regard, NK cell and macrophage activation were detected, as well as high production of typical antibodies (IgM and IgG), cytokines (IFN-g and IL-2), and a proliferative burst of CTC and THC were observed after three injections. The generation and increase of plasma cells were also documented. Furthermore, B and T cell populations decreased along with immunogen levels. These data is comparable to previous investigations that have been focused on vaccine development against Mycobacterium ulcerans (Nain et al., 2020) and filarial diseases (Shey et al., 2019), as well as are in agreement, at least partially, with a recent study that demonstrated a positive correlation between robust CD4+ THC responses with anti-SARS-CoV-2 IgG and IgA titres of COVID-19 convalescent patients (Grifoni et al., 2020b). These immune responses were directed to the SARS-CoV-2 S glycoprotein (Grifoni et al., 2020b). Recently, Kar et al. (2020) have reported a similar potential vaccine model, whose epitopes where also derived from the S glycoprotein. However, there are important differences to point out. For instance, it is not clear why they authors docked the entire vaccine model with HLA class I and class II alleles, which are known to interact only with peptides of a relatively small amino acid length (Owen et al., 2013) (9-mer for HLA class I molecules and 15-mer for HLA class II molecules) as has shown in the present and other similar works focused on SARS-CoV-2 and vaccine development (Marchan, 2020). In addition, the docking simulations between the vaccine model and the TLR-4 showed that the vaccine is not interacting in the region TLR-4 and MD-2 (Kar et al., 2020), which is pivotal for antigen recognition and initiation of the immune response (Park et al., 2009). Importantly, the population coverage of the vaccine molecule reported in the present study is higher compared to the vaccine model from Kar et al. (2020) (99.69% vs 95.78%). On the other hand, Tahir Ul Qamar et al. (2020) selected seven SARS-COV-2 proteins, including the S glycoprotein, to design a vaccine construct. Interestingly, they did not detect the S glycoprotein as antigenic on the Vaxijen website, which contrast with previous reports in the literature wherein the Sglycoprotein has been demonstrated to induce immune responses (Grifoni et al., 2020a). This result may be due to the fact that complex proteins harbour both antigenic and non-antigenic regions and in order to identify and select the former is necessary to apply an integrated approach in which different algorithms are used to validate the predicted output (Grifoni et al., 2020a; Marchan, 2020). This work was limited by A) the population coverage analysis did not include some countries, particularly from Africa, Central America, Eastern Europe, and Central Asia. This was mainly due to data not available concerning the HLA allele frequencies. Nevertheless, the highest population coverage was observed in several of the worst-hit countries by COVID-19 (e.g, Brazil, China, France, Italy, Iran, Peru, Spain, USA, etc.) (World Health Organization, n.d.). B) This study did not explore whether the epitopes used for vaccine design are conserved in other beta-coronaviruses. However, former reports have already demonstrated that SARS-CoV-2 shares 79.5% and 50% sequence identity to SARS-CoV and MERS-CoV, respectively (Lu et al., 2020). In summary, this research provides a novel multi-epitope vaccine built from high potential epitopes derived from the SARS-CoV-2 S glycoprotein. This immunoinformatics study suggests that such multi-epitope vaccine could activate and generate robust humoral and cell-mediated responses in a simultaneous manner against SARS-CoV-2, as well as the population coverage analysis indicates that it could be used globally. However, further rigorous in vitro and in vivo studies are imperative to confirm its immunogenic properties, safety, and efficacy, which—of course—would imply months, even years.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Declaration of Competing Interest

I have no conflict of interest to declare.

43 in total

1. The structural basis of lipopolysaccharide recognition by the TLR4-MD-2 complex.

Authors: Beom Seok Park; Dong Hyun Song; Ho Min Kim; Byong-Seok Choi; Hayyoung Lee; Jie-Oh Lee
Journal: Nature Date: 2009-03-01 Impact factor: 49.962

2. Purification of Polyhistidine-Tagged Proteins.

Authors: Sinéad T Loughran; Ronan T Bree; Dermot Walls
Journal: Methods Mol Biol Date: 2017

3. Modeling the MHC class I pathway by combining predictions of proteasomal cleavage, TAP transport and MHC class I binding.

Authors: S Tenzer; B Peters; S Bulik; O Schoor; C Lemmel; M M Schatz; P-M Kloetzel; H-G Rammensee; H Schild; H-G Holzhütter
Journal: Cell Mol Life Sci Date: 2005-05 Impact factor: 9.261

4. Structural basis and designing of peptide vaccine using PE-PGRS family protein of Mycobacterium ulcerans-An integrated vaccinomics approach.

Authors: Zulkar Nain; Mohammad Minnatul Karim; Monokesh Kumer Sen; Utpal Kumar Adhikari
Journal: Mol Immunol Date: 2020-02-29 Impact factor: 4.407