Muhammad Tahir Ul Qamar1, Farah Shahid2, Sadia Aslam3, Usman Ali Ashfaq4, Sidra Aslam2, Israr Fatima2, Muhammad Mazhar Fareed2, Ali Zohaib5, Ling-Ling Chen6. 1. College of Life Science and Technology, Guangxi University, Nanning, P. R. China. 2. Department of Bioinformatics and Biotechnology, Government College University Faisalabad (GCUF), Faisalabad, Pakistan. 3. Jinnah Hospital, Lahore, Pakistan. 4. Department of Bioinformatics and Biotechnology, Government College University Faisalabad (GCUF), Faisalabad, Pakistan. usmancemb@gmail.com. 5. Department of Healthcare Biotechnology, Atta-ur-Rahman School of Applied Biosciences (ASAB), National University of Sciences and Technology (NUST), Islamabad, Pakistan. 6. College of Life Science and Technology, Guangxi University, Nanning, P. R. China. llchen@mail.hzau.edu.cn.
Abstract
BACKGROUND: Coronavirus disease 2019 (COVID-19) linked with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) cause severe illness and life-threatening pneumonia in humans. The current COVID-19 pandemic demands an effective vaccine to acquire protection against the infection. Therefore, the present study was aimed to design a multiepitope-based subunit vaccine (MESV) against COVID-19. METHODS: Structural proteins (Surface glycoprotein, Envelope protein, and Membrane glycoprotein) of SARS-CoV-2 are responsible for its prime functions. Sequences of proteins were downloaded from GenBank and several immunoinformatics coupled with computational approaches were employed to forecast B- and T- cell epitopes from the SARS-CoV-2 highly antigenic structural proteins to design an effective MESV. RESULTS: Predicted epitopes suggested high antigenicity, conserveness, substantial interactions with the human leukocyte antigen (HLA) binding alleles, and collective global population coverage of 88.40%. Taken together, 276 amino acids long MESV was designed by connecting 3 cytotoxic T lymphocytes (CTL), 6 helper T lymphocyte (HTL) and 4 B-cell epitopes with suitable adjuvant and linkers. The MESV construct was non-allergenic, stable, and highly antigenic. Molecular docking showed a stable and high binding affinity of MESV with human pathogenic toll-like receptors-3 (TLR3). Furthermore, in silico immune simulation revealed significant immunogenic response of MESV. Finally, MEV codons were optimized for its in silico cloning into the Escherichia coli K-12 system, to ensure its increased expression. CONCLUSION: The MESV developed in this study is capable of generating immune response against COVID-19. Therefore, if designed MESV further investigated experimentally, it would be an effective vaccine candidate against SARS-CoV-2 to control and prevent COVID-19.
BACKGROUND:Coronavirus disease 2019 (COVID-19) linked with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) cause severe illness and life-threatening pneumonia in humans. The current COVID-19 pandemic demands an effective vaccine to acquire protection against theinfection. Therefore, the present study was aimed to design a multiepitope-based subunit vaccine (MESV) against COVID-19. METHODS: Structural proteins (Surface glycoprotein, Envelope protein, and Membrane glycoprotein) of SARS-CoV-2 are responsible for its prime functions. Sequences of proteins were downloaded fromGenBank and several immunoinformatics coupled with computational approaches wereemployed to forecast B- and T- cell epitopes from theSARS-CoV-2 highly antigenic structural proteins to design an effectiveMESV. RESULTS: Predicted epitopes suggested high antigenicity, conserveness, substantial interactions with thehuman leukocyte antigen (HLA) binding alleles, and collective global population coverage of 88.40%. Taken together, 276 amino acids long MESV was designed by connecting 3 cytotoxic T lymphocytes (CTL), 6 helper T lymphocyte (HTL) and 4 B-cell epitopes with suitable adjuvant and linkers. TheMESV construct was non-allergenic, stable, and highly antigenic. Molecular docking showed a stable and high binding affinity of MESV with human pathogenic toll-like receptors-3 (TLR3). Furthermore, in silico immune simulation revealed significant immunogenic response of MESV. Finally, MEV codons were optimized for its in silico cloning into theEscherichia coli K-12 system, to ensure its increased expression. CONCLUSION: TheMESV developed in this study is capable of generating immune response against COVID-19. Therefore, if designed MESV further investigated experimentally, it would be an effective vaccine candidate against SARS-CoV-2 to control and prevent COVID-19.
Viruses have the potential to become dangerous life threat and cause irreparable loss to human beings. Hardly the world learns to cope with one strain of virus when another emerges and poses a threat to the future of humanity. A similar situation has emerged when a new strain of novel coronavirus (CoV) that has not been previously identified in humans reported in December, 2019 [1, 2]. Coronaviruses are the largest among RNA viruses belonging to Coronaviridae, Roniviridae and Arteriviridae families. Coronaviridae are unsegmented, 3′ polyadenylated and 5′ capped positive sense single-stranded RNA viruses cause various respiratory diseases in humans [2, 3]. CoVs are classified into four classes: alpha, beta, delta, and gamma. Amongst them, beta and alpha CoVs have been reported for infecting humans [4]. Recent CoV strain has received tremendous attention from researchers, as it causes a global pandemic of coronavirus disease 2019 (COVID-19) [5]. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was identified as the causative agent of this pandemic [6]. The study of genome sequences has cast a shadow that SARS-CoV-2 is closely related to theSARS-CoV which is the causative agent of the SARS disease in 2002/2003 [7]. Initial diagnostic procedures indicated that theSARS-CoV-2 is primarily spread through respiratory droplets from sneezing/coughing, body contact and to someextent through fecal contact [8]. TheSARS-CoV-2may show symptoms within 14 days after exposure, or in some cases it takes more than 14 days. Symptoms of patientsinfected with COVID-19 includefever, runny nose, cough, and dyspnea [9]. Although theentire genome sequence of the virus has been published, the origin and proliferation mechanism of thenew coronavirus is still ambiguous as stated by the World Health Organization [10]. Initial reports claimed that bats, snakes, pangolins or civet could be a possible animal source, but the claims are under debate and needs substantial research to prove it [6, 11–13]. Researchers are currently working to sort out theSARS-CoV-2 source, including possible intermediate animal vectors.The samples taken from a respiratory system-throat swab or lung fluid are helpful in diagnosing its infection in patients [14]. A special clinical diagnostic reverse transcription-PCR based test was developed [15]. Over 200 clinical trials are currently underway to test new and repurposed compounds against SARS-CoV-2 [16, 17]. Several medications such as hydroxychloroquine, remedesivir, and dexamethasone are being tested in clinical trials [18-21]. Several vaccines including subunit vaccines [18, 22], nano-particle based vaccines, viral vector vaccines (adenovirus vector, Ankara vector), inactivated vaccines, fusion-protein based vaccines, recombinant protein, DNA vaccines, and live-attenuated vaccines are also being developed and in pre-clinical trials, but these vaccines are long months away from themarket [23-27].Immunoinformatics approaches can be applied to examine viral antigens, prediction of its epitopes and assessment of its immunogenicity [28, 29]. Moreover, this approach could be both time and cost-effective [3, 30, 31]. Excessiverespiratory infection can also resolve with T-cell reactions and antibodies [32]. Furthermore, rapid identification, isolation, disease prevention, and control measures are required to hinder its spread of SARS-CoV-2 at homes, communities and healthcare units [33, 34]. In various studies, therapeutic approaches against theEbola virus, Zika virus and Middle East respiratory syndrome corona virus (MERS-CoV) were developed using immunoinformatics approaches [3, 31, 35]. The purpose of this study was to pinpoint the potential T-cell and B-cell epitopes fromSARS-CoV-2 structural proteins which can be further joined through adjuvant and linkers to design a multiepitope-based subunit vaccine (MESV). Many in silico approaches were used to validate the structural and physiochemical properties of theMESV. To examine the binding interaction and stability of MESV with human pathogenic receptors, molecular docking analysis has also been carried out. In addition, in silico immune simulation was also performed to validate the immunogenic potential of designed MESV. At theend, theMESV codons were optimized for Escherichia coli system and in silico cloning was performed to ensure its expression profiling. Flow chart of methodology used in present study is graphically presented in Fig. 1.
Fig. 1
The schematic workflow used to develop MESV construct against SARS-CoV-2 structural proteins
The schematic workflow used to develop MESV construct against SARS-CoV-2 structural proteins
Methods
Target proteins sequence and structural analyses
Main structural proteins, Surface glycoprotein (S [Genbank: QHD43416.1]), Envelope protein (E [Genbank: QHD43418.1]) and Membrane glycoprotein (M [QHD43419.1]) of SARS-CoV-2 were taken as targets for epitopes screening and vaccine designing against SARS-CoV-2. Their amino acid sequences were collected in fasta format fromGenBank (https://www.ncbi.nlm.nih.gov/genbank/) [36]. Allergenicity and antigenicity (at a threshold of 0.4) of selected proteins wereevaluated through AllerTOP v2.0 (https://www.ddg-pharmfac.net/AllerTOP/) and VaxiJen v2.0 (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) respectively [37, 38]. Three dimensional (3D) structure of S protein was retrieved from RCSB Protein Data Bank (PDB; https://www.rcsb.org/) [39]. However, 3D structures of other two proteins (E and M) were predicted using homology modeling approach, as their resolved structures are not available yet. RaptorX (http://raptorx.uchicago.edu/) and MODELLER v.9.12 (https://salilab.org/modeller/) wereemployed for homology modeling [40]. Predicted models were then visualized by Chimera (https://www.cgl.ucsf.edu/chimera/) [41]. Galaxy refines server (http://galaxy.seoklab.org/) and ModRefiner (https://zhanglab.ccmb.med.umich.edu/ModRefiner/) was used to refine the predicted models [42, 43]. Besides, the refined structureneeds to be validated based on experimentally validated 3D structure of proteins. Refined structures were therefore applied in the PROSA web (https://prosa.services.came.sbg.ac.at/prosa.php) providing a quality score for a given structure [44]. The quality score beyond the usual range of native proteins indicates a possibleerror in protein structure. Ramachandran plot was created by rampage server (http://mordred.bioc.cam.ac.uk/~rapper/rampage.php), where the principle of PROCHECK is applied to validate the protein structure [45]. Structural analysis was performed to later investigate the positions of B-cell epitopes on target proteins.
Prediction of B-cell and T-cell epitopes
Theepitopes of B-cells help to detect viral infections in the immune system. ABCpred (http://crdd.osdd.net/raghava/abcpred/) was used to forecast 14-mer B cell epitopes for target proteins at 0.51 threshold [46]. Epitopes evident on the outer surface were picked, and other intracellular epitopes were removed. TheVaxijen server tested the antigenicity of the selected epitopes at a threshold of 0.5. B-cell epitope identification was based upon antigenicity, flexibility, linear epitope predictions, hydrophilicity, and surface accessibility [47]. Parker hydrophilicity prediction algorithms, Emini surface accessibility prediction method, Kolaskar and Tongaonkar antigenicity scale, and Karplus and Schulz flexibility prediction tool were used to perform hydrophilicity, accessibility of surface, antigenicity and flexibility analysis respectively [48]. As discontinuous epitopes becomemoreevident and have higher dominant properties than linear epitopes, DiscoTop 2.0 server (http://www.cbs.dtu.dk/services/DiscoTope/) was used to forecast discontinuous epitopes from 3D structures of surface glycoprotein, membrane protein and envelope protein [49]. The position of epitopes on 3D structures of proteins was visualized by Pymol (https://pymol.org/2/) [50].In vaccine designing, T-cell epitopes play a crucial role. More specifically, it reduces the cost and time compared with laboratory experiments [51]. IEDB consensus method (http://tools.iedb.org/mhcii/) was used to predict 8–11 mer MHC class-I and 11–14 mer MHC class-II epitopes. The results of this method are very important due to a large number of human leukocyte antigen (HLA) alleles used in the calculation. The sequence was given in a FASTA format and all the alleles were selected for prediction. Epitopes with less than 2 consensus score believed to be good binders and chosen for further research.
Evaluation of predicted epitopes
Antigenicity and allergenicity of the selected epitopes were checked by Vaxijen v2.0 and Allergen FP v1.0 respectively [52]. Protein Digest server (http://db.systemsbiology.net:8080/proteomicsToolkit/proteinDigest.html) was used to predict epitopes digesting enzymes. ToxinPred (http://crdd.osdd.net/raghava/toxinpred/) was used for non-toxic/toxic properties prediction of epitopes. Non-toxic epitopes were selected for further analysis [53].
Epitopes conservation and population coverage analysis
The degree of conservation of predicted T-cell and B-cell epitopes within the protein sequence was analyzed by IEDB conservancy analysis tool (http://tools.iedb.org/conservancy/). Epitopes having conservancy among all 3 selected proteins were shortlisted for further analyses [54].Theexpression and distribution of HLA alleles vary depending on the world’s ethnicities and regions, thereby impacting theeffective production of MESV [55]. The population coverage was calculated using the IEDB population coverage tool (http://tools.iedb.org/population/), and for this purposeMHC class-I and MHC class-II epitopes and corresponding HLA-binding alleles were considered. This tool estimates population coverage for each epitope for various regions of the world based on the distribution of HLA binding alleles [56].
Multi-epitope-based subunit vaccine (MESV) designing and evaluation
Epitopes with the following characteristics are generally preferred to design a subunit vaccine: (a) highly antigenic, (b) immunogenic, (c) non-allergenic, (d) non-toxic, and (e) with significant population coverage. Therefore, only thoseepitopes were selected further to construct MESV following the above parameters. An adjuvant was attached with theEAAAK linker to the first cytotoxic T lymphocytes (CTL) epitope to improve the immune response. Other epitopes were linked using AAY, GPGPG, and KK linkers after validation of their interaction compatibility to preserve their independent immunogenic activity. β-defensin has been used as an adjuvant in the present research since it is a simple 45 amino acids long peptide that acts as an immunomodulator and as an antimicrobial agent both [57].First, Blastp analysis was carried out using default parameters to confirm that the designed MESV sequence is non-homologous against theHomo sapiens proteome [58]. Protein with less than 37% is commonly known to be a non-homologous. Physiochemical properties of the designed MESV were accessed by the Protparam tool [59]. Protparam predicts various physiochemical properties like (half-life, theoretical isoelectric point [pI], instability index, grand average hydropathy, and aliphatic index) based on the amino acid approximations involved in the pk [60]. AllerTOP v.2.0 server was used to analyze the allergenicity of theMESV construct [38]. The secondary structure of theMESV construct was evaluated using a PSIPRED workbench [58]. This test also evaluated various vaccine properties such as alpha helices, extended chain, degree of beta turns, and random coil.The 3D structure of MESV was predicted using the de novo modeling approach of CABS fold server (http://biocomp.chem.uw.edu.pl/CABSfold/), since the designed MESV was a series of epitopes and no appropriate template was available [61]. This server is based on a CABS modeling approach that combines a multi-scalemodeling pipeline with an exchange replica Monte Carlo scheme. Predicted MESV 3D structure was modified using a galaxy refine server [62]. The Ramachandran plot analysis was carried out using the RAMPAGE server (http://mordred.bioc.cam.ac.uk/~rapper/rampage.php) [45], to confirm the quality of the refined MESV structure, followed by the structural validation analysis using the PROSA web server [44]. TheERRAT server (https://servicesn.mbi.ucla.edu/ERRAT/) was also used to evaluate the calculation of unbounded interactions in theMESV structure [63].Besides, linear B-cell epitopes were predicted from theMESV using the ABCpred server [46]. Ellipro tool (http://tools.iedb.org/ellipro/) was used to predict the conformational B-cell epitopes of the designed MESV using default settings (maximum distance: 6 A°; minimum score: 0.5), provided by IEDB-AR v.2.22. It predicts epitopes by estimating residual protrusion index (PI), protein shape, and neighbor residue clustering [64].
Molecular docking of MESV with human immune receptors
All together for the appropriateevocation of immune response, the interaction amongst the antigenic molecule and immune receptor molecule is essential. Molecular Docking was performed to analyze the interaction betweenMESV construct and human immune receptors. Toll-like receptors-3 (TLR3) has been thoroughly studied, and studies found its key role in antiviral immune response generation. GRAMM-X (http://vakser.compbio.ku.edu/resources/gramm/grammx/) was used for theMESV docking with TLR3 (PDB ID: 1ZIW) [65]. Pymol was utilized for visualization of the docked complexes [50]. Moreover, for the achievement of the conventional sketch of interactions among docked proteins, an online server PDBsum (http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum/GetPage.pl?pdbcode=index.html) was utilized. It analyzes the protein-protein interactions among docked molecules [66].
Immunogenicity evaluation of the vaccine construct
An in silico immune simulation was performed using C-ImmSim 10.1 server (http://150.146.2.1/C-IMMSIM/index.php?page=0) to validate the immunological responses of the designed MESV. C-ImmSim simulates the threemain components of the functional mammal system (Thymus, lymph node, and bonemarrow) [67]. The input parameters for the immune simulations are as follows: volume (10), HLA (A0101, A0101, B0702, B0702, DRB1_0101, DRB1_0101), random seed (12345), number of steps (100), number of injection set to 1. The rest of the parameters were considered to be the default.
In silico cloning and codon optimization
Codon optimization is a method to improve the translation effectiveness of foreign genes in the host if the use of codon is different in both organisms. Codon optimization was carried out followed by in silico cloning, after the careful evaluation of MESV properties and immune response. To make this method consistent with the commonly used prokaryotic expression system; E. coli K12 [68], the java codon adaptation tool (http://www.jcat.de/) [69] was used for MESV codon optimization. The other available choices were selected to evade: (i) termination of rho-independent transcription, (ii) binding-site of prokaryote ribosome, and (iii) cleavage-sites of restriction enzymes. Codon adaptation index (CAI) [70] along with the GC (guanine and cytosine) contents were assessed. Sticky ends of the restriction sites of HindIII and BamHI restriction enzymes were added to allow restriction and cloning, in the start/N terminal and end/C terminal of themodified MESV sequence, respectively. Themodified nucleotide sequence of MESV was additionally cloned into theE. coli pET30a (+) vector by using SnapGene tool (https://www.snapgene.com/), to assure its in vitro expression.
Results
Sequence and structural analysis of the target proteins
All target structural proteins were found to be non-allergenic and highly antigenic. E protein was themost antigenic followed by M and S protein with 0.60, 0.51 and 0.46 antigenic values, respectively.The 3D structure of S protein was retrieved from Protein-Data-Bank using ID: 6VYB [39]. The 3D structure of E protein was determined using homology modeling. Chain-A of envelope small membrane protein of SARS-CoV (PDB ID: 5X29) was found to be the best template (percent identity 88.71%) for E protein of SARS-CoV-2. However, no suitable template was found for M protein, so its structure was predicted by Raptor X [71]. Visualization of themodels was done by Chimera (Additional file 1: Fig. S1). The quality factor (z-score) and Ramachandran plot values of refined predicted models arementioned in Additional file 2: Table S1 (Additional file 1: Fig. S2–S3).
B-cell epitopes prediction from target proteins
Total 23 linear epitopes (S-19, E-1, and M-3) were selected. Among the chosen linear epitopes, ‘ILPVSMTKTSVDCT’ of S protein showed the highest antigenicity (1.6) and predicted score (Additional file 3: Table S2). The positions of epitopes on their respective protein structures were visualized by Pymol (Additional file 1: Fig. S4).Identification of B cell epitope was based on antigenicity, flexibility, linear epitope predictions, hydrophilicity, and surface accessibility. Parker hydrophilicity prediction algorithms, Emini surface accessibility prediction method, Kolaskar and Tongaonkar antigenicity scale, and Karplus and Schulz flexibility prediction tool were used to perform hydrophilicity, accessibility of surface, antigenicity and flexibility analysis respectively (Additional file 1: Fig. S5–S7).To further improve the specificity and variety of B-cell epitopes, Discotop 2.0 server was used to calculate surface abundance concerning residual contact number and use the novel amino acid score to forecast discontinuous epitopes. 3D structures of the target proteins were used to predict discontinuous epitopes; 90% specificity, − 3.700 thresholds and 22.000 Angstroms propensity score radius. Fifty-five discontinuous epitopes of S protein, 1 epitope of theE protein and 22 epitopes of M protein were calculated (Additional file 5: Table S4).
T-cell epitopes prediction from target proteins
Epitopes that are bound to multiple alleles, highly antigenic, non-allergenic and 100% conserved were screened out, and their antigenicity and allergenicity were checked. Based on these criteria, 9 MHC class-I (S-3, E-3, and M-3) and 7 MHC class-II (S-1, E-3 and M-3) were shortlisted (Additional file 6: Table S5). Protein Digest server was used to estimateepitopes/peptides digesting enzymes. Epitopes digestible with many enzymes are not stable. Less enzyme digested epitopes, on the other hand, are very stable and favored vaccine candidates (Additional file 7: Table S6).
Evaluation and selection of epitopes for further analyses
Total three CTL epitopes (S-1 and M-2), six HTLepitopes (E-3 and M-3), and four B-cell epitopes (S-3 and M-1) were selected to construct MESV (Table 1).
Table 1
Final selected epitopes from SARS-CoV-2 structural proteins used to design the multi-epitope-based subunit vaccine (MESV) construct
Sr.No
Epitope
Protein
Position
HLA alleles
Antigenicity
Immunogenicity
MHC class-I
1
VRFPNITNLCPF
S
327–338
HLA-B*35:01
1.2
0.11
2
YRINWITGGIAI
M
71–82
HLA-B*27:05
1.2
0.61
3
SFRLFARTRSMW
M
99–110
HLA-B*57:01
0.6
0.05
MHC class-II
1
LLFLAFVVFLLVTLA
E
18–32
HLA-DRB1*04:04
0.8
0.41
2
AFVVFLLVTLAILTA
E
22–36
HLA-DRB1*04:01
0.6
0.39
3
FVVFLLVTLAILTAL
E
23–37
HLA-DRB1*04:01
0.5
0.38
4
VTLACFVLAAVYRIN
M
60–74
HLA-DRB1*04:08
1.0
0.36
5
ASFRLFARTRSMWSF
M
98–112
HLA-DRB1*04:01
0.7
0.13
6
FRLFARTRSMWSFNP
M
100–114
HLA-DRB1*04:01
0.8
0.11
B-Cell
1
SPTKLNDLCFTNVY
S
383
–
1.6
0.67
2
EILDITPCSFGGVS
S
583
–
1.6
0.81
3
ILPVSMTKTSVDCT
S
726
–
1.6
0.89
4
LEQWNLVIGFLFLT
M
17
–
0.9
0.67
Final selected epitopes fromSARS-CoV-2 structural proteins used to design themulti-epitope-based subunit vaccine (MESV) constructThe selected epitopes showed 88.40% of the world population coverage (Fig. 2). Results revealed that predicted epitopes are showing promising population coverage of the countries strongly affected by COVID-19 including, Germany, France, Spain, Saudi Arabia, England, Italy, Iran, the Philippines, the United States, and Sweden.
Fig. 2
Population coverage of MESV epitopes around the globe predicted by IEDB population coverage tool
Population coverage of MESVepitopes around the globe predicted by IEDB population coverage tool
Designing of MESV
A MESV construct was further developed using all selected epitopes. Using theEAAAK linker, an adjuvant (45 amino acid long B-defensin) was bound at the beginning (to theMEV N-terminal). EAAAK linker reduces connections to other protein areas with efficient detachment and improves stability [58, 72]. Epitopes weremerged in a sequential manner with AAY, GPGPG, and KK linkers, respectively, based on the compatibility of their interaction. Two hundred seventy-six amino acids represented the final MESV construct (Fig. 3).
Fig. 3
Schematic diagram of MESV construct: It has 276 amino acids, consisting of an adjuvant (orange) linked at N-terminal of MEV with the aid of EAAAK linker (yellow). AAY linkers (blue) used to join the CTL epitopes, GPGPG linkers (green) used to join the HTL epitopes and KK linkers (gray) used to join the B-cell epitopes
Schematic diagram of MESV construct: It has 276 amino acids, consisting of an adjuvant (orange) linked at N-terminal of MEV with the aid of EAAAK linker (yellow). AAY linkers (blue) used to join the CTL epitopes, GPGPG linkers (green) used to join theHTLepitopes and KK linkers (gray) used to join the B-cell epitopes
Sequence and structural analyses of MESV
First, Blastp analysis was carried out against theHomo sapiens proteome, and the results revealed that MESV does not resemble any human protein (higher or equal to 37%). The vaccine structure was then tested for toxicity, allergenicity, and antigenicity. MESV was found to be non-allergenic, highly antigenic (0.6737), and non-toxic. Themean half-life of the construct was calculated as 30 h in vitro, > 20 h in yeast and > 10 h in vivo. Molecular weight and theoretical pI of the vaccine were 3157.01 kDa and 10.31 respectively. Grand average hydropathicity was calculated as 0.395. A positive score of the grand average of hydropathy suggests its hydrophobic nature. The secondary structure analysis show that 74 (26.81%) amino acids were involved in the formation of α-helix, 67 amino acids (24.27%) formed β-strands and 135 amino acids (48.91%) form coils.CABS fold server was used to predict the tertiary structure of theMESV (Fig. 4). The structure was refined by thegalaxy refine server. Ramachandran plot analysis of improved model showed that 89.4% amino acids are in favored region, 6.9% amino acids in the allowed region and 3.6% amino acids in the outlier region. Further analysis showed that the qRMSD is 0.544, MolProbity is 2.356, poor rotamers are 0.0%, clash score is 17.7 and z-score is − 4.8. In quality check analysis by ERRAT, the refined model score was 82.4561.
Fig. 4
a MESV construct sequence. Epitopes sequence is in black. The adjuvant sequence is highlighted in brown color, EAAAK linker sequence is highlighted in blue, AAY linkers are highlighted with orange, GPGPG linkers are highlighted with green and KK linkers are highlighted with maroon color; b MESV construct refined 3D structure pipes representation (alpha helix: green; beta strands: blue; loops: gray); c Ramachandran plot analysis of predicted structure shows 89.4% residues are present in the favored region
a MESV construct sequence. Epitopes sequence is in black. The adjuvant sequence is highlighted in brown color, EAAAK linker sequence is highlighted in blue, AAY linkers are highlighted with orange, GPGPG linkers are highlighted with green and KK linkers are highlighted with maroon color; b MESV construct refined 3D structure pipes representation (alpha helix: green; beta strands: blue; loops: gray); c Ramachandran plot analysis of predicted structure shows 89.4% residues are present in the favored region
B-cell epitopes screening from MESV
B-lymphocytes also produce antibodies that provide humoral immunity, in addition to the secretion of cytokines. Eighteen linear/continuous (Additional file 8: Table S7) and six conformational/ discontinuous epitopes (Additional file 9: Table S8) from theMESV sequence were predicted without altering ABCPred 2.0 and Ellipro prediction parameters.
Molecular docking of MESV construct with TLR3
To start the immune response, an appropriate interaction among the antigenic molecule and immune receptor molecule is needed. To decode the binding potential of MESV to the innate immune receptors, bioinformatic modeling drivenmolecular docking of the designed MESV to a representative innate immune receptor TLR3 was performed. The docking evaluation forecast that the best complex with a net global energy of − 22.36 kJ/mol. Visual analysis of the complex leads to the observation of theMESV’s deep binding in the center of TLR3 and favors rigorously hydrogen and weak van dar Waals interactions with specific TLR3 residues. PDBsum was used to gain insights and pin down possible residues of MESVmaking stable bonds with TLR3 (Fig. 5). Within 3 Å, theMESV was observed to form 14 hydrogen bonds with TLR3 potential residues.
Fig. 5
TLR3-MESV docked complex shown at the left in cartoon representation. Interacting residues of MESV are highlighted at right side. MESV vaccine construct displayed with blue color and TLR3 displayed with green color. Salt bridges are displayed with red color lines; other contacts are shown with orange color lines, and hydrogen bonds are displayed with blue color lines. The colors of interacting residues are interpreting the characteristics of amino acids (neutral: green, Cys: yellow, aromatic: pink, aliphatic: grey, positive: blue, negative: red, and Pro&Gly: orange)
TLR3-MESV docked complex shown at the left in cartoon representation. Interacting residues of MESV are highlighted at right side. MESV vaccine construct displayed with blue color and TLR3 displayed with green color. Salt bridges are displayed with red color lines; other contacts are shown with orange color lines, and hydrogen bonds are displayed with blue color lines. The colors of interacting residues are interpreting the characteristics of amino acids (neutral: green, Cys: yellow, aromatic: pink, aliphatic: grey, positive: blue, negative: red, and Pro&Gly: orange)
Immunogenicity evaluation of MESV
All secondary and primary immune responses tend to contribute significantly to the pathogen and may be consistent with the actual immune response. The in silico host immune system response to the antigen is shown in Fig. 6. The primary response was characterized by high IgG + IgG and IgM concentration, followed by IgM, IgG1 + IgG2 and IgG1 at both the secondary and primary stages with concomitant antigen reduction. Additionally, robust interleukin and cytokine response was observed. All of this indicates theMESV’s successful immune response and clearance after subsequent encounters.
Fig. 6
In silico immune response using MESV as antigen. a The antibodies, and b cytokines and interleukins
In silico immune response using MESV as antigen. a The antibodies, and b cytokines and interleukins
In silico cloning within E. coli system
In silico cloning was done to assure theexpression of MESV derived fromSARS-CoV-2 in widely used E. coli hosts. First, the codons of MESV weremodified according to the use of codons of E. coliexpression system (strain K12). The optimized MESV construct contains 828 nucleotides, CAI value of 1.0 (0.8–1.0), and an optimal range of GC content of 53.2% (30–70%) demonstrating the strong potential for reliability and positive protein expression. In the following step, both ends of MESV optimized nucleotide sequence were attached to buffer compatible restriction enzymes BamHI and HindIII restriction sites to assist the purification/cloning process. Finally, the refined MESV sequence was cloned to the several cloning sites of the pET30a (+) vector between the restriction sites. The clone was 6.23 kb long (Fig. 7).
Fig. 7
In silico cloning of codon optimized MESV into E. coli K12 expression system. The plasmid back-bone is kept in black color while the inserted DNA sequence is shown in green color
In silico cloning of codon optimized MESV into E. coli K12expression system. The plasmid back-bone is kept in black color while the inserted DNA sequence is shown in green color
Discussion
CoVs have long been considered as insignificant pathogens causing “colds” in humans. In the twenty-first century, two extremely pathogenic CoVs named SARS-CoV and MERS-CoVemerged from the livestock reservoirs and cause deadly outbreaks. A new strain of CoV officially named as SARS-COV-2 was identified recently, which started a deadly global pandemic of COVID-19. The final dimension and impact of this pandemic are currently uncertain due to the rapidly changing situation [4]. After the recombination of various virus genomes particles, the novel virus infects the host cells rapidly. No reliablemedication is currently available for the said infection. COVID-19infection is a severe problem of morbidity and mortality worldwide. Unfortunately, the unavailability of the vaccinations against COVID-19 has impacted several precious lives, in different regions of the world. Theemergence of COVID-19 results in a significant global disease burden, for which preventativemeasures are urgently needed. To successfully eradicate the disease, researchers have been trying to collect data associated with CoVs to understand its transmission, pathophysiology, and biology [73]. The rapid development of structural and genomic databases combined with computational tools helps in the design and discovery of new vaccine candidates.Recent advancements in the immunological bioinformatics area have resulted in a variety of tools and servers that can lessen the time and cost of traditional vaccine advancement. Due to the problems in the selection of suitable antigen candidates and immunodominant epitopes, the development of effectivemultiepitope vaccines remains toilsome. Thus, the prediction of appropriate antigenic epitopes of a targeted protein by the immunoinformatics approaches is very essential for designing a MESV [74].Here, weexplored the development of epitope-based vaccines targeting the structural proteins (S, M, and E) of theSARS-CoV-2. These proteins play a crucial in the replication cycle and the virus particle structure. TheS-protein plays an important part in binding the virus to the host cell surface receptors and consecutive fusion to promote the viral entrance in the host cell [75-77]. M and E proteins are important for replication, particle assembly within human cells, and viral entry [78, 79]. T- and B-cell epitopes of the target proteins were predicted to support the host’s immune response. The research was performed at primary, secondary and tertiary structural levels of proteins. IEDB analysis resource and ABCPred predicted B-cell conserved epitopes. The position of epitopes on 3D structures of proteins was visualized by Pymol. DiscoTop server was used to predict discontinuous epitopes. To further improve specificity and selectivity, allergenicity, toxicity, and physiochemical properties of predicted epitopes were checked. Digestion analysis verified that the peptides predicted during the analysis were stable and safe to use.An appropriateMESV should be designed with B-cell, HTL, and CTL epitopes and causeeffective reactions to a specific virus [80]. Few groups developed SARS-CoV-2 subunit vaccines but only used a single protein for the vaccine design [15, 81, 82] and the use of CTL epitopes only without taking into account the importance of HTL or B cell epitopes [83]. However, we have incorporated B-cell epitopes in addition to T-cell epitopes frommultiple structural proteins, because of the functions they play in inducing antibody production and mediating its effective features [84]. Besides, the humoral response of memory B-cells can beeasily overcome by the onset of antigens, while the cell-mediated immunity (T-cell immunity) in many cases leads to long-life immunity [85]. CTL limits pathogen spread through the secretion of unique antiviral cytokines and the identification and destruction of infected cells [86]. Therefore, the present vaccine construct has an advantage over already reported constructs.TheHLA alleles retain their response to T-cell epitopes which are highly polymorphic in different ethnic groups. To gain more population coverage, the T-cell epitopes are paired with more alleles. So we chose theHTL and CTL epitopes with their respectiveHLA alleles to predict the worldwide distribution of the alleles. The results showed that the chosenepitopes and their corresponding alleles preferably cover various geographical areas of the world. The selected epitopes cover 88.40% of the world population. France has the highest population, with 94.94%. In Germany, Spain, Saudi Arab, England, and Iran, theepidemic of SARS-CoV-2 happened in most significant measures. Vaccine candidates are therefore vital to protect individuals fromSARS-CoV-2 infection in these geographical regions. The population coverage was 68.60% in China, where the virus first emerged and had several outbreaks.Vaccine candidates were chosen form CTL, HTL, and B cell epitopes depending on their antigenicity, toxicity, immunogenicity, population coverage, and allergenicity. TheMESV was designed by joining theHTL, CTL, and B cell epitopes with GPGPG, AAY, and KK linkers respectively. Linkers are introduced as an indispensableelement in theMESV development to enhance folding, stabilization, and expression. Multi-epitope based vaccines are poorly immunogenic when used alone, and need adjuvant coupling [87]. Adjuvants are ingredients in a vaccine formulation that protects against infection and affect certain immune responses, growth, stability, and durability of antigens [88]. Therefore, 45 amino acids long, an adjuvant β-defensin, was integrated with theEAAAK linker whose length is 5, at N-terminal. TheEAAAK linker is used to integrate the first epitope and adjuvant to facilitateefficient separation of the bifunctional fusion protein domains [89]. The final vaccine stretch with the addition of adjuvant and linkers was discovered to be 276 amino acid long.The analysis of physiochemical characteristics of theMESV construct has shown that it is stable, basic, and hydrophobic. MESV was basic, according to the theoretical pI value, which can ensure stable physiological pH interaction. The calculated aliphatic index and instability index scores showed that the vaccine protein may be stable and thermostable. A positive score of the grand average of hydropathy suggests its hydrophobic nature. MESV has been found to be immunogenic, strongly antigenic, and non-allergenic. This suggests the ability of theepitopic vaccine to elicit a strong immune response without allergic reactions.The 3D structure prediction provides extensive knowledge of the spatial arrangement of essential protein components and provides excellent support for the study of ligand interactions, protein functions, dynamics, and other proteins [90, 91]. After refinement, the desirable characteristics of theMESV construct improved considerably. The Ramachandran plot analysis shows that most residues are present in favored and allowed regions with very few residues in the disallowed region, which shows a satisfactory overall quality of themodel. The good quality of designed MESV construct is further indicated by RMSD value, Poor Rotamers, Clash Score, and MolProbity. Various structure validation methods have been used to detect errors in themodeled MESV construct. TheERRAT quality factor (82.4%) and z-score (− 4.8) proved that the overall structure of the refined MESV is of good quality.An adequate interaction between the antigenmolecule and the immune receptor molecules is important for triggering an immune response. The refined MESV construct was then docked against TLR3 to examine adequate binding to immediate immune response. Stable interactions were observed among theMESV and TLR3 in molecular docking analysis, and less energy was needed for proficient binding.B- and T-cell epitopes consisting multi-epitope vaccine should hypothetically activate both humoral and cellular immune reactions. With substantial IL-10 and IL-2 activities, our vaccine demonstrated the highest production of IFN-γ. Antibodies also provideextracellular SARS-CoV-2 protection. We have also noticed excess immunoglobulins that are active, i.e., IgM, IgG, and their isotypes that may be involved in switching isotype. Besides, the irrelevant Simpson index (D) recommends a diverse immune reaction that is conceivable as a subunit vaccine contains various B-cell and T-cell epitopes.The translation efficiency of foreign genes inside the host system varies because of the incompatibility of mRNA codons, which require codon optimization for higher expression [92]. CAI value obtained was 1.0 and GC content (53.2%) was also within the optimumlimit suggesting possible higher expression in theE. coli K-12 system. Themain aim of MESV in silico cloning was to direct genetic engineers and molecular biologists on theexpected expression level and the potential cloning sites in a particular expression system i.e., E. coli K12 system.We applied thenext-generation vaccine designing approach in this research to create a MESV construct, capable of generating immunological responses against theSARS-CoV-2. We believe that our vaccine will successfully produce humoral and cell-mediated immune responses. Interaction and binding patterns between receptor and vaccine protein were stable and higher. Moreover, in immune simulation, effective immune responses were observed in real life. Thus, MESV designed carefully using such a methodology could become an important asset in combating viral infections.Computational/immunoinformatics approaches rely on experimental methodologies to generate initial raw data for further analyses. The data quality and efficiency of computational algorithms being applied, can limit the accuracy of immunoinformatics predictions. Therefore, further in vivo and in vitro investigations are however required to ensure the real potential of designed MESV to combat COVID-19.
Conclusions
Taken together, we characterized SARS-CoV-2 structural proteins (S, E, and M) for antigenic epitopes and proposed a potential MESV utilizing various immunoinformatics and computational approaches. The findings of this research could save time and related costs for the study of experimental epitope targets. TheMESV can activate all host immune system components and has adequate physicochemical and structural properties. It also appears to interact very stably with an innate immune receptor TLR3, making it more likely to be introduced into the host immune system. To reveal its effectiveness in the fight against COVID-19, however, additional in vitro and in vivo experiments are warranted.Additional file 1: Figure S1. 3D structural representation of SARS-CoV-2 structural proteins: (A) S protein, (B) E protein and (C) M protein. Figure S2. (a) theE protein contains α-helix (77.33%, 58) and random coil (22.66%, 17); (b) the z-score (0.41) of theE protein; (c) the Ramachandran plot of refined structure shows 97.3, 2.7 and 0.0% residues in favored, allowed and disallowed region, respectively. Figure S3. (a) theM protein contains α-helix (40.54%, 90), β-strand (24.32%, 54) and random coil (35.13%, 78); (b) the z-score (− 3.88) of theM protein; (c) the Ramachandran plot of refined structure shows 96.8, 2.7 and 0.5% residues in favored, allowed and disallowed region, respectively. Figure S4. Specific sites of B cells predicted linear epitopes on the 3D structure of SARS-CoV-2 proteins: (A) S protein, (B) E protein and (C) M protein. Figure S5. (A) Prediction of antigenic determinants of S proteinusing Kolaskar and Tongaonkar antigenicity scale; (B) Beta Turns analyses in S protein using Chou and Fasman Beta Turn prediction; (C) Hydrophilicity Prediction of S protein using Parker Hydrophilicity; (D) Surface Accessibility Analyses of S protein using Emini Surface Accessibility Scale; (E) Flexibility Analyses of S protein using Karplus and Schulz Flexibility Scale. Figure S6. (A) Prediction of antigenic determinants of E protein using Kolaskar and Tongaonkar antigenicity scale; (B) Beta Turns Analyses in E protein using Chou and Fasman Beta Turn Prediction; (C) Hydrophilicity Prediction of E protein using Parker Hydrophilicity; (D) surface accessibility analyses of E protein using Emini Surface Accessibility Scale; (E) Flexibility Analyses of E protein using Karplus and Schulz Flexibility Scale. Figure S7. (A) Prediction of antigenic determinants of M protein using Kolaskar and Tongaonkar Antigenicity Scale; (B) Beta turns analyses in M protein using Chou and Fasman Beta Turn Prediction; (C) Hydrophilicity Prediction of M protein using Parker Hydrophilicity; (D) Surface Accessibility Analyses of M protein using Emini Surface Accessibility Scale; (E) Flexibility Analyses of M protein using Karplus and Schulz Flexibility Scale.Additional file 2: Table S1. Structural details of theSARS-CoV-2 structural protein predicted models.Additional file 3: Table S2. Linear B cell epitopes predicted through ABCPred 2.0 server (NT: nontoxic).Additional file 4: Table S3. Emini surface accessibility of SARS-CoV-2 structural proteins.Additional file 5: Table S4. Discontinuous epitopes predicted through DiscoTop 2.0 server.Additional file 6: Table S5. MHC class-I allele and MHC class-II binding peptides with their antigenicity scores.Additional file 7: Table S6. Digestion, allergenicity, toxicity and physiochemical profiling of selected peptides (NA: not allergic; NT: nontoxic).Additional file 8: Table S7. Linear B cell epitopes predicted in vaccine construct.Additional file 9: Table S8. Conformational epitopes in 3D structure of vaccine.
Authors: B Bjellqvist; G J Hughes; C Pasquali; N Paquet; F Ravier; J C Sanchez; S Frutiger; D Hochstrasser Journal: Electrophoresis Date: 1993-10 Impact factor: 3.535
Authors: Dennis A Benson; Ilene Karsch-Mizrachi; David J Lipman; James Ostell; Eric W Sayers Journal: Nucleic Acids Res Date: 2008-10-21 Impact factor: 16.971
Authors: Matthew Cotten; Simon J Watson; Alimuddin I Zumla; Hatem Q Makhdoom; Anne L Palser; Swee Hoe Ong; Abdullah A Al Rabeeah; Rafat F Alhakeem; Abdullah Assiri; Jaffar A Al-Tawfiq; Ali Albarrak; Mazin Barry; Atef Shibl; Fahad A Alrabiah; Sami Hajjar; Hanan H Balkhy; Hesham Flemban; Andrew Rambaut; Paul Kellam; Ziad A Memish Journal: mBio Date: 2014-02-18 Impact factor: 7.867
Authors: Miraj Ud-Din; Aqel Albutti; Asad Ullah; Saba Ismail; Sajjad Ahmad; Anam Naz; Muhammad Khurram; Mahboob Ul Haq; Zobia Afsheen; Youness El Bakri; Muhammad Salman; Bilal Shaker; Muhammad Tahir Ul Qamar Journal: Int J Environ Res Public Health Date: 2022-05-04 Impact factor: 4.614
Authors: Abdur Rehman; Sajjad Ahmad; Farah Shahid; Aqel Albutti; Ameen S S Alwashmi; Mohammad Abdullah Aljasir; Naif Alhumeed; Muhammad Qasim; Usman Ali Ashfaq; Muhammad Tahir Ul Qamar Journal: Vaccines (Basel) Date: 2021-06-16
Authors: Sajjad Ahmad; Farah Shahid; Muhammad Tahir Ul Qamar; Habib Ur Rehman; Sumra Wajid Abbasi; Wasim Sajjad; Saba Ismail; Faris Alrumaihi; Khaled S Allemailem; Ahmad Almatroudi; Hafiz Fahad Ullah Saeed Journal: Vaccines (Basel) Date: 2021-03-21
Authors: Muhammad Tahir Ul Qamar; Saba Ismail; Sajjad Ahmad; Muhammad Usman Mirza; Sumra Wajid Abbasi; Usman Ali Ashfaq; Ling-Ling Chen Journal: Front Immunol Date: 2021-06-16 Impact factor: 7.561