Literature DB >> 32923731

An immunoinformatics study on the spike protein of SARS-CoV-2 revealing potential epitopes as vaccine candidates.

Arafat Islam Ashik1, Mahedi Hasan1, Atiya Tahira Tasnim1, Md Belal Chowdhury1, Tanvir Hossain1, Shamim Ahmed1.   

Abstract

BACKGROUND: The pandemic situation of SARS-CoV-2 infection has sparked global concern due to the disease COVID-19 caused by it. Since the first cluster of confirmed cases in China in December 2019, the infection has been reported across the continents and inflicted upon a substantial number of populations.
METHOD: This study is focused on immunoinformatics analyses of the SARS-CoV-2 spike glycoprotein (S protein) which is key for the viral attachment to human host cells. Computational analyses were carried out for the prediction of B-cell and T-cell (MHC class I and II) epitopes of S protein and the analyses were extended further for the prediction of their immunogenic properties. The interaction and binding affinity of T-cell epitopes with HLA-B7 were also investigated by molecular docking. RESULT: Three distinct epitopes for vaccine design were predicted from the sequence of S protein. The potential B-cell epitope was KNHTSPDVDLG possessing the highest antigenicity score of 1.4039 among other B-cell epitopes. T-cell epitope for human MHC class I was VVVLSFELL with an antigenicity score of 1.0909 and binding ability to 29 MHC-I alleles. The predicted T-cell epitope for human MHC class II molecule was VVIGIVNNT with a corresponding 1.3063 antigenicity score, less digesting enzymes, and 7 MHC-II alleles binding ability. All these three peptides were predicted to be highly antigenic, non-allergenic, and non-toxic. Analyses of the physiochemical properties of these predicted epitopes indicate their stable nature for plausible vaccine design. Furthermore, molecular docking investigation between the MHC class-I epitopes and human HLA-B7 reflects the stable interaction with high affinity among them.
CONCLUSION: The present study posits three potential epitopes of S protein of SARS-CoV-2 predicted by immunoinformatic methods based on their immunogenic properties and interactions with the host counterpart that can facilitate the development of vaccine against SARS-CoV-2. This study can act as the springboard for the future development of the COVID-19 vaccine.
© 2020 The Author(s).

Entities:  

Keywords:  B-cell and T-cell epitope; Biochemistry; Bioinformatics; COVID-19; Epitope-based vaccine design; Immunology; SARS-CoV-2; Spike glycoprotein; Virology

Year:  2020        PMID: 32923731      PMCID: PMC7472982          DOI: 10.1016/j.heliyon.2020.e04865

Source DB:  PubMed          Journal:  Heliyon        ISSN: 2405-8440


Introduction

COVID 19, an ongoing pandemic, is the viral disease attributed to the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) [1]. The SARS-CoV-2 is a coronavirus strain under one of the (+) ssRNA virus family Coronaviridae. SARS-CoV-2 uses human ACE2 (angiotensin converting enzyme 2) protein as it’s receptor [2]. The origin of SARS-CoV-2 is still an unsettled issue since the link of virus's speculated origin to Wuhan, China is yet to be established. Studies supporting and arguing the connection to a particular region of China have positioned the origin of the virus up in the air [3, 4, 5]. Although SARS-CoV-2 was initially speculated to be transmitted to humans from Huanan Seafood Wholesale Market's seafood in Wuhan, China, person-to-person transmission is now evident [4, 5, 6]. Most patients are being given systematic antiviral therapy which is only effective to alleviate the symptoms [3]. While the antiviral drugs are only effective against virus-infected individuals, vaccination provides a protective shield to the unaffected populations against the possible viral attack and gradually drives the eradication of the virus as a consequence. That is why prevention by immunization is presumably the most pragmatic solution to tackle the virus, and thus huge endeavors are being channeled on vaccine development. The basic architectural features of SARS-CoV-2 comprise its 30kb long RNA genome and four major types of structural protein capsid. Among the whole genome of coronavirus, the 3-terminus ORFs encode the structural proteins-spike proteins (S), envelope proteins (E), membrane proteins (M), and nucleocapsid proteins (N) [7]. All these structural and nonstructural accessory proteins of coronaviruses are translated from sub-genomic RNAs (sgRNAs) that are negative-stranded and synthesized by the replication-transcription complex (RTC). The distinct feature of S protein that paves the way to viral attachment with the host cell surface receptor makes it a promising target for vaccine design [8, 9, 10]. S protein is a trimeric class I fusion protein having S1 and S2 subunits. Cryo-electron microscopy structure and 3D atomic-scale map of S protein have already been developed. In order to fuse with the host cell membrane, the S1 subunit of S protein binds to the receptor that leads the transition of the S2 subunit from pre-fusion conformation to post-fusion conformation. This phenomenon happens for all coronavirus class-I fusion S protein. This transition of S protein for membrane fusion represents it as a target for Ab-mediated neutralization [11]. In the mammalian body, virus-infected cells are usually killed by cytotoxic action of CTLs which are signaled by the presentation of epitopes via HLA class-I alleles [12]. Epitopes for helper T-cells also relay signals for B-cells and macrophages. In general, T-helper cell epitopes presented by human HLA class-II alleles activate the B-cells to produce antibodies by plasma cells. Some memory B-cells derived from plasma B-cell keep the invader's memory to exhibit faster recognition and quick response during future encounters [13, 14]. Therefore, viral epitopes are the ideal candidates for vaccine construction that can trigger the immune system to develop immunity against the virus. The early approach of immune therapeutics depended on the use of the whole antigen upon trial and error methods and some other cumbersome processes [15, 16]. The conventional ways of vaccine design were based on administering an entire disease-causing agent that consisted of the killed or live-attenuated organisms, detoxified toxins, etc. However, associated drawbacks of conventional vaccine design including difficulties in culture, killing or attenuation of the virus, identification of genetic variations that give rise to newer strains, and time-consuming nature in development have diverted the focus of researchers to other alternatives [17, 18]. To overcome these drawbacks, some modern approaches like recombinant DNA technology, polysaccharide chemistry, rational vaccinology, structural vaccinology, RNA vaccines, and epitope-based vaccine design have now been embraced by scientists. Several computational programs have also been developed to make these modern approaches fast and accurate based on genome data and protein 3D structures, simulation, and so on. Preliminary computational analyses also narrow down the investigation into a specific area thus saving both time and cost [19]. Immunoinformatics, also known as computational immunology, utilizes the advancement in computational capacity to develop various computational methods and resources to study the immune functions [20]. The epitope-based vaccine prediction for both B-cells and T-cells using immunoinformatics method can facilitate the development of anti-viral vaccine and can show effectiveness in preventing COVID-19. Moreover, epitope-based vaccine is effective in both pre-exposure and post-exposure periods which makes it more advantageous to fight COVID-19 since the repeated infection by the SARS-CoV-2 to the same individual has been reported in several cases [21, 22]. Concerning all the aspects of epitope-based vaccine design, this study is focused on the candidate epitope prediction for SARS-CoV-2 exploiting viral S protein and investigates the immunogenic properties of candidate epitopes as they are effective, safe and stable vaccine [23].

Materials and methods

Acquisition of sequence data and physicochemical properties analysis

The most recent protein sequence of SARS-CoV-2 S protein was acquired from NCBI (National Center for Biotechnology Information) using the accession number: QIC53213.1 in FASTA format and the updated structure (released on February 2020) of the SARS-CoV-2 S protein was retrieved from PDB (ID: 6VSB). To analyze various physical and chemical parameters of retrieved S protein, an online tool ExPASy-ProtParam [24] was used (Supplementary Table S1).

B-cell epitope prediction

IEDB (Immune-Epitope-Database and Analysis-Resource) [25], a free accessible online resource tool, was used to predict the linear B-cell epitopes. These epitopes were predicted through the Bepipred Linear epitope prediction mode of IEDB. Based on the standard parameters including antigenicity, accessibility of surface, flexibility, hydrophilicity, and beta-turn, the epitopes were identified. These parameters were measured using Kolaskar and Tongaonkar antigenicity scale, Emini surface accessibility prediction tool, Karplus and Schulz flexibility prediction tool, Parker hydrophilicity prediction algorithms [26], Chou and Fasman beta-turn prediction algorithm respectively [27]. Prediction of protective antigens was carried out via Vaxijen v2.0 [28] server through forecasting vaxijen scores with the default threshold value 0.4 (70% accuracy). AllergenFP v.1.0 [29] server was used to examine the allergenicity of the predicted peptides. Furthermore, the server DiscoTope 2.0 [30] was used to predict discontinuous B-cell epitope from the S protein 3-D structure amino acid sequence in FASTA format, using the default threshold value of -3.7. This threshold value corresponds to a specificity of 75% and also indicates that 25% of non-epitope residues are predicted as part of epitopes. This method utilizes calculation of surface accessibility (estimated in terms of contact numbers) and a novel epitope propensity amino acid score.

T-cell epitope prediction

MHC class-I and MHC class-II alleles were predicted to obtain corresponding Cytotoxic T-lymphocyte (CTL) and T helper cell epitopes. T cell epitopes are crucial in vaccine design and prediction of them reduces the time and cost in wet-lab experiments.

MHC class-I alleles

A neural network-based MHC class-I binding peptide prediction server nHLAPred [31], containing the ComPred section, was used to forecast MHC class-I binding peptides or CTL epitopes of 67 MHC alleles. In this comprehensive method, threshold and stringency were maintained at 5% as well as proteasome and immune proteasome filters were applied. From this server, an amount of maximum allele binding epitopes based on their maximum antigenicity was selected. Antigenicity of corresponding epitopes as vaxijen scores was predicted by the Vaxijen v2.0 server.

MHC class-II alleles

An online free accessible server ProPred [32] was applied to predict the MHC class-II binding epitopes for a given antigen sequence. The method was applied for all alleles with threshold 5% and the top 10 scorers were displayed. In the same way as MHC class-I, an amount of maximum allele binding epitopes were predicted for MHC class-II from ProPred based on maximum antigenicity and allergenicity. Antigenicity was predicted from vaxijen scores of corresponding epitopes.

T-cell epitopes and features profiling

Physicochemical properties of selected MHC class-I and MHC class-II epitopes including maximum allele binding, antigenicity, allergenicity, digestion, mutation, toxicity, hydrophobicity, hydrophilicity, hydropathicity, half-life, surface accessibility, polarity, charge, pI, and molecular weight were determined. An online server, peptide digestion server (http://db.systemsbiology.net:8080/proteomicsToolkit/proteinDigest.html), was utilized to measure the digestion of peptides by several digestive enzymes. The less the number of digestive enzymes the more the epitope stability, thus more favorable [33]. ToxinPred [34], an online tool, developed to predict and design toxic/non-toxic peptides, was used to measure mutation position, toxicity, hydrophobicity, hydrophilicity, hydropathicity, pI value, charge, and molecular weight of selected epitopes. Another tool, AllergenFP v.1.0, was utilized to measure the allergenicity of selected epitopes. A webserver, HLP [35], available for predicting half-life of peptides in intestine like environment, was employed to measure different physicochemical properties of selected epitopes, including half-life, HPLC parameters, surface accessibility, and flexibility. An SVM based model was used for this server since the epitopes were 9 residues long.

Structure determination and interaction study with HLA-B7

3D structures of all selected MHC class-I epitopes were modeled by an online protein structure prediction server namely the PEP-FOLD server [36]. The crystallized 3D model of the human HLA-B7 allele was retrieved from PDB (PDB ID: 3VCL) [37]. The interaction pattern and binding affinity between HLA-B7 and predicted MHC class-I epitopes of SARS-CoV-2 S protein were investigated using molecular docking tool AutoDock Vina 4.2 [38]. AutoDock Vina is an open-source program for molecular docking and AutoDock Vina significantly improves the average accuracy of the binding mode predictions. The PEP-FOLD server predicted five 3D models for each epitope and the best one was chosen according to its binding affinity with HLA-B7. The higher the binding affinity, the more stable the interaction. PyMOL tool was used to study the binding pattern and interactions.

Results

B-cell epitopes of S protein

Linear B-cell epitopes were predicted from IEDB (Bipipred linear epitope prediction_B cell analysis tool), that gave several probable epitopes with different length. Among them, epitopes with 10–14 amino acid residues were selected for further studies since they can function as potential antigens. Various physicochemical properties of the initially determined epitopes were measured and filtered using several tools according to their properties such as length, vaxijen score for antigenicity, non-allergenicity, and non-toxicity (Supplementary Table S2). A total of 7 epitopes within this residue length and maximum allele binding ability were selected as the best antigenic epitopes which were KNHTSPDVDLG, VRQIAPGQTGKIAD, ILPDPSKPSKRS, LTPGDSSSGWTAG, VITPGTNTSN, RTQLPPAYTNS, YGFQPTNGVGYQ in Table 1. None of them were found to be toxic but two selected peptides showed allergenicity. So, a non-allergen and non-toxic epitope ‘KNHTSPDVDLG’ with residue length 11 and the highest vaxijen score 1.4039 was selected as a potential antigenic B-cell epitope.
Table 1

B-cell epitopes of S protein predicted via IEDB analysis resource along with their start position, end position, length, vexijen scores (antigenicity), allergenicity, and toxicity.

StartEndPeptideLengthVexijen scoreAllergenicityToxicity
11571167KNHTSPDVDLG111.4039Non-allergenNon-toxin
407420VRQIAPGQTGKIAD141.2606Non-allergenNon-toxin
805816ILPDPSKPSKRS120.5322Non-allergenNon-toxin
249261LTPGDSSSGWTAG130.4950Non-allergenNon-toxin
597606VITPGTNTSN100.4217Non-allergenNon-toxin
2131RTQLPPAYTNS110.8710AllergenNon-toxin
495506YGFQPTNGVGYQ120.7136AllergenNon-toxin
B-cell epitopes of S protein predicted via IEDB analysis resource along with their start position, end position, length, vexijen scores (antigenicity), allergenicity, and toxicity. Some essential parameters including antigenicity, accessibility of surface, flexibility, hydrophilicity, and beta-turn were also measured. Kolaskar and Tongaonkar antigenicity scale (with threshold 1.048) resulted in a minimum score of 0.866 and a maximum score of 1.261, and our selected peptide showed an antigenicity score of 1.0163 (Supplementary Table S3). Emini surface accessibility prediction tool (with threshold 1.00) resulted in a minimum score of 0.042 and a maximum score of 6.051, and our selected peptide scored 1.33 in surface accessibility (Supplementary Table S4). Karplus and Schulz's flexibility prediction tool (with threshold 0.993) resulted in a minimum score of 0.876 and a maximum score of 1.125, and our selected peptide showed 1.033 in flexibility score (Supplementary Table S5). Parker hydrophilicity prediction algorithms (with threshold 1.238) resulted in a minimum score of -7.629 and a maximum score of 7.743, and our selected peptide showed 1.33 in hydrophilicity score (Supplementary Table S6). Chou and Fasman's beta-turn prediction algorithm (with threshold 0.997) resulted in a minimum score of 0.541 and a maximum score of 1.484, and our selected peptide showed a propensity score of 1.164 (Supplementary Table S7) presented in Figure 1. In the aforementioned analyses, threshold value denotes the average of all predicted residue scores, and the residues with scores above the threshold value are predicted to be part of an epitope.
Figure 1

Prediction of B-cell epitopes of S protein by different IEDB scales (A. Kolaskar and Tongaonkar antigenicity, B. Emini surface accessibility, C. Karplus and Schulz flexibility, D. Parker hydrophilicity, E. Chou and Fasman beta-turn). Regions above threshold (black line) are proposed to be a part of B cell epitope while regions below the threshold (black line) are not.

Prediction of B-cell epitopes of S protein by different IEDB scales (A. Kolaskar and Tongaonkar antigenicity, B. Emini surface accessibility, C. Karplus and Schulz flexibility, D. Parker hydrophilicity, E. Chou and Fasman beta-turn). Regions above threshold (black line) are proposed to be a part of B cell epitope while regions below the threshold (black line) are not. Alongside the linear (continuous) B-cell epitopes, discontinuous (conformational) B-cell epitopes of S protein were also predicted and are given in Figure 2. List of the residues and other determined parameters of discontinuous epitopes predicted from both A-chain and B-chain of S protein using DiscoTope are presented in Supplementary Table S8.
Figure 2

Discontinuous B-cell epitopes of S protein predicted by DiscoTope 2.0 server. Positions of predicted peptides are marked as yellow.

Discontinuous B-cell epitopes of S protein predicted by DiscoTope 2.0 server. Positions of predicted peptides are marked as yellow.

T-cell epitopes of S protein

A total number of 14 T-cell epitopes of S protein were predicted using nHLAPred and ProPred for both MHC class-I and MHC class-II alleles respectively. nHLAPred tool's ComPred section resulted in CTL epitopes for 67 MHC class-I alleles where threshold and stringency were maintained at 5%, and proteasome and immune proteasome filters were applied (Supplementary Table S9). Among all epitopes for each 67 alleles, 7 epitopes that possessed more than 13 binding alleles were selected for further study. These epitopes were chosen according to the properties such as vaxijen score for antigenicity, non-allergenicity, non-toxicity, and half-lives for better immunogenic response and stability (Supplementary Table S10). In this case, selected seven epitopes are VVVLSFELL, TLDSKTQSL, FEYVSQPFL, GLTVLPPLL, WTFGAGAAL, VYDPLQPEL, GGFNFSQIL. Among all the 7 best-scored epitopes, VVVLSFELL has the highest antigenicity according to its corresponding vaxijen score of 1.0909. It is found to be non-allergenic (retrieved using AllergenFP v.1.0) and can bind to a large number of alleles including HLA-A∗0201, HLA-A∗0206, HLA-A11, HLA-A24, HLA-A3, HLA-A∗3101, HLA-A31, HLA-A∗0301, HLA-A∗3302, HLA-A2.1, HLA-B14, HLA-B27, HLA-B∗2705, HLA-B∗3501, HLA-B∗3701, HLA-B∗5101, HLA-B∗5102, HLA_B∗5301, HLA -B∗5401, HLA-B∗51, HLA-B7, HLA-B8, HLA-Cw∗0301, H2-Db, H2-Kb, H2-Kd, H2-Ld, HLA-B35, Mamu-A∗01. So, for MHC class-I, ‘VVVLSFELL’ was selected as the best one as it had the highest vaxijen score. The primarily selected 7 epitopes with a list of possible alleles for binding, starting position, antigenicity score, and allergenicity are given in Table 2.
Table 2

Positions of MHC class I allele binding peptides predicted by ProPred with their starting position in protein sequence, vaxijen score (antigenicity) and allergenicity.

PeptidePositionMHC class I allelesVaxijen scoreAllergenicity
VVVLSFELL510HLA-A∗0201, HLA-A∗0206, HLA-A∗1101, HLA-A11, HLA-A24, HLA-A∗2402, HLA-A3, HLA-A∗3101, HLA-A31, HLA-A∗0301, HLA-A∗3302, HLA-A68.1, HLA-A2.1 HLA-B14, HLA-B27, HLA-B∗2705, HLA-B∗3501, HLA-B∗3701, HLA-B∗3901, HLA-B∗5101, HLA-B∗5102, HLA-B∗5103, HLA_B∗5301, HLA-B∗5401, HLA-B∗51, HLA-B7, HLA-B∗0702, HLA-B8, HLA-Cw∗0301, H2-Db, H2-Kb, H2-Kd, H2-Ld, H-2Qa, HLA-B35, Mamu-A∗01, HLA-B∗2703, HLA-B∗2704, HLA-A∗6801, HLA-A∗68021.0909Non-allergen
TLDSKTQSL109HLA-A∗0202, HLA-A∗1101, HLA-A11, HLA-A∗2402, HLA-A3, HLA-A∗3101, HLA-A31, HLA-A∗0301, HLA-A20, HLA-A2.1, HLA-B14, HLA-B∗2702, HLA-B27, HLA-B∗2705, HLA-B∗3501, HLA-B∗3801, HLA-B∗3901, HLA-B∗5101, HLA_B∗5301, HLA-B∗5401, HLA-B∗51, HLA-B7, HLA-B∗0702, HLA-B8, HLA-Cw∗0602, H2-Db, H2-Kb, H2-Kd, H2-Ld, HLA-B35, HLA-A∗0204, HLA-A∗6801, HLA-A∗68021.0685Non-allergen
FEYVSQPFL168HLA-A∗0202, HLA-A∗0203, HLA-A∗0206, HLA-B∗2702, HLA-B∗3701, HLA-B40, HLA-B∗4403, HLA-B∗5201, HLA-B60, HLA-B61, HLA-Cw∗0301, H2-Dd, H2-Kk, HLA-B∗2704, HLA-A∗3301, HLA-B440.6324Non-allergen
GLTVLPPLL853HLA-A2, HLA-A∗0201, HLA-A∗0202, HLA-A∗0203, HLA-A∗0205, HLA-A∗1101, HLA-A11, HLA-A∗2402, HLA-A3, HLA-A31, HLA-A∗0301, HLA-A20, HLA-A2.1, HLA-B14, HLA-B∗2702, HLA-B27, HLA-B∗2705, HLA-B∗3501, HLA-B∗3701, HLA-B∗3902, HLA-B∗5101, HLA_B∗5301, HLA-B∗5401, HLA-B∗51, HLA-B62, HLA-B7, HLA-B∗0702, HLA-B8, HLA-Cw∗0602, H2-Db, H2-Kb, H2-Kd, H2-Ld, HLA-G, H-2Qa, HLA-A∗0204, HLA-B∗2704, HLA-A∗3301, HLA-A∗6801, HLA-A∗68020.6621Non-allergen
WTFGAGAAL882HLA-A∗0205, HLA-A24, HLA-A∗3101, HLA-A68.1, HLA-B∗2702, HLA-B∗3801, HLA-B∗3901, HLA-B40, HLA-B∗5801, HLA-B61, HLA-B7, HLA-Cw∗0401, HLA-A∗0204, HLA-B∗2704, HLA-B∗2902, HLA-B440.4918Non-allergen
VYDPLQPEL1133HLA-A∗0201, HLA-A∗0203, HLA-A∗0205, HLA-A∗1101, HLA-A11, HLA-A∗2402, HLA-A3, HLA-A31, HLA-A∗0301, HLA-A2.1, HLA-B14, HLA-B27, HLA-B∗2705, HLA-B∗3501, HLA-B∗3801, HLA-B∗3901, HLA-B∗5101, HLA-B∗5301, HLA-B∗5401, HLA-B∗51, HLA-B7, HLA-B∗0702, HLA-B8, HLA-Cw∗0602, HLA-Cw∗0702, H2-Db, H2-Dd, H2-Kb, H2-Kd, H2-Ld, HLA-B∗2703, HLA-B∗2704, HLA-A∗3301, HLA-A∗6801, HLA-A∗68020.4525Non-allergen
GGFNFSQIL794HLA-B∗3901, HLA-B∗3902, HLA-B40, HLA-B∗5102, HLA-B∗5103, HLA-B∗5201, HLA-B61, HLA-Cw∗0301, HLA-Cw∗0602, H2-Dd, H2-Kb, Mamu-A∗01, HLA-B∗27040.7967Allergen
Positions of MHC class I allele binding peptides predicted by ProPred with their starting position in protein sequence, vaxijen score (antigenicity) and allergenicity. ProPred was operated to predict the MHC class-II binding epitopes with threshold 5% and top scorers 10 (Supplementary Table S11). For MHC class-II, 7 epitopes were selected on the basis of their ability to bind more than 5 alleles and also filtered based on the similar properties investigated for the MHC class-I epitopes (Supplementary Table S12). According to these properties, the 7 best scoring epitopes are IGINITRFQ, VVIGIVNNT, VLSFELLNA, FKIYSKHTP, FVFLVLLPL, FLVLLPLVS, LVLLPLVSS. The epitope ‘IGINITRFQ’ has a maximum vaxijen score of 1.3386 but ‘VVIGIVNNT’ has almost similar vaxijen score of 1.3063 which also indicates its extensive antigenicity too. It has also been found as non-allergen by AllergenFP v.1.0 and can bind a number of MHC class-II alleles including DRB1_0402, DRB1_0804, DRB1_0806, DRB1_1102, DRB1_1114, DRB1_1121, DRB1_1301, DRB1_1304, DRB1_1322, DRB1_1323, DRB1_1327, DRB1_1328 and DRB1_0301, DRB1_0306, DRB1_0307, DRB1_0308, DRB1_0309, DRB1_0311, DRB1_1107. The primarily selected 7 epitopes with a list of possible alleles for binding, starting position, antigenicity score, and allergenicity for MHC class-II binding peptides are presented in Table 3.
Table 3

Positions of MHC class II allele binding peptides predicted by ProPred with their starting position in protein sequence, vaxijen score (antigenicity), and allergenicity.

PeptidePositionMHC class II allelesVaxijen scoreAllergenicity
VVIGIVNNT1123DRB1_0301, DRB1_0306, DRB1_0307, DRB1_0308, DRB1_0309, DRB1_0311, DRB1_11071.3063Non-allergen
IGINITRFQ230DRB1_0402, DRB1_0804, DRB1_0806, DRB1_1102, DRB1_1114, DRB1_1121, DRB1_1301, DRB1_1304, DRB1_1322, DRB1_1323, DRB1_1327, DRB1_13281.3386Non-allergen
VLSFELLNA511DRB1_1104, DRB1_1106, DRB1_1311, DRB1_1501, DRB1_1502, DRB1_15061.0433Non-allergen
FKIYSKHTP200DRB1_0801, DRB1_0802, DRB1_0804, DRB1_0806, DRB1_0813, DRB1_0817, DRB1_1114, DRB1_1120, DRB1_1302, DRB1_1323, DRB1_15020. 9886Non-allergen
FVFLVLLPL1DRB1_0101, DRB1_0102, DRB1_0701, DRB1_0703, DRB1_0801, DRB1_0802, DRB1_0817, DRB1_1101, DRB1_1104, DRB1_1106, DRB1_1114, DRB1_1120, DRB1_1128, DRB1_1302, DRB1_1305, DRB1_1307, DRB1_1311, DRB1_1321, DRB1_1323, DRB1_1501, DRB1_1502, DRB1_1506, DRB5_0101, DRB5_01050.8601Non-allergen
FLVLLPLVS3DRB1_0101, DRB1_0102, DRB1_0401, DRB1_0408, DRB1_0421, DRB1_0426, DRB1_0802, DRB1_0813, DRB1_0817, DRB1_1101, DRB1_1104, DRB1_1106, DRB1_1114, DRB1_1120, DRB1_1128, DRB1_1302, DRB1_1305, DRB1_1307, DRB1_1311, DRB1_1321, DRB1_1323, DRB5_0101, DRB5_01050.4266Non-allergen
LVLLPLVSS4DRB1_0102, DRB1_0301, DRB1_0305, DRB1_0306, DRB1_0307, DRB1_0308, DRB1_0309, DRB1_0311, DRB1_0404, DRB1_0423, DRB1_0802, DRB1_0804, DRB1_0806, DRB1_0813, DRB1_1101, DRB1_1102, DRB1_1104, DRB1_1106, DRB1_1107, DRB1_1114, DRB1_1120, DRB1_1121, DRB1_1128, DRB1_1301, DRB1_1302, DRB1_1304, DRB1_1305, DRB1_1307, DRB1_1311, DRB1_1321, DRB1_1322, DRB1_1323, DRB1_1327, DRB1_13280.6523Allergen
Positions of MHC class II allele binding peptides predicted by ProPred with their starting position in protein sequence, vaxijen score (antigenicity), and allergenicity.

Features of T-cell epitopes

Important physicochemical properties of selected CTL epitopes for MHC class-I and T helper cell epitopes for MHC class-II were determined using several tools. The peptide digestion server was used to predict the digestion of selected peptides. The more the number of digesting enzymes the less its stability, and vice-versa. The enzymes Trypsin, Clostripain, Cyanogen Bromide, Iodoso Benzoate, Proline Endopept, Trypsin K, Trypsin R, and AspN cannot digest the selected VVVLSFELL epitope for MHC class-I. The non-digesting enzymes of MHC class-I binding epitopes were also identified (Supplementary Table S13). The enzymes Trypsin, Chymotrypsin, Clostripain, Cyanogen Bromide, IodosoBenzoate, Proline Endopept, Staph Protease, Trypsin K, Trypsin R and AspN cannot digest the selected VVIGIVNNT epitope for MHC class- II. The non-digesting enzymes of MHC class-II binding epitopes were also determined (Supplementary Table S14). ToxinPred was used to predict and design toxic/non-toxic peptides and also operated to measure mutation position, toxicity, hydrophobicity, hydrophilicity, hydropathicity, pI value, charge, and molecular weight of selected epitopes. None of these peptides were found to be toxic. HLP was used for predicting half-life, HPLC parameters, surface accessibility, and flexibility based on the SVM model. The physicochemical properties of selected epitopes including half-life, mutation, toxicity, hydrophobicity, hydrophilicity, hydropathicity, charge, pI, molecular weight, HPLC parameters, surface accessibility, flexibility, and polarity are given in Table 4 for MHC class-I binding CTL epitopes and in Table 5 for MHC class-II binding T helper cell epitopes.
Table 4

Physiochemical properties of predicted MHC class I peptides.

PeptideHalf-lifeMutationToxicityHydrophobicityHydropathicityHydrophilicityChargepIMWHPLC parameterSurface accessibilityFlexibilityPolarity
VVVLSFELL0.865NMNT0.332.50-1.01-1.004.001018.40-3.73332.1783.57053.670
TLDSKTQSL2.795NMNT-0.26-0.700.270.006.19992.222.96751.2784.090101.290
FEYVSQPFL0.154NMNT0.070.32-0.79-1.004.001129.40-1.20043.1003.80059.580
GLTVLPPLL3.298NMNT0.281.68-1.010.005.88922.32-2.82234.0673.8306.750
WTFGAGAAL1.741NMNT0.271.07-1.070.005.88893.13-0.61129.4673.5704.560
VYDPLQPEL0.520NMNT-0.06-0.36-0.13-2.003.671073.340.44448.2894.05099.940
GGFNFSQIL1.603NMNT0.130.59-0.880.005.88982.24-0.52235.6893.9809.880

NM: No Mutation, NT: Non-Toxin, MW: Molecular Weight.

Table 5

Physiochemical properties of predicted MHC class II peptides.

PeptideHalf-lifeMutationToxicityHydrophobicityHydropathicityHydrophilicityChargepIMWHPLC parameterSurface accessibilityFlexibilityPolarity
VVIGIVNNT1.071NMNT0.201.50-0.900.005.88928.24-0.24434.0443.9909.110
IGINITRFQ0.689NMNT-0.030.41-0.541.0010.111061.39-0.56742.9894.15061.370
VLSFELLNA1.384NMNT0.161.38-0.71-1.004.001005.31-1.90036.6783.61056.790
FKIYSKHTP1.136NMNT-0.19-0.90-0.132.509.721120.440.91155.4113.910157.620
FVFLVLLPL1.735NMNT0.483.07-1.690.005.881060.53- 6.72228.9223.3504.340
FLVLLPLVS0.771NMNT0.392.67-1.380.005.881000.43- 4.97830.7563.5505.660
LVLLPLVSS0.628NMNT0.292.27-1.070.005.88940.33- 3.23332.5893.7506.980

NM: No Mutation, NT: Non-Toxin, MW: Molecular Weight.

Physiochemical properties of predicted MHC class I peptides. NM: No Mutation, NT: Non-Toxin, MW: Molecular Weight. Physiochemical properties of predicted MHC class II peptides. NM: No Mutation, NT: Non-Toxin, MW: Molecular Weight.

Molecular docking studies

Corresponding 3D structures of CTL epitopes for MHC class-I binding peptides were predicted via PEP-FOLD server and 5 models for every 7 epitopes were obtained. From the 5 models for each epitope, the best model was selected according to the highest binding affinity with HLA-B7 protein by molecular docking. All 7 MHC class-I epitopes rather than the best one identified by other metrics were also included in the docking study to check whether the other epitopes could also bind to HLA B7. Binding affinity scores of the best model of all 7 epitopes with HLA B7 and their corresponding interacting residues are listed in Table 6. Among the best 7 epitopes, ‘FEYVSQPFL’ exhibited the highest binding affinity of -7.0 kcal/mol and candidate epitope of this study ‘VVVLSFELL’ exhibited binding affinity of -5.7 kcal/mol. The binding affinity of each epitope is listed in Table 6. PyMOL tool was used to visualize the binding pattern and interacting residues. Several numbers of residues of HLA B7 were found to interact with the candidate epitope ‘VVVLSFELL’ and the residues are: ARG-35, PRO-47, ARG-48, GLU-50, GLU-53, TYR-67, ARG-239, GOL-304. The visualization of best interactions for all 7 epitopes is presented in Figure 3.
Table 6

Best binding affinity and interacting residues.

PeptideBinding affinityInteracting residues
VVVLSFELL-5.7ARG35, PRO47, ARG48, GLU50, GLU53, TYR67, ARG239, GOL304
TLDSKTQSL-5.8PHE22, GLN32, ARG48, GLU69, LYS178, ASP183, ARG239, THR240
FEYVSQPFL-7.0ARG12, HIS13, PRO14, LYS186, THR187, HIS188, LEU206, GLY207, LEU270, LEU272, ARG273, GLU275
GLTVLPPLL-6.0THR187, HIS188, PRO267, LEU270, THR271, LEU272, ARG273, GLU275
WTFGAGAAL-6.1ARG12, GLN32, ARG48, GLY237, ARG239
VYDPLQPEL-5.8ARG12, PRO47, ARG48, PRO50, GLU53, TYR67, ARG239
GGFNFSQIL-6.5LYS178, ARG181, ASP183, LYS186, THR187, HIS188, THR240, PRO267, GLU275
Figure 3

Representation of the interactions between HLA B7 and the predicted MHC class I alleles' binding peptides, respectively, (A) VVVLSFELL, (B) TLDSKTQSL, (C) FEYVSQPFL, (D) GLTVLPPLL, (E) WTFGAGAAL, (F) VYDPLQPEL, and (G) GGFNFSQIL. The peptides are in symmetry with the information provided in Table 6. Ribbon structure represents HLA B7, wire structure represents peptide, and green dotted line represents hydrogen bond between interacting residues.

Best binding affinity and interacting residues. Representation of the interactions between HLA B7 and the predicted MHC class I alleles' binding peptides, respectively, (A) VVVLSFELL, (B) TLDSKTQSL, (C) FEYVSQPFL, (D) GLTVLPPLL, (E) WTFGAGAAL, (F) VYDPLQPEL, and (G) GGFNFSQIL. The peptides are in symmetry with the information provided in Table 6. Ribbon structure represents HLA B7, wire structure represents peptide, and green dotted line represents hydrogen bond between interacting residues.

Discussion

Application of epitope-based vaccine has overcome the obstacles associated with conventional vaccines including undesired allergenic response due to large protein size, time-consuming nature of development, and it is also an effective process to treat viral infection in the post-exposure state [39, 40]. The epitope-based vaccine is a type of subunit vaccine which utilizes small molecular weight peptides known as epitope that do not cause any infection and not prone to any mutation. It is both timesaving and cost-effective to construct a vaccine by this approach that can generate strong immunogenic responses [41, 42]. Prediction of epitope-based vaccines using computational tools have been successfully reported against the Chikungunya virus, Zika virus, Ebola virus, Herpes simplex virus, Yellow fever virus, etc. [43, 44, 45, 46, 47]. The focus of this study was to predict both B-cell and T-cell epitopes using the S protein of SARS-CoV-2 by immunoinformatics method. S protein of SARS-CoV-2 was selected for its immunogenicity along with the important role in viral attachment to the host cell surface [48, 49, 50]. Also, epitopes of viral S protein (such as SARS-CoV S protein) are the most antigenic part that elicit an intrinsic immunogenic response by signaling immune cells to increase both humoral and adaptive immunity [51, 52, 53]. B-cell epitope mapping is the keystone in the development of diagnostics, and it is an initial step to design potent vaccines [54, 55, 56, 57]. Both continuous (linear) and discontinuous (conformational) type of B-cell epitopes are crucial to the peptide-based vaccine and disease diagnosis [58, 59]. Unlike the linear epitopes in which residues occur in a consecutive fashion in the primary sequence, residues of discontinuous epitopes happen in a distant position of primary sequence but are drawn to proximity due to peptide folding [60]. B-cell linear epitopes were predicted by the IEDB analysis-resource tool that generated epitopes comprising more than nine amino acid residues. Using several tools various physicochemical properties of the initially determined epitopes were measured and they were filtered. Among them, KNHTSPDVDLG peptide was selected as a strong immunogen and stable epitope according to its highest antigenic score (1.4039), non-allergenicity, and toxicity. Similarly, CTL epitopes for MHC class-I allele binding were predicted by nHLAPred, and for the prediction of MHC class II peptide, ProPred tool was employed. These epitopes were also filtered according to their different properties for better immunogenic response and stability. In this case, the best antigenic score of 1.0909 was found for VVVLSFELL peptide for MHC class-I allele, and for MHC class-II epitope, ‘IGINITRFQ’ has a maximum vaxijen score of 1.3386 but ‘VVIGIVNNT’ has almost similar vaxijen score of 1.3063 and exhibited the highest non-digesting enzymes. According to their highest antigenicity and other properties, VVVLSFELL and VVIGIVNNT were selected as the best epitopes that could be able to work as strong immunogens. As shown in Tables 1, 2, and 3, the results of this study identified KNHTSPDVDLG as B-cell epitope, VVVLSFELL epitope for MHC class-I binding, and VVIGIVNNT for MHC class- II binding of T cell. These are predicted to be potential vaccine candidates due to their features and high antigenic nature to elicit an intrinsic immunogenic response. Furthermore, the interaction between primarily selected MHC class I epitopes and human HLA B7 allele were analyzed to understand the binding affinities between them. Molecular docking of HLA B7 with each epitope modeled by PepFold denoted a stable interaction between HLA B7 and each ligand. Several significant hydrogen bonds were present in each docked complex (Figure 3) indicating stable use of these peptides as potential immunogens. Although the candidate MHC class-I epitope ‘VVVLSFELL’ has not been found to possess the highest binding affinity score with HLA B7 among remaining epitopes, a score of -5.7 kcal/mol is significant and non-negligible, nonetheless. Moreover, this epitope is preferred due to the other properties such as antigenicity, the number of MHC binding alleles, allergenicity, etc. HLA-B7 was selected for the docking study as a representative protein of MHC class-I alleles since HLA-B is the most prevalent serotype and is available at a higher frequency in humans compared to the other MHC class-I molecules [61]. HLA-B7 also plays a major role in viral epitope binding in viral infection such as HIV and MERS-CoV [33, 62]. HLA-B7 is also one of the common and focused haplotypes in prediction of HLA binding to viral derived peptides of coronavirus [63, 64]. Physicochemical properties (peptide digestion, half-life, hydropathicity, flexibility, toxicity, etc.), immunogenic properties (allergenicity, antigenicity), and stable HLA allele binding capability reinforce the potentiality of these peptides as effective, safe and stable vaccine candidates.

Conclusion

This study has presented several epitopes of the S protein of SARS-CoV-2 for both B-cell and T-cell screened for efficient immunogenic response according to their antigenic nature. This in silico method of epitope-based vaccine design can save both time and cost by filtering out a number of vaccine candidates and associated trials. This study can be used as a steppingstone and advanced further to tackle the challenge imposed by SARS-CoV-2.

Declarations

Author contribution statement

Arafat Islam Ashik: Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper. Mahedi Hasan: Performed the experiments; Wrote the paper. Atiya Tahira Tasnim: Contributed reagents, materials, analysis tools or data; Wrote the paper. Md. Belal Chowdhury: Analyzed and interpreted the data; Wrote the paper. Tanvir Hossain: Conceived and designed the experiments; Contributed reagents, materials, analysis tools or data. Shamim Ahmed: Conceived and designed the experiments; Contributed reagents, materials, analysis tools or data; Wrote the paper.

Funding statement

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Competing interest statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.
  59 in total

1.  A fast method for large-scale de novo peptide and miniprotein structure prediction.

Authors:  Julien Maupetit; Philippe Derreumaux; Pierre Tufféry
Journal:  J Comput Chem       Date:  2010-03       Impact factor: 3.376

Review 2.  The role of helper T cell products in mouse B cell differentiation and isotype regulation.

Authors:  R L Coffman; B W Seymour; D A Lebman; D D Hiraki; J A Christiansen; B Shrader; H M Cherwinski; H F Savelkoul; F D Finkelman; M W Bond
Journal:  Immunol Rev       Date:  1988-02       Impact factor: 12.988

3.  Computational tools for modern vaccine development.

Authors:  Andaleeb Sajid; Yogendra Singh; Pratyoosh Shukla
Journal:  Hum Vaccin Immunother       Date:  2019-12-18       Impact factor: 3.452

4.  Multi-epitope based vaccine against yellow fever virus applying immunoinformatics approaches.

Authors:  Stephane Fraga de Oliveira Tosta; Mariana Santana Passos; Rodrigo Kato; Álvaro Salgado; Joilson Xavier; Arun Kumar Jaiswal; Siomar C Soares; Vasco Azevedo; Marta Giovanetti; Sandeep Tiwari; Luiz Carlos Junior Alcantara
Journal:  J Biomol Struct Dyn       Date:  2020-01-06

5.  BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes.

Authors:  Martin Closter Jespersen; Bjoern Peters; Morten Nielsen; Paolo Marcatili
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

6.  Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China.

Authors:  Chaolin Huang; Yeming Wang; Xingwang Li; Lili Ren; Jianping Zhao; Yi Hu; Li Zhang; Guohui Fan; Jiuyang Xu; Xiaoying Gu; Zhenshun Cheng; Ting Yu; Jiaan Xia; Yuan Wei; Wenjuan Wu; Xuelei Xie; Wen Yin; Hui Li; Min Liu; Yan Xiao; Hong Gao; Li Guo; Jungang Xie; Guangfa Wang; Rongmeng Jiang; Zhancheng Gao; Qi Jin; Jianwei Wang; Bin Cao
Journal:  Lancet       Date:  2020-01-24       Impact factor: 79.321

Review 7.  Random-peptide libraries and antigen-fragment libraries for epitope mapping and the development of vaccines and diagnostics.

Authors:  M B Irving; O Pan; J K Scott
Journal:  Curr Opin Chem Biol       Date:  2001-06       Impact factor: 8.822

8.  A novel coronavirus outbreak of global health concern.

Authors:  Chen Wang; Peter W Horby; Frederick G Hayden; George F Gao
Journal:  Lancet       Date:  2020-01-24       Impact factor: 79.321

9.  Decoding the evolution and transmissions of the novel pneumonia coronavirus (SARS-CoV-2 / HCoV-19) using whole genomic data.

Authors:  Wen-Bin Yu; Guang-Da Tang; Li Zhang; Richard T Corlett
Journal:  Zool Res       Date:  2020-05-18

10.  Epitope-based peptide vaccine design and target site depiction against Middle East Respiratory Syndrome Coronavirus: an immune-informatics study.

Authors:  Muhammad Tahir Ul Qamar; Saman Saleem; Usman Ali Ashfaq; Amna Bari; Farooq Anwar; Safar Alqahtani
Journal:  J Transl Med       Date:  2019-11-08       Impact factor: 5.531

View more
  2 in total

1.  Design, construction and in vivo functional assessment of a hinge truncated sFLT01.

Authors:  Fahimeh Zakeri; Hamid Latifi-Navid; Zahra-Soheila Soheili; Mehdi Sadeghi; Seyed Shahriar Arab; Shahram Samiei; Ehsan Ranaei Pirmardan; Sepideh Taghizadeh; Hamid Ahmadieh; Ali Hafezi-Moghadam
Journal:  Gene Ther       Date:  2022-09-16       Impact factor: 4.184

2.  Lessons Learned from Cutting-Edge Immunoinformatics on Next-Generation COVID-19 Vaccine Research.

Authors:  Chiranjib Chakraborty; Ashish Ranjan Sharma; Manojit Bhattacharya; Sang-Soo Lee
Journal:  Int J Pept Res Ther       Date:  2021-07-10       Impact factor: 1.931

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.