Literature DB >> 32473352

Understanding the B and T cell epitopes of spike protein of severe acute respiratory syndrome coronavirus-2: A computational way to predict the immunogens.

Yoya Vashi¹, Vipin Jagrit¹, Sachin Kumar².

Abstract

The 2019 novel severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) outbreak has caused a large number of deaths, with thousands of confirmed cases worldwide. The present study followed computational approaches to identify B- and T-cell epitopes for the spike (S) glycoprotein of SARS-CoV-2 by its interactions with the human leukocyte antigen alleles. We identified 24 peptide stretches on the SARS-CoV-2 S protein that are well conserved among the reported strains. The S protein structure further validated the presence of predicted peptides on the surface, of which 20 are surface exposed and predicted to have reasonable epitope binding efficiency. The work could be useful for understanding the immunodominant regions in the surface protein of SARS-CoV-2 and could potentially help in designing some peptide-based diagnostics. Also, identified T-cell epitopes might be considered for incorporation in vaccine designs.

Entities: Chemical Disease Gene Species

Keywords: Diagnostics; Epitopes; SARS-CoV-2; Spike protein

Mesh：

Substances：

Year: 2020 PMID： 32473352 PMCID： PMC7251353 DOI： 10.1016/j.meegid.2020.104382

Source DB: PubMed Journal: Infect Genet Evol ISSN： 1567-1348 Impact factor: 3.342

Introduction

Emerging severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) is a recent pandemic and has been declared as a public health emergency by the World Health Organization (WHO, 2020b). The disease rapidly spread across the globe and caused havoc to humanity (Wu and McGoogan, 2020). By the start of May, SARS-CoV-2 had spread to 215 countries and infected over 3,862,676 people (WHO, 2020a). The WHO is continuously monitoring and updating health-related plans to curtail the disease spread. The absence of a specific treatment and vaccine worsens the situation and threatens the world. The International Committee on Taxonomy of Viruses (ICTV), classified SARS-CoV-2 under the family Coronaviridae of order Nidovirales. The genomic sequence of SARS-CoV-2 isolated from the bronchoalveolar lavage fluid of a patient from Wuhan, China showed a length of 29,903 nucleotides [GenBank accession number NC_045512] (Wu et al., 2020). SARS-CoV-2 contains a positive-sense single-stranded RNA with 5ˊ and 3ˊ untranslated region. The genome codes for ORF1a, ORF1b, Spike (S), ORF3a, ORF3b, Envelope (E), Membrane (M), ORF6, ORF7a, ORF7b, ORF8, ORF9b, ORF14, Nucleocapsid (N), and ORF10 from 5ˊ to 3ˊ (Wu et al., 2020; Zhu et al., 2020). The S glycoprotein forms a homotrimer and mediates viral entry into host cells. The S protein is a potential target for therapeutic and vaccine design against SARS-CoV-2 infection in humans (Li, 2016; Tortorici et al., 2019). The S glycoprotein comprises two functional subunits: the S1 subunit is responsible for binding to the host cell receptor and the S2 subunit is responsible for fusion of the virus with the cell membrane. Usually in CoVs, S is cleaved at the boundary between S1 and S2 subunits, which remain non-covalently bound in the prefusion conformation, to activate the protein for membrane fusion via extensive irreversible conformational changes (Burkard et al., 2014; Park et al., 2016; Walls et al., 2017). Setting it apart from other SARS-CoVs, it is found that the S glycoprotein of SARS-CoV-2 harbors a furin cleavage site at the boundary between the S1/S2 subunits (Walls et al., 2020). By now, it is evident that SARS-CoV-2 S uses angiotensin-converting enzyme 2 (ACE2) receptor-mediated entry into cells. Some studies suggest similar binding affinities to human ACE2 with the S protein of SARS-CoV-2 and SARS-CoV (Letko et al., 2020; Walls et al., 2020). However, some suggest that SARS-CoV-2 binds ACE2 with higher affinity than SARS-CoV (Tai et al., 2020; Wang et al., 2020; Wrapp et al., 2020). As the situation worsens, there is a growing need for the development of suitable therapeutics, vaccines, and other diagnostics against SARS-CoV-2 for effective disease management strategies. Vaccines and diagnostic assays based on peptides have become increasingly substantial and indispensable for their advantages over conventional methods (Li et al., 2014; Mohanraj et al., 2017). The present study aimed to locate appropriate epitopes within a particular protein antigen that can elicit an immune response and could be selected for the synthesis of an immunogenic peptide. Using a computational approach, the S glycoprotein of SARS-CoV-2 was explored to identify various immunodominant epitopes for the development of diagnostics and vaccines. Besides, the results could also help us to understand the SARS-CoV-2 surface protein response towards T- and B-cells.

Materials & methods

Collection of the targeted protein sequence

The amino acid sequences (n = 98) of S protein available at the time of study on targeted SARS-CoV-2 were downloaded from the National Centre for Biotechnological Information (NCBI) database.

Identification of potential peptides

To identify an immunodominant region, it is of extreme importance to select the conserved region within the S protein of SARS-CoV-2. All the sequences were compared among themselves for variability using the protein variability server by the Shannon method (Garcia-Boronat et al., 2008). The average solvent accessibility (ASA) profile was predicted for each sequence using the SABLE server (Adamczak et al., 2004). BepiPred 1.0 Linear Epitope Prediction module incorporated in Immune Epitope Database (IEDB) was used to predict potential epitopes within the S protein (Haste Andersen et al., 2006; Larsen et al., 2006; Ponomarenko and Bourne, 2007; Vita et al., 2019). The FASTA sequence of the targeted protein was used as an input for all the default parameters.

Identification of B-cell epitopes

We used two web-based tools for B-cell epitope prediction: the IEDB and ABCpred servers (Saha and Raghava, 2006). S protein structure from the protein data bank (PDB, 6VSB) was analyzed for linear and discontinuous B-cell epitopes using the ElliPro module on the IEDB server with default settings (Ponomarenko et al., 2008; Wrapp et al., 2020). Also, the ABCpred server was used to detect B-cell epitopes using the artificial neural network (ann) method.

Identification of T-cell epitopes

T-cell epitopes with a binding affinity towards major histocompatibility complex (MHC)-I and MHC-II alleles were selected to boost up both cytotoxic T-cell and helper T-cell mediated immune response. IEDB server was used to predict the MHC-I and MHC-II binding epitopes for the targeted protein. The reference set of alleles was used for predicting the MHC-I and MHC-II T-cell epitopes (Karosiene et al., 2012; Nielsen et al., 2007; Nielsen et al., 2003; Peters and Sette, 2005; Sturniolo et al., 1999).

Results and discussion

In our study, we targeted the S glycoprotein of SARS-CoV-2 as it is present outside the virus and interacts with the host receptor. At the time of the study, there were 98 sequences available for the targeted protein of SARS-CoV-2. The S glycoprotein sequence is 1273 amino acids long, except for that of the virus isolated from Kerala (India), which is a 1272 amino acid long S glycoprotein (GenBank accession number MT012098). Our interest here was to determine conserved regions first and then determine surface-exposed regions, which are potential epitopes to generate an immune response. We found that sequences among all the S proteins in the analysis are least variable and highly conserved, as shown in Fig. 1 . However, we found that there were 12 point mutations in the amino acid sequences collected. The mutated sites identified were as follows: positions 247 and 614 for sequence MT007544 (Australia), positions 145 and 408 for sequence MT012098 (India), position 49 for sequence MT027064 (USA), position 221 for sequence MT039890 (South Korea), position 28 for sequence MT049951 (China), position 797 for sequence MT093571 (Sweden), position 157 for sequence MT159716 (USA), positions 655 and 930 for sequence MT163720 (USA), and position 181 for sequence MT184910 (USA). Regions having a high ASA value are more surface exposed compared to others. We identified a total of 24 peptides of varying lengths, which were selected based on high ASA values (Table 1 ). The potential epitope regions were predicted using the sequence of the S protein of SARS-CoV-2 that showed the least variability (GenBank accession number NC_045512). The potential epitopes are represented by blue peaks, while green-colored slopes represent non-epitopic regions (Fig. 2 ).

Fig. 1

Table 1

Conserved region selected based on protein variability, average solvent accessibility and antibody epitope prediction using BepiPred 1.0 Linear Epitope Prediction module of IEDB selected for further analysis.

Sl. No.	Start	End	Length	Peptide
1	21	38	18	RTQLPPAYTNSFTRGVYY
2	69	81	13	HVSGTNGTKRFDN
3	144	155	12	YYHKNNKSWMES
4	178	191	14	DLEGKQGNFKNLRE
5	249	261	13	LTPGDSSSGWTAG
6	278	287	10	KYNENGTITD
7	314	325	12	QTSNFRVQPTES
8	407	428	22	VRQIAPGQTGKIADYNYKLPDD
9	437	450	14	NSNNLDSKVGGNYN
10	461	485	25	LKPFERDISTEIYQAGSTPCNGVEG
11	493	506	14	QSYGFQPTNGVGYQ
12	521	533	13	PATVCGPKKSTNL
13	567	581	15	RDIADTTDAVRDPQT
14	597	607	11	VITPGTNTSNQ
15	625	648	24	HADQLTPTWRVYSTGSNVFQTRAG
16	654	661	8	EHVNNSYE
17	673	691	19	SYQTQTNSPRRARSVASQS
18	700	713	16	GAENSVAYSNNSIA
19	768	780	13	TGIAVEQDKNTQE
20	788	799	14	IYKTPPIKDFGG
21	805	816	12	ILPDPSKPSKRS
22	1134	1150	17	NNTVYDPLQPELDSFKE
23	1153	1171	19	DKYFKNHTSPDVDLGDISG
24	1255	1267	13	KFDEDDSEPVLKG

Fig. 2

Graphical representation of B-cell linear epitopes of spike protein of SARS-CoV-2. B-cell linear epitopes predicted using BepiPred 1.0 module incorporated in IEDB server using default threshold value (0.35).

Profiles of average solvent accessibility (blue) in % and amino acid sequence variability (green) in numbers of the 98 SARS-CoV-2 protein plotted against amino acid numbers. High ASA value means the solvent accessibility score is relatively higher for that region and it is more surface exposed with respect to its neighbours. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Conserved region selected based on protein variability, average solvent accessibility and antibody epitope prediction using BepiPred 1.0 Linear Epitope Prediction module of IEDB selected for further analysis. Graphical representation of B-cell linear epitopes of spike protein of SARS-CoV-2. B-cell linear epitopes predicted using BepiPred 1.0 module incorporated in IEDB server using default threshold value (0.35). The existence of B-cell linear and discontinuous (conformational) epitopes within the identified segments could help us to identify the peptides, which can elicit an immune response (Purcell et al., 2007). We identified 18 linear epitopes, predicted by ElliPro (IEDB), which contained regions from 19 of our selected peptides (highlighted in red in Table 2 ). These identified B-cell linear epitopes were placed based on their positional value and scores. Epitopes with high scores have more potential for antibody binding. Five of our selected peptides (peptide numbers 3, 5, 19, 23, and 24 in Table 1) were not considered as potential linear B-cell epitopes. Some parts of our identified epitopes were in accordance with epitopes recognized in an earlier study (Ahmed et al., 2020), which further supports the credibility of our identified epitopes.

Table 2

IEDB ElliPro predicted linear epitopes for spike protein of SARS-CoV-2. Sequences that match our selected peptides are marked in red.

IEDB ElliPro predicted linear epitopes for spike protein of SARS-CoV-2. Sequences that match our selected peptides are marked in red. Using the same module, B-cell discontinuous epitopes were predicted, which gave 16 epitope regions that contained regions from 18 of our selected peptides (highlighted in red in Table S1). Six peptides (peptide numbers 3, 5, 14, 19, 23, and 24 in Table 1) were not predicted as discontinuous B-cell epitopes. To further confirm, we used the ABCpred server to detect B-cell epitopes, with a default threshold of 0.51. It identified various epitopes with different lengths and scores. Out of those, the regions that contained our selected peptides are highlighted in red in Table 3 . A high score represents good binding affinity with epitopes; most of our peptides scored more than 0.7 and were predicted as linear B-cell epitopes.

Table 3

ABCpred determination of B-cell binding affinities. Note that high score indicates good binding affinity.

ABCpred determination of B-cell binding affinities. Note that high score indicates good binding affinity. We used the IEDB server to determine the binding affinity for the human leucocyte antigen (HLA). As recommended by the IEDB server, reference HLA allele sets were used for the prediction of MHC-I and MHC-II T-cell epitopes, as they provide comprehensive coverage of the population. All the predictions were made using IEDB recommended procedures. The list of binding affinities for MHC-I T-cell epitopes is given in Table S2, where low rank represents high binding affinity. Similarly, the list of binding affinities for MHC-II T-cell epitopes are given in Table 4 . Regions from our selected peptides are highlighted in red. The epitopes with rank <1% for very high binding affinity were selected. We also observed that some of the peptides we identified as potential B-cell epitopes were present as T-cell epitopes with good binding affinities.

Table 4

IEDB prediction of binding affinity with MHC-II alleles, peptides with percentile rank less than 1.00 are shown here. The binding affinity is considered higher for low percentile rank. Sequences that match our selected peptides are marked in red. Overall, it was found that the regions identified in Table 1 not only had good B-cell and T-cell affinities, but the majority of them had also overlapped with discontinuous epitopes (Table S1). The peptide segments identified from the set of 98 sequences of the SARS-CoV-2 S glycoprotein appear to hold reasonable potential to act as immunogens. Peptide-based diagnostics and vaccines have previously been proposed against virus outbreaks (Dey et al., 2017; Ichihashi et al., 2011; Navalkar et al., 2015; Oany et al., 2014; Zhao et al., 2009). The availability of a 3D structure (6VSB) of the SARS-CoV-2 S glycoprotein provided an opportunity to inspect the predicted peptides. Placement of the peptide segments identified by ASA and conserved sequence analysis on the S glycoprotein showed that 20 of the regions we identified lie on the surface (Fig. 3 ). In order to limit recognition and evade the immune response of the host, coronaviruses use conformational masking and glycan shielding (Walls et al., 2019; Xiong et al., 2018). SARS-CoV-2 S trimer also exists in multiple distinct conformational states, which is necessary for receptor engagement, leading to the initiation of fusogenic conformational changes (Walls et al., 2020). The considerable number of peptides at the surface region of the S glycoprotein allows for the potential use of those peptide regions as immunogens. Binding to the ACE2 receptor is a critical initial step for the SARS-CoV-2 in entering target cells. Recent studies have also pointed out the vital role of ACE2 in mediating the entry of SARS-CoV-2 (Hoffmann et al., 2020). Receptor binding motif (RBM) is part of the receptor-binding domain (RBD) of SARS-CoV-2, which contains most of the contacting residue for ACE-2 binding (Lan et al., 2020). It was observed that some of our identified peptides from Table 1 (peptide no. 7–12) fall in the regions of RBD (amino acid no. 319–540) and RBM (amino acid no. 438–506), which makes them potential peptide regions to be used.

Fig. 3

Our selected peptides are highlighted on spike protein of SARS-CoV-2 protein structure downloaded from PDB (ID: 6VSB).

Our selected peptides are highlighted on spike protein of SARS-CoV-2 protein structure downloaded from PDB (ID: 6VSB). The emergence of new viral diseases like SARS-CoV-2 represents a substantial global disease burden. Over the past few months, there have been increased research efforts for the design and development of diagnostics and vaccines for SARS-CoV-2. Some related analyses have been reported in distinct, parallel studies (Baruah and Bose, 2020; Bhattacharya et al., 2020; Grifoni et al., 2020). Our study leverages the available resources and computational methods and adds to the ongoing research focused on the development of diagnostics and vaccines against SARS-CoV-2. Other than already existing ones, we have identified a further number of peptides, which adds to the library of peptides that are likely to be recognized by human immune responses. Facilitated by high mutation rates, traditional vaccines based on antibody-mediated protection are often poor inducers of T-cell responses and can have limited success (Rosendahl Huber et al., 2014). Peptide-based sensitive and rapid diagnostic kits are considered a better alternative to the conventional serological tests, including whole antigenic protein (Mohanraj et al., 2017). In our study, we predicted both B-cell and T-cell epitopes for conferring immunity in different ways. We speculate that the identified epitopes with considerably good epitope binding efficiency have the potential to be an immunodominant peptide. The study could help us to use the predicted peptide as an immunogen for the development of diagnostics and vaccines against SARS-CoV-2.

Conclusion

In the present study, peptide segments were identified on S proteins for the development of diagnostics and vaccines against SARS-CoV-2. The recent availability of 3D data on 2019-CoV S glycoprotein has helped the search. SARS-CoV-2, being an RNA virus, has a high mutation rate and undergoes active recombination (Yi, 2020). Although the peptides identified are ideal candidates as immunogens for the development of peptide-based diagnostics and vaccines, more refinement and lab trials are essential steps that are yet to be undertaken for early development before the identified epitopes are rendered obsolete.

Declaration of Competing Interest

The authors declare no conflict of interest.

13 in total

1. Perspectives on the use and risk of adverse events associated with cytokine-storm targeting antibodies and challenges associated with development of novel monoclonal antibodies for the treatment of COVID-19 clinical cases.

Authors: Aishwarya Mary Johnson; Robert Barigye; Hariharan Saminathan
Journal: Hum Vaccin Immunother Date: 2021-05-11 Impact factor: 3.452

Review 2. Severe acute respiratory syndrome-coronavirus-2 spike (S) protein based vaccine candidates: State of the art and future prospects.

Authors: Arash Arashkia; Somayeh Jalilvand; Nasir Mohajel; Atefeh Afchangi; Kayhan Azadmanesh; Mostafa Salehi-Vaziri; Mehdi Fazlalipour; Mohammad Hassan Pouriayevali; Tahmineh Jalali; Seyed Dawood Mousavi Nasab; Farzin Roohvand; Zabihollah Shoja
Journal: Rev Med Virol Date: 2020-10-15 Impact factor: 11.043

3. Impact of glycan cloud on the B-cell epitope prediction of SARS-CoV-2 Spike protein.

Authors: René Wintjens; Amanda Makha Bifani; Pablo Bifani
Journal: NPJ Vaccines Date: 2020-09-04 Impact factor: 7.344

Review 4. Current and prospective computational approaches and challenges for developing COVID-19 vaccines.

Authors: Woochang Hwang; Winnie Lei; Nicholas M Katritsis; Méabh MacMahon; Kathryn Chapman; Namshik Han
Journal: Adv Drug Deliv Rev Date: 2021-02-06 Impact factor: 17.873

5. Genome-wide analysis of Indian SARS-CoV-2 genomes to identify T-cell and B-cell epitopes from conserved regions based on immunogenicity and antigenicity.

Authors: Nimisha Ghosh; Nikhil Sharma; Indrajit Saha; Sudipto Saha
Journal: Int Immunopharmacol Date: 2020-12-16 Impact factor: 5.714

Review 6. In silico T cell epitope identification for SARS-CoV-2: Progress and perspectives.

Authors: Muhammad Saqib Sohail; Syed Faraz Ahmed; Ahmed Abdul Quadeer; Matthew R McKay
Journal: Adv Drug Deliv Rev Date: 2021-01-17 Impact factor: 17.873

Review 7. How concerning is a SARS-CoV-2 variant of concern? Computational predictions and the variants labeling system.

Authors: Dana Ashoor; Maryam Marzouq; Khaled Trabelsi; Sadok Chlif; Nasser Abotalib; Noureddine Ben Khalaf; Ahmed R Ramadan; M-Dahmani Fathallah
Journal: Front Cell Infect Microbiol Date: 2022-08-10 Impact factor: 6.073

Review 8. Bioinformatic HLA Studies in the Context of SARS-CoV-2 Pandemic and Review on Association of HLA Alleles with Preexisting Medical Conditions.

Authors: Mina Mobini Kesheh; Sara Shavandi; Parastoo Hosseini; Rezvan Kakavand-Ghalehnoei; Hossein Keyvani
Journal: Biomed Res Int Date: 2021-05-28 Impact factor: 3.411

9. Prediction and evolution of B cell epitopes of surface protein in SARS-CoV-2.

Authors: Jerome Rumdon Lon; Yunmeng Bai; Bingxu Zhong; Fuqiang Cai; Hongli Du
Journal: Virol J Date: 2020-10-29 Impact factor: 4.099

10. Spike protein-based epitopes predicted against SARS-CoV-2 through literature mining.

Authors: Wendong Li; Lin Li; Ting Sun; Yufei He; Guang Liu; Zixuan Xiao; Yubo Fan; Jing Zhang
Journal: Med Nov Technol Devices Date: 2020-10-08