| Literature DB >> 33465451 |
Muhammad Saqib Sohail1, Syed Faraz Ahmed1, Ahmed Abdul Quadeer2, Matthew R McKay3.
Abstract
Growing evidence suggests that T cells may play a critical role in combating severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Hence, COVID-19 vaccines that can elicit a robust T cell response may be particularly important. The design, development and experimental evaluation of such vaccines is aided by an understanding of the landscape of T cell epitopes of SARS-CoV-2, which is largely unknown. Due to the challenges of identifying epitopes experimentally, many studies have proposed the use of in silico methods. Here, we present a review of the in silico methods that have been used for the prediction of SARS-CoV-2 T cell epitopes. These methods employ a diverse set of technical approaches, often rooted in machine learning. A performance comparison is provided based on the ability to identify a specific set of immunogenic epitopes that have been determined experimentally to be targeted by T cells in convalescent COVID-19 patients, shedding light on the relative performance merits of the different approaches adopted by the in silico studies. The review also puts forward perspectives for future research directions.Entities:
Keywords: Allergenicity; COVID-19; Computational prediction; Coronavirus; Immunogenicity; Immunoinformatics; Peptide-HLA binding; Reverse vaccinology; SARS-CoV; Toxicity
Mesh:
Substances:
Year: 2021 PMID: 33465451 PMCID: PMC7832442 DOI: 10.1016/j.addr.2021.01.007
Source DB: PubMed Journal: Adv Drug Deliv Rev ISSN: 0169-409X Impact factor: 17.873
Fig. 1Schematic illustration of T cell responses against SARS-CoV-2 and T cell epitope prediction using in silico approaches. (A) Viral peptides, derived from SARS-CoV-2 proteins after multiple intra-cellular processing steps, are presented on the surface of infected cells and antigen presenting cells via HLA class I and class II molecules, respectively. Naïve T cells, specialized in distinguishing foreign-peptides from self-peptides via training in the thymus, scan these peptide-HLA complexes to determine if the peptides belong to a foreign microbe. Recognition of a foreign-peptide leads to activation, proliferation, and differentiation of naïve T cells into effector cells. There are two main types of effector T cells: CD8+ T cells (or cytotoxic T lymphocytes; CTLs) that get activated by viral peptides bound to HLA class I molecules and help in killing the SARS-CoV-2 infected cells (right panel), while CD4+ T cells (or helper T lymphocytes) get activated by peptides bound to HLA class II molecules and help in further enhancing SARS-CoV-2-specific CD8+ T cell and antibody responses (left panel). These adaptive immune cells, activated by peptide-HLA complexes, can collectively mount a potent immune response against SARS-CoV-2. (B) In silico approaches analyze SARS-CoV-2 protein sequences to predict a number of potential HLA-I and HLA-II epitopes that can be used to guide experiments to characterize T cell responses in COVID-19 patients and to inform SARS-CoV-2 vaccine design.
List of reviewed in silico SARS-CoV-2 T cell epitope prediction studies.
| No. | Study label | HLA-I epitope prediction | HLA-II epitope prediction | Immunogenicity | IFN-γ production | Conservation | Allergenicity | Toxicity | Autoimmunity | Vaccine construct |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Ahmed2020 [ | Using SARS-CoV immunological data | Using SARS-CoV immunological data | - | - | Y | - | - | - | - |
| 2 | Grifoni2020 | Using SARS-CoV immunological data, NetMHCpan-4.0 | Using SARS-CoV immunological data, Tepitool | - | - | - | - | - | - | - |
| 3 | Ranga2020 [ | Using SARS-CoV immunological data, NetCTL-1.2 | - | - | - | - | - | - | - | - |
| 4 | Lee2020 | Using SARS-CoV immunological data, NetMHCpan-4.0 | Using SARS-CoV immunological data | iPred | - | - | - | - | - | - |
| 5 | Baruah2020 [ | NetCTL-1.2, NetChop, CTLPred | - | - | IFNepitope | - | - | - | - | - |
| 6 | Crooke2020 [ | NetCTL-1.2, NetMHCpan-4.0 | NetMHCIIpan-3.2 | Vaxijen-2.0 | - | - | AllerCatPro | ToxinPred | - | - |
| 7 | Ojha2020 [ | NetCTL-1.2 | IEDB (method NS | - | - | - | - | - | - | Y |
| 8 | Wang2020 [ | NetMHCpan-4.0 | IEDB (recommended) | Vaxijen-2.0 | - | - | - | - | Y | - |
| 9 | Poran2020 [ | HLAthena | NeonMHC2 | Response against few predicted epitopes tested in recovered patients | - | - | - | - | Y | - |
| 10 | UlQamar2020 [ | IEDB (consensus) | IEDB (consensus) | Vaxijen-2.0 | - | - | AllerTOP-2.0 | NS | - | Y |
| 11 | Gupta2020vr [ | NetCTLpan-1.1 | IEDB (recommended) | Vaxijen-2.0 | IFNepitope | - | AllerTOP-2.0 | ToxinPred | - | - |
| 12 | Enayatkhani2020 [ | RANKPEP | RANKPEP | - | - | AllerTOP-2.0 | - | - | Y | |
| 13 | Ong2020 [ | Vaxign, IEDB (consensus) | Vaxign, IEDB (consensus) | - | - | - | - | - | Y | - |
| 14 | Abdelmageed 2020 [ | IEDB (consensus) | IEDB (recommended) | - | - | - | - | - | - | - |
| 15 | Mukherjee2020 [ | Tepitool, NetMHCpan-4.0, nHLAPred, CTLPred | Tepitool | Vaxijen-2.0 | - | Y | AllerTOP-2.0, AlgPred | ToxinPred | Y | - |
| 16 | Vashi2020 [ | IEDB (method NS) | IEDB (method NS) | - | - | Y | - | - | - | - |
| 17 | Ahmad2020 [ | MHCPred | MHCPred | Vaxijen-2.0 | - | - | AllerTOP-2.0 | - | Y | Y |
| 18 | Naz2020 [ | Tepitool | IEDB (recommended) | Vaxijen-2.0, | - | - | AllerTOP-2.0 | - | - | Y |
| 19 | Chen2020 [ | NetMHCpan-4.0 | IEDB (recommended) | Vaxijen-2.0, | - | - | AllerTOP-2.0 | ToxinPred | - | Y |
| 20 | Martin2020 [ | NetCTL-1.2 | IEDB (recommended) | Vaxijen-2.0 | IFNepitope | - | AllerTOP-2.0 | ToxinPred | - | Y |
| 21 | Dong2020 [ | NetCTL-1.2 | IEDB (consensus) | - | IFNepitope | - | - | - | - | Y |
| 22 | Ghafouri2020 [ | IEDB (method NS) | IEDB (method NS) | Vaxijen-2.0 | - | - | AllerTOP-2.0 | ToxinPred | - | Y |
| 23 | Banerjee2020 [ | NetCTL-1.2 | NetMHCII-2.3 | - | - | - | - | - | - | Y |
| 24 | Samad2020 [ | NetCTL-1.2 | IEDB (consensus) | Vaxijen-2.0, | IFNepitope | - | AllerTOP-2.0 | ToxinPred | - | Y |
| 25 | Bhatnager2020 [ | NetMHCpan-4.0, CTLPred | IEDB (recommended 2.2) | IFNepitope | - | AlgPred, AllergenFP | NS | - | Y | |
| 26 | Devi2020 [ | NetCTL-1.2 | IEDB (consensus) | Vaxijen-2.0, | IFNepitope | - | AllerTOP-2.0 | ToxinPred | - | Y |
| 27 | AbrahamPeele2020 [ | NetCTL-1.2 | IEDB (method NS) | Vaxijen-2.0, | IFNepitope | - | AllerTOP-2.0 | ToxinPred | - | Y |
| 28 | Ismail2020 [ | NetMHC-4.0, , MHCPred | IEDB (consensus), MHCPred | Vaxijen-2.0 | IFNepitope | - | AllerTOP-2.0 | ToxinPred | Y | Y |
| 29 | Jakhar2020 [ | NetCTL-1.2, IEDB (method NS) | NetMHCIIpan-3.0 | Vaxijen-2.0, | IFNepitope | Y | - | ToxinPred | - | Y |
| 30 | Panda2020 [ | NetCTL-1.2 | - | Vaxijen-2.0 | - | - | - | - | - | - |
| 31 | Campbell2020 [ | pVACtools | pVACtools | - | - | - | - | - | - | - |
| 32 | Tilocca2020 [ | IEDB (method NS) | IEDB (method NS) | - | - | - | - | - | - | - |
| 33 | Santoni2020 [ | NetMHC-4.0, NetCTL-1.2 | - | - | - | - | - | - | Y | - |
| 34 | Dijkstra2020 [ | NetMHC | - | - | - | - | - | - | - | - |
| 35 | Prachar2020 [ | NetMHC-4.0 | NetMHCII-2.3 | - | - | - | - | - | - | - |
| 36 | Ramaiah2020 [ | - | IEDB (consensus) | - | - | - | - | - | - | - |
| 37 | Gupta2020 [ | NetMHCpan-4.0 | Sturniolo method | Vaxijen-2.0, | AllerTOP-2.0 | ToxinPred | ||||
| 38 | Srivastava2020 [ | IEDB (consensus) | SMM-align, Sturniolo method | - | IFNepitope | - | ToxinPred | - | Y | |
| 39 | Mitra2020 [ | NetMHC-4.0, NetCTL-1.2, IEDB (consensus) | MHCPred, NetMHCIIpan-3.2, IEDB (consensus) | Vaxijen-2.0 | IFNepitope | - | AllerTOP, AlgPred | ToxinPred | Y | Y |
| 40 | Singh2020 [ | NetCTL-1.2, IEDB (consensus) | NetMHCIIpan-3.2 | Vaxijen-2.0 | IFNepitope | - | AllerTOP-2.0 | Y | Y | |
| 41 | Saha2020 [ | ProPred1 | ProPred | Vaxijen-2.0 | - | - | - | - | - | Y |
| 42 | Nerli2020 [ | NetMHCpan-4.0 | - | Electrostatic surface potential | - | - | - | - | - | - |
| 43 | Liu2020 [ | NetMHCpan-4.0, MHCflurry | NetMHCIIpan-4.0 | - | - | - | - | - | Y | - |
| 44 | Khan2020 [ | NetCTL-1.2 | PREDIVAC | Calis et al. | - | - | AlgPred | ToxinPred | Y | - |
| 45 | Banerjee2020a [ | - | IEDB (method NS) | Vaxijen-2.0 | - | - | - | - | - | - |
| 46 | Bojin2020 [ | IEDB (method NS) | IEDB (method NS) | - | - | - | - | - | - | - |
| 47 | NazneenAkhand 2020 [ | IEDB (method NS) | IEDB (method NS) | Vaxijen-2.0 | IFNepitope | - | AllergenFP, AllerTOP | ToxinPred | - | Y |
| 48 | Feng2020 [ | NetMHCpan, iNeo-Pred | - | NS | - | - | - | NS | Y | - |
| 49 | Bhattacharya2020 [ | ProPred1 | ProPred | Vaxijen-2.0 | - | - | - | - | - | Y |
| 50 | Chauhan2020 [ | NetCTL-1.2, | IEDB (consensus), NetMHCIIpan-3.2 | Vaxijen-2.0 | IFNepitope | - | AlgPred, AllerTOP-2.0 | - | - | Y |
| 51 | Fast2020 [ | NetMHCpan-4.0 | MARIA | - | - | - | - | - | - | - |
| 52 | Joshi2020 [ | NetMHC-4.0, MHCPred | NetMHCIIpan-3.2, MHCPred | Vaxijen-2.0 | - | - | - | ToxinPred | - | - |
| 53 | Kar2020 [ | NetCTL-1.2, IEDB (consensus) | NetMHCIIpan-3.2 | Vaxijen-2.0, | IFNepitope | - | AllerTOP-2.0, AllergenFP | - | - | Y |
| 54 | Qamar2020 [ | IEDB (consensus) | IEDB (consensus) | Vaxijen-2.0 | - | Y | AllergenFP | ToxinPred | - | - |
| 55 | Ahammad2020 [ | NetCTL-1.2 | IEDB (consensus) | Vaxijen-2.0, | IFNepitope | - | AllerTOP-2.0 | ToxinPred | Y | Y |
| 56 | Kiyotani2020 [ | NetMHCpan-4.0, NetMHC-4.0 | NetMHCIIpan-3.1 | - | - | - | - | - | - | - |
| 57 | Sarkar2020 [ | NetMHCpan-4.0 | IEDB (Sturniolo) | Vaxijen-2.0 | - | - | AllerTOP-2.0, AllergenFP | ToxinPred | Y | Y |
| 58 | Romero-Lopez2020 [ | Tepitool | IEDB (recommended 2.2) | Calis et al. | - | - | - | - | - | - |
| 59 | Sanami2020 [ | ProPred1 | ProPred | Vaxijen-2.0 | IFNepitope | - | AllerTOP-2.0 | ToxinPred | Y | Y |
| 60 | Kalita2020 [ | NetCTL-1.2, IEDB (method NS) | NetMHCIIpan-3.0 | - | IFNepitope | - | - | ToxinPred | - | Y |
| 61 | Rahman2020 [ | SMM | SMM-align | - | IFNepitope | - | - | - | Y | |
| 62 | Lin2020 [ | IEDB (consensus) | IEDB (method NS) | Vaxijen-2.0 | - | - | AllergenFP | ToxinPred | - | - |
| 63 | Yarmarkovich2020 [ | NetMHC-4.0 | NetMHCII-2.3 | - | - | - | - | - | Y | - |
| 64 | Yazdani2020 [ | NetMHC-4.0, CTLPred | IEDB (consensus), RANKPEP | Vaxijen-2.0 | IFNepitope | - | AllergenFP | - | - | Y |
| 65 | Lucchese2020 [ | Identified pentamers from the proteome | - | - | - | - | - | - | Y | - |
Studies that used SARS-CoV immunological data for prediction [[1], [2], [3], [4]] are shown in bold font, while all other studies that used peptide-HLA binding prediction methods are in regular font. In silico studies [[1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30]] were obtained from PubMed, while the remaining studies [[31], [32], [33], [34], [35], [36], [37], [38], [39], [40], [41], [42], [43], [44], [45], [46], [47], [48], [49], [50], [51], [52], [53], [54], [55], [56], [57], [58], [59], [60], [61], [62], [63], [64], [65]] were obtained by searching Google Scholar.
HLA-I epitope prediction: ProPred1 [54], SMM [58,59], CTLPred [51], Tepitool [66], nHLAPred [67], NetMHC-4.0 [40], NetChop [48], NetCTL-1.2 [49], NetMHCpan [37], HLAthena [17], MHCflurry [41], NetMHCpan-4.0 [16], iNeo-Pred [152], Vaxign [57], RANKPEP [52], pVACtools [65], NetMHC [36], NetCTLpan-1.1 [39], MHCPred [62], IEDB (consensus) [63].
HLA-II epitope prediction: MHCPred [62], NeonMHC2 [19], NetMHCII-2.3 [42], NetMHCIIpan-3.0 [43], NetMHCIIpan-3.1 [44], NetMHCIIpan-3.2 [42], NetMHCIIpan-4.0 [18], SMM-align [60], RANKPEP [52], Sturniolo method [61], ProPred [55], pVACtools [65], PREDIVAC [56], IEDB (recommended) [43], MARIA [45], IEDB (consensus) [64], Vaxign [57], Tepitool [66], IEDB (recommended 2.2) [43].
Immunogenicity: Vaxijen-2.0 [79], Calis et al. [80], iPred [81]; Electrostatic surface potential [147].
IFN-γ production: IFNepitope [85].
Allergenicity: AlgPred [91], AllerTOP [92], AllerTOP-2.0 [93], AllergenFP [94], AllerCatPro [95].
Toxicity: ToxinPred [98].
Lee2020 [21] and Grifoni2020 [22] reported two sets of epitopes: one set based on using SARS-CoV immunological data and the other using NetMHCpan-4.0. For the purpose of this analysis, we have only considered the set of epitopes predicted using SARS-CoV immunological data.
NS: Not specified.
Fig. 2Summary and comparison of 61 in silico studies that have predicted SARS-CoV-2 T cell epitopes. (Top left panel) Heatmap shows the fraction of common epitopes predicted across each pair of studies. The fraction is computed relative to the number of epitopes predicted by the study indicated in each row (the total number of epitopes predicted in each study are shown within parentheses on the right). Four in silico studies that used SARS-CoV immunological data are indicated in bold font. Of the epitopes predicted by these studies, only the ones predicted based on homology with SARS-CoV epitopes were included. Study labels indicated in the figure correspond to those in Table 1. (Top right panel) Bar plots show the fraction of predicted epitopes for each HLA class in each study, with the total number shown within parentheses. (Bottom left panel) Heatmap shows the number of predicted epitopes derived from each SARS-CoV-2 protein for each in silico study. Each column in this heatmap corresponds to the study mentioned at the top of each column in the top left panel heatmap. Missing tiles indicate no predicted epitopes. (Bottom right panel) Bar plots show the fraction of predicted epitopes, across studies, derived from each SARS-CoV-2 protein, with the total number shown within parentheses. Predicted epitopes were assigned HLA class based on the HLA allele (bearing 4-digit resolution or higher) reported against them; or as “NA” otherwise.
Fig. 3Common in silico prediction methods that have been used by the reviewed SARS-CoV-2 studies. Only methods that were explicitly mentioned by at least 5 in silico SARS-CoV-2 studies (Table 1) are shown here. The methods are grouped according to the category shown in the legend.
Fig. 4Summary of the experimentally-determined HLA-A*02:01-associated SARS-CoV-2 epitopes that were also predicted by in silico studies. (Left panel) List of 33 experimentally-determined HLA-A*02:01-associated epitopes that matched identically with epitopes predicted by in silico studies. (Middle panel) Number of convalescent COVID-19 patients bearing the HLA-A*02:01 allele whose blood sample responded (filled bar) and did not respond (empty bar) upon stimulation with the epitope. (Right panel) Number of in silico studies that predicted the epitope in the context of HLA-A*02:01. Orange represents the number of studies that used SARS-CoV immunological data, while purple represents the number of studies based on peptide-HLA binding prediction. The labels of the in silico studies (Table 1) predicting each epitope are listed on the right. Epitopes are colored according to the SARS-CoV-2 protein from which they are derived (counts shown in legend) and ordered in descending order of the number of patients whose samples responded. The two experimentally-determined HLA-A*02:01 epitopes which did not match identically with any epitope predicted by in silico studies were 906YLFDESGEFKL916 in ORF1a and 20FLAFVVFL27 in E. These epitopes were reported to induce a T cell response in 9/36 and 2/3 COVID-19 convalescent patients, respectively.
Approaches adopted by in silico studies that predicted at least half of the experimentally-determined HLA-A*02:01-associated epitopes.
| No. | Approach | Total number of predicted epitopes | Number of predicted epitopes matching experimentally-determined epitopes | Hit rate | |
|---|---|---|---|---|---|
| 1 | Nerli2020 [ | Based on peptide-HLA binding prediction (involving NetMHCpan-4.0) | 722 | 30 | 4.2% |
| 2 | Ahmed2020 [ | Using SARS-CoV immunological data | 115 | 17 | 14.8% |
Hit rate represents the positive predicted value (i.e., ratio of the number of predicted epitopes matching experimentally-determined epitopes to the total number of in silico predicted epitopes).
Distinct HLA alleles predicted, across in silico studies, to be associated with the 33 experimentally-determined HLA-A*02:01-restricted SARS-CoV-2 epitopes.
| No. | Epitope | Protein | ||
|---|---|---|---|---|
| 1 | 139LLYDANYFL147 | ORF3a | 1 | |
| 2 | 3886KLWAQCVQL3894 | ORF1a | 3 | |
| 3 | 269YLQPRTFLL277 | S | 12 | A*01:01, |
| 4 | 4094ALWEIQQVV4102 | ORF1a | 3 | |
| 5 | 1000RLQSLQTYV1008 | S | 6 | |
| 6 | 222LLLDRLNQL230 | N | 6 | |
| 7 | 2332ILFTRFFYV2340 | ORF1a | 4 | |
| 8 | 107YLYALVYFL115 | ORF3a | 4 | |
| 9 | 72ALSKGVHFV80 | ORF3a | 3 | |
| 10 | 1220FIAGLIAIV1228 | S | 8 | |
| 11 | 221LLLLDRLNQL230 | N | 2 | |
| 12 | 3403FLNGSCGSV3411 | ORF1a | 4 | |
| 13 | 417KIADYNYKL425 | S | 8 | |
| 14 | 821LLFNKVTLA829 | S | 4 | |
| 15 | 424KLPDDFTGCV433 | S | 3 | |
| 16 | 825FGDDTVIEV833 | ORF1a | 2 | |
| 17 | 3467VLAWLYAAV3475 | ORF1a | 3 | |
| 18 | 3639FLLPSLATV3647 | ORF1a | 5 | |
| 19 | 1062FLHVTYVPA1070 | S | 5 | |
| 20 | 983RLDKVEAEV991 | S | 6 | |
| 21 | 995RLITGRLQSL1004 | S | 2 | |
| 22 | 26FLFLTWICL34 | M | 4 | |
| 23 | 338KLDDKDPNF346 | N | 2 | |
| 24 | 386KLNDLCFTNV395 | S | 3 | |
| 25 | 112YLGTGPEAGL121 | N | 1 | |
| 26 | 316GMSRIGMEV324 | N | 5 | |
| 27 | 219LALLLLDRL227 | N | 1 | |
| 28 | 202KIYSKHTPI210 | S | 5 | |
| 29 | 857GLTVLPPLL865 | S | 4 | |
| 30 | 958ALNTLVKQL966 | S | 5 | |
| 31 | 996LITGRLQSL1004 | S | 4 | |
| 32 | 1185RLNEVAKNL1193 | S | 6 | |
| 33 | 976VLNDILSRL984 | S | 7 | A*01:01, |
Epitopes are listed in the same order as in Fig. 4.
Fig. 5Results obtained for the 324 experimentally-determined SARS-CoV-2 T cell epitopes [8,9,115,[176], [177], [178], [179], [180],182] when they were provided as input to the computational tools most commonly used in the reviewed SARS-CoV-2 in silico studies for refinement of epitopes obtained by peptide-HLA binding prediction methods. Positive outcomes indicate the number of epitopes the computational tool predicts to have the characteristic (immunogenicity, IFN-γ production, allergenicity, toxicity) being tested, and vice versa for negative outcomes. “NA” indicates the number of epitopes that could not be analyzed by the specific tool. In case of Calis et al. [80], the method is applicable to HLA-I epitopes only, while in the case of IFNepitope, this was because only a subset of the experimentally-determined epitopes had IFN-γ production information available.