| Literature DB >> 32106567 |
Syed Faraz Ahmed1, Ahmed A Quadeer1, Matthew R McKay1,2.
Abstract
The beginning of 2020 has seen the emergence of COVID-19 outbreak caused by a novel coronavirus, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). There is an imminent need to better understand this new virus and to develop ways to control its spread. In this study, we sought to gain insights for vaccine design against SARS-CoV-2 by considering the high genetic similarity between SARS-CoV-2 and SARS-CoV, which caused the outbreak in 2003, and leveraging existing immunological studies of SARS-CoV. By screening the experimentally-determined SARS-CoV-derived B cell and T cell epitopes in the immunogenic structural proteins of SARS-CoV, we identified a set of B cell and T cell epitopes derived from the spike (S) and nucleocapsid (N) proteins that map identically to SARS-CoV-2 proteins. As no mutation has been observed in these identified epitopes among the 120 available SARS-CoV-2 sequences (as of 21 February 2020), immune targeting of these epitopes may potentially offer protection against this novel virus. For the T cell epitopes, we performed a population coverage analysis of the associated MHC alleles and proposed a set of epitopes that is estimated to provide broad coverage globally, as well as in China. Our findings provide a screened set of epitopes that can help guide experimental efforts towards the development of vaccines against SARS-CoV-2.Entities:
Keywords: 2019 novel coronavirus; 2019-nCoV; B cell epitopes; COVID-19; Coronavirus; MERS-CoV; SARS-CoV; SARS-CoV-2; T cell epitopes; vaccine
Mesh:
Substances:
Year: 2020 PMID: 32106567 PMCID: PMC7150947 DOI: 10.3390/v12030254
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Filtering criteria and corresponding number of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV)-derived epitopes obtained from the Virus Pathogen Database and Analysis Resource (ViPR) database.
| Filtering Criteria | Number of Epitopes | |
|---|---|---|
| Positive T cell assays | T cell epitopes | 115 |
| Positive major histocompatibility complex (MHC) binding assays | T cell epitopes | 959 |
| Positive B cell assays | Linear B cell epitopes | 298 |
| Discontinuous B cell epitopes | 6 | |
Figure 1Comparison of the similarity of structural proteins of SARS-CoV-2 with the corresponding proteins of SARS-CoV and MERS (Middle East Respiratory Syndrome)-CoV. (a) Percentage genetic similarity of the individual structural proteins of SARS-CoV-2 with those of SARS-CoV and MERS-CoV. The reference sequence of each coronavirus (Materials and Methods) was used to calculate the percentage genetic similarity. (b) Circular phylogram of the phylogenetic trees of the four structural proteins. All trees were constructed based on the available unique sequences using PASTA [31] and rooted with the outgroup Zaria Bat CoV strain (accession ID: HQ166910.1).
SARS-CoV-derived T cell epitopes obtained using positive T cell assays that are identical in SARS-CoV-2 (27 epitopes in total).
| Protein | IEDB ID | Epitope | MHC Allele1 | MHC Allele Class 1 |
|---|---|---|---|---|
| N | 125100 | ILLNKHID | HLA-A*02:01 | I |
| N | 1295 | AFFGMSRIGMEVTPSGTW | NA | NA |
| N | 190494 | MEVTPSGTWL | HLA-B*40:01 | I |
| N | 21347 | GMSRIGMEV | HLA-A*02:01 | I |
| N | 27182 | ILLNKHIDA | HLA-A*02:01 | I |
| N | 2802 | ALNTPKDHI | HLA-A*02:01 | I |
| N | 28371 | IRQGTDYKHWPQIAQFA | NA | NA |
| N | 31166 | KHWPQIAQFAPSASAFF | NA | NA |
| N | 34851 | LALLLLDRL | HLA-A*02:01 | I |
| N | 37473 | LLLDRLNQL | HLA-A*02:01 | I |
| N | 37611 | LLNKHIDAYKTFPPTEPK | NA | NA |
| N | 38881 | LQLPQGTTL | HLA-A*02:01 | I |
| N | 3957 | AQFAPSASAFFGMSR | NA | II |
| N | 3958 | AQFAPSASAFFGMSRIGM | NA | NA |
| N | 55683 | RRPQGLPNNTASWFT | NA | I |
| N | 74517 | YKTFPPTEPKKDKKKK | NA | NA |
| S | 100048 | GAALQIPFAMQMAYRF | HLA-DRA*01:01, HLA-DRB1*07:01 | II |
| S | 100300 | MAYRFNGIGVTQNVLY | HLA-DRB1*04:01 | II |
| S | 100428 | QLIRAAEIRASANLAATK | HLA-DRB1*04:01 | II |
| S | 16156 | FIAGLIAIV | HLA-A*02:01 | I |
| S | 2801 | ALNTLVKQL | HLA-A*02:01 | I |
| S | 36724 | LITGRLQSL | HLA-A2 | I |
| S | 44814 | NLNESLIDL | HLA-A*02:01 | I |
| S | 50311 | QALNTLVKQLSSNFGAI | HLA-DRB1*04:01 | II |
| S | 54680 | RLNEVAKNL | HLA-A*02:01 | I |
| S | 69657 | VLNDILSRL | HLA-A*02:01 | I |
| S | 71663 | VVFLHVTYV | HLA-A*02:01 | I |
1 NA: Not available.
Set of the SARS-CoV-derived spike (S) and nucleocapsid (N) protein T cell epitopes (obtained from positive MHC binding assays) that are identical in SARS-CoV-2 and that maximize estimated population coverage globally (87 distinct epitopes).
| Epitopes1 | MHC Allele Class | MHC Allele | Global Accumulated Population Coverage2 (%) | Accumulated Population Coverage in China (%) |
|---|---|---|---|---|
| I | HLA-A*02:01 | 39.08 | 14.62 | |
| I | HLA-A*24:02 | 55.48 | 36.11 | |
| DSFKEELDKY, LIDLQELGKY, | I | HLA-A*01:01 | 66.78 | 39.09 |
| GSFCTQLNR, GVVFLHVTY, AQALNTLVK, MTSCCSCLK, ASANLAATK, SLIDLQELGK, SVLNDILSR, TQNVLYENQK, CMTSCCSCLK, VQIDRLITGR, KTFPPTEPK, KTFPPTEPKK, LSPRWYFYY, ASAFFGMSR, ATEGALNTPK, QLPQGTTLPK, QQQGQTVTK, QQQQGQTVTK, SASAFFGMSR, SQASSRSSSR, TPSGTWLTY | I | HLA-A*03:01 | 76.14 | 41.68 |
| GSFCTQLNR, GVVFLHVTY, AQALNTLVK, MTSCCSCLK, ASANLAATK, SLIDLQELGK, SVLNDILSR, TQNVLYENQK, CMTSCCSCLK, VQIDRLITGR, KTFPPTEPK, KTFPPTEPKK, LSPRWYFYY, ASAFFGMSR, ATEGALNTPK, QLPQGTTLPK, QQQGQTVTK, QQQQGQTVTK, SASAFFGMSR, SQASSRSSSR, TPSGTWLTY | I | HLA-A*11:01 | 83.39 | 73.43 |
| GSFCTQLNR, GVVFLHVTY, AQALNTLVK, MTSCCSCLK, ASANLAATK, SLIDLQELGK, SVLNDILSR, TQNVLYENQK, CMTSCCSCLK, VQIDRLITGR, KTFPPTEPK, KTFPPTEPKK, LSPRWYFYY, ASAFFGMSR, ATEGALNTPK, QLPQGTTLPK, QQQGQTVTK, QQQQGQTVTK, SASAFFGMSR, SQASSRSSSR, TPSGTWLTY | I | HLA-A*68:01 | 85.71 | 74.25 |
| I | HLA-A*23:01 | 87.72 | 74.87 | |
| GSFCTQLNR, GVVFLHVTY, AQALNTLVK, MTSCCSCLK, ASANLAATK, SLIDLQELGK, SVLNDILSR, TQNVLYENQK, CMTSCCSCLK, VQIDRLITGR, KTFPPTEPK, KTFPPTEPKK, LSPRWYFYY, ASAFFGMSR, ATEGALNTPK, QLPQGTTLPK, QQQGQTVTK, QQQQGQTVTK, SASAFFGMSR, SQASSRSSSR, TPSGTWLTY | I | HLA-A*31:01 | 89.55 | 76.93 |
| FPNITNLCPF, APHGVVFLHV, FPRGQGVPI, APSASAFFGM | I | HLA-B*07:02 | 90.89 | 77.61 |
| GAALQIPFAMQMAYR, GWTFGAGAALQIPFA, IDRLITGRLQSLQTY, ISGINASVVNIQKEI, LDKYFKNHTSPDVDL, LGDISGINASVVNIQ, LGFIAGLIAIVMVTI, LNTLVKQLSSNFGAI, LQDVVNQNAQALNTL, LQSLQTYVTQQLIRA, LQTYVTQQLIRAAEI, AQKFNGLTVLPPLLT, PCSFGGVSVITPGTN, QIPFAMQMAYRFNGI, QQLIRAAEIRASANL, QTYVTQQLIRAAEIR, AYRFNGIGVTQNVLY, SSNFGAISSVLNDIL, TGRLQSLQTYVTQQL, WLGFIAGLIAIVMVT, CVNFNFNGLTGTGVL, DKYFKNHTSPDVDLG, IDAYKTFPPTEPKKD, MSRIGMEVTPSGTWL, NKHIDAYKTFPPTEP, VLQLPQGTTLPKGFY | II | HLA-DRB1*01:01 | 91.94 | 78.23 |
| FPRGQGVPI | I | HLA-B*08:01 | 92.85 | 78.41 |
| FPNITNLCPF, APHGVVFLHV, FPRGQGVPI, APSASAFFGM | I | HLA-B*35:01 | 93.53 | 79.23 |
| LQIPFAMQM, RVDFCGKGY | I | HLA-B*15:01 | 94.18 | 82.26 |
| FPNITNLCPF, APHGVVFLHV, FPRGQGVPI, APSASAFFGM | I | HLA-B*51:01 | 94.72 | 83.73 |
| YEQYIKWPWY | I | HLA-B*18:01 | 95.23 | 83.88 |
| GRLQSLQTY, RVDFCGKGY, VRFPNITNL | I | HLA-B*27:05 | 95.55 | 84 |
| MTSCCSCLK, SLIDLQELGK, CMTSCCSCLK, VQIDRLITGR, SASAFFGMSR, SQASSRSSSR | I | HLA-A*33:01 | 95.79 | 85.28 |
| LQIPFAMQM, RVDFCGKGY | I | HLA-B*58:01 | 95.99 | 86.45 |
| LQIPFAMQM, RVDFCGKGY | I | HLA-C*15:02 | 96.17 | 87.22 |
| VRFPNITNL | I | HLA-C*14:02 | 96.29 | 88.11 |
1 Multiple SARS-CoV-derived epitopes that were determined using MHC binding assays are shown for each allele. Epitopes that were also tested for positive T cell response (listed also in Table 2) are shown in bold. Epitopes that lie within the SARS-CoV receptor-binding motif are underlined. 2 Epitopes are ordered according to the estimated global accumulated population coverage.
SARS-CoV-derived linear B cell epitopes from S (23; 20 of which are located in subunit S2) and N (22) proteins that are identical in SARS-CoV-2 (45 epitopes in total).
| Protein | Subunit | IEDB ID | Epitope | Protein | IEDB ID | Epitope |
|---|---|---|---|---|---|---|
| S | S2 | 10778 | DVVNQNAQALNTLVKQL | N | 15814 | FFGMSRIGMEVTPSGTW |
| S | S2 | 11038 | EAEVQIDRLITGRLQSL | N | 21065 | GLPNNTASWFTALTQHGK |
| S | S2 | 12426 | EIDRLNEVAKNLNESLIDLQELGKYEQY | N | 22855 | GTTLPK |
| S | S2 | 14626 | EVAKNLNESLIDLQELG | N | 28371 | IRQGTDYKHWPQIAQFA |
| S | S2 | 18515 | GAALQIPFAMQMAYRFN | N | 31116 | KHIDAYKTFPPTEPKKDKKK |
| S | S1 | 18594 | GAGICASY | N | 31166 | KHWPQIAQFAPSASAFF |
| S | S2 | 2092 | AISSVLNDILSRLDKVE | N | 75235 | YNVTQAFGRRGPEQTQGNF |
| S | S2 | 22321 | GSFCTQLN | N | 33669 | KTFPPTEPKKDKKKK |
| S | S2 | 27357 | ILSRLDKVEAEVQIDRL | N | 37640 | LLPAAD |
| S | S1 | 30987 | KGIYQTSN | N | 38249 | LNKHIDAYKTFPPTEPK |
| S | S2 | 3176 | AMQMAYRF | N | 38648 | LPQGTTLPKG |
| S | S2 | 32508 | KNHTSPDVDLGDISGIN | N | 38657 | LPQRQKKQ |
| S | S2 | 41177 | MAYRFNGIGVTQNVLYE | N | 48067 | PKGFYAEGSRGGSQASSR |
| S | S2 | 462 | AATKMSECVLGQSKRVD | N | 50741 | QFAPSASAFFGMSRIGM |
| S | S2 | 47479 | PFAMQMAYRFNGIGVTQ | N | 50965 | QGTDYKHW |
| S | S2 | 50311 | QALNTLVKQLSSNFGAI | N | 51483 | QLPQGTTLPKGFYAE |
| S | S2 | 51379 | QLIRAAEIRASANLAAT | N | 51484 | QLPQGTTLPKGFYAEGSR |
| S | S1 | 52020 | QQFGRD | N | 51485 | QLPQGTTLPKGFYAEGSRGGSQ |
| S | S2 | 53202 | RASANLAATKMSECVLG | N | 63729 | TFPPTEPK |
| S | S2 | 54599 | RLITGRLQSLQTYVTQQ | N | 55683 | RRPQGLPNNTASWFT |
| S | S2 | 558417 | EIDRLNEVAKNLNESLIDLQELGKYEQY | N | 60379 | SQASSRSS |
| S | S2 | 59425 | SLQTYVTQQLIRAAEIR | N | 60669 | SRGGSQASSRSSSRSR |
| S | S2 | 9094 | DLGDISGINASVVNIQK |
SARS-CoV-derived discontinuous B cell epitopes (and associated known antibodies [39,40,41]) that have at least one site with an identical amino acid to the corresponding site in SARS-CoV-2.
| IEDB ID | Associated Known Antibody | SARS-CoV S Protein Residues 1,2 |
|---|---|---|
| 910052 | S230 | G446, P462, D463, |
| 77444 | m396 | T359, |
| 77442 | 80R | R426, S432, T433, |
1 Residues are numbered according to the SARS-CoV S protein reference sequence, accession ID: NP_828851.1.; 2 Residues in the epitopes that are identical in the SARS-CoV-2 sequences are underlined.
Figure 2Location of SARS-CoV S protein subunits and SARS-CoV-derived B cell epitopes on the protein structure (PDB ID: 5XLR). (a) Subunits S1 and S2 are indicated in purple and green color, respectively. The receptor binding motif lies within the S1 subunit and is indicated in orange color. (b) Residues of the linear B cell epitopes, that were identical in SARS-CoV-2 (Table 4), are shown in red color. The dark and light shade reflect the surface and buried residues, respectively. (c) Location of discontinuous B cell epitopes that share at least one identical residue with corresponding SARS-CoV-2 sites (Table 5). Identical epitope residues are shown in red color, while the remaining epitope residues are shown in blue color. Both the side view (left panel) and the top view (right panel) of the structure are shown.