| Literature DB >> 33897313 |
Richa Jain1, Ankit Jain2, Santosh Kumar Verma3.
Abstract
COVID-19 is an infectious disease caused by a newly discovered corona virus SARS-COV-2. It is the most dangerous epidemic existing currently all over the world. To date, there is no licensed vaccine and not any particular efficient therapeutic agent available to prevent or cure the disease. So development of an effective vaccine is the urgent need of the time. The proposed study aims to identify potential vaccine candidates by screening the complete proteome of SARS-COV-2 using the computational approach. From 14 protein entries in UniProtKB, 4 proteins were screened for epitope prediction based on consensus antigenicity predictions and various physico-chemical criteria like transmembrane domain, allergenicity, GRAVY value, toxicity, stability index. Comprehensive analysis of these 4 antigens revealed that spike protein (P0DTC2) and nucleoprotein (P0DTC9) show the greatest potential for experimental immunogenicity analysis. These 2 proteins have several potential CD4+ and CD8+ T-cell epitopes, as well as high probability of B-cell epitope regions as compared to well-characterized antigen the matrix protein 1 [Influenza A virus (H5N1)]. In addition, the epitope SIIAYTMSL predicted from spike protein (P0DTC2) and epitope SPRWYFYYL predicted from nucleoprotein (P0DTC9) exhibited more than 60% population coverage in the target populations Europe, North America, South Asia, Northeast Asia taken in this study. These epitopes have also been found to exhibit highly significant TCR-pMHC interactions having a joint Z value of 4.51 and 4.37 respectively. Therefore, this analysis suggests that the predicted epitopes might be suitable vaccine candidates and should be subjected to further in-vivo and in-vitro studies.Entities:
Keywords: Covid-19; Epitope; MHC; SARS-COV-2; Vaccine
Year: 2021 PMID: 33897313 PMCID: PMC8051835 DOI: 10.1007/s10989-021-10205-z
Source DB: PubMed Journal: Int J Pept Res Ther ISSN: 1573-3149 Impact factor: 1.931
List of proteins predicted to be antigenic with corresponding antigenic probabilities
| Protein no | UniProtKB id | Protein name | Antigenic Probability | No. of TM regions predicted using TMHMM | |
|---|---|---|---|---|---|
| VaxiJen | ANTIGENpro | ||||
| 1 | P0DTC1 | Replicase polyprotein 1a (pp1a) | 0.47 | 0.64 | 14 |
| 2 | P0DTD1 | Replicase polyprotein 1ab (pp1ab) | 0.46 | 0.68 | 14 |
| 3 | P0DTC2 | Spike glycoprotein (S) | 0.46 | 0.71 | 1 |
| 4 | P0DTC3 | ORF3a protein (NS3a) | 0.49 | 0.40 | 3 |
| 5 | P0DTC7 | ORF7a protein | 0.64 | 0.40 | 1 |
| 6 | P0DTC9 | Nucleoprotein (N) | 0.50 | 0.93 | 0 |
| 7 | P0DTD2 | ORF9b protein | 0.90 | 0.74 | 0 |
| Control | Q9Q0L8 | Matrix protein 1 | 0.47 | 0.86 | 0 |
Physical chemical parameters calculated using ProtParam tool
| UniprotKB id | Protein name | Molecular weight (KDa) | Theoretical pI | Instability Index | Aliphatic Index | GRAVY |
|---|---|---|---|---|---|---|
| P0DTC2 | Spike glycoprotein (S) | 141.17 | 6.24 | 33.01 | 84.67 | − 0.079 |
| P0DTC7 | ORF7a protein | 13.74 | 8.23 | 48.66 | 100.74 | 0.318 |
| P0DTC9 | Nucleoprotein (N) | 45.62 | 10.07 | 55.09 | 52.53 | − 0.971 |
| P0DTD2 | ORF9b protein | 10.79 | 6.56 | 33.11 | 105.46 | − 0.085 |
Q9Q0L8 (Control) | Matrix protein 1 | 27.85 | 9.42 | 38.72 | 82.90 | − 0.246 |
Selected CTL epitopes and their binding to different MHC class I supertypes
| Protein | Epitope | MHC I supertypes |
|---|---|---|
P0DTC2 Spike glycoprotein (S) | AALQIPFAM | B7, B58 |
| AIVMVTIML | A2, B7 | |
| DEDDSEPVL | B39, B44 | |
| EPVLKGVKL | B7, B8 | |
| ESNKKFLPF | A26, B62 | |
| FAMQMAYRF | B58, B62 | |
| FEYVSQPFL | B39, B44 | |
| FLHVTYVPA | A2, B8 | |
| FRKSNLKPF | B8, B27 | |
| FTISVTTEI | A2, A26, B58 | |
| FVFLVLLPL | A2, A26, B8, B62 | |
| GAAAYYVGY | A1, B58, B62 | |
| GAEHVNNSY | A1, B62 | |
| GQTGKIADY | B27, B62 | |
| IAIPTNFTI | A24, B58 | |
| IGAGICASY | B58, B62 | |
| IGIVNNTVY | B58, B62 | |
| ITDAVDCAL | A1, B39, B58 | |
| KGIYQTSNF | B58, B62 | |
| KIADYNYKL | A2, B39 | |
| KIYSKHTPI | A2, B8 | |
| KTSVDCTMY | A1, A3, B58, B62 | |
| KVTLADAGF | B58, B62 | |
| LLALHRSYL | A2, B8 | |
| LPFFSNVTW | B7, B58 | |
| LSETKCTLK | A1, A3 | |
| MTSCCSCLK | A1, A3 | |
| NGVEGFNCY | A26, B62 | |
| NLLLQYGSF | B8, B62 | |
| NTSNQVAVL | A26, B39 | |
| QIITTDNTF | A24, A26, B58, B62 | |
| QLTPTWRVY | A1, B62 | |
| RVVVLSFEL | A2, B7, B58, B62 | |
| SIIAYTMSL | A2, A26, B62 | |
| SLSSTASAL | A2, B7, B62 | |
| SPRRARSVA | B7, B8 | |
| STECSNLLL | A1, B39 | |
| STQDLFLPF | A1, A24, A26, B62 | |
| TFEYVSQPF | A24, B62 | |
| TLDSKTQSL | A2, B39 | |
| TLLALHRSY | A3, B62 | |
| TSNQVAVLY | A1, A3, A26, B58, B62 | |
| VLKGVKLHY | A1, A3, B62 | |
| VLPFNDGVY | A1, B62 | |
| VRFPNITNL | B27, B39 | |
| VVNQNAQAL | B7, B62 | |
| VYDPLQPEL | A24, B39 | |
| WTAGAAAYY | A1, A26, B58, B62 | |
| WTFGAGAAL | A26, B62 | |
| YLQPRTFLL | A2, B39, B58, B62 | |
| YQDVNCTEV | A1, A2, B39 | |
| YQPYRVVVL | A2, A24, B8, B39, B62 | |
| YVPAQEKNF | A26, B62 | |
P0DTC9 Nucleoprotein (N) | DLSPRWYFY | A1, A3, A26 |
| FPRGQGVPI | B7, B8 | |
| KAYNVTQAF | A24, B7, B8, B58, B62 | |
| KMKDLSPRW | B58, B62 | |
| LSPRWYFYY | A1, A3, A26, B58, B62 | |
| QFAPSASAF | A24, B62 | |
| QKKQQTVTL | B8, B39 | |
| QRQKKQQTV | B8, B27 | |
| SPRWYFYYL | B7, B8 | |
| SSPDDQIGY | A1, A26, B62 | |
P0DTD2 ORF9b protein | GPKVYPIIL | B7, B8 |
| KISEMHPAL | A2, B7, B8, B39, B58, B62 | |
| KVYPIILRL | A2, A3, B58 | |
| LRLGSPLSL | B27, B39 | |
| MARKTLNSL | B7, B8 | |
| RLVDPQIQL | A2, B62 | |
| SEMHPALRL | B39, B44 | |
| SLEDKAFQL | A2, B39 |
Population coverage analysis of optimized top scoring CTL epitopes for different target populations
| Protein | Target population | Epitope | Percentage coverage | Total HLA hits |
|---|---|---|---|---|
P0DTC2 Spike glycoprotein (S) | Europe | RVVVLSFEL | 80.55% | 31 |
| SLSSTASAL | 77.12% | 31 | ||
| YQPYRVVVL | 75.44% | 24 | ||
| AIVMVTIML | 72.91% | 18 | ||
| FVFLVLLPL | 68.03% | 20 | ||
| TSNQVAVLY | 63.35% | 26 | ||
| YLQPRTFLL | 63.31% | 26 | ||
| SIIAYTMSL | 60.53% | 33 | ||
| North America | RVVVLSFEL | 80.83% | 31 | |
| SLSSTASAL | 80.83% | 31 | ||
| SIIAYTMSL | 78.10% | 33 | ||
| YQPYRVVVL | 74.61% | 24 | ||
| AIVMVTIML | 69.55% | 18 | ||
| TSNQVAVLY | 67.13% | 26 | ||
| VVNQNAQAL | 64.63% | 23 | ||
| YLQPRTFLL | 64.56% | 26 | ||
| South Asia | TSNQVAVLY | 80.64% | 26 | |
| KTSVDCTMY | 76.42% | 23 | ||
| VLKGVKLHY | 71.64% | 17 | ||
| LSETKCTLK | 66.63% | 10 | ||
| MTSCCSCLK | 66.63% | 10 | ||
| TLLALHRSY | 66.62% | 13 | ||
| SIIAYTMSL | 65.03% | 33 | ||
| RVVVLSFEL | 61.94% | 31 | ||
| North East Asia | TSNQVAVLY | 82.88% | 26 | |
| KTSVDCTMY | 81.61% | 23 | ||
| RVVVLSFEL | 79.33% | 31 | ||
| VLKGVKLHY | 78.68% | 17 | ||
| TLLALHRSY | 76.97% | 13 | ||
| SLSSTASAL | 73.78% | 31 | ||
| VVNQNAQAL | 68.07% | 23 | ||
| SIIAYTMSL | 66.22% | 33 | ||
P0DTC9 Nucleoprotein (N) | Europe | KAYNVTQAF | 77.43% | 27 |
| LSPRWYFYY | 63.35% | 26 | ||
| SPRWYFYYL | 60.10% | 14 | ||
| FPRGQGVPI | 59.75% | 12 | ||
| North America | KAYNVTQAF | 76.89% | 27 | |
| LSPRWYFYY | 67.13% | 26 | ||
| DLSPRWYFY | 54% | 13 | ||
| SPRWYFYYL | 51.20% | 14 | ||
| South Asia | LSPRWYFYY | 80.64% | 26 | |
| DLSPRWYFY | 72.60% | 13 | ||
| KAYNVTQAF | 63.14% | 27 | ||
| SPRWYFYYL | 32.99% | 14 | ||
| North East Asia | LSPRWYFYY | 82.88% | 26 | |
| KAYNVTQAF | 76.27% | 27 | ||
| DLSPRWYFY | 64% | 13 | ||
| SPRWYFYYL | 25.03% | 14 | ||
P0DTD2 ORF9b protein | Europe | KISEMHPAL | 88.15% | 38 |
| KVYPIILRL | 80.84% | 20 | ||
| GPKVYPIIL | 60.10% | 15 | ||
| RLVDPQIQL | 55.65% | 14 | ||
| North America | KISEMHPAL | 85.77% | 38 | |
| KVYPIILRL | 77.80% | 20 | ||
| RLVDPQIQL | 55.50% | 15 | ||
| GPKVYPIIL | 51.20% | 14 | ||
| South Asia | KVYPIILRL | 76.77% | 20 | |
| KISEMHPAL | 65.26% | 38 | ||
| GPKVYPIIL | 32.99% | 14 | ||
| RLVDPQIQL | 31.48% | 15 | ||
| North East Asia | KVYPIILRL | 81.77% | 20 | |
| KISEMHPAL | 81.09% | 38 | ||
| RLVDPQIQL | 64.31% | 15 | ||
| SLEDKAFQL | 37.92% | 13 |
Fig. 1PAComplex server showing pMHC-TCR interactions and homologous peptide for antigen P0DTC2
Fig. 2PAComplex server showing pMHC-TCR interactions and homologous peptide for antigen P0DTC9
Fig. 3Frequency logo for the peptide antigen family of homologous template peptide 1oga (GILGFVFTL) of top hit peptide (SIIAYTMSL)
Fig. 4Frequency logo for the peptide antigen family of homologous template peptide 2vlr (GILGFVFTL) of top hit peptide (SPRWYFYYL)