| Literature DB >> 35250426 |
Riska A Febrianti1, Erlia Narulita1,2.
Abstract
Objectives: This study aimed to produce a recombinant protein vaccine candidate based on an epitope of spike protein from Indonesian SARS-CoV-2 to provide immunogenicity and protection against future infection.Entities:
Keywords: Indonesia; Recombinant protein; Reverse vaccinology; SARS-CoV-2; Spike protein
Year: 2022 PMID: 35250426 PMCID: PMC8881762 DOI: 10.1016/j.jtumed.2022.02.007
Source DB: PubMed Journal: J Taibah Univ Med Sci ISSN: 1658-3612
List of Indonesia SARS-CoV-2 complete genome sequence access codes downloaded from GISAID.
| Number | Province | Accession number |
|---|---|---|
| 1 | East Java/Sidoarjo | EPI_ISL_956315 |
| 2 | Banten/Tangerang | EPI_ISL_947327 |
| 3 | Jakarta | EPI_ISL_953427 |
| 4 | West Java | EPI_ISL_747241 |
| 5 | Central Java | EPI_ISL_791988 |
| 6 | Special Region of Yogyakarta | EPI_ISL_911709 |
| 7 | Aceh | EPI_ISL_791981 |
| 8 | Bangka Belitung Islands | EPI_ISL_747237 |
| 9 | North Sumatra | EPI_ISL_756401 |
| 10 | Lampung | EPI_ISL_791978 |
| 11 | Riau Islands | EPI_ISL_791985 |
| 12 | South Sumatra/Palembang | EPI_ISL_833039 |
| 13 | West Sumatra | EPI_ISL_910014 |
| 14 | Bengkulu | EPI_ISL_791979 |
| 15 | Bali | EPI_ISL_775596 |
| 16 | East Nusa Tenggara/Kupang | EPI_ISL_766048 |
| 17 | West Nusa Tenggara | EPI_ISL_775598 |
| 18 | South Kalimantan | EPI_ISL_753699 |
| 19 | Central Kalimantan | EPI_ISL_538502 |
| 20 | East Kalimantan | EPI_ISL_791983 |
| 21 | North Kalimantan | EPI_ISL_803876 |
| 22 | West Kalimantan | EPI_ISL_911750 |
| 23 | North Sulawesi/Manado | EPI_ISL_574623 |
| 24 | South Sulawesi/Makassar | EPI_ISL_833502 |
| 25 | North Maluku | EPI_ISL_791986 |
| 26 | West Papua | EPI_ISL_775597 |
| 27 | Papua/Timika | EPI_ISL_574603 |
| 28 | Jakarta | EPI_ISL_1118931 |
| 29 | Jakarta | EPI_ISL_1118933 |
Figure 1Determination of conserved regions of the spike protein sequence of Indonesian SARS-CoV-2. Conserved regions of spike protein sequences are marked in red, while unconserved regions are in white. Amino acid residues belonging to unconserved regions are at positions 74, 149, 249, 398, 513, 583, 51, 700-725, 775-795, 813, 838, 924, 1126, and 1298.
Indonesia SARS-CoV-2 spike protein consensus sequence.
| >ConsensusMFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKxGCCSCGSCCKFDEDDSEPVLKGVKLHYT |
Prediction of B cell epitope and analysis of its sustainability.
| No. | Start | End | Peptide | Length | Conservancy analysis (based on AVANA) |
|---|---|---|---|---|---|
| 1 | 13 | 37 | SQCVNLTTRTQLPPAYTNSFTRGVY | 25 | Conserved region |
| 2 | 59 | 81 | FSNVTWFHAIHVSGTNGTKRFDN | 23 | Non-conserved region |
| 3 | 97 | 98 | KS | 2 | Conserved region |
| 4 | 138 | 154 | DPFLGVYYHKNNKSWME | 17 | Non-conserved region |
| 5 | 177 | 189 | MDLEGKQGNFKNL | 13 | Conserved region |
| 6 | 206 | 221 | KHTPINLVRDLPQGFS | 16 | Conserved region |
| 7 | 250 | 260 | TPGDSSSGWTA | 11 | Conserved region |
| 8 | 293 | 296 | LDPL | 4 | Conserved region |
| 9 | 304 | 322 | KSFTVEKGIYQTSNFRVQP | 19 | Conserved region |
| 10 | 329 | 363 | FPNITNLCPFGEVFNATRFASVYAWNRKRISNCVA | 35 | Conserved region |
| 11 | 369 | 393 | YNSASFSTFKCYGVSPTKLNDLCFT | 25 | Conserved region |
| 12 | 404 | 426 | GDEVRQIAPGQTGKIADYNYKLP | 23 | Conserved region |
| 13 | 440 | 501 | NLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTN | 62 | Conserved region |
| 14 | 516 | 536 | ELLHAPATVCGPKKSTNLVKN | 21 | Conserved region |
| 15 | 555 | 562 | SNKKFLPF | 8 | Conserved region |
| 16 | 580 | 583 | QTLE | 4 | Non-conserved region |
| 17 | 602 | 606 | TNTSN | 5 | Conserved region |
| 18 | 617 | 632 | CTEVPVAIHADQLTPT | 16 | Conserved region |
| 19 | 635 | 643 | VYSTGSNVF | 9 | Conserved region |
| 20 | 656 | 666 | VNNSYECDIPI | 11 | Conserved region |
| 21 | 672 | 690 | ASYQTQTNSPRRARSVASQ | 19 | Conserved region |
| 22 | 695 | 710 | YTMSLGAENSVAYSNN | 16 | Non-conserved region |
| 23 | 748 | 748 | E | 1 | Conserved region |
| 24 | 773 | 779 | EQDKNTQ | 7 | Non-conserved region |
| 25 | 786 | 800 | KQIYKTPPIKDFGGF | 15 | Non-conserved region |
| 26 | 807 | 814 | PDPSKPSK | 8 | Non-conserved region |
| 27 | 828 | 842 | LADAGFIKQYGDCLG | 15 | Non-conserved region |
| 28 | 988 | 992 | EAEVQ | 5 | Conserved region |
| 29 | 1035 | 1043 | GQSKRVDFC | 9 | Conserved region |
| 30 | 1107 | 1118 | RNFYEPQIITTD | 12 | Conserved region |
| 31 | 1133 | 1172 | VNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGI | 40 | Conserved region |
| 32 | 1203 | 1206 | LGKY | 4 | Conserved region |
| 33 | 1252 | 1267 | SCCKFDEDDSEPVLKG | 16 | Conserved region |
| 34 | 1269 | 1269 | K | 1 | Conserved region |
List of Class I and Class II HLA (human leukocyte antigen) are useda.
| 56 HLA class I alleles possessed by the Indonesian population HLA-A01:01,HLA-A02:01,HLA-A02:03,HLA-A02:06,HLA-A02:11,HLAA03:01,HLAA11:01, HLA-A11:04,HLA-A24:02,HLA-A24:07,HLA-A24:10,HLA-A26:01,HLAA29:01,HLA-A30:01, HLA-A32:01,HLA-A33:03,HLA-A34:01,HLAA74:01,HLA-B07:02,HLA-B07:05,HLA-B08:01, HLA-B13:01,HLA-B13:02,HLA-B15:01,HLA-B15:02,HLAB15:10,HLAB15:12, HLA-B15:13, HLA-B15:17,HLA-B15:21,HLA-B15:25,HLAB15:32,HLA-B18:01,HLA-B18:02,HLA-B27:06, HLA-B35:01,HLA-B35:02,HLA-B35:03,HLA-B35:05,HLA-B35:30,HLA-B37:01,HLA-B38:02, HLA-B39:15,HLAB40:01,HLAB40:02,HLA-B40:06,HLA-B41:01,HLA-B44:03,HLA-B48:01, HLA-B51:01,HLAB51:02,HLA-B52:01,HLA-B56:01, HLA-B56:02, HLA-B56:07, HLAB57:01, HLA-B58:01 |
| 22 HLA class II alleles possessed by the Indonesian population |
Reference:.
Similarities to selfpeptide, hydrophobicity characteristics, and novelty of candidate epitopes.
| Epitope candidates | Analysis of similarity to self-peptide | Hydrophobicity test | Epitope recency test | Selected epitopes (go to the next step) | |
|---|---|---|---|---|---|
| GRAVY score | Meaning | ||||
| VNLTTRTQL | No similarities in 7/9 or more were found without gaps | −0.200 | Hydrophilic | New epitope (database no reference) | VNLTTRTQL |
| LTTRTQLPP | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| TRTQLPPAY | No similarities in 7/9 or more were found without gaps | −0.922 | Hydrophilic | New epitope (2021 reference database-specifically SARS-CoV-2) | TRTQLPPAY |
| LVRDLPQGF | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| FTVEKGIYQ | No similarities in 7/9 or more were found without gaps | −0.200 | Hydrophilic | New epitope (database no reference) | FTVEKGIYQ |
| GIYQTSNFR | No similarities in 7/9 or more were found without gaps | −0,822 | Hydrophilic | New epitope (database no reference) | GIYQTSNFR |
| FASVYAWNR | No similarities in 7/9 or more were found without gaps | −0.044 | Hydrophilic | New epitope (2020-specific SARS-CoV-2 reference database) | FASVYAWNR |
| FNATRFASV | No similarities in 7/9 or more were found without gaps | 0.433 | Hydrophobic | Not tested because it is hydrophobic | Eliminated |
| VFNATRFAS | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| TRFASVYAW | No similarities in 7/9 or more were found without gaps | 0.267 | Hydrophobic | Not tested because it is hydrophobic | Eliminated |
| YAWNRKRIS | No similarities in 7/9 or more were found without gaps | −1456 | Hydrophilic | New epitope (database no reference) | YAWNRKRIS |
| VYAWNRKRI | No similarities in 7/9 or more were found without gaps | −0,900 | Hydrophilic | New epitope (database no reference) | VYAWNRKRI |
| ATRFASVYA | No similarities in 7/9 or more were found without gaps | 0.567 | Hydrophobic | Not tested because it is hydrophobic | Eliminated |
| EVFNATRFA | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| FKCYGVSPT | No similarities in 7/9 or more were found without gaps | 0.089 | Hydrophobic | Not tested because it is hydrophobic | Eliminated |
| CYGVSPTKL | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| ASFSTFKCY | No similarities in 7/9 or more were found without gaps | 0.267 | Hydrophobic | Not tested because it is hydrophobic | Eliminated |
| STFKCYGVS | No similarities in 7/9 or more were found without gaps | 0.178 | Hydrophobic | Not tested because it is hydrophobic | Eliminated |
| FERDISTEI | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| LYRLFRKSN | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| IYQAGSTPC | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| CYFPLQSYG | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| YFPLQSYGF | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| YRLFRKSNL | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| FRKSNLKPF | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| YNYLYRLFR | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| YQAGSTPCN | No similarities in 7/9 or more were found without gaps | −0,833 | Hydrophilic | New epitope (database no reference) | YQAGSTPCN |
| FNCYFPLQS | No similarities in 7/9 or more were found without gaps | 0.133 | Hydrophobic | Not tested because it is hydrophobic | Eliminated |
| DISTEIYQA | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| EIYQAGSTP | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| LLHAPATVC | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| TQTNSPRRA | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| QTNSPRRAR | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
| FKNHTSPDV | No similarities in 7/9 or more were found without gaps | −1133 | Hydrophilic | New epitope (database no reference) | FKNHTSPDV |
| ELDSFKEEL | Highest similarity: 7/9 without gap | The sequences were eliminated because of their high similarity to self-peptide | Eliminated | ||
The antigenicity, allergenicity, and topology of the epitope candidate membrane.
| No. | Epitope candidates | Antigenicity test | Allergenicity test | Topology | Potential epitope candidates | |
|---|---|---|---|---|---|---|
| Score | Meaning | |||||
| 1 | TRTQLPPAY | 1.2923 | Antigenic | Non-allergen | Inside | TRTQLPPAY |
| 2 | VNLTTRTQL | 1.3468 | Antigenic | Allergens | Inside | Eliminated |
| 3 | GIYQTSNFR | 0.5380 | Antigenic | Allergens | Inside | Eliminated |
| 4 | FTVEKGIYQ | −0.1987 | Non-antigenic | Allergens | Inside | Eliminated |
| 5 | FASVYAWNR | 0.0713 | Non-antigenic | Allergens | Inside | Eliminated |
| 6 | YAWNRKRIS | 0.8209 | Antigenic | Non-allergen | Inside | YAWNRKRIS |
| 7 | VYAWNRKRI | 0.5003 | Antigenic | Allergens | Inside | Eliminated |
| 8 | FKNHTSPDV | 0.4846 | Antigenic | Non-allergen | Outside | FKNHTSPDV |
| 9 | YQAGSTPCN | 0.4992 | Antigenic | Allergens | Inside | Eliminated |
Characteristics of selected epitope candidates.
| Characteristics | Information | Meaning |
|---|---|---|
| Similarity to self-peptide | No similarities in 7/9 or more were found without gaps | Not similar to self-peptide |
| Hydrophobicity | GRAVY Score: −1.133 | Hydrophilic (dissolved) |
| Molecular mass | 1044.13 | |
| tpI | 6.74 | |
| Stability | Stable protein | |
| Recency (based on IEDB database) | The research has never been done and was not found in the database | New (novel) |
| Antigenicity | 0.4846 | Antigenic |
| Allergenicity | Non-allergen | |
| Topology | Outside | |
Selected epitope coding nucleotide sequences and codon optimization results.
| Selected epitope (amino acid sequence) | Reverse translated sequence (nucleotide sequences) | Optimized sequences | Characteristics | |
|---|---|---|---|---|
| CAI | GC content (%) | |||
| FKNHTSPDV | TTCAAGAACCACACCAGCCCCGACGTG | TTCAAAAACC ACACTTCTCC GGACGTA | 0.87 | 44.44 |
Figure 2Construction results for Plasmid pcDNA3.1(+) N-GST (Thrombin)-Epitope of Indonesian SARS-CoV-2. The red part of the plasmid construct is the part designed in this study. The inserted gene was added with ATG as a start codon that encoded the amino acid methionine.