| Literature DB >> 33194329 |
M Shaminur Rahman1, M Nazmul Hoque1,2, M Rafiul Islam1, Salma Akter1,3, A S M Rubayet Ul Alam4, Mohammad Anwar Siddique1, Otun Saha1, Md Mizanur Rahaman1, Munawar Sultana1, Keith A Crandall5, M Anwar Hossain1,6.
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the etiologic agent of the ongoing pandemic of coronavirus disease 2019 (COVID-19), a public health emergency of international concerns declared by the World Health Organization (WHO). An immuno-informatics approach along with comparative genomics was applied to design a multi-epitope-based peptide vaccine against SARS-CoV-2 combining the antigenic epitopes of the S, M, and E proteins. The tertiary structure was predicted, refined and validated using advanced bioinformatics tools. The candidate vaccine showed an average of ≥90.0% world population coverage for different ethnic groups. Molecular docking and dynamics simulation of the chimeric vaccine with the immune receptors (TLR3 and TLR4) predicted efficient binding. Immune simulation predicted significant primary immune response with increased IgM and secondary immune response with high levels of both IgG1 and IgG2. It also increased the proliferation of T-helper cells and cytotoxic T-cells along with the increased IFN-γ and IL-2 cytokines. The codon optimization and mRNA secondary structure prediction revealed that the chimera is suitable for high-level expression and cloning. Overall, the constructed recombinant chimeric vaccine candidate demonstrated significant potential and can be considered for clinical validation to fight against this global threat, COVID-19. ©2020 Rahman et al.Entities:
Keywords: B-cell Epitope; Chimeric Peptide Vaccine; Muti-epitope; SARS-CoV-2; T-cell Epitope
Year: 2020 PMID: 33194329 PMCID: PMC7394063 DOI: 10.7717/peerj.9572
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Linear epitopes present on spike (S) glycoprotein surface predicted through ElliPro in IEDB-analysis resource based upon solvent-accessibility and flexibility are shown with their antigenicity scores.
The highlighted green coloured regions were the potential antigenic domains while the yellow coloured region represents the trans-membrane domain of the S protein.
| 1 | A | 395 | 514 | VYADSFVIRGDEVRQIAPGQTGKIADYNYKLP DDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRK SNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYG FQPTNGVGYQPYRVVVLS | 120 | 0.837 |
| 2 | 58 | 194 | FFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFAS TEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEF QFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYV SQPFLMDLEGKQGNFKNLREFVF | 137 | 0.835 | |
| 3 | 1067 | 1146 | YVPAQEKNFTTAPAICHDGKAHFPREGVFVSNG THWFVTQRNFYEPQIITTDNTF VSGNCDVVIGIVNNTVYDPLQPELD | 80 | 0.83 | |
| 4 | 201 | 270 | FKIYSKHTPINLVRDLPQGFSALEPLVDLPIG INITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYL | 70 | 0.76 | |
| 5 | 331 | 381 | NITNLCPFGEVFNATRFASVYAWNRKRISN CVADYSVLYNSASFSTFKCYG | 51 | 0.706 | |
| 6 | 700 | 720 | GAENSVAYSNNSIAIPTNFTI | 21 | 0.668 | |
| 7 | 27 | 35 | AYTNSFTRG | 9 | 0.66 | |
| 8 | 909 | 936 | IGVTQNVLYENQKLIANQFNSAIGKIQD | 28 | 0.633 | |
| 9 | 789 | 813 | YKTPPIKDFGGFNFSQILPDPSKPS | 25 | 0.6 | |
| 10 | 623 | 642 | AIHADQLTPTWRVYSTGSNV | 20 | 0.598 | |
| 11 | 891 | 907 | GAALQIPFAMQMAYRFN | 17 | 0.591 | |
| 12 | 579 | 583 | PQTLE | 5 | 0.551 | |
| 13 | 687 | 692 | VASQSI | 6 | 0.55 | |
| 14 | 653 | 659 | AEHVNNS | 7 | 0.539 | |
| 15 | 679 | 684 | NSPRRA | 6 | 0.521 | |
| 16 | B | 1067 | 1146 | YVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTH WFVTQRNFYEPQIITTDNTFVS GNCDVVIGIVNNTVYDPLQPELD | 80 | 0.826 |
| 17 | 89 | 194 | GVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATN VVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYS SANNCTFEYVSQPFLMDLEGKQGNFKNLREFVF | 106 | 0.816 | |
| 18 | 58 | 87 | FFSNVTWFHAIHVSGTNGTKRFDNPVLPFN | 30 | 0.81 | |
| 19 | 203 | 270 | IYSKHTPINLVRDLPQGFSALEPLVDLPIGINIT RFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYL | 68 | 0.748 | |
| 20 | 465 | 509 | ERDISTEIYQAGSTPCNGVEGF NCYFPLQSYGFQPTNGVGYQPYR | 45 | 0.727 | |
| 21 | 436 | 458 | WNSNNLDSKVGGNYNYLYRLFRK | 23 | 0.672 | |
| 22 | 700 | 720 | GAENSVAYSNNSIAIPTNFTI | 21 | 0.671 | |
| 23 | 27 | 35 | AYTNSFTRG | 9 | 0.666 | |
| 24 | 909 | 9036 | IGVTQNVLYENQKLIANQFNSAIGKIQD | 28 | 0.641 | |
| 25 | 624 | 643 | IHADQLTPTWRVYSTGSNVF | 20 | 0.617 | |
| 26 | 328 | 365 | RFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADY | 38 | 0.608 | |
| 27 | 891 | 907 | GAALQIPFAMQMAYRFN | 17 | 0.602 | |
| 28 | 577 | 583 | RDPQTLE | 7 | 0.598 | |
| 29 | 790 | 817 | KTPPIKDFGGFNFSQILPDPSKPSKRSF | 28 | 0.595 | |
| 30 | 673 | 693 | SYQTQTNSPRRARSVASQSII | 21 | 0.567 | |
| 31 | 526 | 537 | GPKKSTNLVKNK | 12 | 0.553 | |
| 32 | 653 | 661 | AEHVNNSYE | 9 | 0.548 | |
| 33 | 554 | 563 | ESNKKFLPFQ | 10 | 0.52 | |
| 34 | C | 56 | 194 | LPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFA STEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKV CEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFE YVSQPFLMDLEGKQGNFKNLREFVF | 139 | 0.84 |
| 35 | 1067 | 1146 | YVPAQEKNFTTAPAICHDGKAHFPREGVFV SNGTHWFVTQRNFYEPQIITTDNTFVSGN CDVVIGIVNNTVYDPLQPELD | 80 | 0.822 | |
| 36 | 201 | 270 | FKIYSKHTPINLVRDLPQGFSA LEPLVDLPIGINITRFQTLLALHR SYLTPGDSSSGWTAGAAAYYVGYL | 70 | 0.77 | |
| 37 | 27 | 35 | AYTNSFTRG | 9 | 0.676 | |
| 38 | 465 | 509 | ERDISTEIYQAGSTPCNGVEGFNCY FPLQSYGFQPTNGVGYQPYR | 45 | 0.675 | |
| 39 | 700 | 720 | GAENSVAYSNNSIAIPTNFTI | 21 | 0.658 | |
| 40 | 909 | 936 | IGVTQNVLYENQKLIANQFNSAIGKIQD | 28 | 0.633 | |
| 41 | 437 | 458 | NSNNLDSKVGGNYNYLYRLFRK | 22 | 0.629 | |
| 42 | 673 | 684 | SYQTQTNSPRRA | 12 | 0.619 | |
| 43 | 790 | 817 | KTPPIKDFGGFNFSQILPDPSKPSKRSF | 28 | 0.602 | |
| 44 | 891 | 907 | GAALQIPFAMQMAYRFN | 17 | 0.597 | |
| 45 | 578 | 583 | DPQTLE | 6 | 0.589 | |
| 46 | 620 | 631 | VPVAIHADQLTP | 12 | 0.579 | |
| 47 | 329 | 362 | FPNITNLCPFGEVFNATRFASVYAWNRKRISNCV | 34 | 0.571 | |
| 48 | 687 | 692 | VASQSI | 6 | 0.566 | |
| 49 | 835 | 845 | KQYGDCLGDIA | 11 | 0.567 | |
| 50 | 653 | 659 | AEHVNNS | 7 | 0.559 | |
| 51 | 527 | 536 | PKKSTNLVKN | 10 | 0.546 | |
| 52 | 635 | 642 | VYSTGSNV | 8 | 0.51 |
Figure 1The three-dimensional (3D) structure of the N-terminal domains (NTDs) and receptor binding domains (RBDs) of the spike (S) proteins of SARS-CoV-2 (surface view).
The orange, cyan, and yellow colored regions represent the potential antigenic domains predicted by the IEDB analysis resource ElliPro analysis.
Figure 2Predicted B-cell epitopes using BepiPred-2.0 epitope predictor in IEDB-analysis resource web-based repository.
Yellow areas above threshold (red line) are proposed to be a part of B cell epitopes in (a) RBD and (b) NTD regions of S protein, (c) envelop (E) and (d) membrane (M) proteins of SARS-CoV-2.
B-cell epitopes predicted using Bepipred linear epitope prediction 2.0 in IEDB analysis resource web-server along with their start and end positions, average score, and VaxiJen 2.0 determined antigenicity scores.
| 341-342 | VF | 0.502 | – | |
| 344-349 | ATRFAS | 0.520 | −0.151 | |
| 351-363 | YAWNRKRISNCVA | 0.522 | 0.394 | |
| 372-378 | ASFSTFK | 0.527 | 0.087 | |
| 382 | V | 0.464 | – | |
| 402-427 | IRGDEVRQIAPGQTGKIADYNYKLPD | 0.575 | 0.932 | |
| 440-485 | NLDSKVGGNYNYLYRLFRKSN LKPFERDISTEIYQAGSTPCNGVEG | 0.554 | 0.210 | |
| 493-516 | QSYGFQPTNGVGYQ | 0.535 | 0.670 | |
| 72-81 | GTNGTKRFDN | 0.573 | 0.667 | |
| 110-113 | LDSK | 0.511 | – | |
| 146-155 | HKNNKSWMES | 0.573 | 0.174 | |
| 161-162 | SS | 0.503 | – | |
| 164 | N | 0.499 | – | |
| 172-191 | SQPFLMDLEGKQGNFKNLRE | 0.553 | 0.749 | |
| 199-218 | YRIGNYKLNTDHSSSSDNIA | 0.614 | 0.222 | |
| 57-71 | YVYSRVKNLNSSRVP | 0.565 | 0.449 |
Figure 3Design, construction and structural validation of multi-epitope vaccine candidate (CoV-RMEN) for SARS-CoV-2.
(A) Structural domains and epitopes rearrangement of CoV-RMEN, (B) secondary structure of CoV-RMEN as analyzed through CFSSP: Chou and Fasman secondary structure prediction server , (C) final tertiary structure of CoV-RMEN (surface view) obtained from homology modelling on Phyre2 in which domains and epitopes are represented in different colors (PADRE-smudge; membrane B-cell epitope, MBE-magenta; N-terminal domain, NTD-orange; receptor-binding domain, RBD-cyan; envelop B-cell epitope, EBE-blue; invasin-yellow), (D) validation of the refined model with Ramachandran plot analysis showing 94.7%, 4.8% and 0.5% of protein residues in favored, allowed, and disallowed (outlier) regions respectively, (e) ProSA-web, giving a Z-score of −6.17, and (f) the finally predicted primary structure of the CoV-RMEN.
Figure 4Molecular docking of top five MHC-I and MHC-II epitopes of RBD and NTD domains with respect to HLA allele binders.
(A–E) and (K–O) represent the top five MHC-I epitopes of RBD and NTD domains, respectively. (F–J) and (P–T) represent the top five MHC-II epitopes binds of the same domains binds to their respective HLA alleles. The protein-peptide docking was performed in GalaxyWEB-GalaxyPepDock-server followed by the refinement using GalaxyRefineComplex and free energy (ΔG) of each complex was determined in PRODIGY server. Ribbon structures represent HLA alleles and stick structures represent the respective epitopes. Light color represents the templates to which the alleles and epitopes structures were built. Further information on molecular docking analysis is also available in Data S1.
Figure 5Molecular docking and dynamics of CoV-RMEN vaccine with immune receptors (TLR2, TLR3 and TLR4).
Docked complexes for (A) CoV-RMEN and TLR2, (B) CoV-RMEN and TLR3, and (C) CoV-RMEN and TLR4. Magnified interfaces of the respective complexes are figured to (D), (E) and (F) respectively. Active residues of CoV-RMEN colored magenta, and of TLRs colored orange with stick view. ΔG represents the binding affinity of the complexes. Molecular dynamics simulation study of (G) CoV-RMEN and TLR2, (H) CoV-RMEN and TLR3, and (I) CoV-RMEN and TLR4 complexes across the time window of 100 ps. The reasonably invariable RMSD value indicates a stable complex formation.
Active interface amino acid residues and binding scores among Toll Like Receptors (TLRs) and the constructed vaccine CoV-RMEN.
| ΔG (kcal mol−1) | ||||
|---|---|---|---|---|
| V536, C537, S538, C539, E540, S543, E547, P567, R569, L570 | D72, Y75, L101, I103, I112, C150, F152, E153, Y154, V155, S156, F176, F178, R181, F182, L371 | −30.4 | −9.0 | |
| D36, H39, K41, R643, F644, P646, F647, T650, C651, E652, S653, I654, W656, F657, V658, N659, W660, I661, N662, E663 | F43, S44, N45, V46, T47, W48, D72, Y75, F76, L101, I103, I112, F152, E153, Y154, V155, S156, Q157, F159, F178, R181, F182, L371 | −47.2 | −14.9 | |
| P53, F54, S55, H68, G70, Y72, S73, F75, S76, Q99, S102, G124 | G39, L40, D72, Y75, F76, F90, L101, I103, I112, F152, Y154, S156, Q157, F159, R174, E175, F176, F178, R181, F182, P183, L371, P374, P380, G381 | −52.1 | −16.0 | |
Figure 6C-ImmSim presentation of an in silico immune simulation with the chimeric peptide.
(A) The immunoglobulins and the immunocomplex response to antigen (CoV-RMEN) inoculations (black vertical lines); specific subclasses are indicated as colored peaks, (B) concentration of cytokines and interleukins, and inset plot shows danger signal together with leukocyte growth factor IL-2, (C) B-cell populations after three injections, (D) evolution of B cell, (E) T-helper cell populations per state after injections, and (F) evolution of T-helper cell classes with the course of vaccination.
Figure 7Population coverage of the selected T-cell epitopes and their respective HLA alleles.
The circular plot illustrates the relative abundance of the top 70 geographic regions and ethnic groups for selected CTL and HTL epitopes, which were used to construct the vaccine and their corresponding MHC HLA alleles were obtained for population coverage analysis both individually (either MHC-I or MHC-II) and in combination (MHC-I and MHC-II). (A) Population coverage of top seventy geographical regions out of 123 regions. (B) Population coverage of top seventy ethnic groups selected from 146 ethnic groups. Regions and ethnic groups in the respective MHC-I and MHC-II epitopes are represented by different colored ribbons, and the inner blue bars indicate their respective relative coverages. Further information on population coverage analysis is also available in Data S1.
Figure 8Codon optimization and mRNA structure of CoV-RMEN gene for expression in E. coli.
(A) GC curve (average GC content: 50.26%) of the optimized CoV-RMEN gene, (B) percentage distribution of codons in computed codon quality groups, (C) relative distribution of codon usage frequency along the gene sequence to be expressed in E. coli, and codon adaptation index (CAI) was found to be 0.87 for the desired gene, (D) secondary structure and stability of corresponding mRNA, and (E) resolved view of the start region in the mRNA structure of CoV-RMEN.
Figure 9In silico fusion cloning of the CoV-RMEN.
The final vaccine candidate sequence was inserted into the pETite expression vector where the red part represents the gene coding for the predicted vaccine, and the black circle represents the vector backbone. The six His-tag and SUMU-tag are located at the Carboxy-terminal end.