| Literature DB >> 36110855 |
Luis F Soto1, Ana C Romaní1, Gabriel Jiménez-Avalos2, Yshoner Silva3, Carla M Ordinola-Ramirez3, Rainer M Lopez Lapa3,4, David Requena5.
Abstract
Clostridium perfringens is a dangerous bacterium and known biological warfare weapon associated with several diseases, whose lethal toxins can produce necrosis in humans. However, there is no safe and fully effective vaccine against C. perfringens for humans yet. To address this problem, we computationally screened its whole proteome, identifying highly immunogenic proteins, domains, and epitopes. First, we identified that the proteins with the highest epitope density are Collagenase A, Exo-alpha-sialidase, alpha n-acetylglucosaminidase and hyaluronoglucosaminidase, representing potential recombinant vaccine candidates. Second, we further explored the toxins, finding that the non-toxic domain of Perfringolysin O is enriched in CTL and HTL epitopes. This domain could be used as a potential sub-unit vaccine to combat gas gangrene. And third, we designed a multi-epitope protein containing 24 HTL-epitopes and 34 CTL-epitopes from extracellular regions of transmembrane proteins. Also, we analyzed the structural properties of this novel protein using molecular dynamics. Altogether, we are presenting a thorough immunoinformatic exploration of the whole proteome of C. perfringens, as well as promising whole-protein, domain-based and multi-epitope vaccine candidates. These can be evaluated in preclinical trials to assess their immunogenicity and protection against C. perfringens infection.Entities:
Keywords: Clostridium perfringens; epitope; immunoinformatics; molecular dynamics; toxin; vaccine
Mesh:
Substances:
Year: 2022 PMID: 36110855 PMCID: PMC9469472 DOI: 10.3389/fimmu.2022.942907
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 8.786
Figure 1Flowchart of the immunoinformatic exploration of pathogens for vaccine development. We present three workflows to identify: (A) immunogenic proteins for protein-based vaccines, (B) protein domains enriched in HTL epitopes for subunit vaccines, and (C) nested epitopes for the design of novel multi-epitope protein vaccines. The proteins obtained in these three approaches might be used as well in immunodiagnostic tests.
Figure 4Selection of epitopes for the design of a multi-epitope construct. (A) An example of the pyramidal order for 5 epitopes, showing how they should be concatenated into a new protein, to evaluate the presence of neoepitopes in all the connections in just a single prediction step. (B) Allowed connections (without neoepitopes) are represented in the directed graph. The Hamiltonian path (in red) exemplifies a solution containing all the nodes. (C) Graph representing the nested epitopes (nodes) and its allowed connections (edges), selected in this study for the construction of the multi-epitope construct. Epitopes discarded from the design are marked in red.
Figure 10Results of the C-IMMSIM simulation for 350 days. (A) Schematic illustration of the vaccination trial, with three doses of the multi-epitope vaccine (green box) at days 3, 30, 60; and the challenge (red box) at day 111. Dynamics of (B) HTL and (C) CTL populations. Memory and not memory cells are represented with light-blue and green lines, respectively. (D) B cell populations, grouped by immunoglobulin isotype production. (E) Population of Plasma B lymphocytes producing IgM, IgG1 and IgG2. (F) Antigen concentration and relative antibodies responses. (G) Total population of NK cells. The first, second and third doses were inoculated.
Figure 2Description of the predicted epitopes in C. perfringens. (A) Correlation between the number of epitopes and protein length. Blue: CTL-epitopes. Red: HTL-epitopes. (B) Correlation between the number of HTL- and CTL-epitopes. (C) Comparison of the number of epitopes per protein, for different epitope lengths of CTL- (8, 9 and 10 aa) and HTL- (15 aa) epitopes. (D) Lasso regression of the number of CTL- (blue) and HTL- (red) epitopes across the proteome of C. perfringens. (E) Correlation between protein length and number of CTL- and HTL-epitopes in the protein (top and bottom, respectively). Proteins with more than 500 CTL- and 200 HTL-epitopes (above the dashed line) are labeled. (F) Bar plot showing the number of CTL- and HTL-epitopes (top and bottom, respectively) by HLA allele within, for the four proteins highlighted on panel (E). **** means p-value < 0.0001.
Figure 3Evaluation of epitopes in C. perfringens toxins. (A) Correlation between the number of HTL-epitopes and protein length. (B) Correlation between HTL-epitope density and protein length. (C) Number of HTL-epitopes within toxins binding each of the HLA-II supertype alleles. (D) Location of the HTL-epitopes along the enterotoxin (D) Epitopes are colored in a gradient from yellow to red, representing the number of HLA alleles they bind. (E) Number of HTL-epitopes in the whole Perfringolysin O toxin (in gray) and its non-toxic domain (red) that are predicted to bind each of the HLA-II supertype alleles. (F) Structure of the Perfingolysin O. The toxic-domain is represented in blue. The non-toxic domain in red, highlighting the most promiscuous epitopes in cyan.
Promiscuous HTL-epitopes in C. perfringens toxins.
| Epitopes | Alleles | Number of Alleles | Proteins |
|---|---|---|---|
| PENIKIIANGKVVVD | DRB1*03:01, DRB1*04:01, DRB1*04:05, DRB1*08:02, DRB1*11:01, DRB1*13:02, DRB3*02:02 | 7 | PHOSPHOLIPASE C |
| PKYIVIHDTDNRQAG | DRB1*03:01, DRB1*04:01, DRB1*04:05, DRB1*08:02, DRB1*13:02, DRB3*01:01 | 6 | ENTEROTOXIN D |
| RKPININIDLPGLKG-NPKYIVIHDTDNRQA | DRB1*03:01, DRB1*04:01, DRB1*04:05, DRB1*08:02, DRB1*13:02, DRB3*02:02 | 6 | PERFRINGOLYSIN O - ENTEROTOXIN D |
| MLEEFKYDPNQQLKS-LEEFKYDPNQQLKSF | DRB1*03:01, DRB1*04:01, DRB1*04:05, DRB1*13:02, DRB3*01:01, DRB3*02:02 | 6 | BETA2 TOXIN |
| LKSFEILNSQKIDNK | DRB1*04:01, DRB1*04:05, DRB1*08:02, DRB1*11:01, DRB1*13:02, DRB3*02:02 | 6 | BETA2 TOXIN |
| KYIVIHDTDNRQAGA | DRB1*03:01-DRB1*04:01-DRB1*04:05, DRB1*13:02-DRB3*01:01 | 5 | ENTEROTOXIN D |
| KRKPININIDLPGLK | DRB1*03:01, DRB1*04:01, DRB1*04:05, DRB1*13:02, DRB3*02:02 | 5 | PERFRINGOLYSIN O |
| EIRKVIKDNATFSTK-IRKVIKDNATFSTKN-NDNINIDLSNSNVAV-EMLEEFKYDPNQQLK-EEFKYDPNQQLKSFE | DRB1*03:01, DRB1*04:01, DRB1*13:02, DRB3*01:01, DRB3*02:02 | 5 | PERFRINGOLYSIN O - PERFRINGOLYSIN O - ENTEROTOXIN A - BETA2 TOXIN - BETA2 TOXIN |
| GEIFNIDGKEGSWYK | DRB1*03:01, DRB1*08:02, DRB1*11:01, DRB1*13:02, DRB3*01:01 | 5 | ENTEROTOXIN B |
| ENIKIIANGKVVVDK-NIKIIANGKVVVDKD | DRB1*0301, DRB1*08:02, DRB1*11:01, DRB1*13:02, DRB3*02:02 | 5 | PHOSPHOLIPASE C |
| WNEKYSSTHTLPART-NEKYSSTHTLPARTQ-GSNYGVIGTLRNNDK-ASKSYITIVNEGSNN-SKSYITIVNEGSNNG | DRB1*04:01, DRB1*04:05, DRB1*08:02, DRB1*11:01, DRB3*02:02 | 5 | PERFRINGOLYSIN O - PERFRINGOLYSIN O - ENTEROTOXIN D - ENTEROTOXIN D - ENTEROTOXIN D |
| KQGIVKVNSALNMRS-KSFEILNSQKIDNKE | DRB1*04:01, DRB1*04:05, DRB1*08:02, DRB1*13:02, DRB3*02:02 | 5 | ENTEROTOXIN D - BETA2 TOXIN |
Characteristics of the predicted epitopes selected for the construction of the multi-epitope protein.
| Epitope | Overlaped epitope | Nested_epitope | HLA-II alleles | Protein name | Protein ID | Position | Epitope | HLA-I Alleles | Conservation |
|---|---|---|---|---|---|---|---|---|---|
| Ep_0 | IDGKEYKIANNALIGEGK | IDGKEYKIANNALIG | DRB1*13:02, DRB3*02:02, DRB1*04:05, DRB1*08:02, DRB1*04:01 | FtsX domain-containing protein | Q8XM39 | 453 | EYKIANNALI | A*24:02 | 160/161 |
| KEYKIANNAL | B*40:01 | ||||||||
| EYKIANNAL | A*24:02, B*08:01, B*39:01 | ||||||||
| YKIANNALI | B*39:01 | ||||||||
| KEYKIANNA | B*40:01 | ||||||||
| DGKEYKIANNALIGE | DRB1*13:02, DRB1*11:01, DRB3*02:02, DRB1*04:05, DRB1*08:02, DRB3*01:01, DRB1*04:01 | 454 | EYKIANNALI | A*24:02 | |||||
| KEYKIANNAL | B*40:01 | ||||||||
| EYKIANNAL | A*24:02, B*08:01, B*39:01 | ||||||||
| YKIANNALI | B*39:01 | ||||||||
| KEYKIANNA | B*40:01 | ||||||||
| GKEYKIANNALIGEG | DRB1*13:02, DRB1*11:01, DRB3*02:02, DRB1*04:05, DRB1*08:02, DRB3*01:01, DRB1*04:01 | 455 | EYKIANNALI | A*24:02 | |||||
| KEYKIANNAL | B*40:01 | ||||||||
| EYKIANNAL | A*24:02, B*08:01, B*39:01 | ||||||||
| YKIANNALI | B*39:01 | ||||||||
| KEYKIANNA | B*40:01 | ||||||||
| KEYKIANNALIGEGK | DRB1*13:02, DRB3*02:02, DRB1*04:05, DRB1*08:02, DRB1*04:01 | 456 | EYKIANNALI | A*24:02 | |||||
| KEYKIANNAL | B*40:01 | ||||||||
| EYKIANNAL | A*24:02, B*08:01, B*39:01 | ||||||||
| YKIANNALI | B*39:01 | ||||||||
| KEYKIANNA | B*40:01 | ||||||||
| Ep_1 | LYEKGFLHAKTIVADSS | LYEKGFLHAKTIVAD | DRB1*11:01, DRB3*02:02, DRB1*04:05, DRB1*08:02, DRB1*04:01 | Cardiolipin synthase | P0C2E2 | 387 | FLHAKTIV | B*08:01 | 89/89 |
| LHAKTIVA | B*39:01 | ||||||||
| FLHAKTIVA | A*02:01, B*08:01 | ||||||||
| YEKGFLHAKTIVADS | DRB1*11:01, DRB3*02:02, DRB1*04:05, DRB1*08:02, DRB1*04:01 | 388 | FLHAKTIV | B*08:01 | |||||
| LHAKTIVA | B*39:01 | ||||||||
| FLHAKTIVA | A*02:01, B*08:01 | ||||||||
| EKGFLHAKTIVADSS | DRB1*13:02, DRB1*11:01, DRB3*02:02, DRB1*04:05, DRB1*08:02, DRB1*04:01 | 389 | FLHAKTIV | B*08:01 | |||||
| LHAKTIVA | B*39:01 | ||||||||
| FLHAKTIVA | A*02:01, B*08:01 | ||||||||
| Ep_2* | EGKIVVIIDNSPSVIIL | EGKIVVIIDNSPSVI | DRB1*13:02, DRB3*02:02, DRB1*04:05, DRB1*03:01, DRB3*01:01, DRB1*04:01 | Stage V sporulation protein AF | Q8XLQ7 | 245 | VIIDNSPSV | A*02:01, A*26:01 | 85/86 |
| IIDNSPSVI | A*02:01 | ||||||||
| GKIVVIIDNSPSVII | DRB1*13:02, DRB3*02:02, DRB1*04:05, DRB1*03:01, DRB3*01:01, DRB1*04:01 | 246 | VIIDNSPSV | A*02:01, A*26:01 | |||||
| IIDNSPSVI | A*02:01 | ||||||||
| KIVVIIDNSPSVIIL | DRB1*13:02, DRB3*02:02, DRB1*03:01, DRB3*01:01, DRB1*04:01 | 247 | VIIDNSPSV | A*02:01, A*26:01 | |||||
| IIDNSPSVI | A*02:01 | ||||||||
| DNSPSVIIL | B*39:01 | ||||||||
| Ep_3 | GAERFVLISTDKAVNPT | GAERFVLISTDKAVN | DRB1*11:01, DRB3*02:02, DRB1*04:05, DRB1*08:02, DRB1*04:01 | Polysacc synt 2 domain-containing protein | Q8XN75 | 406 | FVLISTDKAV | A*02:01 | 30/32 |
| ERFVLISTDK | B*27:05 | ||||||||
| VLISTDKAV | A*02:01 | ||||||||
| AERFVLIST | B*40:01 | ||||||||
| AERFVLISTDKAVNP | DRB1*13:02, DRB1*11:01, DRB3*02:02, DRB1*04:05, DRB1*08:02, DRB1*04:01 | 407 | FVLISTDKAV | A*02:01 | |||||
| ERFVLISTDK | B*27:05 | ||||||||
| VLISTDKAV | A*02:01 | ||||||||
| AERFVLIST | B*40:01 | ||||||||
| ERFVLISTDKAVNPT | DRB1*13:02, DRB1*11:01, DRB1*04:05, DRB1*08:02, DRB1*04:01 | 408 | FVLISTDKAV | A*02:01 | |||||
| ERFVLISTDK | B*27:05 | ||||||||
| VLISTDKAV | A*02:01 | ||||||||
| Ep_4 | IKENEFVVDGSTRLSDL | IKENEFVVDGSTRLS | DRB1*04:01, DRB3*01:01, DRB3*02:02, DRB1*03:01, DRB1*13:02 | Probable hemolysin-related protein | Q8XPD3 | 339 | FVVDGSTRL | A*02:01, A*26:01 | 41/41 |
| KENEFVVDGSTRLSD | DRB1*04:01, DRB3*01:01, DRB3*02:02, DRB1*03:01, DRB1*13:02 | 340 | FVVDGSTRL | A*02:01, A*26:01 | |||||
| ENEFVVDGSTRLSDL | DRB1*13:02, DRB3*02:02, DRB1*03:01, DRB3*01:01, DRB1*04:01 | 341 | FVVDGSTRL | A*02:01, A*26:01 | |||||
| DGSTRLSDL | B*08:01 | ||||||||
| Ep_5 | RHKDKIYIDTSPVNNLI | RHKDKIYIDTSPVNN | DRB1*04:01, DRB1*04:05, DRB3*01:01, DRB3*02:02, DRB1*03:01, DRB1*13:02 | TraG-D C domain-containing protein | Q93M96 | 158 | KIYIDTSPV | A*02:01 | 70/82 |
| HKDKIYIDTSPVNNL | DRB1*13:02, DRB3*02:02, DRB1*04:05, DRB1*03:01, DRB3*01:01, DRB1*04:01 | 159 | YIDTSPVNNL | A*02:01 | |||||
| KIYIDTSPV | A*02:01 | ||||||||
| IDTSPVNNL | B*40:01 | ||||||||
| KDKIYIDTSPVNNLI | DRB1*13:02, DRB3*02:02, DRB1*03:01, DRB3*01:01, DRB1*04:01 | 160 | YIDTSPVNNL | A*02:01 | |||||
| KIYIDTSPV | A*02:01 | ||||||||
| DTSPVNNLI | A*26:01 | ||||||||
| IDTSPVNNL | B*40:01 | ||||||||
| Ep_6 | ASATYYIDEDSKIKTA | ASATYYIDEDSKIKT | DRB3*02:02, DRB1*04:05, DRB1*03:01, DRB3*01:01, DRB1*04:01 | FtsX domain-containing protein | Q8XM39 | 331 | ATYYIDEDSK | A*03:01 | 126/128 |
| TYYIDEDSKI | A*24:02 | ||||||||
| YIDEDSKIK | A*01:01 | ||||||||
| YYIDEDSKI | A*24:02 | ||||||||
| SATYYIDEDSKIKTA | DRB3*02:02, DRB1*04:05, DRB1*03:01, DRB3*01:01, DRB1*04:01 | 332 | ATYYIDEDSK | A*03:01 | |||||
| TYYIDEDSKI | A*24:02 | ||||||||
| YIDEDSKIK | A*01:01 | ||||||||
| YYIDEDSKI | A*24:02 | ||||||||
| Ep_7* | VPDNIVSNLKPIANKI | VPDNIVSNLKPIANK | DRB1*13:02, DRB1*11:01, DRB3*02:02, DRB1*03:01, DRB1*08:02 | FtsX domain-containing protein | Q8XM39 | 490 | VSNLKPIANK | A*03:01 | 136/147 |
| VPDNIVSNL | B*07:02, B*08:01, B*39:01 | ||||||||
| SNLKPIANK | A*03:01 | ||||||||
| PDNIVSNLKPIANKI | DRB1*13:02, DRB1*11:01, DRB3*02:02, DRB1*03:01, DRB1*08:02 | 491 | VSNLKPIANK | A*03:01 | |||||
| SNLKPIANK | A*03:01 | ||||||||
| NLKPIANKI | B*08:01 | ||||||||
| Ep_8 | LDYKFILDTNYIEAKL | LDYKFILDTNYIEAK | DRB3*02:02, DRB1*04:05, DRB1*03:01, DRB3*01:01, DRB1*04:01 | Spore germination protein KA | Q8XMP0 | 191 | FILDTNYIEA | A*02:01 | 42/43 |
| ILDTNYIEAK | A*03:01 | ||||||||
| ILDTNYIEA | A*03:01 | ||||||||
| KFILDTNYI | A*24:02 | ||||||||
| YKFILDTNY | B*15:01 | ||||||||
| DYKFILDTNYIEAKL | DRB3*02:02, DRB1*04:05, DRB1*03:01, DRB3*01:01, DRB1*04:01 | 192 | FILDTNYIEA | A*02:01 | |||||
| ILDTNYIEAK | A*03:01 | ||||||||
| ILDTNYIEA | A*03:01 | ||||||||
| KFILDTNYI | A*24:02 | ||||||||
| DTNYIEAKL | A*26:01 | ||||||||
| YKFILDTNY | B*15:01 | ||||||||
| Ep_9 | LDDFITIEKANNSYTF | LDDFITIEKANNSYT | DRB1*13:02, DRB1*11:01, DRB3*02:02, DRB1*08:02, DRB1*04:01 | Cardiolipin synthase | Q8XP94 | 265 | ITIEKANNSY | A*01:01, A*26:01, B*15:01, B*58:01 | 114/115 |
| TIEKANNSY | A*01:01, A*26:01, B*15:01 | ||||||||
| DDFITIEKANNSYTF | DRB1*13:02, DRB1*11:01, DRB3*02:02, DRB1*08:02, DRB1*04:01 | 266 | ITIEKANNSY | A*01:01, A*26:01, B*15:01, B*58:01 | |||||
| IEKANNSYTF | B*40:01 | ||||||||
| KANNSYTF | B*58:01 | ||||||||
| TIEKANNSY | A*01:01, A*26:01, B*15:01 | ||||||||
| Ep_10 | SDNDYVIVNTEGGEFD | SDNDYVIVNTEGGEF | DRB1*11:01, DRB3*02:02, DRB1*04:05, DRB1*08:02, DRB1*04:01 | UPF0182 protein CPE0011 | Q8XPF2 | 461 | VIVNTEGGEF | B*15:01 | 173/174 |
| IVNTEGGEF | A*26:01, B*15:01 | ||||||||
| DNDYVIVNTEGGEFD | DRB1*13:02, DRB1*11:01, DRB3*02:02, DRB1*04:05, DRB1*08:02, DRB1*04:01 | 462 | VIVNTEGGEF | B*15:01 | |||||
| IVNTEGGEF | A*26:01, B*15:01 |
Epitopes “2” and “7” (with asterisk) were not included in the final design. Conservation is represented as the number of sequences where epitope is conserved over the total number of sequences analyzed.
Figure 5Amino acid sequence, epitope sorting and AlphaFold2 3D model of the multi-epitope constructs MEP_6 (A) and MEP_12 (B). The confidence value (pLDDT) is categorized in 4 groups (orange, yellow, cyan, and blue), representing their percentage in pie charts.
Figure 6Convergence analysis of the MD simulation. Ca-RMSD values and radius of gyration (Rg) of MEP_6 (A, B) and MEP_12 (C, D) during a simulation of 1.2 μs. Inset plots in (C, D) show the last 500 ns where the RMSD and Rg converged. (E) Coverage (%) of the five most populated clusters obtained from the RMDS-based clustering. (F) Structural alignment between the AlphaFold2 model (gray) and the centroid of the most populated cluster (in orange) of MEP_12.
Figure 7Quality assessment of the refined centroid of MEP_12 most populated cluster. (A) Ramachandran plot of MEP_12 centroid, indicating the percentage of residues in favored (light blue) and allowed (blue) regions. (B) Scatterplot of the Z-scores of MEP_12 centroid (black dot) and structures with experimental evidence obtained from NMR (blue) and X-ray crystallography (light blue). (C) ERRAT plot of MEP_12 centroid. Bars represent the error value (white: error < 95%, yellow: 95%, error < 99%, red: error > 99%) of a nine-residue sliding window. The overall quality factor indicates the percentage of protein residues with error values lower than 95%.
Figure 8Structural characteristics of MEP_12. (A) Alignment of the 3D structures of MEP_12 from all the centroids of the five most populated clusters. Cyan, purple, orange, green, and pink cartoons correspond to the centroids of clusters 1, 2, 3, 4 and 5 respectively. The twelve β-strands in the structure (left) are represented as arrows in the sequence (right). (B) Violin plots showing the distribution of Ca-RMSF values. The lowest Ca-RMSF distributions are colored in green. (C) Mean SASA by epitope. The error bars represent 95% confidence intervals. (D) Modeled structure of MEP_12, showing epitopes “3” and “6” in red and green, respectively.
Figure 9Protein docking of MEP_12 against TLR1/TLR2 and TLR4/MD-2. Bar plots summarizing the dockings (A) MEP_12-TLR1/TLR2 and (C) MEP_12-TLR4/MD2. The number of structures and the mean Haddock-score by cluster are shown in orange and blue, respectively. Whiskers in the mean Haddock-score bars represent one standard deviation. Structural representation of the most favorable binding mode of the cluster selected, of dockings (B) MEP_12-TLR1/TLR2, and (D) MEP_12-TLR4/MD2; showing the binding energy computed in PRODIGY. MEP_12 is colored in cyan; TLR1 and TLR4 in green; and TLR2 and MD2 in orange. Glycosylated residues and attached glycans (cyan) are shown as sticks, and non-carbon atoms are colored following the CPK convention. The inset plots show a closeup of the residues involved in polar interactions (as cyan and green sticks). The hydrogen bonds and salt bridges are represented by blue and red lines, respectively, connecting the interacting atoms which are labeled indicating amino acid and position.