| Literature DB >> 30399156 |
Dina Schneidman-Duhovny1,2, Natalia Khuri1,2,3, Guang Qiang Dong1,2, Michael B Winter2, Eric Shifrut4, Nir Friedman4, Charles S Craik2,5, Kathleen P Pratt6, Pedro Paz7, Fred Aswad7, Andrej Sali1,2,3.
Abstract
Accurate predictions of T-cell epitopes would be useful for designing vaccines, immunotherapies for cancer and autoimmune diseases, and improved protein therapies. The humoral immune response involves uptake of antigens by antigen presenting cells (APCs), APC processing and presentation of peptides on MHC class II (pMHCII), and T-cell receptor (TCR) recognition of pMHCII complexes. Most in silico methods predict only peptide-MHCII binding, resulting in significant over-prediction of CD4 T-cell epitopes. We present a method, ITCell, for prediction of T-cell epitopes within an input protein antigen sequence for given MHCII and TCR sequences. The method integrates information about three stages of the immune response pathway: antigen cleavage, MHCII presentation, and TCR recognition. First, antigen cleavage sites are predicted based on the cleavage profiles of cathepsins S, B, and H. Second, for each 12-mer peptide in the antigen sequence we predict whether it will bind to a given MHCII, based on the scores of modeled peptide-MHCII complexes. Third, we predict whether or not any of the top scoring peptide-MHCII complexes can bind to a given TCR, based on the scores of modeled ternary peptide-MHCII-TCR complexes and the distribution of predicted cleavage sites. Our benchmarks consist of epitope predictions generated by this algorithm, checked against 20 peptide-MHCII-TCR crystal structures, as well as epitope predictions for four peptide-MHCII-TCR complexes with known epitopes and TCR sequences but without crystal structures. ITCell successfully identified the correct epitopes as one of the 20 top scoring peptides for 22 of 24 benchmark cases. To validate the method using a clinically relevant application, we utilized five factor VIII-specific TCR sequences from hemophilia A subjects who developed an immune response to factor VIII replacement therapy. The known HLA-DR1-restricted factor VIII epitope was among the six top-scoring factor VIII peptides predicted by ITCall to bind HLA-DR1 and all five TCRs. Our integrative approach is more accurate than current single-stage epitope prediction algorithms applied to the same benchmarks. It is freely available as a web server (http://salilab.org/itcell).Entities:
Mesh:
Substances:
Year: 2018 PMID: 30399156 PMCID: PMC6219782 DOI: 10.1371/journal.pone.0206654
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Overview of the 3 steps of antigen processing that are modeled using the current approach.
Fig 2A and B) Cleavage profiles for cathepsins H, S, and B. A) iceLogo representations of substrate specificity. Residues above the line are favored at a given position; residues below the line are disfavored. Statistically significant residues (p < 0.05) are colored according to their physicochemical properties. B) Heat map representations of residue preference clustered and colored by Z-score at each position. Favored residues have Z > 0 and disfavored residues have Z < 0. Cleavage profiles are provided from P1-P4’ for cathepsin H, P4-P4’ for cathepsin S, and P4-P2’ for cathepsin B based on their predominant experimentally-determined cleavage preferences. C) Percentage of 12-mer peptides in the benchmark proteins remaining, following filtering by the cleavage sites predictor.
Benchmark of epitope predictions based on crystal structures.
The epitope core residues within the peptides are bolded.
| PDB | Protein | Confirmed epitope peptide | Antigen length | MHCII type | TCR | ITCell Rank | Rank MHC | Rank TCR | Rank NetMHCII pan(core) | Combined rank |
|---|---|---|---|---|---|---|---|---|---|---|
| 1j8h | HEMAGGLUTININ HA1 | 317 | DRB1*0401 | HA1.7 | 1 | 4 | 3 | 1 | 1 | |
| 1fyt | HEMAGGLUTININ HA1 | 317 | DRB1*0101 | HA1.7 | 1 | 7 | 1 | 12 | 1 | |
| 4e41 | TRIOSEPHOSPHATE ISOMERASE | G | 249 | DRB1*0101 | G4 | 3 | 1 | 72 | 2 | 1 |
| 2iam | TRIOSEPHOSPHATE ISOMERASE | G | 249 | DRB1*0101 | E8 | 1 | 1 | 16 | 2 | 1 |
| 2ian | TRIOSEPHOSPHATE ISOMERASE | G | 249 | DRB1*0101 | E8 | 1 | 1 | 9 | 2 | 1 |
| 1ymm | MYELIN BASIC PROTEIN | EN | 169 | DRB1*1501 | OB.1A12 | 13 | 5 | 61 | 1 | 1 |
| 2wbj | ENGA | DF | 304 | DRB1*1501 | OB.1A12 | 254 | 155 | 288 | 1 | 3 |
| 3pl6 | MYELIN BASIC PROTEIN | N | 169 | DQA1*0102-DQB1*0502 | Hy.1B11 | 1 | 2 | 4 | 1 | 1 |
| 4grl | PHOSPHOMANNOMUTASE | R | 463 | DQA1*0102-DQB1*0502 | Hy.1B11 | 20 | 12 | 74 | 6 | 7 |
| 4may | UL15 | 735 | DQA1*0102-DQB1*0502 | Hy.1B11 | 3 | 3 | 24 | 10 | 1 | |
| 3o6f | MYELIN BASIC PROTEIN | 169 | DRB1*0401 | MS2-3C8 | 93 | 152 | 20 | 99 | 113 | |
| 1zgl | MYELIN BASIC PROTEIN | V | 169 | DRB5*0101 | 3A6 | 18 | 6 | 52 | 1 | 1 |
| 4ozf | GLIADIN | 291 | DQA1*0501-DQB1*0201 | JR5.1 | 20 | 33 | 27 | 218 | 185 | |
| 4ozg | GLIADIN | 291 | DQA1*0501-DQB1*0201 | D2 | 1 | 33 | 2 | 218 | 144 | |
| 4ozh | GLIADIN | 291 | DQA1*0501-DQB1*0201 | S16 | 11 | 33 | 15 | 218 | 175 | |
| 4ozi | GLIADIN | 291 | DQA1*0501-DQB1*0201 | S2 | 12 | 8 | 45 | 244 | 83 | |
| 4gg6 | GLIADIN2 | 307 | DQA1*0301-DQB1*0302 | SP3.4 | 1 | 1 | 81 | 53 | 1 | |
| 4z7u | GLIADIN2 | P | 307 | DQA1*0301-DQB1*0302 | S13 | 1 | 1 | 11 | 53 | 1 |
| 4z7v | GLIADIN2 | 307 | DQA1*0301-DQB1*0302 | L3-12 | 1 | 1 | 22 | 53 | 1 | |
| 4z7w | GLIADIN2 | P | 307 | DQA1*0301-DQB1*0302 | T316 | 1 | 1 | 17 | 53 | 1 |
| Number of cases out of total with the correct epitope ranked #1 | 10/20 | 3/14 | 1/20 | 5/14 | 13/20 | |||||
| Number of cases out of total with the correct epitope among top20 | 18/20 | 11/14 | 10/20 | 10/14 | 15/20 | |||||
Fig 3Success rate for SOAP_PEP, NetMHCIIpan3.1 and the two scoring functions combined.
Benchmark of murine T-cell epitope predictions based on comparative models.
| Protein | Confirmed epitope peptide | Antigen length | MHCII type | TCR CDR3 sequence | TCR V | TCR J | ITcell Rank | Rank MHC | Rank TCR |
|---|---|---|---|---|---|---|---|---|---|
| Hsp60 | VLGGGCALLRCIPALDSLTPANED | 572 | g7 (NOD) | CASSLGGNQDTQYF | V12 | J2.5 | 11 | 72 | 28 |
| MBP | ASQKRPSQR | 250 | U (B10.PL/J) | CASSGTDQDTQYF | V16 | J2.5 | 2 | 14 | 7 |
| MBP | ASQKRPSQR | 250 | U (B10.PL/J) | CASGDAGGSYEQYF | V8.2 | J2.7 | 7 | 33 | 19 |
| hMDM2 | LLGDLFGV | 490 | HLA-A*0201 | CASGDWGYEQYF | V8.2 | J2.7 | 2 | 1 | 89 |
Fig 4Prediction of FVIII epitopes.
A) Consensus based determination of the epitope and its register in the MHCII cavity. CDR3 sequences of the five TCRs and their ranking with respect to different peptide registers in the epitope sequence. Only ranks in the top 100 scores are shown. Peptide FVIII-2194-2205 ranked in the top 100 scoring peptides predicted to bind both HLA-DR1 and each of the 5 indicated TCR-beta variable CDR3 sequences. The predicted consensus core of the epitope for the five TCRs is indicated by an arrow. B) Structural models for the five pMHCII-TCR complexes with the SYFTNMFATWSP 12-mer peptide. The identical MHCII (HLA-DR1) structures within each modeled complex were superimposed, allowing visualization of the different peptide and TCR structural variations. C) The TCR models contact the pMHCII by placing their CDR3 loops in the cavity between FVIII residues Phe2200 and Trp2203. D) The six highest scoring peptides with respect to all five TCRs all have a Trp residue at position 10.
Fig 5A) Structural alignment of pMHCII complexes from the benchmark set. Alpha and beta chains are green and cyan, respectively. The peptide is shown in red with its core region in the box. B) Structural alignment of TCRs from the benchmark set. TCR alpha and beta chains are shown in yellow and blue, respectively. The variable TCR CDR3 loops are shown in the box. To address TCR loop variability, 10 models are used for each TCR sequence.
Fig 6A) Orientations of TCRs with respect to MHCII in the 20 benchmark complexes. B) Orientations of TCRs with respect to MHCII in the 500 template structures generated by docking. C) Distribution of RMSD values between the 20 benchmark pMHCII-TCR structures. D) Distribution of RMSD values between the 500 template structures. Addition of docking generated templates resulted in better coverage of orientation space as indicated by the smoother RMSD distribution on the right.