| Literature DB >> 31849987 |
Sofie Gielis1,2,3, Pieter Moris1,3, Wout Bittremieux1,3,4, Nicolas De Neuter1,2,3, Benson Ogunjimi2,5,6,7, Kris Laukens1,2,3, Pieter Meysman1,2,3.
Abstract
High-throughput T cell receptor (TCR) sequencing allows the characterization of an individual's TCR repertoire and directly queries their immune state. However, it remains a non-trivial task to couple these sequenced TCRs to their antigenic targets. In this paper, we present a novel strategy to annotate full TCR sequence repertoires with their epitope specificities. The strategy is based on a machine learning algorithm to learn the TCR patterns common to the recognition of a specific epitope. These results are then combined with a statistical analysis to evaluate the occurrence of specific epitope-reactive TCR sequences per epitope in repertoire data. In this manner, we can directly study the capacity of full TCR repertoires to target specific epitopes of the relevant vaccines or pathogens. We demonstrate the usability of this approach on three independent datasets related to vaccine monitoring and infectious disease diagnostics by independently identifying the epitopes that are targeted by the TCR repertoire. The developed method is freely available as a web tool for academic use at tcrex.biodatamining.be.Entities:
Keywords: TCR repertoire analysis; cytomegalovirus (CMV); enrichment analysis; epitope specificity; immunoinformatics; infectious disease; vaccines; yellow fever virus (YFV)
Mesh:
Substances:
Year: 2019 PMID: 31849987 PMCID: PMC6896208 DOI: 10.3389/fimmu.2019.02820
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 7.561
Figure 1Overview of the number of trained prediction models for each virus and cancer type (TCRex version 0.3.0). The bars show the number of epitopes for which a prediction model with sufficient performance (i.e., AUC ≥ 0.7 and average precision ≥0.35) was trained.
Validation of three epitope-specific models with the leave-one-study-out validation approach.
| NLVPMVATV | 4,723 | 89 | 10 | 3 | 2 | 0.54 | 0.11 | 0.98 | 0.83 |
| GILGFVFTL | 2,983 | 124 | 3 | 0 | 0 | 0.51 | 0.02 | 1 | 1 |
| GLCTLVAML | 1,142 | 66 | 12 | 3 | 0 | 0.59 | 0.18 | 1 | 1 |
For each epitope, the size of the training dataset, the size of the test datasets (i.e. the holdout validation dataset and the additional cancer dataset) and the number of identified epitope-specific TCRs in the test datasets is given together with the associated performance metrics for a BPR threshold of 0.01%. In addition, the number of identified TCRs in the holdout validation set having a training CDR3 beta sequence is shown. They, however, do not share their V and/or J genes with the training TCRs that match the CDR3 beta sequences, as these were filtered out the holdout dataset.
Figure 2Percentage of unique identified LLWNGPMAV-specific TCRs pre- and post-vaccination. The boxplots show the proportion of unique LLWNGPMAV-specific T cells in pre- and post-vaccination PBMC samples for the nine volunteers from Dewitt et al. (1) (A) and the six volunteers from Pogorelyy et al. (2) (B). Epitope-specific T cells were identified with TCRex using a 0.01% BPR threshold. An increase in the number of unique epitope-specific cells was found for both studies.
Comparison of the epitope-specific TCR sequences identified with TCRex to the results from Pogorelyy et al. (2).
| P1 | 1,151 | 1090 | 92 | 2 |
| P2 | 800 | 763 | 110 | 3 |
| Q1 | 576 | 547 | 79 | 7 |
| Q2 | 1,685 | 1589 | 92 | 8 |
| S1 | 773 | 747 | 170 | 19 |
| S2 | 983 | 948 | 273 | 11 |
For each volunteer from Pogorelyy et al. (.
Figure 3Public vs. TCRex-identified unique LLWNGPMAV-specific TCRs in the post-vaccination samples of Dewitt et al. (1) and Pogorelyy et al. (2). (A) Overview of the number of unique LLWNGPMAV-specific TCRs that were identified with TCRex in the post-vaccination PBMC samples for all nine volunteers (V1–V9) of Dewitt et al. (1) (dark blue) and the total number of public LLWNGPMAV-specific TCRs that were present in these repertoires (light blue). (B) Overview of the number of unique LLWNGPMAV-specific TCRs that were identified with TCRex in the post-vaccination PBMC samples for all six volunteers (P1–S2) of Pogorelyy et al. (2) (dark orange) and the total number of public LLWNGPMAV-specific TCRs that were present in these repertoires (light orange).
LLWNGPMAV-specific enrichment analysis results for all volunteers from Pogorelyy et al. (2).
| P1 | 77 | 0.007272 | 0.999 | 92 | 0.010400 | 0.441 | 8.17e-04 |
| P2 | 95 | 0.008119 | 0.999 | 110 | 0.009614 | 0.673 | 0.045 |
| Q1 | 61 | 0.008615 | 0.999 | 79 | 0.017915 | 2.80e-06 | 1.24e-08 |
| Q2 | 109 | 0.008440 | 0.999 | 92 | 0.015455 | 8.59e-05 | 1.64e-07 |
| S1 | 138 | 0.014311 | 2.40e-04 | 170 | 0.025194 | 6.06e-25 | 5.69e-11 |
| S2 | 180 | 0.013314 | 4.09e-04 | 273 | 0.017962 | 2.10e-18 | 2.03e-06 |
The table represents the adjusted p-values (Benjamini-Hochberg adjusted p-values for 6 tests) associated with the LLWNGPMAV-specific enrichment analyses for both full (i.e., derived from PBMC) pre- and post-vaccination TCR repertoires. In addition, the number of identified specific TCRs and the percentage of identified specific TCRs are given.
LLWNGPMAV-specific enrichment analysis results for all volunteers from Dewitt et al. (1).
| 1 | 35 | 0.010813 | 0.621 | 32 | 0.009094 | 0.935 | 0.857 | 4 | 0.017189 | 0.265 | 0.316 |
| 2 | 48 | 0.012999 | 0.384 | 65 | 0.015565 | 1.56e-03 | 0.212 | 5 | 0.028991 | 0.056 | 0.116 |
| 3 | 41 | 0.012539 | 0.384 | 87 | 0.025967 | 9.29e-14 | 7.86e-09 | 181 | 0.550922 | 1.86e-239 | 5.00e-222 |
| 4 | 23 | 0.009778 | 0.856 | 34 | 0.012084 | 0.299 | 0.231 | 21 | 0.103128 | 2.56e-14 | 1.67e-14 |
| 5 | 8 | 0.012759 | 0.621 | 63 | 0.020710 | 7.22e-07 | 9.93e-04 | 71 | 0.423527 | 7.50e-87 | 1.56e-79 |
| 6 | 41 | 0.012111 | 0.384 | 32 | 0.012065 | 0.299 | 0.684 | 8 | 0.040422 | 2.31e-03 | 7.45e-03 |
| 7 | 11 | 0.006771 | 0.964 | 11 | 0.005344 | 0.992 | 0.857 | 2 | 0.011544 | 0.517 | 0.369 |
| 8 | 22 | 0.008229 | 0.964 | 27 | 0.010866 | 0.539 | 0.212 | 2 | 0.011869 | 0.517 | 0.404 |
| 9 | 13 | 0.006451 | 0.964 | 18 | 0.008163 | 0.938 | 0.284 | 4 | 0.021606 | 0.176 | 0.060 |
The table represents the adjusted p-values (Benjamini-Hochberg adjusted p-values for 9 tests) associated with the LLWNGPMAV-specific enrichment analyses for both full (i.e., derived from PBMC) pre- and post-vaccination TCR repertoires and the post-vaccination activated TCR repertoire. In addition, the number of identified specific TCRs and the percentage of identified specific TCRs are given.
Figure 4Percentage of unique identified LLWNGPMAV-specific TCRs in the post-vaccination repertoires from Dewitt et al. (1). The box plots show the log-scaled proportion of unique LLWNGPMAV-specific T cells in post-vaccination PBMC samples and activated TCR repertoires [i.e., “CD3+ CD8+CD14−CD19−CD38+HLA-DR+ Ag-experienced, activated effector T cells” (1)] for the nine volunteers from Dewitt et al. (1). Epitope-specific TCRs were identified with TCRex using a 0.01% BPR threshold. An increase in the number of unique LLWNGPMAV-specific cells was found in the activated dataset.
Epitope-specific TCRs identified in the enriched CMV-related TCR dataset of Emerson et al. (15) (0.01% BPR threshold).
| TRBV07-06 | CASSLAPGATNEKLFF | TRBJ01-04 | NLVPMVATV | CMV | 0.99 | 0.0 |
| TRBV30 | CAWSVSDLAKNIQYF | TRBJ02-04 | NLVPMVATV | CMV | 0.91 | 0.0 |
| TRBV09 | CASSALGGAGTGELFF | TRBJ02-02 | NLVPMVATV | CMV | 0.84 | 0.0 |
| TRBV07-03 | CASSRLAGGTDTQYF | TRBJ02-03 | QIKVRVKMV | CMV | 0.54 | 2e-05 |
| TRBV07-09 | CASSLIGVSSYNEQFF | TRBJ02-01 | TPRVTGGGAM | CMV | 0.97 | 0.0 |
| TRBV04-03 | CASSPSRNTEAFF | TRBJ01-01 | TPRVTGGGAM | CMV | 0.92 | 0.0 |
| TRBV04-03 | CASSPQRNTEAFF | TRBJ01-01 | TPRVTGGGAM | CMV | 0.85 | 0.0 |
| TRBV04-03 | CASSPHRNTEAFF | TRBJ01-01 | TPRVTGGGAM | CMV | 0.85 | 0.0 |
| TRBV09 | CASSGQGAYEQYF | TRBJ02-07 | LLWNGPMAV | YFV | 0.98 | 0.0 |
Evaluation of the MHC-specificity of epitope-specific models.
| TPGPGVRYPL | HLA-B*42:01 | 63 | HLA-B*07:02 | 23 | 1 |
| TPQDLNTML | HLA-B*42:01 | 114 | HLA-B*81:01 | 40 | 7 |
The table lists the MHC background and size of the training and test datasets for the study on MHC bias and the number of identified TCRs in the test dataset (0.01% BPR threshold).