| Literature DB >> 28348866 |
Damien Farrell1, Gareth Jones2, Christopher Pirson2, Kerri Malone1, Kevin Rue-Albrecht1,3, Anthony J Chubb4, Martin Vordermeier2, Stephen V Gordon5,6,1,4.
Abstract
The discovery of novel antigens is an essential requirement in devising new diagnostics or vaccines for use in control programmes against human tuberculosis (TB) and bovine tuberculosis (bTB). Identification of potential epitopes recognised by CD4+ T cells requires prediction of peptide binding to MHC class-II, an obligatory prerequisite for T cell recognition. To comprehensively prioritise potential MHC-II-binding epitopes from Mycobacterium bovis, the agent of bTB and zoonotic TB in humans, we integrated three binding prediction methods with the M. bovisproteome using a subset of human HLA alleles to approximate the binding of epitope-containing peptides to the bovine MHC class II molecule BoLA-DRB3. Two parallel strategies were then applied to filter the resulting set of binders: identification of the top-scoring binders or clusters of binders. Our approach was tested experimentally by assessing the capacity of predicted promiscuous peptides to drive interferon-γ secretion from T cells of M. bovis infected cattle. Thus, 376 20-mer peptides, were synthesised (270 predicted epitopes, 94 random peptides with low predictive scores and 12 positive controls of known epitopes). The results of this validation demonstrated significant enrichment (>24 %) of promiscuously recognised peptides predicted in our selection strategies, compared with randomly selected peptides with low prediction scores. Our strategy offers a general approach to the identification of promiscuous epitopes tailored to target populations where there is limited knowledge of MHC allelic diversity.Entities:
Keywords: MHC; Mycobacterium bovis; Tuberculosis; bovine; epitope
Mesh:
Substances:
Year: 2016 PMID: 28348866 PMCID: PMC5320590 DOI: 10.1099/mgen.0.000071
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Fig. 1.Pocket profile concept for MHC binding prediction. The polymorphic binding pockets in chain B of the MHC-II complex. This concept is used in TEPITOPEpan to treat alleles being composed of independent, modular pockets that can be extrapolated to other alleles with no binding data. The weight between query pocket W with query pseudo-sequence q, and a pocket in the library, with pseudo-sequence l is calculated as shown. S is the sequence similarity score. K denotes a sum over the entire pocket library. α is a positive parameter that determines the range of similarity scores that give high weights.
Fig. 2.Schemes for peptide selection used in this study. In all selection strategies only binders above the cut-off in at least three alleles were considered. We chose 92 peptides from the top-scoring binders strategy and 176 from the binder clusters strategy for validation.
Summary of the two epitope selection strategies and control used in this study
| Strategy | Summary |
|---|---|
| Top-scoring binders | Highest scoring MHC-II binders in at least three alleles with at least one overlapping binder in both prediction algorithms (netMHCIIpan and TEPITOPEpan). |
| Binder clusters | Detects regions of densest binders using both TEPITOPEpan and netMHCIIpan predictors. Uses only overlapping binders from both methods and at least one MHC-I binder. Results vary depending on the order in which predictors are selected. |
| Random binders (control) | Baseline method using random binders from the entire set of predicted binders. Half were taken from TEPITOPEpan and half from netMHCIIpan without requiring that there is any overlap between predictors. |
All binders are defined as being present in at least three of the eight chosen HLA MHC-II alleles.
Fig. 3.Similarity of human and bovine alleles. (a) Distributions of mean nearest neighbour distance to Tepitope library alleles for the 700 HLA alleles covered by TEPITOPEpan (blue) and known BoLA alleles (green). Though most of the BoLA pseudo-sequences are too distant to be used, there is a substantial overlapping region with the HLA alleles covered by TEPITOPEpan. (b) BoLA vs HLA alleles pseudo-sequence distances. The heatmap shows the nearest-neighbour distances between BoLA (y-axis) and closest HLA alleles covered by TEPITOPEpan. The colour scale represents pseudo-sequence distance on a scale of 0 to 1. Low values (red) are more closely matching alleles. The x-axis is sorted by mean distance. We choose the eight closest HLA alleles with nearest neighbour distance ≤0.25 (leftmost on the x-axis) for both our prediction methods.
Subset of HLA-DR alleles used for predictions
| Reference | Nearest* | Mean |
|---|---|---|
| HLA-DRB1*0301 | 0.20 | 0.43 |
| HLA-DRB1*0401 | 0.18 | 0.40 |
| HLA-DRB1*0801 | 0.17 | 0.42 |
| HLA-DRB1*1101 | 0.16 | 0.37 |
| HLA-DRB1*1301 | 0.17 | 0.41 |
| HLA-DRB1*1401 | 0.18 | 0.40 |
| HLA-DRB3*0101 | 0.21 | 0.49 |
| HLA-DRB3*0201 | 0.16 | 0.45 |
*Nearest-neighbour distance is the pseudo-sequence similarity to the closest BoLA allele. For efficiency one allele subtype is chosen to represent all alleles for that allotype, i.e. HLA-DRB1*0301 represents DRB1*03. HLA allele nomenclature is explained on the IMGT/HLA website (Robinson et al., 2013).
Fig. 4.BoLA allele frequency distributions. Common allele frequencies for Holstein (USA) (Dietz ), Holstein–Charolais (UK) (Baxter ) and Polish Holstein–Friesian cattle (Oprządek ). Note that some frequency values represent the combination of two subtypes (e.g. *2703 and *2707). In these cases the higher value was used. Alleles with frequencies less than 2 % are not shown for clarity.
Common BoLA-DRB3 allele frequencies
| DRB3.2 alleles | Holstein–Charolais | Holstein (USA) | Polish Holstein–Freisian |
|---|---|---|---|
| *24 | – | 0.14 | 0.21 |
| *08 | – | 0.14 | 0.14 |
| *22† | – | 0.14 | 0.12 |
| *27 | 0.2 | 0.04 | – |
| *11 | 0.16 | 0.09 | 0.03 |
| *16† | 0.1 | 0.10 | 0.08 |
| *23 | – | 0.09 | 0.08 |
| *06 | 0.06 | – | – |
| *09 | 0.06 | – | – |
| *01 | 0.06 | – | – |
| *02 | 0.07 | – | 0.03 |
| *05 | 0.04 | – | – |
| *36† | – | – | 0.04 |
| *12† | 0.06 | 0.03 | 0.03 |
| *07 | – | 0.05 | – |
| *28 | – | – | 0.06 |
| *03 | – | 0.04 | 0.03 |
| *10 | 0.05 | – | 0.02 |
| *26 | – | 0.02 | – |
Shown are frequencies derived from USA Holstein (Dietz ), Holstein–Charolais (UK) (Baxter ) and Polish Holstein–Friesian cattle (Oprzadek ). †Alleles covered by prediction methods.
Fig. 5.Results for both sets of IGRA assays. (a) IGRA whole-blood assays, in four animals. White data points are nil-subtracted mean OD values for all animals responding to a given peptide. The boxplots show the underlying distribution for the raw OD values (all animals). Results are grouped by each peptide's number of responders. Peptides inducing no responses are not shown. (b) PBMC IGRA assays in seven animals. White data points are nil-subtracted mean OD values with boxplots showing distribution for all data points.
Fig. 6.Enrichment of peptides containing epitopes predicted to bind to BoLA-DRB3. Responder frequencies (based on four whole-blood and seven PBMC samples) of all peptides tested were grouped by the epitope-selection strategy. Peptides were deemed promiscuous based on a cut-off of ≥26 % derived from the lowest positive control responder frequencies (purple). The cut-off level is indicated by the red dashed line. The binder clusters method should a superior enrichment of high response peptides at 27.3 %.
The 11 most frequently recognized peptides
| Peptide sequence | Method | Responder frequency | *Start position | ||
|---|---|---|---|---|---|
| Rv3732 | Mb3759 | PYVRDGWAFVAIRLTSTDLI | Binder clusters | 82 | 178 |
| Rv1822 | Mb1853 | DWADGKIARLLNQSSRLGAL | Binder clusters | 73 | 53 |
| Rv2140c | Mb2164c | PGGALTLVNDAGMRRYVGAA | Top-scoring binders | 73 | 102 |
| Rv3671c | Mb3695c | NEAAPTWLKTVPKRLSALLN | Binder clusters | 73 | 150 |
| Rv3863 | Mb3893 | LAADGIINAGALIAFEKGRS | Binder clusters | 73 | 183 |
| Rv1239c | Mb1271c | PTVIGGMVLICLFLYHVFRN | Binder clusters | 64 | 344 |
| Rv1591 | Mb1617 | TQAPPVFFARRPLQIALTLM | Binder clusters | 64 | 158 |
| Rv1762c | Mb1793c | EHLEFMAVGTAVRYTAKPGA | Binder clusters | 64 | 111 |
| Rv1833c | Mb1864c | VMSSPPVQYAILRRNFFVER | Binder clusters | 64 | 154 |
| Rv2412 | Mb2435 | RNKAVKSSLRTAVRAFREAA | Binder clusters | 64 | 20 |
| Rv3247c | Mb3275c | ASSVYAMATLFALDRAGAVH | Binder clusters | 64 | 62 |
Peptides with responder frequencies ≥60% of the mean value for the positive controls. *Start position is the location of the start of the peptide in the protein sequence. All peptides are 20 amino acids in length.
Fig. 7.Sequences views of predicted binder clusters and positive epitope regions in selected proteins. (a) Rv3676 is the most enriched protein with three positive peptides out of four predicted. The peptide positions in the sequence are shown as regions in light blue with the single negative in grey. Each box marks the position of a predicted binder. Each row is one of the eight chosen HLA-DRB3 alleles (labelled on the y-axis) for each of three predictors. Colours represent each of the three predictors as shown in the legend. The plots were produced by the epitopepredict library. (b) Rv3854 had two positives from three predicted. The two positives (at positions 38 and 44) are overlapping and contain the same nine-mer core, LTINNVLLR.