| Literature DB >> 32903714 |
Mareike Wendorff1, Heli M Garcia Alvarez2, Thomas Østerbye3, Hesham ElAbd1, Elisa Rosati1, Frauke Degenhardt1, Søren Buus3, Andre Franke1, Morten Nielsen2,4.
Abstract
Human Leukocyte Antigen class II (HLA-II) molecules present peptides to T lymphocytes and play an important role in adaptive immune responses. Characterizing the binding specificity of single HLA-II molecules has profound impacts for understanding cellular immunity, identifying the cause of autoimmune diseases, for immunotherapeutics, and vaccine development. Here, novel high-density peptide microarray technology combined with machine learning techniques were used to address this task at an unprecedented level of high-throughput. Microarrays with over 200,000 defined peptides were assayed with four exemplary HLA-II molecules. Machine learning was applied to mine the signals. The comparison of identified binding motifs, and power for predicting eluted ligands and CD4+ epitope datasets to that obtained using NetMHCIIpan-3.2, confirmed a high quality of the chip readout. These results suggest that the proposed microarray technology offers a novel and unique platform for large-scale unbiased interrogation of peptide binding preferences of HLA-II molecules.Entities:
Keywords: HLA; MHC class II; antigen presentation; high-throughput; machine learning; peptide binding; prediction; ultra-high density peptide microarray
Mesh:
Substances:
Year: 2020 PMID: 32903714 PMCID: PMC7438773 DOI: 10.3389/fimmu.2020.01705
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 7.561
Figure 1Performance of the models on peptide microarray data and resulting motifs. (A) The Pearson correlation coefficient (PCC) and (B) the Spearman correlation coefficient (SCC) on the independent test dataset of the peptide microarray are shown. The pairwise p-values were calculated using a non-parametric bootstrap hypothesis test with 1,000,000 bootstrap iterations. *0.01 < p ≤ 0.05, **0.001 < p ≤ 0.01, ***0.0001 < p ≤ 0.001, and ****p ≤ 0.0001. Motif plots of (C) DRB1*01:03, (D) DRB1*03:01, (E) DRB1*15:01, and (F) DRB1*15:02 based on the top 1% (from a pool of 100,000 random natural peptide) binding peptides generated with the deep learning model (PIA), NNAlign model and NetMHCIIpan-3.2 (8). (G) Pearson correlation coefficient (PCC) of the position specific scoring matrices (PSSM) between the different models.
Figure 2Comparisons of prediction quality on (A) MS ligand and (B) epitope data. The center line inside the box indicates the median Frank and the triangle shows the mean Frank. The data points available in IEDB are represented using a jitter plot. The colored box covers the interquartile range. The whiskers represent 1.5-fold of the interquartile range. Pairwise p-values were calculated using a Wilcoxon signed-rank test (applying Pratt's zero method). *0.01 < p ≤ 0.05, **0.001 < p ≤ 0.01, ***0.0001 < p ≤ 0.001, and ****p ≤ 0.0001.