| Literature DB >> 28211542 |
Yu-Hang Zhang1, Zhihao Xing1, Chenglin Liu2, ShaoPeng Wang3, Tao Huang1, Yu-Dong Cai3, Xiangyin Kong1.
Abstract
During the display of peptide/human leukocyte antigen (HLA) -I complex for further immune recognition, the cleaved and transported antigenic peptides have to bind to HLA-I protein and the binding affinity between peptide epitopes and HLA proteins directly influences the immune recognition ability in human beings. Key factors affecting the binding affinity during the generation, selection and presentation processes of HLA-I complex have not yet been fully discovered. In this study, a new method describing the HLA class I-peptide interactions was proposed. Three hundred and forty features of HLA I proteins and peptide sequences were utilized for analysis by four candidate algorithms, screening the optimal classifier. Features derived from the optimal classifier were further selected and systematically analyzed, revealing the core regulators. The results validated the hypothesis that features of HLA I proteins and related peptides simultaneously affect the binding process, though with discrepant redundancy. Besides, the high relative ratio (16/20) of the amino acid composition features suggests the unique role of sequence signatures for the binding processes. Integrating biological, evolutionary and chemical features of both HLA I molecules and peptides, this study may provide a new perspective of the underlying mechanisms of HLA I-mediated immune reactions.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28211542 PMCID: PMC5314381 DOI: 10.1038/srep42768
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
The composition of the protein and peptide features for each pair of HLA I-peptide interactions.
| AAC | CD | EC | MV | 2nd_structure | Polarity | Total | |
|---|---|---|---|---|---|---|---|
| Protein | 20 | 50 | 50 | 50 | 50 | 50 | 270 |
| Peptide | 20 | 10 | 10 | 10 | 10 | 10 | 70 |
Figure 1The IFS-curves for the (A) Dagging, (B) NNA, (C) RF and (D) SVM algorithms, with the best performance marked with red point. The X-axis indicates the number of features used to construct the classifiers, and the Y-axis indicates their corresponding MCC values.
The best classification performances based on the optimal classifiers derived from the four algorithms.
| Algorithm | Optimal features | ||||
|---|---|---|---|---|---|
| Dagging | 334 | 0.040 | 0.995 | 0.875 | 0.123 |
| NNA | 153 | 0.461 | 0.936 | 0.876 | 0.414 |
| RF | 56 | 0.376 | 0.961 | 0.888 | 0.410 |
| SVM | 137 | 0.007 | 1.000 | 0.875 | 0.079 |
Figure 2The distributions of the relative ratios for the top 153 features derived from the mRMR feature list for (A) the peptide features and (B) the HLA I protein features.
Figure 3The distribution of the relative ratios for the top 93 features derived from the MaxRel feature list for the HLA I protein features.