| Literature DB >> 17597855 |
Vadim Astakhov1, Artem Cherkasov.
Abstract
Prediction of peptides binding to HLA (human leukocyte antigen) finds application in peptide vaccine design. A number of statistical and structural models have been developed in recent years for HLA binding peptide prediction. However, a Bayesian Network (BNT) model is not available. In this study we describe a BNT model for HLA-A2 binding peptide prediction. It has been demonstrated that the BNT model allows up to 99 % accurate identification of the HLA-A2 binding peptides and provides similar prediction accuracy compared to HMM (Hidden Markov Model) and ANN (Artificial Neural Network). At the same time, it has been shown that the BNT has that advantage that it allows more accurate performance for smaller sets of empirical data compared to the HMM and the ANN methods. When the size of the training set has been reduced to 40% from the original data, the identification of the HLA-A2 binding peptides by the BNT, ANN and HMM methods produced ARoc (area under receiver operating characteristic) values 0.88, 0.85, 0.85 respectively. The results of the work demonstrate certain advantages of using the Bayesian Networks in predicting the HLA binding peptides using smaller datasets.Entities:
Year: 2005 PMID: 17597855 PMCID: PMC1891637 DOI: 10.6026/97320630001058
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Figure 1(A) Representation of a nine residue long peptide by the Bayesian Network. (B) Dependence between the estimated ARoc parameter by ANN, HMM and BNT with training and testing set. (C) Dependence between the prediction accuracy estimated by ANN, HMM and BNT with training set size.
Performance of different models in varying datasets is given
| Prediction parameter | Set 1 (73 peptides) | Set 2 (635 peptides) | ||||
|---|---|---|---|---|---|---|
| ANN | HMM | BNT | ANN | HMM | BNT | |
| True Positives | 17 | 21 | 24 | 191 | 193 | 194 |
| True Negatives | 42 | 44 | 45 | 386 | 390 | 392 |
| False Positives | 6 | 4 | 3 | 30 | 26 | 24 |
| False Negatives | 8 | 4 | 1 | 28 | 26 | 25 |
| Matthew Coefficient | 0.73 | 0.85 | 0.86 | 0.78 | 0.86 | 0.89 |
| Specificity | 0.91 | 0.92 | 0.94 | 0.93 | 0.94 | 0.94 |
| Sensitivity | 0.68 | 0.84 | 0.96 | 0.87 | 0.88 | 0.89 |
| Correct predictions | 0.89 | 0.93 | 0.95 | 0.91 | 0.93 | 0.95 |
| ARoc performance | ||||||
| Training/Testing set separation | ANN | HMM | BNT | |||
| 0.4/0.6 | 0.856 | 0.860 | 0.880 | |||
| 0.5/0.5 | 0.873 | 0.880 | 0.901 | |||
| 0.6/0.4 | 0.932 | 0.920 | 0.940 | |||
| 0.7/0.3 | 0.962 | 0.950 | 0.960 | |||
| 0.8/0.2 | 0.985 | 0.992 | 0.980 | |||
| 0.9/0.1 | 0.992 | 0.998 | 0.990 | |||