| Literature DB >> 16719926 |
Guang Lan Zhang1, Nikolai Petrovsky, Chee Keong Kwoh, J Thomas August, Vladimir Brusic.
Abstract
BACKGROUND: The transporter associated with antigen processing (TAP) is a critical component of the major histocompatibility complex (MHC) class I antigen processing and presentation pathway. TAP transports antigenic peptides into the endoplasmic reticulum where it loads them into the binding groove of MHC class I molecules. Because peptides must first be transported by TAP in order to be presented on MHC class I, TAP binding preferences should impact significantly on T-cell epitope selection. DESCRIPTION: PRED(TAP) is a computational system that predicts peptide binding to human TAP. It uses artificial neural networks and hidden Markov models as predictive engines. Extensive testing was performed to valid the prediction models. The results showed that PRED(TAP) was both sensitive and specific and had good predictive ability (area under the receiver operating characteristic curve Aroc>0.85).Entities:
Year: 2006 PMID: 16719926 PMCID: PMC1524936 DOI: 10.1186/1745-7580-2-3
Source DB: PubMed Journal: Immunome Res ISSN: 1745-7580
Number of peptides in the training dataset
| Binding Affinity | Number of peptides |
| 0 | 26 |
| 1 | 52 |
| 2 | 48 |
| 3 | 48 |
| 4 | 53 |
| 5 | 55 |
| 6 | 40 |
| 7 | 87 |
| 8 | 61 |
| 9 | 16 |
| 10 | 7 |
| Sum | 493 |
Performance assessment of ANN/HMM models using 10-fold cross-validation
| ANN 180-2-1 | H | MH | LMH |
| 1st run | 0.95 | 0.95 | 0.89 |
| 2nd run | 0.95 | 0.94 | 0.88 |
| 3rd run | 0.95 | 0.94 | 0.88 |
| ANN 180-1-1 | H | MH | LMH |
| 1st run | 0.93 | 0.94 | 0.87 |
| 2nd run | 0.92 | 0.92 | 0.86 |
| 3rd run | 0.93 | 0.94 | 0.88 |
| HMM | H | M | L |
| 1st run | 0.9 | 0.9 | 0.87 |
| 2nd run | 0.89 | 0.9 | 0.87 |
| 3rd run | 0.89 | 0.9 | 0.86 |
Figure 1Plot of sensitivity and specificity of ANN model against thresholds in 10-fold cross-validation. The ANN model for prediction of A) LMH set, B) MH set, and C) H set.
Figure 2Plot of sensitivity and specificity of HMM model against thresholds in 10-fold cross-validation. The HMM model for prediction of A) LMH set, B) MH set, and C) H set.
Sensitivities and specificities of ANN and HMM models at the selection threshold 6.0
| Threshold | ANN | SE | SP |
| 6.0 | LMH | 0.50 | 1.00 |
| MH | 0.67 | 0.97 | |
| H | 0.88 | 0.89 | |
| Threshold | HMM | SE | SP |
| 6.0 | LMH | 0.66 | .86 |
| MH | 0.81 | 0.81 | |
| H | 0.91 | 0.68 | |
Performance assessment of ANN/HMM models when the dataset was partitioned into two parts with the training dataset containing two thirds of the data points randomly selected and the testing set containing the remaining one third of data points
| ANN 180-2-1 | H | MH | LMH |
| 1st run | 0.91 | 0.92 | 0.85 |
| 2nd run | 0.96 | 0.95 | 0.90 |
| 3rd run | 0.94 | 0.91 | 0.87 |
| HMM | H | M | L |
| 1st run | 0.88 | 0.88 | 0.86 |
| 2nd run | 0.86 | 0.88 | 0.83 |
| 3rd run | 0.91 | 0.9 | 0.82 |
Amino acid position of top 5% predicted TAP binders in Human papillomavirus type 16 E6 (P03126) by SVMTAP, TAPPred and PREDTAP. The positions marked by "+" were selected by four prediction models. The positions marked by "*"were selected by three prediction models. The experimentally identified HLA-A*0301 binders are 17–15, 233–41, 342–50, 459–67, 575–83, 689–97, 793–101, and 8125–133). The predictions in the table marked by 1–8 are within 16-mers containing respective HLA-A*0301 binders
| SVMTAP | TAPPred (SVM) | TAPPred (Cascade SVM) | PREDTAP (ANN) | PREDTAP (HMM) |
| 755,+ | 534,+ | 51 | 755,+ | 755,+ |
| 1318 | 68 | 604 | 534,+ | 463,* |
| 534,+ | 805 | 493 | 463,* | 614 |
| 150 | 815 | 1328 | 68 | 473 |
| 1308 | 755,+ | 937 | 83 | 71 |
| 463,* | 1318 | 116 | 594 | 534,+ |
| 134 | 134 | 67 | 423 | 493 |
| 146 | 51 | 402,3 | 1308 | 836 |
Amino acid position of the top 5% predicted TAP binders in HPV 16 E7 (P03129) by SVMTAP, TAPPred and PREDTAP. The positions marked by "+" were selected by four prediction models and those marked by "*"were selected by three prediction models. The experimentally identified HLA-A*0201 binder is 89–97. 1Within a 16-mer containing E7 89–97
| SVMTAP | TAPPred (SVM) | TAPPred (Cascade SVM) | PREDTAP (ANN) | PREDTAP (HMM) |
| 49+ | 49+ | 58 | 50* | 49+ |
| 9* | 50* | 57 | 9* | 44 |
| 50* | 17 | 881 | 49+ | 43 |
| 59 | 9* | 821 | 48 | 71 |
| 7 | 59 | 67 | 76 | 3 |
Amino acid position of top 3% predicted TAP binders in the tumor antigen KM-HN-1 (NP_689988.1) by SVMTAP, TAPPred and PREDTAP. The positions marked by "+" were selected by four prediction models and those marked by "*"were selected by three prediction models. The predicted TAP-binders in proximity of known T-cell epitopes are designated by 1(196–204), 2(499–508) and 3(770–778)
| SVMTAP | TAPPred (SVM) | TAPPred (Cascade SVM) | PREDTAP (ANN) | PREDTAP (HMM) | ||
| Position | Position | Position | Position | Score | Position | Score |
| 660+ | 372+ | 674 | 1951,+ | 8.15 | 682 | 7.01 |
| 372+ | 1951,+ | 314 | 654* | 8.09 | 5062,+ | 6.98 |
| 426* | 426* | 639 | 372+ | 8.06 | 372+ | 6.87 |
| 1951,+ | 794 | 249 | 422 | 7.94 | 683 | 6.83 |
| 794 | 330 | 530 | 565 | 7.41 | 507+ | 6.65 |
| 199 | 317+ | 525 | 317+ | 6.53 | 1951,+ | 6.62 |
| 654* | 660+ | 325 | 310 | 5.97 | 492 | 6.44 |
| 317+ | 331 | 206 | 378 | 5.59 | 310 | 6.41 |
| 371 | 652* | 12 | 468 | 5.54 | 660+ | 6.41 |
| 110 | 199 | 479 | 7633 | 5.48 | 16 | 6.37 |
| 760 | 198 | 537 | 337 | 5.36 | 468 | 6.33 |
| 198 | 371 | 112 | 426* | 5.33 | 317+ | 6.19 |
| 789 | 654* | 93 | 737 | 5.16 | 395* | 6.16 |
| 705 | 5062,+ | 626 | 246 | 5.04 | 573 | 6.12 |
| 457 | 789 | 470 | 660+ | 4.98 | 223 | 6.12 |
| 507+ | 457 | 141 | 756 | 4.96 | 193 | 6.09 |
| 48 | 730 | 483 | 110 | 4.95 | 730 | 6.09 |
| 573 | 304 | 71 | 5062,+ | 4.89 | 313 | 6.05 |
| 376 | 565 | 99 | 201 | 4.79 | 647 | 6.05 |
| 652* | 395* | 668 | 507+ | 4.76 | 510 | 6.01 |
| 395* | 760 | 781 | 365 | 4.64 | 652* | 6.01 |
| 780 | 318 | 57 | 653 | 4.55 | 15 | 6.01 |
| 455 | 7643 | 579 | 5022 | 4.52 | 814 | 6.01 |
| 5062,+ | 63 | 124 | 492 | 4.44 | 676 | 5.94 |
| 324 | 507+ | 590 | 456 | 4.43 | 782 | 5.94 |
Figure 3The examples of the output pages of PREDTAP for a single protein. The sequence type chosen is "protein sequence". A) The input page. B) The main result page. The input sequence is decomposed into overlapping 9-mers for prediction of binding scores to TAP. C) Alignment view of the predicted TAP binding regions in the input protein.
Figure 4An example of the output pages of PREDTAP for a list of peptides. A) The input page. B) The main result page. All 9-mers in each peptide were submitted for prediction. The predicted binding scores are represented by the highest individual 9-mer binding score of each input peptide. The 9-mer with the highest binding score in each peptide is displayed as "Binding Core" in the result table.