| Literature DB >> 20591147 |
Sibel Kalyoncu1, Ozlem Keskin, Attila Gursoy.
Abstract
BACKGROUND: PDZ domain is a well-conserved, structural protein domain found in hundreds of signaling proteins that are otherwise unrelated. PDZ domains can bind to the C-terminal peptides of different proteins and act as glue, clustering different protein complexes together, targeting specific proteins and routing these proteins in signaling pathways. These domains are classified into classes I, II and III, depending on their binding partners and the nature of bonds formed. Binding specificities of PDZ domains are very crucial in order to understand the complexity of signaling pathways. It is still an open question how these domains recognize and bind their partners.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20591147 PMCID: PMC2909223 DOI: 10.1186/1471-2105-11-357
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Representative structure of a PDZ domain in complex with its ligand. (a) The common representation of a PDZ domain (α-1 syntrophin) with a peptide (in its stick form) in its binding pocket. Peptide positions -1 and -3 (blue) point towards to the solvent, the positions 0 and -2 (pink) head towards to the binding pocket (b) The interaction of the peptide with αB helix and conserved GLGF segment (here it is GLGI) of the βA-βB loop (PDB ID:2PDZ).
Seven amino acid classes used in our model.
| Class | Amino acid(s) | Volume (Å3) | Dipole (Debye) |
|---|---|---|---|
| 1 | Ala, Gly, Val | <50 | 0 |
| 2 | Ile, Leu, Phe, Pro | >50 | 0 |
| 3 | Tyr, Met, Thr, Ser | >50 | <1.0 |
| 4 | His, Asn, Gln, Trp | >50 | 1.0 < Dip. < 2.0 |
| 5 | Arg, Lys | >50 | 2.0 < Dip. < 3.0 |
| 6 | Asp, Glu | >50 | >3.0 |
| 7 | Cys* | >50 | <1.0 |
*Cys is differentiated from class 3 because it can form disulfide bonds
Prediction results for interaction prediction of PDZ domains for both trigram and bigram models.
| Training set (10-fold cross validation) | Validation set | |||||||
|---|---|---|---|---|---|---|---|---|
| TPR | FPR | Precision | Accuracy | TPR | FPR | Precision | Accuracy | |
| Trigram | 0.89 | 0.075 | 0.85 | 91.4 | 0.61 | 0.042 | 0.92 | 79.8 |
| Bigram | 0.844 | 0.053 | 0.89 | 91.2 | 0.889 | 0.323 | 0.545 | 74.2 |
Figure 2Performance evaluation of Random Forest trigram model. (a) ROC curve, (b) precision versus recall curve for interaction prediction part (c) ROC curve, (d) precision versus recall curve for classification part.
Prediction results for class prediction of PDZ domains for both trigram and bigram models.
| TP Rate | FP Rate | Precision | Accuracy (%) | |||||
|---|---|---|---|---|---|---|---|---|
| Trigram | Bigram | Trigram | Bigram | Trigram | Bigram | Trigram | Bigram | |
| ClassI, ClassII, Class I-II* | 0.907 | 0.895 | 0.081 | 0.093 | 0.911 | 0.902 | 90.7 | 89.5 |
| ClassI, ClassII | 0.918 | 0.956 | 0 | 0.200 | 1 | 0.915 | 93.8 | 90.8 |
| ClassI, ClassI-II | 0.900 | 0.955 | 0 | 0.227 | 1 | 0.894 | 92.4 | 89.4 |
| ClassII, ClassI-II | 1 | 0.813 | 0.107 | 0 | 0.812 | 1 | 92.7 | 92.7 |
*The first row shows a multi-class learning and remaining rows shows the binary-class learning for pair wise combinations of three classes. For multi-class learning, weighted average results were shown.
Prediction results after feature reduction.
| TPR | FPR | Precision | AUC | Accuracy (%) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Trigram | Bigram | Trigram | Bigram | Trigram | Bigram | Trigram | Bigram | Trigram | Bigram | |
| Interaction prediction | 0.744 | 0.786 | 0.096 | 0.07 | 0.798 | 0.851 | 0.905 | 0.948 | 85 | 88.1 |
| Classification* | 0.942 | 0.86 | 0.044 | 0.096 | 0.942 | 0.859 | 0.994 | 0.966 | 94.2 | 86 |
* Weighted average result for multi-class learning (Class I, Class II, Class I-II)
Figure 3Critical sequence motifs. (a) Aligned sequences of 5 representative PDZ domains: α1-syntrophin(1/1) (PDB ID:2pdz), NHERF1(1/2) (PDB ID:1i92), Harmonin(2/3) (PDB ID:2kbs), Pick1(1/1) (PDB ID:2pku) and PTP-BL(2/5) (PDB ID:1vj6). While first row indicates the aligned sequence of corresponding PDZ domain, second row represents the sequence in seven class amino acid types. Secondary structure positions of the PDZ sequences are represented graphically at the top (αA, Αb, βA-βF). Three sequence motifs ("12", "16", "25") proposed to account for ligand specificity are indicated by yellow highlight. (b) Cartoon diagrams of these PDZ domains, motifs "12", "16" and "25" are colored in red and shown in stick form.