| Literature DB >> 19483101 |
Wen-Yi Chu1, Yu-Feng Huang, Chun-Chin Huang, Yi-Sheng Cheng, Chien-Kang Huang, Yen-Jen Oyang.
Abstract
This article presents the design of a sequence-based predictor named ProteDNA for identifying the sequence-specific binding residues in a transcription factor (TF). Concerning protein-DNA interactions, there are two types of binding mechanisms involved, namely sequence-specific binding and nonspecific binding. Sequence-specific bindings occur between protein sidechains and nucleotide bases and correspond to sequence-specific recognition of genes. Therefore, sequence-specific bindings are essential for correct gene regulation. In this respect, ProteDNA is distinctive since it has been designed to identify sequence-specific binding residues. In order to accommodate users with different application needs, ProteDNA has been designed to operate under two modes, namely, the high-precision mode and the balanced mode. According to the experiments reported in this article, under the high-precision mode, ProteDNA has been able to deliver precision of 82.3%, specificity of 99.3%, sensitivity of 49.8% and accuracy of 96.5%. Meanwhile, under the balanced mode, ProteDNA has been able to deliver precision of 60.8%, specificity of 97.6%, sensitivity of 60.7% and accuracy of 95.4%. ProteDNA is available at the following websites: http://protedna.csbb.ntu.edu.tw/, http://protedna.csie.ntu.edu.tw/, http://bio222.esoe.ntu.edu.tw/ProteDNA/.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19483101 PMCID: PMC2703882 DOI: 10.1093/nar/gkp449
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Illustration of the function of ProteDNA. (a) The partial prediction output of ProteDNA with the polypeptide sequence of Yeast TF GCN4 in PDB complex 1YSA. (b) The tertiary structure of the complex with PDB ID 1YSA. The residues colored by red are those sequence-specific binding residues correctly identified by ProteDNA, while the residues colored by blue are the false negatives. In this case, there is no false positive.
Figure 2.Overview of the architecture of ProteDNA.
Figure 3.The webpage for submitting a job.
Overall performance of ProteDNA
| Type of the secondary structure element | No. of residues tested | Prediction results | |||||||
|---|---|---|---|---|---|---|---|---|---|
| TP | TN | FP | FN | Precision | Sensitivity | Specificity | Accuracy | ||
| Performance under the | |||||||||
| Helix | 33 769 | 1397 | 30 916 | 320 | 1136 | 0.814 | 0.552 | 0.990 | 0.957 |
| Sheet | 5396 | 0 | 5239 | 0 | 157 | NA | 0.000 | 1.000 | 0.971 |
| Coil | 21 286 | 355 | 20 401 | 57 | 473 | 0.862 | 0.429 | 0.997 | 0.975 |
| Overall | 60 451 | 1752 | 56 556 | 377 | 1766 | 0.823 | 0.498 | 0.993 | 0.965 |
| Performance under the | |||||||||
| Helix | 33 769 | 1679 | 30 299 | 937 | 854 | 0.642 | 0.663 | 0.970 | 0.947 |
| Sheet | 5396 | 39 | 5208 | 31 | 118 | 0.557 | 0.248 | 0.994 | 0.972 |
| Coil | 21 286 | 417 | 20 052 | 406 | 411 | 0.507 | 0.504 | 0.980 | 0.962 |
| Overall | 60 451 | 2135 | 55 559 | 1374 | 1383 | 0.608 | 0.607 | 0.976 | 0.954 |
Breakdown of the experimental results with ProteDNA in respect of different types of TF-DNA bindings
| Type of TF-DNA bindings | No. of TFs involved | No. of residues tested | Prediction results | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| TP | TN | FP | FN | Precision | Sensitivity | Specificity | Accuracy | |||
| Performance under the | ||||||||||
| Zipper-type | 146 | 9587 | 586 | 8667 | 128 | 206 | 0.821 | 0.740 | 0.985 | 0.965 |
| Helix-turn-helix | 220 | 27 063 | 510 | 25 455 | 149 | 949 | 0.774 | 0.350 | 0.994 | 0.959 |
| Zinc-coordinating | 152 | 12 105 | 598 | 11 098 | 86 | 323 | 0.874 | 0.649 | 0.992 | 0.966 |
| β-hairpin/ribbon | 38 | 2618 | 0 | 2488 | 0 | 130 | NA | 0.000 | 1.000 | 0.950 |
| Others | 44 | 9078 | 58 | 8848 | 14 | 158 | 0.806 | 0.269 | 0.998 | 0.981 |
| Overall | 600 | 60 451 | 1752 | 56 556 | 377 | 1766 | 0.823 | 0.498 | 0.993 | 0.965 |
| Performance under the | ||||||||||
| Zipper-type | 146 | 9587 | 643 | 8496 | 299 | 149 | 0.683 | 0.812 | 0.966 | 0.953 |
| Helix-turn-helix | 220 | 27 063 | 769 | 24 994 | 610 | 690 | 0.558 | 0.527 | 0.976 | 0.952 |
| Zinc-coordinating | 152 | 12 105 | 610 | 10 925 | 259 | 311 | 0.702 | 0.662 | 0.977 | 0.953 |
| β-hairpin/ribbon | 38 | 2618 | 39 | 2365 | 123 | 91 | 0.241 | 0.300 | 0.951 | 0.918 |
| Others | 44 | 9078 | 74 | 8778 | 84 | 142 | 0.468 | 0.343 | 0.991 | 0.975 |
| Overall | 600 | 60 451 | 2135 | 55 558 | 1375 | 1383 | 0.608 | 0.607 | 0.976 | 0.954 |
Performance delivered by alternative predictors of DNA-binding residues, where the F-score is the harmonic mean of precision and sensitivity
| Predictor | Sensitivity | Specificity | Accuracy | Precision | |
|---|---|---|---|---|---|
| ProteDNA under the | 0.498 | 0.993 | 0.965 | 0.823 | 0.621 |
| ProteDNA under the | 0.607 | 0.976 | 0.954 | 0.608 | 0.607 |
| Ahmad and Sarai ( | 0.682 | 0.660 | 0.664 | 0.308* | 0.425* |
| Yan and | 0.410 | 0.871 | 0.780 | 0.439* | 0.424* |
| BindN ( | 0.652 | 0.728 | 0.722 | 0.186* | 0.289* |
| DP-Bind ( | 0.791 | 0.786 | 0.800 | –* | –* |
The numbers with an asterisk are those that have been derived from the numbers reported in the related studies.