| Literature DB >> 22661576 |
Yao Chi Chen1, Jon D Wright, Carmay Lim.
Abstract
DR_bind is a web server that automatically predicts DNA-binding residues, given the respective protein structure based on (i) electrostatics, (ii) evolution and (iii) geometry. In contrast to machine-learning methods, DR_bind does not require a training data set or any parameters. It predicts DNA-binding residues by detecting a cluster of conserved, solvent-accessible residues that are electrostatically stabilized upon mutation to Asp(-)/Glu(-). The server requires as input the DNA-binding protein structure in PDB format and outputs a downloadable text file of the predicted DNA-binding residues, a 3D visualization of the predicted residues highlighted in the given protein structure, and a downloadable PyMol script for visualization of the results. Calibration on 83 and 55 non-redundant DNA-bound and DNA-free protein structures yielded a DNA-binding residue prediction accuracy/precision of 90/47% and 88/42%, respectively. Since DR_bind does not require any training using protein-DNA complex structures, it may predict DNA-binding residues in novel structures of DNA-binding proteins resulting from structural genomics projects with no conservation data. The DR_bind server is freely available with no login requirement at http://dnasite.limlab.ibms.sinica.edu.tw.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22661576 PMCID: PMC3394278 DOI: 10.1093/nar/gks481
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.An example of the Results page from DR_bind.
Comparison of the performance measures of DR_Bind using our nonredundant data set of 83 DNA-bound and 55 DNA-free protein structures and the protein–DNA benchmark version 1.2 containing 47 DNA-bound and free protein structures
| Data set | I (bound) | II (free) | III (bound) | III (free) |
|---|---|---|---|---|
| No. of structures | 83 | 55 | 47 | 47 |
| TP | 728 | 419 | 468 | 417 |
| FP | 831 | 566 | 371 | 429 |
| TN | 18 128 | 11 596 | 6486 | 6435 |
| FN | 1,362 | 792 | 702 | 693 |
| Precision | 0.47 | 0.43 | 0.56 | 0.49 |
| Sensitivity | 0.35 | 0.35 | 0.40 | 0.38 |
| Specificity | 0.96 | 0.95 | 0.95 | 0.94 |
| Accuracy | 0.90 | 0.90 | 0.87 | 0.86 |
| mcc | 0.35 | 0.33 | 0.40 | 0.35 |
Figure 2.The percent frequency of a precision value derived from 1000 random choices of (a) 40 DNA-bound structures from Data set I and (b) 25 DNA-free protein structures from Dataset II. The solid, dashed, dotted and dashed–dotted curves correspond to precision values obtained using DR_bind, BindN+, NAPS and DNABINDPROT, respectively.
Comparison of the performance measures of DR_Bind, BindN+, NAPS and DNABINDPROT using the same data set of 83 DNA bound or 55 DNA-free protein structures,
| Server | DR_Bind | BindN+ | NAPS | |
|---|---|---|---|---|
| TP | 728 (419) | 1013 (542) | 328 (180) | 244 (169) |
| FP | 831 (566) | 1798 (1129) | 733 (459) | 1040 (772) |
| TN | 18 128 (11 596) | 17 161 (11 033) | 18 226 (11 703) | 17 919 (11 390) |
| FN | 1362 (792) | 1077 (669) | 1762 (1031) | 1846 (1042) |
| Precision | 0.47 (0.43) | 0.36 (0.32) | 0.31 (0.28) | 0.19 (0.18) |
| Sensitivity | 0.35 (0.35) | 0.48 (0.45) | 0.16 (0.15) | 0.12 (0.14) |
| Specificity | 0.96 (0.95) | 0.91 (0.91) | 0.96 (0.96) | 0.95 (0.94) |
| Accuracy | 0.90 (0.90) | 0.86 (0.87) | 0.88 (0.89) | 0.86 (0.86) |
| mcc | 0.35 (0.33) | 0.34 (0.31) | 0.16 (0.15) | 0.08 (0.09) |
aThe PDB entries are listed in Supplementary Table S1; the total number of residues in the data set is 21 049, out of which 2090 residues are DNA-binding (=TP+FN) and 18 959 residues are non-DNA-binding (=FP+TN).
bPerformance measures based on the DNA-free protein structures are in the parentheses.
cThe PDB entries are listed in Supplementary Table S1; the total number of residues in the dataset is 13 373, out of which 1211 residues are DNA-binding (=TP+FN) and 12 162 residues are non-DNA-binding (=FP+TN).
Comparison of the performance measures of DR_Bind, BindN+ and NAPS using the same data set of 15 DNA-bound protein structures with no or insufficient close homologs
| Server | DR_Bind | BindN+ | NAPS |
|---|---|---|---|
| TP | 110 | 230 | 34 |
| FP | 122 | 618 | 115 |
| TN | 2585 | 2089 | 2592 |
| FN | 292 | 172 | 368 |
| Precision | 0.47 | 0.27 | 0.23 |
| Sensitivity | 0.27 | 0.57 | 0.08 |
| Specificity | 0.95 | 0.77 | 0.96 |
| Accuracy | 0.87 | 0.75 | 0.84 |
| mcc | 0.29 | 0.26 | 0.07 |
aThe PDB entries are listed in Supplementary Table S2; the total number of residues in the data set is 3109, out of which 402 residues are DNA-binding (=TP+FN) and 2707 residues are non DNA-binding (=FP+TN).