| Literature DB >> 25521807 |
Abstract
BACKGROUND: Protein-DNA interactions play important roles in many biological processes. Computational methods that can accurately predict DNA-binding sites on proteins will greatly expedite research on problems involving protein-DNA interactions.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25521807 PMCID: PMC4290685 DOI: 10.1186/1752-0509-8-S4-S10
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Contributions of features.
| Features | E_P2 | Ent3 | StrCn4 | rASA5 | Cur6 | Poc7 | All8 | |
|---|---|---|---|---|---|---|---|---|
| Accuracy (%) | 86.9 | 77.0 | 67.5 | 54.7 | 54.5 | 54.1 | 54.1 | 88.7 |
1 PSSM: position-specific scoring matrix; 2 E_P: electrostatic potential; 3 Ent: sequence entropy; 4 StrCn: structural conservation; 5 rASA: relative solvent accessibility; 6 Cur: surface curvature; 7 Poc: size of the pocket where the residue is located; and 8 All: all the seven attributes were used.
Figure 1The ROC of the proposed method in predicting DNA-binding site residues.
Predictions by the top 1 patch.
| Unbound | Bound | Top 1 patch | ||
|---|---|---|---|---|
| 1iknA,C | 1leiA,B | 6 | 0 | 0.4 |
| 1zrfA,B | 6 | 0 | 2.4 | |
| 1zzkA | 1zziA | 6 | 0 | 2.9 |
| 1ztwA | 4 | 2 | 4.2 | |
| 1a2pC | 1brnL | 6 | 0 | 4.5 |
| 1m3qA | 6 | 0 | 4.9 | |
| 1cl8A,B | 4 | 2 | 8.0 | |
| 1lqc | 1l1mA,B | 6 | 0 | 8.7 |
| 1rfiB | 2 | 4 | 9.5 | |
| 1xyiA | 6 | 0 | 10.7 | |
| 1qqiA | 1gxpA,B | 4 | 2 | 19.3 |
| 1u1qA | 3 | 3 | 25.6 | |
| 1f5eP | 4 | 2 | 26.7 | |
1 For the proteins in italic, the interfaces were predicted using only PSSM. For the others, all seven features were used; 2 TP: the number of the interface residues falling in the top 1 patch; 3 FP: the number of non-interface residues in the top 1 patch. 4 P: When a patch is randomly picked, the probability of it containing at least as many interface residues as the top 1 patch.
Figure 2Tradeoff between coverage and accuracy for the proposed method.
Comparison with other methods.
| Proteins | Graph Kernel1 | DISPLAR2 | MV3 | |||
|---|---|---|---|---|---|---|
| 1a2p | 0.44 | 0.33 | 0.45 | 0.45 | ||
| 1ikn | 0.46 | 0.38 | 0.46 | 0.39 | ||
| 1lqc | 0.95 | 0.56 | 0.95 | 0.49 | ||
| 1qqi | 0.70 | 0.67 | 0.70 | 0.33 | ||
| 1zzk | 0.37 | 0.47 | ||||
| 1qc9 | 0.55 | 0.41 | 0.55 | 0.23 | ||
| 2alc | 0.90 | 0.59 | 0.75 | 0.47 | ||
| 1ko9 | 0.48 | 0.80 | 0.50 | 0.89 | ||
| 1qzq | 0.57 | 0.63 | 0.57 | 0.24 | ||
| 1l3k | 0.44 | 0.56 | 0.44 | 0.63 | ||
| 1xx8 | 0.60 | 0.92 | 0.74 | 0.32 | ||
| 1g6n | 0.48 | 0.73 | 0.48 | 0.52 | ||
1 The method proposed in this study; 2 The DISPLAR method developed by Tjong and Zhou [7]; 3 The MV method developed by [9]; 4 The bold font shows the best performance among the three methods on each protein.