| Literature DB >> 27198220 |
Inbal Paz1, Efrat Kligun1, Barak Bengad1, Yael Mandel-Gutfreund2.
Abstract
Gene expression is a multi-step process involving many layers of regulation. The main regulators of the pathway are DNA and RNA binding proteins. While over the years, a large number of DNA and RNA binding proteins have been identified and extensively studied, it is still expected that many other proteins, some with yet another known function, are awaiting to be discovered. Here we present a new web server, BindUP, freely accessible through the website http://bindup.technion.ac.il/, for predicting DNA and RNA binding proteins using a non-homology-based approach. Our method is based on the electrostatic features of the protein surface and other general properties of the protein. BindUP predicts nucleic acid binding function given the proteins three-dimensional structure or a structural model. Additionally, BindUP provides information on the largest electrostatic surface patches, visualized on the server. The server was tested on several datasets of DNA and RNA binding proteins, including proteins which do not possess DNA or RNA binding domains and have no similarity to known nucleic acid binding proteins, achieving very high accuracy. BindUP is applicable in either single or batch modes and can be applied for testing hundreds of proteins simultaneously in a highly efficient manner.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27198220 PMCID: PMC4987955 DOI: 10.1093/nar/gkw454
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Examples of BindUP results pages. (A) A presentation of the largest positive patch, calculated on a structural model of NOL10, constructed by I-TASSER. The model is predicted to be NA-binding. (B) A presentation of the largest negative and positive patches, calculated on chain A of PDB ID: 3S30, which is predicted to be non-NA-binding. (C) A presentation of the largest positive patches, calculated on eight protein chains of PDB ID: 1AOI, displayed together with the DNA chains. (D) A presentation of the three largest positive patches, calculated on the two protein chains of PDB ID: 1QRV, displayed together with the DNA chains.
A summary of BindUP results tested on different datasets
| Dataset | Algorithm | Sensitivity | Specificity | AUC |
|---|---|---|---|---|
| BindUP_NA323 | BindUP | 0.71 | 0.96 | 0.94 |
| BindUP_NA230_struct | BindUP | 0.70 | 0.91 | 0.91 |
| BindUP_R127 | BindUP | 0.65 | 0.97 | 0.90 |
| BindUP_R127 | SPOT-Struct-RNA | 0.63 | 0.99 | |
| BindUP_D190 | BindUP | 0.74 | 0.95 | 0.96 |
| BindUP_D190 | SPOT-Struct-DNA | 0.57 | 1.00 | |
| RBscore_P627 | BindUP | 0.60 | 0.90 | 0.83 |
Sensitivity was calculated using the formula TP/(TP + FN). Specificity was calculated using the formula TN/(TN + FP). AUC was calculated using the Gist Support Vector Machine (SVM) classifier (http://www.chibi.ubc.ca/gist/). Results for SPOT-Struct-DNA and SPOT-Struct-RNA were obtaining by running the independent datasets D190 and R127 of the respective web servers. RBscore_P627 was extracted from (15,45). The dataset was processed, removing protein structures that are defined as obsolete in the RCSB PDB database, as well as structures that overlap with BindUP training set.