Literature DB >> 27198220

BindUP: a web server for non-homology-based prediction of DNA and RNA binding proteins.

Inbal Paz¹, Efrat Kligun¹, Barak Bengad¹, Yael Mandel-Gutfreund².

Abstract

Gene expression is a multi-step process involving many layers of regulation. The main regulators of the pathway are DNA and RNA binding proteins. While over the years, a large number of DNA and RNA binding proteins have been identified and extensively studied, it is still expected that many other proteins, some with yet another known function, are awaiting to be discovered. Here we present a new web server, BindUP, freely accessible through the website http://bindup.technion.ac.il/, for predicting DNA and RNA binding proteins using a non-homology-based approach. Our method is based on the electrostatic features of the protein surface and other general properties of the protein. BindUP predicts nucleic acid binding function given the proteins three-dimensional structure or a structural model. Additionally, BindUP provides information on the largest electrostatic surface patches, visualized on the server. The server was tested on several datasets of DNA and RNA binding proteins, including proteins which do not possess DNA or RNA binding domains and have no similarity to known nucleic acid binding proteins, achieving very high accuracy. BindUP is applicable in either single or batch modes and can be applied for testing hundreds of proteins simultaneously in a highly efficient manner.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：

Year: 2016 PMID： 27198220 PMCID： PMC4987955 DOI： 10.1093/nar/gkw454

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Nucleic acid binding proteins (NABPs), specifically DNA-binding proteins (DBPs) and RNA-binding proteins (RBPs), play a crucial role in all steps of the gene expression pathway, from RNA transcription via post-transcriptional regulation to protein translation (1,2). In recent years it is becoming apparent that both DBPs and RBPs are also involved in epigenetic regulation (3,4). Understanding the complexity of the gene expression regulation requires the identification of RBPs and DBPs that are involved in these processes and defining their RNA and DNA binding sites. Over the years, high resolution structures of protein–DNA and protein–RNA complexes, solved by x-ray crystallography and nuclear magnetic resonance, have provided crucial information on the properties of DBPs and RBPs and on their modes of interactions with the nucleic acids. In recent years, there has been an enormous advance in the development of high-throughput experimental technologies for detecting NABPs. Recently, several high-throughput proteomic-based methodologies were developed for in vivo detection of RBPs in eukaryotes (5–11). These approaches (known as ‘RNA interactome capture’ experiments) were successfully employed to identify a large fraction of known RBPs, as well as to detect novel RBPs. While DBPs have been extensively studied and characterized (2), de novo detection of proteins which bind DNA is generally a hard task. In vitro methods for detection of RNA and DNA binding specificities, such as RNAcompete (12) and protein binding microarrays (13), respectively, have been also employed for validating nucleic acid binding preferences. Despite the advancement of the experimental techniques to discover NABPs, given the high throughput nature of these techniques, these approaches tend to produce a high number of incorrectly detected proteins (including false negatives and false positives). It is thus of great importance to complement the experimental approaches, aiming to discover novel NABPs, with sophisticated computational approaches. Over the years many computational approaches have been developed for the identification and classification of DNA and RNA binding proteins and their binding sites (for recent reviews see (14–17)). The computational methods for classifying DNA and RNA binding proteins can be roughly divided into methods that are based on the protein structure (e.g. (18–21) applied for DNA and (22–25) for RNA) and those that rely on the amino acid sequence alone (such as (26–28)). Moreover, function prediction methods have been employed for predicting the function of NABPs from sequences based on structure, using fold recognition approaches, e.g. (25,29). Several of these methods have been implemented to web servers, such as DBD-hunter (19), Spot-Struct-DNA (30) and Spot-Struct-RNA (25) for identifying DNA and RNA binding proteins from structure. In addition, DBD-threader (29), PSSM-DT (16), RNApred (27) and SPOT-seq-RNA (24) web servers are available for annotating NA-binding function given the protein sequence using template-based approaches. The vast majority of the computational approaches (both structural based and sequence based methods), available to-date for classifying NABPs, rely on homology and thus are less effective for identifying novel DNA and RNA binding proteins. Recently, Zhou et al. applied their template-based SPOT-seq-RNA algorithm to the entire human proteome, correctly identifying 42.6% of all annotated RBPs (31). Similar results were achieved when testing their method on the RBPs which were discovered by the interactome capture experiment conducted in Human HeLA cells (6). We have previously developed a machine learning approach named NAbind for classifying DNA and RNA binding proteins given the protein structures (20,23). NAbind is a machine learning approach, based on extracting the largest positive electrostatic patch on the protein surface, implemented in our PFPlus web server (32). Notably, while the majority of NABPs bind the DNA or RNA via a continuous positive interface, some proteins employ different strategies, for example the tRNA binding proteins that possess a significant large negative patch and bind their tRNA substrate via two distinct positive patches (33). Such proteins that do not rely on the large positive surface for NA-binding are likely to be mispredicted by the algorithm. Nevertheless, the great advantage of the NAbind algorithm is that it is based on the overall physicochemical and structural features of the NABPs, learnt from the three-dimensional (3D) structures of known DBPs and RBPs, and does not rely on either sequence or structural homology to the known NABPs. NAbind has been recently trained on a large set of non-redundant DNA and RNA binding proteins (sharing <25% sequence identify) from the protein data bank (PDB) (34) and has been modified to be applicable for both experimentally solved protein structures as well as low resolution structural models, derived from protein sequences. Here we describe a new web server, BindUP, for predicting NABPs based on the electrostatic patches on the protein surfaces, employing the NAbind algorithm (20,23). The server was tested on a completely independent dataset of DNA and RNA binding protein structures, achieving an Area Under ROC Curve (AUC) value of 0.94, with 0.71 sensitivity and 0.96 specificity. Moreover, BindUP was successfully applied on non-homology-based structural models of novel RBPs. Further, we show that the positive electrostatic patches extracted by BindUP highly overlap with the NA-binding regions, suggesting that BindUP can also be employed for identifying binding interfaces. The information on the largest positive patches, as well as the negative patches, is provided both graphically and in text. BindUP is significantly more efficient than PFPlus (32) in providing the electrostatic patch information, as all patches are pre-calculated and stored in a database. The server currently holds information for all 117 882 protein structures in the PDB (as of 18 April 2016). BindUP is applicable in either single or batch mode and can be applied for testing hundreds of proteins simultaneously in a highly efficient manner. BindUP is freely accessible via the website http://bindup.technion.ac.il/.

BindUP METHODOLOGY

The algorithm for predicting NABPs, given a 3D structure of a protein or a structural model, is based on our NAbind algorithm, originally developed and trained on high resolution structures of DBPs (20) and further on RBPs (23). The NAbind algorithm is based on the unique features of the proteins’ electrostatic surface patches, implemented in the PatchFinder web server(32). The PatchFinder algorithm (20) automatically assigns surface (positive and negative) patches by looking for adjacent points on the protein surface that meet a given electrostatic potential cut-off (2 or -2 kT/e, for positive and negative patches, respectively). The algorithm is built of several steps. In the first step, the electrostatic potential of the protein is calculated on a 3D grid, using the Poisson Boltzmann equation. The electrostatic potential is calculated using the APBS software (35) with a grid spacing of 1Å. Hydrogen atoms are added prior to the calculations using PDB2PQR (36). We further define the grid points that fall on the protein surface, using the DMS open source http://www.cgl.ucsf.edu/Overview/software.html#dms to calculate the surface accessibility, based on the Lee and Richards algorithm (37), ignoring all non-surface points. We then extract continuous electrostatic patches on the protein surface by selecting all 3D patches of adjacent grid points which meet the defined cut-off. Finally, we select the largest electrostatic patches for each protein chain and assign the protein residues related to the positive and negative patches. As aforementioned, the NAbind algorithm employs the information from the largest positive and negative patches, extracted by PatchFinder, as well as other structural features of the protein. Among these features are the molecular weight, the overall surface accessibility and the moment dipole of the protein chain. The patch features include the size of the patch (positive and negative), the overall potential and surface accessibility of the largest positive electrostatic patch as well as the overlap between the largest positive patch and the largest cleft on the protein surface (a detailed description of the features is found in (23)). To distinguish NABPs from non-NABPs, we use the GIST Support Vector Machine (SVM) classifier http://www.chibi.ubc.ca/gist/, trained with a linear kernel function using the default parameters. All properties are fed into the input matrix as numeric features with no manipulations conducted on the matrix. The SVM was trained on a non-redundant set of 450 protein chains, including 90 DNA-binding, 60 RNA-binding and 300 non-NA-binding, extracted from the PDB database. The dataset was generated by selecting from the PDB all DNA and RNA binding protein chains and further removing redundancy by employing the BLASTClust program, which uses the BLAST local alignment algorithm for pairwise comparison and clustering (38). We further selected the representative protein structure from each of the clusters (sharing <25% sequence identity between them) with the best resolution. An equivalent dataset of proteins which do not bind nucleic acids was generated in a similar manner. Finally, the representative datasets of NA and non-NA binding proteins were manually curated, ensuring that the selected proteins chains are the binding or non-binding chains, respectively. Notably, the uniqueness of the NAbind algorithm is that it does not rely on either sequence or structural homology and thus can be applied for predicting novel NABPs. Nevertheless, it is important to note that NAbind was trained and tested on single NA-binding chains and thus it may fail in predicting NA-binding in cases where proteins bind the nucleic acid as large multimeric complexes and rely on the large electrostatic patch, induced by the complex formation, as for example in the case of the DnaQ-like 3'-5' exonuclease (39).

BindUP DESCRIPTION

Input

BindUP server has two modes of usage, a single protein mode and a batch mode. In the single protein mode, BindUP predicts the NA-binding propensity for a given protein structure. The user can choose whether to calculate all the protein chains of the structure (each chain is calculated separately) or to select a specific chain identifier. The structure can be provided as either a PDB ID or as a user-defined coordinate file (of a known structure or a structural model) in PDB format. In case the input is provided as PDB ID, BindUP retrieves the results from a database of pre-calculated predictions. Other calculation options enable the user to control the type and number of electrostatic patches that will be displayed in the results. By default, BindUP displays the largest positive electrostatic patch. However, it is possible to choose whether to display only positive patches or negative patches (up to three patches together) or the combination of both (one positive and one negative patch). In the batch mode, BindUP gets a list of protein structures and calculates the requested electrostatic patches and the NA-binding prediction for each structure. The structures should be provided as PDB IDs only, pasted into the browser or uploaded as a text file. The number of entries is unlimited. The four-letter PDB ID may be followed by a chain identifier, to indicate that the calculation should be performed on the specific chain exclusively. Otherwise, the calculation will be performed on all the protein chains of the structure. The list of PDB entries may be combined of four-letter PDB ID entries (e.g. 1d66) and PDB IDs including a chain identifier (e.g. 1d66A) mixed together. In both modes it is optional to add an e-mail address to which the results will be automatically sent when the analysis is completed.

Output

BindUP calculation is performed on each protein chain separately. The results, for each requested protein chain, include the NA-binding prediction and the requested electrostatic patches on the protein surface. In the single protein mode, BindUP results are provided both in a web-based presentation and in downloadable text files. In case the user has requested to calculate a specific chain or if the PDB file contains only one protein chain, BindUP presents the results for this chain exclusively (Figure 1A). In case BindUP calculates more than one protein chain, it initially presents the results for the first protein chain (Figure 1B). Using a drop-down menu, it allows the user to interactively switch between all the protein chains and display the results per each chain. The last option in the drop-down menu is ‘all’, which displays the results for all the protein chains together (Figure 1C and D). Both the graphic presentation and the downloadable text files change according to the chain selection. Notably, when selecting the ‘all’ option, the NA-prediction is not displayed, as results may differ between chains. In this case, the prediction appears in the results text file.

Figure 1.

Examples of BindUP results pages. (A) A presentation of the largest positive patch, calculated on a structural model of NOL10, constructed by I-TASSER. The model is predicted to be NA-binding. (B) A presentation of the largest negative and positive patches, calculated on chain A of PDB ID: 3S30, which is predicted to be non-NA-binding. (C) A presentation of the largest positive patches, calculated on eight protein chains of PDB ID: 1AOI, displayed together with the DNA chains. (D) A presentation of the three largest positive patches, calculated on the two protein chains of PDB ID: 1QRV, displayed together with the DNA chains. The web-based presentation includes the NA-binding prediction for the selected chain and a visualization of the electrostatic patches, requested by the user, using Jmol: an open-source Java viewer for chemical structures in 3D (http://www.jmol.org/). In addition to the graphic presentation, BindUP provides two text files for download. The first file is a summary of the results. It contains the NA-binding prediction and the residues composing the electrostatic patches requested by the user. The second file is the coordinate PDB file, with the patches annotation inserted to the B-factor (temperature) column (the color-coding is described in the manual section of the website). These two downloadable files interactively change according to the user's choice. In the batch mode, the results for each protein structure are provided as downloadable text files only. The two text files are the same as described above and include the results for one chain or for all the protein chains, according to the input provided by the user. The first link in the results page refers to a summary text file, including the results for all the protein structures that have been submitted in the current job together.

RESULTS AND DISCUSSION

In the last few years there has been a great advancement in experimental (in vivo and in vitro) technologies for the detection of RNA and DNA binding proteins (5,7,9–13,40). However, given the many new roles expected for these proteins, it is estimated that many other proteins are yet to be discovered. In previous studies we have developed the NAbind algorithm for predicting novel DNA and RNA binding proteins from the structure of the protein, without relying on either sequence or structural homology (20,23). Given the enormous expansion in the number of DBPs and RBPs in the PDB (including proteins bound in complex with the nucleic acid as well as proteins solved in the unbound state) and the advancement in methods for non-homology-based protein structure predictions (41), we have added to the algorithm an additional feature, enabling the prediction of NABPs given a structural model of the protein predicted from sequence. Furthermore, while our previous studies have considered DNA and RNA binding proteins separately (training the algorithms on each group of proteins independently), the current version of NAbind, implemented in our new web server BindUP, was trained on a mixed set of DBPs and RBPs and does not attempt to distinguish between the two types of NABPs. This is consistent with the growing knowledge of proteins that bind both DNA and RNA (42), as well as our previous work showing that DBPs and RBPs can be weakly distinguished when considering only double stranded DBPs versus single stranded RBPs (43) and recent studies showing that algorithms for predicting DNA and RNA binding sites are unable to distinguish between the binding sites of the different NABPs (15,17). We have tested BindUP on an independent set of 323 structures of DNA and RNA binding proteins (BindUP_NA323) and on a control set of an equal number of non NA-binding proteins extracted from the PDB. Overall, we achieve an AUC value of 0.94, with 0.71 sensitivity and 0.96 specificity (see Table 1 and detailed results in Supplementary Table S1). As expected, among the proteins we mispredicted are proteins that do not rely on a large continuous electrostatic patch to bind the nucleic acid, such as yeast aspartyl-tRNA synthetase (PDB ID: 1ASY), or proteins that bind nucleic acids as large multimeric complexes, such as Ebola virus matrix protein VP40 (PDB ID: 1H2C) that binds the RNA as an octamer. Notably, the proteins in our test set were completely independent from the proteins in the training set, sharing <25% sequence identity among them and when compared to each of the proteins in the training set. To further ensure that BindUP does not rely on structural homology we used the CATH (44) structural classification to generate an additional training set of 230 structures of DNA and RNA binding proteins (BindUP_NA230_struct) and a control set of an equal number of non-NABPs, extracted from the PDB, which do not share structural homology (CATH ‘H-level’) with any protein in the training set. The results of the structural non-redundant set were very similar to those achieved for BindUP_NA323, with an AUC value of 0.91, 0.7 sensitivity and 0.91 specificity (see Table 1 and detailed results in Supplementary Table S2). To further test BindUP on another independent set of NABP structures, we extracted from the literature a recently compiled dataset of 627 NABPs solved in complex with the nucleic acid (DNA or RNA), generated by Miao and Westhof (45). We have further removed from this set proteins which overlapped with our training set and structures that are defined as obsolete in the RCSB PDB database, ending up with 535 NABP structures. As a control set we used an equal number of non-NABPs, extracted from the PDB. In this case, BindUP achieved less accurate results with an AUC of 0.83 compared to 0.94 in our independent testing set, 0.6 sensitivity and 0.9 specificity (see Table 1 and detailed results in Supplementary Table S3). Given the fact that previous prediction algorithms considered DBPs and RBPs prediction separately, to compare our results with other NA-binding prediction servers, we tested BindUP on independent sets of DBPs (BindUP_D190) and RBPs (BindUP_R127). The two latter sets are subsets of BindUP_NA323, excluding six protein chains that are annotated as both DNA and RNA binding proteins. The results for the independent lists were AUC ROC values of 0.96 and 0.90 for DBPs and RBPs, respectively (Table 1, detailed results in Supplementary Tables S4 and S5). We further compared BindUP predictions to the only two available active servers on the www for predicting NA-binding function from the protein structure: SPOT-Struct-DNA (30) and SPOT-Struct-RNA (25), for DBPs and RBPs, respectively. As shown in Table 1, both programs achieved very similar results to BindUP, with slightly higher sensitivity values attained by BindUP. However, in comparison to SPOT-Struct-DNA and SPOT-Struct-RNA, BindUP runs on batch mode and thus can predict the NA-binding function for an unlimited number of structures in one session. Moreover, as all BindUP calculations are stored in a database, holding information for the entire set of protein structures in the PDB, it runs extremely fast, independent of the size of the proteins and the number of protein chains in the PDB structure. To our knowledge, BindUP is the only web server for prediction of NA-binding function from structure which runs efficiently in a batch mode.

Table 1.

A summary of BindUP results tested on different datasets

Dataset	Algorithm	Sensitivity	Specificity	AUC
BindUP_NA323	BindUP	0.71	0.96	0.94
BindUP_NA230_struct	BindUP	0.70	0.91	0.91
BindUP_R127	BindUP	0.65	0.97	0.90
BindUP_R127	SPOT-Struct-RNA	0.63	0.99
BindUP_D190	BindUP	0.74	0.95	0.96
BindUP_D190	SPOT-Struct-DNA	0.57	1.00
RBscore_P627	BindUP	0.60	0.90	0.83

Sensitivity was calculated using the formula TP/(TP + FN). Specificity was calculated using the formula TN/(TN + FP). AUC was calculated using the Gist Support Vector Machine (SVM) classifier (http://www.chibi.ubc.ca/gist/). Results for SPOT-Struct-DNA and SPOT-Struct-RNA were obtaining by running the independent datasets D190 and R127 of the respective web servers. RBscore_P627 was extracted from (15,45). The dataset was processed, removing protein structures that are defined as obsolete in the RCSB PDB database, as well as structures that overlap with BindUP training set. Clearly, the most important advantage of NAbind, implemented in BindUP, compared to other NA-binding prediction algorithms, is that it does not rely on homology and thus can contribute to predict novel NABPs. While there is a relatively large number of ‘hypothetical proteins’ in the PDB resulting from structure genomic initiatives, still the majority of protein structures in PDB are of known function. To examine the ability of BindUP to predict NA-binding function from sequence for novel proteins, we chose to test it on RBPs which were recently identified by the RNA interactome capture experiments (5,7,8), deliberately selecting the subset of proteins which do not possess known RNA or DNA binding domains and have no homologous structure in the PDB. To this end, we extracted a list of 131 protein orthologs, reported by Kwon et al. (8) to be common to the three RNA interactome capture experiments, conducted in mammalian cells (5,7,8) and were not previously annotated as RBPs. We further manually curated the list, removing proteins possessing domains annotated in PFAM (46) as related to NA-binding function as well as proteins that shared over 35% identity (>30 amino acid coverage) to any other protein in the PDB, ending up with a final list of 58 novel RBPs. We further extracted all the domains from the proteins and used the I-TASSER non-homology-based protein structure modeler software (41), which we ran locally on our servers, to generate the structural model of the protein domains. The models of each of the domains were tested independently on BindUP. Overall, 86% of the RBPs in our test set had at least one domain predicted as NA-binding by BindUP (Supplementary Table S6). An example of a novel RBP, predicted correctly as NA-binding by BindUP, is Nol10. Nol10 is a WD-repeat nuclear protein with an unclear function, previously known to bind protein complexes in the nucleoli (47) and recently shown to bind RNA in the high throughput interactome capture experiments (5,7,8). We predicted the structure of its two domains using I-TASSER (41) and further submitted them to BindUP. Interestingly, while the Nol10 domains have no significant homology to other structures in the PDB and more so do not share similarity with any RNA binding domain, both domains were predicted by BindUP as NA-binding (Figure 1A). Overall, consistent with the fact that BindUP does not rely on homology, these results strongly support that BindUP can accurately identify novel NABPs. Moreover, while NAbind was originally developed for predicting NA-binding function from structure, we show that BindUP achieves highly accurate results given structural models of proteins that are predicted from sequence, using non-homology-based approaches. As aforementioned, the NAbind algorithm strongly depends on the features of the electrostatic patches on the protein surface, extracted by the PatchFinder algorithm, implemented in the PFPlus web server (32). In addition to predicting whether a protein binds nucleic acids, BindUP provides the information on the largest positive patches on the protein surface. We have previously shown that the largest positive patches on proteins, extracted by the PatchFinder algorithm, highly overlap with DNA and RNA binding interfaces (23,32). Moreover, we have shown that PatchFinder can also be applied to structural models, derived from a non-homology-based modeling method, showing high overlap with the known binding interfaces (48). Given the phenomenal growth in the number of protein-NA complexes in the PDB, we repeated the previous tests on the most updated set in the literature of non-redundant protein chains which were solved in complex with nucleic acids (including protein–DNA and protein–RNA complexes) (45). We compared the largest positive patches, calculated by BindUP, to the known nucleic acid binding interface, extracted by the program Intervor, which employs the Voroni interface model (49). This was done by calculating the overlap between the residues composing the one, two or three largest positive patches and the residues that are part of the known binding interface. The results are detailed in Supplementary Table S7. Overall, we show that the largest electrostatic patches highly overlap with the real NA-binding interface (extracted by Intervor). When considering the residues composing the largest positive patch only, the median sensitivity and specificity of the overlap is 0.65 and 0.86, respectively (Supplementary Table S7). When considering the residues in the three largest positive patches on the protein surface, as suggested in (23), the median sensitivity increases to 0.72 and accordingly the specificity reduces to 0.75. Taken together, BindUP is currently the most accurate and efficient web service for computational prediction of NA-binding function. BindUP is a non-homology-based predictor and can thus be applied for predicting novel NABPs given the protein structure or a structural model derived from sequence. In addition to providing the prediction of the protein function, i.e. whether it is an NA-binding protein or not, BindUP offers information on the largest continuous electrostatic patches on the protein surface. The latter has been shown to correspond with functional regions on the protein surface, specifically to the NA-binding interface, in case of the largest continuous positive patch.

49 in total

1. The Protein Data Bank.

Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information.

Authors: Shandar Ahmad; M Michael Gromiha; Akinori Sarai
Journal: Bioinformatics Date: 2004-01-22 Impact factor: 6.937

3. SVM based prediction of RNA-binding proteins using binding residues and evolutionary information.

Authors: Manish Kumar; M Michael Gromiha; Gajendra P S Raghava
Journal: J Mol Recognit Date: 2011 Mar-Apr Impact factor: 2.137

4. Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities.

Authors: Michael F Berger; Anthony A Philippakis; Aaron M Qureshi; Fangxue S He; Preston W Estep; Martha L Bulyk
Journal: Nat Biotechnol Date: 2006-09-24 Impact factor: 54.908

Review 5. A census of human RNA-binding proteins.

Authors: Stefanie Gerstberger; Markus Hafner; Thomas Tuschl
Journal: Nat Rev Genet Date: 2014-11-04 Impact factor: 53.242

6. Crystal structure of the protein from Arabidopsis thaliana gene At5g06450, a putative DnaQ-like exonuclease domain-containing protein with homohexameric assembly.

Authors: David W Smith; Mi Ra Han; Joon Sung Park; Kyung Rok Kim; Taeho Yeom; Ji Yeon Lee; Do Jin Kim; Craig A Bingman; Hyun-Jung Kim; Kyubong Jo; Byung Woo Han; George N Phillips
Journal: Proteins Date: 2013-06-17

7. SPOT-Seq-RNA: predicting protein-RNA complex structure and RNA-binding function by fold recognition and binding affinity prediction.

Authors: Yuedong Yang; Huiying Zhao; Jihua Wang; Yaoqi Zhou
Journal: Methods Mol Biol Date: 2014

8. Patch Finder Plus (PFplus): a web server for extracting and displaying positive electrostatic patches on protein surfaces.

Authors: Shula Shazman; Gershon Celniker; Omer Haber; Fabian Glaser; Yael Mandel-Gutfreund
Journal: Nucleic Acids Res Date: 2007-05-30 Impact factor: 16.971

9. Global analysis of yeast mRNPs.

Authors: Sarah F Mitchell; Saumya Jain; Meipei She; Roy Parker
Journal: Nat Struct Mol Biol Date: 2012-12-09 Impact factor: 15.369

10. DBD-Hunter: a knowledge-based method for the prediction of DNA-protein interactions.

Authors: Mu Gao; Jeffrey Skolnick
Journal: Nucleic Acids Res Date: 2008-05-31 Impact factor: 16.971

19 in total

1. DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.

Authors: Farman Ali; Saeed Ahmed; Zar Nawab Khan Swati; Shahid Akbar
Journal: J Comput Aided Mol Des Date: 2019-05-23 Impact factor: 3.686

2. Machine Learning Attempts for Predicting Human Subcutaneous Bioavailability of Monoclonal Antibodies.

Authors: Hao Lou; Michael J Hageman
Journal: Pharm Res Date: 2021-03-12 Impact factor: 4.200

3. JAK2-binding long noncoding RNA promotes breast cancer brain metastasis.

Authors: Shouyu Wang; Ke Liang; Qingsong Hu; Ping Li; Jian Song; Yuedong Yang; Jun Yao; Lingegowda Selanere Mangala; Chunlai Li; Wenhao Yang; Peter K Park; David H Hawke; Jianwei Zhou; Yan Zhou; Weiya Xia; Mien-Chie Hung; Jeffrey R Marks; Gary E Gallick; Gabriel Lopez-Berestein; Elsa R Flores; Anil K Sood; Suyun Huang; Dihua Yu; Liuqing Yang; Chunru Lin
Journal: J Clin Invest Date: 2017-11-13 Impact factor: 14.808

Review 4. Intimate connections: Inositol pyrophosphates at the interface of metabolic regulation and cell signaling.

Authors: Stephen B Shears
Journal: J Cell Physiol Date: 2017-06-15 Impact factor: 6.384

5. RBP-TSTL is a two-stage transfer learning framework for genome-scale prediction of RNA-binding proteins.

Authors: Xinxin Peng; Xiaoyu Wang; Yuming Guo; Zongyuan Ge; Fuyi Li; Xin Gao; Jiangning Song
Journal: Brief Bioinform Date: 2022-07-18 Impact factor: 13.994

6. Insights into the capsid structure of banana bunchy top virus.

Authors: Sangita Venkataraman; Ramasamy Selvarajan; Sundaram S Subramanian; Savithri S Handanahalli
Journal: 3 Biotech Date: 2022-06-07 Impact factor: 2.893

Review 7. Ribonomics Approaches to Identify RBPome in Plants and Other Eukaryotes: Current Progress and Future Prospects.

Authors: Muhammad Haroon; Rabail Afzal; Muhammad Mubashar Zafar; Hongwei Zhang; Lin Li
Journal: Int J Mol Sci Date: 2022-05-25 Impact factor: 6.208

8. Evaluation of model refinement in CASP14.

Authors: Adam J Simpkin; Filomeno Sánchez Rodríguez; Shahram Mesdaghi; Andriy Kryshtafovych; Daniel J Rigden
Journal: Proteins Date: 2021-07-29

9. Prediction of DNA binding proteins using local features and long-term dependencies with primary sequences based on deep learning.

Authors: Guobin Li; Xiuquan Du; Xinlu Li; Le Zou; Guanhong Zhang; Zhize Wu
Journal: PeerJ Date: 2021-05-03 Impact factor: 2.984

10. NEAT1 is essential for metabolic changes that promote breast cancer growth and metastasis.

Authors: Mi Kyung Park; Li Zhang; Kyung-Won Min; Jung-Hyun Cho; Chih-Chen Yeh; Hyesu Moon; Daniel Hormaechea-Agulla; Hyejin Mun; Seungbeom Ko; Ji Won Lee; Sonali Jathar; Aubrey S Smith; Yixin Yao; Nguyen Thu Giang; Hong Ha Vu; Victoria C Yan; Mary C Bridges; Antonis Kourtidis; Florian Muller; Jeong Ho Chang; Su Jung Song; Shinichi Nakagawa; Tetsuro Hirose; Je-Hyun Yoon; Min Sup Song
Journal: Cell Metab Date: 2021-12-07 Impact factor: 27.287