| Literature DB >> 22884576 |
Samad Jahandideh1, Vinodh Srinivasasainagendra2, Degui Zhi3.
Abstract
RNA-protein interaction plays an important role in various cellular processes, such as protein synthesis, gene regulation, post-transcriptional gene regulation, alternative splicing, and infections by RNA viruses. In this study, using Gene Ontology Annotated (GOA) and Structural Classification of Proteins (SCOP) databases an automatic procedure was designed to capture structurally solved RNA-binding protein domains in different subclasses. Subsequently, we applied tuned multi-class SVM (TMCSVM), Random Forest (RF), and multi-class ℓ1/ℓq-regularized logistic regression (MCRLR) for analysis and classifying RNA-binding protein domains based on a comprehensive set of sequence and structural features. In this study, we compared prediction accuracy of three different state-of-the-art predictor methods. From our results, TMCSVM outperforms the other methods and suggests the potential of TMCSVM as a useful tool for facilitating the multi-class prediction of RNA-binding protein domains. On the other hand, MCRLR by elucidating importance of features for their contribution in predictive accuracy of RNA-binding protein domains subclasses, helps us to provide some biological insights into the roles of sequences and structures in protein-RNA interactions.Entities:
Keywords: Multi-class ℓ(1)/ℓ(q)-regularized logistic regression; Prediction; RNA-binding domain; Random Forest; Tuned multi-class SVM
Mesh:
Substances:
Year: 2012 PMID: 22884576 PMCID: PMC3867591 DOI: 10.1016/j.jtbi.2012.07.013
Source DB: PubMed Journal: J Theor Biol ISSN: 0022-5193 Impact factor: 2.691