Literature DB >> 22884576

Comprehensive comparative analysis and identification of RNA-binding protein domains: multi-class classification and feature selection.

Samad Jahandideh1, Vinodh Srinivasasainagendra2, Degui Zhi3.   

Abstract

RNA-protein interaction plays an important role in various cellular processes, such as protein synthesis, gene regulation, post-transcriptional gene regulation, alternative splicing, and infections by RNA viruses. In this study, using Gene Ontology Annotated (GOA) and Structural Classification of Proteins (SCOP) databases an automatic procedure was designed to capture structurally solved RNA-binding protein domains in different subclasses. Subsequently, we applied tuned multi-class SVM (TMCSVM), Random Forest (RF), and multi-class ℓ1/ℓq-regularized logistic regression (MCRLR) for analysis and classifying RNA-binding protein domains based on a comprehensive set of sequence and structural features. In this study, we compared prediction accuracy of three different state-of-the-art predictor methods. From our results, TMCSVM outperforms the other methods and suggests the potential of TMCSVM as a useful tool for facilitating the multi-class prediction of RNA-binding protein domains. On the other hand, MCRLR by elucidating importance of features for their contribution in predictive accuracy of RNA-binding protein domains subclasses, helps us to provide some biological insights into the roles of sequences and structures in protein-RNA interactions.

Entities:  

Keywords:  Multi-class ℓ(1)/ℓ(q)-regularized logistic regression; Prediction; RNA-binding domain; Random Forest; Tuned multi-class SVM

Mesh:

Substances:

Year:  2012        PMID: 22884576      PMCID: PMC3867591          DOI: 10.1016/j.jtbi.2012.07.013

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  67 in total

1.  Prediction of protein cellular attributes using pseudo-amino acid composition.

Authors:  K C Chou
Journal:  Proteins       Date:  2001-05-15

2.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

3.  Annotating nucleic acid-binding function based on protein structure.

Authors:  Eric W Stawiski; Lydia M Gregoret; Yael Mandel-Gutfreund
Journal:  J Mol Biol       Date:  2003-02-28       Impact factor: 5.469

4.  Identify DNA-binding proteins with optimal Chou's amino acid composition.

Authors:  Xiao-Wei Zhao; Xiang-Tao Li; Zhi-Qiang Ma; Ming-Hao Yin
Journal:  Protein Pept Lett       Date:  2012-04       Impact factor: 1.890

5.  Predicting protein structural class by incorporating patterns of over-represented k-mers into the general form of Chou's PseAAC.

Authors:  Yu-Fang Qin; Chun-Hua Wang; Xiao-Qing Yu; Jie Zhu; Tai-Gang Liu; Xiao-Qi Zheng
Journal:  Protein Pept Lett       Date:  2012-04       Impact factor: 1.890

6.  Protein-RNA interactions: a structural analysis.

Authors:  S Jones; D T Daley; N M Luscombe; H M Berman; J M Thornton
Journal:  Nucleic Acids Res       Date:  2001-02-15       Impact factor: 16.971

7.  Identifying bacterial virulent proteins by fusing a set of classifiers based on variants of Chou's pseudo amino acid composition and on evolutionary information.

Authors:  Loris Nanni; Alessandra Lumini; Dinesh Gupta; Aarti Garg
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2011-08-18       Impact factor: 3.710

8.  Discriminating outer membrane proteins with Fuzzy K-nearest Neighbor algorithms based on the general form of Chou's PseAAC.

Authors:  Maqsood Hayat; Asifullah Khan
Journal:  Protein Pept Lett       Date:  2012-04       Impact factor: 1.890

9.  Prediction of protein subcellular multi-localization based on the general form of Chou's pseudo amino acid composition.

Authors:  Li-Qi Li; Yuan Zhang; Ling-Yun Zou; Yue Zhou; Xiao-Qi Zheng
Journal:  Protein Pept Lett       Date:  2012-04       Impact factor: 1.890

10.  Identification of colorectal cancer related genes with mRMR and shortest path in protein-protein interaction network.

Authors:  Bi-Qing Li; Tao Huang; Lei Liu; Yu-Dong Cai; Kuo-Chen Chou
Journal:  PLoS One       Date:  2012-04-04       Impact factor: 3.240

View more
  4 in total

1.  iSulf-Cys: Prediction of S-sulfenylation Sites in Proteins with Physicochemical Properties of Amino Acids.

Authors:  Yan Xu; Jun Ding; Ling-Yun Wu
Journal:  PLoS One       Date:  2016-04-22       Impact factor: 3.240

2.  RStrucFam: a web server to associate structure and cognate RNA for RNA-binding proteins from sequence information.

Authors:  Pritha Ghosh; Oommen K Mathew; Ramanathan Sowdhamini
Journal:  BMC Bioinformatics       Date:  2016-10-07       Impact factor: 3.169

3.  iSNO-PseAAC: predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition.

Authors:  Yan Xu; Jun Ding; Ling-Yun Wu; Kuo-Chen Chou
Journal:  PLoS One       Date:  2013-02-07       Impact factor: 3.240

4.  Predictions of Protein-Protein Interfaces within Membrane Protein Complexes.

Authors:  Ebrahim Barzegari Asadabadi; Parviz Abdolmaleki
Journal:  Avicenna J Med Biotechnol       Date:  2013-07
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.