Literature DB >> 21919860

RSARF: prediction of residue solvent accessibility from protein sequence using random forest method.

Ganesan Pugalenthi1, Krishna Kumar Kandaswamy, Kuo-Chen Chou, Saravanan Vivekanandan, Prasanna Kolatkar.   

Abstract

Prediction of protein structure from its amino acid sequence is still a challenging problem. The complete physicochemical understanding of protein folding is essential for the accurate structure prediction. Knowledge of residue solvent accessibility gives useful insights into protein structure prediction and function prediction. In this work, we propose a random forest method, RSARF, to predict residue accessible surface area from protein sequence information. The training and testing was performed using 120 proteins containing 22006 residues. For each residue, buried and exposed state was computed using five thresholds (0%, 5%, 10%, 25%, and 50%). The prediction accuracy for 0%, 5%, 10%, 25%, and 50% thresholds are 72.9%, 78.25%, 78.12%, 77.57% and 72.07% respectively. Further, comparison of RSARF with other methods using a benchmark dataset containing 20 proteins shows that our approach is useful for prediction of residue solvent accessibility from protein sequence without using structural information. The RSARF program, datasets and supplementary data are available at http://caps.ncbs.res.in/download/pugal/RSARF/.

Mesh:

Substances:

Year:  2012        PMID: 21919860     DOI: 10.2174/092986612798472875

Source DB:  PubMed          Journal:  Protein Pept Lett        ISSN: 0929-8665            Impact factor:   1.890


  18 in total

1.  Equilibrium Ensembles for Insulin Folding from Bias-Exchange Metadynamics.

Authors:  Richa Singh; Rohit Bansal; Anurag Singh Rathore; Gaurav Goel
Journal:  Biophys J       Date:  2017-04-25       Impact factor: 4.033

2.  Identification of compound-protein interactions through the analysis of gene ontology, KEGG enrichment for proteins and molecular fragments of compounds.

Authors:  Lei Chen; Yu-Hang Zhang; Mingyue Zheng; Tao Huang; Yu-Dong Cai
Journal:  Mol Genet Genomics       Date:  2016-08-16       Impact factor: 3.291

3.  Comprehensive comparative analysis and identification of RNA-binding protein domains: multi-class classification and feature selection.

Authors:  Samad Jahandideh; Vinodh Srinivasasainagendra; Degui Zhi
Journal:  J Theor Biol       Date:  2012-08-03       Impact factor: 2.691

4.  Prediction of protein domain with mRMR feature selection and analysis.

Authors:  Bi-Qing Li; Le-Le Hu; Lei Chen; Kai-Yan Feng; Yu-Dong Cai; Kuo-Chen Chou
Journal:  PLoS One       Date:  2012-06-15       Impact factor: 3.240

5.  Analysis and Identification of Aptamer-Compound Interactions with a Maximum Relevance Minimum Redundancy and Nearest Neighbor Algorithm.

Authors:  ShaoPeng Wang; Yu-Hang Zhang; Jing Lu; Weiren Cui; Jerry Hu; Yu-Dong Cai
Journal:  Biomed Res Int       Date:  2016-02-03       Impact factor: 3.411

6.  Prediction of protein solvent accessibility using PSO-SVR with multiple sequence-derived features and weighted sliding window scheme.

Authors:  Jian Zhang; Wenhan Chen; Pingping Sun; Xiaowei Zhao; Zhiqiang Ma
Journal:  BioData Min       Date:  2015-01-31       Impact factor: 2.522

7.  iHyd-PseCp: Identify hydroxyproline and hydroxylysine in proteins by incorporating sequence-coupled effects into general PseAAC.

Authors:  Wang-Ren Qiu; Bi-Qian Sun; Xuan Xiao; Zhao-Chun Xu; Kuo-Chen Chou
Journal:  Oncotarget       Date:  2016-07-12

8.  Logistic regression models to predict solvent accessible residues using sequence- and homology-based qualitative and quantitative descriptors applied to a domain-complete X-ray structure learning set.

Authors:  Reecha Nepal; Joanna Spencer; Guneet Bhogal; Amulya Nedunuri; Thomas Poelman; Thejas Kamath; Edwin Chung; Katherine Kantardjieff; Andrea Gottlieb; Brooke Lustig
Journal:  J Appl Crystallogr       Date:  2015-11-10       Impact factor: 3.304

9.  PredRSA: a gradient boosted regression trees approach for predicting protein solvent accessibility.

Authors:  Chao Fan; Diwei Liu; Rui Huang; Zhigang Chen; Lei Deng
Journal:  BMC Bioinformatics       Date:  2016-01-11       Impact factor: 3.169

10.  iROS-gPseKNC: Predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition.

Authors:  Xuan Xiao; Han-Xiao Ye; Zi Liu; Jian-Hua Jia; Kuo-Chen Chou
Journal:  Oncotarget       Date:  2016-06-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.