Literature DB >> 21169376

DROP: an SVM domain linker predictor trained with optimal features selected by random forest.

Teppei Ebina1, Hiroyuki Toh, Yutaka Kuroda.   

Abstract

MOTIVATION: Biologically important proteins are often large, multidomain proteins, which are difficult to characterize by high-throughput experimental methods. Efficient domain/boundary predictions are thus increasingly required in diverse area of proteomics research for computationally dissecting proteins into readily analyzable domains.
RESULTS: We constructed a support vector machine (SVM)-based domain linker predictor, DROP (Domain linker pRediction using OPtimal features), which was trained with 25 optimal features. The optimal combination of features was identified from a set of 3000 features using a random forest algorithm complemented with a stepwise feature selection. DROP demonstrated a prediction sensitivity and precision of 41.3 and 49.4%, respectively. These values were over 19.9% higher than those of control SVM predictors trained with non-optimized features, strongly suggesting the efficiency of our feature selection method. In addition, the mean NDO-Score of DROP for predicting novel domains in seven CASP8 FM multidomain proteins was 0.760, which was higher than any of the 12 published CASP8 DP servers. Overall, these results indicate that the SVM prediction of domain linkers can be improved by identifying optimal features that best distinguish linker from non-linker regions.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 21169376     DOI: 10.1093/bioinformatics/btq700

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  27 in total

1.  iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou's PseAAC to formulate DNA samples.

Authors:  Muhammad Kabir; Maqsood Hayat
Journal:  Mol Genet Genomics       Date:  2015-08-30       Impact factor: 3.291

2.  IS-Dom: a dataset of independent structural domains automatically delineated from protein structures.

Authors:  Teppei Ebina; Yuki Umezawa; Yutaka Kuroda
Journal:  J Comput Aided Mol Des       Date:  2013-05-29       Impact factor: 3.686

3.  Fast H-DROP: A thirty times accelerated version of H-DROP for interactive SVM-based prediction of helical domain linkers.

Authors:  Tambi Richa; Soichiro Ide; Ryosuke Suzuki; Teppei Ebina; Yutaka Kuroda
Journal:  J Comput Aided Mol Des       Date:  2016-12-27       Impact factor: 3.686

4.  H-DROP: an SVM based helical domain linker predictor trained with features optimized by combining random forest and stepwise selection.

Authors:  Teppei Ebina; Ryosuke Suzuki; Ryotaro Tsuji; Yutaka Kuroda
Journal:  J Comput Aided Mol Des       Date:  2014-06-26       Impact factor: 3.686

5.  ThreaDomEx: a unified platform for predicting continuous and discontinuous protein domains by multiple-threading and segment assembly.

Authors:  Yan Wang; Jian Wang; Ruiming Li; Qiang Shi; Zhidong Xue; Yang Zhang
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

6.  DPClass: An Effective but Concise Discriminative Patterns-Based Classification Framework.

Authors:  Jingbo Shang; Wenzhu Tong; Jian Peng; Jiawei Han
Journal:  Proc SIAM Int Conf Data Min       Date:  2016-05

7.  DeepDom: Predicting protein domain boundary from sequence alone using stacked bidirectional LSTM.

Authors:  Yuexu Jiang; Duolin Wang; Dong Xu
Journal:  Pac Symp Biocomput       Date:  2019

8.  An integrative computational framework based on a two-step random forest algorithm improves prediction of zinc-binding sites in proteins.

Authors:  Cheng Zheng; Mingjun Wang; Kazuhiro Takemoto; Tatsuya Akutsu; Ziding Zhang; Jiangning Song
Journal:  PLoS One       Date:  2012-11-14       Impact factor: 3.240

9.  FunSAV: predicting the functional effect of single amino acid variants using a two-stage random forest model.

Authors:  Mingjun Wang; Xing-Ming Zhao; Kazuhiro Takemoto; Haisong Xu; Yuan Li; Tatsuya Akutsu; Jiangning Song
Journal:  PLoS One       Date:  2012-08-24       Impact factor: 3.240

10.  PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites.

Authors:  Jiangning Song; Hao Tan; Andrew J Perry; Tatsuya Akutsu; Geoffrey I Webb; James C Whisstock; Robert N Pike
Journal:  PLoS One       Date:  2012-11-29       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.