Literature DB >> 15978619

Armadillo: domain boundary prediction by amino acid composition.

Michel Dumontier1, Rong Yao, Howard J Feldman, Christopher W V Hogue.   

Abstract

The identification and annotation of protein domains provides a critical step in the accurate determination of molecular function. Both computational and experimental methods of protein structure determination may be deterred by large multi-domain proteins or flexible linker regions. Knowledge of domains and their boundaries may reduce the experimental cost of protein structure determination by allowing researchers to work on a set of smaller and possibly more successful alternatives. Current domain prediction methods often rely on sequence similarity to conserved domains and as such are poorly suited to detect domain structure in poorly conserved or orphan proteins. We present here a simple computational method to identify protein domain linkers and their boundaries from sequence information alone. Our domain predictor, Armadillo (http://armadillo.blueprint.org), uses any amino acid index to convert a protein sequence to a smoothed numeric profile from which domains and domain boundaries may be predicted. We derived an amino acid index called the domain linker propensity index (DLI) from the amino acid composition of domain linkers using a non-redundant structure dataset. The index indicates that Pro and Gly show a propensity for linker residues while small hydrophobic residues do not. Armadillo predicts domain linker boundaries from Z-score distributions and obtains 35% sensitivity with DLI in a two-domain, single-linker dataset (within +/-20 residues from linker). The combination of DLI and an entropy-based amino acid index increases the overall Armadillo sensitivity to 56% for two domain proteins. Moreover, Armadillo achieves 37% sensitivity for multi-domain proteins, surpassing most other prediction methods. Armadillo provides a simple, but effective method by which prediction of domain boundaries can be obtained with reasonable sensitivity. Armadillo should prove to be a valuable tool for rapidly delineating protein domains in poorly conserved proteins or those with no sequence neighbors. As a first-line predictor, domain meta-predictors could yield improved results with Armadillo predictions.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15978619     DOI: 10.1016/j.jmb.2005.05.037

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  22 in total

1.  Domain structure of Lassa virus L protein.

Authors:  Linda Brunotte; Michaela Lelke; Meike Hass; Katja Kleinsteuber; Beate Becker-Ziaja; Stephan Günther
Journal:  J Virol       Date:  2010-10-27       Impact factor: 5.103

2.  IS-Dom: a dataset of independent structural domains automatically delineated from protein structures.

Authors:  Teppei Ebina; Yuki Umezawa; Yutaka Kuroda
Journal:  J Comput Aided Mol Des       Date:  2013-05-29       Impact factor: 3.686

3.  Fast H-DROP: A thirty times accelerated version of H-DROP for interactive SVM-based prediction of helical domain linkers.

Authors:  Tambi Richa; Soichiro Ide; Ryosuke Suzuki; Teppei Ebina; Yutaka Kuroda
Journal:  J Comput Aided Mol Des       Date:  2016-12-27       Impact factor: 3.686

4.  H-DROP: an SVM based helical domain linker predictor trained with features optimized by combining random forest and stepwise selection.

Authors:  Teppei Ebina; Ryosuke Suzuki; Ryotaro Tsuji; Yutaka Kuroda
Journal:  J Comput Aided Mol Des       Date:  2014-06-26       Impact factor: 3.686

5.  ThreaDomEx: a unified platform for predicting continuous and discontinuous protein domains by multiple-threading and segment assembly.

Authors:  Yan Wang; Jian Wang; Ruiming Li; Qiang Shi; Zhidong Xue; Yang Zhang
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

6.  DomSVR: domain boundary prediction with support vector regression from sequence information alone.

Authors:  Peng Chen; Chunmei Liu; Legand Burge; Jinyan Li; Mahmood Mohammad; William Southerland; Clay Gloster; Bing Wang
Journal:  Amino Acids       Date:  2010-02-18       Impact factor: 3.520

7.  Mathematical model for empirically optimizing large scale production of soluble protein domains.

Authors:  Eisuke Chikayama; Atsushi Kurotani; Takanori Tanaka; Takashi Yabuki; Satoshi Miyazaki; Shigeyuki Yokoyama; Yutaka Kuroda
Journal:  BMC Bioinformatics       Date:  2010-03-01       Impact factor: 3.169

8.  OPUS-Dom: applying the folding-based method VECFOLD to determine protein domain boundaries.

Authors:  Yinghao Wu; Athanasios D Dousis; Mingzhi Chen; Jialin Li; Jianpeng Ma
Journal:  J Mol Biol       Date:  2008-11-10       Impact factor: 5.469

9.  Ab initio and homology based prediction of protein domains by recursive neural networks.

Authors:  Ian Walsh; Alberto J M Martin; Catherine Mooney; Enrico Rubagotti; Alessandro Vullo; Gianluca Pollastri
Journal:  BMC Bioinformatics       Date:  2009-06-26       Impact factor: 3.169

10.  DomHR: accurately identifying domain boundaries in proteins using a hinge region strategy.

Authors:  Xiao-yan Zhang; Long-jian Lu; Qi Song; Qian-qian Yang; Da-peng Li; Jiang-ming Sun; Tong-hua Li; Pei-sheng Cong
Journal:  PLoS One       Date:  2013-04-11       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.