Literature DB >> 15961444

The predictive power of the CluSTr database.

Robert Petryszak1, Ernst Kretschmann, Daniela Wieser, Rolf Apweiler.   

Abstract

SUMMARY: The CluSTr database employs a fully automatic single-linkage hierarchical clustering method based on a similarity matrix. In order to compute the matrix, first all-against-all pair-wise comparisons between protein sequences are computed using the Smith-Waterman algorithm. The statistical significance of the similarity scores is then assessed using a Monte Carlo analysis, yielding Z-values, which are used to populate the matrix. This paper describes automated annotation experiments that quantify the predictive power and hence the biological relevance of the CluSTr data. The experiments utilized the UniProt data-mining framework to derive annotation predictions using combinations of InterPro and CluSTr. We show that this combination of data sources greatly increases the precision of predictions made by the data-mining framework, compared with the use of InterPro data alone. We conclude that the CluSTr approach to clustering proteins makes a valuable contribution to traditional protein classifications. AVAILABILITY: http://www.ebi.ac.uk/clustr/.

Mesh:

Year:  2005        PMID: 15961444     DOI: 10.1093/bioinformatics/bti542

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  20 in total

Review 1.  In silico characterization of proteins: UniProt, InterPro and Integr8.

Authors:  Nicola Jane Mulder; Paul Kersey; Manuela Pruess; Rolf Apweiler
Journal:  Mol Biotechnol       Date:  2007-10-04       Impact factor: 2.695

2.  GeMMA: functional subfamily classification within superfamilies of predicted protein structural domains.

Authors:  David A Lee; Robert Rentzsch; Christine Orengo
Journal:  Nucleic Acids Res       Date:  2009-11-18       Impact factor: 16.971

3.  Ultra-fast sequence clustering from similarity networks with SiLiX.

Authors:  Vincent Miele; Simon Penel; Laurent Duret
Journal:  BMC Bioinformatics       Date:  2011-04-22       Impact factor: 3.307

Review 4.  Ortholog identification in the presence of domain architecture rearrangement.

Authors:  Kimmen Sjölander; Ruchira S Datta; Yaoqing Shen; Grant M Shoffner
Journal:  Brief Bioinform       Date:  2011-06-28       Impact factor: 11.622

Review 5.  Protein function annotation by homology-based inference.

Authors:  Yaniv Loewenstein; Domenico Raimondo; Oliver C Redfern; James Watson; Dmitrij Frishman; Michal Linial; Christine Orengo; Janet Thornton; Anna Tramontano
Journal:  Genome Biol       Date:  2009-02-02       Impact factor: 13.583

6.  Large-scale collection and annotation of full-length enriched cDNAs from a model halophyte, Thellungiella halophila.

Authors:  Teruaki Taji; Tetsuya Sakurai; Keiichi Mochida; Atsushi Ishiwata; Atsushi Kurotani; Yasushi Totoki; Atsushi Toyoda; Yoshiyuki Sakaki; Motoaki Seki; Hirokazu Ono; Yoichi Sakata; Shigeo Tanaka; Kazuo Shinozaki
Journal:  BMC Plant Biol       Date:  2008-11-12       Impact factor: 4.215

7.  Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space.

Authors:  Yaniv Loewenstein; Elon Portugaly; Menachem Fromer; Michal Linial
Journal:  Bioinformatics       Date:  2008-07-01       Impact factor: 6.937

8.  GFam: a platform for automatic annotation of gene families.

Authors:  Rajkumar Sasidharan; Tamás Nepusz; David Swarbreck; Eva Huala; Alberto Paccanaro
Journal:  Nucleic Acids Res       Date:  2012-07-11       Impact factor: 16.971

9.  PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees.

Authors:  Huaiyu Mi; Anushya Muruganujan; Paul D Thomas
Journal:  Nucleic Acids Res       Date:  2012-11-27       Impact factor: 16.971

10.  How to inherit statistically validated annotation within BAR+ protein clusters.

Authors:  Damiano Piovesan; Pier Luigi Martelli; Piero Fariselli; Giuseppe Profiti; Andrea Zauli; Ivan Rossi; Rita Casadio
Journal:  BMC Bioinformatics       Date:  2013-02-28       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.