Literature DB >> 11294792

Picasso: generating a covering set of protein family profiles.

A Heger1, L Holm.   

Abstract

MOTIVATION: Evolutionary classification leads to an economical description of protein sequence data because attributes of function and structure are inherited in protein families. This paper presents Picasso, a procedure for deriving a minimal set of protein family profiles that cover all known protein sequences.
RESULTS: Picasso starts from highly overlapping sequence neighbourhoods revealed by all-on-all pairwise Blast alignment. Overlaps are reduced by merging sequences or parts of sequences into multiple alignments. For maximum unification, the multiple alignments must reach into the twilight zone of sequence similarity. Sensitive and selective profile-profile comparison allows unification down to about 15% pairwise sequence identity. Families unified through a short conserved sequence motif are associated with multiple full-length alignments describing different subfamilies. Domains that are mobile modules are identified based on their association with different sets of neighbours. The result is 10000 unified domain families (excluding singletons) representing functionally related proteins and recovering classical prolific domain types in high numbers. The classification is useful, for example, in developing strategies for efficient database searching and for selecting targets to complete the map of all 3-D structures.

Mesh:

Substances:

Year:  2001        PMID: 11294792     DOI: 10.1093/bioinformatics/17.3.272

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  19 in total

1.  ProtoNet: hierarchical classification of the protein space.

Authors:  Ori Sasson; Avishay Vaaknin; Hillel Fleischer; Elon Portugaly; Yonatan Bilu; Nathan Linial; Michal Linial
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

2.  Detecting distant homology with Meta-BASIC.

Authors:  Krzysztof Ginalski; Marcin von Grotthuss; Nick V Grishin; Leszek Rychlewski
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

3.  An assessment of substitution scores for protein profile-profile comparison.

Authors:  Xugang Ye; Guoli Wang; Stephen F Altschul
Journal:  Bioinformatics       Date:  2011-10-13       Impact factor: 6.937

4.  Structural similarity to bridge sequence space: finding new families on the bridges.

Authors:  Parantu K Shah; Patrick Aloy; Peer Bork; Robert B Russell
Journal:  Protein Sci       Date:  2005-05       Impact factor: 6.725

5.  A limited universe of membrane protein families and folds.

Authors:  Amit Oberai; Yungok Ihm; Sanguk Kim; James U Bowie
Journal:  Protein Sci       Date:  2006-07       Impact factor: 6.725

6.  LIBRUS: combined machine learning and homology information for sequence-based ligand-binding residue prediction.

Authors:  Chris Kauffman; George Karypis
Journal:  Bioinformatics       Date:  2009-09-28       Impact factor: 6.937

7.  Genome-wide comparative gene family classification.

Authors:  Christian Frech; Nansheng Chen
Journal:  PLoS One       Date:  2010-10-15       Impact factor: 3.240

8.  Refining homology models by combining replica-exchange molecular dynamics and statistical potentials.

Authors:  Jiang Zhu; Hao Fan; Xavier Periole; Barry Honig; Alan E Mark
Journal:  Proteins       Date:  2008-09

9.  CORAL: aligning conserved core regions across domain families.

Authors:  Jessica H Fong; Aron Marchler-Bauer
Journal:  Bioinformatics       Date:  2009-05-26       Impact factor: 6.937

10.  Proteomic properties reveal phyloecological clusters of Archaea.

Authors:  Nela Nikolic; Zlatko Smole; Anita Krisko
Journal:  PLoS One       Date:  2012-10-25       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.