Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Picasso: generating a covering set of protein family profiles.

Literature DB >> 11294792

Picasso: generating a covering set of protein family profiles.

Abstract

MOTIVATION: Evolutionary classification leads to an economical description of protein sequence data because attributes of function and structure are inherited in protein families. This paper presents Picasso, a procedure for deriving a minimal set of protein family profiles that cover all known protein sequences.
RESULTS: Picasso starts from highly overlapping sequence neighbourhoods revealed by all-on-all pairwise Blast alignment. Overlaps are reduced by merging sequences or parts of sequences into multiple alignments. For maximum unification, the multiple alignments must reach into the twilight zone of sequence similarity. Sensitive and selective profile-profile comparison allows unification down to about 15% pairwise sequence identity. Families unified through a short conserved sequence motif are associated with multiple full-length alignments describing different subfamilies. Domains that are mobile modules are identified based on their association with different sets of neighbours. The result is 10000 unified domain families (excluding singletons) representing functionally related proteins and recovering classical prolific domain types in high numbers. The classification is useful, for example, in developing strategies for efficient database searching and for selecting targets to complete the map of all 3-D structures.

Mesh：

Substances：
Proteins

Year: 2001 PMID： 11294792 DOI： 10.1093/bioinformatics/17.3.272

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

19 in total

Picasso: generating a covering set of protein family profiles.

1. ProtoNet: hierarchical classification of the protein space.

2. Detecting distant homology with Meta-BASIC.

3. An assessment of substitution scores for protein profile-profile comparison.

4. Structural similarity to bridge sequence space: finding new families on the bridges.

5. A limited universe of membrane protein families and folds.

6. LIBRUS: combined machine learning and homology information for sequence-based ligand-binding residue prediction.

7. Genome-wide comparative gene family classification.

8. Refining homology models by combining replica-exchange molecular dynamics and statistical potentials.

9. CORAL: aligning conserved core regions across domain families.

10. Proteomic properties reveal phyloecological clusters of Archaea.