| Literature DB >> 11806833 |
Nicola J Mulder1, Rolf Apweiler.
Abstract
With the large influx of raw sequence data from genome sequencing projects, there is a need for reliable automatic methods for protein sequence analysis and classification. The most useful tools use various methods for identifying motifs or domains found in previously characterized protein families. This article reviews the tools and resources available on the web for identifying signatures within proteins and discusses how they may be used in the analysis of new or unknown protein sequences.Entities:
Mesh:
Substances:
Year: 2001 PMID: 11806833 PMCID: PMC150457 DOI: 10.1186/gb-2001-3-1-reviews2001
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Useful tools and resources for protein family, domain and motif analysis
| Database | Description | URL | Published reference |
| Blocks | Database of protein alignment blocks | [19] | |
| CDD | Conserved domain database | [47] | |
| CluSTr | Clusters of SWISS-PROT and TrEMBL proteins | [34] | |
| DOMO | Protein-domain database based on sequence alignments | [24] | |
| InterPro | Integrated documentation resource for protein families, domains and functional sites | [49] | |
| IProClass | Integrated protein classification database | [45] | |
| MetaFam | Database of protein family information | [43] | |
| Pfam | Collection of multiple sequence alignments and hidden Markov models | [13] | |
| PIR | Protein Information Resource | [4] | |
| PIR-ALN | Curated database of protein sequence alignments | [26] | |
| PRINTS-S | Compendium of protein fingerprints | [11] | |
| ProClass | Non-redundant protein database organized by family relationships | [28] | |
| ProDom | Automatic compilation of homologous domains | [21] | |
| PROSITE | Database of patterns and profiles describing protein families and domains | [9] | |
| ProtoMap | Automatic hierarchical classification of SWISS-PROT proteins | [30] | |
| SBASE | Curated protein domain library based on sequence clustering | [51] | |
| SMART | Simple Modular Architecture Research Tool - a collection of protein families and domains | [15] | |
| SWISS-PROT and TrEMBL | Protein sequence databases | [2] | |
| SYSTERS | Systematic re-searching method for sequence searching and clustering | [32] | |
| TIGRFAMs | Protein families based on hidden Markov models | [17] |
Figure 1An example of a MetaFam family entry [43,44]. This is the entry for SRP54, the GTP-binding signal recognition particle domain, and shows the links between related entries in ProDom, PIR superfamilies, Pfam, DOMO, Blocks+, SBASE and ProtoMap. The domain structure for the selected SWISS-PROT protein SR54_HALN1 is shown at the bottom of the entry.
Figure 2An example of an InterPro entry [49,50]. This is entry IPR000402, which describes the Na+, K+ ATPase β subunit. The entry integrates two PROSITE patterns and a Pfam HMM, which are diagnostic for the same family. It includes an abstract describing this family, and match lists of all SWISS-PROT and TrEMBL proteins that belong to the family. (b) The output from the sequence search result of InterProScan. The Escherichia coli flagellar biosynthetic protein FliP was scanned, and shown to hit InterPro entry IPR002039, which describes the FliP protein family. The results are shown in both a graphical and a tabular view, and include amino-acid positions for the signature matches.