| Literature DB >> 15059245 |
Anirban Bhaduri1, Ganesan Pugalenthi, Ramanathan Sowdhamini.
Abstract
BACKGROUND: The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. DESCRIPTION: An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database.Entities:
Mesh:
Substances:
Year: 2004 PMID: 15059245 PMCID: PMC407847 DOI: 10.1186/1471-2105-5-35
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Flowchart representation of the steps involved in the curation of PASS2 database. Listed are useful tools and additional derived information that may be obtained from PASS2.
Figure 2Superposed structures of the cytochrome superfamily representatives: The cytochrome superfamily has six representative members in PASS2 (1a7va-, 1bbha-, 1cpq--, 1e85a-, 256ba-, 2ccya-) which have been superposed as explained (see Curation of Alignments section). The figure has been created using MOLSCRIPT [32].
Figure 3Representative structure-based sequence alignment for the cytochrome superfamily. The six members have been aligned and represented incorporating the three-dimensional features of JOY [18].
Comparision of the number of hits obtained in HMMSearch using models derived from regular multiple sequence alignments and structure based sequence alignments.
| Superoxide dismutase | 46609 | 152 | 137 |
| Anticodon-binding domain of class I aminoacyl-tRNA synthetases | 47323 | 220 | 182 |
| Cyclophilin (peptidylprolyl isomerase) | 50891 | 112 | 98 |
| Hemopexin-like domain | 50923 | 103 | 73 |