| Literature DB >> 14997562 |
Anirban Bhaduri1, R Ravishankar, R Sowdhamini.
Abstract
Limitations in techniques for the elucidation of protein function have led to an increasing gap between the annotated proteins and those encoded in a genome. The functional selection and three-dimensional structural constraints of proteins in nature often relate to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. We identify spatially interacting conserved regions, or motifs, within protein superfamilies that are critical for structure and/or function. A search in sequence databases using these descriptors as additional constraints is an approach to identifying putative additional members of superfamilies. Such constrained searches have been tested against proteins of known structure to demonstrate high percentage specificity (93) with a low error rate of 0.0004. This approach has been compared with other sensitive sequence search methods (e.g., PSI-BLAST, HMMsearch, and IMPALA). It has been extended to analyze the distribution of 11 superfamilies in 93 genomes, including the human genome. Copyright 2004 Wiley-Liss, Inc.Entities:
Mesh:
Substances:
Year: 2004 PMID: 14997562 DOI: 10.1002/prot.10638
Source DB: PubMed Journal: Proteins ISSN: 0887-3585