Literature DB >> 16381866

iMOTdb--a comprehensive collection of spatially interacting motifs in proteins.

Ganesan Pugalenthi1, Anirban Bhaduri, R Sowdhamini.   

Abstract

Realization of conserved residues that represent a protein family is crucial for clearer understanding of biological function as well as for the better recognition of additional members in sequence databases. Functionally important residues are recognized well due to their high degree of conservation in closely related sequences and are annotated in functional motif databases. Structural motifs are central to the integrity of the fold and require careful analysis for their identification. We report the availability of a database of spatially interacting motifs in single protein structures as well as those among distantly related protein structures that belong to a superfamily. Spatial interactions amongst conserved motifs are automatically measured using sequence similarity scores and distance calculations. Interactions between pairs of conserved motifs are described in the form of pseudoenergies. iMOTdb database provides information for 854,488 motifs corresponding to 60,849 protein structural domains and 22,648 protein structural entries.

Entities:  

Mesh:

Year:  2006        PMID: 16381866      PMCID: PMC1347487          DOI: 10.1093/nar/gkj125

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The central dogma in protein folding problem is how proteins arrive at their unique three-dimensional fold spontaneously. Anfinsens' hypothesis has stated that the entire information about the tertiary structure of a protein is contained in its amino acid sequence (1). Proteins are largely tolerant to mutations and a large amount of information in homologous protein families reveals that mutations are more likely in structurally variable regions (2–8). Structurally invariant regions point to solvent-buried residues that undergo permitted amino acid exchanges. We had earlier identified such structurally invariant residues amongst superfamilies where proteins are distantly related but retain similar biological functions (9,10). The structurally invariant residues undergo permitted amino acid mutations where the amino acids exchanged still retain similar chemical groups. Functionally important residues can be recognized from mutagenesis experiments or simply from their high sequence and structural conservation among protein families and superfamilies. Information on such functional residues can be obtained from popular motif databases (11). However, conserved residues crucial for the structural integrity are hard to recognize since they undergo permitted amino acid exchanges. We had earlier employed conserved residues that are spatially interacting with other motifs in the fold to recognize additional putative members of a protein family (12) and developed a web server for the automatic identification of spatially interacting conserved residues (13). There have been similar attempts by other groups on the visualization of conserved regions on protein structures (14). In this paper, we report the availability of a database containing spatially proximate conserved motifs where iMOT has been applied to whole database of protein structural superfamilies (7,10) and all structural entries in the Protein Structural Databank (15).

CONTENTS OF THE DATABASE

This database provides interacting motifs for 60 849 protein structural domain superfamilies derived from SCOP database 1.67 release (7). All the 1731 problematic entries in the SCOP database could not be considered for our database owing to spurious values in the calculations or lack of spatial interactions of conserved residues or lack of homologues or entries with only Cα coordinates. For each structural member in the superfamily that has been considered in SCOP database, homologous sequences are individually identified. Alignment positions are provided an average similarity score after consulting amino acid exchange matrix (16). Contiguous residues with an average similarity score of more than 50 are treated as conserved residues or motifs. These motifs are mapped on to the structural superfamily member to examine their spatial proximity with each other. Pairs of conserved residues are further examined by calculating psuedoenergies that describe the strength of interactions (13). Spatially interacting motifs are mapped on to the alignment of the superfamily to further recognize spatially interacting motifs that are conserved throughout the superfamily [for details, please see the help web pages and (12,13)]. Interacting motifs are provided for all the 22 648 protein structures submitted in PDB database (May 2005 release). iMOTdb pertains to 854 488 motifs of 60 849 protein structural domains corresponding to SCOP 1.67 database (7). Spatially interacting motifs identified in protein structures are mapped and colour-coded on sequence alignment as well as on structure [using MOLSCRIPT (17) and CHIME (MDL Information Systems, Inc.)]. The extent of spatial interaction between all possible pairs of motifs is provided as a symmetric matrix where the values are described as pseudoenergies (13). Pseudoenergies are classified, by benchmarking on known structural motifs, as strong (better than −125), medium (between −125 and −50) and weak (worse than −50) and colour-coded accordingly. Structural information about individual motifs is provided that includes the presence of motifs in secondary structures, solvent accessibility patterns and positional variations amongst superfamily members (reflected as root mean square deviations). This database provides the user with an option to search genome databases using selected interacting motifs as in SCANMOT server (18) and using PHIBLAST (19). Hyperlinks to other online resources, such as PROSITE (11), CKAAPsDB (20), PRINTS and (21) eMOTIFS (22), are provided so that direct comparison of motif definitions and peptide signatures (23) may be possible.

APPLICATIONS

Spatially interacting motifs can be critical for structure and/or function. They are useful in searching for distant homologues and establishing remote homologies among the largely unassigned sequences in genome databases. Availability of information on structural motifs in large number of protein structures should be useful as starting points to perform detailed analysis, for the rational design of experiments in protein folding, site-directed mutagenesis and to understand mechanism of action and conformational changes in proteins. iMOTdb database can be accessed from .
  22 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  The EMOTIF database.

Authors:  J Y Huang; D L Brutlag
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

3.  The PROSITE database, its status in 2002.

Authors:  Laurent Falquet; Marco Pagni; Philipp Bucher; Nicolas Hulo; Christian J A Sigrist; Kay Hofmann; Amos Bairoch
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

4.  CKAAPs DB: a Conserved Key Amino Acid Positions DataBase.

Authors:  Wilfred W Li; Boojala V B Reddy; John G Tate; Ilya N Shindyalov; Philip E Bourne
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

5.  3MOTIF: visualizing conserved protein sequence motifs in the protein structure database.

Authors:  Steven P Bennett; Craig G Nevill-Manning; Douglas L Brutlag
Journal:  Bioinformatics       Date:  2003-03-01       Impact factor: 6.937

6.  SMoS: a database of structural motifs of protein superfamilies.

Authors:  Saikat Chakrabarti; K Venkatramanan; R Sowdhamini
Journal:  Protein Eng       Date:  2003-11

7.  The PRINTS database: a resource for identification of protein families.

Authors:  Terri K Attwood
Journal:  Brief Bioinform       Date:  2002-09       Impact factor: 11.622

8.  iMOT: an interactive package for the selection of spatially interacting motifs.

Authors:  A Bhaduri; G Pugalenthi; N Gupta; R Sowdhamini
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

9.  Conserved spatially interacting motifs of protein superfamilies: application to fold recognition and function annotation of genome data.

Authors:  Anirban Bhaduri; R Ravishankar; R Sowdhamini
Journal:  Proteins       Date:  2004-03-01

10.  PASS2: an automated database of protein alignments organised as structural superfamilies.

Authors:  Anirban Bhaduri; Ganesan Pugalenthi; Ramanathan Sowdhamini
Journal:  BMC Bioinformatics       Date:  2004-04-02       Impact factor: 3.169

View more
  2 in total

Review 1.  Building Bridges Between Structural and Network-Based Systems Biology.

Authors:  Christos T Chasapis
Journal:  Mol Biotechnol       Date:  2019-03       Impact factor: 2.695

2.  3did: a catalog of domain-based interactions of known three-dimensional structure.

Authors:  Roberto Mosca; Arnaud Céol; Amelie Stein; Roger Olivella; Patrick Aloy
Journal:  Nucleic Acids Res       Date:  2013-09-29       Impact factor: 16.971

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.