| Literature DB >> 17135201 |
Yanli Wang1, Kenneth J Addess, Jie Chen, Lewis Y Geer, Jane He, Siqian He, Shennan Lu, Thomas Madej, Aron Marchler-Bauer, Paul A Thiessen, Naigong Zhang, Stephen H Bryant.
Abstract
Three-dimensional (3D) structure is now known for a large fraction of all protein families. Thus, it has become rather likely that one will find a homolog with known 3D structure when searching a sequence database with an arbitrary query sequence. Depending on the extent of similarity, such neighbor relationships may allow one to infer biological function and to identify functional sites such as binding motifs or catalytic centers. Entrez's 3D-structure database, the Molecular Modeling Database (MMDB), provides easy access to the richness of 3D structure data and its large potential for functional annotation. Entrez's search engine offers several tools to assist biologist users: (i) links between databases, such as between protein sequences and structures, (ii) pre-computed sequence and structure neighbors, (iii) visualization of structure and sequence/structure alignment. Here, we describe an annotation service that combines some of these tools automatically, Entrez's 'Related Structure' links. For all proteins in Entrez, similar sequences with known 3D structure are detected by BLAST and alignments are recorded. The 'Related Structure' service summarizes this information and presents 3D views mapping sequence residues onto all 3D structures available in MMDB (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=structure).Entities:
Mesh:
Year: 2006 PMID: 17135201 PMCID: PMC1751549 DOI: 10.1093/nar/gkl952
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Non-identical protein sequences in Entrez have been classified into groups linked to related structures, at various levels of sequence similarity. Sequence identity is calculated from the BLAST alignments, and here only those neighbor relationships are listed that produce an aligned footprint of 50 residues or more. The analysis also excludes protein sequences which have been directly obtained from MMDB. Forty-eight percent of sequences in Entrez protein have at least one structure neighbor with an extensive alignment footprint and at least 30% identical residues.
Figure 2A screen shot of the ‘Related Structure’ summary along with Entrez's document summary for protein NP_036676. Clicking on the ‘Related Structure’ option from the ‘Links’ pull-down menu launches the summary view.
Figure 3A Cn3D view of the query sequence from Figure 2 aligned to chain A of the related structure 1O86 (PDB code). Residues in aligned regions are displayed in upper case letters with identical residue pairs rendered in red color. Residues within a 5 A contact radius of the bound drug lisinopril are highlighted in the 3D structure view and automatically mapped onto the aligned residues shown in the sequence alignment window. Side chains of these residues are displayed selectively and rendered as ball-and-stick models.