| Literature DB >> 24319143 |
Thomas Madej1, Christopher J Lanczycki, Dachuan Zhang, Paul A Thiessen, Renata C Geer, Aron Marchler-Bauer, Stephen H Bryant.
Abstract
The computational detection of similarities between protein 3D structures has become an indispensable tool for the detection of homologous relationships, the classification of protein families and functional inference. Consequently, numerous algorithms have been developed that facilitate structure comparison, including rapid searches against a steadily growing collection of protein structures. To this end, NCBI's Molecular Modeling Database (MMDB), which is based on the Protein Data Bank (PDB), maintains a comprehensive and up-to-date archive of protein structure similarities computed with the Vector Alignment Search Tool (VAST). These similarities have been recorded on the level of single proteins and protein domains, comprising in excess of 1.5 billion pairwise alignments. Here we present VAST+, an extension to the existing VAST service, which summarizes and presents structural similarity on the level of biological assemblies or macromolecular complexes. VAST+ simplifies structure neighboring results and shows, for macromolecular complexes tracked in MMDB, lists of similar complexes ranked by the extent of similarity. VAST+ replaces the previous VAST service as the default presentation of structure neighboring data in NCBI's Entrez query and retrieval system. MMDB and VAST+ can be accessed via http://www.ncbi.nlm.nih.gov/Structure.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24319143 PMCID: PMC3965051 DOI: 10.1093/nar/gkt1208
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.This histogram displays the number of structures in MMDB (blue), categorized by the size of the biological assembly. Monomers, dimers and higher oligomers up to dodecamers are plotted as separate categories, the 13th category summarizes tridecamers and all higher oligomers. The y-axis is scaled logarithmically. Red columns indicate the number of structures in that category that have at least one complete biological assembly match according to VAST+.
Figure 2.The VAST+ web service generates lists of structures that have 3D similarity to the query. Matches are evaluated with biological assemblies as the unit of comparison (referred to as Biological Units) and may summarize simultaneous alignment of several protein molecule pairs. The query structure ‘3O6F’ (24) currently yields 2712 structure neighbors. Only the 115 neighbors with a complete biological assembly match have been selected in this example (via the ‘display filters’ menu, shown as collapsed in this figure). The 115 complete matches have been sorted by RMSD, and the third ranking match has been selected to provide more detail. The tabulated matches are shown with their PDB accession, descriptive text, the number of proteins aligned in the match, the total number of aligned residues, the sequence identity and the RMSD resulting from the simultaneous superimposition of all aligned molecules. In this example, the query ‘3O6F’ matches the structure ‘1J8H’ with a total of four aligned protein molecules, totaling 768 residues and resulting in a superposition with 2.55 Å RMSD. 80% of the residues in 3O6F and 1J8H that were spatially aligned by VAST are identical. The extended panel characterizing this selected match contains a table that lists pairs of matching/aligned proteins, and it provides schematic depictions of each biological assembly’s composition and interactions. The user can mouse-over those schematics to identify individual molecules and their corresponding match in the other structure (as shown in this example). The individual protein match table contains action buttons that provide access to the pairwise sequence alignments as derived from the VAST superimposition and launch points for visualization of the structure superimposition with the protein structure viewer Cn3D (23). Each “3D View” button will open a superposition of the complete biological assembly alignment with the 3D view centered on the selected protein molecule and its sequence data featured in the Cn3D sequence viewer window. Next to the Aligned Molecules table, an information box lists some stats that characterize the matched biological assembly.
Figure 3.Visualization of structurally matching biological assemblies, as rendered by the visualization tool Cn3D. Cn3D is a helper application for the web browser, available for Windows and OS-X platforms. The query structure, PDB accession 3O6F, represents the complex of an autoreactive T-cell receptor (MS2-3C8, molecules rendered in green and brown) complexed with a self-peptide derived from myelin basic protein and the multiple sclerosis-associated MHC molecule HLA-DR4 (molecules rendered in magenta and blue) (24). The self-peptide has been fused with the MHC molecule for the experiment, which explains why the query is represented as a biological assembly with only four components (Figure 2), and is rendered in gray, as is the default for all unaligned segments in Cn3D visualization sessions launched from VAST+ results pages. The left panel shows 3O6F superimposed with the structure neighbor, PDB accession 1J8H (25), which contains a complex between HLA-DR3, an Influenza hemagglutinin peptide, and a human alpha/beta T-cell receptor. Molecules are rendered so that their colors match those of the corresponding query molecules. The structures of the two complexes match well, resulting in a superimposition of 768 amino acid residues at ∼2.6 Å RMSD. This demonstrates how well the autoreactive T-cell receptor complex mimics complexes that include foreign peptides, and it is thought that this binding mode is responsible for the autoimmune TCR escaping negative selection. The right panel shows the VAST+ alignment between 3O6F and the structure of a T-cell receptor from a patient with multiple sclerosis, complexed with a myelin basic protein-derived peptide and an HLA-DR2 MHC, PDB accession 1YMM (26). The conformations of the two complexes are different although their components are similar, and VAST+ does not consider the complete biological assemblies to match. Instead, it reports the most extensive sub-structure match, which in this case involves both subunits of the MHC (molecules rendered in magenta and blue). The molecules corresponding to the TCR are rendered in gray color and would not be displayed by default. The unusual conformation of the complex reported in 1YMM is thought to represent an alternative binding mode that helps autoimmune TCRs to escape negative selection.
URLs for MMDB and VAST resources
| MMDB | Database home page | |
| MMDB FTP | Data distribution | |
| VAST | Identify structurally similar individual protein molecules | |
| VAST+ | Identify structurally similar macromolecular complexes | |
| VAST search | Input the 3D coordinates of a query structure to search for similar structures | |
| Cn3D | Molecular graphics viewer | |
| CBLAST | Find 3D structures that are related to a query protein via sequence comparison |