| Literature DB >> 19906706 |
Patrick May1, Annika Kreuchwig, Thomas Steinke, Ina Koch.
Abstract
With growing amount of experimental data, the number of known protein structures also increases continuously. Classification of protein structures helps to understand relationships between protein structure and function. The main classification methods based on secondary structures are SCOP, CATH and TOPS, which all classify under different aspects, and therefore can lead to different results. We developed a mathematically unique representation of protein structure topologies at a higher abstraction level providing new aspects of classification and enabling for a fast search through the data. Protein Topology Graph Library (PTGL; http://ptgl.zib.de) aims at providing a database on protein secondary structure topologies, including search facilities, the visualization as intuitive topology diagrams as well as in the 3D structure, and additional information. Secondary structure-based protein topologies are represented uniquely as undirected labeled graphs in four different ways allowing for exploration under different aspects. The linear notations, and the 2D and 3D diagrams of each notation facilitate a deeper understanding of protein topologies. Several search functions for topologies and sub-topologies, BLAST search possibility, and links to SCOP, CATH and PDBsum support individual and large-scale investigation of protein structures. Currently, PTGL comprises topologies of 54,859 protein structures. Main structural patterns for common structural motifs like TIM-barrel or Jelly Roll are pre-implemented, and can easily be searched.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19906706 PMCID: PMC2808981 DOI: 10.1093/nar/gkp980
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The ‘Topology Browser’ for protein 1G3E chain A. On the left side, the β protein graph is shown together with additional information for the SSEs. On the right side, the β folding graph A in KEY notation is shown. The linear notation is ‘(1,1,3,−1,−1)’ representing a barrel structure.
Figure 2.The TIM barrel motif as 3D image (upper left), the corresponding KEY notation (middle left) and two search patterns (bottom left), both described in RED as β graphs. For visualization purposes of the TIM barrel motif, the helices are shown in the β graph image. On the right site, the different occurrences of the motif in CATH, SCOP and PTGL are depicted. The differences between the PTGL and CATH or SCOP mainly rely on applying a stronger, but more precisely definition for the motif, which limits the risk for false positives.
PTGL content
| Graph type | Total | Different | Non-bifurcated | Barrels |
|---|---|---|---|---|
| α–β | 606 816 | 77 127 | 2432 | 666 |
| α | 636 820 | 29 984 | 680 | 191 |
| β | 199 122 | 13 418 | 2246 | 696 |
The first column indicates the type of the folding graph. ‘Total’ gives the total amount of folding graphs of the particular graph type; ‘different’ the amount of different graphs among the total number; ‘non-bifurcated’ the number of different non-bifurcated graphs including barrels; and ‘barrels’ the number of different barrel structures.