| Literature DB >> 31599042 |
Cameron Mura1, Stella Veretnik1, Philip E Bourne1,2.
Abstract
We suspect that there is a level of granularity of protein structure intermediate between the classical levels of "architecture" and "topology," as reflected in such phenomena as extensive three-dimensional structural similarity above the level of (super)folds. Here, we examine this notion of architectural identity despite topological variability, starting with a concept that we call the "Urfold." We believe that this model could offer a new conceptual approach for protein structural analysis and classification: indeed, the Urfold concept may help reconcile various phenomena that have been frequently recognized or debated for years, such as the precise meaning of "significant" structural overlap and the degree of continuity of fold space. More broadly, the role of structural similarity in sequence↔structure↔function evolution has been studied via many models over the years; by addressing a conceptual gap that we believe exists between the architecture and topology levels of structural classification schemes, the Urfold eventually may help synthesize these models into a generalized, consistent framework. Here, we begin by qualitatively introducing the concept.Entities:
Keywords: architecture; fold space; molecular evolution; protein structure classification; secondary structure; superfold; topology; β-sheet
Mesh:
Substances:
Year: 2019 PMID: 31599042 PMCID: PMC6863707 DOI: 10.1002/pro.3742
Source DB: PubMed Journal: Protein Sci ISSN: 0961-8368 Impact factor: 6.725
Figure 1Schematic representation of the Urfold concept, with respect to protein structure space. This diagram sketches the granularity of structural levels that are typically considered (a), ranging from coarsest (e.g., “α/β class”) to finer levels (e.g., “homologous superfamily” and below). Note that the terms used here (class, architecture, etc.) closely align with the usage in systems such as CATH, but they are not necessarily identical (the “c,” “a,” etc. in panel a are lowercase for this reason—we do not mean to imply, simply by using these terms, that the present work strictly adheres to any particular classification scheme). The exact position of the Urfold, between the topology (red) and architecture (yellow) levels, is currently indeterminate. These conceptual terms are elaborated in (b) and (c). Panel (b) shows the relationships, in terms of a hierarchical concept map or ontology, between (a) the various conceptual levels of protein structural entities found in most hierarchical classification systems (class, architecture, topology, etc.), in the vertical direction, and (b) the grouping or “aggregation” function served by such terms as “superfamily” and “superfold” (and, now, “urfold”) represented in the mostly horizontal direction (semitransparent slabs, color matched to panel a). The “eye” icon in (b) gazes down (and through) the yellow slab, representing entities at the architecture level, whereupon we see a set of architecturally identical protein folds (SH3/Sm, OB, etc.) that can be grouped into the small β‐barrel (SBB) Urfold in (c); here, contour lines represent different thresholds, or stringencies, of clustering discrete entities at that given level along the structural classification hierarchy (the concept planes/slabs). In a sense, the Urfold concept is to the architecture level as the superfold concept is to the topology(/fold) level. The histogram in (d) roughly indicates the relative populations of these structural levels. A noticeable jump occurs between the upper levels in most classification schemes (CATH, SCOP, ECOD), and we suggest that the Urfold corresponds to structural entities lying within the architecture ↭ topology gap
Figure 2Some examples of putative Urfolds and analyses thereof. Many protein structures exhibit architectural similarity despite topological variability, irrespective of considerations of homology—a principle we term the Urfold. This concept is illustrated here using (within each panel) two or more examples of distinct folds that adopt equivalent architectures, suggesting them as putative Urfolds. All 3D structures are shown as cartoon ribbon diagrams, and PDB codes are indicated near each structure (light‐gray). The N′ and C′‐termini are marked in most cases (space permitting), and individual SSEs are color ramped from N′ → C′ along the visible spectrum (red → orange → yellow → ⋯). The helices are of secondary importance for the immediate purposes of (a) and (d); so in those two panels their color is either light‐tan (a) or a hue that is intermediate between the adjoining strands (d). Also in (a) and (d), individual β‐strand numbers appear on the cartoons. The strand layout for each β‐sheet is diagrammed underneath each representation, for example, as 5↓1↑2↓3↑4↓, for the SH3/Sm superfold in (d). For cases wherein we consider the helices to have a pivotal role in defining a particular Urfold (i.e., panels b and c), these schematic diagrams are used to also indicate the approximate location of each helix, for example, the “⦚31↓3↑⋯,” for the KH domain of hnRNP K in (b). In general, the coloring and diagrammatic schemes are intended to expose the nature of the equivalencies and other mappings between the salient SSEs. Further descriptions of these putative Urfold examples are provided in the text