Literature DB >> 16978047

Intricate knots in proteins: Function and evolution.

Peter Virnau1, Leonid A Mirny, Mehran Kardar.   

Abstract

Our investigation of knotted structures in the Protein Data Bank reveals the most complicated knot discovered to date. We suggest that the occurrence of this knot in a human ubiquitin hydrolase might be related to the role of the enzyme in protein degradation. While knots are usually preserved among homologues, we also identify an exception in a transcarbamylase. This allows us to exemplify the function of knots in proteins and to suggest how they may have been created.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16978047      PMCID: PMC1570178          DOI: 10.1371/journal.pcbi.0020122

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


Introduction

Although knots are abundant and complex in globular homopolymers [1-3], they are rare and simple in proteins [4-8]. Sixteen methyltransferases in bacteria and viruses can be combined into the α/β knot superfamily [9], and several isozymes of carbonic anhydrase (I, II, IV, V) are known to be knotted. Apart from these two folds, only a few insular knots have been reported [5,6,10,11], some of which were derived from incomplete structures [6,11]. For the most part, knotted proteins contain simple trefoil knots (31) that can be represented by three essential crossings in a projection onto a plane (see Figure 1, left). Only three proteins were identified with four projected crossings (41, Figure 1, middle).
Figure 1

Examples of the Three Different Types of Knots Found in Proteins

Colors change continuously from red (first residue) to blue (last residue). A reduced representation of the structure, based on the algorithm described in [1,6,36], is shown in the lower row.

(Left) The trefoil knot (31) in the YBEA methyltransferase from E. coli (pdb code 1ns5; unpublished data) reveals three essential crossings in a projection onto a plane.

(Middle) The figure-eight knot (41) in the Class II ketol-acid reductoisomerase from Spinacia oleracea (pdb code 1yve [26]) features four crossings. (Only the knotted section of the protein is shown.)

(Right) The knot 52 in ubiquitin hydrolase UCH-L3 (pdb code 1xd3 [18]) reveals five crossings. Pictures were generated with Visual Molecular Dynamics (http://www.ks.uiuc.edu/Research/vmd) [43].

Examples of the Three Different Types of Knots Found in Proteins

Colors change continuously from red (first residue) to blue (last residue). A reduced representation of the structure, based on the algorithm described in [1,6,36], is shown in the lower row. (Left) The trefoil knot (31) in the YBEA methyltransferase from E. coli (pdb code 1ns5; unpublished data) reveals three essential crossings in a projection onto a plane. (Middle) The figure-eight knot (41) in the Class II ketol-acid reductoisomerase from Spinacia oleracea (pdb code 1yve [26]) features four crossings. (Only the knotted section of the protein is shown.) (Right) The knot 52 in ubiquitin hydrolase UCH-L3 (pdb code 1xd3 [18]) reveals five crossings. Pictures were generated with Visual Molecular Dynamics (http://www.ks.uiuc.edu/Research/vmd) [43]. In this report we provide the first comprehensive review of knots in proteins, which considers all entries in the Protein Data Bank (http://www.pdb.org) [12], and not just a subset. This allows us to examine knots in homologous proteins. Our analysis reveals several new knots, all in enzymes. In particular, we discovered the most complicated knot found to date (52) in human ubiquitin hydrolase (Figure 1, right), and suggest that its entangled topology protects it against being pulled into the proteasome. We also noticed that knots are usually preserved among structural homologues. Sequence similarity appears to be a strong indicator for the preservation of topology, although differences between knotted and unknotted structures are sometimes subtle. Interestingly, we have also identified a novel knot in a transcarbamylase that is not present in homologues of known structure. We show that the presence of this knot alters the functionality of the protein, and suggest how the knot may have been created in the first place. Mathematically, knots are rigorously defined in closed loops [13]. Fortunately, both the N- and C-termini of open proteins are typically accessible from the surface and can be connected unambiguously: we reduce the protein to its Cα- backbone, and draw two lines outward starting at the termini in the direction of the connection line between the center of mass of the backbone and the respective ends [5]. The lines are joined by a big loop, and the structure is topologically classified by the determination of its Alexander polynomial [1,13]. Applying this method to the Protein Data Bank in the version of January 3, 2006, we found 273 knotted structures in the 32,853 entries that contain proteins (Table S1). Knots formed by disulfide [14,15] or hydrogen bonds [7] were not included in the study.

Results

For further analysis, we considered 36 proteins that contain knots as defined by rather stringent criteria discussed in the Materials and Methods section. These proteins can be classified into six distinct families (Table 1). Four of these families incorporate a deeply knotted section, which persists when 25 amino acids are cut off from either terminus. Interestingly, all knotted proteins thus identified are enzymes. Our investigation affirms that all members of the carbonic anhydrase fold (including the previously undetermined isozymes III, VII, and XIV) are knotted. In addition, we identify a novel trefoil in two bacterial transcarbamylase-like proteins (AOTCase in Xanthomonas campestris and SOTCase in Bacteroides fragilis) [16,17].
Table 1

List of Knotted PDB Entries (January 2006)

List of Knotted PDB Entries (January 2006)

UCH-L3—The most complex protein knot.

One of our most intriguing discoveries is a fairly intricate knot with five projected crossings (52) in ubiquitin hydrolase (UCH-L3 [18]; see Figure 1, right). This knot is the first of its kind and, apart from carbonic anhydrases, the only identified in a human protein. Human UCH-L3 also has a yeast homologue [6,19] with a sequence identity of 32% [20]. Amino acids 63 to 77 are unstructured, and if we connect the unstructured region by an arc that is present in the human structure, we obtain the same knot with five crossings. What may be the function of this knot? In eukaryotes, proteins get labeled for degradation by ubiquitin conjugation. UCH-L3 performs deconjugation of ubiquitin, thus rescuing proteins from degradation. The close association of the enzyme with ubiquitin should make it a prime target for degradation at the proteasome. We suggest that the knotted structure of UCH-L3 makes it resistant to degradation. In fact, the first step of protein degradation was shown to be ATP-dependent protein unfolding by threading through a narrow pore (~13 Å in diameter) of a proteasome [21,22]. Such threading into the degradation chamber depends on how easily a protein unfolds, with more stable proteins being released back into solution [23] and unstable ones being degraded. If ATP-dependent unfolding proceeds by pulling the C-terminus into a narrow pore [21], then a knot can sterically preclude such translocation, hence preventing protein unfolding and degradation. While arceabacterial proteasome PAN was shown to process proteins from its C- to N-terminus [21], it cannot be ruled out that some eukaryotic proteasomes process proteins in the N- to C-direction, thus requiring protection of both termini. Unfolding of a knotted protein by pulling may require a long time for global unfolding and untangling of the knot. Unknotted proteins, in contrast, have been shown to become unstable if a few residues are removed from their termini [24], suggesting that threading a few (5–10) residues into a proteasomal pore would be sufficient to unravel an unknotted structure. At both termini, UCH-L3 contains loops entangled into the knot protecting both ends against unfolding if pulled. It should also be noted that both N- and C-termini are stabilized by a number of hydrophobic interactions with the rest of the protein. The C-terminus is particularly stable—residues 223 to 229 are hydrophobic and form numerous contacts at 5 Å with the rest of the structure. We would like to stress that this hypothesis needs to be tested by experiments. Different proteins may also provide different levels of protection against degradation, depending on structural details, the depth of the knot, and its complexity. Recently, a knot in the red/far-red light photoreceptor phytochrome A in Deinococcus radiodurans was identified [11] (see Materials and Methods). Although sequence similarity suggests that the knot may also be present in plant homologues, we cannot be certain. In plants, the red-absorbing form is rather stable (half-life of 1 wk), but the far-red–absorbing form is degraded upon photoconversion by the proteasome with a half-life of 1–2 h in seedlings (and somewhat longer in adult plants) [25].

Evolutionary aspects.

As expected, homologous structures tend to retain topological features. The trefoil knot in carbonic anhydrase can be found in isozymes ranging from bacteria and algae to humans (Table 1). Class II ketol-acid reductoisomerase comprises a figure-eight knot present in Escherichia coli [10] and spinach [26] (see Figure 1, middle), and S-adenosylmethione synthetase contains a deep trefoil knot in E. coli [5,27] and rat [28]. It appears that particular knots have indeed been preserved throughout evolution, which suggests a crucial role for knots in protein enzymatic activity and binding. UCH-L3 in human and yeast share only 33% [29] of their sequences, but contain the same 5-fold knot as far as we can tell from the incomplete structure in yeast. It is not only likely that all species in between have the same knot—the link between sequence and structure may also be used to predict candidates for knots among isozymes or related proteins for which the structure is unknown. For example, UCH-L4 in mouse has 96% sequence identity with human UCH-L3. The similarity with UCH-L6 in chicken is 86%, and with UCH-L1 about 55%. Indeed, a reexamination of the most recent Protein Data Bank entries revealed that UCH-L1 contains the same 52 knot as UCH-L3. (See the Update section—the structure was not yet part of the January Protein Data Bank release on which this paper is based.) Unfortunately, the method is not foolproof because differences between knotted and unknotted structures are sometime subtle. As we will demonstrate in the next paragraph, a more reliable estimate has to consider the conservation of major elements of the knot, like loops and threads.

AOTCase—How a protein knot can alter enzymatic activity.

Somewhat surprisingly, we also identified a pair of homologues for which topology is not preserved. N-acetylornithine transcarbamylase (AOTCase [17]) is essential for the arginine biosynthesis in several major pathogens. In other bacteria, animals, and humans, a homologous enzyme (OTCase) processes L-ornithine instead [30]. Both proteins have two active sites. The first one binds carbamyl phosphate to the enzyme. The second site binds acetylornithine in AOTCases and L-ornithine in OTCases, enabling a reaction with carbamyl phosphate to form acetylcitrulline or citrulline, respectively [17, 31]. AOTCase in X. campestris has 41% sequence identity with OTCase from Pyrococcus furiosus [32] and 29% with human OTCase [31]. As demonstrated in Figure 2, AOTCase contains a deep trefoil knot which is not present in OTCase (Figure 2, right) and which modifies the second active site. The knot consists of a rigid proline-rich loop (residues 178–185), through which residues 252 to 256 are threaded and affixed. As elaborated in [17], the reaction product N-acetylcitrulline strongly interacts with the loop and with Lys252. Access to subsequent residues is, however, restricted by the knot. L-norvaline in Figure 2 (right) is very similar to L-ornithine but lacks the N-ɛ atom of the latter to prevent a reaction with carbamyl phosphate. As the knot is not present in OTCase, the ligand has complete access to the dangling residues 263–268 and strongly interacts with them [31]. This leads to a rotation of the carboxyl-group by roughly 110° around the Cα–Cβ bond [17].
Figure 2

Structures of Transcarbamylase from X. campestris with a Trefoil Knot and from Human without a Knot

(Left) Knotted section (residues 171–278) of N-acetylornithine transcarbamylase from X. campestris with reaction product N-acetylcitrulline (pdb code 1yh1 [17]) and interacting side chains.

(Right) Corresponding (unknotted) section (residues 189–286) in human ornithine transcarbamylase (pdb code 1c9y [31]) with inhibitor L-norvaline and carbamyl phosphate. Colors change continuously from red (first residue in the section) to blue (last residue in the section). The two proteins have an overall sequence identity of 29% [41]. Pictures were generated with VMD [43].

Structures of Transcarbamylase from X. campestris with a Trefoil Knot and from Human without a Knot

(Left) Knotted section (residues 171–278) of N-acetylornithine transcarbamylase from X. campestris with reaction product N-acetylcitrulline (pdb code 1yh1 [17]) and interacting side chains. (Right) Corresponding (unknotted) section (residues 189–286) in human ornithine transcarbamylase (pdb code 1c9y [31]) with inhibitor L-norvaline and carbamyl phosphate. Colors change continuously from red (first residue in the section) to blue (last residue in the section). The two proteins have an overall sequence identity of 29% [41]. Pictures were generated with VMD [43]. This example demonstrates how the presence of a knot can modify active sites and alter the enzymatic activity of a protein—in this case, from processing L-ornithine to N-acetyl-L-ornithine. It is also easy to imagine how this alteration happened: a short insertion extends the loop and modifies the folding pathway of the protein.

Discussion

Nature appears to disfavour entanglements, and evolution has developed mechanisms to avoid knots. Human DNA wraps around histone proteins, and the rigidity of DNA allows it to form a spool when it is fed into a viral capsid. One end also stays in the loading channel and prevents subsequent equilibration [33]. Knotted proteins are rare, although the reason is far less well understood. Can the absence of entanglement be explained in terms of particular statistical ensembles, or is there an evolutionary bias? And how do these structures actually fold? Knots are ubiquitous in globular homopolymers [1-3,8], but rare in coil-like phases [1,34-36]. It is likely that even a flexible polymer will at least initially remain unknotted after a collapse from a swollen state. In proteins, the free energy landscape is considerably more complex, which may allow most proteins to stay unknotted. The secondary structure and the stiffness of the protein backbone may shift the length scale at which knots typically appear, too [8]. If knotted proteins are in fact more difficult to degrade, it might also be disadvantageous for most proteins to be knotted in the first place. Unfortunately, few experimental papers address folding and biophysical aspects of knots in proteins. In recent work [37], Jackson and Mallam reversibly unfolded and folded a knotted methyltransferase in vitro, indicating that chaperones are not a necessary prerequisite. In a subsequent study [38], the authors provide an extensive kinetic analysis of the folding pathway. In conclusion, we would like to express our hope that this report will inspire more experiments in this small but nevertheless fascinating field.

Materials and Methods

To determine whether a structure is knotted, we reduce the protein to its backbone, and draw two lines outward starting at the termini in the direction of the connection line between the center of mass of the backbone and the respective ends. These two lines are joined by a big loop, and the structure is classified by the determination of its Alexander polynomial [1,13]. To determine the size of the knotted core, we delete successively amino acids from the N-terminus until the protein becomes unknotted [1,6]. The procedure is repeated at the C-terminus starting with the last structure that contained the original knot. For each deletion, the outward pointing line through the new termini is parallel to the respective lines computed for the full structure. The thus determined size should, however, only be regarded as a guideline. A better estimate can be achieved by looking at the structure. In Table 1 we include knotted structures with no missing amino acids in the center of the protein. (A list of potentially knotted structures with missing amino acids can be found in Table S3.) Technically, the numbering of the residues in the mmcif file has to be subsequent, and no two amino acids are allowed to be more than 6 Å apart. In addition, the knot has to persist when two amino acids are cut from either terminus. We have further excluded structures for which unknotted counterexamples exist (e.g., only one nuclear magnetic resonance structure among many is knotted or another structure of the same protein is unknotted). If a structure is fragmented, the knot has to appear in one fragment and in the resulting structure obtained from connecting missing sections by straight lines. Other knotted structures are only considered when at least one additional member of the same structural family [9] contains a knot according to the criteria above. The enforcement of these rules leads to the exclusion of the bluetongue virus core protein [6] (41) and photoreceptor phytochrome A in D. radiodurans [11] (31), which have been previously identified as being knotted. Both structures are fragmented and become knotted only when a few missing fragments are connected by straight lines. In the viral core protein, the dangling C-terminus threads through a loose loop and becomes knotted in one out of two cases. On the other hand, the photoreceptor phytochrome A appears to contain a true knot. Notably, our analysis suggests that the thus connected structure of phytochrome A contains a figure-eight knot instead of a trefoil as reported in [11]. Moreover, we excluded a structure of the Autographa California nuclear polyhedrosis virus, which contains a knot according to our criteria. However, the N-terminus is buried inside the protein and the knot only exists because of our specific connection to the outside. To further validate our criteria, we implemented an alternative method [4,8,39] that relies on the statistical analysis of multiple random closures. We arbitrarily chose two points on a sphere (which has to be larger than the protein) and connected each with one terminus. The two points can be joined unambiguously, and the resulting loop was analyzed by calculating the Alexander polynomial. We repeated the procedure 1,000 times, and defined the knot as the majority type. Applying this analysis, we discovered 241 knotted structures in the Protein Data Bank. All 241 structures are also present in the 273 structures (Table S1) that were identified by our method, and the knot type is the same. The missing 32 structures (Table S2) are mostly shallow knots and were already rejected according to our extended criteria. The random closure also correctly discards rare structures with buried termini. In conclusion, the method used in this paper is considerably faster but requires a slightly increased inspection effort. Our observations agree with [8], which provides an extensive comparison of closures applied to proteins. A complete listing of knotted Protein Data Bank structures is given in the Supporting Information.

Update.

Recently, the structure of human UCH-L1 was solved and released [40]. The protein shares 55% sequence identity with UCH-L3 [41], and it contains the same 5-fold knot. UCH-L1 is highly abundant in the brain, comprising up to 2% of the total brain protein [42]. The structure of UCH-L1 was not yet part of the January Protein Data Bank edition on which the rest of this study is based. We also noticed several new structures of knotted transcarbamylase-like proteins.

List of Knotted Protein Data Bank Entries

(79 KB DOC) Click here for additional data file.

List of Knotted Entries from Table S1 That Become Unknotted When Ends Are Connected by the Random Closure Method

(28 KB DOC) Click here for additional data file.

List of Structures That Become Knotted When Missing Sections Are Joined by Straight Lines

(35 KB DOC) Click here for additional data file.

Supporting Information

Accession Numbers

The Protein Data Bank (http://www.pdb.org) accession numbers for the structures discussed in this paper are human UCH-L3 (1xd3), UCH-L3 yeast homologue (1cmx), human UCH-L1 (2etl), photoreceptor phytochrome A in D. radiodurans (1ztu), class II ketol-acid reductoisomerase in E. coli (1yrl), class II ketol-acid reductoisomerase in spinach (1yve), S-adenosylmethione synthetase in E. coli (1fug), S-adenosylmethione synthetase in rat (1qm4), AOTCase from X. campestris (1yh1), SOTCase from B. fragilis (1js1), OTCase from P. furiosus (1a1s), OTCase from human (1c9y), bluetongue virus core protein (2btv), and baculovirus P35 protein in Autographa California nuclear polyhedrosis virus (1p35).
  33 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Structural basis for the specificity of ubiquitin C-terminal hydrolases.

Authors:  S C Johnston; S M Riddle; R E Cohen; C P Hill
Journal:  EMBO J       Date:  1999-07-15       Impact factor: 11.598

3.  A deeply knotted protein structure and how it might fold.

Authors:  W R Taylor
Journal:  Nature       Date:  2000-08-24       Impact factor: 49.962

4.  Crystal structure of N-acetylornithine transcarbamylase from Xanthomonas campestris: a novel enzyme in a new arginine biosynthetic pathway found in several eubacteria.

Authors:  Dashuang Shi; Hiroki Morizono; Xiaolin Yu; Lauren Roth; Ljubica Caldovic; Norma M Allewell; Michael H Malamy; Mendel Tuchman
Journal:  J Biol Chem       Date:  2005-02-24       Impact factor: 5.157

5.  A light-sensing knot revealed by the structure of the chromophore-binding domain of phytochrome.

Authors:  Jeremiah R Wagner; Joseph S Brunzelle; Katrina T Forest; Richard D Vierstra
Journal:  Nature       Date:  2005-11-17       Impact factor: 49.962

6.  Partitioning between unfolding and release of native domains during ClpXP degradation determines substrate selectivity and partial processing.

Authors:  Jon A Kenniston; Tania A Baker; Robert T Sauer
Journal:  Proc Natl Acad Sci U S A       Date:  2005-01-25       Impact factor: 11.205

7.  Probing nature's knots: the folding pathway of a knotted homodimeric protein.

Authors:  Anna L Mallam; Sophie E Jackson
Journal:  J Mol Biol       Date:  2006-05-02       Impact factor: 5.469

8.  Structural basis for conformational plasticity of the Parkinson's disease-associated ubiquitin hydrolase UCH-L1.

Authors:  Chittaranjan Das; Quyen Q Hoang; Cheryl A Kreinbring; Sarah J Luchansky; Robin K Meray; Soumya S Ray; Peter T Lansbury; Dagmar Ringe; Gregory A Petsko
Journal:  Proc Natl Acad Sci U S A       Date:  2006-03-13       Impact factor: 11.205

9.  Exploring the folding funnel of a polypeptide chain by biophysical studies on protein fragments.

Authors:  J L Neira; A R Fersht
Journal:  J Mol Biol       Date:  1999-01-22       Impact factor: 5.469

10.  Statistics of knots, geometry of conformations, and evolution of proteins.

Authors:  Rhonald C Lua; Alexander Y Grosberg
Journal:  PLoS Comput Biol       Date:  2006-05-19       Impact factor: 4.475

View more
  88 in total

1.  The fractal globule as a model of chromatin architecture in the cell.

Authors:  Leonid A Mirny
Journal:  Chromosome Res       Date:  2011-01       Impact factor: 5.239

2.  Knot formation in newly translated proteins is spontaneous and accelerated by chaperonins.

Authors:  Anna L Mallam; Sophie E Jackson
Journal:  Nat Chem Biol       Date:  2011-12-18       Impact factor: 15.040

Review 3.  Knot theory in understanding proteins.

Authors:  Rama Mishra; Shantha Bhushan
Journal:  J Math Biol       Date:  2011-11-22       Impact factor: 2.259

4.  Conservation of complex knotting and slipknotting patterns in proteins.

Authors:  Joanna I Sułkowska; Eric J Rawdon; Kenneth C Millett; Jose N Onuchic; Andrzej Stasiak
Journal:  Proc Natl Acad Sci U S A       Date:  2012-06-08       Impact factor: 11.205

5.  Slipknotting upon native-like loop formation in a trefoil knot protein.

Authors:  Jeffrey K Noel; Joanna I Sułkowska; José N Onuchic
Journal:  Proc Natl Acad Sci U S A       Date:  2010-08-11       Impact factor: 11.205

6.  Experimental detection of knotted conformations in denatured proteins.

Authors:  Anna L Mallam; Joseph M Rogers; Sophie E Jackson
Journal:  Proc Natl Acad Sci U S A       Date:  2010-04-14       Impact factor: 11.205

7.  The folding mechanics of a knotted protein.

Authors:  Stefan Wallin; Konstantin B Zeldovich; Eugene I Shakhnovich
Journal:  J Mol Biol       Date:  2007-02-22       Impact factor: 5.469

Review 8.  Knotted and topologically complex proteins as models for studying folding and stability.

Authors:  Todd O Yeates; Todd S Norcross; Neil P King
Journal:  Curr Opin Chem Biol       Date:  2007-11-09       Impact factor: 8.822

9.  Discovery of a thermophilic protein complex stabilized by topologically interlinked chains.

Authors:  Daniel R Boutz; Duilio Cascio; Julian Whitelegge; L Jeanne Perry; Todd O Yeates
Journal:  J Mol Biol       Date:  2007-03-06       Impact factor: 5.469

10.  KnotGenome: a server to analyze entanglements of chromosomes.

Authors:  Joanna I Sulkowska; Szymon Niewieczerzal; Aleksandra I Jarmolinska; Jonathan T Siebert; Peter Virnau; Wanda Niemyska
Journal:  Nucleic Acids Res       Date:  2018-07-02       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.