Lindsay M McMorran1, David J Brockwell1, Sheena E Radford2. 1. Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds LS2 9JT, UK; School of Molecular and Cellular Biology, University of Leeds, Leeds LS2 9JT, UK. 2. Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds LS2 9JT, UK; School of Molecular and Cellular Biology, University of Leeds, Leeds LS2 9JT, UK. Electronic address: s.e.radford@leeds.ac.uk.
Abstract
Research into the mechanisms by which proteins fold into their native structures has been on-going since the work of Anfinsen in the 1960s. Since that time, the folding mechanisms of small, water-soluble proteins have been well characterised. By contrast, progress in understanding the biogenesis and folding mechanisms of integral membrane proteins has lagged significantly because of the need to create a membrane mimetic environment for folding studies in vitro and the difficulties in finding suitable conditions in which reversible folding can be achieved. Improved knowledge of the factors that promote membrane protein folding and disfavour aggregation now allows studies of folding into lipid bilayers in vitro to be performed. Consequently, mechanistic details and structural information about membrane protein folding are now emerging at an ever increasing pace. Using the panoply of methods developed for studies of the folding of water-soluble proteins. This review summarises current knowledge of the mechanisms of outer membrane protein biogenesis and folding into lipid bilayers in vivo and in vitro and discusses the experimental techniques utilised to gain this information. The emerging knowledge is beginning to allow comparisons to be made between the folding of membrane proteins with current understanding of the mechanisms of folding of water-soluble proteins.
Research into the mechanisms by which proteins fold into their native structures has been on-going since the work of Anfinsen in the 1960s. Since that time, the folding mechanisms of small, water-soluble proteins have been well characterised. By contrast, progress in understanding the biogenesis and folding mechanisms of integral membrane proteins has lagged significantly because of the need to create a membrane mimetic environment for folding studies in vitro and the difficulties in finding suitable conditions in which reversible folding can be achieved. Improved knowledge of the factors that promote membrane protein folding and disfavour aggregation now allows studies of folding into lipid bilayers in vitro to be performed. Consequently, mechanistic details and structural information about membrane protein folding are now emerging at an ever increasing pace. Using the panoply of methods developed for studies of the folding of water-soluble proteins. This review summarises current knowledge of the mechanisms of outer membrane protein biogenesis and folding into lipid bilayers in vivo and in vitro and discusses the experimental techniques utilised to gain this information. The emerging knowledge is beginning to allow comparisons to be made between the folding of membrane proteins with current understanding of the mechanisms of folding of water-soluble proteins.
The biological activity of many proteins is reliant on their ability to adopt a specific, three-dimensional structure. Failure to achieve this structure can have serious consequences, as evidenced by the prevalence of diseases caused by protein misfolding and aggregation [1], [2]. The information required for a polypeptide chain to attain its native structure was shown to be contained within its primary sequence by Anfinsen’s experiments on ribonuclease A [3], [4]. This small, globular protein was completely unfolded in chemical denaturants and shown to regain its native conformation spontaneously upon dialysis [3]. This seemingly simple conclusion has led to decades of scientific research to determine how the amino acid sequence of a protein confers its native structure.If folding were to occur by random sampling of all possible conformations of the polypeptide chain, finding the native state would take an astronomically long time. Proteins, however, fold on biologically relevant timescales [5]. Levinthal suggested that this apparent paradox could be resolved if proteins fold via defined pathways [6]. Following this conclusion, several mechanisms were proposed to describe the pathways traversed by a protein en route to the native state. Analysis of the refolding kinetics of ribonuclease A revealed two distinct phases [7] and led to the suggestion of a “framework” mechanism of folding whereby secondary structural elements of proteins are formed prior to their docking to form the tertiary structure [8], [9], [10]. Further investigation of the refolding of ribonuclease A revealed that one of the phases observed was not, in fact, due to the presence of an observable folding intermediate, but arose as a consequence of proline cis–trans isomerisation [11]. This realisation, alongside the characterisation of the folding of chymotrypsin inhibitor 2 which revealed a simple two-state folding mechanism [12], [13], [14], saw the framework mechanism become disfavoured due to its implication that folding intermediates should be present [10], [15]. To explain folding in the absence of detectable intermediates, the nucleation-condensation mechanism was proposed by Fersht and co-workers [12]. This mechanism involves the formation of a small nucleus of structure stabilised by weak, possibly long-range, contacts and the subsequent rapid collapse around this folding nucleus to yield the native state [10], [12], [15]. A combination of simulation and experimental data on members of the homeodomain-like super-family of proteins revealed that the seemingly contrasting framework and nucleation-condensation folding models could be considered as two extremes of a single mechanism. In this unified model, the relative stability of the secondary and tertiary structure determines whether these elements are formed in series or in parallel [16] and, thus, two decades of conflicting evidence were resolved.More recently it has been realised that proteins fold via a collection of parallel pathways which make up a funnel-shaped energy landscape (Fig. 1) [17], [18], [19], [20]. At the top of the funnel, the unfolded state represents a large ensemble of high-entropy conformations of the polypeptide chain. While unstructured, the polypeptide chain may be biased by weak, residual interactions which initiate folding [21]. Indeed, an unfolded variant of the bacterial immunity protein Im7 has been studied recently under non-denaturing conditions, revealing conformational restriction in the regions of the protein sequence which ultimately form the native helices, emphasising the importance of such interactions in the initiation of folding [22]. Similar conclusions have been drawn from other proteins and protein fragments under different denaturing conditions [23], [24], [25], [26], [27], [28]. As folding progresses, the polypeptide chain undergoes many reorganisations aiding the formation of stabilising interactions between side-chains, the protein backbone and the solvent as the protein approaches the native state [17]. The landscape view is an attractive one as it does not place restrictions on whether secondary structure must form before, or at the same time, as the tertiary structure. Additionally, the funnel-shaped landscape predicts the experimentally observed robustness of the folding process to destabilising mutations: if the final fold remains the most stable state relative to the unfolded ensemble, a mutation may block some of the pathways to the native state but alternative folding pathways can be utilised [17].
Understanding the folding, stability and function of membrane proteins is an important area of research as these proteins represent 60% of current drug targets and have vital roles in the cell, including signalling, transport and biogenesis [45], [46]. In contrast with the wealth of information available about the folding of small, water-soluble proteins, the field of membrane protein folding has lagged significantly [47]. For water-soluble proteins, folding is driven by the need to bury hydrophobic side-chains in order to prevent aggregation and to facilitate the formation of stable structures [40]. In addition to the attainment of the native state, the folding of integral membrane proteins is complicated by the need to insert the polypeptide chain into a lipid membrane [47]. Following membrane insertion, most of the surface of an integral membrane protein is in contact with the membrane’s hydrophobic interior. Hydrophilic residues are either restricted to regions which contact the polar head groups of the membrane lipids or are exposed to the aqueous environment on either side of the membrane [47]. Additionally, the membrane environment in vivo is highly dynamic and heterogeneous with regions of varying lipid composition [48]. Recreating this environment in vitro has proved more difficult than the simple aqueous environment needed to fold water-soluble proteins and this, in part, has limited studies on the folding of integral membrane proteins [47], [48].
Classes of membrane proteins
The proteins present in biological membranes can be categorised into two families: the lipid-anchored proteins, which have a covalently-bound fatty-acid moiety through which a water-soluble protein is attached to a membrane, and the integral membrane proteins, which contain membrane-spanning regions. Only the folding mechanism of the latter will be described here. In contrast with lipid-associated proteins, the integral membrane proteins are constrained by the need to compensate for the energetic cost of burying peptide bonds in the lipid bilayer [49], estimated to be 1.2 kcal/mol per peptide bond [50]. As a consequence, it was predicted that membrane spanning regions would form regular secondary structural elements in order to maximise the hydrogen bonding potential of the peptide backbone [49]. Formation of secondary structure reduces the energetic cost of incorporation of peptide bonds into a bilayer by ≈0.4 kcal/mol per peptide bond for α-helical structure and ≈0.5 kcal/mol per peptide bond for β-sheet structure [50]. The first α-helical membrane protein structure was solved in 1975 by Henderson and Unwin using electron microscopy to generate a three-dimensional image of the purple membrane of Halobacterium salinarum
[51]. The resulting 7 Å resolution image revealed the structure of bacteriorhodopsin (bR)1 to be a seven helical, transmembrane bundle [51]. The structural information available about bR has since been increased by structures at higher resolution obtained using electron microscopy (3 Å, [52]) and X-ray diffraction (1.43 Å, [53]). Since the structural elucidation of bR, a wide variety of helical transmembrane structures have been solved and deposited in the Protein Data Bank (PDB) [54]. These show a diverse range of size and function across the kingdoms of life. Some examples are depicted in Fig. 2a–c.
Fig. 2
Example structures of integral membrane proteins. Structures of (a) the transmembrane segment of a glycophorin A monomer from human erythrocyte membranes solved by NMR spectroscopy (1AFO [228]); (b) bacteriorhodopsin, a seven-helical bundle from the purple membrane of Halobacterium salinarum (1C3 W [229]); (c) calcium ATPase 1 from the sarcoplasmic reticulum membrane of Oryctolagus cuniculus, a ten-helical bundle with a large cytoplasmic domain (1IWO [230]); (d) PagP, an 8-stranded palmitoyl transferase enzyme from E. coli (1THQ [206]); (e) the 8-stranded transmembrane domain of OmpA, an ion channel from E. coli (1BXW [231]), with the C-terminal periplasmic domain (structure currently not determined) represented by a red circle; (f) the 10-stranded OM protease, OmpT, from E. coli (1I78 [232]); (g) the 12-stranded, colicin-secreting phospholipase A, OmpLA, from E. coli (1QD5 [233]); (h) the OmpF porin, a trimer comprised of three 16-stranded β-barrels, from E. coli (2ZFG [234]) and (i) the 24-stranded translocation domain of PapC from E. coli (3FIP [235]). Unless otherwise specified, all structures were solved using X-ray crystallography. Proteins are coloured rainbow: violet (N-terminus) to red (C-terminus). In (h), a single OmpF monomer is coloured, while the remaining monomers are shown in greyscale. The approximate position of the membrane is indicated in all images with grey shading. All images were generated from the PDB files using the accession numbers given in brackets using UCSF Chimera molecular visualisation application [236].
Example structures of integral membrane proteins. Structures of (a) the transmembrane segment of a glycophorin A monomer from human erythrocyte membranes solved by NMR spectroscopy (1AFO [228]); (b) bacteriorhodopsin, a seven-helical bundle from the purple membrane of Halobacterium salinarum (1C3 W [229]); (c) calcium ATPase 1 from the sarcoplasmic reticulum membrane of Oryctolagus cuniculus, a ten-helical bundle with a large cytoplasmic domain (1IWO [230]); (d) PagP, an 8-stranded palmitoyl transferase enzyme from E. coli (1THQ [206]); (e) the 8-stranded transmembrane domain of OmpA, an ion channel from E. coli (1BXW [231]), with the C-terminal periplasmic domain (structure currently not determined) represented by a red circle; (f) the 10-stranded OM protease, OmpT, from E. coli (1I78 [232]); (g) the 12-stranded, colicin-secreting phospholipase A, OmpLA, from E. coli (1QD5 [233]); (h) the OmpF porin, a trimer comprised of three 16-stranded β-barrels, from E. coli (2ZFG [234]) and (i) the 24-stranded translocation domain of PapC from E. coli (3FIP [235]). Unless otherwise specified, all structures were solved using X-ray crystallography. Proteins are coloured rainbow: violet (N-terminus) to red (C-terminus). In (h), a single OmpF monomer is coloured, while the remaining monomers are shown in greyscale. The approximate position of the membrane is indicated in all images with grey shading. All images were generated from the PDB files using the accession numbers given in brackets using UCSF Chimera molecular visualisation application [236].In contrast with the ubiquitous distribution of α-helical transmembrane proteins, membrane proteins containing β-sheet secondary structure are found only in the outer membranes of Gram-negative bacteria, mitochondria and chloroplasts [55], [56]. In order to satisfy all of the possible hydrogen bonds in the peptide backbone, each β-strand associates laterally with its neighbours resulting in an overall cylindrical topology, known as a β-barrel [55]. With predominantly non-polar side chains exposed to the hydrophobic membrane interior and each of the backbone hydrogen bonds satisfied, the resulting β-barrel structures have high thermodynamic stability [50], [57]. The size of the β-barrel is highly variable, with known structures containing between 8 and 24 β-strands, and the proteins may contain periplasmic or extracellular domains [55], [56]. The regions between β-strands often alternate between tight turns on the periplasmic side of the membrane and longer, more flexible loops on the outer surface, which are exposed to the external environment (Fig. 2d–i) [55]. One almost entirely conserved structural feature across this family of protein structures is an even number of β-strands, with only one known exception to this rule discovered thus far: the 19-stranded human mitochondrial voltage-dependent anion-selective channel (hVDAC) [56], [58].As more and more research is focused on the determination of membrane protein structure, a total of over 430 unique structures have now been reported in the membrane protein structure (mpstruc) database with both α-helical and β-barrel proteins represented [59]. Structure determination has been most successful using X-ray diffraction, although 105 of the reported structures were solved using nuclear magnetic resonance (NMR) spectroscopy and a further 14 structures have been solved using electron diffraction, demonstrating the utility of multiple techniques in the membrane protein structure determination tool box [60], [61]. Almost 30 years after the first structure was reported, integral membrane proteins still account for only 1% of the protein structures in the PDB [62]. Significant progress is being made in this area, however, with 64 structures determined in 2012 and 50 in 2013 to date [59].
Biological membranes
Lipids in cells have three main functions: energy storage, signal transduction and forming the matrix of biological membranes – the approximately 30 Å thick layer which encloses the cell and organelles within eukaryotic cells [63]. Cellular membranes are composed of polar lipids, which self-associate into bilayers to shield the hydrophobic regions from the aqueous environment in a process driven entropically by water molecules [63]. The structure and composition of lipid bilayers varies greatly even within an organism, allowing the properties of different membranes to be tailored to a specific function. Modulation of lipid composition to adapt to different functional requirements implies the evolutionary advantage of an extensive and complex lipid repertoire [63]. In addition to varying lipid composition, membranes can have either a symmetrical or asymmetrical distribution of lipids between the two leaflets of the bilayer. Within a leaflet, favourable interactions between some of the lipid components can generate domains of specific lipid composition, known as lipid rafts, which are thought to be involved in localising membrane proteins [63]. The variable properties of biological membranes demonstrate the highly dynamic, heterogeneous and complex nature of the lipid environment in which integral membrane proteins must insert, fold and function.
The cell envelope of Gram-negative bacteria
A hallmark of Gram-negative bacteria, such as Escherichia coli, is the cell envelope, which is composed of two lipid membranes enclosing an aqueous compartment called the periplasm (Fig. 3a) [64], [65]. The inner membrane (IM) is a symmetric phospholipid bilayer composed of approximately 70% phosphatidylethanolamine (PE, Fig. 3d), 25% phosphatidylglycerol (PG, Fig. 3e) and 5% or less cardiolipin (Fig. 3f) and forms the barrier between the cytoplasm and the periplasm (Fig. 3a) [64], [65], [66]. There are two types of protein associated with the IM; lipoproteins which undergo lipid modifications of an N-terminal cysteine residue to anchor them to the periplasmic face of the IM and the α-helical integral membrane proteins (reviewed in Ref. [65]). IM proteins are responsible for many cellular processes, including lipid synthesis and small molecule transport. This class of proteins is discussed at length elsewhere in this issue.
Fig. 3
The cell envelope of Gram-negative bacteria. (a) The cytoplasm of E. coli is surrounded by the inner membrane (IM), the periplasm and the outer membrane (OM). The IM is a symmetric phospholipid (shown in orange) bilayer containing integral α-helical membrane proteins. The OM is an asymmetric bilayer of phospholipid and lipopolysaccharide (LPS, shown in purple) and contains β-barrel integral membrane proteins. The periplasm is the aqueous compartment between the two membranes in which the peptidoglycan cell wall is found. Both membranes have associated lipoproteins on their periplasmic faces. (b) The lipid composition of the IM (light blue), inner leaflet of the OM (dark blue) and outer leaflet of the OM (white) in E. coli (percentages based on those reported in [64], [65], [66]). Structures of (c) LPS, (d) phosphatidylethanolamine, (e) phosphatidylglycerol and (f) cardiolipin are shown.
The cell envelope of Gram-negative bacteria. (a) The cytoplasm of E. coli is surrounded by the inner membrane (IM), the periplasm and the outer membrane (OM). The IM is a symmetric phospholipid (shown in orange) bilayer containing integral α-helical membrane proteins. The OM is an asymmetric bilayer of phospholipid and lipopolysaccharide (LPS, shown in purple) and contains β-barrel integral membrane proteins. The periplasm is the aqueous compartment between the two membranes in which the peptidoglycan cell wall is found. Both membranes have associated lipoproteins on their periplasmic faces. (b) The lipid composition of the IM (light blue), inner leaflet of the OM (dark blue) and outer leaflet of the OM (white) in E. coli (percentages based on those reported in [64], [65], [66]). Structures of (c) LPS, (d) phosphatidylethanolamine, (e) phosphatidylglycerol and (f) cardiolipin are shown.The periplasm is the compartment between the IM and the outer membrane (OM), which comprises around 10% of the total cell volume and contains soluble proteins, as well as a peptidoglycan cell wall. The peptidoglycan layer plays important roles in maintaining the shape of the cell and preventing lysis, while periplasmic proteins are involved in maintaining the integrity of the cell envelope. The processes which take place in the periplasm are independent of nucleotide hydrolysis, since no ATP is present in this compartment [65], [67]. When energy is required, the cell relies on complex coupling mechanisms which often use the proton-motive force across the inner membrane as the primary energy source [65].The inner leaflet of the asymmetric OM is comprised of phospholipids and is similar in composition to the IM, although the PE content is enriched compared with the IM (Fig. 3b) [65], [66]. The outer leaflet of the OM consists of lipopolysaccharide (LPS), a glycolipid typically consisting of lipid A, a core oligosaccharide and an O-antigen (Fig. 3c) [64], [68]. The structure of the O-antigen is highly variable, even within a species, with approximately 170 variants being recorded in E. coli
[69]. The high number of fatty acid chains on LPS compared with phospholipids, and the fact that these chains are saturated, facilitates tight packing of the LPS molecules in the OM [68]. This confers low fluidity to the membrane and it is this property that is responsible for the low permeability of the OM [68]. As a consequence, rapid diffusion of small, hydrophobic molecules across the OM is prevented and Gram-negative bacteria thus tend to be less susceptible to hydrophobic antibiotics than their Gram-positive counterparts [68]. As with the IM, integral membrane proteins and lipoproteins are associated with the OM, but in this membrane the integral outer membrane proteins (OMPs) usually have β-barrel structures [55], [65]. The barrel is formed by membrane-spanning β-strands which are held together by a lateral hydrogen bond network [55]. One reported exception, the polysaccharide translocon Wza of E. coli, has a barrel structure composed of laterally associated α-helices [70]. Several examples of OMP structures are shown in Fig. 2d–i.
OMP biogenesis in vivo
Periplasmic chaperones assisting OMP biogenesis
Following their synthesis in the cytosol, OMPs are targeted to the SecYEG translocon by the SecB chaperone, whereupon they are translocated across the IM through SecYEG in an unfolded state [71], [72]. The unfolded OMPs must be protected from aggregation and must be able to traverse the periplasm, including the peptidoglycan layer, and then correctly fold and insert into the OM [73]. These requirements suggest that transport across the periplasm and membrane insertion may be facilitated processes and, indeed, a number of periplasmic and OM-associated proteins have been implicated in the OMP assembly pathway [73]. These proteins can be roughly grouped into three categories: proteases; chaperones which stabilise unfolded and non-native conformations of their client proteins; and folding catalysts, which catalyse rate-limiting steps in folding (Fig. 4a and b) [73].
SurA was first identified in 1990 when it was shown to be required for the survival of E. coli in the stationary phase [75]. Initial characterisation described SurA as a parvulin-like peptidyl-prolyl isomerase (PPIase) involved in the proper assembly of major OMPs [67], [76], [77]. Trypsin digestion of OMPs in surA deletion mutants showed that not all OMPs have increased rates of proteolysis, leading to the conclusion that SurA is not an essential folding factor [76]. The amounts of FadL, LamB, OmpA, OmpC, OmpF, OmpX and LptD, however, were all found to be reduced in surA deletion strains [78]. Similar results were obtained using a proteomic analysis, which also revealed an upregulation of proteins under the control of the σE regulon in surA deletion strains [79].Crystallisation of SurA revealed a four-domain protein with two PPIase domains (P1 and P2) sandwiched between the N- and C-terminal domains (Fig. 5a) [80]. PPIase domain P1 is packed against the core structure of the N- and C-terminal domains and does not show significant activity, while the more active P2 domain extends away from the core structure [77], [80], [81]. The PPIase activity of P2 has been shown to be increased in the presence of the adjacent chaperone domain, presumably as this domain facilitates substrate binding close to the active site of P2 [82]. Deletion of both PPIase domains, however, did not cause a significant loss of SurA function in vivo and the isolated PPIase domains failed to complement activity in surA deletion mutants [81]. This led to the conclusion that SurA functions mainly as a chaperone [81]. Interestingly, mutations which would be expected to cause a loss of PPIase function in the P1 domain, if this domain were active, destabilised SurA in vitro but increased chaperone activity in vivo
[83]. This result suggests a regulatory function of the P1 domain, explaining its lack of significant PPIase activity [83].
Although Skp had previously been suggested to be a DNA-binding protein, an OMP and an LPS binding protein, it is now known that Skp is a periplasmic protein, as evidenced by the N-terminal signal sequence which targets Skp for translocation across the inner membrane [73]. Depletion of Skp in vivo led to a moderate reduction of OmpC, OmpF, OmpA and LamB in the OM fraction [95]. These cells, however, were still viable demonstrating that Skp is not essential. Further genetic studies revealed that both the skp surA deletion mutant and the degP surA deletion mutant result in a synthetic lethal phenotype, leading to the prevailing hypothesis that the periplasmic chaperones Skp and DegP act on redundant pathways seperate to SurA (Fig. 4a) [94]. Proteomic analysis of a skp deletion mutant suggested that none of the OMPs identified were affected significantly; however, depletion of SurA in the skp null strain reduced the levels of almost all OMPs, consistent with the hypothesis of parallel chaperone pathways [96]. While this hypothesis explains many of the observations above, the accumulation of protein aggregates in the periplasm of skp degP deletion mutants suggests that Skp may have a role in maintaining the solubility of at least some OMPs prior to folding [97]. Additionally, skp fkpA deletion mutants showed defects in the folding of LptD, an essential OMP involved in LPS synthesis [98]. Over-expression of SurA could not ameliorate these effects [98]. Together, these results suggest that the chaperone pathways in the periplasm may be inter-dependent.The chaperone activity of Skp in OMP folding was confirmed recently by the finding that Skp binds with high affinity (Kd∼ nM) to unfolded OMPs [95], [99], [100]. Furthermore, Skp binding to OmpA was shown to occur via the transmembrane domain of OmpA [95]. NMR spectroscopy confirmed that the transmembrane domain of OmpA is Skp bound, while the periplasmic domain is free to fold independently in solution [100]. Interaction of unfolded OMPs with Skp is thought to occur early after translocation across the IM, as evidenced by the ability to cross-link Skp to the OMP PhoE at the periplasmic side of the IM in spheroplasts [101]. Skp has also been shown to be required for the release of newly-translocated OmpA from the periplasmic side of the IM in spheroplasts [97], suggesting that the N-terminal residues of the client protein may interact with Skp as they emerge from the translocon. This hypothesis was supported by Förster resonance energy transfer (FRET) experiments which showed that the N-terminal residues of OmpC enter the Skp cavity first [102]. Molecular dynamics (MD) simulations yielded similar results [102].Skp is a trimeric protein in solution with a structure that has been described as “jellyfish-like” (Fig. 5b) [103], [104]. A β-barrel domain forms the body of the jellyfish, with long α-helical “tentacles” protruding from the β-barrel [103], [104]. Skp is a basic protein, with a theoretical isoelectric point (pI) around 9.5 [73]. The charge distribution of Skp is notable, with many basic residues clustered around the tips of the α-helical tentacles resulting in a large macrodipole moment [103], [104]. This unusual distribution of charges suggests a role of Skp in substrate delivery to negatively charged membranes. Additionally, a putative LPS-binding site was identified on the outer edge of the Skp α-helices, similar to an LPS-binding motif found on the OMP FhuA [104]. This finding led to the suggestion that Skp may act as a stress chaperone, acting to rescue OMPs which veer off the SurA chaperone pathway [105].The α-helical domains of Skp define a central cavity enriched with hydrophobic residues, which has been shown to be involved in sequestering unfolded substrates to prevent their misfolding and aggregation [88], [90], [100], [102], [103], [104]. In contrast with SurA, no common binding motif has been identified for the interaction of Skp with its substrates [73], [90] and both electrostatic and hydrophobic interactions have been implicated as being important for binding [99], [102], [106]. In the absence of a specific binding motif, the high affinity interaction between Skp and its substrates has been shown to arise from the formation of a large number of simultaneous weak interactions that exist between the chaperone and its substrate [90]. Despite a low sequence identity, Skp shares a striking structural resemblance to prefoldin [104], an ATP-independent, cytosolic chaperone in eukaryotes and archaea, which protects partially folded proteins from aggregation and passes clients to other chaperones to promote folding [103], [104]. The transient nature of the multitude of weak, local interactions between Skp and its substrates has also been hypothesised to facilitate handover of substrates to other chaperones [90], which could point to functional, as well as structural similarities, between these ATP-independent chaperones. Conformational analysis of Skp-bound OmpX revealed a highly flexible, but compact, ensemble which lacks secondary structure [90], in contrast with the molten globule-like conformations of substrates bound by ATP-dependent chaperones such as GroEL [107]. This difference in substrate binding provides further evidence that ATP-independent chaperones, such as Skp, rely on a thermodynamic gradient for substrate release and subsequent folding [90], [108]. In this model, the large free energy of folding of OMPs provides a “thermodynamic sink” that drives the release of these client proteins from Skp and enables their subsequent folding in the OM [108]. The ability of Skp to prevent aggregation of water-soluble proteins has also been shown using clients including lysozyme [104] and single chain antibodies [106], revealing that Skp is able to chaperone soluble proteins as well as OMPs. Skp may, therefore, act as a universal chaperone assisting the folding of both soluble proteins and OMPs in the periplasm.
Other periplasmic folding factors
While SurA and Skp are currently the most studied of the periplasmic folding factors, other proteins have been shown to participate in OMP biogenesis. Three other periplasmic PPIase enzymes are currently known: PpiA, PpiD and FkpA [64]. While PpiA has been shown to have the highest activity of these proteins in vitro, its deletion had no detectable effect on the assembly of OMPs in vivo and, as yet, no evidence has directly linked PpiA to OMP biogenesis [64], [73]. In contrast, deletion of the inner membrane-anchored PpiD (Fig. 5a) was reported to cause a reduction in the levels of OmpC, OmpF, OmpA and LamB in the OM of the mutant cells and a ppiD surA deletion mutant was reported to be lethal, implicating PpiD in OMP folding [64], [73]. Later studies did not replicate these findings, however, and the role of PpiD in OMP assembly remains an open question [64].FkpA, like SurA, is an example of a dual PPIase-chaperone folding factor and fkpA null mutants have been shown to be viable, but to display increased OM permeability and an up-regulation of periplasmic proteases [67]. Structural studies revealed that FkpA has a C-terminal PPIase domain, appended to an N-terminal chaperone domain that mediates dimerisation to form a v-shaped cleft (Fig. 5c) [109]. It has been hypothesised that substrate binding occurs in this cleft [109], but no data are currently available to support this. Until recently, chaperone activity of FkpA had been reported only for soluble protein substrates and the involvement of FkpA in OMP biogenesis was not well supported [64], [109]. Creation of a skp fkpA deletion mutant by Schwalm and co-workers, however, showed that folding of LptD is compromised in this strain, providing the first evidence that FkpA may indeed chaperone OMPs in the periplasm [98].Another example of a dual-function chaperone in the periplasm is the protease-chaperone DegP which, alongside DegS, belongs to the HtrA family of proteases [73]. Temperature changes were initially thought to cause the switch between the two activities of DegP, with the chaperone activity dominating at 28 °C and the protease activity becoming dominant at 42 °C [73]. Structure–function analysis of DegP concluded that the resting state of this protein is a hexamer, in which the interactions between subunits block the protease sites and the central cavity is large enough to accommodate unfolded substrates [110]. Similarly to DegS, DegP recognises the C-terminal residues of misfolded OMPs via its PDZ domains and this was thought to cause structural reorganisation to larger proteolytically-active oligomers comprised of 12 or 24 DegP monomers [78], [111], [112]. Cyro-electron microscopy of DegP in the presence of OmpC revealed a tetrahedral arrangement of DegP trimers with the inner cavity filled by a cylindrical area of electron density into which the structure of folded OmpC could be modelled [111]. This led to the conclusion that the fate of DegP-bound substrates lies in their ability to adopt their native structure within the cavity, as only unfolded substrates can be degraded [111]. Recent data using cage-deficient mutants of DegP, which can only associate into trimers, revealed that these variants are able to bind and degrade substrates without the need to form higher order oligomers, suggesting that cage formation in response to substrate binding may be linked solely to the chaperone function of this protein [113].The most recently discovered periplasmic chaperone, Spy, was identified by its over-expression in bacterial strains expressing unstable periplasmic proteins [114]. In vitro characterisation of Spy revealed it to be an effective chaperone, suppressing aggregation and protecting substrates from inactivation by tannins [114]. Spy has a novel α-helical cradle structure, which is unlike that of any known chaperone (Fig. 5d) [114]. It is not known, currently, whether Spy is involved in OMP biogenesis, but the discovery of a previously unknown chaperone only two years ago highlights that there is much still to be learned about the chaperone network in the E. coli periplasm. In addition to the complexity evident in a single species, studies on the periplasmic chaperones of Neisseria
meningitides show that SurA deletion causes no detectable defects in OMP assembly, while deletion of Skp caused lower levels of some porins, but not all OMPs [115]. Complementation of the skp null mutant was not possible using E. coliSkp, suggesting that periplasmic chaperones may act in a species-specific manner, making the derivation of generic principles of periplasmic chaperoning difficult to achieve [115].
Insertion into the outer membrane
Following chaperone-assisted transport across the periplasm, unfolded OMPs must insert and fold into the OM, a process which is assisted by the BAM complex [116]. The BAM complex in E. coli (Fig. 6a) is comprised of the outer membrane protein BamA (previously YaeT, Fig. 6b) and four accessory lipoproteins, BamB (YfgL, Fig. 6c), BamC (NlpB, Fig. 6d), BamD (YfiO, Fig. 6d) and BamE (SmpA, Fig. 6e) [105], [116]. BamA is an essential protein, the depletion of which causes an accumulation of aggregated OMPs in the periplasm, leading to cell death [117]. BamA belongs to the conserved Omp85 super-family first identified in N. meningitides
[118]. Homologues are also found in mitochondria (Sam50) and chloroplasts (Toc75) [119]. All of the BamA homologues identified thus far have a C-terminal transmembrane β-barrel domain and a water-soluble N-terminal region [116]. The structure of the BamA β-barrel in two different organisms has been elucidated recently, revealing it to be 16-stranded with a conserved VRGY motif in loop L6 which is thought to be functionally important (Fig. 6b) [120].
Application of different biophysical methods to the study of OMP folding into lipid bilayers
In recent years, many techniques have been developed to study the folding of water-soluble proteins and together these have yielded a near-atomistic view of the folding landscapes of many such proteins [31]. By contrast, the experimental toolbox for the study of membrane protein folding is more limited [141]. Much of the information available about OMP folding into lipid bilayers has resulted from the use of techniques such as cold SDS–PAGE (where samples are not boiled prior to loading, resulting in the differential migration of the folded and unfolded conformations of the OMP [142]) and tryptophan fluorescence emission and far-UV circular dichroism (CD) spectroscopies on a limited subset of proteins, as summarised below and in Table 1. In spite of the challenges faced, the development of new methodologies and the application of existing methods to study OMP folding are now beginning to yield knowledge of OMP folding mechanisms both in the presence and absence of chaperones.
Table 1
Summary of selected in vitro folding studies of outer membrane proteins (OMPs) into lipid bilayers.
Protein
Notes
Techniques used
Selected References
OmpA
Folding studies of OmpA are described in this review and elsewhere.
Cold SDS–PAGE, far-UV CD, Trp fluorescence
Kleinschmidt (2006) [171]; Otzen (2013) [58]
PagP
Folding studies of a C-terminally his-tagged construct of PagP (HT PagP) and untagged construct (PagP) are described in detail in this review.
Bishop et al. (2000) [239]; Ahn et al. (2004) [206]; Huysmans et al. (2010) [157]
hVDAC
Human voltage-dependent anion-selective channel (hVDAC) can be folded to the native state in LUVs composed of lipids of varying acyl chain length (diC10:0PC to diC18:1PC). Folding yield was estimated to be 94% in diC12:0PC LUVs. Secondary structure content of hVDAC in diC12:0PC LUVs was not affected by changing pH from 7.0 to 3.0.
Cold SDS–PAGE, far-UV CD, Trp fluorescence, sucrose density gradient centrifugation and proteolysis
Shanmugavadivu et al. (2007) [225]
FomA
FomA can be folded to the native state in both diC10:0PC and diC18:1PC (LUVs and SUVs). Kinetic analysis of folding into diC10:0PC and diC18:1PC SUVs suggested that FomA folds via parallel pathways into both lipids. The folding halftime is dependent on acyl chain length and reaction temperature.
Cold SDS–PAGE, Trp fluorescence, far-UV CD
Pocanschi et al. (2006) [202]
OmpG
The porin OmpG reconstituted in native E. coli lipids is gated by conformational changes in extracellular loops in a pH-dependent manner (closed at pH 5.0). Unfolding under force reveals each β-hairpin unfolds individually. Refolding from this mechanically unfolded state also proceeds by sequential folding of individual β-hairpins.
Atomic force microscopy (AFM)
Sapra et al. (2009) [240]; Damaghi et al. (2010) [241]; Mari et al. (2010) [242]; Damaghi et al. (2011) [243]
OmpF
Refolding of urea-solubilised OmpF into diC14:0PC SUVs occurred at only 15% yield. Refolding kinetics were biphasic but much slower than OmpA.
Cold SDS–PAGE, Trp fluorescence, far-UV CD
Surrey et al. (1996) [244]
Summary of selected in vitro folding studies of outer membrane proteins (OMPs) into lipid bilayers.As well as the methods mentioned above, the quenching of tryptophan fluorescence has been utilised successfully to follow the folding and membrane insertion of OmpA [143], [144]. In this technique, lipids which are brominated at different positions in the acyl chain are introduced into liposomes. The depth of membrane insertion of an OMP during a folding reaction is measured by following the kinetics of tryptophan fluorescence quenching by the bromine atoms within the bilayer [144]. Additionally, the use of mutagenesis to create OmpA variants with only a single tryptophan residue allowed the insertion of different regions of the OmpA β-barrel into the bilayer to be monitored. For example, varying the location of a single tryptophan residue can yield information on whether individual secondary structural elements are inserted sequentially or simultaneously [143]. Alongside other spectroscopic techniques, these experiments provided the first evidence of how an OMP folds, revealing that the folding and membrane insertion of OmpA is a concerted process [143], [144].NMR studies on unfolded water-soluble proteins has provided evidence that the starting point of folding is not a random coil structure [22], [23], [28], [145], [146]. Indeed, residual structure in the unfolded state has been suggested to be important in initiating folding by facilitating the collapse of the polypeptide chain into conformations able to fold efficiently to the native state [22]. In contrast with helical IM proteins, OMPs can be denatured in urea or guanidine hydrochloride (GuHCl), enabling studies of their folding using classic Anfinsen-style experiments [43]. Analysis of the urea denatured state of OmpX has shown that the protein is globally unfolded, but has two regions of non-random structure: one a hydrophobic cluster and the other a helical region [147]. Analysis of peptides corresponding to these regions showed independent binding of the clusters to detergent micelles, suggesting a role of residual structure in the unfolded state in the initiation of OMP folding and membrane insertion [148]. The application of NMR spectroscopy to folded membrane proteins is complicated by the need to find a suitable mimic of the membrane environment. Detergent micelles are widely used, but problems with long term stability and the maintenance of proteins in a functional state are commonly encountered [149]. As a result, development of alternative non-micellar systems such as bicelles, nanodiscs and amphipols, to stabilise the folded state of membrane proteins has become an active research area [149], [150], [151] and is reviewed elsewhere in this issue.Detailed information about the folding mechanisms of water-soluble proteins has been obtained by measuring the folding and unfolding kinetics of the protein of interest, using a spectroscopic probe to monitor the reaction time-course in the presence of varying concentrations of denaturant. The logarithm of the kinetic rate constants acquired from these data can be plotted against the denaturant concentration, giving a characteristic v-shaped chevron plot [152]. If the limbs of the chevron remain linear over denaturant concentrations from 0 M to the highest concentration used (highly denaturing) and the free energies associated with the folding and unfolding events equate to the equilibrium unfolding free energy, the data suggest that the protein under study probably folds via a two-state mechanism [13], [14]. Deviations from linearity, termed rollover, in either limb can occur and have been interpreted to indicate the presence of a folding intermediate [152], [153], movement of the transition state [154] or aggregation [155]. As well as being an indicator of folding mechanism (alongside equilibrium folding analysis), chevron plot analysis also yields information about the compactness of the transition state (or any populated intermediates) and their position on the folding reaction coordinate [152]. This detailed kinetic analysis is routine in the study of water-soluble protein folding and its recent application to study the folding of OmpA [156] and PagP [157] will be discussed later in this review.Protein engineering methods coupled to kinetic analysis have been the most successful tool to probe the folding mechanisms of membrane proteins and have been applied to the IM proteins bR [44], [158], [159] (Fig. 2b), DsbB [160] and the OMP, PagP (Fig. 2d) [157]. Φ-Value analysis is a powerful protein engineering approach, which can be used to map the formation of contacts in the transition states and populated intermediates formed during protein folding (Fig. 7) [161], [162], [163]. In this approach, specific side-chain interactions are deleted by mutation, and the effect on the kinetic and thermodynamic parameters of the variant protein is measured and compared with wild-type [163]. The amino acid substitution may cause a change in the free energy of activation (kinetic), the equilibrium (thermodynamic) free energy of the folding reaction, or both. The ratio of these changes is the Φ-value, which is usually between 0 and 1, and gives a measure of the change of stability of the transition state (or intermediate), compared with the change in stability of the native state [163]. Both stabilities are measured relative to the denatured state, whose free energy is assumed to be unaffected by the mutation [163]. Φ-Values close to zero indicate that the transition state (or intermediate) is unstructured in the region of the amino acid substitution, while higher Φ-values indicate that the region is structured in the transition state (or intermediate) [161], [162]. Partial Φ-values are also observed and can be interpreted in several ways. For example, partial Φ-values may result if native contacts are partially formed in the transition state (or folding intermediate) or if multiple folding routes exist [163]. Φ-Values have provided a wealth of information on the structure of partially folded intermediates and transition states of water soluble proteins, revealing detailed information about their folding mechanisms [163]. Furthermore, the use of experimental Φ-values as constraints in MD simulations allows atomistic models of these ensembles to be created [164]. The main stumbling block for the application of Φ-value analysis to OMPs is the need to find experimental conditions that enable reversible folding for the protein of interest, which has proved difficult for many OMPs [58], [156], [165].
Fig. 7
Schematic of the principles of Φ-value analysis. (a) A mutation (shown as a red dot) is made in a region of the protein which is native-like in the transition state (‡) leading to equal destabilisation of ‡ and the native state (N) resulting in a Φ-value of 1 or (b) a mutation is made in a region of the protein which is unfolded in ‡ but structured in the native state, leading to destabilisation of N only and a Φ-value of 0. It is assumed that the mutation does not affect the free energy of the unfolded ensemble (U). Abbreviations: ΔΔG°U–N refers to the difference in the free energy of folding upon mutation and ΔΔG°U–‡ refers to the difference in the free energy between U and ‡ upon mutation. This image was adapted with permission from G.H.M. Huysmans.
Schematic of the principles of Φ-value analysis. (a) A mutation (shown as a red dot) is made in a region of the protein which is native-like in the transition state (‡) leading to equal destabilisation of ‡ and the native state (N) resulting in a Φ-value of 1 or (b) a mutation is made in a region of the protein which is unfolded in ‡ but structured in the native state, leading to destabilisation of N only and a Φ-value of 0. It is assumed that the mutation does not affect the free energy of the unfolded ensemble (U). Abbreviations: ΔΔG°U–N refers to the difference in the free energy of folding upon mutation and ΔΔG°U–‡ refers to the difference in the free energy between U and ‡ upon mutation. This image was adapted with permission from G.H.M. Huysmans.
Folding studies of OMPs in vitro
OmpA
The study of OMP folding in vitro has been pioneered by work on the monomeric, 8-stranded ion channel OmpA (Fig. 2e) from E. coli. Initial experiments using a combination of cold SDS–PAGE and protease digestion were conducted by Schweizer et al. as early as 1978 [166], which demonstrated that OmpA can fold in the presence of LPS and Triton X-100 detergent. It was not until over a decade later that Surrey and Jähnig reported the folding of OmpA into lipid bilayers [167], opening the door to studies of membrane protein folding in a more native-like environment. OmpA, solubilised in 8 M urea, was shown by these authors to insert spontaneously into small unilamellar vesicles (SUVs) of diC14:0PC upon rapid dilution from urea [167]. Detailed kinetic studies were carried out by these, and other, authors on the refolding of urea-unfolded OmpA into liposomes using cold SDS–PAGE, far-UV CD and tryptophan fluorescence emission [143], [168], [169]. Kinetic measurements of the formation of secondary structure by far-UV CD and tertiary structure by tryptophan fluorescence emission revealed that these structural elements form concomitantly [170]. Additionally, the rate of quenching of single tryptophan mutants of OmpA by brominated lipids suggested that the 8 β-strands penetrate the membrane simultaneously [144]. Based on these data a scheme for the refolding pathway of OmpA was proposed, beginning with collapse of the protein in aqueous solution, followed by adsorption to the membrane surface and folding to the native state by progressive penetration deeper into the membrane as the β-barrel forms (Fig. 8) [143], [144], [168], [169]. These experiments provided the first evidence of the concerted nature of OMP folding and membrane insertion [143], [144], [168], [169], [171].
The study of homologous proteins has been utilised widely in the field of protein folding [183], [184]. Application of this approach to water soluble proteins includes the homeodomain-like super-family [16], [185], α-spectrin domains [37], [184], [186], [187], [188], [189], bacterial immunity proteins [190], [191], [192], ribosomal S6 proteins [193], [194] and immunoglobulin-like domains [195], [196], [197], [198], [199]. These studies have yielded valuable insights into the role of protein topology, amino acid sequence and secondary structure propensity on folding mechanisms [183], [184]. Recently, Burgess et al. adopted a screening approach using cold SDS–PAGE analysis to determine the folding yield of nine different OMPs (OmpX, OmpW, OmpA, PagP, OmpT, OmpLA, FadL, Omp85 (BamA) and OmpF) under different conditions [200]. The effects of pH, temperature, vesicle size and lipid composition were investigated, revealing that, in general, folding yield was increased by using pH values of 8–10, lipids with shorter acyl chains and smaller vesicle diameter [200]. These results accord with previous work on the effects of membrane thickness and curvature on the folding of OmpA [170]. A single condition to maximise the folding yield of all nine OMPs could not, however, be identified [200].High temperature has been reported to increase the folding yield of OmpA [169], [201], FomA [202] and OmpX [201], but increasing temperature had varying effects on the folding yield of the OMPs in the Burgess study [200]. This variation was proposed to arise from the different aggregation propensities of the OMPs studied at higher temperature [200]. The aggregation propensity of the nine OMPs in the urea-unfolded state was investigating using sedimentation velocity analytical ultra-centrifugation, revealing that self-association did not correlate well with folding efficiency [203]. Some trends were observed, however, showing that addition of salt (50–400 mM) and lower pH (values 6–7) increase self-association, while the presence of ⩾4 M urea kept all the OMPs in a monomeric state [203]. The relatively low aggregation propensity of OmpA has since been attributed to the presence of the periplasmic domain, which has been shown to fold independently of the β-barrel domain [204]. A thorough screen to establish conditions for optimal folding of OmpLA was then conducted using the results of previous studies as a guide [200], [203], revealing that OmpLA folds reversibly into LUVs diC12:0PC at pH 3.8, 37 °C [165]. These conditions also allowed reversible folding of PagP and OmpW, but not the other OMPs studied [108]. The difficulty in finding conditions suitable for the folding of different OMPs is thus a challenge, making comparative studies of these homologous proteins challenging. It is therefore difficult to draw out similarities and differences in the folding mechanism of a single protein class in the context of OMP folding mechanisms.
PagP
PagP is an OMP from E. coli that forms an 8-stranded β-barrel with a 19-residue amphipathic α-helix at its N-terminus (Fig. 2d) [205], [206], [207]. The β-barrel is tilted by approximately 25° to the membrane normal [205], [206], [208], stabilised in this position by the interactions of the aromatic girdles with the membrane intrafacial region [209]. PagP is a palmitoyl transferase enzyme, which transfers a palmitate chain from a phospholipid to hexa-acylated lipid A and thus helps to reinforce the structure of the outer membrane [210]. Folding studies on a C-terminally His-tagged construct of PagP (HT PagP) revealed that the protein folds into both detergent micelles and liposomes in vitro
[211]. Unlike OmpA, a high concentration of urea (7 M) was required to solubilise the protein and prevent aggregation prior to insertion [143], [211]. Far-UV CD was used to follow the formation both β-sheet structure and tertiary structure (via a Cotton effect between tyrosine-26 and tryptophan-66) [211], [212]. In accordance with previous results on OmpA, secondary and tertiary structure were found to form concomitantly during HT PagP folding into either cyclofos-7 micelles or diC12:0PC liposomes (SUVs and LUVs) [170]. Mutants of HT PagP were then created to investigate the role of the N-terminal α-helix (residues 1–19) in folding, since the possession of a periplasmic α-helix is an unusual feature of an OMP [211]. This study demonstrated that the helix increases the stability of folded HT PagP in liposomes, but this effect was not so pronounced in detergent, illustrating the importance of the lipid bilayer in the stability of the native state and the importance of developing membrane-like mimics for studies of OMP stability [211]. One mutant in which a conserved residue in the α-helix, tryptophan-17, was replaced with alanine, unfolded fifty times more rapidly that the wild-type [211]. Moreover, in a HT PagP helix deletion construct, mutation of arginine-59, located in the intrafacial region of the β-barrel domain, to tryptophan restored the folding and unfolding kinetics in liposomes to rates similar to those of the wild type protein, demonstrating the importance of the aromatic girdle in the folding and stability of PagP [213].By systematically varying the protein concentration and studying the folding of HT PagP under a range of lipid-to-protein ratios (LPRs), conditions were established under which the HT PagP unfolding transition is completely reversible in diC12:0PC LUVs [157]. Equilibrium stability studies and kinetic chevron plot analysis of HT PagP (un)folding revealed that the protein folds via a two-state mechanism over the range of urea concentrations studied (7.8–10 M). A Φ-value analysis was then undertaken for HT PagP using point mutants of 19 residues spread throughout the protein structure [157]. These experiments provided the first insights into the structural features of a transition state for OMP folding, suggesting a polarised transition state in which the N-terminal half of the protein remains largely unstructured, whilst the C-terminal half of the protein is native-like (Fig. 9) [157]. Interestingly, two negative Φ-values were observed, providing evidence for stabilisation of the transition state by non-native interactions [157]. The resulting mechanism of tilted insertion is consistent with the concerted folding and insertion suggested for OmpA [143], [144], [170]. It remains to be seen whether this mechanism is observed for other OMPs.
Fig. 9
Φ-Value analysis of HT PagP. ΦF-values determined from kinetic analysis of HT PagP variants are mapped onto a ribbon diagram (left) and a topology model (right). Regions with ΦF-values close to 0 are shown in red, regions with ΦF-values close to 1 are shown in blue, intermediate ΦF-values are shown in purple, ΦF-values less than 1 are shown in orange and undetermined ΦF-values are grey. Reproduced with permission from [157].
Φ-Value analysis of HT PagP. ΦF-values determined from kinetic analysis of HT PagP variants are mapped onto a ribbon diagram (left) and a topology model (right). Regions with ΦF-values close to 0 are shown in red, regions with ΦF-values close to 1 are shown in blue, intermediate ΦF-values are shown in purple, ΦF-values less than 1 are shown in orange and undetermined ΦF-values are grey. Reproduced with permission from [157].Kinetic analysis of HT PagP (un)folding over a range of urea concentrations (7.8–10 M) revealed that folding into diC12:0PC LUVs is best described by a burst phase followed by a single exponential phase [157], [211], [214]. Lowering the lipid-to-protein ratio, the HT PagP concentration, or inclusion of diC12:0PE in the diC12:0PC LUVs, resulted in a burst phase followed by two exponential phases [214]. Interrupted refolding experiments, in which folding is allowed to proceed for varying lengths of time before unfolding is re-initiated by dilution into high urea concentrations, were also carried out. These experiments revealed that the rate of formation of the native state is best described by two rate constants indicative of parallel folding pathways [214], a feature also seen in the folding of some water-soluble proteins [145], [197], [215], [216], [217], [218]. Since the unfolding kinetics of HT PagP are well described by single exponential kinetics under all conditions tested, there was insufficient evidence for an alternative native-like state of HT PagP [214]. The slower pathway most likely arises from a second population of folding-competent HT PagP molecules in solution ready to adsorb onto the lipid upon exposure of free membrane surface [214].More recently, kinetic analysis of an untagged variant of PagP has been undertaken in the presence of SurA and Skp [92]. This PagP construct, initially reported by Burgess et al. [200], was chosen for folding assays in the presence of soluble chaperones as it had been reported to be folding competent in urea concentrations as low as 1 M, in stark contrast with the high urea concentrations required for efficient folding of HT PagP [200], [211]. Interestingly, however, folding of PagP is not fully reversible under conditions which promote reversible folding of HT PagP, again highlighting the difficulty in generating OMPs suitable for equilibrium denaturation studies [92]. Nonetheless, the folding kinetics of PagP into both zwitterionic and negatively charged liposomes were investigated in the presence and absence of Skp or SurA (Fig. 10) [92]. These experiments showed that membrane composition and ionic strength of the buffer strongly influences the effect that Skp has on PagP folding, suggesting that electrostatic interactions play an important role in the mechanism of action of this chaperone [92], consistent with the previous results on Skp-mediated folding of OmpA [92], [182]. SurA, however, did not affect the observed folding rates of PagP, in contrast with the results observed for OmpT refolding [135], but consistent with the view that Skp and SurA may act by distinct mechanisms in partially redundant chaperone pathways [78], [105]. The ability of Skp to prevent the aggregation of HT PagP was also investigated, revealing that even under conditions in which aggregation is strongly favoured, Skp can rescue the folding and membrane insertion of HT PagP [92]. Together, these studies indicate the power of combining different methods to study the folding mechanism of an OMP and set the scope for future investigations into how OMPs fold both unassisted and assisted by folding factors in vitro and in vivo.
Early experiments on the folding of water-soluble proteins focused on small, single domain proteins as these provide relatively simple folding models [3], [11], [13], [14]. In the intervening decades, many details on the folding of these proteins have been elucidated as new experimental methods and computational techniques have been developed [31]. More complicated folding scenarios are beginning to be investigated experimentally, including studies of multidomain proteins, oligomeric proteins, folding in the presence of chaperones and in the cellular milieu [40], [41], [139], [219]. Even fifty years after Anfinsen’s experiments, the first simulations of protein folding in all-atom detail have only been reported in the last two years and these initial studies were limited to small (<100 residues), single domain proteins [30], [220], [221]. More recently, the characterisation of the folding and assembly of a dimeric protein in atomistic detail using equilibrium MD simulations has paved the way for simulation studies of the folding of oligomeric proteins [222]. Application of simulation techniques to study integral membrane proteins firstly requires the protein under study to be correctly positioned within the bilayer as this information is not obtained when membrane protein structures are determined by X-ray crystallography or NMR [223]. This is often achieved by first applying a coarse-grained approach to correctly position the protein within the membrane and using these results to guide atomistic simulations to obtain detailed information on lipid-protein interactions and protein dynamics [223]. The multiscale simulation approach has been successful in probing the lipid interactions of aquaporins, ion channels and G-protein coupled receptors, among others, but simulating the complexities of lipid organisation within membranes remains a challenge in this field [223], [224].Similar to the evolution of studies of water-soluble protein folding, analysis of OMP folding mechanisms, thus far, have focused on a small subset of relatively simple proteins that are amenable to the array of kinetic and thermodynamic assays required to determine folding mechanisms [58], [159]. Recent trends, however, suggest that the field of membrane protein folding is already progressing towards more complex folding systems. Folding models utilised thus far include OmpA and Omp85 (a BamA homologue) both of which have soluble, periplasmic domains, the trimeric porin, OmpF and the large, 19-stranded hVDAC [108], [200], [203], [225]. Moreover, the use of periplasmic chaperones to aid folding and membrane insertion has now been applied to several different folding systems [92], [108], [135], [182].Membrane protein folding is complicated by the need to recreate a suitable membrane mimetic environment in vitro into which the protein can fold [47]. While early studies concentrated on simple micellar systems, these often do not provide a good mimic for the complex, heterogeneous environment of a membrane in vivo and can lead to inactivation or aggregation of the protein of interest [226]. The use of lipid-based mimics, usually synthetic liposomes, has provided a wealth of information based on the ability to introduce guest lipids to modulate surface charge, lateral pressure and membrane fluidity and to examine their effect on OMP folding and stability. Studies to date have focused on relatively simple lipid mixtures, often with short acyl chains (∼C12), as these have been shown to promote the folding of OMPs in vitro, and at lipid-to-protein molar ratios (often 400:1 to 3200:1, or up to approximately 100:1 by weight) which far exceed those encountered in biological membranes (in the range 1:1 to 1:3 by weight [227]). While much has been learned about the interactions between OMPs and their surrounding lipid environment from these studies, these simple mimics do not accurately represent the folding situation in vivo. Indeed, the membrane environment of OMPs in vivo is characterised by variable lipid composition and asymmetric distributions of lipids, as well as being densely packed with the many proteins associated with, and inserted into, the bilayer [63].As the experimental toolbox for the folding of OMPs continues to expand, the level of complexity attainable in in vitro models is likely to increase. Additionally, as the gap in understanding between the folding of OMPs and water-soluble proteins continues to decrease, the similarities and differences between the folding mechanisms of these two protein classes will emerge. This knowledge will be critical in understanding some of the fundamental biophysical questions which remain to be answered, such as the relationship between the primary sequence of a protein and its native structure in the context of water or lipid as the solvent, and how chaperones assist folding in vitro and in vivo. Moreover, these concepts will be critical in refining current knowledge of protein folding mechanisms and working towards a universal folding theory which encompasses all of nature’s proteins.
Authors: Stefano Gianni; Nicholas R Guydosh; Faaizah Khan; Teresa D Caldas; Ugo Mayor; George W N White; Mari L DeMarco; Valerie Daggett; Alan R Fersht Journal: Proc Natl Acad Sci U S A Date: 2003-10-31 Impact factor: 11.205
Authors: Christopher Stubenrauch; Matthew J Belousoff; Iain D Hay; Hsin-Hui Shen; James Lillington; Kellie L Tuck; Kate M Peters; Minh-Duy Phan; Alvin W Lo; Mark A Schembri; Richard A Strugnell; Gabriel Waksman; Trevor Lithgow Journal: Nat Microbiol Date: 2016-05-16 Impact factor: 17.745
Authors: Georg Krainer; Pablo Gracia; Erik Frotscher; Andreas Hartmann; Philip Gröger; Sandro Keller; Michael Schlierf Journal: Biophys J Date: 2017-06-16 Impact factor: 4.033