Xin Zhang1, Yu Liu, Joseph C Genereux, Chandler Nolan, Meha Singh, Jeffery W Kelly. 1. Department of Chemistry, ‡Department of Molecular and Experimental Medicine, and §Department of Chemical Physiology, ∥The Skaggs Institute for Chemical Biology, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States.
Abstract
The biosynthesis of soluble, properly folded recombinant proteins in large quantities from Escherichia coli is desirable for academic research and industrial protein production. The basal E. coli protein homeostasis (proteostasis) network capacity is often insufficient to efficiently fold overexpressed proteins. Herein we demonstrate that a transcriptionally reprogrammed E. coli proteostasis network is generally superior for producing soluble, folded, and functional recombinant proteins. Reprogramming is accomplished by overexpressing a negative feedback deficient heat-shock response transcription factor before and during overexpression of the protein-of-interest. The advantage of transcriptional reprogramming versus simply overexpressing select proteostasis network components (e.g., chaperones and co-chaperones, which has been explored previously) is that a large number of proteostasis network components are upregulated at their evolved stoichiometry, thus maintaining the system capabilities of the proteostasis network that are currently incompletely understood. Transcriptional proteostasis network reprogramming mediated by stress-responsive signaling in the absence of stress should also be useful for protein production in other cells.
The biosynthesis of soluble, properly folded recombinant proteins in large quantities from Escherichia coli is desirable for academic research and industrial protein production. The basal E. coli protein homeostasis (proteostasis) network capacity is often insufficient to efficiently fold overexpressed proteins. Herein we demonstrate that a transcriptionally reprogrammed E. coli proteostasis network is generally superior for producing soluble, folded, and functional recombinant proteins. Reprogramming is accomplished by overexpressing a negative feedback deficient heat-shock response transcription factor before and during overexpression of the protein-of-interest. The advantage of transcriptional reprogramming versus simply overexpressing select proteostasis network components (e.g., chaperones and co-chaperones, which has been explored previously) is that a large number of proteostasis network components are upregulated at their evolved stoichiometry, thus maintaining the system capabilities of the proteostasis network that are currently incompletely understood. Transcriptional proteostasis network reprogramming mediated by stress-responsive signaling in the absence of stress should also be useful for protein production in other cells.
Production
of large quantities
of soluble, properly folded, and functional recombinant proteins-of-interest
remains a major challenge in both academic and industrial settings. Escherichia coli is an easily cultured organism often used
for recombinant protein production; however, the quantity obtained
of a soluble, folded, and functional protein-of-interest is often
undesirably low, because many proteins-of-interest aggregate to form
inclusion bodies in the cytoplasm when overexpressed.[1] This happens because the innate E. coli proteostasis network capacity is insufficient to support the efficient
folding of large quantities of a protein-of-interest as well as the
endogenous proteome during bacterial cell growth. Therefore, one potentially
general strategy to improve recombinant protein production is to enhance
the E. coli’s proteostasis network capacity
to facilitate the proper folding of recombinant proteins during their
overexpression, without compromising the folding of the endogenous
proteome.[2,3]The proteostasis network capacity
of E. coli is
determined by the concentration and relative stoichiometry of the
proteostasis network components, including chaperones, co-chaperones,
chaperonins, folding enzymes, and proteases.[4−7] Overexpression of select proteostasis
network components, such as the molecular chaperones (DnaK), and/or
co-chaperones (DnaJ and GrpE), and/or chaperonins (GroEL and GroES),
either alone or in combination, effectively increases the biosynthetic
yield of certain proteins-of-interest.[8−11] However, this approach is limited—individual
chaperones often handle specific protein substrates,[12,13] making it difficult to predict a priori a suitable
chaperone system for a specific protein. Thus, extensive experimentation
is required to determine the chaperone pathway required to fold a
specific protein.[11−13] The E. coli cytosolic proteostasis
network is transcriptionally regulated by the heat-shock response
(HSR) stress-responsive signaling pathway.[14] The advantage of transcriptional reprogramming versus overexpressing
select proteostasis network components is that the system capabilities
of the proteostasis network are maintained because the proteostasis
network components are upregulated at their evolved stoichiometry.[15−18]The HSR transcriptional program is controlled by the transcriptional
factor σ32,[19] whose induction through
stress has been shown to elevate the mRNA and protein levels of the
majority of proteostasis network components.[20] Elevated temperature or the coexpression of σ32 has been used
previously to improve recombinant protein production.[21−23] However, these methods can be problematic: (i) an elevated temperature
causes the endogenous proteome to misfold, consuming much of the HSR-enhanced
folding capacity; (ii) the HSR or the coexpression of σ32 is
transient, due to a negative feedback pathway and degradation of the
σ32 transcriptional factor,[20,24] thus presenting
a timing challenge; and (iii) growing E. coli at
high temperature for an extended period compromises cellular health
and requires a lot of electricity for large scale production.Herein, we transcriptionally reprogram E. coli’s
cytosolic proteostasis capacity at a permissive growth
temperature before and during production to maximize the quality and
quantity of recombinant proteins-of-interest (Figure 1a). To accomplish this goal, we overexpressed the previously
reported I54N mutant of σ32 (σ32-I54N)[25] that is insensitive to negative feedback regulation, affording
persistently high proteostasis network component concentrations at
the stoichiometry optimized by evolution (Figure 1b). σ32-I54N was subcloned into a pBAD vector, which
allows the expression of σ32-I54N to be regulated by l-arabinose. Addition of arabinose to the cell culture (final concentration
of 0.02% (w/v)) increased σ32-I54N levels over a period of ∼24
h, even at 37 °C, a temperature that is known not to induce a
HSR (Figure 2a, top panel). As expected, σ32-I54N
expression substantially increased the levels of major chaperones
(DnaK), co-chaperones (DnaJ and GrpE), and chaperonins (GroEL and
GroES) over a period of ∼24 h (Figure 2a). In contrast, the concentration of trigger factor, a cotranslational
chaperone not under σ32 regulation,[20] was largely unchanged over this time period (Figure 2a). Importantly, proteostasis network capacity could also
be enhanced when the cells were grown in minimal media (M9, Supplementary Figure S1), rendering this approach
suitable for the production of metabolically labeled proteins.
Figure 1
Enhancing cellular
proteostasis capacity for high-yield, high-quality
protein production in E. coli. (a) The proposed strategy
to increase production of soluble, folded, and functional recombinant
proteins by enhancing the proteostasis capacity of the E.
coli through overexpression of a negative feedback deficient
mutant of heat-shock factor σ32, σ32-I54N. (b) Induction
of σ32-I54N expression by the addition of arabinose results
in a persistent induction of cellular chaperones, co-chaperones, and
chaperonins at ambient temperatures.
Figure 2
Overexpression of σ32-I54N increased the cellular concentration
of major proteostasis network components in E. coli for durations that are suitable for recombinant protein expression.
(a) In LB media, σ32-I54N expression increased the cellular
concentration of σ32-I54N, chaperonins GroEL and GroES, chaperone
DnaK, and co-chaperone GrpE, as determined by Western blotting analyses
(experimental procedures outlined in the top panel). Trigger factor
is not regulated by σ32 and serves as a loading control. EV:
empty vector. (b,c) σ32-I54N expression for 1 h followed by
cell lysis increased the concentration of major heat-shock proteins
and maintained their naturally evolved stoichiometry, as quantified
by whole cell SILAC MudPIT proteomic analyses (experimental procedures
outlined in the top panel). Changes in heat-shock protein levels in
response to wild-type (WT) σ32 expression or thermal (42 °C)
stress are shown for comparison and more comprehensively in Supplementary Table S1. The HSR-regulated proteins
that belong to specific folding or degradation pathways are color
coded as green (DnaK/DnaJ/GrpE), blue (GroEL/GroES), and red (ClpX/P
and ClpA/P AAA+ proteases). Trigger factor is not regulated by σ32
and serves as a control.
Enhancing cellular
proteostasis capacity for high-yield, high-quality
protein production in E. coli. (a) The proposed strategy
to increase production of soluble, folded, and functional recombinant
proteins by enhancing the proteostasis capacity of the E.
coli through overexpression of a negative feedback deficient
mutant of heat-shock factor σ32, σ32-I54N. (b) Induction
of σ32-I54N expression by the addition of arabinose results
in a persistent induction of cellular chaperones, co-chaperones, and
chaperonins at ambient temperatures.Overexpression of σ32-I54N increased the cellular concentration
of major proteostasis network components in E. coli for durations that are suitable for recombinant protein expression.
(a) In LB media, σ32-I54N expression increased the cellular
concentration of σ32-I54N, chaperonins GroEL and GroES, chaperone
DnaK, and co-chaperone GrpE, as determined by Western blotting analyses
(experimental procedures outlined in the top panel). Trigger factor
is not regulated by σ32 and serves as a loading control. EV:
empty vector. (b,c) σ32-I54N expression for 1 h followed by
cell lysis increased the concentration of major heat-shock proteins
and maintained their naturally evolved stoichiometry, as quantified
by whole cell SILAC MudPIT proteomic analyses (experimental procedures
outlined in the top panel). Changes in heat-shock protein levels in
response to wild-type (WT) σ32 expression or thermal (42 °C)
stress are shown for comparison and more comprehensively in Supplementary Table S1. The HSR-regulated proteins
that belong to specific folding or degradation pathways are color
coded as green (DnaK/DnaJ/GrpE), blue (GroEL/GroES), and red (ClpX/P
and ClpA/P AAA+ proteases). Trigger factor is not regulated by σ32
and serves as a control.Using a quantitative whole cell proteomics approach (Figure 2b, top panel; stable isotopic labeling by amino
acids in cell culture (SILAC) combined with multidimensional protein
identification technology (MuDPIT) mass spectrometry), we found that
σ32-I54N expression produced a HSR-like transcriptionally remodeled
proteostasis network without perturbing the majority of the endogenous
cellular proteome (Figures 2b,c and 3a). First, σ32-I54N expression for 1 h (Figure 2b, top panel) resulted in an elevated level of HSR
regulated proteins within the σ32 regulon, including the major
chaperones, chaperonins, and the AAA+ proteases (Figures 2b and c; proteins with >1.5-fold upregulation
following
σ32-I54N expression can be found in Supplementary
Table S1, along with their fold changes following wild type
σ32 expression or resulting from a thermal HSR). These results
demonstrate that the σ32-I54N mutant faithfully recapitulates
the transcriptional program of wild-type σ32 or a thermal-induced
HSR; however, the fold change was higher with σ32-I54N than
with wild-type σ32 or heat-shock due to the loss of feedback
inhibition. Second, we found that the σ32-I54N HSR transcriptional
program largely maintained the proper stoichiometry of proteostasis
network components in comparison to the wild-type σ32 transcriptional
program or a thermal HSR (Figure 2b,c and Supplementary Table S1). This is evidenced by
the extent of the fold change of chaperones or chaperonins associated
with individual pathways. For example, components comprising the Hsp70
(DnaK, DnaJ, and GrpE) and the Hsp60 (GroEL and GroES) pathways were
each increased by a factor of ∼6 and ∼4, respectively
(green and blue in Figure 2b,c). Similarly,
the ClpX, ClpP and ClpA AAA+ proteases were increased by ∼2-fold
(red in Figure 2b,c). Since productive folding
of a protein-of-interest is the result of a collaboration between
a number of folding pathways competing with proteolysis,[5,16−18,26] it is crucial that
balance between these pathways be maintained as well as possible.
Third, σ32-I54N expression minimally perturbed the endogenous E. coli proteome, with the exception of those proteins in
the σ32 regulon—89% of the proteins detected in the SILAC
MudPIT proteomics study were changed by less than 50% in response
to σ32-I54N expression (Figure 3a). Consistent
with this data, σ32-I54N expression minimally perturbed E. coli growth (Figure 3b). Collectively,
these results indicate that σ32-I54N expression results in healthy E. coli exhibiting an enhanced proteostasis network capacity,
providing a pro-folding environment that also has the capacity to
degrade misfolded proteins, envisioned to minimize inclusion body
formation.
Figure 3
σ32-I54N expression minimally perturbs the E. coli proteome, with the exception of the heat-shock response genes, and
thus does not affect cell growth. (a) Volcano plot relating the fold
change (FC) of the proteome in response to σ32-I54N expression
to the FC variability between duplicate SILAC MudPIT proteomics experiments.
Variability is expressed as log2 π, where π
= |FC – 1|/σFC, wherein σFC is the standard deviation of FC. Experimental procedures are outlined
in Figure 2b, top panel. (b) Overexpression
of σ32-I54N did not affect growth of E. coli during recombinant protein overexpression at 37 °C in LB media.
σ32-I54N expression minimally perturbs the E. coli proteome, with the exception of the heat-shock response genes, and
thus does not affect cell growth. (a) Volcano plot relating the fold
change (FC) of the proteome in response to σ32-I54N expression
to the FC variability between duplicate SILAC MudPIT proteomics experiments.
Variability is expressed as log2 π, where π
= |FC – 1|/σFC, wherein σFC is the standard deviation of FC. Experimental procedures are outlined
in Figure 2b, top panel. (b) Overexpression
of σ32-I54N did not affect growth of E. coli during recombinant protein overexpression at 37 °C in LB media.Next, we investigated whether
the HSR transcriptional program was
beneficial for recombinant protein production. Toward this end, the
pBAD vector encoding σ32-I54N was cotransformed with a pET29b(+) vector harboring the protein-of-interest into the Bl21
(DE3) E. coli strain commonly used for protein overexpression,
which is deficient in Lon and OmpT proteases (Figure 4a). σ32-I54N expression was initiated by the addition
of l-arabinose (0.02%, w/v) after the culture reached an
OD600 of 0.4 at 37 °C (red pathway in Figure 4a). Transcriptional reprogramming to enhance the E. coli proteostasis network capacity (Figure 4a) was started 1 h before IPTG induction of the protein-of-interest,
which was expressed for 4 h. During the period of protein-of-interest
expression, l-arabinose was present in the culture to continuously
enhance E. coli proteostasis network capacity (red
pathway in Figure 4a). For comparison, d-glucose (0.02%, w/v) was used to inhibit σ32-I54N expression,
resulting in a basal E. coli proteostasis network
capacity (black pathway in Figure 4a). The
σ32-I54N transcriptional program moderately to substantially
increased the soluble concentration of three aggregation-prone recombinant
proteins exhibiting distinct structural scaffolds and organismal origins
(Figure 4b). A de novo designed,
mutation-destabilized retro-aldolase (RA) and the industrially important
endoxylanase (XynA) enzyme mainly form inclusion bodies, with only
a small soluble fraction, when produced in E. coli featuring a basal proteostasis network.[27−29] When these
proteins were overexpressed in E. coli featuring
an enhanced proteostasis network capacity, the solubility of RA and
XynA increased by 2- and 3-fold, respectively (Figure 4b), suggesting that the transcriptionally reprogrammed proteostasis
network is able to rescue aggregation-prone proteins from inclusion
body formation.
Figure 4
HSR transcriptional program increases the yield of soluble,
folded,
and functional recombinant proteins. (a) Schematic showing recombinant
protein overexpression in a HSR transcriptionally enhanced E. coli proteostasis network. d-Glucose inhibits
σ32-I54N expression, resulting in a basal proteostasis capacity.
POI: protein-of-interest. IPTG: isopropyl β-d-1-thiogalactopyranoside.
(b) Western blotting analysis of soluble recombinant proteins produced
in the same number of E. coli cells featuring enhanced
(+) or basal (−) proteostasis capacities. Trigger factor serves
as a loading control. (c) Concentration of functional destabilized
RA mutant in lysates measured using the previously described RA folding
probe.[28] (d) Xylanase activity in lysates
measured using the EnzChekUltraxylanase assay kit.
(e) Concentration of native, tetrameric A25T-TTR in lysates measured
using the previously published TTR-tetramer folding probe.[28]
HSR transcriptional program increases the yield of soluble,
folded,
and functional recombinant proteins. (a) Schematic showing recombinant
protein overexpression in a HSR transcriptionally enhanced E. coli proteostasis network. d-Glucose inhibits
σ32-I54N expression, resulting in a basal proteostasis capacity.
POI: protein-of-interest. IPTG: isopropyl β-d-1-thiogalactopyranoside.
(b) Western blotting analysis of soluble recombinant proteins produced
in the same number of E. coli cells featuring enhanced
(+) or basal (−) proteostasis capacities. Trigger factor serves
as a loading control. (c) Concentration of functional destabilized
RA mutant in lysates measured using the previously described RA folding
probe.[28] (d) Xylanase activity in lysates
measured using the EnzChekUltraxylanase assay kit.
(e) Concentration of native, tetrameric A25T-TTR in lysates measured
using the previously published TTR-tetramer folding probe.[28]Using protein folding probes, we have recently shown that
not all
soluble proteins are functional when overexpressed in E. coli.[28] Thus, the increase in soluble protein
may not reflect an increase in the levels of functional protein. The
functional concentration of a protein-of-interest can either be quantified
using protein folding probes as previously reported[28] or assessed directly using functional assays. Using the
RA folding probe, we found that the enhanced proteostasis network
capacity increased the functional RA concentration in cell lysates
by 2-fold (9.80 ± 0.41 μM vs 4.12 ± 0.36 μM;
Figure 4c and Supplementary
Figure S2), comparable to the 2-fold increase in the soluble
fraction. Similarly with xylanase, the HSR transcriptional program
produced a ∼4-fold higher XynA activity, assessed using a fluorescence-based
xylanase activity assay, relative to XynA produced within a basal
proteostasis network (6.19 ± 0.45 μM/min vs 1.69 ±
0.34 μM/min; Figure 4d and Supplementary Figure S3). Thus, it appears that
the enhanced proteostasis network capacity improves the yield of recombinant
RA and XynA proteins as folded and functional proteins.We also
assessed the solubility and folding of the A25T mutant
of humantransthyretin (TTR). A25TTTR is a soluble protein when overexpressed
in E. coli, and its solubility did not change when
A25TTTR was expressed within an enhanced proteostasis network (Figure 4b). However, <50% of the soluble A25TTTR assumed
its native tetrameric conformation when expressed within a basal proteostasis
network (Figure 4e).[28] Under a σ32-I54N-mediated enhanced proteostasis network, 39%
more of the soluble A25TTTR forms native tetramers in comparison
to TTR produced in a basal proteostasis network (7.98 ± 0.11
μM versus 5.75 ± 0.13 μM; Figure 4e and Supplementary Figure S4),
suggesting that the enhanced proteostasis network capacity promotes
the folding and assembly of A25TTTR into its functional quaternary
structure.Collectively, our data show that the σ32-I54N
HSR-like reprogrammed
proteostasis network promotes the production of soluble, folded, and
functional recombinant proteins. This approach for recombinant protein
production differs from previous approaches. First, by avoiding an
environmental stress to induce the HSR, the cells maintain a healthy
cellular physiology. Second, the HSR transcriptional program increases
the cellular levels of chaperones, co-chaperones, chaperonins, and
proteases (except Lon and ompT in the Bl21 DE3 strain) in their naturally
evolved stoichiometry, which is important for maintaining the system
attributes of the cytosolic proteostasis network. Such a global enhancement
of E. coli cytosolic proteostasis capacity is able
to mediate folding of a variety of client proteins, eliminating the
necessity to know which chaperone or chaperonin pathway handles a
particular protein-of-interest. Third, the σ32-I54N transcription
factor is resistant to the feedback inhibition and degradation that
limits the proteostasis network enhancement that can be achieved by
wild-type σ32 overexpression. Therefore, higher concentrations
of most proteostasis network components can be achieved with σ32-I54N
reprogramming (Figure 2b), resulting in higher
quantities of XynA (Supplementary Figure S5) relative to WT σ32 reprogramming. Lastly, although σ32-I54N
transcriptional reprogramming also increases protease levels, it primarily
affords a pro-folding environment, suited to folding recombinant proteins.
The yield of XynA was minimally affected by lengthening the expression
period from 4 to 6 h, suggesting that the proteolytic capacity did
not override the pro-folding capacity (Supplementary
Figure S6). Thus, we expect this approach to be effective for
improving the yield of a variety of recombinant proteins without the
need for extensive optimization. However, optimization of the temperature,
the culture density at the time of HSR transcriptional program induction,
and the timing of protein-of-interest induction could further enhance
yield.The principles that we have demonstrated for improved
recombinant
protein overexpression in bacteria should be readily applicable to
other cellular expression systems, including eukaryotic cells.[15−18] Transcriptional reprogramming retains the system attributes of the
proteostasis network, enabling enhanced proteostasis network capacities
to be used to improve the yield of soluble, folded, and functional
recombinant proteins. Genetic strategies and chemical approaches for
the activation of stress responsive signaling in the absence of stress
are now becoming available for multiple organisms.[30−34] Thus, it is now practical to transcriptionally reprogram
proteostasis network capacity for improved production of difficult
to fold recombinant proteins.
Methods
Recombinant
Protein Overexpression in the Heat-Shock-Like Expression
System
A pET29b(+) vector (kanamycin resistance)
encoding the gene of a protein-of-interest was transformed into the
Bl21 (DE3) strain harboring the pBAD-σ32-I54N vector (ampicillin
resistance). When cultures of Bl21 (DE3) cells bearing both vectors
reached an OD600 of 0.4, σ32-I54N expression was
induced with 0.02% (w/v) l-arabinose. After incubation for
1 h, isopropyl β-d-1-thiogalactopyranoside (IPTG, final
concentration of 1 mM) was added to induce overexpression of the protein-of-interest,
which could be induced for as long as 24 h. During protein-of-interest
expression, l-arabinose was kept in the cell culture to ensure
that the E. coli proteostasis network capacity was
constantly enhanced.
Authors: Takashi Yura; Eric Guisbert; Mark Poritz; Chi Zen Lu; Elizabeth Campbell; Carol A Gross Journal: Proc Natl Acad Sci U S A Date: 2007-10-29 Impact factor: 11.205
Authors: Francesca Ceroni; Alice Boo; Simone Furini; Thomas E Gorochowski; Olivier Borkowski; Yaseen N Ladak; Ali R Awan; Charlie Gilbert; Guy-Bart Stan; Tom Ellis Journal: Nat Methods Date: 2018-03-26 Impact factor: 28.547
Authors: Christopher Kesten; Arndt Wallmann; René Schneider; Heather E McFarlane; Anne Diehl; Ghazanfar Abbas Khan; Barth-Jan van Rossum; Edwin R Lampugnani; Witold G Szymanski; Nils Cremer; Peter Schmieder; Kristina L Ford; Florian Seiter; Joshua L Heazlewood; Clara Sanchez-Rodriguez; Hartmut Oschkinat; Staffan Persson Journal: Nat Commun Date: 2019-02-20 Impact factor: 14.919