Vanessa E DeLey Cox1, Megan F Cole2, Eric A Gaucher3,4. 1. School of Chemistry and Biochemistry , Georgia Institute of Technology , Atlanta , Georgia 30332 , United States. 2. Department of Biology , Emory University , Atlanta , Georgia 30322 , United States. 3. School of Biology , Georgia Institute of Technology , Atlanta , Georgia 30332 , United States. 4. Department of Biology , Georgia State University , Atlanta , Georgia 30303 , United States.
Abstract
Noncanonical amino acid (ncAA) incorporation has led to significant advances in protein science and engineering. Traditionally, in vivo incorporation of ncAAs is achieved via amber codon suppression using an engineered orthogonal aminoacyl-tRNA synthetase:tRNA pair. However, as more complex protein products are targeted, researchers are identifying additional barriers limiting the scope of currently available ncAA systems. One barrier is elongation factor Tu (EF-Tu), a protein responsible for proofreading aa-tRNAs, which substantially restricts ncAA scope by limiting ncaa-tRNA delivery to the ribosome. Researchers have responded by engineering ncAA-compatible EF-Tus for key ncAAs. However, this approach fails to address the extent to which EF-Tu inhibits efficient ncAA incorporation. Here, we demonstrate an alternative strategy leveraging computational analysis to broaden EF-Tu's substrate specificity. Evolutionary analysis of EF-Tu and a naturally evolved specialized elongation factor, SelB, provide the opportunity to engineer EF-Tu by targeting amino acid residues that are associated with functional divergence between the two ancient paralogues. Employing amber codon suppression, in combination with mass spectrometry, we identified two EF-Tu variants with non-native substrate compatibility. Additionally, we present data showing these EF-Tu variants contribute to host organismal fitness, working cooperatively with components of native and engineered translation machinery. These results demonstrate the viability of our computational method and lend support to corresponding assumptions about molecular evolution. This work promotes enhanced polyspecific EF-Tu behavior as a viable strategy to expand ncAA scope and complements ongoing research emphasizing the importance of a comprehensive approach to further expand the genetic code.
Noncanonical amino acid (ncAA) incorporation has led to significant advances in protein science and engineering. Traditionally, in vivo incorporation of ncAAs is achieved via amber codon suppression using an engineered orthogonal aminoacyl-tRNA synthetase:tRNA pair. However, as more complex protein products are targeted, researchers are identifying additional barriers limiting the scope of currently available ncAA systems. One barrier is elongation factor Tu (EF-Tu), a protein responsible for proofreading aa-tRNAs, which substantially restricts ncAA scope by limiting ncaa-tRNA delivery to the ribosome. Researchers have responded by engineering ncAA-compatible EF-Tus for key ncAAs. However, this approach fails to address the extent to which EF-Tu inhibits efficient ncAA incorporation. Here, we demonstrate an alternative strategy leveraging computational analysis to broaden EF-Tu's substrate specificity. Evolutionary analysis of EF-Tu and a naturally evolved specialized elongation factor, SelB, provide the opportunity to engineer EF-Tu by targeting amino acid residues that are associated with functional divergence between the two ancient paralogues. Employing amber codon suppression, in combination with mass spectrometry, we identified two EF-Tu variants with non-native substrate compatibility. Additionally, we present data showing these EF-Tu variants contribute to host organismal fitness, working cooperatively with components of native and engineered translation machinery. These results demonstrate the viability of our computational method and lend support to corresponding assumptions about molecular evolution. This work promotes enhanced polyspecific EF-Tu behavior as a viable strategy to expand ncAA scope and complements ongoing research emphasizing the importance of a comprehensive approach to further expand the genetic code.
Genetic code expansion is a
central goal of protein research and engineering with a broad range
of applications. The ability to reliably incorporate noncanonical
amino acids (ncAAs) in a site-specific manner has expanded the protein
engineering toolbox to enable the functionalization of proteins with
affinity, spectroscopic, and chemical tags.[1] Consequently, bio-orthogonal modification of proteins with ncAAs
is a powerful and emerging tool critical to the development of both
fundamental protein science and applied biotechnologies. The most
common technique for the translation of proteins containing site-specific
ncAA mutations is amber codon suppression.[2] This technique leverages an orthogonal translation system (OTS),
consisting of a dedicated aminoacyl-tRNA synthetase (aaRS):tRNA pair,
which mediates incorporation of a specific ncAA in the target protein
at a repurposed amber codon.[3] In bacteria,
ncAA incorporation is typically accomplished via OTSs
developed from bio-orthogonal aaRS:tRNA pairs derived from Methanocaldococcus jannaschii or Methanosarcina species. Production yields of proteins containing ncAAs have been
improved through development of an engineered E. coli strain in which release factor 1 has been deleted and genomic amber
codons have been replaced with the ochre stop codon, allowing reassignment
of the amber codon to an ncAA.[4] Additional
advances promoting ncAA incorporation include cell-free translation
systems, optimized translation component concentrations, and genomic
incorporation of OTSs.[5−7] However, while these advances have led to incorporation
of some ncAAs at high yield, the routine application of the OTS strategy
is consistently hindered by considerable and recurring barriers.[8]Persistent challenges include cross-reactive
OTSs, incompatibility
with endogenous elongation factors, and discrimination by additional
translation components. These factors affect both the yield and purity
of ncAA-containing proteins as well as the fitness and viability of
the host microorganism.[9−13] Furthermore, the diversity of these modifications is reduced to
a specific set of ncAAs compatible with existing and engineered translation
machinery, thereby significantly reducing the readily available scope
of potential chemistries and applications. These challenges highlight
an immediate need to develop improved engineering strategies beyond
OTS development that will enable translation of increasingly complex
peptide products, with multisite incorporation of multiple ncAAs.[7,14]One obstacle limiting expansion of the genetic code is elongation
factor Tu (EF-Tu), a guanosine triphosphatase (GTPase).[10,15−18] EF-Tu serves two functions in translation. While most commonly recognized
for translocation of aa-tRNA complexes to the ribosome, it also plays
a critical role in quality control by proofreading aa-tRNAs.[19] All 20 aa-tRNAs associate with EF-Tu having
carefully tuned interactions that prevent misacylated tRNAs from being
efficiently delivered to the ribosome for translation.[20] Similar to misacylated tRNAs, ncaa-tRNAs are
non-native substrates and can be discriminated by EF-Tu, thus preventing
their incorporation into a translated protein.[21] Past efforts have typically circumvented EF-Tu’s
editing mechanism by targeting ncAAs that are tolerated as substrates.
For particularly intractable ncAAs, often with bulky or highly charged
side chains, orthogonal EF-Tus have been developed.[15,16,22,23] However, these
efforts fail to recognize EF-Tu’s comprehensive effect on translation.
Even OTSs that can mediate ncAA incorporation via wild-type EF-Tu benefit from an engineered EF-Tu.[15,24] As a result, engineering EF-Tu to accept an expanded set of ncaa-tRNA
substrates represents a unique opportunity for expanding ncAA incorporation.Within the framework of this strategy, there are two approaches
to broaden EF-Tu’s substrate acceptance. One is to knockout
EF-Tu’s proofreading capabilities and develop a variant that
can accommodate additional ncAAs as well as the canonical 20. However,
this method requires a trade-off between the degree of polyspecificity
desired to translate ncAAs and the specificity required for host organism
survival. Here, we present an alternative strategy: engineering a
novel EF-Tu with broader ncAA compatibilities to be used in complement
with native EF-Tu. This strategy parallels an evolved mechanism for
cellular cotranslational incorporation of selenocysteine (Sec), the
21st proteinogenic amino acid, which uses a dedicated elongation factor,
SelB, in concert with EF-Tu.[25]Computational
methods that exploit models of molecular evolution
have been previously leveraged to develop enzymes with expanded substrate
scope. These strategies are based on the concept that enzymes evolved
specialized activity from generic activities, a theory that is supported
by research demonstrating ancestral proteins exhibit broader substrate
compatibility than their modern counterparts.[26,27] In order to apply these methods to engineering an EF-Tu with enhanced
polyspecific substrate compatibility, we assume, on the basis of sequence
similarity, that SelB and EF-Tu are paralogues. This, in turn, suggests
EF-Tu and SelB share a common ancestor that exhibited greater substrate
promiscuity than the modern proteins. Motivated by this theory, EF-Tu
and SelB protein families were selected for computational analysis
to identify sites involved in functional divergence between EF-Tu
and SelB. This information was then utilized to engineer substrate-promiscuous
EF-Tus.Herein, we describe our efforts to transform the manner
in which
EF-Tu is utilized to incorporate ncAAs. Leveraging an evolutionary-based
method, reconstructing evolutionary adaptive paths (REAP), we engineered
EF-Tu variants to better accommodate three non-native substrates.
By mass spectrometry, we demonstrate two variants, from a collection
of eight, have expanded substrate capabilities. By monitoring cell
culture density, we also show these EF-Tu variants support host organism
fitness. These results lend credence to our choice of evolutionary-based
method and also suggest that EF-Tu and SelB had a common ancestor
with expanded substrate polyspecificity. We discuss how this approach
complements current research highlighting the advantages of improved
OTSs and promotes a more comprehensive approach critical to achieving
future goals that expand the genetic code.
Results and Discussion
Computational Approach to Protein Engineering
REAP
has been previously employed to guide development of enzyme
libraries with expanded substrate acceptance.[28,29] In brief, this method employs inferred evolutionary mutation rates
of amino acid positions to predict which amino acid replacements are
most likely to impart novel protein activity (Figure ).[30,31] REAP analysis is based
on the assumption that amino acids that impact function are conserved
during the evolution of a protein family and the corresponding assumption
that residues lacking conservation are likely not correlated to activity
or stability. REAP functions by ranking residues according to their
degree of conservation in one lineage compared to the degree of conservation
in another lineage. Amino acid sites with low inferred replacement
rates are predicted to have a high correlation to function and are
thus targeted during library design. Correspondingly, sites with high
replacements rates are predicted to have minimal influence on protein
behaviors and are excluded from library design. A central tenet of
this method is that a REAP-developed library can enrich the functional
diversity of a library while reducing the number of variants required
for testing.
Figure 1
General schematic illustrating REAP methodology. This
scheme shows
the comparison of two clades highlighted in blue and pink. Homologous
sequences from each clade are aligned and analyzed computationally
to identify Type I and Type II functional divergence. Results can
be used to estimate the probability that a mutation will affect protein
activity, leading to development of a functionally diverse protein
library.
General schematic illustrating REAP methodology. This
scheme shows
the comparison of two clades highlighted in blue and pink. Homologous
sequences from each clade are aligned and analyzed computationally
to identify Type I and Type II functional divergence. Results can
be used to estimate the probability that a mutation will affect protein
activity, leading to development of a functionally diverse protein
library.Conserved amino acid sites are
classified during REAP analysis
as exhibiting either Type I or Type II functional divergence. Type
I indicates an amino acid is conserved in only one lineage of a protein
phylogeny.[32,33] This indicates the residue is
critical for function in one protein family (where it is conserved),
but not the other (in which the site is variable). Alternatively,
amino acid sites exhibiting Type II functional divergence show conservation
in both branches of the phylogeny, although the amino acid identity
at the conserved position differs between families.[34] This type of divergence suggests that while the amino acid
position is important to protein activity in both families, its role
in protein function may differ.
Selection
of Relevant Amino Acid Residues
To design a small EF-Tu library,
REAP analysis compared EF-Tu and
SelB sequences. Examination of sequence similarity suggests that EF-Tu
and SelB can be classified as functionally divergent homologues, making
them appropriate protein families for a REAP application. EF-Tu and
SelB sequences from 19 prokaryotic families were aligned and evaluated
to identify amino acid positions predicted to influence substrate
compatibility (Figure ). The aligned sequences were analyzed via three
computational models using DIVERGE software.[35] Two models, which employ different parameters for analysis, were
used to identify Type I functional divergence.[32,33] Sites associated with Type II functional divergence were identified
using a third model.[34] Residues were ranked
according to their posterior probability (Type I) or posterior ratio
(Type II) producing a rank-ordered list of amino acid positions, with
the top-ranked sites being predicted to have a greater influence on
activity (Table S1).
Figure 2
Multiple sequence alignment
of EF-Tu and SelB sequences. Sites
selected via REAP analysis are shown. Type I (blue)
and Type II (red) are color-indicated.
Multiple sequence alignment
of EF-Tu and SelB sequences. Sites
selected via REAP analysis are shown. Type I (blue)
and Type II (red) are color-indicated.A preliminary list of targeted amino acid sites was produced
by
parsing the top-ranked residues according to their distance from the
target substrate (Figure A).[36] Because REAP identifies residues
based on conservation rates, a metric influenced by many factors,
the list of REAP-inferred sites was refined via distance
discrimination, which has been previously used to engineer EF-Tu variants.
Distances were calculated using the Cγ of the binding target
amino acid and the Cα of the EF-Tu residue. Residues exceeding
13 Å were removed from the list leaving 26 predicted positions
in close proximity to the target. Of the 26 sites, 7 residues were
excluded, thereby culling the final list to 19 residues (Figure B). Residues omitted
from the library included aliphatic residues between 12 and 13 Å
since they were not expected to have a significant effect on substrate
acceptance, and an alanine residue that was not eligible for mutation
in an alanine-scanning library. Methionine, having the highest entropy
rotamer of the amino acid side chains, was also excluded. Lastly,
although the Cα of Y76 was within 13 Å, the Cγ of
the side chain fell outside the distance cutoff, suggesting that substitution
of alanine would not impact target specificity.
Figure 3
Selection of mutations
in the REAP-designed EF-Tu library. (A)
Plot shows residues identified by REAP. Ranking of position versus distance from target. Black diamonds denote positions
selected for replacement. Gray circles indicate amino acid sites outside
the 13 Å distance cutoff. Colored squares represent residues
within the distance cutoff that were excluded from the library for
various reasons: aliphatic residues (blue), alanine (purple), methionine
(red), tyrosine (green). (B) Amino acids identified by REAP analysis
highlighted on crystal structure of EF-Tu (gray) complexed with tRNAPhe (purple). Inset highlights sites mutated to generate EF-Tu
library (blue). Residues not selected for library are also identified
(cyan). Phenylalanine (orange) is situated in the amino acid binding
pocket. Based on Protein Data Bank structure 1OB2.
Selection of mutations
in the REAP-designed EF-Tu library. (A)
Plot shows residues identified by REAP. Ranking of position versus distance from target. Black diamonds denote positions
selected for replacement. Gray circles indicate amino acid sites outside
the 13 Å distance cutoff. Colored squares represent residues
within the distance cutoff that were excluded from the library for
various reasons: aliphatic residues (blue), alanine (purple), methionine
(red), tyrosine (green). (B) Amino acids identified by REAP analysis
highlighted on crystal structure of EF-Tu (gray) complexed with tRNAPhe (purple). Inset highlights sites mutated to generate EF-Tu
library (blue). Residues not selected for library are also identified
(cyan). Phenylalanine (orange) is situated in the amino acid binding
pocket. Based on Protein Data Bank structure 1OB2.To gauge REAP’s ability to identify relevant
residues, we
also selected three decoy positions. These positions were chosen on
the basis of visual analysis of either a multiple sequence alignment
or EF-Tu’s crystal structure. Site N13 was chosen on the basis
of its conservation in EF-Tu proteins. This position was excluded
from the library because protein lengths were normalized for analysis
(see Materials and Methods). In addition,
two positions, V227 and V274, were selected on the basis of EF-Tu’s
crystal structure and their proximity to the target at 9.1 and 5.8
Å, respectively. Since distance discrimination is the prevailing
strategy used to select mutation sites in EF-Tu, these positions were
deemed likely candidates for mutation and were incorporated into the
library.Alanine scanning was employed to assess the functional
implications
of each position selected via REAP. Definitive evaluation
of EF-Tu mediated incorporation of non-native substrates would require
target protein purification via affinity chromatography
and confirmation via mass spectrometry, a low-throughput,
high-content workflow. The total library size was reduced by grouping
alanine replacements in combinations of 4, 8, or 12 mutations since
the REAP-derived library contained 22 targeted positions, a larger
number of amino acid positions than previous efforts (Figure A). By generating a small,
targeted library from the computational analysis, this comprehensive
strategy for EF-Tu variant analysis was feasible for all variants
that merited further investigation, even the entire library.
Figure 4
REAP-derived
library variants. (A) Chart of mutations made to each
EF-Tu variant. Sequence of wild-type E. coli EF-Tu is shown for reference. (B) EF-Tu (gray) with amino acid residues
mutated in variant EF-4A (inset). Protein is complexed with phenylalanine
(orange) and tRNAPhe (purple).
REAP-derived
library variants. (A) Chart of mutations made to each
EF-Tu variant. Sequence of wild-type E. coli EF-Tu is shown for reference. (B) EF-Tu (gray) with amino acid residues
mutated in variant EF-4A (inset). Protein is complexed with phenylalanine
(orange) and tRNAPhe (purple).
ncAA-Compatible EF-Tu Variants
To characterize
the EF-Tu library, we used an amber codon suppression assay requiring
the cotranslational insertion of an ncAA at an in-frame amber codon.[16] The target gene, chloramphenicol acetyltransferase
(CAT), contained an amber mutation at the permissive D112 position.
CAT confers antibiotic resistance to E. coli resulting in an assay that directly correlates ncAA incorporation
with cellular survival reported as half the maximal inhibitory concentration
(IC50). Rates of survival above wild-type EF-Tu (EF-coli)
indicate the REAP-engineered EF-Tu variant can facilitate incorporation
of the ncAA with greater efficiency than EF-coli.O-Phospho-l-serine (Sep) was a strong ncAA candidate for
our system, because it had been previously identified as an ncAA that
benefits from an engineered EF-Tu.[16] Different
strategies were employed to overcome this barrier to Sep incorporation,
but even OTSs that were somewhat compatible with wild-type EF-Tu showed
improved yields when paired with an engineered EF-Tu.[24,37] One effort to incorporate Sep developed an orthogonal triplet consisting
of tRNASep, SepRS, and EF-Sep to enable cotranslational
insertion of Sep.[16] This engineered triplet
provided a platform for assessing the substrate compatibility of our
modified EF-Tus. Our EF-Tu variants were assayed in combination with
the Sep-OTS, specifically tRNASep and SepRS.Of the
REAP-designed EF-Tu variants, variant EF-4A (N63A/D216A/K263A/N273A)
resulted in the highest IC50 values as determined by the
CAT translation assay. Variant EF-4C conferred survivability similar
to EF-coli with other variants presenting substantially lower IC50 values (Table S2). To deconvolute
the contribution of the four point mutations comprising EF-4A, single-mutation
variants were assayed (EF-N63A, EF-D216A, EF-K263A, and EF-N273A)
(Figure B). Of these
variants, EF-D216A showed improved survivability relative to both
EF-coli and the quadruple mutant EF-4A (Figure ). IC50 values associated with
variants EF-N63A and EF-N273A were not statistically distinguishable
from EF-coli. Variant EF-K263A presented IC50 values below
wild type (Table S2).
Figure 5
Characterization of EF-4A
and single-mutation EF-Tu variants. In vivo suppression via EF-Tu variants
with Sep-OTS (dark purple) or without SepRS (cyan) as measured by
synthesis of CAT (quantified by IC50 value). Data shown
represent triplicate averages except for EF-coli (Sep-OTS), EF-N63A
(Sep-OTS), and EF-N273A (no SepRS), which show data from five replicates.
EF-4A assayed without SepRS shows data from four replicates. All error
bars represent standard deviation. P-values are relative
to EF-coli.
Characterization of EF-4A
and single-mutation EF-Tu variants. In vivo suppression via EF-Tu variants
with Sep-OTS (dark purple) or without SepRS (cyan) as measured by
synthesis of CAT (quantified by IC50 value). Data shown
represent triplicate averages except for EF-coli (Sep-OTS), EF-N63A
(Sep-OTS), and EF-N273A (no SepRS), which show data from five replicates.
EF-4A assayed without SepRS shows data from four replicates. All error
bars represent standard deviation. P-values are relative
to EF-coli.However, host organism
survival conferred by CAT expression does
not exclusively require incorporation of Sep at the amber mutation.
Rather, bacteria survival could be a result of EF-Tu mediated incorporation
of any available aa-tRNA that can pair with the amber codon. To identify
this mechanism of survival, the CAT expression assay was performed
withholding either tRNASep or SepRS. In the event that
a misacylated tRNA was incorporated at the amber mutation, IC50 values would remain unchanged when SepRS was withheld. Conversely,
an endogenous tRNA mispairing with the amber codon would be indicated
by unchanged IC50 values when tRNASep was withheld.
Analysis of experiments lacking tRNASep showed no host
survival, thereby confirming that endogenous tRNAs are not capable
of mispairing with the amber codon. Complementary experiments withholding
SepRS showed largely unchanged IC50 values, indicating
tRNASep is cross-reactive with endogenous aaRSs (Figure ). EF-4A and EF-N273A
show improved host survival when SepRS was withheld, suggesting these
variants might have greater aptitude with a misacylated tRNA. In contrast,
EF-D216A shows somewhat decreased host survival, suggesting perhaps
it may have greater compatibility with the ncaa-tRNA. The compatibility
of EF-4A and EF-D216A with non-native substrates was further evaluated via electrospray ionization mass spectrometry (ESI-MS).
Mass Spectrometry Confirms Substrate-Promiscuous
EF Variants
In order to determine the specific substrate
compatibility of variants EF-4A and EF-D216A, CAT proteins expressed via these EF-Tu variants were purified, and ESI-MS was employed
to investigate the breadth of amino acids incorporated at the permissive
position. Analysis of CAT proteins translated via EF-4A and EF-D216A showed peaks consistent with incorporation of
both Sep and Ser at position 112, suggesting enhanced EF-Tu compatibility
with non-native substrates, specifically ncaa-tRNAs and misacylated
tRNAs (Figure ). EF-D216A
was additionally capable of mediating Gln incorporation at the amber
codon, making it compatible with three non-native substrates. Since
the aim of this effort was the expansion of EF-Tu’s substrate
scope, not the incorporation of a specific ncAA, it is not necessary
to distinguish between types of non-native substrates. To eliminate
post-translational dephosphorylation of Sep as a possible route to
Ser incorporation, ESI-MS was used to analyze CAT protein expressed via EF-Sep mediated translation. If a post-translational
modification were responsible, Ser112 incorporation would be evident
in this CAT protein as well; however, only peaks consistent with Sep
incorporation were evident. These data indicate that Ser incorporation
was not the result of a post-translational dephosphorylated Sep. Combined
with results from the CAT expression assay, these data are indicative
of EF-4A and EF-D216A having expanded substrate compatibility with
non-native substrates, specifically Sep-tRNASep, Ser-tRNASep, and Gln-tRNASep.
Figure 6
Mass spectrometry confirmed
amino acids incorporated at amber mutation
in CAT protein. Protein translation was mediated by EF-Tu variants
listed. The relevant region of the CAT amino acid sequence is shown
for reference with an “X” indicating the permissive
position. A representative group of protein spectra matched is shown
in Table S5.
Mass spectrometry confirmed
amino acids incorporated at amber mutation
in CAT protein. Protein translation was mediated by EF-Tu variants
listed. The relevant region of the CAT amino acid sequence is shown
for reference with an “X” indicating the permissive
position. A representative group of protein spectra matched is shown
in Table S5.This approach to EF-Tu engineering requires balancing the
expanded
polyspecificity desired for non-native substrate acceptance with the
risk of inaccurate translation of the target gene. In this case, EF-Tu
variants mediated expression of a mixed protein product, CAT proteins
with Sep, Ser, Gln at position 112. While this may initially seem
problematic, we argue this challenge is readily overcome by improvements
to synthetase and tRNA engineering. Our data align with prior work
that shows this particular orthogonal tRNA (o-tRNA), tRNASep, is cross-compatible with endogenous aaRSs, indicating that misincorporation,
while permitted by an engineered EF-Tu, is actually caused by a cross-reactive
OTS.[16] While a substrate-specific EF-Tu
variant can prevent misincorporation, this strategy merely shifts
the burden of accurate translation from the aaRS:tRNA pair to the
EF-Tu and fails to address the cross-compatible OTS as the underlying
cause.Since cross-reactive OTSs are common obstacles to genetic
expansion,
recent articles strongly advocate for more rigorous o-tRNA and orthogonal
aaRS (o-RS) engineering.[9,13,15,38] Improving the precision of OTS
engineering transfers the responsibility of accurate translation from
EF-Tu back to the OTS, a distribution of labor that mimics native
translation in which the primary responsibility for accuracy falls
to the canonical aaRS:tRNA pairs, not downstream translation components.[39] By mirroring the native distribution of responsibilities,
dedicated OTSs that ensure accurate tRNA acylation pave the way for
researchers to use translation components with expanded capabilities,
including substrate-promiscuous EF-Tus. This application of downstream
polyspecificity is reflected in the role of the ribosome, which is
known to exhibit broad substrate acceptance and still produce accurately
translated proteins due to the fact that translation components further
upstream ensure accurate tRNA acylation.[40−42] If a rigorously
engineered OTS were used, there is no evidence a promiscuous EF-Tu
would undermine accurate translation. Similarly, we would anticipate
that synthetic acylation methods (e.g., flexizymes)
would be compatible with an EF-Tu exhibiting alternative substrate
compatibility.[43]
EF-Tu Variant
Supports Organismal Fitness
While broader substrate acceptance
is a highly desirable feature
of an EF-Tu, there may be a limit to the degree of infidelity that
is possible for the cell to tolerate. An EF-Tu with enhanced polyspecificity
risks the potential of being so indiscriminate that it is detrimental
to host organism fitness. To examine the impact of engineered EF-Tus
on organismal fitness, we compared substrate-promiscuous variants
(EF-4A and EF-D216A) against an ncAA-specific variant (EF-Sep) and
the wild type (EF-coli). Each EF-Tu variant was expressed in bacteria
grown in 2xYT media to detect leaky expression of EF-Tu, 2xYT media
with 2% glucose added for catabolic repression, and 2xYT media with
0.5 mM IPTG for induction of EF-Tu expression. Each growth curve represents
triplicate averages that were subsequently fitted to modified growth
models that estimate the maximum specific growth rate and lag time.[44]Table S8 presents
these two parameters and their respective errors for each assay. Importantly,
the cell line used, BL21ΔserB, lacks the gene
encoding Sep phosphatase; as such, growth curves were not reproduced
using an alternative engineered cell line.[16]While growth curves for all EF-Tus were similar, cultures
in which EF-4A and EF-D216A expression had been induced suggested
these variants marginally improved host organism fitness. On average,
these variants showed somewhat elevated maximal growth rates and a
shorter lag time before entering an exponential growth phase relative
to EF-coli and EF-Sep (Figure ). They also demonstrated more reliable reproducibility with
consistently small standard deviations. While any benefit to the host
organism is minimal, it is significant that these substrate-promiscuous
variants do not impair host organism fitness. Rather, these data support
the application of engineered polyspecific EF-Tu variants for use
in concert with native translation machinery, further recommending
our strategy as a route to ncAA incorporation. These growth curves
suggest that expanding EF-Tu’s substrate scope is compatible
with the endogenous translation machinery and does not negatively
impact native translation. Hence, an EF-Tu variant with non-native
polyspecific behavior appears to be an asset to genetic code expansion.
Figure 7
Growth
assays for EF-Tu variants expressed in BL21ΔserB cell line. Triplicate averages are shown for EF-4A
(black triangles), EF-D216A (red circles), EF-Sep (blue circles),
and EF-coli (green diamonds). Cultures were grown in 2xYT media (A),
2xYT media with 2% glucose added (B), and 2xYT media with 0.5 mM IPTG
added (C). Error bars show standard deviation.
Growth
assays for EF-Tu variants expressed in BL21ΔserB cell line. Triplicate averages are shown for EF-4A
(black triangles), EF-D216A (red circles), EF-Sep (blue circles),
and EF-coli (green diamonds). Cultures were grown in 2xYT media (A),
2xYT media with 2% glucose added (B), and 2xYT media with 0.5 mM IPTG
added (C). Error bars show standard deviation.
Conclusion
As expansion of the genetic code targets
increasingly complex protein
products, EF-Tu discrimination of ncAAs is emerging as a critical
factor limiting ncAA scope. In this approach, we identified multiple
EF-Tu variants that facilitated incorporation of non-native substrates,
both ncaa-tRNAs and misacylated tRNAs. Computational methods rooted
in theories of molecular evolution guided development of a targeted
EF-Tu library. Compatibility of EF-Tu variants with non-native substrates
was assessed via OTS-mediated amber codon suppression
and confirmed via ESI-MS analysis of purified CAT
protein expressed via EF-Tu mediated translation.
Using these techniques, two EF-Tu variants with expanded substrate
compatibility were identified. Growth assays demonstrated that a cooperative
EF-Tu with expanded substrate scope is a viable addition to cellular
translation without sabotaging cell growth. These data suggest that
expanding EF-Tu’s substrate compatibility may be compatible
with the natural limits imposed by endogenous gene expression. This
research supports future goals to expand the genetic code including
multisite ncAA incorporation, multiple ncAA incorporation, and proteome-wide
incorporation, which can be impacted by EF-Tu’s proofreading
capabilities.[8]The strategy of employing
multiple elongation factors with different
substrate compatibilities in parallel is an evolved mechanism for
cellular cotranslational incorporation of Sec. Our results support
the application of this naturally occurring strategy to engineer the
genetic code and expand ncAA scope. Specifically, this research demonstrated
EF-Tu variants with expanded substrate compatibility can work effectively
in concert with endogenous translation machinery. The success of the
REAP-derived library also offers support to our underlying assumptions,
specifically that EF-Tu and SelB may be paralogues and may have once
shared a common ancestor that exhibited broader polyspecific activity.
Further analysis of the EF-Tu library emphasizes the impact of single
mutations to engineered EF-Tu variants. To generate the library, each
EF-Tu variant contained multiple mutations, a commonly used approach;
however, a single-mutation variant showed enhanced substrate compatibility
relative to the quadruple variant. Additionally, the single-mutation
variants offered improved insight into the contribution of individual
residues to EF-Tu behavior. These data suggest the possibility that
epistatic interactions among amino acid residues may limit researchers’
ability to identify sites influencing substrate acceptance, thus recommending
single-mutation variants for future engineering of EF-Tus compatible
with non-native substrates.Data generated by the EF-Tu library
also presented an opportunity
to evaluate how effectively REAP identified residues that expand substrate
acceptance. The most impactful mutation, D216, was ranked within the
top ten sites associated with Type II functional divergence and within
the top 15% (out of 279 total residues) overall. Only six sites selected
for library development ranked higher. Although the impact of D216
has been debated, our data support evidence that this position strongly
affects EF-Tu’s substrate specificity.[17,22] Of the residues that ranked higher than D216, site N273 was also
mutated in variant EF-4A. Although position N273 was not as impactful
as site D216, follow-up experiments suggest that N273 can also influence
substrate binding, contrary to previous findings. Since both residues
associated with expanded EF-Tu activity were ranked within the top
positions identified by REAP, these data lend validation to our computational
method. They also suggest that REAP can identify relevant positions
whose importance may be otherwise overlooked. Since this research
highlights the influence of individual mutations on EF-Tu, the contribution
of the individual decoy positions cannot be fully characterized, as
they were not evaluated singly in the context of the wild-type sequence.
However, because positions relevant to ncAA compatibility were identified via assessment of mutations in combination, we can conclude
that in combination, the decoy positions were not as effective at
expanding EF-Tu’s non-native substrate compatibility as those
identified by REAP, providing support for our methodology.This
work directly complements current research seeking to further
expand the breadth of non-natural protein translation. Prior work
targeting multisite ncAA incorporation has demonstrated the vital
importance of both improved OTS and EF-Tu engineering.[13,15,38] Additionally, more precisely
engineered OTSs could allow the translation machinery to accommodate
even a highly polyspecific EF-Tu with limited risk of inaccurate translation.
Expanded EF-Tu substrate acceptance also has the potential to reduce,
if not completely eliminate, EF-Tu:ncAA compatibility as a challenge
inhibiting ncAA incorporation. The substrate-promiscuous EF-Tus described
herein, for example, could be promising in combination with other
ncAAs. Additionally, they may be tractable platforms for development
of additional EF-Tus with novel function in the form of either further
expanded substrate compatibility or alternate substrate specificity.
Continued expansion of the genetic code to incorporate alternative
polymer chemistries, non-natural peptide backbone structures, and
increasingly exotic ncAAs is anticipated to demand increasingly extensive
and creative bioengineering solutions.[7,14] Components
that have previously been somewhat tolerant of ncAA incorporation,
like EF-Tu, are beginning to come to the forefront as obstacles that
must be addressed to achieve these challenging goals.[11,12] These concurrent efforts illustrating the urgent need for comprehensive
and creative strategies to expand the genetic code support the argument
for novel approaches to engineer EF-Tu.
Materials and Methods
REAP Library
The REAP alignment was generated using
38 sequences from 19 species of bacteria that express both EF-Tu and
SelB. Due to a large discrepancy in average sequence length between
EF-Tu and SelB sequences, we normalized the length of the 38 selected
sequences to create a more accurate phylogeny. Generally, 25 residues
were eliminated from the N-terminus of each SelB sequence and 343
residues were deleted from the C-terminus. For EF-Tu sequences, 47
residues were removed from the N-terminus; the C-terminus was not
adjusted. REAP DNA sequences are found in Table S4.REAP analysis was completed using DIVERGE2.0 software.[35] The multiple sequence alignment was generated
in Clustal Omega; the phylogeny was generated within DIVERGE2.0 using
a Poisson distribution. Output was calculated for Gu99, Gu01, and
Type II.[32−34]
In Vivo Assay
Variants
were assayed
using a system for the cotranslational insertion of Sep in
vivo. This system included an orthogonal triplet, a tRNA
(tRNASep), an aminoacyl-tRNASep synthetase (SepRS),
and EF-Tu variant (EF-Sep) specifically engineered for Sep. These
genes were located on two plasmids: pCAT112TAG-SepT (Addgene, plasmid
number 34624), and pKD-SepRS-EFSep (Addgene, plasmid number 34623).
The EF-Tu variant EF-Sep was used as a positive control and the standard
to which the REAP variants were compared. Wild-type E. coli EF-Tu, which is not compatible with Sep, was used as a negative
control. It is relevant to note that all experiments contained endogenous
wild-type EF-Tu. The BL21ΔserB cell line (Addgene,
bacterial strain number 34929), which critically lacks Sep phosphatase,
was used. All plasmids and cell lines described here were gifts from
Jesse Rinehart and Dieter Söll.
Calculate IC50
Plasmids of choice (pKD and
pCAT) were transformed into BL21ΔserB competent
cells (Addgene, catalog number 34929). A single colony was selected
from each transformation, grown overnight, and made into a glycerol
freezer stock (25% sterile glycerol, 25% sterile water, and 50% bacteria
culture). For each assay, glycerol freezer stocks were streaked out
and a single colony was picked and grown for ∼24 h. The culture
was then diluted to OD600 0.15 in media supplemented with
2 mM Sep, grown to OD600 0.6–0.8 and induced (0.5
mM IPTG, Sigma-Aldrich). Cultures were allowed to express for 20 h
and then diluted in saline and plated, in duplicate, on agar plates
with a range of chloramphenicol (Thermo Fisher Scientific) concentrations.
Colonies were counted daily.All liquid and solid cultures were
grown at 30 °C. All liquid cultures were grown in LB media supplemented
with 0.08% glucose. Kanamycin (25 μg/mL, kanamycin sulfate,
VWR) and tetracycline (10 μg/mL, tetracycline hydrochloride
98%, Alfa Aesar) were present in all liquid cultures and agar plates.
Sep (2 mM, O-phospho-l-serine, Sigma-Aldrich)
was present in agar plates used for the CAT assay.
Protein Purification
In order to purify the CAT protein,
a hexahistidine tag was added to the carboxyl-terminus of the CAT112TAG
gene (via Gibson assembly). The His-tag was added
to the carboxyl-terminus to prevent truncated peptides from being
purified. Appropriate glycerol freezer stocks were made as described
above. Glycerol freezer stocks were streaked out and a single colony
was picked and grown overnight. Then, 1–1.5 mL starter culture
was added to 0.5–3 L media supplemented with 2 mM Sep, grown
to OD 0.6–0.8 and induced (0.5 mM IPTG). Protein was expressed
for 20 h then spun down and frozen at −80 °C. Cultures
were resuspended in 5 mL protein extraction reagent (BugBuster, EMD
Millipore) and 2.5 μL Benzonase nuclease (250 U/μL purity
>90% EMD Millipore) per 1 g cell pellet. Resuspended pellets were
incubated at room temperature for 60 min on a rocking platform and
then spun down (11 419g). The supernatant
for each sample was collected and applied to 1.5 mL Ni-NTA resin (Superflow
prepacked columns, Qiagen) using a vacuum manifold (QIAvac 24 Plus,
Qiagen). All filter sterilized buffers contained 50 mM NaH2PO4 and 300 mM NaCl pH 8.0 with either 10, 20, or 500
mM imidazole added. Columns were prepped by decanting the storage
buffer, and then applying 10 mL of 10 mM imidazole buffer. Next, 30
mL of supernatant were applied to column, followed by 10 mL of 20
mM imidazole buffer. This step was repeated, applying 30 mL supernatant
followed by 10 mL of 20 mM imidazole buffer, until all supernatant
had been applied to the column, ending with 10 mL of 20 mM imidazole
buffer. Finally, protein was eluted in 0.5 mL aliquots with 500 mM
imidazole buffer (4.5 mL total). Eluate aliquots were run on an SDS-PAGE
gel to estimate protein concentration in aliquots. When deemed necessary,
aliquots were combined and concentrated using centrifugal concentrators
with a 10 000 MWCO membrane (Spin-X UF, Corning).
In-Gel Digestion
and Mass Spectrometry
In-gel digestion,
nano-LC–MS/MS, and peptide identification was performed as
previously described with the following modifications.[45] Protein digestion was performed using chymotrypsin.
Reverse phase chromatography was performed using an in-house packed
column (40 cm long × 75 μm ID × 360 OD, Dr. Maisch
GmbH ReproSil-Pur 120 C18-AQ 1.9 μm beads) and a 120 min gradient.
Raw files were searched using the Mascot algorithm (version 2.5.1)
against a protein database constructed of combining the FASTA file
for CAT protein (modified to generate 20 versions each with a different
natural or modified amino acid at position 112) with a contaminant
database (cRAP, downloaded 11–21–16 from http://www.thegpm.org) via Proteome Discoverer 2.1. Variable modifications include
oxidation of Met, carboxyamidomethylation of Cys, and phosphorylation
of Ser, Thr, or Tyr. Only peptide spectral matches with an expectation
value of less than 0.01 (“High Confidence”) were used
(Table S6). As a control, wild-type CAT
protein was translated via wild-type EF-Tu and as
expected, only the wild-type amino acid, aspartic acid, was translated
at position 112 (Table S7). CAT protein
translation mediated by variants EF-coli, EF-N63A, EF-K263A, and EF-N273A
were not analyzed using mass spectrometry because protein expression
levels were too low to isolate purified CAT protein.
Growth Curves
For each sample, glycerol freezer stocks
were streaked out. Three colonies were selected from each plate and
grown overnight in 2xYT media. The following day, 5 μL of the
overnight culture was diluted in 195 μL fresh 2xYT media, 2xYT
media supplemented with 2% glucose, or 2xYT media supplemented with
0.5 mM IPTG. These three media stocks contained 2 mM Sep. Samples
were grown 24 h with shaking in a SpectraMax M2e microplate reader
(Molecular Devices) and absorbance was measured (OD600)
at 10.25 min intervals. Three wells with only 200 μL 2xYT media
with 2 mM Sep (supplemented with nothing, glucose, or IPTG) served
as references for absorbance measurements. All liquid cultures were
grown at 30 °C in media supplemented with 25 μg/mL kanamycin
and 10 μg/mL tetracycline. During data analysis, OD600 values were averaged for the three blank reference cells and the
value was subtracted from the corresponding growth curves. Data sets
were normalized on the basis of the cultures’ starting density
with growth curves beginning at OD600 0.1 for t = 0 min.
Authors: Ka-Weng Ieong; Michael Y Pavlov; Marek Kwiatkowski; Anthony C Forster; Måns Ehrenberg Journal: J Am Chem Soc Date: 2012-10-22 Impact factor: 15.419
Authors: Daniel T Rogerson; Amit Sachdeva; Kaihang Wang; Tamanna Haq; Agne Kazlauskaite; Susan M Hancock; Nicolas Huguenin-Dezot; Miratul M K Muqit; Andrew M Fry; Richard Bayliss; Jason W Chin Journal: Nat Chem Biol Date: 2015-06-01 Impact factor: 15.040
Authors: Aditya M Kunjapur; Devon A Stork; Erkin Kuru; Oscar Vargas-Rodriguez; Matthieu Landon; Dieter Söll; George M Church Journal: Proc Natl Acad Sci U S A Date: 2018-01-04 Impact factor: 11.205
Authors: Rey W Martin; Benjamin J Des Soye; Yong-Chan Kwon; Jennifer Kay; Roderick G Davis; Paul M Thomas; Natalia I Majewska; Cindy X Chen; Ryan D Marcum; Mary Grace Weiss; Ashleigh E Stoddart; Miriam Amiram; Arnaz K Ranji Charna; Jaymin R Patel; Farren J Isaacs; Neil L Kelleher; Seok Hoon Hong; Michael C Jewett Journal: Nat Commun Date: 2018-03-23 Impact factor: 14.919
Authors: Seok Hoon Hong; Ioanna Ntai; Adrian D Haimovich; Neil L Kelleher; Farren J Isaacs; Michael C Jewett Journal: ACS Synth Biol Date: 2014-01-02 Impact factor: 5.110
Authors: Petar I Penev; Claudia Alvarez-Carreño; Eric Smith; Anton S Petrov; Loren Dean Williams Journal: PLoS Comput Biol Date: 2021-10-29 Impact factor: 4.475