Nina Bionda1, Abby L Cryan, Rudi Fasan. 1. Department of Chemistry, University of Rochester , Rochester, New York 14627, United States.
Abstract
Inspired by the biosynthetic logic of lanthipeptide natural products, a new methodology was developed to direct the ribosomal synthesis of macrocyclic peptides constrained by an intramolecular thioether bond. As a first step, a robust and versatile strategy was implemented to enable the cyclization of ribosomally derived peptide sequences via a chemoselective reaction between a genetically encoded cysteine and a cysteine-reactive unnatural amino acid (O-(2-bromoethyl)-tyrosine). Combination of this approach with intein-catalyzed protein splicing furnished an efficient route to achieve the spontaneous, post-translational formation of structurally diverse macrocyclic peptides in bacterial cells. The present peptide cyclization strategy was also found to be amenable to integration with split intein-mediated circular ligation, resulting in the intracellular synthesis of conformationally constrained peptides featuring a bicyclic architecture.
Inspired by the biosynthetic logic of lanthipeptide natural products, a new methodology was developed to direct the ribosomal synthesis of macrocyclic peptides constrained by an intramolecular thioether bond. As a first step, a robust and versatile strategy was implemented to enable the cyclization of ribosomally derived peptide sequences via a chemoselective reaction between a genetically encoded cysteine and a cysteine-reactive unnatural amino acid (O-(2-bromoethyl)-tyrosine). Combination of this approach with intein-catalyzed protein splicing furnished an efficient route to achieve the spontaneous, post-translational formation of structurally diverse macrocyclic peptides in bacterial cells. The present peptide cyclization strategy was also found to be amenable to integration with split intein-mediated circular ligation, resulting in the intracellular synthesis of conformationally constrained peptides featuring a bicyclic architecture.
Methods for generating macrocyclic
peptides, and combinatorial libraries thereof, have attracted considerable
interest owing to the peculiar conformational and molecular recognition
properties of this structural class and their promise toward addressing
challenging drug targets.[1,2] To this end, one approach
has involved the generation and manipulation of natural cyclopeptide
scaffolds via the reconstruction and engineering of their respective
biosynthetic pathways.[3−7] Following an alternative approach, other groups, including ours,
have focused on implementing strategies to enable the macrocyclization
of ribosomally derived polypeptides of arbitrary sequence.[8,9] Particularly attractive features of the latter are their high combinatorial
potential and the possibility to interface the resulting libraries
of constrained peptides with powerful display platforms. Despite significant
contributions in this area,[10−18] these approaches have yet remained largely limited to the production
of macrocyclic peptides in vitro.[8,9] As
a notable exception, there is the split intein-mediated circular ligation
method (SICLOPPS) introduced by Benkovic and co-workers,[19] in which head-to-tail cyclic peptides are generated
via circularization of an internal peptide sequence upon a trans splicing reaction involving flanking domains from
the natural split intein DnaE (Supplementary Figure
S1). Enabling the synthesis of cyclic peptides in living cells,
this approach has provided a powerful tool for the discovery of cyclopeptide
inhibitors against a variety of protein targets upon integration with
an intracellular selection or reporter system.[20−22] A limitation
of this method however is the accessibility of a single type of cyclic
peptide topology (i.e., N-to-C-end
cyclic peptide). Furthermore, cyclization efficiency is largely affected
by the composition of the target peptide sequence.[22−24] Thus, more
general and versatile methods to direct the ribosomal synthesis of
macrocyclic peptides would be highly desirable in order to expand
current capabilities toward exploring and exploiting peptide macrocycles
for drug discovery and chemical biology applications.To this
end, we developed and report here an efficient strategy
for the production of structurally diverse thioether-linked macrocyclic
peptides in living bacterial cells (E. coli). Inspiration
for this approach was drawn from the ‘logic’ underlying
the biosynthesis of lanthipeptides, a class of ribosomally derived
polycyclic peptides constrained by intramolecular thioether bonds.[25,26] As schematically illustrated in Figure 1A,
these natural products are initially produced as linear precursor
polypeptides via ribosomal synthesis. Recognition of the N-terminal leader sequence by dehydratase enzymes then mediates the
conversion of Ser and Thr residues located within the core peptide
into dehydroalanine (Dha) and dehydrobutyrine (Dhb), respectively.
The signature thioether cross-links are subsequently formed via enzyme-assisted
Michael addition of cysteine sulfhydryl groups onto these electrophilic
α,β-unsaturated amino acid residues. Finally, removal
of the leader peptide by a downstream protease releases the mature
lanthipeptide.[25,26]
Figure 1
(A) Schematic representation
of lanthipeptide biosynthesis (i)
and our bioinspired strategy for ribosomal synthesis of thioether-bridged
macrocyclic peptides (ii) and bicyclic peptides (iii). (B) Strategy
for ribosomal synthesis of thioether-bridged macrocyclic peptides.
The linear precursor polypeptide comprises an N-terminal tail (pink),
the unnatural amino acid O-(2-bromoethyl)-tyrosine
(O2beY), a variable target sequence (green) containing the reactive
cysteine (purple), and GyrA intein (blue). Depending on the nature
of the “I–1” residue (orange), the macrocyclic
peptide can be released in vitro (path A) or directly in vivo (path B).
Inspired by this reaction
sequence, we sought to develop a strategy
for the ribosomal generation of conformationally constrained peptides
through the combination of peptide cyclization via an inter-side-chain
thioether linkage with proteolytic release of the resulting macrocyclic
peptide. Conceivably, achieving this goal in an enzyme-independent
manner would involve the challenge of building in the structural elements
necessary for promoting these transformations directly into the genetically
encoded, ribosomally produced precursor polypeptide. As outlined in
Figure 1A, we reasoned this task could be achieved
via (a) a chemoselective reaction between a cysteine and a ribosomally
incorporated unnatural amino acid bearing a cysteine-reactive side-chain
group and (b) spontaneous release of the macrocyclic peptide by means
of an intein-based protein splicing element.(A) Schematic representation
of lanthipeptide biosynthesis (i)
and our bioinspired strategy for ribosomal synthesis of thioether-bridged
macrocyclic peptides (ii) and bicyclic peptides (iii). (B) Strategy
for ribosomal synthesis of thioether-bridged macrocyclic peptides.
The linear precursor polypeptide comprises an N-terminal tail (pink),
the unnatural amino acid O-(2-bromoethyl)-tyrosine
(O2beY), a variable target sequence (green) containing the reactive
cysteine (purple), and GyrA intein (blue). Depending on the nature
of the “I–1” residue (orange), the macrocyclic
peptide can be released in vitro (path A) or directly in vivo (path B).To implement this idea, we envisioned that the target unnatural
amino acid (UAA) ought to satisfy several criteria. First, its side-chain
electrophilic group should be sufficiently reactive to enable efficient
peptide cyclization upon reaction with a proximal cysteine in intramolecular
settings, but not too reactive in order to avoid side reactions with
competing nucleophiles present in the cellular environment (e.g.,
glutathione). In addition, it should be amenable to protein incorporation
via an aminoacyl-tRNA synthetase (AARS)/tRNA pair with orthogonal
reactivity.[27]With these considerations
in mind, we decided that O-(2-bromoethyl)-tyrosine
(O2beY, Figure 1B)
could meet the aforementioned requirements, as suggested by its sluggish
reactivity in the cyclization of cysteine-containing peptides in vitro(15) and its structural
similarity to O-propargyl-tyrosine (OpgY), for which
an amber suppressor AARS/tRNA pair was made available.[28] Accordingly, the envisioned approach would entail
cyclization of a precursor polypeptide via a thioether bond-forming
reaction between O2beY and a proximal cysteine, followed by intein-mediated
release of the macrocyclic peptide (Figure 1B).To evaluate the feasibility of this design, a first series
of constructs
was utilized (entries 1–9, Figure 2A),
which encode for 12- to 16-amino acid long target peptide sequences
and in which the O2beY and Cys residues are spaced from each other
by an increasing number of intervening residues (i.e., from Z+1 to
Z+12). In the respective genes, an amber stop codon (TAG) was introduced
after the initial Met-Gly to allow for the site-specific incorporation
of O2beY into the target sequence. The latter was then genetically
fused to an engineered variant of Mxe GyrA intein,
whose C-terminal Asn198 was mutated to Ala to abolish
its self-splicing activity while preserving its ability to form a
thioester bond at the N-terminal end via N → S acyl transfer (dotted box,
Figure 1B).
Figure 2
(A) Protein constructs used in this study.
Notes: aThe
residues involved in the thioether linkage are highlighted in green
(O2beY) and red (Cys). Initial methionine is omitted. bRelative position of the cysteine with respect to the unnatural amino
acid O2beY ( = ‘Z’). (B) Dependence of macrocyclization
efficiency on the relative position of the Cys residue with respect
to the unnatural amino acid O2beY (‘Z’). Proteins were
isolated after expression for 12 h at 27 °C (see Supporting Information for details).
(A) Protein constructs used in this study.
Notes: aThe
residues involved in the thioether linkage are highlighted in green
(O2beY) and red (Cys). Initial methionine is omitted. bRelative position of the cysteine with respect to the unnatural amino
acid O2beY ( = ‘Z’). (B) Dependence of macrocyclization
efficiency on the relative position of the Cys residue with respect
to the unnatural amino acid O2beY (‘Z’). Proteins were
isolated after expression for 12 h at 27 °C (see Supporting Information for details).To identify a viable AARS/tRNA pair for O2beY incorporation
into
these constructs, we initially tested the Methanocaldococcus
jannaschii tyrosyl-tRNA synthetase variant previously evolved
for recognition of the structurally related OpgY (OpgY-RS).[28] This choice was motivated based on the ‘polyspecificity’
often exhibited by orthogonal AARSs toward related UAA structures.[29,30] Gratifyingly, using OpgY-RS, O2beY could be successfully incorporated
into a model protein consisting of Yellow Fluorescent Protein (YFP)
with an N-terminal amber stop codon (Figure 3B). To improve its efficiency toward O2beY incorporation,
OpgY-RS was subjected to further mutagenesis. To this end, an homology
model of this enzyme was first generated on the basis of the available
crystal structure of MjTyr-RS in complex with its
native substrate, tyrosine.[31] Then, the
unnatural amino acid O2beY was docked into the enzyme active site
(Figure 3A). Inspection of the model suggested
that an Ala32Gly mutation would expand the active site cavity to better
accommodate the 2-bromo-ethoxy group in O2beY. Rewardingly, the resulting
AARS variant (called O2beY-RS) was found to enable the ribosomal incorporation
of O2beY with significantly higher efficiency compared to OpgY-RS,
while maintaining discriminating selectivity against the natural amino
acids (Figure 3B). Comparison of the expression
yields of YFP(O2beY) versus wild-type YFP indicated that the efficiency
of amber stop codon suppression with O2beY-RS was excellent (85%).
Figure 3
Aminoacyl-tRNA
synthetase for O2beY incorporation. (A) Model of
OpgY-RS active site as generated based on the crystal structure of MjTyr-RS in complex with tyrosine (PDB code 1J1U). The bound substrate
(Tyr, yellow), docked O2beY (green), and mutated acid residues (red)
are highlighted. The van der Waals sphere of the Ala32 residue is
shown as a dotted sphere. (B) YFP-based screen indicating the relative
efficiency of OpgY and O2beY incorporation via amber stop codon suppression
with OpgY-RS and O2beY-RS. (C) MS spectrum of YFP protein incorporating
O2beY (YFP(O2beY)).
Aminoacyl-tRNA
synthetase for O2beY incorporation. (A) Model of
OpgY-RS active site as generated based on the crystal structure of MjTyr-RS in complex with tyrosine (PDB code 1J1U). The bound substrate
(Tyr, yellow), docked O2beY (green), and mutated acid residues (red)
are highlighted. The van der Waals sphere of the Ala32 residue is
shown as a dotted sphere. (B) YFP-based screen indicating the relative
efficiency of OpgY and O2beY incorporation via amber stop codon suppression
with OpgY-RS and O2beY-RS. (C) MS spectrum of YFP protein incorporating
O2beY (YFP(O2beY)).With a suitable AARS/tRNA
pair for O2beY incorporation in hand,
the constructs corresponding to entries 1–9 (Figure 2A) were produced in E. coliBL21(DE3)
cells using a dual plasmid system (see Supporting
Information for details). To better examine both the occurrence
and efficiency of the thioether bond-forming reaction according to
the strategy of Figure 1B, the aforementioned
constructs were designed to contain a Thr residue at the position
preceding the intein (‘I–1 site’). This substitution
minimizes premature hydrolysis of GyrA-fusion proteins during expression,[14,32] thereby facilitating analysis of the target peptide sequences after
chemically induced splicing of the intein from the purified proteins in vitro (Figure 1B, path A). This
procedure would also permit the isolation of any product resulting
from the unselective reaction of O2beY with other nucleophiles in vivo. Accordingly, after purification, the proteins were
treated with benzyl mercaptan to release the N-terminal
peptides. The reaction mixtures were then analyzed by LC–MS
to detect and quantify the amount of the desired thioether-linked
macrocyclic product as well as that of the uncyclized linear peptide,
as judged on the basis of the peak areas in the corresponding extracted-ion
chromatograms (see Figure 4A and Supplementary Figures S3–S10). As summarized
in Figure 2B, these studies revealed that the
macrocyclization had occurred with very high efficiency (80–95%)
across the constructs with Cys and O2beY being separated by one (Z+2)
to seven (Z+8) residues. Increasing
this distance (i.e., Cys at Z+10 and Z+12, entries 8 and 9 in Figure 2A) resulted in a noticeable increase of the acyclic
product (50–80%, Figure 2B), thus defining
the upper limits for the macrocycle size accessible through this method.
Interestingly, when the Cys was located immediately adjacent to the
unnatural amino acid (entry 1, Figure 2A),
minimal cyclization (5%) was observed. A similar lack of reactivity
was observed by Suga and co-workers in the context of in vitro translated peptides containing a cysteine-reactive N-terminal 2-chloroacetyl moiety,[33] and
this result can be rationalized here on the basis of the unfavorable
14-membered macrocycle formed when the O2beY/Cys pair are in a i/i+1 relationship. For each construct
tested, the identity of the macrocyclic product could be further confirmed
by analysis of the corresponding MS/MS fragmentation spectrum as illustrated
in Supplementary Figure S2.
Figure 4
Chemical structure
(top), LC–MS extracted-ion chromatogram
(center left), and MS/MS fragmentation spectrum (bottom) of representative
macrocyclic peptides prepared from construct 12mer-Z6C (A), Strep1-Z5C
(B), and cStrep3(C)-Z3C (C). MS spectra of the precursor proteins
after affinity purification and thiol-induced splicing (12mer-Z6C)
and directly from cell lysate (Strep1-Z5C and cStrep3(C)-Z3C) are
also shown (center right). Data for the other constructs can be found
in Supplementary Figures S2–S19.
Importantly,
GyrA intein contains a Cys at its N-terminal end,
which is crucial for mediating protein splicing in
the context of our planned strategy for producing these peptide macrocycles
inside the cells (Figure 1B). Since this residue
is partially buried within the active site,[34] we did not expect it to readily react with the O2beY side-chain.
Notably, quantitative splicing of the GyrA moiety upon treatment of
all these contructs with benzyl mercaptan (Figure 4A and Supplementary Figure S17)
indicated that no reaction occurred between O2beY and the catalytic
Cys at the intein I+1 site. Furthermore, no adducts or dimers were
observed for any of the constructs described above, including those
undergoing only partial cyclization (i.e., entries 8 and 9, Figure 2A). Altogether, these results evidenced the high
chemo- and regioselectivity of the macrocyclization reaction.Chemical structure
(top), LC–MS extracted-ion chromatogram
(center left), and MS/MS fragmentation spectrum (bottom) of representative
macrocyclic peptides prepared from construct 12mer-Z6C (A), Strep1-Z5C
(B), and cStrep3(C)-Z3C (C). MS spectra of the precursor proteins
after affinity purification and thiol-induced splicing (12mer-Z6C)
and directly from cell lysate (Strep1-Z5C and cStrep3(C)-Z3C) are
also shown (center right). Data for the other constructs can be found
in Supplementary Figures S2–S19.In the interest of determining
whether the thioether bond-forming
reactivity is preserved if the order of Cys and O2beY is reversed,
the two constructs corresponding to entries 10 and 11 in Figure 2A were prepared. Here, the reactive Cys is located
upstream of the unnatural amino acid and specifically at positions
Z-6 and Z-8. Analysis of these constructs according to the procedure
described above (Supplementary Figures S11 and
S12) revealed the occurrence of the desired cyclic peptide
as the largely predominant product (>99%), i.e., with comparable
(Z-6)
or even higher (Z-8) efficiency than the corresponding Z+6 and Z+8
counterparts (Figure 2B). These data clearly
showed that either arrangement of the Cys/O2beY pair within the target
sequence is compatible with macrocylization.Having established
the versatility of the approach toward obtaining
structurally diverse macrocyclic peptides either linked to the N-terminus of a protein or in isolated form after intein
splicing in vitro, we next investigated whether this
strategy could be further evolved to permit the production of macrocyclic
peptides in vivo. In previous studies,[18] we established that certain amino acid substitutions
at the level of the I–1 site, and in particular Asp and Lys,
can strongly promote N-terminal splicing of GyrA
intein during recombinant expression. This effect is likely due to
the ability of these residues to favor hydrolysis of the intein-catalyzed
thioester linkage through their nucleophilic side-chain groups. While
undesirable in the context of our MOrPH synthesis methodologies,[18] we envisioned this reactivity could be leveraged
here for mediating the spontaneous release of the macrocyclic peptide
from the precursor protein as outlined in Figure 1B (path B). To test this idea, the constructs corresponding
to entries 12 and 13 in Figure 2A were generated.
In addition to an Asp residue at the I–1 site, these contructs
were designed to encompass the sequence of two streptavidin-binding
peptides previously isolated by phage display[35,36] as a way to facilitate the isolation of the target macrocyclic peptides
from the cells. Accordingly, after expression of these constructs
in E. coli, cells were lysed and the cell lysates
were passed over streptavidin-coated agarose beads. To our delight,
LC–MS analysis of the eluates revealed the occurrence of the
expected peptide macrocycles, as illustrated by the chromatograms
and MS/MS spectra in Figure 4B and Supplementary Figures S13 and S14. Since the
uncyclized peptide could also be captured through this procedure,
these analyses also showed that the desired macrocyclic product was
formed with high efficiency in each case (i.e., >95% for Strep1-Z5C;
70% for Strep2-Z7C). Furthermore, both precursor polypeptides were
found to have undergone complete splicing in vivo (Figure 4B and Supplementary
Figure S18A,B). Since O2beY-mediated alkylation of the intein
catalytic cysteine would prevent protein splicing, the latter results
further higlighted the high degree of regioselectivity of the macrocyclization
reaction. Collectively, the results obtained with the streptavidin-binding
sequences of constructs Strep1-Z5C and Strep2-Z7C provide a proof-of-principle
demonstration of the feasibility and efficiency of the strategy of
Figure 1B for directing the synthesis of cyclopeptides
in living bacterial cells. Interestingly, the cyclization yield observed
with these sequences correlated very well with the reactivity trend
measured across the previous contructs (Figure 2B), suggesting that this parameter is rather predictable on the sole
basis of the Cys/O2beY distance and despite the difference in the
composition of the target peptide sequence.These positive results
prompted us to test whether our bioinspired
approach could be further extended to enable the ribosomal synthesis
of bicylic peptides via the integration of O2beY/Cys-mediated macrocyclization
with split intein-catalyzed circular ligation.[19] If viable, this second strategy (Figure 1A(iii)) could provide the complementary capability of generating
macrocyclic peptides that are constrained by means of an N-to-C-end cyclic backbone and an intramolecular
inter-side-chain thioether linkage. Implementation of this design
presented the challenge that two cysteine residues are involved in
the trans splicing process leading to the head-to-tail
cyclopeptide (referred to as IntC+1 and IntN+1 cysteine; see Supplementary Figure S1), which could potentially cross-react with O2beY. However, on the
basis of the reactivity studies described in Figure 2B, we envisioned this challenge could be tackled by placing
the unnatural amino acid in i/i+1
arrangement with respect to the IntC+1 cysteine and by
placing the reactive cysteine at a closer distance to O2beY compared
to the IntN+1 cysteine, as schematically outlined in Figure 5.
Figure 5
Schematic representation of the strategy for in
vivo synthesis of bicyclic peptides via combination of O2beY/Cys
cyclization
and split intein-mediated circular ligation.
Schematic representation of the strategy for in
vivo synthesis of bicyclic peptides via combination of O2beY/Cys
cyclization
and split intein-mediated circular ligation.According to these design principles, the cStrep3(C)-Z3C
construct
was prepared (entry 14, Figure 2A). A related
construct where the IntC+1 cysteine is replaced with serine
(entry 15, Figure 2A) was also prepared, as
this substitution is compatible with split intein-catalyzed peptide
cyclization.[23] In each case, the target
peptide sequences were designed on the basis of a previously described
HPQ motif-containing cyclopeptide capable of binding streptavidin[37] in order to facilitate isolation of the peptide
products from the cells. Lysates from E. coli cells
expressing these constructs were processed as described above for
the other streptavidin-binding peptides. Notably, the desired bicyclic
peptide was isolated as the largely predominant product in both cases
(>95%), as determined by LC–MS. The bicyclic structure of
these
compounds was further evidenced by the corresponding MS/MS fragmentation
spectra (Figure 4C and Supplementary Figures S15 and S16). Treatment of the bicyclic
peptides with the thiol-alkylating iodoacetamide resulted in a 57
Da increase in molecular mass and shift of the peptide retention time
for the product of the cStrep3(C)-Z3C precursor protein but not for
that of cStrep3(S)-Z3C, which is consistent with the presence of a
free thiol from (from IntC+1 cysteine) in the former but
not in the latter. To allow measurement of the extent of post-translational
self-processing of these precursor polypeptides in vivo, a chitin-binding domain was included at the C-terminus
of the IntN domain (Figure 2A).
LC–MS analysis of the protein fraction eluted from chitin beads
showed that the split intein-mediated cyclization has occurred quantitatively
with both cStrep3(C)-Z3C (Figure 4C) and its
serine-containing counterpart (Supplementary Figure
S18D). Altogether, these results demonstrate the possibility
of integrating our peptide cyclization strategy with split intein-mediated
protein circularization to enable the formation of bicyclic peptides
in E. coli.In conclusion, we have developed
two complementary, versatile methodologies
to direct the production of ‘natural product-like’ macrocyclic
peptides constrained by an intramolecular thioether bridge in bacterial
cells. A key feature of these methods is that the structural elements
and reactivity driving the peptide macrocyclization process are built
into the genetically encoded polypeptide precursor. Using this approach,
peptide macrocycles of various ring size and composition could be
prepared efficiently and with predictable regioselectivity. Importantly,
the possibility to generate these macrocyclic peptides in protein-fused
or isolated form make the present methodologies directly amenable
to integration with well-established display platforms (e.g., phage[38] or mRNA display[39]) or intracellular selection systems,[20,21] respectively,
for library screening. We also anticipate that the strategies outlined
in Figures 1B and 5 can
be of rather general value, that is, applicable to other cysteine-reactive
unnatural amino acids. Work in this direction is currently ongoing
in our group, which includes exploring the possibility of extending
this approach to the synthesis of conformationally constrained peptides
in eukaryotic cells.[40]
Authors: Joshua A Kritzer; Shusei Hamamichi; J Michael McCaffery; Sandro Santagata; Todd A Naumann; Kim A Caldwell; Guy A Caldwell; Susan Lindquist Journal: Nat Chem Biol Date: 2009-07-13 Impact factor: 15.040
Authors: Andrew E Owens; Ivan de Paola; William A Hansen; Yi-Wen Liu; Sagar D Khare; Rudi Fasan Journal: J Am Chem Soc Date: 2017-08-30 Impact factor: 15.419
Authors: Eric J Moore; Dmitri Zorine; William A Hansen; Sagar D Khare; Rudi Fasan Journal: Proc Natl Acad Sci U S A Date: 2017-11-06 Impact factor: 11.205