Thibault Annaval1, Jeffrey D Rudolf1, Chin-Yuan Chang1, Jeremy R Lohman1, Youngchang Kim2,2, Lance Bigelow2, Robert Jedrzejczak2, Gyorgy Babnigg2, Andrzej Joachimiak2,2,3, George N Phillips4, Ben Shen1,1,1. 1. Department of Chemistry, Department of Molecular Medicine, and Natural Products Library Initiative at The Scripps Research Institute, The Scripps Research Institute, 130 Scripps Way, Jupiter, Florida 33458, United States. 2. Midwest Center for Structural Genomics and Structural Biology Center, Biosciences Division, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, Illinois 60439, United States. 3. Department of Biochemistry and Molecular Biology, University of Chicago, 929 E. 57th Street, W225, Chicago, Illinois 60637, United States. 4. BioSciences at Rice and Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77251, United States.
Abstract
Enediynes are potent natural product anticancer antibiotics, and are classified as 9- or 10-membered according to the size of their enediyne core carbon skeleton. Both 9- and 10-membered enediyne cores are biosynthesized by the enediyne polyketide synthase (PKSE), thioesterase (TE), and PKSE-associated enzymes. Although the divergence between 9- and 10-membered enediyne core biosynthesis remains unclear, it has been observed that nascent polyketide intermediates, tethered to the acyl carrier protein (ACP) domain of PKSE, could be released by TE in the absence of the PKSE-associated enzymes. In this study, we determined the crystal structure of SgcE10, the TE that participates in the biosynthesis of the 9-membered enediyne C-1027. Structural comparison of SgcE10 with CalE7 and DynE7, two TEs that participate in the biosynthesis of the 10-membered enediynes calicheamicin and dynemicin, respectively, revealed that they share a common α/β hot-dog fold. The amino acids involved in both substrate binding and catalysis are conserved among SgcE10, CalE7, and DynE7. The volume and the shape of the substrate-binding channel and active site in SgcE10, CalE7, and DynE7 confirm that TEs from both 9- and 10-membered enediyne biosynthetic machineries bind the linear form of similar ACP-tethered polyene intermediates. Taken together, these findings further support the proposal that the divergence between 9- and 10-membered enediyne core biosynthesis occurs beyond PKSE and TE catalysis.
Enediynes are potent natural product anticancer antibiotics, and are classified as 9- or 10-membered according to the size of their enediyne core carbon skeleton. Both 9- and 10-membered enediyne cores are biosynthesized by the enediynepolyketide synthase (PKSE), thioesterase (TE), and PKSE-associated enzymes. Although the divergence between 9- and 10-membered enediyne core biosynthesis remains unclear, it has been observed that nascent polyketide intermediates, tethered to the acyl carrier protein (ACP) domain of PKSE, could be released by TE in the absence of the PKSE-associated enzymes. In this study, we determined the crystal structure of SgcE10, the TE that participates in the biosynthesis of the 9-membered enediyne C-1027. Structural comparison of SgcE10 with CalE7 and DynE7, two TEs that participate in the biosynthesis of the 10-membered enediynescalicheamicin and dynemicin, respectively, revealed that they share a common α/β hot-dog fold. The amino acids involved in both substrate binding and catalysis are conserved among SgcE10, CalE7, and DynE7. The volume and the shape of the substrate-binding channel and active site in SgcE10, CalE7, and DynE7 confirm that TEs from both 9- and 10-membered enediyne biosynthetic machineries bind the linear form of similar ACP-tethered polyene intermediates. Taken together, these findings further support the proposal that the divergence between 9- and 10-membered enediyne core biosynthesis occurs beyond PKSE and TE catalysis.
Enediyne natural products
are some of the most potent anticancer
antibiotics known to date.[1] Enediynes are
classified as 9- or 10-membered according to the size of their enediyne
cores, which are carbocyclic rings consisting of two acetylenic groups
conjugated to a double bond or incipient double bond.[2] The enediyne cores are biosynthesized by the enediynepolyketide
synthase (PKSE), thioesterase (TE), and PKSE-associated enzymes.[3] The iteratively acting PKSE, responsible for
initiating enediyne core biosynthesis, is composed of six domains:
ketosynthase (KS), acyltransferase (AT), acyl carrier protein (ACP),
ketoreductase (KR), dehydratase (DH), and phosphopantetheinyl transferase
(PPTase).[4] These domains catalyze the decarboxylative
condensation of one acetyl-CoA and (minimally) seven malonyl-CoA,
as well as associated reductive modifications of the resulting β-ketone
groups, to form the ACP-tethered polyketide intermediates for enediyne
core biosynthesis. Convergent attachment of the various peripheral
moieties to the enediyne core completes enediyne biosynthesis (Figure ). Although it is
not known if the PKSE-associated enzymes act on the ACP-tethered or
discrete forms of the polyketide intermediates to yield the 9- or
10-membered enediyne core, the ACP-tethered intermediates can be released
by the TEs as discrete polyene products in the absence of the PKSE-associated
enzymes.[5] Although the biosynthesis of
the peripheral moieties has been extensively studied and is well established,[6,7] it remains unclear how the enediyne cores are constructed and what
controls the divergence between 9- and 10-membered enediyne core biosynthesis.
Figure 1
Proposed
biosynthesis of enediyne natural products, featuring the
PKSE, TE, and PKSE-associated enzymes for both 9- and 10-membered
enediyne cores and convergent attachment of the peripheral moieties
to the enediyne cores. (A) Structures of selected 9-membered enediyne
C-1027 (1) and 10-membered enediynes calicheamicin (2) and dynemicin (3). (B) PKSE–TE pair
as a model system for enediyne core biosynthesis. In vivo expression
of cognate or mismatched PKSE–TE pairs from selected 9- and
10-membered enediyne biosynthetic machineries all produced 4 as the dominant product (path a). In vitro reconstitution of PKSE–TE
pairs from 9- and 10-membered enediyne biosynthetic machineries yielded
varying products, as exemplified by SgcE–SgcE10 for the 9-membered
enediyne 1 producing 4, 5,
and 6, CalE8–CalE7 for the 10-membered enediyne 2 producing 4, 7, and 8, and DynE8–DynE7 for the 10-membered enediyne 3 producing 4 and 7, respectively (path
b).
Proposed
biosynthesis of enediyne natural products, featuring the
PKSE, TE, and PKSE-associated enzymes for both 9- and 10-membered
enediyne cores and convergent attachment of the peripheral moieties
to the enediyne cores. (A) Structures of selected 9-membered enediyneC-1027 (1) and 10-membered enediynescalicheamicin (2) and dynemicin (3). (B) PKSE–TE pair
as a model system for enediyne core biosynthesis. In vivo expression
of cognate or mismatched PKSE–TE pairs from selected 9- and
10-membered enediyne biosynthetic machineries all produced 4 as the dominant product (path a). In vitro reconstitution of PKSE–TE
pairs from 9- and 10-membered enediyne biosynthetic machineries yielded
varying products, as exemplified by SgcE–SgcE10 for the 9-membered
enediyne 1 producing 4, 5,
and 6, CalE8–CalE7 for the 10-membered enediyne 2 producing 4, 7, and 8, and DynE8–DynE7 for the 10-membered enediyne 3 producing 4 and 7, respectively (path
b).The exact inventory of the PKSE-associated
enzymes and their functions
in morphing the nascent polyketide intermediates into the 9- or 10-membered
enediyne cores remain elusive.[8] Efforts
to understand enediyne core biosynthesis are currently limited to
using the PKSE–TE pair as a model system (Figure B).[9−13] Thus, coexpression of cognate pairs of PKSE–TE
from biosynthetic machineries of either 9- or 10-membered enediynes,
including the 9-membered enediyne C-1027 (1) and 10-membered
enediynescalicheamicin (2) and dynemicin (3), in both heterologous hosts and known enediyne producers all produced
the same polyketide, 1,3,5,7,9,11,13-pentadecaheptaene (4), as the dominant product (Figure B, path a).[9,10] PKSEs and TEs from
different 9- and 10-membered enediyne biosynthetic machineries were
found to be freely interchangeable, affording functional PKSE–TE
pairs without any apparent effect on 4 production.[10] These results led to the conclusion that PKSE
and TE catalysis does not direct biosynthetic divergence between 9-
and 10-membered enediyne cores. In contrast, in vitro reconstitution
of PKSE–TE pairs from 9- or 10-membered enediyne biosynthetic
machineries yielded varying products. Although 4 was
the common metabolite produced by PKSE–TE pairs from both 9-
and 10-membered enediyne biosynthetic machineries, additional products
were also characterized, the exact profile of which seemed to be assay-condition
dependent.[11−13] For example, the PKSE–TE pair of SgcE–SgcE10
from the biosynthetic machinery of the 9-membered enediyne 1 produced a nonaketide lactone (5) as the main product
when SgcE was assayed in the absence or in a low molar ratio of SgcE10.
When SgcE was assayed in an excess of SgcE10, 5 and a
nonaketide β-hydroxy-ketone (6) were observed as
the major products, with concomitant accumulation of an increasing
amount of 4 (Figure B, path b).[11] When the PKSE–TE
pair of CalE8–CalE7 from the biosynthetic machinery of the
10-membered enediyne 2 was similarly assayed, 4 and the related methyl ketone (7) were consistently
generated as the major products, with concomitant accumulation of
the corresponding hexaketide β-hydroxy-ketoacid (8) as a minor product (Figure B, path b).[12] The PKSE–TE
pair of DynE8–DynE7 from the biosynthetic machinery of the
10-membered enediyne 3 resulted in the production of 4 and several minor products including 7 and
other compounds whose structures were not determined (Figure B, path b).[13] These findings argued that divergence between 9- and 10-membered
enediyne core biosynthesis may begin at the stage of PKSE and TE catalysis.[11]The crystal structures of CalE7 and DynE7,
two TEs from the biosynthetic
machineries of the 10-membered enediynes 2 and 3 have been reported.[14,15] TEs for enediyne core
biosynthesis are inherently different from the TEs that are known
to catalyze the hydrolytic release of ACP-tethered polyketide intermediates
(i.e., type I TEs) or aberrant products (i.e., type II TEs) for reduced
polyketide biosynthesis.[14,15] Type I TEs are domains
of type I polyketide synthases, whereas type II TEs are discrete proteins.[16,17] Type I and type II TEs share moderate sequence homologies, are monomeric
or dimeric, and are members of the well-known α/β hydrolase
structural family.[17−19] In contrast, the TEs for both 9- and 10-membered
enediyne core biosynthesis are discrete proteins. They have low sequence
homologies with both type I and type II TEs.[15] Both CalE7 and Dyn7 are tetrameric hot-dog fold enzymes.[14,15] On the basis of their sequence similarity and shared hot-dog folds,
the TEs for enediyne core biosynthesis are reminiscent of the acyl-CoA
TEs, but with distinct catalytic residues.[14,15] Whereas the acyl-CoA TEs typically have a catalytic Asp or Glu,[16,17] the TEs from 9- or 10-membered enediyne biosynthetic machineries
possess a conserved Arg, which is implicated in catalysis;[14,15] type I and type II TEs for reduced polyketide biosynthesis share
a conserved Ser/His/Asp catalytic triad.[18]No structures of TEs from 9-membered enediyne biosynthetic
machineries
have been reported. Structural comparison between TEs from 9- and
10-membered enediyne biosynthetic machineries could provide a molecular
basis to reconcile the differences in product formation observed from
in vivo and in vitro studies of selected PKSE–TE pairs for
9- and 10-membered enediyne core biosynthesis. Here, we report the
crystal structure of SgcE10, the TE from the 9-membered enediyne biosynthetic
machinery of 1 from Streptomyces globisporus, revealing that the sequences, structures, oligomerization state,
substrate-binding tunnel, active site, and catalytic residues are
highly conserved among TEs for 9- (i.e., SgcE10) and 10-membered (i.e.,
DynE7 and CalE7) enediyne core biosynthesis. These results further
support the proposal that the divergence between 9- and 10-membered
enediyne core biosynthesis occurs beyond PKSE and TE catalysis.
Results
Phylogenetic
Analysis Revealing Local, but not Global, Differentiation
of TEs from 9- and 10-Membered Enediyne Biosynthetic Machineries
Previous bioinformatics analysis of five PKSE–TE pairs (i.e.,
MdpE–E10, NcsE–E10, and SgcE–E10 for 9-membered
enediyne cores and CalE8–E7 and DynE8–E7 for 10-membered
enediyne cores) revealed that PKSEs and TEs clade together according
to the ring size of their enediyne core.[10] We recently identified 87 enediyne gene clusters, including the
gene clusters that encoded the biosynthesis of 10 known enediyne natural
products, through a virtual screen of publicly available genome databases.[20] We subsequently completed a genome survey of
3400 actinomycetes from The Scripps Research Institute Actinomycetales
Collection, discovering 81 additional enediyne producers.[21] Genome sequencing of 31 representatives of the
81 new enediyne producers confirmed that each harbored a distinct
enediyne biosynthetic gene cluster. An enediyne genome neighborhood
network was constructed that allowed prediction of 9- or 10-membered
enediyne biosynthesis based on the exclusivity of E2 or R3 genes within the enediyne biosynthetic gene
clusters. Enediyne discovery was demonstrated by the isolation and
structural characterization of tiancimycin, a new 10-membered enediyne
natural product.[21] With the significantly
increased number of PKSE–TE pairs now available, we first sought
to determine whether the classification of TEs according to their
participation in 9- or 10-membered enediyne core biosynthesis still
held true.We constructed a phylogenetic tree using 101 TEs.
This included the 11 TEs from the seven 9- and four 10-membered known
enediyne biosynthetic machineries and the 60 and 30 TEs from the putative
enediyne gene clusters that were predicted to encode the biosynthesis
of 9- or 10-membered enediynes, respectively.[20,21] The TEs range in sequence identity from 38 to 100%. SgcE10 shares
45 and 67% sequence identities with DynE7 and CalE7, respectively,
the two TEs from the 10-membered enediyne core biosynthetic machineries
whose crystal structures have been determined. DynE7 and CalE7 only
share 49% sequence identity, less than that between SgcE10 and CalE7.
As with the previous phylogenetic analysis of five TEs,[10] the 11 TEs from the biosynthetic machineries
of known enediynes separate into different clades (Figure , and also see Figure S1 for similar analysis of PKSEs). In
general, the TEs from the 9- or 10-membered enediyne biosynthetic
machineries tend to clade together, although most clades are not exclusive.
However, there is no clear overall trend where TEs from the 9- or
10-membered enediyne biosynthetic machineries segregate into universal
clades (Figure ).
Figure 2
Phylogenetic
tree of the 101 TEs from confirmed and predicted 9-
and 10-membered enediyne biosynthetic machineries. The TEs from 9-membered
enediyne (red) and 10-membered enediyne (blue) biosynthetic machineries
show local, but not global, differentiation. Except for the 11 known
enediynes (denoted by asterisks), the core size of each enediyne is
predicted.[20] The three TEs, SgcE10, DynE7,
and CalE7 (denoted by black boxes), whose crystal structures have
been determined are marked with an asterisk. The sequence alignment
was made with ClustalW, and the phylogenetic tree was constructed
using the Maximum Likelihood method in MEGA 7.0.18.[38]
Phylogenetic
tree of the 101 TEs from confirmed and predicted 9-
and 10-membered enediyne biosynthetic machineries. The TEs from 9-membered
enediyne (red) and 10-membered enediyne (blue) biosynthetic machineries
show local, but not global, differentiation. Except for the 11 known
enediynes (denoted by asterisks), the core size of each enediyne is
predicted.[20] The three TEs, SgcE10, DynE7,
and CalE7 (denoted by black boxes), whose crystal structures have
been determined are marked with an asterisk. The sequence alignment
was made with ClustalW, and the phylogenetic tree was constructed
using the Maximum Likelihood method in MEGA 7.0.18.[38]Despite the differences seen in
the phylogenetic analysis, alignment
of the TE sequences from the 11 biosynthetic machineries of structurally
characterized enediynes revealed sequence conservation of key residues
involved in the binding and catalysis of the ACP-tethered polyene
intermediates for TEs from both 9- and 10-membered enediyne biosynthetic
machineries (Figure ). On the basis of the structures of CalE7 and DynE7, Phe42, Leu43,
Leu60, and Leu137 (numbered based on SgcE10) form the hydrophobic
pocket that binds the linear polyene intermediate 4,
and Tyr27, Arg39, and Glu40 form a catalytic triad-like motif; these
residues are found in all of the 101 TEs.[14,15] However, no specific residue, conserved only for TEs from the 9-
or 10-membered enediyne biosynthetic machineries, is apparent to allow
for simple sequence-based classification.
Figure 3
Sequence alignment of
the 11 TEs from the seven 9-membered (CyaE10,
CynE10, KedE10, MdpE10, NcsE10, SgcE10, and SpoE10) and four 10-membered
(CalE7, DynE7, UcmE10, and TnmE10) enediyne biosynthetic machineries.
Residues are colored on the basis of the level of conservation (red
box with white character shows strict identity, red character similarity,
and blue frame similarity across groups). The corresponding secondary
structure of SgcE10 is depicted above the sequence alignment. Key
residues, involved in substrate binding (denoted by carets) and catalysis
(denoted by asterisks) as deduced on the basis of the CalE7 and DynE7
structures,[14,15] are conserved for all of the
TEs.
Sequence alignment of
the 11 TEs from the seven 9-membered (CyaE10,
CynE10, KedE10, MdpE10, NcsE10, SgcE10, and SpoE10) and four 10-membered
(CalE7, DynE7, UcmE10, and TnmE10) enediyne biosynthetic machineries.
Residues are colored on the basis of the level of conservation (red
box with white character shows strict identity, red character similarity,
and blue frame similarity across groups). The corresponding secondary
structure of SgcE10 is depicted above the sequence alignment. Key
residues, involved in substrate binding (denoted by carets) and catalysis
(denoted by asterisks) as deduced on the basis of the CalE7 and DynE7
structures,[14,15] are conserved for all of the
TEs.
Overall Structure of SgcE10
Establishing a Common Fold of TEs
for both 9- and 10-Membered Enediyne Core Biosynthesis
To
determine whether the structures of the TEs from the 9- and 10-membered
enediyne biosynthetic machineries are different, we next solved the
crystal structure of SgcE10, the TE from the 9-membered enediyne biosynthetic
machinery of 1 from S. globisporus. The best crystals were obtained from the chymotrypsin digested
SgcE10 at 16 °C in 0.2 M ammonium tartrate dibasic containing
20% PEG3350. The crystals grew within 1 week and reached sizes of
approximately 0.10 mm × 0.02 mm × 0.01 mm. Diffraction analysis
revealed that the crystals belong to the P21 space group with cell dimensions of a = 57.42 Å, b = 184.53 Å, c = 59.40 Å, and
β = 95.6°. The structure was solved by Se-SAD with phasing
from three selenomethionines to a resolution of 2.80 Å. The refinement
was performed until the structure converged to an Rwork of 0.211, an Rfree of
0.270, and a root-mean-square deviation (RMSD) for bond distances
and angles of 0.002 Å and 0.604°, respectively. The final
model included the protein residues 6–145 of the chains A–E,
6–147 of chain F, 5–146 of chain G, and 5–145
of chain H. There are three tartrates, two glycerols, one ethylene
glycol, and 95 ordered water molecules found in the structure. Several
residues in both the C- and N-termini were disordered and not modeled.[22] Data collection and refinement statistics are
summarized in Table .
Table 1
Data Collection and Refinement Statistics
data set
SgcE10
space group
P21
unit cell (Å)
a = 57.42, b = 184.53, c = 59.40, β = 95.60°
wavelength (Å)
0.97931
highest resolution bin (Å)
2.85–2.78
number
of observed reflections
30 094 (1549)a
Rmerge (%)b
12.0 (72.7)a
completeness (%)
99.5 (100)a
I/σ
10.4 (1.9)a
CC1/2 (highest resolution shell)
0.659
CC*c
0.891
phasing
phasing method
SAD
phasing resolution
range (Å)
49.8–2.78
number of SeMet
24
overall figure of
merit
0.72
refinement
Phenix.refine/refmac5.5
refinement
resolution
range (Å)
39.16–2.78
Rwork (%)
21.1
Rfree (%)
27.0
number of protein
residues (model/total)
1125/1272
solvent molecules
95
bond lengths (Å)
0.002
bond angles
(deg)
0.604
average B-factors (Å2)
36.0
Wilson B-factor (Å2)
69.07
Ramachandran plot
(%)
preferred
95.5
disallowed
0.3
PDB ID
4I4J
Numbers in parenthesis are from
the highest resolution bin.
Numbers in parenthesis are from
the highest resolution bin.Rmeas = ∑(N/(N – 1)1/2)∑|I(hkl) –
I̅(hkl)|/∑∑I(hkl).CC* = (2CC1/2/(1 + CC1/2))1/2.SgcE10 forms
a homotetramer with a 222 symmetry (Figure A). This agrees well with its
tetrameric state in solution, as determined by size-exclusion chromatography
(Figure S2). Each monomer has an α/β
hot-dog fold that is composed of one β-sheet consisting of five
antiparallel strands surrounded by three α-helices (Figure B). The four monomers
constituting the SgcE10 homotetramer are nearly identical. Structural
alignments of monomer A with monomers B–D exhibit RMSD values
of 0.3, 0.5, and 0.3 Å, respectively (Figure C). Only the N-terminal and C-terminal extremities
and the loop at the entrance of the active site are not superimposed
for the four monomers, and these three regions have correspondingly
high B-factor values (Figure D). The SgcE10 homotetramer is a dimer of dimers, with each
dimer composed of 10 antiparallel strands surrounding the two central
α1 helices (Figure A). Each dimer is formed by hydrogen bonds between the two
β2 strands, the two α1 helices, and the α1 helix
and the β5 strand, hydrophobic interactions between the β2
strand and the α1 helix, and ionic interactions between the
two β2 strands, and between the β5−α3 loop
and the β1−α1 loop. This dimerization of two SgcE10
monomers creates two channels for substrate binding and catalysis
(Figure E). The SgcE10
tetramer, created by the association of the two dimers, is maintained
by hydrophobic interactions and hydrogen bonding between the β1
and α1 loops, hydrophobic interactions between the α2
helix and the β1−α1 and β2−β3
loops, and ionic interactions between the α1 helix and the β1
and β3 strands. The association of the two dimers forms a channel
with a diameter of ∼6 Å along the two-fold axis of symmetry
(Figure A). The tetrameric
SgcE10 structure shows RMSD values of 1.3 and 1.0 Å with CalE7
and DynE7, respectively (Figure F); structural alignment of DynE7 and CalE7 gives an
RMSD of 1.0 Å. Taken together, the structure of SgcE10, in comparison
with those of CalE7 and DynE7, establishes a common three-dimensional
structural fold for TEs from both 9- and 10-membered enediyne biosynthetic
machineries.
Figure 4
SgcE10 is a homotetramer that shares the same structural
fold as
that of CalE7 and DynE7. (A) The overall structure of SgcE10 is a
homotetramer consisting of a dimer of dimers. (B) The α/β
hot-dog monomer fold composed of a β-sheet comprising five antiparallel
strands (β1−β5) wrapped by three α-helices
(α1−α3). (C) Superposition of the chains A (green),
B (blue), C (pink), and D (yellow) comprising the homotetramer of
SgcE10 showing that the monomers are identically related to each other.
(D) Some weak variations observed in the N- and C-terminal extremities
and the loop positioned at the entrance of the active site as depicted
with B-factor presentation of the chain A of SgcE10. The warm colors
in the extremity and loop regions indicate a high degree of mobility.
(E) Dimerization of two monomers affording the dimer with two active
sites (red arrows). (F) Superposition of the dimers of SgcE10 (blue),
DynE7 (orange), and CalE7 (yellow) establishing a common structural
fold for TEs from both 9- and 10-membered enediyne biosynthetic machineries.
SgcE10 is a homotetramer that shares the same structural
fold as
that of CalE7 and DynE7. (A) The overall structure of SgcE10 is a
homotetramer consisting of a dimer of dimers. (B) The α/β
hot-dog monomer fold composed of a β-sheet comprising five antiparallel
strands (β1−β5) wrapped by three α-helices
(α1−α3). (C) Superposition of the chains A (green),
B (blue), C (pink), and D (yellow) comprising the homotetramer of
SgcE10 showing that the monomers are identically related to each other.
(D) Some weak variations observed in the N- and C-terminal extremities
and the loop positioned at the entrance of the active site as depicted
with B-factor presentation of the chain A of SgcE10. The warm colors
in the extremity and loop regions indicate a high degree of mobility.
(E) Dimerization of two monomers affording the dimer with two active
sites (red arrows). (F) Superposition of the dimers of SgcE10 (blue),
DynE7 (orange), and CalE7 (yellow) establishing a common structural
fold for TEs from both 9- and 10-membered enediyne biosynthetic machineries.
Active Site of SgcE10 Suggesting
Common ACP-Tethered Polyene
Intermediates and Their Release Mechanism for Both 9- and 10-Membered
Enediyne Core Biosynthesis
The structure of SgcE10 is composed
of four active sites, two in each interfacial dimer (Figure A,E). The channel formed by
dimerization is hydrophobic and is composed of Phe42, Leu43, Val50,
Val54, Leu60, Met83, Leu85, Leu88, Leu93, Phe95, Ile118, Leu137, and
Leu141. However, the entrance of the channel is open to solvent and
is understandably hydrophilic due to the presence of Asn21, Tyr27,
Tyr28, Arg39, Glu40, and Thr62 (Figure A). We superimposed the A and B chains of the SgcE10
structure with the A and C chains of DynE7 in its apo form (PDB entry 2XFL). The DynE7 residues
involved in catalysis (i.e., Tyr23, Arg35, Glu36), as well as those
found in the substrate-binding cavity, are present in the SgcE10 structure
and similarly positioned (Figure B). Only a few minor changes were seen, although the
physicochemical properties of these residues were retained (Leu50
to Val54, Ile89 to Leu93, Met91 to Phe95, and Val113 to Ile118 for
SgcE10 in comparison with DynE7) (Figure B).
Figure 5
SgcE10 sharing the same substrate-binding tunnel
and active site
as those of CalE7 and DynE7. (A) The active site of SgcE10 is composed
of a hydrophobic pocket (pink) with hydrophilic residues (yellow)
with the catalytic Arg (red) at the entrance of the pocket. The hydrophobic
pocket is involved in binding of the linear polyene intermediate,
while the hydrophilic residues likely accept the phosphopantetheine
arm of ACP. The substrate-binding channel is formed by the dimerization
of two monomers (light and dark blue); Arg39, Glu40, and Thr62 are
from one monomer and Asn21, Tyr27, and Tyr28 are from the other. (B)
Superposition of the SgcE10 (blue) and DynE7 (orange) active sites
revealing the same substrate-binding and catalytic residues of SgcE10
(pink) as those of DynE7 (yellow). (C) Superposition of 7 from the ligand-bound DynE7 structure (PDB entry 2XEM) with one of the
four active sites of SgcE10, supporting a similar mechanism for the
TE-catalyzed hydrolysis of the ACP-tethered polyene intermediates
in 9- and 10-membered enediyne core biosynthesis. A catalytic water
molecule (red sphere) is positioned by Glu40 in front of the ketone
and could be involved in the release of the heptane. The negative
charge of the tetrahedral intermediate, resulting from nucleophilic
attack by the water molecule, could be stabilized by Arg39, which
is oriented by Tyr27.
SgcE10 sharing the same substrate-binding tunnel
and active site
as those of CalE7 and DynE7. (A) The active site of SgcE10 is composed
of a hydrophobic pocket (pink) with hydrophilic residues (yellow)
with the catalytic Arg (red) at the entrance of the pocket. The hydrophobic
pocket is involved in binding of the linear polyene intermediate,
while the hydrophilic residues likely accept the phosphopantetheine
arm of ACP. The substrate-binding channel is formed by the dimerization
of two monomers (light and dark blue); Arg39, Glu40, and Thr62 are
from one monomer and Asn21, Tyr27, and Tyr28 are from the other. (B)
Superposition of the SgcE10 (blue) and DynE7 (orange) active sites
revealing the same substrate-binding and catalytic residues of SgcE10
(pink) as those of DynE7 (yellow). (C) Superposition of 7 from the ligand-bound DynE7 structure (PDB entry 2XEM) with one of the
four active sites of SgcE10, supporting a similar mechanism for the
TE-catalyzed hydrolysis of the ACP-tethered polyene intermediates
in 9- and 10-membered enediyne core biosynthesis. A catalytic water
molecule (red sphere) is positioned by Glu40 in front of the ketone
and could be involved in the release of the heptane. The negative
charge of the tetrahedral intermediate, resulting from nucleophilic
attack by the water molecule, could be stabilized by Arg39, which
is oriented by Tyr27.Although the CalE7 structure was determined in its apo form,[14] the structures of both the apo and ligand-bound
DynE7 forms were obtained, revealing significant conformational changes
throughout the substrate-binding tunnel and active site upon complexation
with 7.[15] The SgcE10 structure
was solved in its apo form (Figures and 5). All attempts to dock 7, 8, or their phosphopantetheinyl thioesters
as mimics for the corresponding ACP-tethered polyene intermediates
(Figure B) into the
substrate-binding channel and active site of apo SgcE10 unfortunately
yielded no reasonable in silico complexes. Considering the conformational
changes observed in the structure of DynE7 in complex with 7,[15] similar conformational changes in
the apo SgcE10 structure are most likely needed to accommodate the
ACP-tethered polyene intermediates. Thus, superposition of the SgcE10
structure and the ligand-bound DynE7 structure, in complex with 7, revealed that the side chain of Phe95 in SgcE10 (Met91
in DynE7) is positioned in the middle of the substrate-binding channel
and active site cavity. For proper substrate binding, the side chain
of Phe95 must flip to form part of the hydrophobic wall. With no reasonable
in silico complexes, and in spite of Phe95 blocking part of the substrate-binding
channel, overlaying 7 from ligand-bound DynE7 with SgcE10
showed a possible substrate-binding orientation (Figure C). The substrate mimic 7 is situated in the SgcE10 substrate-binding channel and
active site cavity, interacting with all of the constituent residues
via hydrophobic interactions. A water molecule from the SgcE10 structure
is positioned beside the ketone group of 7. The oxygen
of the latter is pointing toward the side chain of Arg39 at a reasonable
distance of 3.6 Å. The ketoneoxygen of 7 is equivalent
to the β-hydroxyl oxygen of the native substrate of SgcE10 linked
to the ACP domain. However, the proximity of the conserved Arg and
the water molecule suggests that the position of the ketoneoxygen
is similar to the thioester oxygen; without an ACP-tethered substrate,
it is unclear where the thioester and β-hydroxyl oxygens will
be positioned. Arg39 could act as an oxyanion hole, stabilizing the
negative charge generated by the attack of the activated water molecule
on the thioester bond of the ACP-tethered polyene intermediates. Indeed,
Arg39 is positioned via hydrogen bonds with the side chains of Tyr27
and Glu40, which may also be involved in the positioning of the water
molecule (3.3 Å) (Figure C). The ACP-tethered polyene intermediates are hydrolytically
released as either a β-hydroxy- or β-ketoacid, subsequent
decarboxylative dehydration or decarboxylation of which finally affords 4 or 7, respectively; it is not known, however,
if the latter steps are TE-catalyzed or nonenzymatic (Figure B). Generally, TEs and ACPs
form protein complexes that facilitate substrate release. However,
attempts to dock SgcE10 with the ACP domain of SgcE failed to afford
any reasonable complex model, although it has been suggested previously
that the TEs bind to the ACP domain-tethered intermediate primarily
by recognizing the hydrophobic linear polyene.[15]
Discussion
The enediynes are some
of the most potent antitumor antibiotics,
and their fascinating molecular architectures have inspired numerous
biosynthetic studies. Since the cloning of the first 9-membered (1)[23] and 10-membered (2)[24] enediyne biosynthetic gene clusters
in 2002, a total of 12 enediyne biosynthetic gene clusters, encoding
the biosynthesis of seven 9-membered enediynes and five 10-membered
enediynes, have now been cloned, sequenced, and partially characterized.
Over 100 additional enediyne biosynthetic gene clusters have been
discovered through genome mining; although most of these encoded enediyne
natural products are yet to be structurally characterized, these discoveries
underline the remarkable potential of nature in biosynthesizing the
enediyne family of natural products.[20,21] Tremendous
progress has been made in understanding the biosynthesis of the peripheral
moieties and uncovering the convergent strategy of their attachment
to the enediyne cores to complete enediyne biosynthesis;[6,7] however, how the enediyne cores are constructed and what controls
the divergence between 9- and 10-membered enediyne core biosynthesis
remain elusive.Enediyne core biosynthesis is currently limited
to using the PKSE–TE
pair as a model system, and all attempts to produce and structurally
characterize biosynthetic intermediates, leading to either 9- or 10-membered
enediyne cores, have met with little success.[9−13] Although in vivo coexpression of cognate or mismatched
PKSE–TE pairs from both 9- and 10-membered enediyne biosynthetic
machineries all resulted in the exclusive production of the same polyene
metabolite 4 (Figure B, path a), in vitro reconstitution of PKSE–TE
pairs from selected 9- or 10-membered enediyne biosynthetic machineries
yielded varying metabolites, for example, 4, 5, and 6 by SgcE–SgcE10 (Figure B, path b), 4, 7, and 8 by CalE8–CalE7 (Figure B, path c), and 4 and 7 by DynE8–DynE7. The metabolite profile discrepancy
between the in vivo and in vitro studies of the PKSE–TE pairs
raised the question of whether the varying metabolites observed in
vitro are relevant to enediyne core biosynthesis and whether TEs play
a gate-keeping role, selectively channeling the nascent ACP-tethered
polyene intermediates to 9- or 10-membered enediyne core biosynthesis.
Because the structures of CalE7 and DynE7, two TEs from the biosynthetic
machineries of the 10-membered enediynes 2 and 3, are known, we reasoned that structural comparison of CalE7
and DynE7 with TEs from 9-membered enediyne biosynthetic machineries
may provide insight into the nature of the ACP-tethered polyene intermediates
and the roles of the TEs in enediyne core biosynthesis.Taking
advantage of the rapidly growing family of enediyne biosynthetic
gene clusters,[20,21] we first carried out extensive
bioinformatics analysis of the PKSE–TE pairs from all 9- and
10-membered enediyne biosynthetic machineries, and this revealed that
TEs are generally categorized into clades based on their involvement
in 9- or 10-membered enediyne biosynthesis, but that this trend is
not strictly followed (Figure ). However, amino acid sequence alignments showed that all
TEs are fairly conserved (38–100% identities) and, most importantly,
the residues involved in substrate binding and catalysis are highly
conserved (Figure ), suggesting similar ACP-tethered polyene intermediates and mechanisms
of catalysis for TEs in both 9- and 10-membered enediyne core biosynthesis.
We next solved the crystal structure of SgcE10 from the 9-membered
biosynthetic machinery of 1 and compared it with those
of DynE7 and CalE7, the two TEs from the 10-membered enediyne biosynthetic
machineries of 2 and 3. The overall structure
(Figure ) and the
key residues lining up the substrate-binding tunnel and active sites
are all highly conserved among SgcE10, DynE7, and CalE7 (Figure ), suggesting that
the catalytic processes of these enzymes are very similar. Considering
that the PKSE–TE pairs from both 9- and 10-membered enediyne
biosynthetic machineries produced the same polyene metabolite 4,[10] which was also detected from
the fermentations of all known enediyne producers examined,[25] and combined with the new findings that SgcE10
features a nearly identical volume and shape of substrate-binding
tunnel and active site cavity to CalE7 and DynE7, it is likely that
the ACP-tethered polyene substrates for the TEs and their catalytic
release for both 9- and 10-membered enediyne core biosynthesis are
identical. Therefore, our current sequence and structural analysis
of SgcE10 and other TEs, along with previous in vivo and in vitro
characterizations of PKSE–TE pairs, further support the proposal
that the divergence between 9- and 10-membered enediyne core biosynthesis
occurs beyond PKSE and TE catalysis. Biosynthetic investigations are
currently aimed at characterizing the role of the PKSE-associated
enzymes in 9- or 10-membered enediyne core biosynthesis (Figure B, path a).
Materials
and Methods
Gene Cloning
Gene cloning was performed according to
previously published procedures.[26] Full
length sgcE10 (gi: 24575121; locus version: AAL06692.1)
was amplified from the genomic DNA of S. globisporus with KOD Hot Start DNA polymerase using 5′-TACTTCCAATCCAATGCCATGACCGCGACGAATCCTGACTA-3′
and 5′-TTATCCACTTCCAATGTTAGGCGGCGCGTCCCGC-3′ as forward
and reverse primers, respectively. The amplification buffer was supplemented
with betaine to a final concentration of 2.5 M. The PCR product was
purified and treated with T4 DNA polymerase in the presence of dCTP
according to the specifications of the manufacturer (New England Biolabs),
cloned into pMCSG57[27] according to ligation-independent
procedures,[28] and transformed into E. coli BL21(DE3)-Gold (Stratagene).
Gene Expression
and Protein Purification
To produce
the SgcE10 protein, a bacterial culture was grown at 37 °C, 190
rpm in 2 L of enriched M9 medium[29] until
it reached OD600 = 1.0. After air-cooling it down to 4
°C for 60 min, inhibitory amino acids (25 mg/L each of l-valine, l-isoleucine, l-leucine, l-lysine, l-threonine, and l-phenylalanine) and 90 mg/L selenomethionine
(SeMet) were added. Gene expression was induced with 0.5 mM isopropyl-β-d-thiogalactoside (IPTG). The cells were incubated overnight
at 18 °C, harvested by centrifugation, and resuspended in lysis
buffer [50 mM HEPES, pH 8.0, containing 5% (v/v) glycerol, 500 mM
NaCl, 20 mM imidazole, and 10 mM β-mercaptoethanol]. Cells were
disrupted by lysozyme treatment (1 mg/mL) and sonication, and the
insoluble cellular material was removed by centrifugation. The SeMet-labeled
SgcE10 protein was purified using Ni-NTA affinity chromatography and
the ÄKTAxpress system (GE Healthcare Life Sciences) with the
addition of 10 mM β-mercaptoethanol in all buffers. The N-terminal
His6-tag of SgcE10 was cleaved using recombinant His6-tagged TEV protease. An additional step of Ni-NTA affinity
chromatography was performed to remove the protease, uncut His6-tagged SgcE10 protein, and the His6-tag. The oligomeric
state of SgcE10 was defined by using size-exclusion chromatography
(Superdex 200 16/600 column, GE Healthcare Life Sciences). Pure SgcE10
was concentrated using Amicon Ultra-15 concentrators (Millipore) in
20 mM HEPES buffer, pH 8.0, containing 250 mM NaCl and 2 mM dithiothreitol
(DTT). Protein concentrations were determined from the absorbance
at 280 nm using a molar absorption coefficient (ε280 = 15 930 M–1 cm–1).[30] The concentration of SgcE10 samples used for
crystallization was 42 mg/mL. A second batch of SeMet-labeled SgcE10
was purified for optimization with reductive alkylation and partial
proteolysis.[26] The concentration of SgcE10
samples used for crystallization was 5 mg/mL for reductively methylated,
29 mg/mL for reductively ethylated, 26 mg/mL for reductively iso-propylated, and 48 mg/mL for partially proteolyzed (with
chymotrypsin, trypsin, or thermolysin), respectively. Individual aliquots
of the purified SgcE10 were stored at −80°C until needed.
SgcE10 Crystallization
SgcE10 was screened for crystallization
conditions with the help of a Mosquito liquid dispenser (TTP Labtech)
using the sitting-drop vapor-diffusion technique in 96-well CrystalQuick
plates (Greiner Bio-One). For each condition, 0.4 μL of SgcE10
and 0.4 μL of crystallization formulation were mixed, and the
mixture was equilibrated against 140 μL of the reservoir in
the well. Several commercially available crystallization screens were
used including: MCSG-1–3 (Anatrace) at 24 and 4 °C for
the unmodified protein, MCSG-1–4 at 24 °C for the reductively
alkylated proteins, and PEG/Ion HT (Hampton Research Corp.) at 16
°C for the partially proteolyzed proteins.
Data Collection
and Structure Determination and Refinement
Diffraction data
were collected at 100 K at the 19-ID beamline
of the Structural Biology Center at the Advanced Photon Source, Argonne
National Laboratory.[31] The single-wavelength
data at 0.97931 up to 2.8 Å were collected from a single protein
crystal of SgcE10. Integration and scaling were performed with the
HKL-3000 program package.[32] The structure
was determined by single-wavelength anomalous diffraction (SAD) phasing
utilizing the anomalous signal from Se atoms with PHENIX (AutoSol
and AutoBuild)[33] using the peak data to
2.8 Å. An extensive manual model was built with Coot[34] with subsequent refinement using phenix.refine,[33] and Refmac5.5 from the CCP4 suite[35] was used. After multiple rounds of refinement
with coot and phenix.refine, the structure converged. The stereochemistry
of the structure was checked with PROCHECK[36] and a Ramachandran plot.[37] The atomic
coordinates and experimental structure factors of SgcE10 were deposited
in the Protein Data Bank (PDB) under the code 4I4J.
Authors: S C Tsai; L J Miercke; J Krucinski; R Gokhale; J C Chen; P G Foster; D E Cane; C Khosla; R M Stroud Journal: Proc Natl Acad Sci U S A Date: 2001-12-18 Impact factor: 11.205
Authors: Gerd Rosenbaum; Randy W Alkire; Gwyndaf Evans; Frank J Rotella; Krzystof Lazarski; Rong Guang Zhang; Stephan L Ginell; Norma Duke; Istvan Naday; Jack Lazarz; Michael J Molitsky; Lisa Keefe; John Gonczy; Larry Rock; Ruslan Sanishvili; Martin A Walsh; Edwin Westbrook; Andrzej Joachimiak Journal: J Synchrotron Radiat Date: 2005-12-22 Impact factor: 2.616
Authors: Jian Zhang; Steven G Van Lanen; Jianhua Ju; Wen Liu; Pieter C Dorrestein; Wenli Li; Neil L Kelleher; Ben Shen Journal: Proc Natl Acad Sci U S A Date: 2008-01-25 Impact factor: 11.205
Authors: Vincent B Chen; W Bryan Arendall; Jeffrey J Headd; Daniel A Keedy; Robert M Immormino; Gary J Kapral; Laura W Murray; Jane S Richardson; David C Richardson Journal: Acta Crystallogr D Biol Crystallogr Date: 2009-12-21
Authors: Abigael J Kosgei; Mitchell D Miller; Minakshi Bhardwaj; Weijun Xu; Jon S Thorson; Steven G Van Lanen; George N Phillips Journal: Acta Crystallogr F Struct Biol Commun Date: 2022-01-01 Impact factor: 1.056
Authors: Sarah K Alvarado; Mitchell D Miller; Minakshi Bhardwaj; Jon S Thorson; Steven G Van Lanen; George N Phillips Journal: Acta Crystallogr F Struct Biol Commun Date: 2021-09-21 Impact factor: 1.056