Haoqian Liang1, Isaiah J Lopez1, Marina Sánchez-Hidalgo2, Olga Genilloud2, Wilfred A van der Donk1,3. 1. Department of Biochemistry, University of Illinois at Urbana─Champaign, 600 S. Mathews Avenue, Urbana, Illinois 61801, United States. 2. Fundación MEDINA Centro de Excelencia en Investigación de Medicamentos Innovadores en Andalucía, Avenida del Conocimiento, 34 Parque Tecnológico de Ciencias de la Salud, Armilla, 18016 Granada, Spain. 3. Department of Chemistry and Howard Hughes Medical Institute, University of Illinois at Urbana─Champaign, 600 S. Mathews Avenue, Urbana, Illinois 61801, United States.
Abstract
Lanthipeptides are ribosomally synthesized and post-translationally modified peptides characterized by lanthionine (Lan) and/or methyllanthionine (MeLan) residues. Four classes of enzymes have been identified to install these structures in a substrate peptide. Recently, a novel class of lanthipeptides was discovered that lack genes for known class I-IV lanthionine synthases in their biosynthetic gene cluster (BGC). In this study, the dehydration of Ser/Thr during the biosynthesis of the class V lanthipeptide cacaoidin was reconstituted in vitro. The aminoglycoside phosphotransferase-like enzyme CaoK iteratively phosphorylates Ser/Thr residues on the precursor peptide CaoA, followed by phosphate elimination catalyzed by the HopA1 effector-like protein CaoY to achieve eight successive dehydrations. CaoY shows sequence similarity to the OspF family proteins and the lyase domains of class III/IV lanthionine synthetases, and mutagenesis studies identified residues that are critical for catalysis. An AlphaFold prediction of the structure of the dehydration enzyme complex engaged with its substrate suggests the importance of hydrophobic interactions between the CaoA leader peptide and CaoK in enzyme-substrate recognition. This model is supported by site-directed mutagenesis studies.
Lanthipeptides are ribosomally synthesized and post-translationally modified peptides characterized by lanthionine (Lan) and/or methyllanthionine (MeLan) residues. Four classes of enzymes have been identified to install these structures in a substrate peptide. Recently, a novel class of lanthipeptides was discovered that lack genes for known class I-IV lanthionine synthases in their biosynthetic gene cluster (BGC). In this study, the dehydration of Ser/Thr during the biosynthesis of the class V lanthipeptide cacaoidin was reconstituted in vitro. The aminoglycoside phosphotransferase-like enzyme CaoK iteratively phosphorylates Ser/Thr residues on the precursor peptide CaoA, followed by phosphate elimination catalyzed by the HopA1 effector-like protein CaoY to achieve eight successive dehydrations. CaoY shows sequence similarity to the OspF family proteins and the lyase domains of class III/IV lanthionine synthetases, and mutagenesis studies identified residues that are critical for catalysis. An AlphaFold prediction of the structure of the dehydration enzyme complex engaged with its substrate suggests the importance of hydrophobic interactions between the CaoA leader peptide and CaoK in enzyme-substrate recognition. This model is supported by site-directed mutagenesis studies.
Ribosomally synthesized and post-translationally
modified peptides
(RiPPs) are a rapidly expanding class of natural products.[1] The largest class of known RiPPs is the lanthipeptides,
which are characterized by the β-thioether cross-linked bis
amino acids lanthionine (Lan) and methyllanthionine (MeLan). Installation
of (Me)Lan is achieved through dehydration of Ser/Thr residues to
form dehydroalanine (Dha)/dehydrobutyrine (Dhb), followed by intramolecular
Michael-type addition of cysteine thiols to the resulting dehydroamino
acids.[2] Four classes of enzymes have been
characterized that differ in their domain architecture and mechanisms
of (Me)Lan synthesis.[2] Biosynthesis of
class I lanthipeptides involves dedicated dehydratase (LanB) and cyclase
(LanC) enzymes. LanB enzymes use glutamyl-tRNA to glutamylate the
side chains of Ser/Thr residues, followed by glutamate elimination
to generate Dha/Dhb. LanC enzymes use a zinc ion to activate the Cys
thiol for addition to Dha/Dhb. In contrast, class II–IV lanthipeptides
are formed by the multifunctional enzymes LanM (class II), LanKC (class
III), and LanL (class IV) that catalyze both dehydration and cyclization.
These three enzyme classes use phosphorylation by kinase domains to
activate the side chain hydroxy groups of Ser/Thr for elimination.
They differ in how they catalyze the elimination, which occurs within
the kinase active site for LanM and in dedicated phosphoSer/phosphoThr
lyase domains for LanKC and LanL. LanKC and LanL differ in their cyclase
domains, which contain zinc in a LanC fold for LanL but takes place
in a domain that lacks the zinc-binding site for LanKC. Recently,
a novel group of (Me)Lan-containing RiPPs were discovered for which
the biosynthetic gene cluster (BGC) lacked genes encoding well-defined
class I–IV (Me)Lan synthase homologues, suggesting the existence
of an unknown synthase.[3−6] This new group of lanthipeptides was termed class V.[5]Cacaoidin was the first reported class V lanthipeptide
and is produced
by Streptomyces cacaoi CA-170360 (Figure ).[3,6] Cacaoidin
displays potent antimicrobial activity against Gram-positive bacteria,
including methicillin-resistant Staphylococcus aureus (MRSA), and carries multiple unusual structural features, including
an N,N-dimethyl lanthionine, d-amino acids,
an O-glycosylated tyrosine, and a C-terminal aminovinyl-methyl-cysteine
(AviMeCys) (Figure A). The cao BGC contains roughly 27 open reading
frames (ORFs) (Figure B), with basic local alignment search tool (BLAST) analysis used
to tentatively assign functions based on the modifications found in
cacaoidin. Like for other lanthipeptides, the precursor peptide for
cacaoidin is made up of an N-terminal leader peptide and a C-terminal
core peptide (Figure A). The latter is converted into mature cacaoidin by post-translational
modifications.[3,6] A subset of proteins encoded in
the cao BGC exhibit similarity with those in the
BGCs of other recently reported class V lanthipeptides, the lxm BGC involved in lexapeptide biosynthesis[4] and the spr BGC involved in pristinin
A3 biosynthesis.[5] These include a flavin-dependent
cysteine decarboxylase (CaoD, formerly Cao13) for AviMeCys formation,
an F420H2-dependent dehydrogenase (CaoJC, formerly Cao12) involved in d-amino acid formation,[4] a HopA1-like effector protein (CaoY, formerly
Cao7), and two aminoglycoside phosphotransferase (APH) family proteins
(CaoK and CaoX, formerly Cao9 and Cao14). The HopA1 and APH proteins
are believed to be involved in dehydration and cyclization in class
V lanthipeptides,[3−7] but the in vitro activity has not yet been reported.
Figure 1
(A) Sequence
of CaoA and structure of cacaoidin. The sequence of
the CaoA core and leader peptides is depicted, as well as the residues
that undergo post-translational modification. The cleavage sequence
at the end of the leader peptide is shown in underlined font. In cacaoidin,
the N-terminal Lan is depicted in pink, with di-methylation in orange.
The C-terminal AviMeCys motif is depicted in brown. Dehydroalanine
is shown in green, dehydroamino acids that are further reduced to d-Ala and d-Abu are shown in red, and glycosylated
Tyr is shown in blue. (B) Biosynthetic gene cluster of cacaoidin.
(A) Sequence
of CaoA and structure of cacaoidin. The sequence of
the CaoA core and leader peptides is depicted, as well as the residues
that undergo post-translational modification. The cleavage sequence
at the end of the leader peptide is shown in underlined font. In cacaoidin,
the N-terminal Lan is depicted in pink, with di-methylation in orange.
The C-terminal AviMeCys motif is depicted in brown. Dehydroalanine
is shown in green, dehydroamino acids that are further reduced to d-Ala and d-Abu are shown in red, and glycosylated
Tyr is shown in blue. (B) Biosynthetic gene cluster of cacaoidin.In this study, we demonstrate the dehydration of
CaoA by the combined
action of the APH protein CaoK and the HopA1 homologue CaoY. We also
provide insight into the residues required for the activity of the
lyase CaoY through predicted secondary structure analysis as well
as mutagenesis. Following the nomenclature used for lexapeptide,[4] we propose the use of LanK for the class V Ser/Thr
kinases and LanY for the phosphoSer/phosphoThr lyases that together
achieve dehydration to continue using common names in lanthipeptide
biosynthesis.[8] We also provide a model
for the substrate recognition mechanism by the dehydratase complex
formed by CaoK and CaoY using AlphaFold-Multimer structure prediction-guided
site-directed mutagenesis.[9−11]
Results and Discussion
Dehydration of CaoA
Previously, the study of lexapeptide
biosynthesis demonstrated the ability of LxmKYXD to collectively introduce
dehydroamino acids, Lan, and AviMeCys into the precursor peptide LxmA
through heterologous co-expression in Escherichia coli.[4]In vivo investigation
of the biosynthesis of the RiPP thioviridamide from Streptomyces sp. NRRL S-87 also demonstrated that TvaCS-87 and
TvaDS-87, which are homologues of the APH protein
LxmK and the HopA1 homologue LxmY, respectively, are capable of dehydration
of Ser/Thr residues in the precursor peptide TvaAS-87.[12] Thioviridamide is a member of the
thioamitides that contain thioamide residues as the unifying post-translational
modification.[1] Furthermore, TvaCS-87 was also identified as sharing similarity to the kinase domain of
the class III lanthipeptide synthetase MicKC,[13] and mutation of the conserved catalytic residues abolished the production
of dehydrated product.[12] Similar results
were reported from in vitro reconstitution of the
biosynthesis of another thiamitide termed thioholgamide.[14] In the current study, adopting the nomenclature
from the lxm BGC,[4] we
first replaced the original numerical order-based designation of genes
in the cao BGC to correspond to their lxm homologs (Figure B). To verify the kinase activity of CaoK, we co-expressed CaoK with
the N-terminally His6-tagged precursor peptide CaoA (His6–CaoA) in E. coli. The
peptide was purified by immobilized metal affinity chromatography
(IMAC). Analysis by matrix-assisted laser desorption/ionization time-of-flight
mass spectrometry (MALDI-TOF MS) revealed multiple peaks along with
the ion of the precursor peptide, with mass shifts corresponding to
peptides that had been phosphorylated up to four times (Figure A). We then introduced the
putative lyase CaoY into the co-expression system resulting in predominantly
eight dehydrations of CaoA, consistent with the structure of mature
cacaoidin (Figure A).[3] Mass spectral analysis also illustrated
the presence of glutathione (GSH) addition, likely due to the large
number of dehydroalanines in the modified core peptide, which complicated
the in vivo analysis of CaoK and CaoY. We therefore
focused on reconstituting the activity of CaoK and CaoY in
vitro.
Figure 2
(A) MALDI-TOF mass spectra of (i) unmodified His6–CaoA.
[M + H]+ exp. m/z = 7025;
obsd m/z = 7022. (ii) His6–CaoA modified by CaoK in E. coli. [M + H]+ exp. = 7025, obsd = 7028; (M + H + HPO3)+ exp. = 7105, obsd = 7109; (M + H + 2HPO3)+ exp. = 7185, obsd = 7188; (M + H + 3HPO3)+ exp. = 7265, obsd = 7271; (M + H + 4HPO3)+ exp. = 7345, obsd = 7349. (iii) His6–CaoA modified by CaoK and CaoY in E. coli. (M + H – 8H2O)+ exp. = 6881, obsd
= 6883; (M + H – 8H2O + GSH)+ exp. =
7186, obsd = 7190. (iv) His6–CaoA modified by His6–CaoK–CaoY dehydratase complex in vitro with adenosine triphosphate (ATP) and MgCl2. (M + H –
8H2O)+ exp. = 6881, obsd = 6878. (B, C) Liquid
chromatography coupled with electrospray ionization-quadrupole-time
of flight (LC-ESI-QTOF) mass spectrum of (B) Glu-C digested His6–CaoA core peptide modified by CaoK–CaoY during
co-expression in E. coli ([M + 2H]2+ exp. = 1038.4588; obsd = 1038.4594), and (C) MS–MS
fragmentation pattern of Glu-C digested His6–CaoA
core peptide modified by CaoK–CaoY in E. coli. For fragment masses, see Table S3.
(A) MALDI-TOF mass spectra of (i) unmodified His6–CaoA.
[M + H]+ exp. m/z = 7025;
obsd m/z = 7022. (ii) His6–CaoA modified by CaoK in E. coli. [M + H]+ exp. = 7025, obsd = 7028; (M + H + HPO3)+ exp. = 7105, obsd = 7109; (M + H + 2HPO3)+ exp. = 7185, obsd = 7188; (M + H + 3HPO3)+ exp. = 7265, obsd = 7271; (M + H + 4HPO3)+ exp. = 7345, obsd = 7349. (iii) His6–CaoA modified by CaoK and CaoY in E. coli. (M + H – 8H2O)+ exp. = 6881, obsd
= 6883; (M + H – 8H2O + GSH)+ exp. =
7186, obsd = 7190. (iv) His6–CaoA modified by His6–CaoK–CaoY dehydratase complex in vitro with adenosine triphosphate (ATP) and MgCl2. (M + H –
8H2O)+ exp. = 6881, obsd = 6878. (B, C) Liquid
chromatography coupled with electrospray ionization-quadrupole-time
of flight (LC-ESI-QTOF) mass spectrum of (B) Glu-C digested His6–CaoA core peptide modified by CaoK–CaoY during
co-expression in E. coli ([M + 2H]2+ exp. = 1038.4588; obsd = 1038.4594), and (C) MS–MS
fragmentation pattern of Glu-C digested His6–CaoA
core peptide modified by CaoK–CaoY in E. coli. For fragment masses, see Table S3.Initially, N-terminally His6-tagged
CaoK and CaoY were
expressed individually in E. coli,
but His6–CaoK was found mostly in the insoluble
portion after cell lysis. Its solubility increased when it was co-expressed
with untagged CaoY. His6–CaoK and CaoY co-eluted
as an enzyme complex during IMAC purification (Figures S1 and S2). Similar results were very recently reported
for the dehydratase composed of ThoCD involved in thioholgamide biosynthesis.[14] Size exclusion chromatography (SEC) purification
indicated the formation of a heterodimer composed of His6–CaoK and CaoY. The precursor peptide CaoA was then reacted
with the purified CaoK–CaoY complex in the presence of adenosine
triphosphate (ATP) and MgCl2, resulting in 8-fold dehydrated
CaoA as the main product (Figure A). The instrument on which the MALDI-TOF mass spectra
were recorded has a relatively large error at the mass range of these
peptides that do not ionize well. Therefore, the dehydrated CaoA was
treated with endoproteinase Glu-C to release the core peptide (Figure S3), followed by analysis by high-performance
liquid chromatography (HPLC) coupled with electrospray ionization-quadrupole-time-of-flight
tandem mass spectrometry (LC-ESI-QTOF-MS/MS). The +2 charge state
of 8-fold dehydrated CaoA core peptide was observed (Figure B), and the dehydration sites
were established by MS/MS fragmentation (Figure C and Table S2). The data agree with the final structure of cacaoidin.
Characterization of CaoY
Previous bioinformatics studies
demonstrated sequence similarity between CaoY and HopA1 (PF17914),[3,7] a Pseudomonas syringae effector protein
that suppresses the effector-triggered immunity response in plants
by targeting the positive immune regulator enhanced disease susceptibility1
(EDS1).[15,16] An in vivo kinase-effector
interaction screen also identified a number of other kinases as putative
HopA1 targets, many of which are shared with other P. syringae effectors, such as HopAI1,[17] implying a similar protein–protein interaction
mechanism adopted by these two effectors. HopAI1 suppresses pathogen-associated
molecular patterns-induced immunity of the plant host by inactivating
the mitogen-activated protein kinase (MAPK) cell signaling pathway.[18] HopAI1 inactivates MAPKs through elimination
of the phosphate group from phosphothreonine (pThr) on their kinase
activation loop. This activity resembles that of the OspF family of
proteins during infection of mammals (e.g., OspF from Shigella and SpvC from Salmonella).[19,20] Sequence alignment of the OspF family proteins with the lyase domain
of the class III and IV lanthionine synthetases LanKC and LanL illustrated
the high conservation of essential catalytic residues, suggesting
that OspF family proteins and LanKC/LanL may have evolved from a common
primitive pSer/pThr lyase.[19,21−23] However, previous sequence alignment of other HopA1-like proteins
in RiPP biosynthesis with OspF family proteins and LanKC/LanL did
not show significant sequence similarity.[12] In this study, CaoY and other LanY homologues were aligned with
HopA1, HopAI1, OspF family proteins, and LanKC/LanL based on predicted
secondary structure.[24] CaoY and its homologues
showed similarity to the lyase catalytic domain of OspF family proteins
and LanKC/LanL based on the secondary structure in their C-terminal
region (Figure A).
This predicted structural similarity was also reported recently for
ThoD involved in thioholgamide biosynthesis.[14] Lys136 (Lys219 in CaoY) and Tyr158 (Tyr236 in CaoY) of SpvC that
are involved in deprotonation of the α-proton of pSer/pThr residues
as revealed by structural studies[20,25] are highly
conserved in all three protein families, whereas Lys104 and His106
are missing in CaoY and its homologues. His106 acts as a catalytic
acid to facilitate Cβ-OP cleavage in OspF/SpvC and
LanKC/L (Figure B).
Interestingly, Lys104 that activates the α-proton by stabilizing
the oxygen of the enolate formed during phosphate elimination in both
OspF/SpvC and LanKC/L is substituted by Arg187 in CaoY (Figure A) even though previous Lys-to-Arg
mutation at this position in SpvC almost completely abolished its
lyase activity.[20] The residues that form
an arginine-rich pocket in SpvC that interact with the phosphate group
in the substrate do not align with CaoY in the predicted secondary
structure level alignment, except Arg148 (Arg229 in CaoY) (Figure A,B), suggesting
that CaoY may adopt a different set of residues to stabilize the leaving
group.
Figure 3
(A) Predicted secondary structure-based sequence alignment analysis
of the lyase-like domains of OspF, LanKC/L, and LanY. Predicted α-helix
and β-sheet regions are framed in red and blue, respectively.
Conserved residues among the three protein families are highlighted
in red font. Arg187 and Lys219 studied by mutagenesis are highlighted
in orange. Asterisks denote the residues present in the active site
in the crystal structure of OspF and in LanKC/L proteins. Residue
numbering for SpvC and CaoY is shown at the top and bottom, respectively.
(B) Proposed catalytic mechanism for β-elimination of the phosphate
of pSer/Thr catalyzed by OspF. The phosphorylated residue is colored
in blue, and active site residues are colored in red. Corresponding
sites in CaoY (when conserved) are indicated. (C) MALDI-TOF mass spectrum
of (i) His6–CaoA modified by CaoK–CaoY-R187A
in the presence of ATP and MgCl2. [M + H]+ exp.
= 7025, obsd = 7025; (M + H + HPO3)+ exp. =
7105, obsd = 7104; (M + H + 2HPO3)+ exp. = 7185,
obsd = 7185; (M + H + 3HPO3)+ exp. = 7265, obsd
= 7265; (M + H + 2HPO3 – H2O)+ exp. = 7087, obsd = 7087; (M + H + 3HPO3 – H2O)+ exp. = 7167, obsd = 7167. (ii) His6–CaoA modified by CaoK–CaoY-K219A in the presence of
ATP and MgCl2. [M + H]+ exp. = 7025, obsd =
7023; (M + H + HPO3)+ exp. = 7105, obsd = 7102;
(M + H + 2HPO3)+ exp. = 7185, obsd = 7183. (iii)
His6–CaoA modified by CaoK–CaoY-R229Q in
the presence of ATP and MgCl2. (M + H – 8H2O)+ exp. = 6881, obsd = 6880; (M + H – 7H2O)+ exp. = 6899, obsd = 6898; (M + H + HPO3 – 3H2O)+ exp. = 7051, obsd = 7048;
(M + H + 2HPO3 – 3H2O)+ exp.
= 7131, obsd = 7127.
(A) Predicted secondary structure-based sequence alignment analysis
of the lyase-like domains of OspF, LanKC/L, and LanY. Predicted α-helix
and β-sheet regions are framed in red and blue, respectively.
Conserved residues among the three protein families are highlighted
in red font. Arg187 and Lys219 studied by mutagenesis are highlighted
in orange. Asterisks denote the residues present in the active site
in the crystal structure of OspF and in LanKC/L proteins. Residue
numbering for SpvC and CaoY is shown at the top and bottom, respectively.
(B) Proposed catalytic mechanism for β-elimination of the phosphate
of pSer/Thr catalyzed by OspF. The phosphorylated residue is colored
in blue, and active site residues are colored in red. Corresponding
sites in CaoY (when conserved) are indicated. (C) MALDI-TOF mass spectrum
of (i) His6–CaoA modified by CaoK–CaoY-R187A
in the presence of ATP and MgCl2. [M + H]+ exp.
= 7025, obsd = 7025; (M + H + HPO3)+ exp. =
7105, obsd = 7104; (M + H + 2HPO3)+ exp. = 7185,
obsd = 7185; (M + H + 3HPO3)+ exp. = 7265, obsd
= 7265; (M + H + 2HPO3 – H2O)+ exp. = 7087, obsd = 7087; (M + H + 3HPO3 – H2O)+ exp. = 7167, obsd = 7167. (ii) His6–CaoA modified by CaoK–CaoY-K219A in the presence of
ATP and MgCl2. [M + H]+ exp. = 7025, obsd =
7023; (M + H + HPO3)+ exp. = 7105, obsd = 7102;
(M + H + 2HPO3)+ exp. = 7185, obsd = 7183. (iii)
His6–CaoA modified by CaoK–CaoY-R229Q in
the presence of ATP and MgCl2. (M + H – 8H2O)+ exp. = 6881, obsd = 6880; (M + H – 7H2O)+ exp. = 6899, obsd = 6898; (M + H + HPO3 – 3H2O)+ exp. = 7051, obsd = 7048;
(M + H + 2HPO3 – 3H2O)+ exp.
= 7131, obsd = 7127.Site-directed mutagenesis was conducted on CaoY
to confirm the
similarity between CaoY and OspF/SpvC implied in the predicted secondary
structure. We first replaced Lys219 of CaoY with Ala. The enzyme variant
was co-expressed and copurified with wild-type His6–CaoK,
and subsequently, CaoA was supplied to the enzyme complex for in vitro dehydration (Figure S4). Instead of full dehydration, only phosphorylated CaoA was observed
after 12 h at room temperature (Figure C), indicating the loss of lyase activity of CaoY-K219A.
This result agrees well with a previous SpvC study in which the SpvC-K136A
variant showed no activity.[20] We then investigated
the importance of Arg187. The dehydratase complex formed by CaoY-R187A
and CaoK was reacted with CaoA in vitro, and phosphate
elimination activity was nearly completely abolished (Figure C), with mostly phosphorylation
observed. To test the role of Arg229 in elimination, CaoA was reacted
with CaoY-R229Q-CaoK, and inefficient dehydration was observed (Figure C). Collectively,
these results demonstrate the catalytic similarity of LanY, the OspF
family proteins, and the LanKC/LanL lyase domain.The heterodimer
dehydratase complex formed by CaoK and CaoY also
hints at their potential functional similarity to LanKC and LanL (Figures S2 and S4). Instead of a bifunctional
enzyme, the kinase and lyase in class V interact to generate a bifunctional
enzyme complex. The co-occurrence of LanK and LanY may be used as
a bioinformatic handle for class V lanthipeptide discovery. When CaoY
was used as query protein for the construction of a genome neighborhood
network (GNN),[26,27] with 10 neighboring genes upstream
and downstream collected, high co-occurrence of CaoY homologues (PF17914)
and APH family proteins (PF01636) was observed in diverse genome contexts
(Figure S5). Similar findings were also
reported in the studies on the spr BGC.[5]
Bioinformatic Study of Class V Lanthipeptide Sequence
We then investigated how the heterodimer dehydratase complex might
interact with the substrate. Precursor peptides of lanthipeptides
typically contain an N-terminal leader peptide (LP) that is critical
for the modifying enzymes to recognize the substrate and a C-terminal
core peptide, where the modifications take place.[2] Although the catalytic domains of CaoK and CaoY show sequence
similarity to LanKC/LanL, the substrate binding domain of the class
IV lanthionine synthetase SgbL[28] is absent
in CaoK. CaoA also lacks the conserved sequence found in class III
and class IV lanthipeptide LPs that is believed to be involved in
enzyme–substrate recognition.[23] Therefore,
the mechanism by which LanK–LanY recognize their precursor
peptide remains poorly understood.To target the substrate residues
that are potentially involved in leader peptide recognition by the
class V dehydratase complex, we generated a multiple sequence alignment
(MSA) of putative class V LanA sequences using the CaoY GNN described
above.[26,27] From the BGCs with co-occurring LanK and
LanY, ORFs shorter than 110 residues[8] containing
Cys and Ser/Thr in their C-terminal region were collected and aligned.
The MSA displayed two types of peptides that show remarkable similarity
of overall sequence within each group (type-A and type-B) (Figure S6), with a more heterogeneous group that
was poorly aligned with the other sequences that we termed type-C.Interestingly, while type-B potential LanA ORFs are all from cyanobacteria,
type-A LanA ORFs were all found in genomes of actinobacteria and especially Streptomyces, which encode a large number of lanthipeptides.[29] All three characterized class V lanthipeptides
thus far, including cacaoidin, fall into type-A.[3−5] In type-A peptides,
Tyr–17 in the CaoA leader peptide is fully conserved (Figure S6), implying potential importance in
substrate recognition (RiPP leader peptides have negative residue
numbers counting back from the core peptide, whereas the core peptide
has positive numbers, as shown in Figure A). Ala–3 and Pro–2 are also
highly conserved through the entire group. In contrast, CaoA Met–24,
Phe–20, and Glu–1 are replaced by Leu, Tyr, and Ala
in most peptides. The negatively charged Glu in position −12
also shows high conservation with Asp substitution in some sequences.
C-terminal to this Glu/Asp residue is usually a hydrophobic residue
(Leu, Val, Phe, Ile).
AlphaFold Model for Substrate Recognition by the Dehydratase
Complex
To further investigate the enzyme–substrate
interaction, AlphaFold-Multimer was used to generate a predicted structure
of the dehydratase complex CaoK–CaoY with the substrate CaoA.[9−11] We recently showed for the enzyme TglHI, which performs post-translational
modifications on the peptide TglA, that the predicted substrate engagement
mechanism was close to that reported crystallographically for a closely
related enzyme after the AlphaFold model was made.[30,31] The resulting predicted model of CaoK–CaoY had an average
predicted local distance difference test (pLDDT) ranging from 84 to
85.6,[32] suggesting high accuracy of the
overall prediction (Figure S7A,B and Table S5). In the predicted CaoY structure, the previously discussed lyase
active site residues Arg187 and Lys219 are in close proximity, lending
support to the predicted structure (Figure S8C). Three additional positively charged residues His57, Arg217, and
Arg229 are also located around the active site, indicating their potential
involvement in catalyzing phosphate elimination (Figure S8D). Alignment of the top-five models demonstrates
a consistent prediction of how CaoK, CaoY, and the LP of CaoA interact
(Figure S7C,D). In the model, the LP makes
contact with the kinase CaoK but not the lyase CaoY. The C-terminus
of CaoA (Ile–8 to Cys23) is in different orientations in the
five models, with pLDDT values lower than 50 of each residue (Figures A, S7, and Table S5), consistent with the movement of the core
peptide between the kinase and lyase domains. In the highest-ranking
model (model rank_1), the CaoA LP forms two α-helices: One from
Gly–29 to Thr–18, and the other from Val–14 to
Glu–12 (Figure A,B). Both helices have hydrophilic residues exposed to solvent and
hydrophobic side chains oriented toward CaoK (Figure B).
Figure 4
(A) Overall structure of the rank_1 model from
AlphaFold-Multimer
prediction of the CaoAKY complex. CaoA is colored by pLDDT values
(Table S5). CaoK and CaoY are colored in
magenta and cyan, respectively. (B) Potential interaction region between
CaoA LP and CaoK hydrophobic groove in the predicted structure, with
hydrophobic surface in yellow and hydrophilic surface in blue. CaoA
is colored in green. The two predicted α-helices, Met–24
and Tyr–17 in the CaoA LP, are depicted. (C, D) Formation of
hydrophobic pockets of CaoK that are predicted to capture CaoA LP
residues Met–24 (C) and Tyr–17 (D), with CaoK surface
in magenta and LP residues of CaoA in gray.
(A) Overall structure of the rank_1 model from
AlphaFold-Multimer
prediction of the CaoAKY complex. CaoA is colored by pLDDT values
(Table S5). CaoK and CaoY are colored in
magenta and cyan, respectively. (B) Potential interaction region between
CaoA LP and CaoK hydrophobic groove in the predicted structure, with
hydrophobic surface in yellow and hydrophilic surface in blue. CaoA
is colored in green. The two predicted α-helices, Met–24
and Tyr–17 in the CaoA LP, are depicted. (C, D) Formation of
hydrophobic pockets of CaoK that are predicted to capture CaoA LP
residues Met–24 (C) and Tyr–17 (D), with CaoK surface
in magenta and LP residues of CaoA in gray.The hydrophobic face of the first α-helix
(Gly–29
to Thr–18) interacts in the model with a hydrophobic groove
shaped by helices α13−α16 of CaoK, with Met–24
inserted into a hydrophobic pocket of CaoK made up of Leu264, Ile298,
Ile359, Ala363, Phe370, Ser373, and Leu374 (Figures C and S8A). The
aromatic ring of the side chain of CaoA Phe–20 is predicted
to interact with a pocket composed of Val287, Val290, and Leu291 on
CaoK (Figure S9A). The side chain of Tyr–17
in the LP of CaoA occupies another pocket formed by Leu197, Leu198,
Asp201, Leu204, Ile336, Met339, Leu349, Ala357, and Gly358 of CaoK
(Figure D). The hydroxyl
group of Tyr–17 is predicted to engage in a hydrogen bond with
Asp201 of CaoK. Tyr–17 is situated at the end of the helix,
and its interaction with the pocket on CaoK pivots the LP such that
it makes a 90° turn back into the CaoK hydrophobic groove. An
electrostatic interaction between Glu–12, the last residue
of the second helix of CaoA that is highly conserved (Figure S6), and Arg354 of CaoK may be functionally
important (Figure S9B). After Leu–11,
the predictions have low confidence values, and the model is not able
to provide reliable information on core peptide binding (Figure S9C).The interactions between CaoK
and CaoY in the complex are mostly
mediated by two anti-parallel β-sheets with winged helix motifs
(Pro102 to Leu141) on CaoY that interact with three α-helices
of CaoK (Leu173 to Gln199) (Figure S8B).
Site-Directed Mutagenesis of CaoA and CaoY
The functional
importance of the residues in the LP of CaoA was probed by site-directed
mutagenesis to test the AlphaFold-Multimer prediction (Table ). Given the highly insoluble
nature of the precursor peptide that precluded direct binding studies
and kinetic investigations, we chose modification in E. coli as a proxy for substrate recognition. His6–CaoA variants were co-expressed with wild-type CaoK
and CaoY, and the resultant peptides were analyzed by MALDI-TOF MS
after purification (Table and Figure S10). In the first
helix region of CaoA (Gly–29 to Thr–18), replacement
of Met–24 with Gln, a residue with a polar uncharged side chain
of comparable size as that of Met, resulted in incomplete dehydration
with a range of products observed that had undergone different extents
of dehydration and a single phosphorylation (Figure S10). These data suggest that phosphorylation and elimination
were affected by the substitution. Replacement of Phe–20 with
Gln also resulted in incomplete dehydration, again, with the observation
of partially dehydrated and singly phosphorylated peptides. No dehydration
was observed when the fully conserved Tyr–17 was substituted
by Ala or Gln, with a small amount of phosphorylated peptide observed
for the Y–17A variant (Figure S10). Thus, it appears that the engagement of the aromatic ring of Tyr–17
with the CaoK pocket in the model plays an essential role in substrate
recognition. In the second helix (Val–14 to Glu–12),
replacement of Glu–12 by a positively charged Lys resulted
in CaoA that was partially dehydrated, and a similar result was obtained
for the variant CaoA-L–11N (Figure S10), which supports the functional importance of the electrostatic
interaction between Glu–12 of CaoA and Arg354 of CaoK and the
interaction between substrate and kinase mediated by the hydrophobic
residue that always follows Glu–12.
Table 1
Modification of Variants of CaoA after
Co-Expression with CaoK and CaoYa
Mutation of residues that decrease
dehydration efficiency are highlighted in orange, that abolish dehydration
in red, and that hinder N-terminal phosphate elimination in blue.
Mutation of residues that decrease
dehydration efficiency are highlighted in orange, that abolish dehydration
in red, and that hinder N-terminal phosphate elimination in blue.Replacement of residues that showed low confidence
values in the
predicted model or that are not conserved generally had no influence
on enzymatic modification, providing 7–8 dehydrations even
for nonconservative substitutions (Table and Figure S10; sometimes the dehydrated residues react in the cell with glutathione
as previously observed).[33] These observations
are consistent with the model, as these residues mostly face away
from the interaction interface between CaoA and CaoK. Thus, Met–24,
Phe–20, Tyr–17, Glu/Asp–12, and Leu–11
appear to be the critical residues consistent with the sequence alignment
and/or the AlphaFold model. For the LP variants at these positions,
the phenotype was either inactive (Tyr–17) or production of
a range of dehydration states with one phosphorylation.One
mutation that provided a different phenotype was substitution
of Glu–1 with Lys, which resulted in fully dehydrated CaoA
along with 7-fold dehydrated phosphorylated CaoA. This observation
suggests that the final phosphate elimination is impeded by this substitution.
We treated the product with endoproteinase Glu-C to release the core
peptide for LC-MS analysis (Figures S11 and S12). Although MS/MS analysis did not allow determination of the specific
phosphorylated site, the fragmentation pattern clearly demonstrated
that the site was on the N-terminus of the CaoA core peptide (Ser1
to Thr6). We postulate that the phosphorylated residue in the CaoA-E–1K
variant was Ser1 and that the replacement of Glu–1 with Lys
in CaoA at the junction between the LP an CP resulted in a repulsive
interaction between Lys–1 and the two positively charged amino
acids in the lyase active site (Arg187 and Lys219). This putative
deleterious interaction only seems to hinder the elimination activity
of a phosphorylated residue at the N-terminus of the CP as the main
product was successfully dehydrated seven times (Figures S10 and S11). This hypothesis also implies C- to N-terminal
directionality of the dehydration events catalyzed by CaoK and CaoY.The AlphaFold model was also leveraged to identify additional CaoY
active site residues. As noted above, His57 and Arg229 are close to
each other, while Arg217 sits on the opposite side with its side chain
oriented away from the proposed active site (Figure S8D). Mutagenesis revealed that whereas CaoY-R217Q retained
full activity, mutation of His57 to Asn only resulted in partial dehydration
(Figure S13). This compromised function
suggests that His57 might work along with Arg229 in stabilizing the
phosphate group during elimination. Mutagenesis of Asp201 of CaoK
(D201K; D201M; D201W; D201F) to investigate its importance in engaging
the phenolic group of Tyr–17 did not lead to the isolation
of a CaoKY complex with only insoluble CaoY observed. Similarly, attempts
to block the binding site for Met–24 by engineering a salt
bridge (I359E/F370K and I359E/L374K) were unsuccessful and also did
not result in the isolation of a CaoKY complex.
Discussion
In this study, we provide detailed insight
into dehydration of
class V lanthipeptides through reconstitution of the activities of
CaoK and CaoY from the cacaoidin biosynthetic pathway. The co-expression
and in vitro experiments clearly demonstrated kinase
and lyase activity of CaoK and CaoY, respectively. A heterodimeric
dehydratase complex is formed by CaoK and CaoY, which dehydrated CaoA
eight times at the positions consistent with mature cacaoidin. The
cooperative behavior of the kinase and lyase illustrates the functional
similarity between class V dehydratase and class III-IV lanthionine
synthetases LanKC/LanL. Indeed, an evolutionary connection of LanY
to the OspF family proteins and the lyase domains of LanKC/LanL was
clearly revealed through mutagenesis studies based on a predicted
secondary structure alignment. Like OspF/LanL enzymes, LanY likely
utilizes a Lys for deprotonation of the α-proton of the pSer/pThr
residues, but distinct from the other two families, it uses an Arg
instead of Lys for activation of the carbonyl group of the pSer/pThr.
The residues that facilitate phosphate elimination through stabilization
of the leaving group in OspF/LanL are mostly absent in LanY with only
Arg229 conserved. It is likely that at least one additional residue
(His57) assists Arg229 in leaving group stabilization.An AlphaFold-Multimer
prediction model of the substrate recognition
mechanism adopted by class V lanthipeptide dehydratases was supported
by mutagenesis experiments. MSA of putative class V lanthipeptide
precursor peptides suggests that this mechanism of substrate recognition
is conserved among type-A class V lanthipeptides. Cacaoidin and lexapeptide,
two currently characterized class V lanthipeptides, both display strong
antimicrobial activity,[3,4,6] indicating
the potential of class V lanthipeptides as a resource of novel antibiotics.
Our study on the dehydratase of cacaoidin biosynthesis paves the way
for further investigation of class V lanthipeptide biosynthesis as
well as the biosynthesis of thioamitides that use a similar complex
for dehydration.[12,14]
Authors: John A Gerlt; Jason T Bouvier; Daniel B Davidson; Heidi J Imker; Boris Sadkhin; David R Slater; Katie L Whalen Journal: Biochim Biophys Acta Date: 2015-04-18
Authors: Alexander M Kloosterman; Peter Cimermancic; Somayah S Elsayed; Chao Du; Michalis Hadjithomas; Mohamed S Donia; Michael A Fischbach; Gilles P van Wezel; Marnix H Medema Journal: PLoS Biol Date: 2020-12-22 Impact factor: 8.029
Authors: Manuel Montalbán-López; Thomas A Scott; Sangeetha Ramesh; Imran R Rahman; Auke J van Heel; Jakob H Viel; Vahe Bandarian; Elke Dittmann; Olga Genilloud; Yuki Goto; María José Grande Burgos; Colin Hill; Seokhee Kim; Jesko Koehnke; John A Latham; A James Link; Beatriz Martínez; Satish K Nair; Yvain Nicolet; Sylvie Rebuffat; Hans-Georg Sahl; Dipti Sareen; Eric W Schmidt; Lutz Schmitt; Konstantin Severinov; Roderich D Süssmuth; Andrew W Truman; Huan Wang; Jing-Ke Weng; Gilles P van Wezel; Qi Zhang; Jin Zhong; Jörn Piel; Douglas A Mitchell; Oscar P Kuipers; Wilfred A van der Donk Journal: Nat Prod Rep Date: 2020-09-16 Impact factor: 15.111
Authors: Suwen Zhao; Ayano Sakai; Xinshuai Zhang; Matthew W Vetting; Ritesh Kumar; Brandan Hillerich; Brian San Francisco; Jose Solbiati; Adam Steves; Shoshana Brown; Eyal Akiva; Alan Barber; Ronald D Seidel; Patricia C Babbitt; Steven C Almo; John A Gerlt; Matthew P Jacobson Journal: Elife Date: 2014-06-30 Impact factor: 8.140
Authors: Elizabeth K Brauer; George V Popescu; Dharmendra K Singh; Mauricio Calviño; Kamala Gupta; Bhaskar Gupta; Suma Chakravarthy; Sorina C Popescu Journal: PLoS Biol Date: 2018-12-12 Impact factor: 8.029
Authors: Fernando Román-Hurtado; Marina Sánchez-Hidalgo; Jesús Martín; Francisco Javier Ortiz-López; Olga Genilloud Journal: Antibiotics (Basel) Date: 2021-04-08
Authors: Robert D Finn; Penelope Coggill; Ruth Y Eberhardt; Sean R Eddy; Jaina Mistry; Alex L Mitchell; Simon C Potter; Marco Punta; Matloob Qureshi; Amaia Sangrador-Vegas; Gustavo A Salazar; John Tate; Alex Bateman Journal: Nucleic Acids Res Date: 2015-12-15 Impact factor: 16.971
Authors: John Jumper; Richard Evans; Alexander Pritzel; Tim Green; Michael Figurnov; Olaf Ronneberger; Kathryn Tunyasuvunakool; Russ Bates; Augustin Žídek; Anna Potapenko; Alex Bridgland; Clemens Meyer; Simon A A Kohl; Andrew J Ballard; Andrew Cowie; Bernardino Romera-Paredes; Stanislav Nikolov; Rishub Jain; Demis Hassabis; Jonas Adler; Trevor Back; Stig Petersen; David Reiman; Ellen Clancy; Michal Zielinski; Martin Steinegger; Michalina Pacholska; Tamas Berghammer; Sebastian Bodenstein; David Silver; Oriol Vinyals; Andrew W Senior; Koray Kavukcuoglu; Pushmeet Kohli Journal: Nature Date: 2021-07-15 Impact factor: 49.962