Literature DB >> 25244001

High divergence of the precursor peptides in combinatorial lanthipeptide biosynthesis.

Qi Zhang¹, Xiao Yang, Huan Wang, Wilfred A van der Donk.

Abstract

Lanthionine-containing peptides (lanthipeptides) are a rapidly growing family of polycyclic peptide natural products belonging to the large class of ribosomally synthesized and post-translationally modified peptides (RiPPs). These compounds are widely distributed in taxonomically distant species, and their biosynthetic systems and biological activities are diverse. A unique example of lanthipeptide biosynthesis is the prochlorosin synthetase ProcM from the marine cyanobacterium Prochlorococcus MIT9313, which transforms up to 29 different precursor peptides (ProcAs) into a library of lanthipeptides called prochlorosins (Pcns) with highly diverse sequences and ring topologies. Here, we show that many ProcM-like enzymes from a variety of bacteria have the capacity to carry out post-translational modifications on highly diverse precursor peptides, providing new examples of natural combinatorial biosynthesis. We also demonstrate that the leader peptides come from different evolutionary origins, suggesting that the combinatorial biosynthesis is tied to the enzyme and not a specific type of leader peptide. For some precursor peptides encoded in the genomes, the leader peptides apparently have been truncated at the N-termini, and we show that these N-terminally truncated peptides are still substrates of the enzymes. Consistent with this hypothesis, we demonstrate that about two-thirds of the ProcA N-terminal sequence is not essential for ProcM activity. Our results also highlight the potential of exploring this class of natural products by genome mining and bioengineering.

Entities: CellLine Chemical Disease Mutation Species

Mesh：

Substances：

Year: 2014 PMID： 25244001 PMCID： PMC4245175 DOI： 10.1021/cb500622c

Source DB: PubMed Journal: ACS Chem Biol ISSN： 1554-8929 Impact factor: 5.100

Ribosomally synthesized and post-translationally modified peptides (RiPPs) are a major class of natural products, as revealed by the genome sequencing efforts of the past decade.[1] These compounds are produced in all three domains of life and possess vast structural diversity. Among the best-studied RiPPs are lanthipeptides, a class of compounds that are distinguished by the presence of thioether cross-linked amino acids named lanthionines and methyllanthionines.[2−7] Many lanthipeptides, such as the commercially used food preservative nisin, have potent antimicrobial activity and are termed lantibiotics.[8,9] Lanthipeptides are widely distributed among taxonomically distant species[10] and are currently grouped into four distinct classes according to their biosynthetic machineries.[3,10] Like all RiPPs, lanthipeptides are generated from a linear precursor peptide, which is generically termed LanA. This precursor peptide consists of a C-terminal core peptide, where all post-translational modifications take place, and an N-terminal leader peptide that is important for post-translational modification and that is subsequently removed by proteolysis (Figure 1).[1,11] The installation of the (methyl)lanthionine thioether bridges is achieved by the initial dehydration of Ser and Thr residues in the precursor peptides, followed by stereoselective intramolecular Michael-type addition of Cys thiols to the newly formed dehydroamino acids (Figure 1).

Figure 1

Schematic representation of the biosynthetic pathway of lanthipeptides exemplified by prochlorosin 2.8. A shorthand notation for lanthionine structures is shown in the box. Leader and core peptides are not shown in proportion to their actual lengths. An intriguing example of a lanthipeptide synthetase is ProcM, a class II enzyme (generically termed LanM) from the planktonic marine cyanobacterium Prochlorococcus MIT9313.[12] ProcM acts on up to 29 different precursor peptide substrates (ProcAs) and produces a library of lanthipeptides termed prochlorosins (Pcns) that possess highly diverse sequences and ring topologies,[12,13] representing a remarkable example of natural combinatorial biosynthesis. The biological role of Pcns is currently elusive, but they are believed to be functional, as they were found to be produced in the host strain, and their biosynthetic genes were transcribed in response to changes in environmental conditions.[12] The intriguing combinatorial biosynthesis of Pcns provides an interesting model to investigate the evolution of natural product diversity and the molecular origins for the remarkable substrate tolerance displayed by the enzyme. The ProcA substrates have an unusually long leader peptide compared to that of other lanthipeptide substrates, raising the question of whether this longer leader peptide might be correlated with the large diversity of substrates that ProcM processes. The ProcA leader peptides are also unique in that they have sequence homology with the Nif11 proteins.[14] The exact function of the Nif11 proteins is not known, but they are thought to play a role in nitrogen fixation, as their genes cluster with other nitrogen fixation genes.[15] An alternative model that has been proposed is that it is the cyclization active site of ProcM that is unique and that confers the ability to cyclize a wide variety of substrates. Here, we present bioinformatic and biochemical investigations on lanthipeptide biosynthetic systems employing ProcM-like enzymes. We show that the precursor peptides for lanthipeptide biosynthesis are highly divergent among different biosynthetic systems and that many ProcM-like lanthipeptide synthetases can engage in combinatorial biosynthesis by tolerating precursor peptides with highly diverse core sequences.

Results and Discussion

Genome Mining of LanAs Associated with ProcM Analogues

ProcM catalyzes both dehydration and cyclization reactions to transform the linear ProcA peptides into a panel of Pcns. The enzyme contains a conserved CCG motif, suggesting that it likely uses three Cys residues for binding of an active site zinc ion,[10] unlike other lanthipeptide cyclases that utilize two Cys and one His as zinc ligands. This zinc site has been shown to be important for the cyclization reaction.[16,17] The presence of three Cys ligands may account in part for the high substrate tolerance of ProcM, as model studies have demonstrated that the reactivity of thiolate nucleophiles ligated to Zn2+ is enhanced with an increased number of thiolate ligands.[18] We previously have shown that LanMs containing the CCG motif cluster together to form a distinct subclade in the LanM phylogenetic tree, suggesting that these ProcM-like enzymes evolved independently.[10] To investigate whether other ProcM-like enzymes also have multiple structurally diverse LanA substrates, we performed a genome-wide examination of the LanAs associated with ProcM analogues. This investigation showed that similar lanthipeptide synthetases can have very diverse precursor peptides, which vary significantly in both the number of putative substrates and their amino acid sequence (Figure 2A and Supporting Information Table 1 for a detailed list). Unlike ProcM, many ProcM analogues have only a single LanA substrate, as found for most lanthipeptides. On the other hand, many other ProcM-like enzymes have several substrates, harboring either very similar or highly diverse core peptide sequences (Figure 2A and Supporting Information Table 1). A notable example is a ProcM analogue from Prochlorococcus MIT9303, which has almost the same sequence as ProcM (95% identity, 97% similarity). This organism encodes 15 putative substrate peptides (Figure 2A), and none of these substrates have a core peptide similar in sequence to the core peptides of the 29 ProcAs. This observation raises the possibility that many, or all, Pcns may lack a biological function and that they may represent intermediaries during evolution. Conversely, these two lanthipeptide biosynthetic systems may represent a remarkable example of convergent evolution of functional lanthipeptides, if the structurally diverse peptides fulfill similar roles in the closely related species.

Figure 2

Genome mining of precursor peptide genes associated with ProcM-like enzymes. (A) Bayesian MCMC phylogram of ProcM-like enzymes (protein sequence) and a summary of the number of their putative LanA substrates. The lacticin 481 synthetase LctM and nukacin synthetase NukM were used as an outgroup for Bayesian MCMC analysis, which is shown as an orange triangle. The detailed Bayesian MCMC tree is shown in Supporting Information Figure 29. The putative lanA genes were categorized into two groups based on whether they are spatially close to their associated lanM genes. For LanAs that lack Cys residues, the substrates are shown as total number of lanAs/the number of lanAs that do not code for Cys. If an enzyme had multiple LanA substrates, then the core peptide sequences were aligned to examine whether these precursor peptides are similar (S, for which the Ser/Thr and Cys residues are aligned well) or diverse (D, for which Ser/Thr and Cys residues did not align well). ProcM from Prochlorococcus MIT9313, NpnM from Nostoc punctiforme PCC 73102, and four LanMs (CyanM1–4) from Cyanothece sp. PCC 7425 are highlighted in red, blue, and green, respectively. CyanM1 is highlighted by an asterisk. Three groups of substrates shown in blue contain leader peptides that share very weak similarities with the N11P family (1 × 10–4 < e-value < 0.1). NA indicates that the precursors do not belong to any known protein family. (B) Sequence alignment of CyanA1.1–1.3, showing that CyanA1.1 and Cyan1.3 may have been truncated at their N-termini. Alternatively, the open reading frame (ORF) annotation of CyanA1.2 could be incorrect, and its translation start codon may instead be at the light brown arrow, like CyanA1.1 and 1.3. Completely conserved and highly conserved residues in the leader peptides are shown in black and gray boxes, respectively. Ser/Thr and Cys residues in the core peptides are shown in blue and red boxes, respectively. The proteolytic cleavage site is indicated by a green arrow. For detailed information on precursor peptide sequence and the procedures for bioinformatics analysis, see Supporting Information Table 1 and Supporting Information Methods. The vast majority of ProcM-like enzymes with the CCG motif were found in cyanobacteria, a rich source of RiPP natural products,[19] but they are also present in other phyla (Figure 2A). The precursor peptide genes associated with these ProcM-like synthetases appear to have different evolutionary origins. Many of the LanA leader peptides are members of the N11P family, but several other leader peptides belong to the nitrile hydratase leader peptide (NHLP) family (Figure 2A).[14] NHLP is highly similar to the α subunit of nitrile hydratase (NHase) but lacks about 30 amino acids in the middle of the NHase sequence that are important for binding of a catalytic metal ion.[14,20] We note that many cyanobacteria (e.g., Prochlorococcus strains) do not have Nif11 and/or NHase genes and therefore the evolutionary relationship between these enzymes and the leader peptides is not clear. In addition to the N11P and NHLP leader peptides, the LanAs from the planktonic cyanobacterium Synechocystis sp. PCC 7509 have leader peptides that belong to the TIGR03898 family (Figure 2A).[21] This family mostly consists of the leader peptides from LanAs in firmicutes (e.g., MrsA in mersacidin biosynthesis). The five LanAs in Synechocystis sp. PCC 7509 may thus be a result of horizontal gene transfer, possibly from a firmicute. Furthermore, many of the leader peptides of LanAs associated with ProcM-like enzymes do not belong to any known protein family (Figure 2A), illustrating the diverse origins of lanthipeptide precursor genes that cluster with the lanM genes in the ProcM clade. Another interesting observation is that some LanAs may have been truncated at their N-termini. For example, three putative precursor genes (cyanA1.1–1.3) were found adjacent to a ProcM-like LanM gene (cyanM1) in the genome of the cyanobacterium Cyanothece sp. PCC 7425. Compared with CyanA1.2, CyanA1.1 and CyanA1.3 appear to be much shorter in length (Figure 2B). Analysis of the 5′-untranslated regions (UTRs) of CyanA1.1 and CyanA1.3 showed that their sequences are very similar to the coding region of the N-terminus of CyanA1.2 and that the putative truncation in CyanA1.1 might be caused by a point mutation resulting in a stop codon in the 5′-UTR (Supporting Information Figure 1). An alternative interpretation is that the longer CyanA1.2 is based on an incorrectly assigned start codon; translation initiation at a later start codon would result in a peptide that has the same length as that of CyanA1.1 and CyanA1.3 (Figure 2B). We did observe what appears to be a true N-terminally truncated procA gene (here named procAt.1) in the genome of Prochlorococcus MIT9313, which escaped our previous annotation for procA genes.[12] The truncation of ProcAt.1 seems to be caused by the transformation of the original start codon to an ochre stop codon, resulting in translation initiation at a different downstream start codon (Supporting Information Figure 2).

Much of the N-Termini of ProcAs Are Dispensable for ProcM Activity

Unlike the newly identified ProcAt.1, whose leader peptide has 45 amino acids, all of the other ProcA leader peptides have more than 60 amino acids (Supporting Information Figure 3). As mentioned above, ProcA leader peptides belong to the N11P family[14] and are much longer than those of other class II lantibiotics (e.g., as a typical example, the leader peptide of LctA for lacticin 481 biosynthesis has only 23 residues[22,23]). It has been shown that the lacticin 481 synthetase LctM recognizes both the LctA leader and core peptides[24] and thus one possible role of the much longer leader peptides of ProcAs is that they provide additional recognition elements for ProcM, allowing for the remarkably high substrate tolerance. In this scenario, N-terminally truncated LanAs are likely the vestiges of precursor peptide divergence during evolution and might not necessarily be real substrates. To test whether ProcAt.1 is a ProcM substrate, we coexpressed ProcAt.1 with ProcM in Escherichia coli, and the resulting peptide was digested by endoprotease Glu-C. High-resolution matrix-assisted laser desorption/ionization time-of-flight (MALDI-ToF) mass spectrometry (MS) analysis showed that ProcAt.1 had been dehydrated up to three times (Figure 3 and Supporting Information Figure 4). N-Ethylmaleimide (NEM) derivatization, which was employed to derivatize any free cysteine residues, showed that although lanthionine formation for the 2-fold dehydration product is incomplete, the 3-fold dehydration product is fully cyclized (Figure 3), strongly suggesting that ProcAt.1 is a true substrate of ProcM.

Figure 3

Modification of ProcAt.1 by ProcM. (A) MALDI-ToF-MS analysis of ProcAt.1 that was obtained by coexpression with ProcM and treated with endoproteinase Glu-C (trace i) and subsequently derivatized by NEM (trace ii). (B) Sequence of ProcAt.1 modified by ProcM and treated with Glu-C. The ESI-MS/MS fragmentation pattern for the 3-fold dehydrated species is shown (the MS/MS data is presented in Supporting Information Figure 4). To further interrogate the importance of the full-length ProcA leader peptide, we made a series of ProcA2.8 mutants lacking 30, 40, and 50 N-terminal amino acids (Figure 4A) and coexpressed these mutants with ProcM in E. coli. MALDI-ToF-MS analysis clearly showed that mutants lacking the 30 and 40 N-terminal residues were fully dehydrated by ProcM (Figure 4B,C). The peptides were subsequently digested by endoprotease Asp-N and subjected to NEM derivatization. Compared with the control peptides that were expressed in the absence of ProcM and that were fully derivatized by NEM on their two free Cys residues, the ProcM-modified peptides did not react with NEM (Figure 4D,E), indicating that these peptides were fully cyclized. MS/MS analysis confirmed that the correct lanthionine rings of ProcA2.8 were produced in both truncated substrates (Supporting Information Figure 5). Hence, ProcM is able to perform catalysis not only on substrates with dramatically varied core sequences but also on mutants that are significantly truncated in the leader peptide. Although the dispensability of part of the N-terminal leader peptides has previously been shown in the biosynthesis of the lantibiotic lacticin 481,[23] the lasso peptide MccJ25,[25] and the biosynthetic enzymes of cyanobactin biosynthesis,[26−28] the fact that about two-thirds of the leader peptide is dispensable for ProcM activity is surprising considering the high substrate tolerance of the enzyme. The possibility that the long ProcA leader peptides are correlated with ProcM’s tolerance of highly varied core sequences is thus not supported by our studies. Whether ProcM could modify the peptide lacking the 50 N-terminal residues could not be determined because this mutant peptide could not be obtained, regardless of whether the peptide was coexpressed with ProcM or not, suggesting that a certain minimum length of ProcA2.8 might be necessary for peptide stability, at least in E. coli.

Figure 4

ProcM modification of truncated ProcA2.8 derivatives. (A) Sequence of ProcA2.8 and schematic representation of the truncation variants discussed in this study. The purple arrow shows the physiological proteolytic cleavage site for leader peptide removal. The blue arrow shows the endoprotease Asp-N site that was used in this study to shorten the peptide and allow better analysis of the post-translational modifications in the core peptide. (B) MALDI-ToF MS analysis of ProcA2.8-(31–82) that was obtained either by expressing the peptide alone (trace i) or by coexpression with ProcM (trace ii). (C) MALDI-ToF MS analysis of ProcA-(41–82), presented in the same manner as for ProcA2.8-(31–82) in panel B. (D) ProcA2.8-(31–82) peptides (unmodified and modified) were digested by Asp-N and subsequently treated with NEM. Trace i shows the unmodified peptide before NEM derivatization, and trace ii demonstrates complete NEM derivatization of this peptide. Traces iii and iv show the ProcM-modified peptide before and after NEM treatment, respectively. No derivatization of the modified peptides is observed, strongly suggesting formation of lanthionine rings in the ProcM-modified peptide. (E) MALDI-ToF MS analysis of Asp-N-digested ProcA2.8-(41–82). The data are shown as in panel D. In all of the MS spectral data shown, the signals corresponding to the unmodified and the ProcM-modified peptides are highlighted in yellow and green, respectively, whereas the NEM-derivatized peptides are highlighted in light blue. Part of the nonhighlighted peaks are derived from proteolysis products of the leader peptide and Asp-N.

Lanthipeptides from Cyanothece sp. PCC 7425

Since, at present, only the prototypical ProcM has been shown to actually carry out combinatorial biosynthesis, we decided to investigate for a subset of ProcM-like enzymes whether they are active with their associated LanA peptides with diverse leader peptides. We first investigated a particularly interesting lanthipeptide biosynthetic system in the marine cyanobacterium Cyanothece sp. PCC 7425, a strain that also produces several cyanobactins.[29] The genome of this organism encodes four LanMs (here termed CyanM1–4) that consist of a separate phylogenetic subclade (Figures 2A and 5A). Two sets of three lanA genes (cyanA1.1–1.3 and cyanA4.1–4.3) were found adjacent to cyanM1 and cyanM4, respectively, whereas cyanM2 has only a single lanA gene (cyanA2.0) nearby (Figure 5A). Multiple lanA genes (cyanA3.1–3.7) were found adjacent to cyanM3. In addition, another locus approximately 2.8 Mbp away from cyanM3 encodes five more putative LanAs (cyanA3.8–3.12) with leader peptides that are very similar to those of CyanA3.1–3.7 but with lower sequence homology to the leader peptides of the other CyanAs (Figure 5A and Supporting Information Table 1). To correlate these precursor peptides with their corresponding enzymes, a sequence similarity network was constructed based on their leader sequences. The results show that CyanAs can be divided into four groups (Figure 5B), suggesting that they are substrates of four different enzymes.

Figure 5

Lanthipeptide biosynthesis in Cyanothece sp. PCC 7425. (A) Four lanthipeptide biosynthetic systems in Cyanothece sp. PCC 7425, showing the gene clusters of each system and their locations in the genome. (B) Sequence similarity network based on the leader peptide sequence of CyanAs. Each node represents a leader peptide, and each edge (line) indicates a pair of nodes (leader peptides) that have a BlastP e-value more stringent than the cutoff value used (1 × 10–7). Different biosynthetic systems are depicted by different colors. (C) High level of conservation in the N-terminal leader sequence and hypervariability of the C-terminal core peptide of CyanA3.1–3.12. The GG/GA protease cleavage site for leader peptide removal is marked by a green arrow. For the sequences of CyanA1, CyanA2, and CyanA4, see Supporting Information Table 1. CyanA1.1, CyanA1.2, and CyanA1.3 were first tested as potential substrates for CyanM1 by coexpression of each cyanA1.1–1.3 individually with cyanM1 in E. coli. MALDI-ToF MS analysis showed that all three peptides were modified by the enzyme (Supporting Information Figures 6–8). The 12 peptides CyanA3.1–3.12 have highly conserved leader regions but very diverse core sequences (Figure 5C) and thus the CyanM3–CyanA3.x system could be similar to the combinatorial biosynthetic system of the prochlorocins. To test this hypothesis, the 12 genes, cyanA3.1–3.12, were each coexpressed individually with cyanM3, and MALDI-ToF MS analysis showed that all peptides were modified by the enzyme (Supporting Information Figures 9–20), demonstrating very high substrate tolerance for CyanM3, similar to the observations with ProcM.[12] Some precursor peptides have sequences that allowed cleavage by commercially available proteases to remove most of the leader peptides without proteolysis in the core peptide. MALDI-ToF MS analysis of the proteolytic products clearly demonstrated that the dehydrations take place in the core peptide (Supporting Information Figures 16–18). Given that CyanM1–4 form a distinct phylogenetic subclade, we reasoned that the three other enzymes might also have high substrate tolerance. In line with this proposal, we showed that CyanM4 can modify not only its putative substrate CyanA4.1 (Supporting Information Figure 21) but also a chimeric peptide consisting of the CyanA4.1 leader and CyanA1.2 core peptide (Supporting Information Figure 22); the latter is very different from the CyanA4.1 core peptide. However, CyanM4 did not modify CyanA1.2 (Supporting Information Figure 23), supporting the model that it is the cognate leader peptide (i.e., the leader peptide of CyanA4.1) that plays a requisite role for enzyme catalysis,[11] although probably not for substrate promiscuity.

LanAs That Do Not Have a Cysteine Residue

Cysteine is a required residue for formation of the thioether bridges of lanthipeptides. However, we noted that several putative LanAs associated with ProcM-like enzymes only have Ser/Thr residues and lack any Cys residues (Figure 2A and Supporting Information Table 1). These peptides include ProcA4.1, which is the only Cys-free peptide among 30 ProcAs (including the newly characterized ProcAt.1). To investigate whether ProcA4.1 is a ProcM substrate, the peptide was coexpressed with ProcM, and MALDI-ToF-MS analysis showed that the resulting peptide was dehydrated despite the lack of Cys residues (Figure 6A and Supporting Information Figure 24). Since the 29 other ProcA peptides all contain at least one Cys (Supporting Information Figure 3), the absence of a Cys in ProcA4.1 likely resulted from mutations in the core sequence of an ancestor LanA.

Figure 6

Coexpression studies of Cys-lacking peptides with LanMs. (A) MALDI-ToF MS analysis of ProcA4.1 that was obtained by coexpression with ProcM. Also shown is the sequence of the ProcA4.1 core (obtained by TEV cleavage of a ProcA4.1 mutant containing an engineered TEV cleavage site just before the predicted core sequence) and the MS/MS fragmentation pattern for the 3-fold dehydrated species. (B) MALDI-ToF MS analysis of NpnA3 that was obtained by coexpression with NpnM in E. coli. Also shown is the sequence of endoproteinase Glu-C cleaved NpnA3 and the MS/MS fragmentation pattern for the 4-fold dehydrated species. (C) MALDI-ToF MS analysis of NpnA6 obtained similarly to that for NpnA3 in panel B. Also presented is the sequence and MS/MS fragmentation pattern for 3-fold dehydrated NpnA6. The MS/MS data for 3-fold dehydrated ProcA4.1, 4-fold dehydrated NpnA3, and 3-fold dehydrated NpnA6 are shown in Supporting Information Figures 24–26, respectively. A more unusual example is found in N. punctiforme PCC 73102, which encodes putative RiPP precursor peptides that have very conserved putative leader peptides but highly diverse core peptides. Among the six peptides identified, none has a Cys residue (Supporting Information Table 1), but the genes encoding these peptides do cluster with a ProcM-like gene. Other genes encoding known RiPP biosynthetic enzymes (e.g., cyclodehydratases for thiazole/oxazole formation[14,30]) were not found in the genome of the organism, suggesting the peptides are putative LanM substrates. Notably, unlike ProcAs and CyanA3.x, whose leader peptides belong to the N11P protein family, the leader peptides of these putative RiPP precursors (here termed NpnAs) belong to the NHLP family (Figure 1A). To test whether these NHLP-containing and Cys-lacking peptides are LanM substrates, two NpnAs with very different core peptides (NpnA3 and NpnA6) were coexpressed with NpnM. MALDI-ToF MS analysis of the resulting peptides showed that both NpnAs were dehydrated (Figure 6B,C and Supporting Information Figures 25–26). Attempts to form a lanthionine by appending a five-amino acid sequence containing a Cys to NpnA3 were unsuccessful (Supporting Information Figure 27). Detailed sequence analysis showed that NpnM may have an impaired cyclase active site because the enzyme has a QG motif instead of a HG motif found in almost all LanC and LanM proteins (Supporting Information Figure 28). The His in this motif is essential for cyclase activity of the class I lanthipeptide cyclase NisC,[31] and its mutation in NpnM may be a consequence of its substrates no longer requiring cyclization. These observations provide support for the model in which lanthipeptide synthetases coevolved with their substrates during the evolutionary process.[10] The function(s) of the Cys-lacking NpnAs is also an interesting question, since the mature products cannot be lanthipeptides. As previously noted,[32,33] a gene encoding a member of the zinc-dependent dehydrogenase enzymes (NpnJ) is encoded near the LanM gene. For the lantibiotics lacticin 3147 and carnolysin, the enzymes LtnJ and CrnJ convert Dha into d-Ala residues.[33,34] Hence, it is possible that the products of the gene cluster in N. punctiforme PCC 73102 are d-amino acid-containing peptides, if NpnJ hydrogenates the dehydroamino acids formed by NpnM. Unfortunately, we did not obtain soluble NpnJ by heterologous expression in E. coli. Similarly, previous attempts to complement a ΔltnJ mutant of Lactococcus lactis with npnJ were unsuccessful, and insoluble expression could not be ruled out.[32]

Conclusions

The tremendous structural and functional diversity of lanthipeptides raises many questions regarding their biological roles, origins, and evolutionary mechanisms.[35−38] The Pcn biosynthetic system is arguably one of the most remarkable examples of natural product combinatorial biosynthesis found in nature. By genome mining for putative lanthipeptides whose biosyntheses involve ProcM-like enzymes, we show that the LanA precursor peptides are highly diverse among different systems and that phylogenetically closely related lanthipeptide synthetases can be associated with very different sets of substrates. In addition, we demonstrated that much of the N-terminal ProcA leader peptide is not required for enzyme activity. This work also extends the combinatorial biosynthesis paradigm to many other organisms by experimental demonstration that ProcM-like enzymes that have multiple nearby precursor peptides have the capacity to process a widely diverse set of core peptides. Given the sequence diversity of their leader peptides and the demonstration that much of the leader peptide is dispensable, this substrate tolerance is not imparted by the leader peptide and hence it is likely that it is a property of the ProcM enzyme clade. The findings herein thus highlight the potential of exploring lanthipeptides by genome mining and for biosynthetic engineering efforts to produce novel lanthipeptides with desired biological activities by mimicking the natural evolutionary process.

Methods

Materials

All oligonucleotides were purchased from Integrated DNA Technologies. Restriction endonucleases, DNA polymerases, T4 DNA ligase, and endoproteinase Asp-N and Glu-C were purchased from New England Biolabs. Media components for bacterial cultures were purchased from Difco laboratories. Chemicals were purchased from Fisher Scientific or from Aldrich unless noted otherwise. E. coli DH5α was used as host for cloning and plasmid propagation, and E. coli BL21 (DE3) was used as a host for coexpression. Cyanothece sp. PCC7425 was purchased from ATCC. Synthetic npnM gene was ordered from GeneArt, and synthetic npnA genes were ordered from Integrated DNA Technologies (IDT).

General Methods

All polymerase chain reactions (PCR) were carried out on a C1000 thermal cycler (Bio-Rad). DNA sequencing was performed by the Biotechnology Center at the University of Illinois at Urbana–Champaign, using appropriate primers. MALDI-ToF MS was carried out on Bruker Daltonics Ultraflex MALDI ToF/ToF mass spectrometer. ESI MS/MS analyses were performed on a Synapt ESI quadrupole ToF mass spectrometry system (Waters). Deisotoping and deconvolution of ESI MS/MS spectra were performed using the MaxEnt3 program (Waters). Detailed procedures for cloning are described in the Supporting Information. Primer sequences are included in Supporting Information Table 2.

Genome Mining of ProcM-like Enzymes and Their Associated LanA Aubstrates

To identify ProcM-like enzymes, BlastP searches were performed using the ProcM protein sequence as the query. Hits were selected with identity >30% and gap <8%. To identify putative lanAs, the annotated open reading frames (ORFs) around lanMs were inspected manually, and BlastP searches were performed within this genome using either a ProcA leader peptide sequence, the leader sequence from other LanAs identified in the same genome, or the leader sequences from known LanAs as the queries. Detailed genome mining procedures are described in the Supporting Information. LanA sequences are shown with their corresponding LanM enzymes in Supporting Information Table 1.

Phylogenetic and Network analysis

Bayesian MCMC inference analyses were performed using the program MrBayes (version 3.2).[39] Final analyses consisted of two sets of eight chains each (one cold and seven heated), run for about 2 million generations with trees saved and parameters sampled every 100 generations. A mixed amino acid model was utilized, and analyses were run to reach a convergence with standard deviation of split frequencies <0.01. Posterior probabilities were averaged over the final 75% of trees (25% burn in). Network analysis was performed by BLAST searches comparing each CyanA leader peptide against another. A Matlab script was written to remove all duplicate comparisons, and the result was imported into the Cytoscape software package.[40] The nodes were arranged using the yFiles organic layout provided with Cytoscape version 2.8.3.

General Procedure for LanA in Vivo Modification

Electrocompetent E. coli BL21 (DE3) cells were transformed with the coexpression constructs. Cultures were inoculated from single colony transformants and grown overnight at 37 °C in 20 mL of LB broth supplemented with 50 mg L–1 kanamycin. The overnight culture was used to inoculate 1 L of LB broth, and cells were grown at 37 °C to OD600 ≈ 0.6–0.8. Expression was induced by the addition of 0.2 mM IPTG, and the culture was incubated at 18 °C for 18 h. After harvesting, the pellet was resuspended in 35 mL of denaturing LanA buffer 1 (6 M guanidine hydrochloride, 20 mM NaH2PO4, 500 mM NaCl, 0.5 mM imidazole, pH 7.5). The cell paste was subjected to sonication, and the cell debris was removed by centrifugation. The supernatant was purified by immobilized metal affinity chromatography (IMAC) using His60 Ni Superflow Resin (Clontech), and the elution fractions were desalted and purified by reversed-phase HPLC using a Waters Delta-pak C4 column. The fractions were either analyzed directly by MALDI-ToF-MS, or treated with appropriate endoprotease (Glu-C, Asp-N, or TEV protease) to remove the leader peptide before MALDI-ToF-MS and/or ESI-MS/MS analysis.

40 in total

1. TIGRFAMs: a protein family resource for the functional identification of proteins.

Authors: D H Haft; B J Loftus; D L Richardson; F Yang; J A Eisen; I T Paulsen; O White
Journal: Nucleic Acids Res Date: 2001-01-01 Impact factor: 16.971

Review 2. Natural products--a simple model to explain chemical diversity.

Authors: Richard D Firn; Clive G Jones
Journal: Nat Prod Rep Date: 2003-08 Impact factor: 13.423

Review 3. Insights into the evolution of lanthipeptide biosynthesis.

Authors: Yi Yu; Qi Zhang; Wilfred A van der Donk
Journal: Protein Sci Date: 2013-09-18 Impact factor: 6.725

4. One-pot synthesis of azoline-containing peptides in a cell-free translation system integrated with a posttranslational cyclodehydratase.

Authors: Yuki Goto; Yumi Ito; Yasuharu Kato; Shotaro Tsunoda; Hiroaki Suga
Journal: Chem Biol Date: 2014-05-22

5. The discovery of new cyanobactins from Cyanothece PCC 7425 defines a new signature for processing of patellamides.

Authors: Wael E Houssen; Jesko Koehnke; David Zollman; Jeremie Vendome; Andrea Raab; Margaret C M Smith; James H Naismith; Marcel Jaspars
Journal: Chembiochem Date: 2012-11-21 Impact factor: 3.164

6. Crystal structure of nitrile hydratase from a thermophilic Bacillus smithii.

Authors: Shinji Hourai; Misao Miki; Yoshiki Takashima; Satoshi Mitsuda; Kazunori Yanagi
Journal: Biochem Biophys Res Commun Date: 2003-12-12 Impact factor: 3.575

7. Lacticin 481: in vitro reconstitution of lantibiotic synthetase activity.

Authors: Lili Xie; Leah M Miller; Champak Chatterjee; Olga Averin; Neil L Kelleher; Wilfred A van der Donk
Journal: Science Date: 2004-01-30 Impact factor: 47.728

8. Structure and biosynthesis of carnolysin, a homologue of enterococcal cytolysin with D-amino acids.

Authors: Christopher T Lohans; Jessica L Li; John C Vederas
Journal: J Am Chem Soc Date: 2014-09-10 Impact factor: 15.419

Review 9. Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature.

Authors: Paul G Arnison; Mervyn J Bibb; Gabriele Bierbaum; Albert A Bowers; Tim S Bugni; Grzegorz Bulaj; Julio A Camarero; Dominic J Campopiano; Gregory L Challis; Jon Clardy; Paul D Cotter; David J Craik; Michael Dawson; Elke Dittmann; Stefano Donadio; Pieter C Dorrestein; Karl-Dieter Entian; Michael A Fischbach; John S Garavelli; Ulf Göransson; Christian W Gruber; Daniel H Haft; Thomas K Hemscheidt; Christian Hertweck; Colin Hill; Alexander R Horswill; Marcel Jaspars; Wendy L Kelly; Judith P Klinman; Oscar P Kuipers; A James Link; Wen Liu; Mohamed A Marahiel; Douglas A Mitchell; Gert N Moll; Bradley S Moore; Rolf Müller; Satish K Nair; Ingolf F Nes; Gillian E Norris; Baldomero M Olivera; Hiroyasu Onaka; Mark L Patchett; Joern Piel; Martin J T Reaney; Sylvie Rebuffat; R Paul Ross; Hans-Georg Sahl; Eric W Schmidt; Michael E Selsted; Konstantin Severinov; Ben Shen; Kaarina Sivonen; Leif Smith; Torsten Stein; Roderich D Süssmuth; John R Tagg; Gong-Li Tang; Andrew W Truman; John C Vederas; Christopher T Walsh; Jonathan D Walton; Silke C Wenzel; Joanne M Willey; Wilfred A van der Donk
Journal: Nat Prod Rep Date: 2013-01 Impact factor: 13.423

10. The cyanobactin heterocyclase enzyme: a processive adenylase that operates with a defined order of reaction.

Authors: Jesko Koehnke; Andrew F Bent; David Zollman; Kieran Smith; Wael E Houssen; Xiaofeng Zhu; Greg Mann; Tomas Lebl; Richard Scharff; Sally Shirran; Catherine H Botting; Marcel Jaspars; Ulrich Schwarz-Linek; James H Naismith
Journal: Angew Chem Int Ed Engl Date: 2013-11-08 Impact factor: 15.336

23 in total

1. Michael-type cyclizations in lantibiotic biosynthesis are reversible.

Authors: Xiao Yang; Wilfred A van der Donk
Journal: ACS Chem Biol Date: 2015-03-10 Impact factor: 5.100

2. Evolutionary radiation of lanthipeptides in marine cyanobacteria.

Authors: Andres Cubillos-Ruiz; Jessie W Berta-Thompson; Jamie W Becker; Wilfred A van der Donk; Sallie W Chisholm
Journal: Proc Natl Acad Sci U S A Date: 2017-06-19 Impact factor: 11.205

Review 3. Mechanistic Understanding of Lanthipeptide Biosynthetic Enzymes.

Authors: Lindsay M Repka; Jonathan R Chekan; Satish K Nair; Wilfred A van der Donk
Journal: Chem Rev Date: 2017-01-30 Impact factor: 60.622

4. Steric complementarity directs sequence promiscuous leader binding in RiPP biosynthesis.

Authors: Jonathan R Chekan; Chayanid Ongpipattanakul; Satish K Nair
Journal: Proc Natl Acad Sci U S A Date: 2019-11-12 Impact factor: 11.205

Review 5. RiPP antibiotics: biosynthesis and engineering potential.

Authors: Graham A Hudson; Douglas A Mitchell
Journal: Curr Opin Microbiol Date: 2018-03-10 Impact factor: 7.934

6. Expanded natural product diversity revealed by analysis of lanthipeptide-like gene clusters in actinobacteria.

Authors: Qi Zhang; James R Doroghazi; Xiling Zhao; Mark C Walker; Wilfred A van der Donk
Journal: Appl Environ Microbiol Date: 2015-04-17 Impact factor: 4.792

7. Substrate Recognition by the Class II Lanthipeptide Synthetase HalM2.

Authors: Imran R Rahman; Jeella Z Acedo; Xiaoran Roger Liu; Lingyang Zhu; Justine Arrington; Michael L Gross; Wilfred A van der Donk
Journal: ACS Chem Biol Date: 2020-04-28 Impact factor: 5.100

8. Structural Characterization and Bioactivity Analysis of the Two-Component Lantibiotic Flv System from a Ruminant Bacterium.

Authors: Xiling Zhao; Wilfred A van der Donk
Journal: Cell Chem Biol Date: 2016-01-28 Impact factor: 8.116

9. Three Principles of Diversity-Generating Biosynthesis.

Authors: Wenjia Gu; Eric W Schmidt
Journal: Acc Chem Res Date: 2017-09-11 Impact factor: 22.384

Review 10. Combinatorial biosynthesis of RiPPs: docking with marine life.

Authors: Debosmita Sardar; Eric W Schmidt
Journal: Curr Opin Chem Biol Date: 2015-12-19 Impact factor: 8.822