Literature DB >> 34279620

Nonribosomal peptide synthetases and their biotechnological potential in Penicillium rubens.

Riccardo Iacovelli¹, Roel A L Bovenberg^2,3, Arnold J M Driessen¹.

Abstract

Nonribosomal peptide synthetases (NRPS) are large multimodular enzymes that synthesize a diverse variety of peptides. Many of these are currently used as pharmaceuticals, thanks to their activity as antimicrobials (penicillin, vancomycin, daptomycin, echinocandin), immunosuppressant (cyclosporin) and anticancer compounds (bleomycin). Because of their biotechnological potential, NRPSs have been extensively studied in the past decades. In this review, we provide an overview of the main structural and functional features of these enzymes, and we consider the challenges and prospects of engineering NRPSs for the synthesis of novel compounds. Furthermore, we discuss secondary metabolism and NRP synthesis in the filamentous fungus Penicillium rubens and examine its potential for the production of novel and modified β-lactam antibiotics.

Entities: Chemical

Keywords: Antibiotics; Natural products; Nonribosomal peptide synthetases

Mesh：

Substances：

Year: 2021 PMID： 34279620 PMCID： PMC8788816 DOI： 10.1093/jimb/kuab045

Source DB: PubMed Journal: J Ind Microbiol Biotechnol ISSN： 1367-5435 Impact factor: 4.258

Introduction

Nonribosomal peptides (NRP) were first discovered during the 1950s, when several studies on the biosynthesis of tyrocidine and gramicidin—mixtures of cyclic decapeptides with antibiotic activity produced by Brevibacillus brevis—found evidence that the biosynthesis of these compounds was independent of the mRNA-ribosome mechanism (Berg et al., 1965; Fujikawa et al., 1966; Mach et al., 1963; Spaeren et al., 1967; Tomino et al., 1967; Yukioka et al., 1965). Since their discovery, NRPs have been of great interest for research and industry, given their numerous clinical applications and biological functions: antibiotics and precursors (1–6), toxins (7–9), anticancer (10–12), siderophores (13–15), immunosuppressant (16), antifungal (17), and pigments (18) (Schwarzer et al., 2003; Süssmuth & Mainz, 2017). These small peptides can range in size between 2 and 50 amino acids, and are characterized by a wide structural diversity (Fig. 1).

Fig. 1.

Examples of nonribosomal peptides.

Examples of nonribosomal peptides. NRPs are synthesized by large multi-modular enzymes called nonribosomal peptide synthetases (NRPS), found in bacteria and filamentous fungi. Their size can range from about 100 kDa for one-module enzymes (Luo et al., 2001) up to the 1.8 MDa of the kolossin A synthase, consisting of 15 modules and 45 domains (Bode et al., 2015). Bacterial NRP synthesis is generally performed by several enzymes encoded by genes organized in an operon, while in fungi the synthesis is often carried out by single NRPS (Schwarzer et al., 2003). NRPS can activate and incorporate a broad variety of substrates including standard and non-proteinogenic amino acids (d- and l-), fatty acids, α-hydroxy acids, α-keto acids, heterocycles, and others, further contributing to the chemical and structural diversity of NRPs (Caboche et al., 2008; Fischbach & Walsh, 2006; Kudo et al., 2019; McErlean et al., 2019). Each NRPS module specifically recognizes, activates and incorporates a single substrate into the growing peptide chain. Proceeding in a linear fashion, the order and the specificity of the modules will determine the primary sequence of the product (Stanišić & Kries, 2019). A minimal NRPS module is constituted by several domains: adenylation (A), condensation (C), and peptide-carrier protein (PCP), or thiolation (T) (Fig. 2). Initiation modules usually lack C domains, while termination modules often possess an extra domain, the thioesterase (Te) (Strieker et al., 2010). A domains recognize a specific substrate and activate it with ATP yielding an acyl-AMP intermediate and PPi (I). The conjugate is then transferred to the phosphopantheteine arm (ppant) of the adjacent PCP domain via a transesterification reaction, with AMP being released (II). The ppant is a CoA-derived cofactor attached post-translationally by phosphopantheteinyl transferases, and it is crucial for the activity of NRPS (Shen et al., 2004). The substrates/intermediates are subsequently transported by two adjacent PCP domains to the catalytic site of the C domain. Here, the formation of the peptide bond is catalyzed, with the α-amino group of the downstream (acceptor) substrate attacking the activated α-carboxy group of the upstream (donor) peptide (Linne & Marahiel, 2004) (III). The upstream PCP is unloaded and ready for another cycle. Synthesis proceeds until the last substrate is incorporated by the termination module. In the final stage, the thioesterase domain will cleave the peptide from the terminal PCP domain and catalyze its release via hydrolysis (IV-a) or macrocyclization (IV-b) (Payne et al., 2017) (Fig. 2).

Fig. 2.

Schematic representation of a trimodular nonribosomal peptide synthetase (adapted from Stanišić & Kries, 2019). NRP synthesis starts with the adenylation domains activating specific substrates using ATP, generating aminoacyl-AMP (I); the substrate is then loaded onto the ppant of the T domain via a thioesterification reaction, with the release of AMP (II); the two activated substrates are transported to the condensation domain, where the peptide bond formation is catalyzed (III); when the fully grown peptide chain reaches the thioesterase domain (IV) it can be released either via hydrolysis (IVa) or intramolecular cyclization (IVb). Because of the modular organization of these enzymes and the appealing properties of NRP variants for clinical applications, there have been several attempts at engineering NRPS to modify substrate specificity with the aim of producing novel compounds (Baltz, 2014; Butz et al., 2008; Cai et al., 2019; Calcott et al., 2014; Fischbach et al., 2007; Han et al., 2012; Kaljunen et al., 2015; Kries et al., 2014, 2015; Nguyen et al., 2006). In most cases, the hybrid/modified enzymes showed low production yields, relatively to their wild-type counterpart. This is likely due to our limited understanding of NRPSs’ multimodular architecture, as well as the dynamics of NRP synthesis: how domains and modules interact with each other during the process. It is now clear that linker regions are crucial for these interactions, hence for the functionality of the NRPS machinery (Beer et al., 2014; Doekel, et al., 2008; Yu et al., 2013). Furthermore, some studies reported how even C domains show specificity toward upstream activated substrates (Belshaw et al., 1999; Bloudoff et al., 2016), exerting an extra gate-keeping function and adding a further challenge to successful engineering attempts. Despite the many challenges encountered, NRPS engineering is a potentially very interesting approach for the discovery and synthesis of new bioactive compounds by fermentation, both urgently needed to address the growing antibiotic resistance problem and the need to make new antibiotics in a sustainable manner. In the following sections, the fundamental aspects of NRPS structure and function and the most recent advances in NRPS engineering will be discussed in depth. Further, we will discuss secondary metabolism in the industrially relevant filamentous fungus Penicillium rubens, and the relevance of NRPS engineering for β-lactam antibiotics production. This review is written in the honor of Prof. Arnold Demain, who provided major contributions to this field with his pioneering work on the elucidation of the role of the ACV tripeptide-forming NRPS in the first and common step of the biosynthetic pathways of penicillin and cephalosporin. His efforts paved the way for the extensive global research on β-lactam biosynthesis and, ultimately, the development of the β-lactam industry as we know it today.

The Domains of NRPS: Structure and Function

In the next section, the individual domains of a NRPS will be described, along with the reactions they catalyze and the most recent mechanistic insights.

Adenylation Domains—Substrate Activation

Adenylation domains belong to the ANL superfamily of adenylating enzymes, which comprises acyl- and aryl-CoA synthetases and firefly luciferase as well. These enzymes are structurally homologous and share the conserved mechanism of the adenylation partial reaction, although they catalyze different overall reactions (Gulick, 2009). NRPS A domains determine the specificity of the entire module and are referred as “gatekeeper” domains, since adenylation of the substrate is essential for subsequent thioesterification and incorporation of the building block in the growing peptide (Payne et al., 2017; Sun et al., 2014). Their size is generally of 500–550 amino acids, divided into two distinct subdomains: the larger N-terminal Acore subdomain (∼400 amino acids) and the smaller C-terminal Asub subdomain (∼100 amino acids), separated by a wide cleft. The binding pocket of the substrate is located in the Acore, while the catalytic residue (Lys) is positioned on a loop in the Asub (Conti et al., 1996, 1997) (Fig. 3A).

Fig. 3.

Core domains of NRPSs. (A) Adenylation domain. Structure of the A domain of GrsA (Conti et al., 1997) (PDB: 1AMU) with its active site architecture and substrate-binding pocket bond network: in magenta H-bonds, in gray hydrophobic interactions and in green pi–pi stacking interactions. (B) T/PCP domain. Mechanism of PCP priming: A PPTase attaches the ppant cofactor, derived from CoA, to a conserved residue of serine. In the bottom left corner, structure of model PCP domain BlmI (PDB: 4NEO). (C) Condensation domain. Structure of the model C domain VibH from Vibrio cholerae (PDB: 1L5A) showing the classical V-shape architecture. On the right side, mechanism of peptide bond formation in the active site of the condensation domain of CDA-C1 (adapted from Bloudoff & Schmeing, 2017). When the first crystal structure of an A domain (PheA from the gramicidin synthase A) was solved (Conti et al., 1997), several core motifs were identified, along with ten key residues crucial for the interaction with the substrate. Two highly conserved residues were shown to form critical hydrogen bonds with the α-amino group and α-carboxyl group of the substrate: respectively, Asp235 and Lys517, which is the catalytic residue. The other eight residues interact with the side chain of the substrate, contributing to its recognition and correct positioning (Fig. 3A). By matching these residues with corresponding motifs in other A domains, the nonribosomal specificity code (also known as “Stachelhaus code”) was determined, providing a set of general rules that would allow predicting the substrate specificity of A domains simply from the primary sequence of the enzyme (Challis et al., 2000; Rausch et al., 2005; Stachelhaus et al., 1999). Though this model works well for most amino acid activating-bacterial NRPSs, it is less successful with the eukaryotic enzymes (Stack et al., 2007; von Döhren, 2009) and A domains that activate other types of substrates. These show different substrate-binding pocket interactions altogether (Alonzo et al., 2020; Lee et al., 2010). In all cases the reaction catalyzed is the same: the charged catalytic lysine interacts with both ATP and the acid substrate, bringing them in close proximity and ultimately driving the attack of the carboxylate to the α-phosphate of the ATP. This results in the formation of the acyl-AMP intermediate (activated substrate) and the release of inorganic pyrophosphate (Conti et al., 1997; Stanišić & Kries, 2019; Strieker et al., 2010). To allow proper positioning and binding of the substrates, the Asub subdomain is initially oriented away from the active site, and the A domain adopts an open conformation. Once the substrates are bound, Asub moves toward the active site, adopting a closed conformation. This movement brings the catalytic residue in proximity of the substrates and allows the adenylation reaction (Reimer et al., 2016). Subsequently, Asub rotates by ∼140°, allowing the PCP domain to partially penetrate the active site and load the substrate, before a new adenylation cycle can begin (Drake et al., 2016; Miller & Gulick, 2016). Such conformational change is made possible by a hinge residue in the linker region that joins Acore and Asub (R. Wu et al., 2009). This rotational mechanism is well conserved across all members of the ANL superfamily (Gulick, 2009; Miller & Gulick, 2016; Stanišić & Kries, 2019) and is a crucial step for the alternation of the two states of the A domain: the catalytic, adenylate-forming state and the thioester-forming state.

PCP Domains—Loading Stage

PCP (peptidyl-carrier-protein) domains, also referred to as T (thiolation) domains, are the transporter units of NRPSs. They are very similar to other carrier proteins, like ACPs (acyl-carrier-proteins) from fatty acids synthases and polyketide synthases, with which they share structural features and function (Mercer & Burkart, 2007). PCPs are the smallest domains of NRPS enzymes, generally ranging in size between 70 and 90 amino acids, structurally organized in a 4-helices bundle. Helices 1, 2, and 4 are longer and approximately parallel between each other, while helix 3 is shorter and perpendicular to the other three (Fig. 3B) (Lohman et al., 2014; Weber et al., 2000). To become functional, apo-PCPs need to be post-translationally modified by specific enzymes called 4′-phosphopantetheinyl transferases (PPTases). This modification involves the attachment of the 4′-phosphopantetheine cofactor, derived from CoA, to a conserved residue of serine contained in the structural motif GGXS (Fig. 3B) (Beld et al., 2014). This residue is located at the start of helix 2, after a connecting loop, and it protrudes outward to allow the cofactor attachment (Lohman et al., 2014; Tufar et al., 2014). Several studies have investigated the interaction of PCP domains with the other catalytic domains of NRPS, revealing the importance of hydrophobic residues on helix 2 and helix 3 (Drake et al., 2016; Lai et al., 2006; Liu et al., 2011; Mitchell et al., 2012). In particular, helix 2 is involved in a patch of hydrophobic interactions with A and C domains, which allows the correct positioning of the PCP domain (Mitchell et al., 2012). These observations are supported by the fact that the ppant arm is attached to helix 2, an ideal position to transport activated substrates and intermediates to the respective active sites of the catalytic domains. Furthermore, PCPs are connected to A domains through a flexible linker that is crucial for A-PCP interaction and therefore for the functionality of the enzyme. Recently, a conserved proline-rich motif was identified at the start of this region, with the consensus sequence LPxP (Miller et al., 2014). This motif interacts with the adjacent A10 motif of the A domain, stabilizing the catalytic residue of lysine and likely shortening the linker length. This allows the PCP movement and the rotation of the Asub subdomain to happen in a coordinated manner, and the consequent partial penetration of the PCP in the active site of the A domain. When this happens, the –SH group of the ppant cofactor will attack the activated carboxyl group of the substrate via thioesterification. AMP is released in the process and the substrate is now ready for transport to the adjacent domains Mercer & Burkart, 2007; Staniŝić & Kries, 2019.

Condensation and Cyclization Domains—Peptide Elongation

Condensation domains, sized on average 450 amino acids, are the elongation units of NRPSs, as they catalyze the formation of peptide bond between adjacent substrates/intermediates. They share the same structural fold of other acyl transferases, namely chloramphenicol acetyl transferases (CAT) (Leslie, 1990) and dihydrolipoamide acetyltransferase (E2p component of the pyruvate dehydrogenase complex) (Mattevi et al., 1992), as well as the same active site motif, HHxxxDG (De Crecy-Lagard et al., 1995). Even though the primary gatekeepers in nonribosomal peptide synthesis are the A domains, C domains also appear to show specificity to their substrates, generally in a stricter manner toward acceptor substrates (Belshaw et al., 1999; Bloudoff et al., 2016; Ehmann et al., 2000). The first crystal structure of a C domain, the enzyme VibH from Vibrio cholera, was solved in 2002 by Keating and colleagues (Keating et al., 2002). VibH belongs to the biosynthetic cluster of vibriobactin synthetase and it is a stand-alone C domain, offering an ideal target for structural studies. The C domain is organized in a pseudo-dimeric fashion, with two lobes, a N-terminal lobe and C-terminal lobe, facing each other and creating a V-shaped canyon-like structure (Bloudoff et al., 2013; Samel et al., 2007; Tanovic et al., 2008). The two lobes are similar in structure, showing a large central β-sheet flanked by several α-helices, and are separated by a central tunnel (Fig. 3C). The two putative PCP binding sites are at the opposite sides of the tunnel, while the active site residue, the second histidine of the HHxxxDG motif, is located on a connecting loop in the N-lobe and protrudes toward the center of the tunnel. Initially, it was thought that the second histidine of the active site motif had a catalytic role. Essentially it would act as a base to deprotonate the α-amino group of the acceptor substrate, therefore allowing its nucleophilic attack on the carbonyl group of the donor substrate (Bergendahl et al., 2002; Stachelhaus et al., 1998). However, several other studies on different C domains showed that this residue is not always essential and can be mutated without significant loss of activity, in marked contrast with its proposed function (Bergendahl et al., 2002; Keating et al., 2002; Marshall et al., 2002). An alternative explanation was provided by a recent structural study, which made use of chemical probes to investigate the C domain-substrate interaction mechanism (Bloudoff et al., 2016). The probes consist of structural analogues of the substrate that are covalently tethered to a residue of the C domain that is positioned along the tunnel, so that they can be presented at high concentration to the active site, in order to achieve proper electron density and hence obtain well resolved crystal structures. The mechanism that was revealed is shown in Fig. 3C. The ε nitrogen of the histidine makes a hydrogen bond with the α-amino group of the acceptor substrate, which also interacts with the backbone carbonyl of another residue (in this specific case serine). These two interactions favor the correct positioning of the substrate, promoting the nucleophilic attack on the donor carbonyl. It seems likely that the main role of the conserved residue of histidine is therefore that of positioning the substrate (Bloudoff & Schmeing, 2017). Cyclization domains are fairly common in NRPS, where they can replace the C domain in some elongation modules (Rausch et al., 2007). Essentially, Cy domains catalyze the elongation of the peptide through a two-step mechanism (Chen et al., 2001; Duerfahrt et al., 2004; Gehring et al., 1998; Marshall et al., 2001). In the first reaction, the peptide bond is formed between acceptor and donor substrate, analogously to what happens in C domains. The next step is the nucleophilic attack of the side chain of the acceptor substrate, which is always a thiol- (cysteine) or hydroxyl- (threonine or serine) group, to the carbonyl of the newly-formed peptide bond, which generates the heterocycle. Where present, the heterocyclic rings are crucial for the biological function of the NRPs (Roy et al., 1999). Cy domains belong to the same superfamily of C domains and therefore share the same overall V-shape structure, with the acceptor and donor PCP binding sites at the opposite sides of the central tunnel (Bloudoff et al., 2017; Dowling et al., 2016). In contrast with the canonical C domains, the conserved motif on the homologous connecting loop here is DxxxxD, which completely lacks the histidine residues, here often replaced by hydrophobic residues. The leading hypothesis is that the residues of this motif have a structural function, rather than that of interacting with the substrates. The catalytic residues involved in heterocyclization have been later identified in two distinct conserved motifs, PVVFTS (Cy6) and SQTPQVxLD (Cy7) (Konz et al., 1997), thanks to mutational studies by Bloudoff and colleagues on the cyclization domain from the bacillamide synthetase unit BmdB (Bloudoff et al., 2017). It appears that cyclization is completely abolished when both the threonine residue from motif Cy6 and the aspartic acid from Cy7 are mutated, whereas robust condensation can still be observed. The suggested mechanism is that these residues deprotonate the side chain group of the acceptor substrate, thereby priming the cyclization.

Thioesterase and Reductase Domains—Product Release

Thioesterase domains (Te) are the terminal domains of NRPSs and therefore are only present at the C-terminal region of the termination module (Schneider & Marahiel, 1998). NRPS Te domains show strong similarities with type II fatty acid thioesterases and polyketide synthase thioesterase domains, with which they share structural features and the conserved active-site motif GxSxG. They all show a typical α/β-hydrolase fold, with an alternated α/β/α motif in the central region and two helices forming a so-called lid region, which surrounds the substrate channel (Bruner et al., 2002; Horsman et al., 2016). Like other hydrolases, Te domains possess a catalytic triad of serine (in the conserved motif), histidine and aspartic acid, harbored in a central cavity. The mechanism with which Te domains perform their task is a two-step mechanism already known for other serine hydrolases, which involves a loading phase and a release phase. The histidine and aspartic acid generate a network of charges that ultimately deprotonates the side chain of the serine, thereby priming its nucleophilicity. When the PCP moves toward the active site of the Te, the lid region moves away from it, allowing the presentation of the thioester to the catalytic triad (Frueh et al., 2008; Tsai et al., 2001). At this point the serine attacks the C-terminal group of the NRP, releasing the PCP-ppant thiolate. The second step is the release step, and it can occur via three distinct routes: (a) hydrolysis following the attack of a water molecule, leading to the release of a linear product (Tahlan et al., 2017); (b) attack of a nucleophilic group within the NRP itself (N-terminal α-amino group or a nucleophilic side chain), leading to the release of a cyclic peptide (Bruner et al., 2002; Kohli & Walsh, 2003; Tsai et al., 2001); (c) attack of a nucleophilic group belonging to a newly-synthesized peptidyl-PCP, common in the case of iterative NRPSs, leading to the release of multimeric NRPs (Shaw-Reid et al., 1999). Te domains can show a certain degree of selectivity toward their substrates, and most importantly they seem to be specific for certain types of release mechanisms only, which are often controlled by structural properties intrinsic to the substrates themselves (Gaudelli et al., 2015; Horsman et al., 2016; Trauger et al., 2000). An alternative release mechanism is offered by NAD(P)H-dependent reductase domains (R). In contrast with other accessory domains, these domains actually replace the Te domains in the NRPS machineries that possess them. Several R domains have been studied and characterized in the past decade (Barajas et al., 2015; Chhabra et al., 2012; Wyatt et al., 2012), providing crucial insights on their structural organization and mechanism. R domains are organized in a bigger N-terminal subdomain that is responsible for NAD(P)H binding, and a smaller and flexible C-terminal domain that most likely recognizes the substrates and promotes its correct positioning. The active site is constituted by a catalytic triad of S/T-Y-K in the NAD(P)H binding pocket. The thioester will be positioned by the peptidyl-PCP domain in the same site, where the terminal carbonyl of the peptide is reduced, resulting in the release of the product as an aldehyde and the regeneration of the ppant cofactor (Chhabra et al., 2012). Once released, the product can go through another round of reduction, resulting in the corresponding primary alcohol, or undergo an intramolecular cyclization via the N-terminal α-amino group. The advantage of using these domains instead of “standard” Te domains, is that the C-terminal end of the peptide will be free of negative charges, therefore offering more possibilities in terms of further modification (e.g., glycosylation), or promoting cyclization (Süssmuth & Mainz, 2017). Furthermore, the C-terminal aldehyde moiety can render the product biologically active, as is the case for the lipotripeptide fellutamide B, a potent proteasome inhibitor (Hines et al., 2008; Yeh et al., 2016).

Accessory Domains—in cis Product Modification

The broad chemical and structural diversity of NRPs is the crucial trait that makes this class of secondary metabolites so successful in nature, and at the same time incredibly interesting for their potential clinical applications. Many of the structural modifications that further contribute to this diversity are introduced by accessory modifying domains. Amongst optional domains, epimerization domains (E) are some of the most common and best characterized. They catalyze the conversion of l-amino acid residues to d-amino acids, which can be an important feature to allow NRPs to adopt specific conformations that are critical to their biological function (e.g., antibiotics) (Kawai et al., 2004), or gain resistance toward cellular proteases (Bessalle et al., 1990). Given their tailoring function, E domains are present as a fourth domain in an elongation module (C-A-PCP-E). Like Cy domains, E domains also belong to the C domain superfamily. Unlike the former though, they also share the same active site motif, HHxxxDG (De Crecy-Lagard et al., 1995). Unsurprisingly, the structure of the E domains is fairly similar to that of C domains (Chen et al., 2016; Samel et al., 2014), with the biggest difference occurring at the acceptor PCP binding site, where an extra stretch of residues blocks the access to the active site. This forces the adjacent PCP domain, loaded with the substrate, to bind exclusively at the donor site (Samel et al., 2014). As in C domains, the second histidine of the active site motif appears to have a critical function for catalysis. Next to it, another crucial residue has been identified, a conserved glutamic acid that lies at the opposite side of the tunnel. Mutation of either residue greatly impairs epimerization activity (Stachelhaus & Walsh, 2000). The proposed mechanism is that the histidine acts as a general base to deprotonate the α-carbon of the substrate, generating an enolate intermediate. The glutamic acid then acts as general acid instead, protonating the α-carbon and thereby converting the enolate to the d-form of the amino acid. Despite the limited amount of structural information, several studies revealed that E domains exhibit a certain degree of specificity toward their substrates (Luo et al., 2001), and in general tend to prefer peptidyl-PCP substrates rather than l-aminoacyl-substrates (Stein et al., 2005). This is a strong indication that the epimerization occurs predominantly after the condensation reaction has already taken place. Other very common modification domains in NRPS systems are methyltransferase domains (M). These are compact domains, with a size of about 45 kDa, that modify the substrates by introducing a methyl group. The majority of these domains are integrated in the A domain itself, between the two core motifs A9 and A10 (Labby et al., 2015; Mori et al., 2018), but they can also be found upstream of the A domain (Müller et al., 2014) or as stand-alone domains (Shi et al., 2009). The most abundant type of modification is the N-methylation of the backbone, with the cyclosporine A as one of the best known examples (Lawen & Zocher, 1990). Other types of methylation can also occur: O-, S- and even C-, though they are rather rare (Süssmuth & Mainz, 2017). Less common is N-methylation of the side chain of a substrate (Müller et al., 2014), and in this case the M domain is actually upstream of the A domain, rather than embedded into it. Though structural information on methyltransferase domains is scarce, several studies identified conserved motifs for the binding of SAM (S-adenosyl-methionine), which suggests that this cofactor is most likely used as a donor of methyl groups for the methylation reaction (Ansari et al., 2008; Mori et al., 2018; Velkov & Lawen, 2003). A well characterized optional domain is the formylation domain (F), with the prominent example of the F domain of LgrA, the initiation module of linear gramicidin synthase (Reimer et al., 2016; Schoenafinger et al., 2006). The F domain is located directly upstream of the A domain, at the N-terminus of an initiation module. When the substrate is activated and loaded onto the PCP domain, the latter transports it to the catalytic center of F domain. Utilizing formyltetrahydrofolate (fTHF) as cofactor, the F domain attaches the formyl group to the α-amino group of the substrate (N-formylation). Thereafter, the whole module undergoes extensive conformational changes that allow the PCP domain to transport the formylated substrate to the downstream C domain, and a new synthetic cycle can begin (Reimer et al., 2016). Similarly to the case of reductase domains, the advantage of N-formylation is that the final product will not be positively charged at the N-terminus, allowing the peptide to gain the necessary chemical properties for its biological function (e.g., antibacterial activity). Flavin mononucleotide (FMN)-dependent oxidase domains (Ox) are another type of accessory domains found in NRPSs. They are relatively small domains (about 30 kDa) and, like the M domains, they are often embedded in the C-terminal subdomain of the A domain itself, between the core motifs A8 and A9 (Labby et al., 2015; Perlova et al., 2006; Schneider et al., 2003). Ox domains show two main conserved signature motifs, Ox1 and Ox2, which suggests their homology with other FMN-dependent oxidoreductases. They are responsible for the oxidation of thiazoline or oxazoline species generated by cyclization domains to the corresponding thiazole or oxazole. Like other domains, their activity is strictly dependent on substrate supply and positioning by the PCP domain. Other less represented tailoring domains include ketoacyl reductase (KR) domains (Fujimori et al., 2007; Magarvey, Ehling-Schulz, et al., 2006; Xu et al., 2009), monooxygenases (MOx) domains (Perlova et al., 2006; Weinig et al., 2003), and β-lactam forming C domains (Gaudelli et al., 2015; Gunsior et al., 2004). The latter are particularly interesting in that they show a novel function for C domains, namely β-lactam ring formation. The mechanism that has been proposed for this function involves an extra residue of histidine immediately preceding the active site motif, in this case (H)HHxxxDG, which catalyzes the dehydration of a serine (donor) substrate. Thereafter, the α-amino group of the acceptor substrate first attacks the dehydroalanine side chain (amine addition), generating a secondary ammine. This subsequently attacks the carbonyl group of the thioester (nucleophilic attack), generating the thioester bound β-lactam intermediate (Gaudelli et al., 2015).

Types of NRPS

Linear, Iterative, and Nonlinear NRPSs

Linear NRPSs or type A NRPSs synthesize their products in a colinear fashion, where each module specifically recognizes and incorporates only one substrate into the growing peptide. The biosynthetic cycle of these enzymes is the simplest amongst NRPSs (Fig. 2), with the sequence of the product reflecting exactly the amount and order of the modules. Linear NRPSs systems can either consist of a single protein harboring all modules and domains required to perform the complete synthesis process (Bode et al., 2015; Iacovelli et al., 2020; Viggiano et al., 2018), or, more often, several proteins each providing activated substrates or intermediates for the stepwise assembly of the final peptide product (Hoertz et al., 2012; Kessler et al., 2004; Mootz & Marahiel, 1997; Scholz-Schroeder et al., 2003). Iterative NRPSs, also called type B NRPSs, can reutilize specific modules during one biosynthetic cycle, resulting in certain modules incorporating the same substrate multiple times (Gehring et al., 1997; Hoyer et al., 2007; Juguet et al., 2009). Often this mechanism leads to the formation of symmetrical compounds, like the siderophore enterobactin, the antibiotic gramicidin S, and the depsipepetides bassianolide, enniatin, and beauvaricin. Because of the nature of their biosynthetic cycle, iterative NRPSs require a “storage position” for the intermediates that are being assembled. These intermediates can either be stored on the PCP domain (Al-Mestarihi et al., 2015; Glinski et al., 2002) or on the terminal thioesterase domain (Hoyer et al., 2007; Shaw-Reid et al., 1999). The iterative activation and incorporation of one substrate can go on for as many as 5 elongation cycles in certain NRPSs, until the growing peptide reaches a critical length and triggers the final unloading/release step. Though it is unclear what determines the critical length, the structural features of the peptide and the NRPSs themselves might play a role (Süssmuth & Mainz, 2017). Nonlinear, or type C, NRPSs assemble their product utilizing a similar strategy to that of iterative NRPSs. In this case though it is not a specific module that is being reused multiple times, but an individual domain. Most commonly this is an A domain that provides aminoacyl-AMP to other domains than its cognate PCP domain, even to different NRPSs (Du et al., 2000; Felnagle et al., 2007; Magarvey, Haltli, et al., 2006; Schneider et al., 2003).

Stand-alone NRPSs

Despite the different biosynthetic cycles that have been described, some of the core features of NRPSs are their modularity and the large size that often is required to house all the domains necessary for the synthesis of the product. Surprisingly, a recent genome-mining study (Wang et al., 2014) found that about 10% of the bacterial gene clusters for these enzymes lacked the canonical modular organization. In these cases, individual modules (a well-known example being the gramicidin S synthetase GrsA) or individual domains and didomains are encoded on separate proteins that work in concert to perform all the biosynthetic steps required for NRP synthesis. Commonly these isolated proteins activate and provide unusual substrates (Bibb et al., 2014; Chen et al., 2002; Maruyama et al., 2012; Vaillancourt et al., 2005), involving in cis (Chen et al., 2002) or in trans modifications performed by external enzymes that are recruited by the NRPSs themselves (Bibb et al., 2014). But stand-alone domains can also carry out canonical functions, with the notable example of the C domain of VibH (Keating et al., 2002). In general, the existence of such machineries proves that the core domains of NRPSs have the ability to operate autonomously and in a non-modular context.

PKS–NRPS Hybrids

The same genome mining study discussed above (Wang et al., 2014) identified approximately 3400 gene clusters involved in the synthesis of NRPs and PKs (polyketides), another class of natural compounds synthesized by mega-synthases known as PKSs (polyketide synthases) (Payne et al., 2017). A large portion of these clusters, about 34%, contained genes that encode for hybrid synthases, that is, enzymes that bear functional core domains of both PKSs and NRPSs. Sharing a common thiotemplate mechanism for loading and transporting substrates—NRPSs have PCP domains, PKSs have ACP (acyl-carrier-protein) domains—these enzymes are able to assemble extremely complex chemical structures. Notable examples of PK–NRP hybrid products are the anticancer agents bleomycin (Du et al., 2000) and epothilone (Chen et al., 2001), the antibacterial and antifungal paenilamicins (Müller et al., 2014) and the antibiotic zwittermicin A (Stohl et al., 1999). PKS-NRPS hybrid machineries can either be organized in the same polypeptide chain (tethered type), or in separate subunits where often stand-alone enzymes of one kind are coupled with modular systems of the other (non-tethered type) (Miyanaga et al., 2018). The individual subunits need to communicate efficiently to coordinate the transport of substrates and intermediates across the hybrid system. Crucial players in these interactions are specific linker regions in the tethered type-hybrids, and special docking domains for the non-tethered type (Liu et al., 2004; Miyanaga et al., 2018; Richter et al., 2008). Given their intrinsic communication capabilities there is great interest in engineering hybrid PKS–NRPS systems for the production of novel compounds (Connor et al., 2003). In this respect, future structural studies will provide new insights into the synthesis mechanism and the necessary protein-protein interactions in these hybrid systems, pushing the engineering efforts one step further.

Higher Order Architecture of NRPSs

The extensive research on NRPS enzymes in the past decades helped unravel many of the biochemical features of these enzymes, as well as the structural features of their domains. Although many structures of single domains (Bloudoff et al., 2013; Bruner et al., 2002; Conti et al., 1997; Goodrich et al., 2015; Keating et al., 2002; Lohman et al., 2014; Yonus et al., 2008) and di-domains (Liu et al., 2011; Mitchell et al., 2012; Sundlov et al., 2012; Tan et al., 2015) have been available for some time, solving the structure of an entire module, or even more the multimodular structure of an NRPS, has offered a formidable challenge. To date, there is limited information about the modular structure of NRPS (Drake et al., 2016; Reimer et al., 2016; Tanovic et al., 2008). What these structures show, essentially, is a rigid organization of the main catalytic domains, namely the A and C domain (formylation domain in one case (Reimer et al., 2016)). More specifically, it is the N-terminal Acore subdomain that is involved in this type of organization. These C(F)-Acore duets form a solid catalytic platform of rectangular shape, with the active sites aligned on the same side. The Asub subdomain is connected to the Acore by a flexible linker that allows it to move relative to the catalytic platform, therefore allowing the different half-reactions that will have to take place. The 4-helix bundle of the PCP domain is itself connected to the Asub via a flexible linker, and it is capable to move along the platform and contact all of the catalytic domains for thioester presentation to the active sites. Overall, these structures show that the position of Asub and PCP relative to the other domains and to each other are influenced by the catalytic state of the module (Fig. 4A–D).

Fig. 4.

Higher order architecture of nonribosomal peptide synthetases: (A) Termination module of SrfA-C, architecture C-A-PCP-Te, inactive (PDB: 2VSQ) (Tanovic et al., 2008). (B) Elongation module of EntF, architecture C-A-PCP, thiolation state (PDB: 5T3D) (Drake et al., 2016). (C) Termination module of AB3403, architecture C-A-PCP-Te, condensation state (PDB: 4ZXH) (Drake et al., 2016). (D) Initiation module of LgrA, architecture F-A-PCP, formylation state (PDB: 5ES9) (Reimer et al., 2016). (E) Initiation and first elongation module of LgrA, architecture F1-A1-PCP1-C2-A2-PCP2, both modules in condensation state (PDB: 6MFZ) (Reimer et al., 2019). These studies show conformational flexibility of individual NRPS domains, at the same time highlighting a certain rigidity of the main components of the assembly line. Based on the superposition of the structure of SrfA-C (C-A-PCP-Te) and a di-domain (PCP-C) structure from tyrocidine synthase (Samel et al., 2007), a structural model for a multimodular NRPS was later proposed (Marahiel, 2016). In this model, the modules are organized in a helical fashion along a central axis, each module being rotated 120° relative to the adjacent ones. The PCP domains would be located within the helix, protected from the solvent. Overall, the model proposed a very rigid organization of the modules, which seems unlikely given the frequent conformational changes that happen during NRP synthesis. Indeed, recent structural studies revealed how flexible NRPS actually are and that they can adopt many different conformations (Reimer et al., 2019; Tarry et al., 2017). In one of these studies, the first ever dimodular structure of a NRPS (LgrA) was solved (Reimer et al., 2019), revealing how the major structural features of the single domains are conserved, as well as the domain organization in each catalytic state (Fig. 4E). Other structures were generated during the same work, which show different orientations of module 2 for the same catalytic state of module 1, suggesting that the overall conformation is independent of the catalytic state of the individual modules (Reimer et al., 2019). Small angle X-ray scattering studies on the constructs’ behavior in solution, and subsequent modeling of the results, confirmed the high flexibility of LgrA. The only event that requires the strict coordination of two adjacent modules is the condensation reaction, where the two PCP domains have to bind the C domain at their respective binding site to allow peptide bond formation. This entire process is mediated by the interaction between donor PCP and C domain during the reaction itself (Reimer et al., 2019).

The Biosynthetic Cycle of NRPSs

The recent advances in structural biology discussed above provided numerous insights into the catalytic mechanisms at the base of NRP synthesis, as well as a better understanding of the movements that the core domains undergo during this process. The transition between the main catalytic states of a canonical module (C-A-PCP) require large conformational changes that mainly involve the small domains Asub and PCP. A full synthetic cycle of one elongation module requires four stages, defined by the catalytic state and the position of Asub and PCP relative to the catalytic platform formed by the C domain and Acore (Drake et al., 2016; Reimer et al., 2016). In the first stage, the A domain is an open state that allows the diffusion of ATP and the substrate into the active site, where the interaction with the residues of the binding pocket provides the correct positioning of the functional groups. In the second stage, the Asub rotates by about 30° closing in toward the Acore. This movement brings the catalytic lysine, housed on a flexible loop, inside the active site, thereby priming the adenylation reaction (Reimer et al., 2016). The third stage involves further movement of the Asub, which rotates by approximately 140° on the horizontal plane, presenting now the opposite face to the Acore, and drags the PCP domain on top of the active site (Drake et al., 2016; Gulick, 2009; Reger et al., 2008; Reimer et al., 2016). The ppant cofactor can now penetrate the binding pocket and attach the substrate, releasing AMP in the process. In the last stage, Asub rotates again by about 180° and moves away from Acore, in concert with a rotation of the PCP domain that allows the latter to travel the necessary distance to reach the C domain. In this stage the substrate is provided for the condensation reaction. At the same time, the A domain returns to the initial open state, ready to begin another cycle (Drake et al., 2016; Reimer et al., 2016). While the A domain stays in open conformation, the elongated peptide can then be transported to the downstream C (or Te) domain for further processing, with a simple rotational movement of the PCP domain (Drake et al., 2016; Reimer et al., 2019). When two PCP domains of adjacent modules are in peptide donation conformation—that is, bound to their respective sites on the C domain—both of the A domains can start new synthetic cycles simultaneously. This intrinsic ability of NRPSs improves the catalytic efficiency and production rate of NRP synthesis.

Interactions with Helper Proteins and Other Associated Enzymes

As discussed in the previous sections, NRP synthesis is a significantly complex process, in which a variety of structural protein domains cooperate and carry out specific functions that ultimately lead to the assembly of the final product. NRPSs are not the only players involved and they often require interactions with other proteins to fulfill their function. In this section, the main interaction partners of NRPSs will be discussed.

Phosphopantetheinyl Transferases

Phosphopantetheinyl transferases (PPTases) belong to a large superfamily of enzymes crucial for all domains of life (Beld et al., 2014; Lambalot et al., 1996). They are responsible for a post-translational modification of modular synthases such as NRPS and PKS, as well as fatty acid synthases (FAS). All of these enzymes share a common thiotemplate-based mechanism, involving a carrier protein (CP) domain, which is responsible for the timely transport of the substrates and intermediates across the enzymatic system. As discussed previously for PCPs, these carrier proteins require a cofactor to be fully functional, the 4′-phosphopantetheine (ppant) moiety. The ppant works as a sort of “swinging arm” where the intermediates are covalently loaded onto for transport. The cofactor attachment is mediated by PPTases, which use coenzyme A as substrate and tether the ppant to conserved residues of serine via a phosphoester bond (Mofid et al., 2004). There are three types of PPTases. The holo-acyl carrier protein synthase (AcpS)-type PPTases (I) are primarily involved in the activation of FASs (primary metabolism) and therefore the most common type of PPTase. Sfp-type PPTases (II) are able to modify CPs from all classes of mega-synthases. The name derives from the gene sfp, which encodes a PPTase involved in the activation of the surfactin synthase in Bacillus subtilis. This enzyme is well expressed in Escherichia coli, where it can be integrated in the genome (Gruenewald et al., 2004), and it exhibits a broad promiscuity toward both CoA and CPs substrates. Therefore, it is widely used for the heterologous expression of NRPS or PKS genes. Integrated PPTases (III) have been reported in yeast and other fungi, where they are fused at the C-terminal end of certain FASs. They are the least represented family of PPTase. Some type I and type II PPTases can be encoded in NRPS, PKS or NRPS-PKS biosynthetic gene clusters, and can exhibit preferential activity toward their “cognate” CP (Huang et al., 2006), while others are more promiscuous. Several examples have been described in actinomycetes, in particular heterologous expression hosts of the genus Streptomyces (Baltz, 2016). These encode two or more PPTases in their genome, making them attractive hosts for the production of secondary metabolites. PPTases work by deprotonating the hydroxyl group of the conserved residue of serine of CPs, priming the attachment of the ppant cofactor. The mechanism requires the presence of a Mg2+ ion, which is coordinated by conserved residues of glutamic acid and aspartic acid (Mofid et al., 2004; Tufar et al., 2014). The mechanism of interaction between Sfp and PCP was unraveled in a recent work (Tufar et al., 2014). The crystal structure of the complex Sfp/PCP revealed that the main interaction occurring is a hydrophobic contact between one helix (α2) of the PCP domain and the C-terminal domain of Sfp. Other contacts, including a hydrogen bond, were observed, but mutational analysis proved them to be non-essential. Comparisons with the crystal structure of a human PPTase/ACP complex (Bunkoczi et al., 2007) showed strong similarities, suggesting a conserved interaction mechanism across all domains of life.

MbtH-Like Proteins

MbtH-like proteins (MLPs) are small proteins, about 70 amino acids in length, which are often associated with NRPS biosynthetic gen clusters (BGCs) (Baltz, 2016). The name derives from the first identified member of this family, MbtH, encoded in the biosynthetic gene cluster (BGC) of the siderophore mycobactin in Mycobacterium tuberculosis (Quadri et al., 1998). They are widely present in bacteria, especially in actinomycetes where more than one MLP can be encoded in the same BGC, while they appear to be completely absent in fungal NRPS systems (Baltz, 2011). MLPs have been studied extensively in recent years, and yet the exact function remains still unknown. Growing evidence suggests that they can increase the activity of A domains as well as enhance the soluble expression of their partner NRPSs (Boll et al., 2011; Felnagle et al., 2010; Zwahlen et al., 2019), although in some cases the deletion of the MLP gene from a cluster does not have any effect on the biosynthesis of the NRP (Stegmann et al., 2006). An interesting feature of MLPs is that they can activate non-cognate NRPSs as well, both in vivo and in vitro (Boll et al., 2011; Lautru et al., 2007; Wolpert et al., 2007; Zhang et al., 2010; Zwahlen et al., 2019). In the most recent of these works, it has been shown that the heterologous expression of bacterial MLPs can boost the production of NRPS-related secondary metabolites in the filamentous fungus Penicillium chrysogenum (Zwahlen et al., 2019). The structure of several MLPs has been solved, both as isolated proteins (Buchko et al., 2010; Drake et al., 2007) as well as in complex with A domains (Herbst et al., 2013; Miller et al., 2016; Mori et al., 2018; Tarry et al., 2017). Generally, these proteins display a core region containing a three-strand antiparallel β-sheet and a C-terminal helix. The structures of the complexes MLP-A domain show a conserved interaction mechanism, involving three conserved residues of tryptophan. Two of them form a pocket that binds an alanine residue on the Acore, while a third one positions itself in a pocket formed by hydrophobic residues on the A domain. Many functional aspects of the association MLPs-NRPSs remain yet to be elucidated, but it appears already evident that these small partner proteins possess a great biotechnological value. Potential applications span from the overexpression of hybrid BGCs and the activation of silent ones to the improvement of industrial producer strains.

Trans-Acting Tailoring Enzymes

The chemical diversity of NRPs can be further expanded by the action of trans-acting enzymes (Walsh et al., 2001). Some of the modifications introduced at this stage can be important for bioactivity. The most prominent example of tailored NRPs is the case of glycopeptide antibiotics (GPAs), such as vancomycin or teicoplanin (Bischoff, Pelzer, Bister, et al., 2001; Hadatsch et al., 2007; Pelzer et al., 1999). These compounds are characterized by a heptapeptide scaffold, which can be decorated with a myriad of modifications. Glycosyltransferases act specifically on the final product of GPAs-related NRPS, using the appropriate UDP-sugars to glycosylate specific residues of the peptide in a regiospecific manner. Other modifications include sulfation, acylation and methylation (Yim et al., 2014). In each case specific stand-alone enzymes are involved, which most often act post-assembly line. Some of these can be critical for bioactivity—acylation of GPAs seems to be essential for activity against certain bacteria—while others generally improve solubility or stability of the compounds. Other types of tailoring enzymes act on substrates or synthesis intermediates, and therefore are required for the correct processing and release of the peptides. This is the case of P450 monooxygenases (P450 MO), a widespread superfamily of enzymes that generally incorporate hydroxyl groups in the metabolism of various compounds (Danielson, 2002). For instance, the inactivation of three P450 MO involved in the synthesis of GPAs completely stalls the respective NRPS machinery (Bischoff, Pelzer, Höltzel, et al., 2001; Bischoff, Pelzer, Bister, et al., 2001; Hadatsch et al., 2007; Pelzer et al., 1999). These enzymes catalyze the oxidative cyclization of the linear precursor peptides via cross-linking of the aryl side chains, resulting in the rigid aglycone scaffold that is subsequently glycosylated. P450 MO need to be efficiently recruited by NRPSs in order to perform their specific catalytic steps in a timely and coordinated manner. Different domains can play a major role in this process. In the case of GPAs, recent studies revealed the involvement of a novel domain, the X domain (Haslinger et al., 2015; Ulrich et al., 2016). This domain is conserved in the termination module of all GPA-related NRPS machineries, and it is structurally related to C domains. Mutations in the active site motif (HRxxxDD) render the X domain catalytically inactive. Its function, instead, is that of recruitment of a P450 MO via specific hydrogen bonds and salt bridges. These interactions occur between the catalytic motif of the monooxygenase (PRDD) and two residues of arginine on the X domain. It has been shown that mutating these residues can abolish the oxidative cyclization of the precursor peptides (Haslinger et al., 2015; Ulrich et al., 2016). In other NRPS systems, it is the PCP domain that interacts with the cognate P450 MO via a network of hydrophobic contacts. For instance, this interaction is crucial for the β-hydroxylation of the amino acid precursors during the synthesis of the anticancer compound skyllamycin (Haslinger et al., 2014; Pohle et al., 2011). Halogenation is also a common type of modification (generally chlorination), once again well exemplified by the case of GPAs (Yim et al., 2014). The incorporation of chlorine atoms is critical for the antibiotic activity of these compounds, as their absence significantly reduces their binding affinity to the lipid II (Pinchman & Boger, 2013a, 2013b). As for P450 monooxygenases, halogenases act during peptide assembly on the aminoacyl-S-PCP intermediates, with both the PCP domain and the bound substrate playing an important role in the recruitment of the enzyme (Kittilä et al., 2017).

Engineering of Nonribosomal Peptide Synthetases

As previously discussed, many NRPs are valuable compounds from an industrial and pharmaceutical point of view. Although they exhibit interesting biological activities, NRPs might not always have optimal pharmacokinetics properties or the desired target. Hence, the great interest in engineering NRPSs for the production of modified or novel compounds. These efforts are further driven by the ever-rising phenomenon of antibiotic resistance. Traditionally, the different approaches that have been used to engineer NRPs production can be divided into two main types: (i) indirect, where the focus has been either on the precursor supply chain, therefore modifying the building blocks themselves or altering their availability in the host, or on the use of engineered or exogenous trans-acting tailoring enzymes; (ii) direct, involving direct manipulation of the genes encoding NRPS enzymes (Fig. 5). The following section will focus solely on the latter, for which the most prominent and recent examples will be described.

Fig. 5.

Schematic overview of common direct NRPSs engineering strategies. Site-directed mutagenesis and direct evolution of the residues of the binding pocket of adenylation domain (A); domain, module, subdomain and custom exchange units swapping approaches (B); and reprogramming of NRPS assembly lines via fusion with COM (communication-mediating) domains (C).

Active Site Modification: Mutagenesis and Directed Evolution

When the specificity code of A domains was deciphered (Stachelhaus et al., 1999), it became possible to identify sets of residues responsible for activating specific substrates. Theoretically, by introducing individual or combined point mutations within the binding site, changing the specificity code in essence, one could achieve the activation and incorporation of alternative substrates. This strategy has been successfully used in several instances to achieve incorporation of non-native substrates (Eppelmann et al., 2002; Kaljunen et al., 2015; Kries et al., 2014; Thirlway et al., 2012). The introduction of a single or double mutation was sufficient to change the specificity of the domain, with little or no loss of activity at all. However, it is important to mention that in all of these cases the newly activated substrates are either structural analogues of the native ones (e.g., Glu/Gln, Asp/Asn), or functionalized versions thereof. Thus, the chemical diversity achieved was limited. Another interesting application was to redirect naturally promiscuous A domains toward certain substrates (Bian et al., 2015; Han et al., 2012). Overall, this approach might hold higher chances of success, given that the native enzymes already possess the intrinsic ability to activate these substrates. An alternative active site modification approach is that of directed evolution. The rationale is based on the knowledge that the multitude of NRPS machineries in nature has evolved via gene duplication, deletion, insertion and point mutation events (Cane et al., 1998). “Recreating” and redirecting this evolution process could therefore be a viable strategy to achieve activation of an alternative substrate and improve the activity toward certain natural substrates (Evans et al., 2011; Villiers & Hollfelder, 2011; Zhang et al., 2013), or greatly improve the activity of hybrid NRPSs (Fischbach et al., 2007). These strategies usually involve several rounds of random mutagenesis (e.g., via error-prone PCR) or saturation mutagenesis of specific residues that interact directly with the substrate (crystal structures are particularly valuable in this case). What they all have in common is that they require (medium-) high-throughput screening methods, given the large libraries of genes that are generated, that are often costly, time-consuming and limited by the type of compounds that the target NRPS produces. In general, the advantage of active site modification as a targeted approach is that the structural changes that are introduced are usually minor, therefore less likely to lead to unfolding/degradation issues as well as to introduce disruptions in key linker regions of the enzyme. Disadvantages include time-consuming and costly laboratory procedures, as NRPS enzymes are encoded by large genes, and the lack of a universal screening method. Also, as discussed in the previous sections, the other domains of a NRPS exhibit some degree of specificity toward their native substrates as well, further limiting the chances of success of these approaches.

Domain, Module, and Unit Exchanges

The primary sequence of a NRP is determined by the sequence and order of modules in an NRPS. Thus, it seems logical that altering the modular structure by inserting, deleting or replacing individual domains, modules or other types of exchange units (XUs), could potentially lead to the production of altered or novel compounds. The simplest approaches involved the complete deletion (Mootz et al., 2002) or insertion of a module (Butz et al., 2008), leading respectively to a shorter or a longer NRP product. Early examples of swapping strategies date back more than twenty years, with the work on the enzyme surfactin synthase (Geysen et al., 1995; Schneider et al., 1997). Both works showed that replacing either individual A domains (Geysen et al., 1995) or a whole module (Schneider et al., 1997) led to functional hybrid NRPSs that produced the expected peptides. Several other examples of such successful strategies exist, where entire domains (Calcott et al., 2014; Linne et al., 2001; Zobel et al., 2016) or modules (Butz et al., 2008; Nguyen et al., 2010; Yakimov et al., 2000; Yu et al., 2013) have been swapped. One of the most extensive work in this respect was carried out on the daptomycin biosynthetic pathway. A variety of approaches, including domain, module or entire subunit (stand-alone NRPSs) exchanges, as well as module fusions and engineering of tailoring enzymes, were used to produce variants of daptomycin, a clinically important lipopeptide antibiotic used to treat infections caused by Gram-positive pathogens. Some of the variants that were generated possessed improved pharmacokinetics properties and were as active as the native compound (Baltz, 2014; Doekel et al., 2008; Miao et al., 2006; Nguyen et al., 2006, 2010). Although these strategies may all seem relatively successful, in many instances such major manipulations led to expression, folding or degradation issues, because of the large perturbations introduced in the structural organization of the enzymes. Even when successfully expressed, many chimeric NRPSs have significantly lower yields compared to their wild-type counterpart, or do not exhibit any detectable activity. The most likely reason is that some of the key inter-domain interactions are disrupted, leading to inactive enzymes. In recent years new engineering strategies have attempted to tackle this issue by replacing noncanonical exchange units, and therefore trying to preserve those key interactions, as exemplified by the sub-domain swap strategy used by Kries and coworkers in 2015 (Kries et al., 2015). In this work, the researchers targeted a specific region with a flavodoxin-like fold within the N-terminal subdomain of the A domain. This so-called “subdomain” encompasses the binding pocket (including the nine residues that confer specificity) and is characterized by a compact fold, making it ideal for excisions and insertions while keeping other functionally relevant interfaces intact. Indeed, all the hybrids that were generated were successfully overexpressed, with four out of nine producing the expected peptides. This approach has much higher chances of success when the crystal structure of the target NRPS is available, given the need for a precise determination of the subdomain boundaries. More recent works targeted bigger exchange units (XUs), encompassing multi-domains or cross-module regions, while maintaining a particular attention to the junction points and linker regions (Bozhüyük et al., 2018, 2019; Steiniger et al., 2019). In particular, in one of these studies the swapping approach was taken a step further: chimeric NRPSs were designed and assembled de novo using XUs in a combinatorial manner (Bozhüyük et al., 2018). Several hybrids were built with an unprecedented success rate. A key factor was the identification of a new inter-domain fusion point, located in the C-A linker region. Compared to other linkers (A-T, T-C), the sequence of this linker is considerably more conserved and bigger, about 30 amino acids in length. The 20 N-terminal amino acids are involved in weak hydrophobic (and possibly aspecific) interactions between the two domains, and they always include an α-helix. The remaining 10 at the C-terminal have no secondary structure and only interact with the A domain. The researchers used the junction between these two regions as the fusion point, assuming that the ability of C and A domains to interact would be preserved in its entirety. This proved to be a highly efficient strategy, with the only limiting factor being the specificity of the downstream C domains. Indeed, in another study where a combinatorial approach was used to build chimeric NRPSs, it was confirmed that the success of the engineering experiments strongly relied on respecting the original specificities of the C domain, with the acceptor site being particularly stricter (Steiniger et al., 2019). To tackle this limit, a new potential fusion point was identified within the C domain itself (Bozhüyük et al., 2019). The rationale behind this approach is that C domains are pseudo-dimers constituted by two lobes, therefore the linker region that connects them is an ideal target for the fusion. Results were encouraging and demonstrated that this assumption was correct: a hybrid bacterial NRPS containing an exogenous ATC unit was completely inactive, while the one containing the ATCCTerm unit showed even higher activity than wild-type (Bozhüyük et al., 2019). A similar strategy was attempted with fungal NRPSs, in this case leading to no success (Steiniger et al., 2019), probably due to intrinsic structural differences between bacterial and fungal domains. In general, each approach should be tailored to specific engineering experiments, depending upon factors such as domains specificity compatibility and organisms of origin.

Reprogramming of Assembly Lines via COM Domains

NRPS assembly lines are very often constituted of several individual proteins, each of them assembling specific fragments of the final product (Hoertz et al., 2012; Kessler et al., 2004; Mootz & Marahiel, 1997; Scholz-Schroeder et al., 2003). The activity of each needs to be efficiently coordinated so that the intermediates are presented only at the right protein-protein interface. This interaction is mediated by small regions found at the termini of the proteins, the so called communication-mediated (COM) domains or docking domains (DDs) (Hahn & Stachelhaus, 2004; Hacker et al., 2018). Analogously to a lock and key system, the COM domains—COMD (donor) and COMA (acceptor)—bind to each other in a complementary manner. This provides a platform that allows a transient, productive and specific interaction between two partner NRPSs, preventing undesired coupling events that would lead to a shortened product or a complete halt. Certain COM domains exhibit strict specificity toward their partner, while others appear more relaxed. The key is in specific conserved motifs within the COM domains that interact directly with each other (Hahn & Stachelhaus, 2006). Because of the small size of COM domains (roughly 20–30 amino acids), they have great potential for engineering purposes, as one could easily introduce partner COM domains at the termini of two NRPSs with the goal to generate a hybrid compound. Indeed, several studies have shown the possibility to use COM domains to reprogram NRPS assembly lines (Cai et al., 2019; Chiocchini et al., 2006; Hacker et al., 2018; Hahn & Stachelhaus, 2006; Liu et al., 2016). Entire biosynthetic machineries could be reprogrammed simply by swapping partner COM domains, therefore forcing alternative assembly of the final products. In one case the hybrid assembly line was derived from three different biosynthetic systems, while the production yield was always comparable with that of the native ones. In the most recent of these works (Liu et al., 2016), targeted mutations within the COM domains were enough to alter the selectivity of the NRPSs subunits, leading to the production of novel lipopeptides with antifungal and antimicrobial activity. The main advantages of a COM-based NRPS engineering approach are the relatively easy design and construction of fusion proteins, small structural changes to the overall architecture of the enzymes, and undoubtedly the universal applicability. The main limiting factor for such an approach is that the individual specificities of the adjacent modules and domains that are brought together must be somewhat compatible, in order for the synthesis to proceed.

Penicillium rubens and Natural Products

The discovery and exploitation of nonribosomal peptides—and in general natural products—have had a tremendous impact on the fields of pharmaceutical, food, agricultural, and environmental sciences, and their many applications. In this context, the study and development of filamentous fungi as industrial workhorses to produce such compounds was at least of equal importance. Even though early examples of commercial products manufactured via fungal fermentation date back to the first decades of the 20th century (Meyer et al., 2020), one event in particular is arguably the most widely recognized as the catalyst for the boom of fungal biotechnology. The serendipitous discovery of penicillin by Alexander Fleming set a significant milestone for microbiology and medicine, effectively catapulting our society into the modern antibiotic era (Demain & Elander, 1999; Gaynes, 2017). Penicillin was produced by a strain of “mould” which Fleming initially classified as Penicillium rubrum, but later studies corrected this several times, until a recent comparative phylogenetic analysis identified it as P. rubens (Houbraken et al., 2011). Members of the genus Penicillium are ubiquitous soil fungi, commonly associated with spoiled food and poorly ventilated indoor environments. In general, wherever organic material is present, Penicillium species will thrive, carrying out their crucial role as decomposers (Visagie et al., 2014). To date, more than 350 species have been identified as members of this genus (Nielsen et al., 2017), but the most relevant strains in terms of penicillin production derive from a single isolate, the so-called Wisconsin strain (NRRL 1951). Like Fleming's mold, the NRRL 1951 strain and the Wisconsin 54–1255 strain (the first genome sequenced Penicillium strain (Van Den Berg et al., 2008)), originally classified as P. chrysogenum, were also identified as P. rubens. The NRRL 1951 strain was isolated from a moldy cantaloupe purchased at a market in Illinois, USA, in the context of a broad screening program for natural penicillin overproducers, and subsequently improved to generate the industrial strains that are used today. These strains are the results of decades of selection processes, collectively known as classical strain improvement (CSI) (Guzmán-Chávez et al., 2018; Salo, 2016). The CSI program involved several rounds of random mutagenesis (e.g., UV, X-ray or nitrogen mustard gas exposure), followed by selection toward desirable properties, such as loss of pigments, improved growth and enhanced levels of penicillin production. Given the random nature of the methods used during the CSI program, the genome of current industrial strains carries a plethora of other mutations that affected secondary metabolism in general, resulting in lower expression levels of penicillin-unrelated secondary metabolism genes, and in some cases even in nonfunctional proteins (Jami et al., 2010; Salo et al., 2015, 2016). Despite these effects, the production of several secondary metabolites has been characterized in P. rubens, e.g., the NRP products roquefortines (Ali et al., 2013), siderophores (Samol, 2015), and fungisporin (Ali et al., 2014), and the polyketides sorbicillinoids (Guzmán-Chávez et al., 2017; Salo et al., 2016), chrysogine (Viggiano et al., 2018), and macrophorins (Mózsik et al., 2021). In many other cases, the exact function of biosynthetic gene clusters still needs to be elucidated. In this respect, modern bioinformatic tools such as SMURF (Guilhamon & Lupien, 2018), CASSIS (Wolf et al., 2016), and antiSMASH (Blin et al., 2019) have revolutionized the field, as they allow the identification of BGCs from the genomic sequence of an organism and potentially identify homologies with gene clusters from other fungi, of which the function has been already elucidated. Analyzing the genome of P. rubens, 33 core biosynthetic genes (encoding synthases or synthetases) have been identified, for most of which no product is known. These genes encode 10 NRPS, 20 PKS, 2 hybrid NRPS–PKS, and 1 dimethyl-allyl-tryptophan synthase (DMATS) (Guzmán-Chávez, 2018; Salo, 2016; Samol, 2015; Van Den Berg et al., 2008). In the next paragraphs, we will briefly discuss the main findings on the 10 NRPSs identified in P. rubens, before focusing on the biosynthetic pathway of penicillin and its biotechnological potential.

Nonribosomal Peptide Synthetases in P. rubens

With only 10 ORFs identified in its genome, P. rubens possesses a relatively low number of NRPS, compared to other common filamentous fungi that are employed as cell factories (e.g., Aspergillus nidulans carries 27 genes that encode for NRPS) (Kjærbølling et al., 2020; von Döhren, 2009). Extensive characterization studies based on gene deletion and/or overexpression (Ali et al., 2013, 2014; Samol, 2015; Viggiano et al., 2018), led to the identification of the associated products and biosynthetic pathways for most of these NRPS (Table 1 and Fig. 6). Intriguingly, each of these pathways produces a large variety of related compounds, which is possibly due to the promiscuity of the A domains of the NRPS enzymes involved, or to the employment of highly branched tailoring pathways where individual enzymes are often capable of catalyzing more than one type of modification.

Table 1.

Nonribosomal Peptide Synthetases in Penicillium rubens and Associated Biosynthetic Pathways, Modified from Guzmán-Chávez et al. (2018)

Gene ID	Gene	Protein	Domain organization	Product	Pathway
Pc13g05250	pssC	Type IV ferrichrome synthetase	A₁TCA₂TCTCA₃TCTCT	Ferrichrome	Ferrichrome
Pc13g14330	—	Tetrapeptide synthetase	CA₁TECA₂TCA₃TCA₃TCA₄TC	—	—
Pc16g03850	pssA	Penicillium siderophore synthetase A	ATCTC	Coprogen B	Coprogens
Pc16g04690	hcpA	Hydrophobic cyclic tetrapeptide synthetase	A₁TECA₂A₃TCA₄TECTCT	Cyclic tetrapeptides	Fungisporin
Pc21g01710	nrpsA	Brevianamide synthetase	A₁TCA₂T	Brevianamide F	Brevianamides
Pc21g10790	—	Hexapeptide synthetase	A₁TCA₂TCA₃TECA₄TCA₅TCA₆TC	—	—
Pc21g12630	chyA	Dipeptide synthetase	A₁TCA₂TC	2-(2-Aminopropanamido) benzoic acid	Chrysogine
Pc21g15480	roqA	Dipeptide synthetase	A₁TCA₂TC	Histidyl-tryptophanyl-diketopiperazine (HTD)	Roquefortine C/Meleagrin
Pc21g21390	pcbAB	l‐δ‐(α-Aminoadipoyl)-l‐cysteinyl-d‐valine (ACV) synthetase	C*A₁TCA₂TCA₃TETe	ACV tripeptide	Penicillin
Pc22g20400	pssB	Penicillium siderophore synthetase B	ATCTC	Fusarinine C	Fusarinines

Fig. 6.

Structures of all known major NRPS products from Penicillium rubens.

Structures of all known major NRPS products from Penicillium rubens. Nonribosomal Peptide Synthetases in Penicillium rubens and Associated Biosynthetic Pathways, Modified from Guzmán-Chávez et al. (2018) Of the NRPS identified, three—PssA, PssB, and PssC (Pss stands for Penicillium siderophore synthase)—are involved in the synthesis of three major classes of siderophores: coprogens, fusarinines, and ferrichromes, respectively. PssA and PssB are essential for iron acquisition in P. rubens, with deletion strains showing major growth defects under iron starvation conditions (Samol, 2015). Interestingly, PssA seems to be also involved in the synthesis of fusarinines, suggesting some type of cross-talk between the two individual NRPS systems (Samol, 2015), a process that seems to occur more frequently in fungal secondary metabolism and siderophore biosynthesis (Huang et al., 2020; Lazos et al., 2010; Sheridan et al., 2015). The enzymes RoqA and NrpsA belong, respectively, to the biosynthetic pathways of roquefortines (and meleagrin) and brevianamides, indole alkaloids found in several species of Penicillium and Aspergillus and that are common fungal contaminants in the food industry (Borthwick, 2012; Kokkonen et al., 2005). RoqA and NrpsA participate in the first step of their respective pathways, providing the 2,5-diketopiperazine precursors for the synthesis of the bioactive compounds. Despite having shown interesting antifungal and insecticidal properties in some studies (Nishanth Kumar et al., 2014; Paterson et al., 1990; Tang et al., 2015), these compounds are often neurotoxic or hepatotoxic, and they can pose a great danger for humans when ingested at high doses (Borthwick, 2012; Borthwick & Da Costa, 2017; Rand et al., 2005). HcpA is a large tetrapeptide synthetase that utilizes aromatic and aliphatic substrates to synthesize different types of hydrophobic cyclic peptides (Ali et al., 2014). This ability derives from the intrinsic promiscuity of the adenylation domains of HcpA. Each A domain can activate two different substrates (M1: Phe and Tyr; M2: Trp and Phe; M3,4: Val and Ile), resulting in 10 different combinations, with the most abundant being the metabolite fungisporin (Studer, 1969). Although the function of these metabolites remains still unknown, a ΔhcpA strain of P. rubens showed formation of colonies with a smooth surface phenotype, as opposed to the classic wrinkled surface (Ali et al., 2014). This could indicate a potential involvement of the cyclic tetrapeptides in determining the correct surface hydrophobicity properties of a colony, ultimately influencing its ability to exchange nutrients and gasses with the environment. ChyA is a dimodular NRPS that catalyzes the condensation of the substrates alanine and anthranilic acid to the dipeptide 2-(2-aminopropanamido) benzoic acid (Salo, 2016; Viggiano et al., 2018). This molecule serves as precursor for the synthesis of chrysogine and related compounds, which involves up to 5 more enzymes (ChyC-D-E-H-M) (Viggiano et al., 2018). Chrysogine is a yellow alkaloid pigment that seems to be common amongst filamentous fungi. Indeed, it has been found in several Penicillium, Aspergillus and Fusarium species (Nicoletti & Trincone, 2016; Pildain et al., 2008; Wollenberg et al., 2017). Its bioactivity has only been briefly investigated, without revealing any toxicity toward cancer cell lines or microorganisms, and therefore was not further considered for potential pharmaceutical applications (Nicoletti & Trincone, 2016; Pildain et al., 2008). The most studied and well characterized NRPS is l‐δ‐(α‐aminoadipoyl)‐l‐cysteinyl‐d‐valine synthetase (ACVS), first enzyme of the penicillin biosynthetic pathway (Baldwin & Abraham, 1988). Given the historical importance of penicillin, we will discuss ACVS in detail in the next section, with a particular emphasis on the latest insights and biotechnological potential.

ACV Synthetase and the Biosynthetic Pathway of Penicillin

Since its discovery, the biosynthetic pathway of penicillin has been arguably one of the most studied secondary metabolites routes in microbiology. In this respect, one of the first milestones was the identification of the tripeptide l‐δ‐(α‐aminoadipoyl)‐l‐cysteinyl‐d‐valine (ACV), which was extracted for the first time from the mycelium of P. chrysogenum (as discussed above, most likely P. rubens in this case as well) in 1959 (Arnstein & Morris, 1960). Despite it being the first precursor of all β-lactam antibiotics, the synthesis of ACV was one of the last steps of the pathway to be elucidated (Tahlan et al., 2017). Given the similarity with glutathione (γ-l-Glutamyl-l-cysteinylglycine) it was initially thought that ACV was synthesized in a similar manner, involving two enzymatic steps for the synthesis of a dipeptide precursor and later on for the attachment of the third moiety (Banko et al., 1986; Lu, 2013). A few years later, Arnold Demain and coworkers identified the ACV synthetase (ACVS) as a single multifunctional enzyme belonging to the same family of gramicidin S and tyrocidine synthase (Banko et al., 1987). His work was not only instrumental for the discovery of the enzyme and the elucidation of its reaction mechanism, it also enabled the development of purification, characterization and assay procedures that became a hallmark of ACVS and NRPS biochemistry (Aharonowitz et al., 1993; Zhang & Demain, 1992). ACVS is a linear trimodular NRPS, where each module is responsible for the activation and incorporation of a specific substrate: l-α-aminoadipic acid (l-α-Aaa), l-cysteine and l-valine, respectively (Fig. 7). The first substrate is an unusual amino acid that is an intermediate in the biosynthetic pathway of lysine (Neshich et al., 2013; Zabriskie & Jackson, 2000). ACVS enzymes from different organisms have been extensively studied and characterized in the past decades, with a particular attention to substrate specificity and enzymatic activity (Baldwin et al., 1990, 1994; Byford et al., 1997; Coque et al., 1991, 1996; Etchegaray et al., 1997; Iacovelli et al., 2020; Siewers et al., 2009; Theilgaard et al., 1997; Wu et al., 2012). Generally, they appear to be rather specific enzymes that are not capable of activating a broad array of substrates. In particular, the first module of ACVS can only activate a few structural analogues of l-α-Aaa, and only with low yields. Thus, this module possesses a very strict specificity toward its native substrate (Baldwin et al., 1994; Iacovelli et al., 2020). Interestingly, ACVS is the only NRPS known to date that can recognize and activate l-α-Aaa (Flissi et al., 2020). Furthermore, the adenylation reaction occurs on the side chain (δ) carbonyl group, resulting in a noncanonical peptide bond between l-α-Aaa and the second substrate l-cysteine (Iacovelli et al., 2020; Tahlan et al., 2017). In a recent study, a novel conserved domain was identified at the N-terminus of the first module of ACVS, with a partial homology to condensation domains (Iacovelli et al., 2021). Though its function remains yet to be elucidated, it appears essential for the adenylation of l-α-Aaa and therefore for the functionality of the entire enzyme. This unusual domain might be involved in the proper positioning of l-α-Aaa in the binding pocket of the adenylation domain, or it might have a critical role in maintaining the functional fold of module 1. All together, these observations suggest that ACVS is a rather unique and interesting NRPS.

Fig. 7.

ACVS domain organization and biosynthetic pathway of penicillin in P. rubens. ACVS consists of a total of 11 domains organized in three modules (C*AT1-CAT2-CATETe3), each responsible for the activation and incorporation of a specific substrate: respectively, l-α-aminoadipic acid (l-α-Aaa), l-cysteine and l-valine. The three amino acids are incorporated into the tripeptide δ-(l-α-aminoadipoyl)-l-cysteinyl-d-valine (l, l, d)-ACV). In Penicillium, ACV is converted into isopenicillin-N (IPN) by the isopenicillin-N synthetase (IPNS), which catalyzes the formation of the β-lactam ring. Subsequently, the enzyme acyl-coenzyme A: isopenicillin N acyltransferase (IAT) catalyzes the exchange of the aminoadipoyl moiety with a phenylacetyl group, generating penicillin G. ACVS is responsible for carrying out the first reaction in the penicillin biosynthetic pathway. The following step is performed by the enzyme Isopenicillin N synthase (IPNS), which catalyzes the formation of the β-lactam ring at the l-cysteinyl-d-valine moiety (Fig. 7) (Schenk, 2000). These first two steps occur in the cytosol (Van De Kamp et al., 1999; Weber et al., 2012) and are shared among all β-lactams producing organisms. Following independent biocatalytic routes observed in different organisms, IPN is then utilized for the synthesis of distinct classes of antibiotics: penicillins, cephalosporins, cephamycins, and clavams (Ozcengiz & Demain, 2013). P. rubens possesses one of the simplest of these pathways, where IPN is processed only by a third enzyme, acyl-coenzyme A: isopenicillin N acyltransferase (IAT). IAT utilizes an acyl-CoA donor to exchange the aminoadipate moiety of IPN, generating the final product. Depending on the abundance and presence (or external feeding) of different carboxylic acids, P. rubens can produce a range of penicillins: penicillin G (benzylpenicillin) when phenylacetyl-CoA is utilized by IAT, penicillin V (phenoxymethylpenicillin) from phenoxyacetyl-CoA; penicillin K (octanoylpenicillin) from octanoyl-CoA; and, to a lesser extent, other aliphatic and aryl aliphatic penicillins (Ball et al., 1978; Ferrero et al., 1990; Luengo et al., 1986). These acyl moieties need to be activated as acyl-CoA by an independent phenylacetate-CoA ligase (PCL), which has been shown to be able to activate several types of acyl substrates (Koetsier et al., 2009). The enzymatic reactions carried out by PCL and IAT take place in microbodies, organelles that are related to peroxisomes and that maintain a slightly alkaline pH in their lumen (Kiel et al., 2009; Müller et al., 1992; Van De Kamp et al., 1999).

Improved Penicillin Production and Semi-Synthetic β-Lactam Antibiotics

Given its relative simplicity and the importance of β-lactam antibiotics, the biosynthetic pathway of penicillin has been the target of several strategies aimed at improving penicillin yields or producing novel bioactive compounds since its very discovery. As briefly discussed above, the original P. rubens strain (NRRL 1951) was subjected during the years to several mutational approaches that generated a wide array of high-yielding strains for the production of penicillin (Peñalva et al., 1998; Peterson & Tornqvist, 1956; Ziemons et al., 2017). Furthermore, penicillin production can also be improved by manipulating the metabolic pathways that degrade its precursors, utilizing appropriate carbon sources and medium pH, or controlling the transcriptional levels of the biosynthetic genes (Peñalva et al., 1998; Weber, Polli, et al., 2012). Another important event was the discovery that the IAT enzyme could use different acyl-CoA precursors to generate alternative penicillins (Behrens & Corse, 1948), a process that can be steered by feeding the desired carboxylic acid (Havn Eriksen et al., 1994). The most prominent example is the production of penicillin G, which is the major product of industrial fermentations, based on feeding phenylacetic acid as substrate for PCL and IAT. Another common β-lactam is penicillin V, where phenoxyacetic acid or phenoxyethanol (later fermented to the respective acid) are fed to the fermentation broth (Ball et al., 1978). However, the feeding approach is clearly limited by the native specificities of PCL (and related CoA ligases) and by IAT. This limitation was partially overcome with the development of semi-synthetic penicillin antibiotics (SSPAs). These compounds derive from “natural” penicillins (such as penicillin G), which are first hydrolyzed to remove the acyl moiety and generate 6-aminopenicillanic acid (6-APA). This process is achieved by employing penicillin acylases, enzymes naturally present in P. rubens and many bacterial organisms but, for industrial applications, often produced in large amounts and further improved by protein engineering using recombinant strains of E. coli (Alkema et al., 2000; Cole, 1966; Erickson & Bennett, 1965; Sio & Quax, 2004; Tishkov et al., 2010). In the following step (either enzymatic or chemical), 6-APA is acylated using “non-natural” donor substrates to produce the antibiotic with the desired moiety. This has led to the production of common broad-spectrum antibiotics such as ampicillin and amoxicillin (Blum et al., 2010; Deng et al., 2016; Moody & Boesten, 2006; Wu et al., 2010). During the years, many more SSPAs have been developed, each presenting their own advantages or disadvantages in terms of antibiotic activity spectrum and pharmacokinetics. Despite continuous improvements, the production of SSPAs remains a relatively costly process, since generally 6-APA needs to be isolated and purified from the cultivation medium, and the acyl precursors need to be prepared with high yields and degree of purity (Deng et al., 2016). Furthermore, the use of organic solvents, catalysts and other compounds that can produce by-products is often required. To render the process less costly and more environmentally-friendly, in vitro one-pot biocatalytic cascades that use penicillin acylases (and mutated variants) have been developed (Blum et al., 2010; Deng et al., 2016; Gabor & Janssen, 2004; Jager et al., 2008; Wu et al., 2010; Zhang et al., 2010). A major engineering achievement for β-lactam production in P. rubens was the reprogramming of strains for the synthesis of cephalosporins, which are naturally absent in this organism (Crawford et al., 1995). The main difference between cephalosporins and penicillins is the presence of a dihydrothiazine ring in place of the thioazolidine ring, which is introduced by the activity of a penicillin N expandase (Cooper, 1993). Herein, the cefE gene of Streptomyces clavuligerus was introduced in P. rubens, and adipate was added to the fermentation medium, resulting in the IAT-catalyzed formation of the precursor adipoyl-6-aminopenicillanic acid. The latter is an excellent substrate for the expandase, which catalyzes its conversion to adipoyl-7-aminocephalosporanic acid, which in turn can be used as a synthon for the semisynthetic production of cephalosporins (Robin et al., 2001). This process has been implemented at an industrial scale. Further examples are the production of cephalosporin C, achieved with the combined expression of isopenicillin N epimerization, ring expansion, and acetylation genes (Ullán et al., 2007), and the formation of a carbamoylated cephem antibiotic precursor that involved the expression of combined expandase/hydroylase, carbamoyltransferase, and transporter genes (Harris et al., 2009; Nijland et al., 2008). These accomplishments further demonstrate the potential of metabolic engineering in P. rubens and its versatility as β-lactam producing platform.

Toward a Two-Step Fermentative Synthesis of β-Lactams

Recent advances in protein engineering, and in particular NRPS engineering, offer a great advantage for the development of engineered enzymatic routes for the production of novel compounds. In this respect, if one could engineer successfully the first two enzymes of the pathway—ACVS and IPNS—the synthesis of (novel) β-lactams of interest could be achieved in two simple steps (Fig. 8), bypassing the need for tailored PCL and IAT activities. Furthermore, recent achievements in strain engineering—such as the development of a CRISPR-Cas9-based tool for genomic editing (Pohl et al., 2016), and the development of a secondary metabolites deficient P. rubens strain for natural products production (Pohl et al., 2020)—would allow for the required engineering to be performed in vivo, with the ultimate goal of generating high-yielding strains. With these, novel (or semi-synthetic) β-lactam compounds could be produced in a completely fermentative manner, considerably reducing the costs and environmental impact of the synthesis process.

Fig. 8.

Proposed engineered pathway for the synthesis of novel β-lactams. An engineered variant of ACVS synthesizes the tripeptide precursor that already includes the desired moiety (X). An engineered IPNS then catalyzes the formation of the β-lactam ring, generating the novel compound in only two enzymatic steps. Potentially, such strategy can also be applied to produce (semi-) synthetic β-lactam compounds in a fully fermentative manner. Several structural and mutagenesis studies of IPNS, often complexed with substrates or analogues (Ge et al., 2009; Kreisberg-Zakarin et al., 1999; Loke & Sim, 1999; Long et al., 2003, 2005; Roach et al., 1997), provided crucial insights on the binding of ACV and the reaction mechanism. While the β-lactam ring formation clearly involves solely the cysteinyl-valine moiety, the aminoadipate moiety is important for the binding of the substrate (Loke & Sim, 1999). Targeting specific residues with mutagenesis experiments may lead to IPNS mutants capable of recognizing alternative tripeptides with a different moiety at the first position. Since the CV moiety would stay the same, the ability to form the β-lactam ring should remain intact. Importantly, such engineering attempts are aided by the availability of high-quality crystal structures of IPNS (Clifton et al., 2013; Daruzzaman et al., 2013; Roach et al., 1995, 1997). However, for ACVS the picture is rather different. To date, no structural studies of either the full enzyme or isolated modules have been reported, leading to a general lack of information on the substrate binding and reaction mechanism. To develop targeted strategies such as those proposed above for IPNS, obtaining structures of the catalytic domains of ACVS at near-atomic resolution (ideally bound to substrates), is paramount. In general, NRPS enzymes are difficult to crystallize given their large size and dynamic architecture. In this respect, recent advances in electron cryo-microscopy (cryo-EM) could offer a valuable tool to overcome these limits (Nakane et al., 2020). Recently, the first adenylation domain of a bacterial ACVS has been engineered to change the substrate specificity using a subdomain swap strategy (Kries et al., 2015), but no activity was reported for any of the hybrid NRPSs constructed but one. In that case, the bacterial subdomain was replaced with a fungal homologue (same specificity), resulting in a functional—yet impaired—hybrid capable of synthesizing ACV (Iacovelli et al., 2020). Although the reason of the inactivity of the other hybrid NRPSs still needs to be resolved, it seems likely that the alternative substrates could either not be activated (faulty adenylation domain) or not be incorporated into the peptide chain (hampered condensation reaction). Because of the large size of ACVS and the complex assays required to measure its activity (peptide formation), these enzymes are less amenable to high throughput mutagenesis approaches or evolutionary engineering. Only through a major breakthrough in the structural biology of ACVS, it will be possible to refine and tailor the engineering strategies that have been developed in recent years. This may finally allow the desired modification of the substrate specificity of the first module of ACVS, ultimately leading to β-lactam diversification. In particular, great potential lies within the exchange units-based approaches that take into account the natural specificity of condensation domains (Bozhüyük et al., 2018, 2019).

Concluding Remarks

Nonribosomal peptide synthetases are complex, modular machines that synthesize a myriad of bioactive natural products. Great efforts have been made in the past decades to elucidate the structural and functional features of these enzymes, with the ultimate aim of harnessing their biotechnological potential. Additionally, an extensive network of accessory proteins involved in NRP synthesis was unraveled. The complex interactions between all these players greatly expand the chemical diversity of NRPs, granting them some of the features needed for bioactivity. This new information led to a greater understanding of how the NRPS machinery operates, setting the basis for the development of more and more advanced protein engineering approaches. In the last decade especially, significant progress has been made, although a universal “plug and play” approach has not been developed yet. However, a wide variety of viable strategies are now available, each with its own advantages and disadvantages. Combined with the increasing availability of NRPS structures, this will allow researchers to be able to select and tailor specific approaches based on their goal, target NRPS and organism of origin, opening a new era for the combinatorial biosynthesis of bioactive peptides. Penicillium rubens is one of the first filamentous fungi that has been recognized for its potential and harnessed for the production of a natural product, penicillin. Following the pioneering work of Prof. Arnold Demain on the biosynthetic pathways of penicillin and cephalosporin, natural product synthesis in this organism has been extensively studied and manipulated to address the rising need of bioactive molecules, and in particular antibiotics. In this respect, establishing a simple two-step biosynthetic pathway (and the fermentative synthesis) for the production of novel β-lactam antibiotics certainly merits further attention. In a society that needs new antibiotics to combat resistant bacterial infections and a solid transition toward a sustainable, bio-based economy, such a development would bring an exceptional added value for fungal biotechnology.

Funding

None declared.

297 in total

Review 1. Improved beta-lactam acylases and their use as industrial biocatalysts.

Authors: Charles F Sio; Wim J Quax
Journal: Curr Opin Biotechnol Date: 2004-08 Impact factor: 9.740

2. Nonprocessive [2 + 2]e- off-loading reductase domains from mycobacterial nonribosomal peptide synthetases.

Authors: Arush Chhabra; Asfarul S Haque; Ravi Kant Pal; Aneesh Goyal; Rajkishore Rai; Seema Joshi; Santosh Panjikar; Santosh Pasha; Rajan Sankaranarayanan; Rajesh S Gokhale
Journal: Proc Natl Acad Sci U S A Date: 2012-03-26 Impact factor: 11.205

Review 3. Genetic manipulation of secondary metabolite biosynthesis for improved production in Streptomyces and other actinomycetes.

Authors: Richard H Baltz
Journal: J Ind Microbiol Biotechnol Date: 2015-09-12 Impact factor: 3.346

4. Involvement of microbodies in penicillin biosynthesis.

Authors: W H Müller; R A Bovenberg; M H Groothuis; F Kattevilder; E B Smaal; L H Van der Voort; A J Verkleij
Journal: Biochim Biophys Acta Date: 1992-04-22

5. Crystal structure of the termination module of a nonribosomal peptide synthetase.

Authors: Alan Tanovic; Stefan A Samel; Lars-Oliver Essen; Mohamed A Marahiel
Journal: Science Date: 2008-06-26 Impact factor: 47.728

6. Module extension of a non-ribosomal peptide synthetase of the glycopeptide antibiotic balhimycin produced by Amycolatopsis balhimycina.

Authors: Diane Butz; Timo Schmiederer; Bianka Hadatsch; Wolfgang Wohlleben; Tilmann Weber; Roderich D Süssmuth
Journal: Chembiochem Date: 2008-05-23 Impact factor: 3.164

7. Rational and efficient site-directed mutagenesis of adenylation domain alters relative yields of luminmide derivatives in vivo.

Authors: Xiaoying Bian; Alberto Plaza; Fu Yan; Youming Zhang; Rolf Müller
Journal: Biotechnol Bioeng Date: 2015-03-02 Impact factor: 4.530

Review 8. The beta-lactam antibiotics: past, present, and future.

Authors: A L Demain; R P Elander
Journal: Antonie Van Leeuwenhoek Date: 1999 Jan-Feb Impact factor: 2.271

9. Biosynthesis of novel Pyoverdines by domain substitution in a nonribosomal peptide synthetase of Pseudomonas aeruginosa.

Authors: Mark J Calcott; Jeremy G Owen; Iain L Lamont; David F Ackerley
Journal: Appl Environ Microbiol Date: 2014-07-11 Impact factor: 4.792