Non-ribosomal peptide synthetases (NRPSs) are attractive targets for biosynthetic pathway engineering due to their modular architecture and the therapeutic relevance of their products. With catalysis mediated by specific protein-protein interactions formed between the peptidyl carrier protein (PCP) and its partner enzymes, NRPS enzymology and control remains fertile ground for discovery. This review focuses on the recent efforts within structural biology by compiling high-resolution structural data that shed light into the various protein-protein interfaces formed between the PCP and its partner enzymes, including the phosphopantetheinyl transferase (PPTase), adenylation (A) domain, condensation (C) domain, thioesterase (TE) domain and other tailoring enzymes within the synthetase. Integrating our understanding of how the PCP recognizes partner proteins with the potential to use directed evolution and combinatorial biosynthetic methods will enhance future efforts in discovery and production of new bioactive compounds.
Non-ribosomal peptide synthetases (NRPSs) are attractive targets for biosynthetic pathway engineering due to their modular architecture and the therapeutic relevance of their products. With catalysis mediated by specific protein-protein interactions formed between the peptidyl carrier protein (PCP) and its partner enzymes, NRPS enzymology and control remains fertile ground for discovery. This review focuses on the recent efforts within structural biology by compiling high-resolution structural data that shed light into the various protein-protein interfaces formed between the PCP and its partner enzymes, including the phosphopantetheinyl transferase (PPTase), adenylation (A) domain, condensation (C) domain, thioesterase (TE) domain and other tailoring enzymes within the synthetase. Integrating our understanding of how the PCP recognizes partner proteins with the potential to use directed evolution and combinatorial biosynthetic methods will enhance future efforts in discovery and production of new bioactive compounds.
Non-ribosomal peptides (NRPs) are secondary metabolites biosynthesized by microbes that are small peptides that are assembled outside of ribosomal translation. NRPs can act as metal chelators, pigments, and toxins given their wide scope of structural diversity [1]. NRPs also exhibit a variety of relevant therapeutic properties, such as antibiotic, antitumor, and immunosuppressant bioactivities [1]. Commonly used NRP therapeutics include vancomycin, cyclosporin A, bleomycin A2, and polymyxin B [[2], [3], [4], [5]]. The bioactivities of these compounds can be attributed to their complex molecular scaffolds installed by the non-ribosomal peptide synthetase (NRPS).The NRPS is a modular collection of enzymes that catalyzes the biosynthesis and modification of short peptide products. Central to NRP biosynthesis is the peptidyl carrier protein (PCP) (commonly referred to as a thiolation domain), which is a small ∼80 residue protein that forms a conserved 4-helix bundle (Fig. 2) [6]. Separating helices 1 and 2 is the loop 1 region, an ordered 17–22 residue loop that immediately precedes a conserved serine at the beginning of helix 2. The proper assembly of NRPs requires a series of reactions catalyzed by different NRPS domains. First, the inactive apo-PCP requires the post-translational attachment of a 4′phosphopantetheine (PPant) arm to the conserved serine residue of the PCP via a phosphopantetheinyl transferase (PPTase) to form holo-PCP (Fig. 1A) [7]. Next, the adenylation (A) domain is responsible for the activation and covalent attachment of a specific amino acid onto the holo-PCP through the adenylation and thiolation reactions (Fig. 1B), which encompass the activation of the amino acid substrate with adenosine triphosphate (ATP) [6], followed by thioester linkage formation with to form the peptidyl-PCP. Once loaded, the peptidyl-PCPs from upstream and downstream NRPS modules bind to their respective donor or acceptor sites of the condensation (C) domain, which catalyzes peptide bond formation between the PCP-bound substrates (Fig. 1C) [6]. Upon reaching the termination module, the peptidyl-PCP transfers the elongated peptide chain to the thioesterase (TE), which catalyzes the hydrolysis or cyclization of the peptide product for product release (Fig. 1D). In addition to the canonical A, C, and TE domains, tailoring domains that are fused to the enzymatic assembly line (in cis) or act as standalone domains (in trans) may also be included to install unique structural modifications to the peptide product [8]. These chemical modifications include halogenation, dehydrogenation (DH), hydroxylation, formylation (F), methylation, epimerization (E), or acylation. These functionalizations demonstrate the wide variety of partner proteins available for the PCP to generate productive protein-protein interactions to enable transformation of the nascent natural product.
Fig. 2
High-resolution structures of NRPS PCP-partner protein complexes. In the center is the PCP with its conserved secondary structure, where helix 1 (blue) is connected to helix 2 (yellow) by the 17–22 residue loop 1 (green), which is followed by a short helix 3 (orange) and helix 4 (red). At the beginning of helix 2 is the conserved serine, which is modified with a PPant arm (pink). Around the PCP are examples of the different PCP-partner protein complex structures that reveal the protein-protein interactions at the interface. These include the PCP-PPTase (PDB 4MRT), PCP-A domain (PDB 6O6E), PCP-C domain (PDB 6MFZ), PCP-F domain (PDB 5ES9), PCP-DH domain (PDB 6CXT), and PCP-TE domain (PDB 2ROQ). (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Fig. 1
General reactions catalyzed by canonical NRPS enzymes including the A) phosphopantetheinylation, B) aminoacylation, C) condensation, and D) thioesterification reactions.
General reactions catalyzed by canonical NRPS enzymes including the A) phosphopantetheinylation, B) aminoacylation, C) condensation, and D) thioesterification reactions.High-resolution structures of NRPS PCP-partner protein complexes. In the center is the PCP with its conserved secondary structure, where helix 1 (blue) is connected to helix 2 (yellow) by the 17–22 residue loop 1 (green), which is followed by a short helix 3 (orange) and helix 4 (red). At the beginning of helix 2 is the conserved serine, which is modified with a PPant arm (pink). Around the PCP are examples of the different PCP-partner protein complex structures that reveal the protein-protein interactions at the interface. These include the PCP-PPTase (PDB 4MRT), PCP-A domain (PDB 6O6E), PCP-C domain (PDB 6MFZ), PCP-F domain (PDB 5ES9), PCP-DH domain (PDB 6CXT), and PCP-TE domain (PDB 2ROQ). (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)The NRPS can be divided into type I and type II proteins. Type I NRPS proteins consist of the canonical enzymatic domains linked together in a single polypeptide chain, analogous to the type I fatty acid synthase (FAS) and the type I polyketide synthase (PKS) [[9], [10], [11]]. Type II NRPS pathways exist as stand-alone proteins or di-domains that are expressed independently from multi-modular NRP biosynthetic enzymes. Type II NRPS proteins commonly exist as part of linear pathways, unlike the iterative nature of the type II FAS and the type II PKS. Although both type I and II NRPS proteins contain tailoring domains, type II NRPS proteins commonly install diverse chemical groups, including dehydrogenated prolines, substituted aromatics, cyclopropanes, and halogenated aliphatics [10]. The type II NRPS proteins are commonly found at the first initiation step of an NRPS pathway, which includes the A domain and the PCP, and also may include subsequent tailoring domains, such as halogenation, dehydrogenation, hydroxylation, or cyclopropanation domains that may be associated with downstream type I NRPS systems.Due to the pharmaceutical relevance of NRPs and the modular architecture of the NRPS, NRPS biosynthesis has been a target for engineering in order to create new natural products with enhanced bioactivities. Initially, early attempts at engineering NRPS systems were met with limited success, where published combinatorial biosynthetic attempts reported low yields or no product formation [12]. Early efforts included swapping a cognate A domain with a non-cognate A domain to change the identity of the incorporated amino acid, a process coined combinatorial biosynthesis [13]. The lack of identified product formation has been attributed to many challenges, among them was the hypothesis that engineered systems may lack proper protein-protein interactions found in wild-type pathways [14]. Domain substitution with non-cognate partner proteins runs the risk of losing the specific protein-protein interactions made at the PCP-partner protein interface, potentially abrogating enzyme turnover. Recent efforts towards the re-engineering of NRPSs have included obtaining high-resolution structural data on the NRPS domains [15], specifically within a PCP-partner protein complex to identify the exact interactions that govern protein recognition (Fig. 2) [16]. This review focuses on recent advancements in such efforts to discover the modes of PCP-partner protein recognition. Understanding the various modes of interaction will prove critical to achieve high turnover, rationally designed NRPS biosynthesis.
Phosphopantetheinyl transferase and peptidyl carrier protein interface analysis
PPTases are essential enzymes due to their critical roles in both primary and secondary metabolism from all domains of life [17]. PPTases are responsible for a post-translational modification of FAS and PKS acyl carrier proteins (ACPs) in addition to NRPS PCPs. Due to their essential role in fatty acid synthesis, PPTases have served as a promising target for antibiotic drug development [7]. PPTases convert the inactive apo-carrier protein to the active holo-carrier protein through the covalent attachment of a 4′-phosphopantetheine moiety from coenzyme A (CoA) onto a conserved serine on all carrier proteins (Fig. 1A), which is found at the beginning of helix 2 (Fig. 2). In each carrier protein-dependent system, the PPant arm on the holo-carrier protein allows for the covalent tethering of the carboxylic acid substrate in the form of a thioester linkage, and the tethered substrate may then be shuttled by the carrier protein to various enzymatic domains for subsequent modifications and incorporation into the final natural product.Due to their ability to attach CoA substrates directly onto the carrier protein, PPTases have been subjected to distinct applications in the field of biotechnology, which include attachment of fluorophores, chemical crosslinkers, and solid supports onto carrier proteins [7]. The PPTase from Bacillus subtilis, Sfp, demonstrates wide substrate scope with respect to both protein and CoA substrates [7]. Sfp has therefore been a heavily utilized tool to append carrier proteins with unnatural cargo and has been crucial for loading unnatural chemical probes onto the carrier protein to aid in stabilizing the carrier protein-partner protein complex for structural analysis. To understand the unique promiscuity observed in Sfp, the X-ray crystal structure of Sfp in complex with CoA and the PCP from tyrocidine NRPS, TycC3, was solved to a resolution of 2.0 Å [18]. As a means to promote complex formation, the conserved serine on the PCP was mutated to an alanine to prevent transfer of the PPant arm. Analysis of the PCP-Sfp protein-protein interface revealed a dependence on hydrophobic interactions and the presence of an intramolecular hydrogen bond similar to the recently solved structure of the ACP-PPTase complex from the Mycobacterium abscessus PKS PpsC [19]. Helix II in the PCP is mainly responsible for the hydrophobic interactions that occur at the interface where Leu46 and Met49 occupied a hydrophobic patch located in the C-terminal portion of Sfp. The single hydrogen bonding reaction is formed by the Gln40 located in the loop 1 region of the PCP. Mutagenic studies further support the importance of residues that comprise this hydrophobic patch and are necessary for sustained catalytic activity, where mutations that disrupt the hydrophobic interface residues result in abolished activity. Mutation of residues responsible for hydrogen bonding interactions retained enzymatic activity, which suggested the importance of the hydrophobic patch towards PCP recognition and the hydrogen bonding interaction responsible for Sfp promiscuity for non-cognate carrier proteins [18].Studies involving Sfp as a tool to tag and modify short peptide sequences further support the role of the hydrophobic patch for carrier protein substrate recognition in Sfp [20]. Experiments in which short segments from the PCP flanking the conserved serine were incubated with Sfp led to identification of minimal peptide sequences that can be recognized and modified by Sfp. As supported by the Sfp-TycC3 PCP complex X-ray crystal structure, peptides that formed an α-helix analogous to the carrier protein helix 2 and its hydrophobic interaction with Sfp were able to be loaded with a fluorescently modified PPant [20,21]. These results have been further combined with computational approaches such as machine learning to develop the utility of Sfp and short peptide sequences as a tag to modify and functionalize proteins [22].
Adenylation domain and peptidyl carrier protein interface analysis
The A domain is a critical player in NRP biosynthesis due to its role in the activation and attachment of a specific substrate onto the PCP prior to substrate incorporation into the natural product. The N-terminal core of the A domain, Acore, (residues ∼1–400) houses the substrate binding pockets for ATP, a magnesium ion, and an amino acid [23]. While the ATP and magnesium ion binding are conserved across A domains, the substrate binding pocket for the amino acid varies, and has been demonstrated to distinguish the binding of the various acid substrates across A domain homologs. Upon binding of these three substrates, the A domain exists in the adenylation state and will catalyze the adenylation reaction through a conserved catalytic lysine located in the A10 motif of the Asub domain (residues ∼400–500). This forms an amino acid-adenylate intermediate in the active site upon loss of a pyrophosphate (Fig. 1B). The Asub domain undergoes a domain alternation to form the thiolation state, which is the rotation of the Asub by ∼140° along a hinge region in the A8 loop to form a new catalytic active site for thiolation, as well as a protein-protein interface that can bind its cognate PCP. This Asub domain rotation has been uncovered in multiple A domain crystal structures bound to different substrates [[23], [24], [25], [26], [27], [28], [29], [30]]. Upon holo-PCP binding to the A domain in the thiolation state, the PPant arm extends into the active site of the A domain, where the thiolation reaction is catalyzed to form a new thioester bond with the amino acid substrate with adenosine monophosphate (AMP) as a leaving group (Fig. 3) [23]. The AMP and substrate-loaded PCP dissociate and the A domain is ready to catalyze the next set of adenylation and thiolation reactions.
Fig. 3
X-ray crystal structures of PCP-A domain complexes. The overall crystal structures of A) PltL-PltF complex (PDB 6O6E) and B) HitD-HitB (PDB 6M01), where the A domain is colored according to the Acore (white) and the Asub (gray) at the C-terminal end. The structures of chemical probes C) utilized in PCP-A domain structural analysis: 1 is a salicylate-AVS inhibitor; 2 is a valyl-AVS inhibitor; 3 is a seryl-AVS inhibitor; 4 is a prolyl-AVS inhibitor (electrophilic trap in red); 5 is a valine pantetheineamide substrate mimic; and 6 is a bromoacetamide pantetheine crosslinker. The interface residues from the D) PltL-PltF complex and E) HitD-HitB complex are shown in ball and stick. Hydrogen bonding and electrostatic interactions are shown with black dashed lines. The PCPs from both structures are colored as previously described, where helix 1 is blue, loop 1 is green, helix 2 is yellow, helix 3 is orange, and helix 4 is red, and the PPant/ligands are pink. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
X-ray crystal structures of PCP-A domain complexes. The overall crystal structures of A) PltL-PltF complex (PDB 6O6E) and B) HitD-HitB (PDB 6M01), where the A domain is colored according to the Acore (white) and the Asub (gray) at the C-terminal end. The structures of chemical probes C) utilized in PCP-A domain structural analysis: 1 is a salicylate-AVS inhibitor; 2 is a valyl-AVS inhibitor; 3 is a seryl-AVS inhibitor; 4 is a prolyl-AVS inhibitor (electrophilic trap in red); 5 is a valine pantetheineamide substrate mimic; and 6 is a bromoacetamide pantetheine crosslinker. The interface residues from the D) PltL-PltF complex and E) HitD-HitB complex are shown in ball and stick. Hydrogen bonding and electrostatic interactions are shown with black dashed lines. The PCPs from both structures are colored as previously described, where helix 1 is blue, loop 1 is green, helix 2 is yellow, helix 3 is orange, and helix 4 is red, and the PPant/ligands are pink. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)Because of the A domain's role as a gatekeeper in controlling substrate incorporation into the natural product, it has been the main target in NRPS engineering through A domain substitution or active site engineering [12,[31], [32], [33], [34], [35]]. Many of the early efforts in A domain substitution, however, were met with limited success, which was suggested to be due to the lack of proper protein-protein interactions [36,37]. Thus, large efforts have been made in determining the molecular basis of the PCP-A domain interaction; many of these efforts were spearheaded through the use of chemical biology tools in combination with structural biology to unveil the specific protein-protein interactions responsible for binding and therefore substrate loading. One of the main chemical probes utilized was the adenosine vinylsulfonamide (AVS) inhibitor (Fig. 3C), which was initially designed as a substrate mimic to the aminoacyl-adenylate intermediate [26,27,[38], [39], [40], [41], [42], [43]]. This inhibitor incorporates an electrophilic trap in the form of a Michael acceptor that would be attacked by the PPant thiol, thus covalently linking the PCP to the probe while it is non-covalently bound in the A domain in the thiolation state.
Initial structural analysis of type II PCP-A domain complexes
The first two X-ray crystal structures of the PCP-A domain complex were solved from the type II PCP and A domain from enterobactin biosynthesis, EntB-EntE, and the type II PCP-A di-domain from an unknown biosynthesis, PA1221 [26,27]. In enterobactin biosynthesis, the initiation step involves the A domain, EntE, which activates a 2,3-dihydroxybenzoic acid and transfers it to the PCP, EntB [27]. In an unknown biosynthetic pathway, the A domain in PA1221 activates valine and loads it onto the PCP [26]. Both of these studies utilized the AVS inhibitor modified with the appropriate acyl substrate in AVS probes 1 and 2 (Fig. 3C), respectively, which allowed the trapping, crystallization, and structure determination of the otherwise transient PCP-A domain complex in the thiolation state [26,27].Generally, both X-ray structures revealed that the PCP helix 2 and loop 1 regions formed specific protein-protein interactions with a composite interface formed by the Acore and Asub domains, respectively. The EntB-EntE X-ray crystal structure uncovered a large dependence on hydrophobic interactions located on loop 1 and helix 2 in addition to three salt bridge interactions that exist on loop 1 and helix 2 [27]. The PA1221 PCP-A domain X-ray crystal structure identified three hydrophobic interactions at the PCP helix 2 and multiple hydrogen bonding interactions found at helix 1, loop 1, and helix 2 [26]. While the interface locations were consistent, the types of interactions at the interfaces varied across both structures. The high-resolution information of the EntB-EntE interface allowed the rational design of a homologous A domain from acinetobactin, BasE, to improve activity with the non-cognate PCP, EntB [27]. The successful BasE mutations involved swapping of potential BasE interface residues with that of EntE based on sequence alignments, which replaced BasE hydrophobic interactions with electrostatic interactions observed in the EntB-EntE structure. These mutations improved BasE initial velocity rates with EntB by 15-fold.
Application of chemical probes to type I PCP-A domain systems
The next set of significant PCP-A domain studies involved applying chemical probes to study the larger type I NRPS systems from the first NRPS module of enterobactin biosynthesis, EntF, and the first NRPS module from linear gramicidin biosynthesis, LgrA. The EntF crystal structure was solved in the thiolation state using the AVS probe 3, which revealed a PCP-A domain interface dependent on mainly hydrophobic interactions located at the EntF PCP loop 1, helix 2, and helix 3 regions [28]. Additionally, hydrogen bonds were found at EntF PCP helix 1, loop 1, and helix 2. In the case of the LgrA crystal structure in the thiolation state, which was solved using probe 5 (Fig. 3C), a valyl-pantetheinamide probe, the PCP-A domain interface was identified to be similar to the EntF PCP-A domain interface in that the interface utilized the PCP loop 1, helix 2, and helix 3 regions [30]. While the LgrA PCP-A domain interface had less hydrophobic interactions and more hydrogen bonding interactions than EntF, LgrA contains a single salt bridge interaction at the start of helix 2 to aid in PCP binding. The trapped thiolation states of EntF and LgrA were compared to other module states to further dive into the modular architecture and movements during the NRPS biosynthetic cycles.
Recent interface analysis PCP-A domain complexes
Aside from conventional crystallographic studies on crosslinked carrier protein complexes, nuclear magnetic resonance (NMR) titration studies were performed to probe the residues involved in protein-protein recognition of the type II PCP-A domain interaction from pyoluteorin biosynthesis [44]. The A domain, PltF, activates and loads L-proline onto the PCP, PltL. An NMR titration was carried out utilizing N15-labeled PltL loaded with an S-methyl PPant probe that allowed the PPant probe to access the A domain active site, but inhibited formation of the thioester and thus transfer of the aminoacyl moiety [44]. While this method identified PCP residues involved at the PCP-A domain interface, there was still uncertainty in how the A domain specifically interacts with the PCP. A follow up study of the PltL-PltF interaction utilized the AVS probe 4 (Fig. 3C) to trap, crystallize, and solve the PltL-PltF complex structure through X-ray crystallography (Fig. 3A) [45].The crystal structure of the PltL-PltF complex revealed a similar mode of binding when compared to the previously solved PCP-A domain structures, however the main difference was the minimal role of PltL helix 2 in creating specific protein-protein interactions at the interface. Between PltL and PltF, the structure shows a single hydrophobic interaction between PltL Met43 and PltF Met257 at the beginning of helix 2 immediately following the conserved serine residue (Fig. 3D). The remainder of the interface was located along the loop 1 region of PltL, which employed precise hydrophobic and hydrogen bonding interactions. PltF Ile454 is seen sitting inside a hydrophobic pocket formed by PltL loop 1 residues Ile19, and Trp37. Adjacent to this hydrophobic interaction is a hydrogen bond between PltF Lys457 with the backbone carbonyl of PltL Gly38. PltL helix 3 was also observed forming hydrophobic interactions and a single hydrogen bonding interaction between PltL Ser62 and PltF Ser232. Alanine scanning of the A domain interface residues and comparison to the previous NMR titration experiments confirmed the importance of each specific interface interaction.The covalent crosslinking of a PCP-A domain complex has been recently explored with success towards solving a PCP-A domain X-ray crystal structure. The ACP and partner protein interactions with the ketosynthase, acyltransferase, and TE from the FAS and PKS systems have been crosslinked and structurally analyzed using a variety of PPant probes such as the chloroacrylamide and bromoacetamide probes [9,46]. These probes take advantage of the nucleophilic active site cysteine or serine that attacks the PPant probe with a halide as a leaving group, which covalently crosslinks the PCP and partner protein. Since A domains do not have a nucleophilic residue as part of its catalytic mechanism, a cysteine mutation must be introduced in the A domain active site to enable crosslinking with these probes [47].The probes and respective mutations have been applied to the type II A domain and the PCP of hitachimycin biosynthesis [47]. The A domain, HitB, activates and loads a (S)-β-phenylalanine onto the PCP, HitD. A conserved aspartate in the active site of HitB, which is involved in substrate binding of the amino group of the amino acid, was mutated to a cysteine. This mutation enabled crosslinking of the HitD-HitB complex with probe 6 (Fig. 3C), which afforded crystallization and determination of the HitD-HitB X-ray crystal structure in the thiolation state (Fig. 3B) [47]. The complex crystal structure revealed an interface formed mainly by the HitD loop 1 and helix 2 regions, which is consistent with the previously solved PCP-A domain structures discussed above (Fig. 3E). Although the regions are consistent, the specific interactions differ. The HitD loop 1 uses a Phe16 to fit into a HitB hydrophobic pocket formed by Asub domain residues Arg590, Pro503, and Ile506. Adjacent to the hydrophobic interaction are two hydrogen bond interactions formed between HitB Arg590 and the main chain carbonyls of HitD Arg30 and Asp31. On HitD helix 2, there are multiple salt bridge interactions with the HitD Acore. These salt bridge formations occur between HitD Glu41 with HitB Arg275/His276 and HitD Glu47 with HitB Arg249. Additionally, HitB Trp247 sits inside a HitD hydrophobic pocket formed between helix 2 and helix 3 consisting of Thr39, Leu43, Leu59, and Phe64.While the general PCP and A domain regions in the HitD-HitB crystal structure are consistent with the previously solved PCP-A domain complex structures, the HitD-HitB interface utilizes a combination of hydrogen bonding, hydrophobic, and electrostatic interactions [47]. EntF, PA1221, and PltF-PltL interfaces depend on hydrogen bonding and hydrophobic interactions [26,28,45], whereas EntE-EntB is dependent on electrostatic interactions and hydrophobic interactions [27]. HitD-HitB is most similar to LgrA in that it involves 1 electrostatic interaction, 2 hydrophobic interactions, and multiple hydrogen bonding interactions [30].
Outlook on PCP-A domain interfaces
Although this extensive collection of PCP-A domain structures has surfaced recently, there remains questions regarding the mechanism of PCP-A domain binding. The multitude of PCP-A domain structures reveal the conformation formed during thiolation; however, it is important to note that the initial recognition and binding events are just as important. NMR structures of PCPs have shown the positioning of the substrate-loaded PPant in a retracted state [48], which may be the conformation that the A domain must recognize. Conversely, during thiolation, the PCP-A domain structures have the PPant in the extended state. A full understanding will require that a dynamic picture of the PCP-A domain binding mechanism be teased out, which is an ongoing investigation [49,50]. Furthermore, analysis of the PCP-A domain linker and its role in catalytic activity, in addition to its contributions to forming the PCP-A domain interface, will also aid efforts in A domain substitutions [31,51].
Condensation domain and evolutionarily related enzyme interface analysis with the peptidyl carrier protein
The C domain catalyzes the peptide bond formation between adjacent PCP-linked substrates, and is responsible for the downstream transfer of the elongating peptide intermediate throughout the synthetase. Multiple C domain crystal structures revealed that the C domain is split into two halves, which are referred to as the N-terminal lobe and C-terminal lobe [[28], [29], [30],[52], [53], [54], [55], [56], [57], [58], [59], [60], [61]]. The two lobes are held together as a pseudo-dimer through conserved latch and floor loop motifs. To access the active site, the previously solved crystal structures have revealed two 15 Å tunnels from the donor and acceptor PCP binding sites by which the PPant can enter and present its substrate to the catalytic residues [15]. The conserved catalytic residues postulated to be responsible for the catalysis of condensation, HHxxxDG, are found on the N-terminal lobe at the interface of both lobes. While the exact mechanism of catalysis is still being discerned, it is postulated that the second conserved histidine deprotonates the amine of the acceptor substrate, which allows amine nucleophilic attack at the thioester carbon of the donor substrate [62].C domains have been debated as a secondary checkpoint, where the C domain must bind the correct substrates in the active site before catalyzing peptide bond formation in addition to creating specific protein-protein interactions with the appropriate donor and acceptor PCPs. While assessment of the C domain active site substrate selectivity is still underway [[63], [64], [65], [66], [67]] information involving the protein-protein recognition of the C-domain with both acceptor and donor PCPs is critical when engineering C domains with non-cognate NRPS systems.
The first crystal structure of the PCP-C domain complex
The earliest studies that structurally analyzed the protein-protein interactions of C domains involved the type I NRPS SrfA-C and the type I NRPS AB3403 [28,54]. Both SrfA-C and AB3403 are termination modules consisting of the domains C-A-PCP-TE from surfactin biosynthesis and an unknown biosynthetic pathway, respectively. The X-ray crystal structures of both NRPS modules have been solved with the PCP bound at the acceptor site of the C domain [28,54]. The SrfA-C acceptor PCP-C domain structure initially revealed a protein-protein interface formed by the C domain N-terminal lobe and C-terminal lobe with the PCP helix 2 and helix 3, respectively [54]. The specific PCP-C domain interactions consisted of nearly all hydrophobic interactions, with only one potential hydrogen bonding interaction. In this study, the conserved serine on the PCP was mutated to an alanine to prevent addition of a PPant arm, so specific interactions with the cofactor remain unresolved.The AB3403 acceptor PCP-C domain structure revealed a similar protein-protein interface to SrfA-C, where the C domain N-terminal lobe and C-terminal lobe are observed interacting with the PCP helix 2 and 3, respectively [28]. Similarly, the AB3403 PCP-C domain interface is very dependent on hydrophobic residues. Unlike the SrfA-C structure, the AB3403 PPant arm was observed extended into the active site with the C domain Arg344 forming an electrostatic interaction with the PPant phosphate. Despite the similar acceptor PCP-C domain interfaces, superposition of the C domains revealed that the bound PCPs differ by a 30° rotation. While this may be due to the lack of a PPant arm in the SrfA-C structure, the protein-protein interface created is feasible because the location of the conserved serine is still at the entrance to the C domain tunnel.
Recent success in the interface analysis of type I PCP-C domain complexes
Recent work in obtaining the structural snapshots of a di-modular NRPS has revealed multiple PCP-C domain bound structures, including the donor PCP-C domain complex as well as the first structure of a C domain with both the acceptor and donor PCPs bound simultaneously (Fig. 4A) [29]. This was performed on the type I NRPS from linear gramicidin synthesis, LgrA, which consists of F1-A1-PCP1-C2-A2-PCP2-E2, where the subscript represents the module [29]. Of the multiple structures and conformations solved on this system, two structures utilized probe 7 (Fig. 4D) to help crystallize and gain high-resolution insights into the protein-protein interactions of the donor PCP-C domains (PCP1–C2) complex. The crystal structures of the donor PCP-C domain complex revealed a protein-protein interface mainly dependent on hydrophobic interactions (Fig. 4B). These interactions are located at the PCP loop 1, helix 2, and helix 3 regions and the C domain C-terminal lobe. The PCP loop 1 contributes an electrostatic interaction with His721 and C domain Asp1011. Adjacent is a network of hydrophobic interactions involving PCP loop 1 Leu723 and helix 3 Phe752 and Tyr748 with C domain Thr1013, Met1016, Leu1085, and Leu1088.
Fig. 4
X-ray crystal structures of PCP-C domain complexes. A) Crystal structure of the LgrA C domain bound with PCPs at the donor and acceptor sites (PDB 6MFZ). B) Interface view of the donor PCP-C domain X-ray crystal structure from LgrA (PDB 6MFW). C) Interface view of the X-ray crystal structure of the donor PCP bound at the C domain acceptor position of FscG (PDB 7KVW). D) Chemical probes utilized to obtain PCP-C domain crystal structures: 7 is a formyl-valine pantetheineamide substrate mimic and 8 is a glycyl-ether panthetheine substrate mimic. The PCPs from both structures are colored as previously described, where helix 1 is blue, loop 1 is green, helix 2 is yellow, helix 3 is orange, and helix 4 is red, and the PPant is pink. The C domain is colored according to the N-terminal lobe (white) and C-terminal lobe (gray). (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
X-ray crystal structures of PCP-C domain complexes. A) Crystal structure of the LgrA C domain bound with PCPs at the donor and acceptor sites (PDB 6MFZ). B) Interface view of the donor PCP-C domain X-ray crystal structure from LgrA (PDB 6MFW). C) Interface view of the X-ray crystal structure of the donor PCP bound at the C domain acceptor position of FscG (PDB 7KVW). D) Chemical probes utilized to obtain PCP-C domain crystal structures: 7 is a formyl-valine pantetheineamide substrate mimic and 8 is a glycyl-ether panthetheine substrate mimic. The PCPs from both structures are colored as previously described, where helix 1 is blue, loop 1 is green, helix 2 is yellow, helix 3 is orange, and helix 4 is red, and the PPant is pink. The C domain is colored according to the N-terminal lobe (white) and C-terminal lobe (gray). (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)Interestingly, unlike the previously solved acceptor PCP-C domain structures that show the composite C domain interface formed by both N-terminal and C-terminal lobes, the donor PCP in LgrA is observed only interacting with the C-terminal lobe, where the donor PCP helix 2 contacts the floor loop region of the C-terminal lobe [29]. The crystal structure shows a lack of specific interactions commonly encountered between the PCP helix 2 and the C domain floor loop. Instead, the PCP-C domain interface reveals a dependence on shape complementarity between the two helices. Using these high-resolution structures, the 6 Å resolution crystal structure of the LgrA C domain bound to both acceptor and donor PCPs (PCP1-C2-PCP2) was resolved (Fig. 4A) [29]. The crystal structure revealed the first instance of both acceptor and donor PCPs occupying their respective sites on the C domain. The donor PCP maintains an identical binding interface to the other donor PCP-C domain complexes, while the acceptor PCP is supported through comparison to the AB3403 PCP-C domain interaction. Although a low-resolution structure, the acceptor PCP-C domain interface seems to be formed by the PCP helix 2 and loop 1 regions and the C domain N-terminal lobe and C-terminal lobe, respectively. Direct coupling analysis and mutagenesis of the protein-protein interactions between the LgrA acceptor PCP-C domain revealed significant decreases in C domain activity, thus supporting the interface interactions inferred from the model.Recently, the crystal structure of the PCP2–C3 didomain from the fuscachelin type I NRPS, FscG, was solved utilizing probe 8 (Fig. 4D) to aid in crystallization and visualization of active site residues [66]. The probe mimicked a glycyl-PPant, where the thioester linkage was replaced with a more stable thioether. The glycyl-PPant moiety bound in the active site provided insight into the lack of a substrate binding pocket to control C domain specificity, which is consistent with recent C domain substrate analyses [[63], [64], [65],67]. Surprisingly, the crystal structure revealed that the PCP2 was bound at the acceptor site of the C domain instead of the expected donor site [66]. Comparison of the PCP2 and PCP3 revealed a sequence identity of 65% and a structural alignment with a root mean squared deviation of 2 Å, which supports the continued analysis of the PCP2 bound at the opposite side of the C domain. This donor PCP-C domain binding interaction revealed an interface that was mainly hydrophobic located at the PCP helix 2 and helix 3 regions (Fig. 4C). Specific hydrophobic interactions included the PCP helix 2 Leu2518 and Leu2515 with C domain N-terminal lobe Leu2580 and Trp2579, and also with the PCP helix 3 Phe2538 with Val 2908. The C domain also utilizes Arg 2906 to create a salt bridge interaction with the PPant phosphate moiety. Interestingly, the buried surface area at the interface is ∼550 Å2, which is small compared to the previous PCP-A domain and PCP-C domain interface areas. The donor PCP2–C3 domain complex structure was also solved with only a PPant arm attached to the PCP [66]. While the protein-protein interface remained generally the same, the C domain Arg-2577 was observed blocking access to the C domain tunnel at the interface, and the PPant was unable to extend into the tunnel. On the other hand, the glycyl-PPant loaded PCP was observed inside the tunnel and active site of the C domain. The PCP2–C3 crystal structure with the Arg2577Gly mutation revealed an unloaded PPant extended into the C domain tunnel. This structure along with enzyme assays of the mutant supported the hypothesis that Arg2577 acts as a gating residue that may only be moved if the appropriate substrate is loaded onto the PCP.
Evolutionarily divergent epimerization and termination domains
Tailoring and termination domains, such as the E domain and the CT domains, have evolved from C domains [68]. The E domain is chiefly responsible for conversion of thiotemplated l-amino acids to d-amino acids, which contributes to the structural diversity of NRPs [69]. The CT domain instead terminates NRP production through cyclization and release of a cyclic peptide product. Structurally, both the E and CT domains conserve the canonical V-shaped fold seen in C domain structures [56,70]. The differences arise in subtle active site changes that confer different activities.Although these enzymes catalyze different reactions, the mode of binding the donor PCP remains similar to previously solved donor PCP-C domain interfaces. In the gramicidin type I NRPS, module 1 consists of A-PCP-E, where the E domain epimerizes the L-phenyl-PCP to D-phenyl-PCP [70]. The crystal structure of the PCP-E di-domain was solved and revealed the PCP bound to the donor binding site of the E domain using hydrophobic, hydrogen bonding, and electrostatic interactions [70]. The location of the interactions mainly involved the E domain N-terminal lobe with the PCP loop 1, helix 2, and helix 3. The E domain C-terminal lobe also contacted the PCP helix 2 as part of the protein-protein interface. Additionally, a crystallographically ordered 20-residue linker region between the E domain and PCP was also identified as crucial for the formation of a protein-protein interface, as mutation of specific electrostatic residues decreased product formation [70].In fumiquinazoline F biosynthesis, the CT domain, TqaA, is responsible for the cyclization of a ten-membered ring from a tripeptide [56]. The X-ray crystal structure of TqaA as a holo-PCP-CT domain complex was solved and revealed an overall structure and complex with the PCP bound at the donor site of the CT domain similar to the PCP-E domain structure from GrsA [56]. This interface was formed by almost exclusively the CT domain C-terminal lobe with the PCP helix 2 and helix 3 through mainly hydrophobic interactions in addition to a PPant phosphate hydrogen bonding interaction.
Outlook on the PCP-C domain interface
Obtaining structural information on the C domain interactions with its donor and acceptor PCPs to inform NRPS engineering has remained challenging due to multiple factors. PCP interactions with partner proteins are transient in nature and thus are difficult to crystallize in order to study the specific interfaces that enable peptide bond formation. Furthermore, two substrate-loaded PCPs are required to be bound at the donor and acceptor sites in order to evaluate the active site interactions that affect substrate selectivity. Additionally, C domain dynamics at the interface may play a role in PCP binding and substrate access [61]. Promising chemical biology tools are currently being developed to help stabilize the transient PCP-C domain complexes for structural analysis of the protein-protein interface and active site substrate binding [46,57,71,72]. Despite these challenges, X-ray crystal structures of C domains have guided successful re-engineering of type I NRPSs through identification of new areas susceptible to combinatorial biosynthesis [73,74]. Emerging techniques in structural biology such as cryo-electron microscopy (EM) could potentially aid in capturing the C domain with a combination of donor and acceptor PCPs, which can further shed light on the effect of PCP-bound substrates in forming a protein-protein interface. Starter condensation domains, which condense a donor acyl chain with an acceptor amino acid in lipopeptide NRPSs, have also seen recent success in active site analysis and engineering [75,76].
Peptidyl carrier protein and tailoring domain interface analysis
The tailoring domains encompass the groups of proteins that are not considered to be core NRPS domains (PCP, A domain, and C domain), yet have the capacity to chemically modify the growing peptide bound to the carrier protein [6]. The chemical modifications catalyzed by tailoring domains add diversity and functionality to the structure of NRPs, which may add protection against degradation by proteases, enhance binding affinity to specific targets, and increase NRP half-life upon release from the PCP [77]. Given the vast number of tailoring domains characterized to date, we limit our discussion to those that have been shown to form interfaces with carrier proteins in order to further illustrate the importance of protein-protein interactions to gain access to substrate functionalization. Some of the chemical modifications on the growing peptide include but are not limited to N-formylation, β-hydroxylation, dehydrogenation, and the aforementioned epimerization [10]. Being able to understand the interactions that are involved in coordinating these reactions, combined with an understanding of the governing protein-protein interactions provide a higher degree of spatiotemporal control in engineered NRPSs. In this section, we explore the protein-protein interfaces observed in the structures of different tailoring domains bound to the PCP, with an emphasis on the types of interactions that promote transient complex formation and guide reactivity in the biosynthesis of NRPs.
Formylation domain
After biosynthesis, the dimerization of gramicidin is stabilized by the N-formyl valine moiety, which enables the formation of pores that disrupt ion gradients in the membranes of gram-positive bacteria, thus highlighting the importance of formyl modifications and F domains [78]. Recently published crystal structures of LgrA from the linear gramicidin NRPS show different conformational states that illustrate the different stages of adenylation, thiolation and formylation involved in a type I NRPS module [30]. In the formylation state, the Asub domain in LgrA positions the valyl-PCP at the active site of the F domain, where a single salt bridge between Arg758 in helix II of the PCP and a Asp652 in a nearby loop in the Asub domain was observed. The F domain Met178 and Leu127 and the PCP Tyr748 form a hydrophobic patch that provides further stability to the complex, although interestingly the interface area is approximately 500 Å2, which is small relative to other PCP-partner protein structures [30].
Oxidation domain
Another important group of tailoring domains are the P450 oxygenases, or “Nature's blowtorch” as they are sometimes referred to, are oxygen dependent metalloproteins widely known for their capacity to install hydroxyl groups to certain substrates [16]. One example of a heavily hydroxylated NRP is skyllamycin, which is a cyclic depsipeptide with multiple β-hydroxylated amino acids, as well as hydroxylated aromatic rings [79]. In the skyllamycin biosynthetic pathway, P450s are selective towards the cognate PCPs from different modules within the synthetase [79]. P450s must recognize the competent binding interfaces that emerge from the PCPs loaded with different peptides. The structure of the cytochrome P450 tailoring domain, P450sky, bound to the PCP, PCP7, reveals the protein-protein interface of a monooxygenase domain that binds to PCP [79]; electrostatic and hydrogen bonding interactions at the interface were observed in the PCP helices 2 and 3, where residues Arg63, Thr46, and Lys47 form interactions with the P450 Asp191, Asn197 and E235, respectively. Trp193 and Leu194 of the P450 form hydrophobic patches with residues of helix 2 and 3 in the PCP that also assist in accommodating the geminal dimethyl group of the PPant attached to the conserved Ser42. It is important to note that although the P450 is a standalone domain that binds to the PCP, it does so selectively and does not necessarily interact with all amino acids in the module, given that not all residues in the final product show hydroxylation at the β-carbon. Comparing the crystal structure to a computational model of other PCPs reveals slight conformational differences between the relative orientations of the helices in PCP, which could potentially account for the P450 selectivity for certain PCPs [79].
Dehydrogenation domain
Another important tailoring domain is the DH domain involved in the biosynthesis of pyrrole containing NRPs. DH domains, which may also be re-classified as an oxidase [80], are flavin-dependent proteins that use oxygen as the final electron acceptor in the process of dehydrogenating proline for the production of thiotemplated pyrroles [10,16]. In pentabromopseudilin biosynthesis, a type II NRPS DH domain oxidizes a PCP-bound proline to a pyrrole group [80]. To understand the mechanism of dehydrogenation and how the DH domain binds the PCP, the tetrameric X-ray crystal structure of a flavin dependent DH domain, Bmp3, bound to FAD in complex with the PCP, Bmp1, was solved with either holo-Bmp1 or pyrrolyl-Bmp1 (Fig. 5A) [80].
Fig. 5
X-ray crystal structure of the PCP-DH domain complex. A) Overview of the tetrameric Bmp1-Bmp3 crystal structure (PDB 6CXT). The monomers of the DH domain, Bmp3, is alternating in white or gray. B) Close up of the Bmp1-Bmp3 interface. The PCP, Bmp1 is colored as previously described, where helix 1 is blue, loop 1 is green, helix 2 is yellow, helix 3 is orange, and helix 4 is red, and the PPant/ligands are pink. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
X-ray crystal structure of the PCP-DH domain complex. A) Overview of the tetrameric Bmp1-Bmp3 crystal structure (PDB 6CXT). The monomers of the DH domain, Bmp3, is alternating in white or gray. B) Close up of the Bmp1-Bmp3 interface. The PCP, Bmp1 is colored as previously described, where helix 1 is blue, loop 1 is green, helix 2 is yellow, helix 3 is orange, and helix 4 is red, and the PPant/ligands are pink. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)When comparing the DH domain active site between both structures, there were no differences in terms of the active site and cofactor spatial organization [80]. Hydrophobic residues in Bmp3 aid in aligning the PPant moiety with its active site, placing the proline in close proximity to FAD. The Bmp1 helix I interactions have proven to be important for the DH domain activity, as demonstrated by mutations to Arg277 in Bmp3 that disrupt hydrogen bonds with the Leu13 main chain carbonyl and significantly decrease production of the pyrrole (Fig. 5B) [80]. Glu15 in the Bmp1 helix I also forms hydrogen bonds with the Thr274 main chain nitrogen, demonstrating how disruptions to one of the recognition helices in the PCP can impact activity. The most disruptive mutations are those shown to interfere with helix II and III of the PCP, where a hydrophobic patch consisting of Leu28, Met38, Ile58, Pro60, and Phe63 binds the side chains Tyr178 and Leu179 in Bmp3. Double mutations of these residues in Bmp3 result in complete elimination of product formation. The overall contribution of these interactions led to the conclusion that hydrophobic interactions govern the formation of interfaces with electrostatic and salt bridge interactions playing a minor role.
Peptidyl carrier protein and termination domain interface analysis
Termination domains are commonly found at the end of linear NRPS modules and are responsible for the release of the mature peptide from the PCP [16]. The mechanism of action varies between termination domains from different NRPSs by taking advantage of a diverse array of nucleophiles that may catalyze the intra- or intermolecular release from the PCP, yielding linear or cyclized products with different functional groups. For instance, reductase (R) domains catalyze a reduction at the thioester linkage that can lead to the release of alcohols or aldehyde groups at the C-terminus of the peptide product. TEs cleave the thioester bond in PPant and can use different substrates as nucleophiles [81]. Additionally, the CT domain can also catalyze the release and macrocyclization of NRPs as discussed previously [56].
Thioesterase domains
The structures of PCPs in complex with TE domains give insight into the protein-protein interactions that lead to the timed release of substrate from PCP. In general, NRPS TEs belong to the α/β hydrolase family of enzymes, with an average size of 240–290 residues. Apart from possessing a Ser-His-Asp catalytic triad, they also have a 40-residue lid region that lines the substrate and alternates between open and closed states [81]. TEs are be further classified into type I, which hydrolyze a mature peptide from the PCP using diverse catalytic strategies, and type II TEs that recognize and hydrolyze PCPs with incorrectly loaded cargo that can stall the biosynthetic machinery [82]. While not directly involved with the core NRPS machinery, type II (repair) TEs can act in trans on PCPs with similar structure and catalytic domains as their type I counterparts [83].The TE domain in the termination module EntF is an example of a type I TE responsible for the peptide releasing step from the PCP to produce the enterobactin in Escherichia coli. The cyclization reaction that ultimately forms a tri-lactone is a product of coordinated reactions between the PCP and TE that are situated as the final two domains of the synthetase. To identify the protein-protein interactions and TE conformational changes upon binding, the EntF-TE complex was structurally analyzed in complex with the PCP [84]. The X-ray crystal structure of the EntF PCP-TE complex shows extensive interactions between the TE lid region and active site residues with the PCP helices II and III, encompassing over 1000 Å2 of total buried surface area excluding the PPant arm. In terms of the PPant arm, the majority of contacts involve a loop region in the TE, while mutagenesis studies revealed the importance of specific residues at the interface essential for enterobactin production and release. For instance, mutation of the TE Trp1079 resulted in a disruption of the hydrophobic interactions with the PCP at the PPant cavity and inhibition of product formation.Compared to type I TEs, the type II TE, SrfTEII from surfactin biosynthesis, also shares an α/β hydrolase fold [83]. However, the promiscuity observed in SrfTEII is due to the partial covering of its catalytic triad as well as other structural modifications that allow increased accessibility. Further comparisons with another type II TE structure from the colibactin synthase, ColQ, further highlights the preference for smaller substrates, as evidenced by a smaller active site cavity when compared to type I [84,85]. Substrate specificity studies demonstrate that type II TEs favor hydrolysis of acetate, indicative of a proofreading role of PCPs that have been post-translationally modified with acetyl-CoA or malonyl-CoA by PPTases [17].
Reductase domain
The biosynthesis of aureusimine from Methanobrevibacter ruminatium involves the NRPS Mru_0351, which utilizes an archaeal R domain to release the peptide product [86]. The recent structure of the Mru_0351 PCP-R domain shows the first archaeal R domain bound to the PCP (Fig. 6A) [86]. The R domain was compared to a carboxylic acid reductase module, CAR-PCP-R, which is the only other R domain structure complex reported to date [87]. The principal interactions between the R domain and the PCP include a novel helix-turn-helix (HTH) motif and a gating loop in the R domain that interact with the PCP helix residues [86]. A series of hydrogen bonds and hydrophobic interactions decorate the PCP and R domain interface. The hydrophobic interactions mainly consist of the PCP Phe3736, Leu3743, Ile3749, Ile3750, Leu3753, Tyr3761 and Phe3765 and the R domain Tyr4118, Met4122, Ile4126, Ile4130 and Tyr3920 from the HTH-motif and gating loop (Fig. 6B). Hydrogen bonding networks occur between the novel HTH-motif and PCP helices II and III that extend to water molecules found at the interface.
Fig. 6
X-ray crystal structure of the PCP-R domain complex. A) Overview of the Mru_0351 PCP-R domain complex (PDB 6VTJ). B) Close up of the PCP-R domain interface. The R domain is colored according to the N-terminal (white) and the C-terminal (gray) regions. The PCP is colored as previously described, where helix 1 is blue, loop 1 is green, helix 2 is yellow, helix 3 is orange, and helix 4 is red, and the PPant is pink. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
X-ray crystal structure of the PCP-R domain complex. A) Overview of the Mru_0351 PCP-R domain complex (PDB 6VTJ). B) Close up of the PCP-R domain interface. The R domain is colored according to the N-terminal (white) and the C-terminal (gray) regions. The PCP is colored as previously described, where helix 1 is blue, loop 1 is green, helix 2 is yellow, helix 3 is orange, and helix 4 is red, and the PPant is pink. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)Although it is not fully modeled in the crystal structure, the PPant participates in various interactions with the R domain as it extends into its active site. The geminal dimethyl group of PPant is stabilized by a hydrophobic pocket in the R domain composed of Tyr3920, Leu3743 and Tyr3761. Hydrogen bonding interactions also exist between PPant and main chain atoms in His3919, Thr4032, and Ala4304. The structure of the PCP-R domain also evaluates the role of the gating loop residues in stabilizing the PCP and positioning the PPant group close to the NAD(P)H binding site.
Outlook
The increase in high-resolution information of the transient PCP-partner protein complexes over the past decade is extremely insightful towards establishing guidelines to inform future efforts in engineering NRPS pathways. Compared to the ACP-partner protein interactions in the type II FAS, the interface regions on the carrier protein are similar, however, studies of ACP-partner protein interactions are revealing protein-protein interfaces that are much more dependent on small, electrostatic interfaces [9,88]. The complex structures reviewed here provide static details on the mode of binding; however, to further understand the PCP-partner protein binding event, more dynamic information will be required through techniques such as NMR titrations, solution NMR structures, molecular dynamic simulations, and cryo-EM structures.Nevertheless, the protein-protein interactions found at the PCP-partner protein interface can already be leveraged to understand and design new interactions with non-cognate partner proteins. During interface design, the wild-type PCP-partner protein structures can also be integrated with computational techniques, such as protein-protein docking and MD simulations, to create a model of a new non-cognate PCP-partner protein complex [89]. This model can then be used in rational design, semi-rational design, or in directed evolution to improve the binding interactions between the non-cognate proteins. It may be worthwhile to focus design on the interface of the partner protein, as mutations to the PCP will likely affect interactions with other partner proteins necessary in a pathway. Interface design can be performed in conjunction with current NRPS design methodologies, such as A domain substitution, A-PCP-C domain substitutions, or insertion of tailoring domains in type I and type II NRPS systems (Fig. 7). While this review only covers PCP-partner protein interfaces, interface design can also be applied to the variety of interdomain interactions created throughout the NRPS biosynthetic cycle, such as between the A-C domain interfaces. Overall, designing protein-protein interfaces as part of combinatorial biosynthesis strategies is a promising way to enhance the success in NRPS engineering. A productive interface, and thus product formation, or improved pathway productivity, may only lie a few mutations away!
Fig. 7
Design methodologies of the NRPS integrated with protein-protein interface design between non-cognate PCP and partner proteins. A) A general type I NRPS (blue) is shown with non-cognate (green) substitutions or insertions. The following panels highlights the new non-cognate protein-protein interfaces that have been introduced in B) A domain substitution, C) A-PCP-C domain substitution, and D) tailoring domain insertion. Movement of the PCP in C) is shown with arrows, where the initial PCP position is shown as more transparent. The new non-cognate interface can be optimized via mutations, which are depicted as complementary shapes similar to a puzzle piece. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Design methodologies of the NRPS integrated with protein-protein interface design between non-cognate PCP and partner proteins. A) A general type I NRPS (blue) is shown with non-cognate (green) substitutions or insertions. The following panels highlights the new non-cognate protein-protein interfaces that have been introduced in B) A domain substitution, C) A-PCP-C domain substitution, and D) tailoring domain insertion. Movement of the PCP in C) is shown with arrows, where the initial PCP position is shown as more transparent. The new non-cognate interface can be optimized via mutations, which are depicted as complementary shapes similar to a puzzle piece. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Author contribution
J.C.C. and J.O.S. contributed to writing the manuscript. J.C.C., J.O.S., and M.D.B. contributed to review and editing.
Declaration of competing interest
The authors indicate they have no conflict of interest.We have no conflict of interest to declare.
Authors: Stefan A Samel; Georg Schoenafinger; Thomas A Knappe; Mohamed A Marahiel; Lars-Oliver Essen Journal: Structure Date: 2007-07 Impact factor: 5.006
Authors: Kien T Nguyen; Daniel Ritz; Jian-Qiao Gu; Dylan Alexander; Min Chu; Vivian Miao; Paul Brian; Richard H Baltz Journal: Proc Natl Acad Sci U S A Date: 2006-11-07 Impact factor: 11.205
Authors: Delin Sun; Thasin A Peyear; W F Drew Bennett; Olaf S Andersen; Felice C Lightstone; Helgi I Ingólfsson Journal: Biophys J Date: 2019-10-10 Impact factor: 4.033