Literature DB >> 31636431

Pass-back chain extension expands multimodular assembly line biosynthesis.

Jia Jia Zhang1, Xiaoyu Tang1, Tao Huan2, Avena C Ross3, Bradley S Moore4,5.   

Abstract

Modular nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) enzymatic assembly lines are large and dynamic protein machines that generally effect a linear sequence of catalytic cycles. Here, we report the heterologous reconstitution and comprehensive characterization of two hybrid NRPS-PKS assembly lines that defy many standard rules of assembly line biosynthesis to generate a large combinatorial library of cyclic lipodepsipeptide protease inhibitors called thalassospiramides. We generate a series of precise domain-inactivating mutations in thalassospiramide assembly lines, and present evidence for an unprecedented biosynthetic model that invokes intermodule substrate activation and tailoring, module skipping and pass-back chain extension, whereby the ability to pass the growing chain back to a preceding module is flexible and substrate driven. Expanding bidirectional intermodule domain interactions could represent a viable mechanism for generating chemical diversity without increasing the size of biosynthetic assembly lines and challenges our understanding of the potential elasticity of multimodular megaenzymes.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 31636431      PMCID: PMC6917876          DOI: 10.1038/s41589-019-0385-4

Source DB:  PubMed          Journal:  Nat Chem Biol        ISSN: 1552-4450            Impact factor:   15.040


Modular nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) enzymes are molecular-scale assembly lines that construct complex polymeric products, many of which are useful to humans as life-saving drugs. The first characterized assembly lines exhibited an elegant co-linear biosynthetic logic, whereby the linear arrangement of functional units, called modules, along an NRPS or PKS polypeptide directly correlates to the chemical structure of the product[1]. The PKS giving rise to the antibiotic erythromycin[2] and the NRPS producing the antibiotic daptomycin[3] are two such examples. The core components of an NRPS or PKS assembly line elongation module include, respectively, condensation (C) or ketosynthase (KS) domains catalyzing chain extension, adenylation (A) or acyltransferase (AT) domains for substrate selection, and thiolation (T) domains for covalent substrate tethering. Optional tailoring domains such as methyltransferase (MT), ketoreductase (KR), or dehydratase (DH) domains, if present, chemically modify building blocks or chain-extension intermediates. This one-to-one correlation between product moieties and assembly line modules with the requisite catalytic domains is one feature that makes NRPS/PKS enzymes among the largest proteins found in nature. However, it is now clear that many assembly lines do not strictly abide by the rules of co-linearity. A phylogenetically distinct class of modular PKSs are trans-AT PKSs, which do not directly encode AT domains within modules but instead as stand-alone enzymes that act in trans[4]. A separate type of NRPS, referred to as a nonlinear NRPS, deviates from the standard core domain arrangement of C-A-T and reuses a single domain more than once[5]. Several nonlinear NRPSs possess modules missing A domains and are presumably loaded by A domains from upstream modules[6-10]. Leveraging domain activities in trans or from different modules reduces the size of biosynthetic assembly lines and thus may represent a mechanism for minimizing modular assembly lines without sacrificing product complexity. A particularly intriguing set of hybrid NRPS-PKS assembly lines that are both nonlinear and trans-AT are those responsible for the biosynthesis of the thalassospiramides, a large group of immunosuppressive cyclic lipodepsipeptides[11-13]. Thalassospiramide NRPS-PKS genes have been identified in several marine Rhodospirillaceae bacteria and exhibit distinct architectures that range in domain and module “completeness”[13]. While it is still unclear whether all configurations are functional, thalassospiramide assembly lines representing the most “complete” and “incomplete” architectures are both capable of producing a large combination of lipodepsipeptides that vary in fatty acid and amino acid composition, order, and length[12]. Previous work identifying these NRPS-PKS genes and structurally characterizing their numerous and diverse chemical products[11-13] led us to hypothesize that these assembly lines must operate with an unprecedented degree of nonlinearity. Furthermore, we posited that in order to generate their chemical products, thalassospiramide assembly lines must catalyze one or two rounds of pass-back chain extension, where the chain-extension intermediate is passed from a downstream module back to an upstream module within the same polypeptide. Here, we report a comprehensive characterization of the thalassospiramide biosynthetic machinery from α-proteobacteria Thalassospira sp. CNJ-328 and Tistrella mobilis KA081020-065, which represent the most “complete” and “incomplete” assembly line architectures, respectively[13]. We present an experimentally supported and mechanistically novel biosynthetic model that invokes inter-module substrate activation and tailoring, module skipping, and pass-back chain extension, whereby the ability to pass the growing chain forward or backward is flexible and influenced by the identity and chain length of the chemical intermediate. These newly described features accentuate the potential bidirectionality and flexibility of multi-modular megaenzymes and reveal new engineering opportunities and structural considerations.

Results

Cloning and heterologous expression of ttc and ttm.

Close inspection of the thalassospiramide ttc and ttm gene clusters revealed that the NRPS/PKS genes from Thalassospira and Tistrella, respectively, occupy completely different genomic contexts (Fig. 1a,b, Supplementary Table 1). Ttc also includes a 4’-phosphopantetheinyl transferase (PPTase), TtcD, for which there is no homolog in ttm. Notably absent from the genomic vicinity of both pathways are any genes encoding stand-alone AT or A domains (Supplementary Table 1), although both assembly lines possess a trans-AT PKS module and Ttm contains two NRPS modules without A domains. We chose to clone a broader range for ttc and a more limited range for ttm (Fig. 1a,b).
Fig. 1 ∣

Heterologous reconstitution of thalassospiramide biosynthetic gene clusters in a P. putida host.

a,b, Annotated genomic loci encompassing the thalassospiramide assembly line genes from Thalassospira sp. CNJ-328 (a) and Tistrella mobilis KA081020-065 (b) targeted for cloning and heterologous expression. (c) LC-MS analysis of extracts from an empty P. putida EM383 host and hosts with genomically integrated ttc and ttm pathways compared against an authentic thalassospiramide A (1) standard. Extracted ion chromatograms (EIC) of m/z 958.5496. See Supplementary Fig. 4 for associated MS/MS spectra.

To directly clone these gene clusters, a new bacterial artificial chromosome (BAC)-based transformation-associated recombination (TAR) cloning vector, pCAP-BAC (pCB), was designed and constructed to enable stable maintenance of large constructs in Escherichia coli (Supplementary Fig. 1). pCB lacks host-specific integration elements, which can be introduced after cloning, to make it easier to retrofit cloned pathways with integration elements for different hosts. Following successful cloning of ttc, heterologous expression was attempted but never achieved in E. coli, despite efforts to perform promoter refactoring, stabilize protein expression, and co-express the pathway with various promiscuous PPTases (Supplementary Fig. 2). Thus, we constructed a Pseudomonas integration cassette containing the Int-B13 site-specific recombinase[14] and introduced it into the vector backbone to generate pCB-ttc-int (Supplementary Fig. 1). This construct was successfully integrated into the genome of Pseudomonas putida EM383[15] (Supplementary Fig. 3). The same procedure was used for ttm, and expression of both gene clusters in P. putida was successful as evidenced by detection of the representative product thalassospiramide A (1) (Fig. 1c, Supplementary Fig. 4). This validated that the Ttc and Ttm assembly lines do not require additional pathway-specific enzymatic components, beyond what was transferred to the host and supplied through primary metabolism. To our knowledge, this is the first report of successful heterologous expression of a trans-AT pathway without co-transfer of a cognate AT. Our current hypothesis as to why expression was successful in P. putida but not in E. coli is that the P. putida primary metabolic AT is capable of interfacing with the trans-AT PKS modules, while the AT from E. coli is not.

Reconstitution of thalassospiramide structural diversity.

We next explored whether the heterologously expressed Ttc and Ttm clusters could reproduce the full suite of thalassospiramide chemical diversity. Thalassospiramide cyclic lipodepsipeptides can be grouped into four categories based on their chemical structures and biosynthetic origin[11-13] (Fig. 2). Products of Ttc can incorporate serine, phenylalanine, or tyrosine as the first amino acid residue (colored blue in Fig. 2), which can be extended by a single ketide unit to generate “B-like” as opposed to “A1-like” thalassospiramides. Alternatively, Ttm incorporates serine or valine in the first position, then includes or omits a valine residue (colored red in Fig. 2) to generate “A4-like” or “E-like” thalassospiramides, respectively. Both assembly lines produce “A-like” thalassospiramides at greater relative abundance than their “B-like” or “E-like” counterparts.
Fig. 2 ∣

Ttc and Ttm assembly lines and structures of associated cyclic lipodepsipeptide products.

(a) Ttc assembly line and structures of a representative set of associated chemical products. (b) Ttm assembly line and representative associated chemical products. Analogs not previously reported are underlined; see Supplementary Table 2 for HR-MS data. Analogs detected by LC-MS from the native producer but not the heterologous host are marked with an asterisk. C, condensation; A, adenylation; T, thiolation; KS, ketosynthase; AT, acyltransferase; DH, dehydratase; KR, ketoreductase; MT, methyltransferase; TE, thioesterase.

Additional elements that contribute to structural diversity include the N-terminal fatty acid, which is predominantly an atypical C10:1(Δ3) fatty acid, and the pattern of N-methylation, which is predominantly limited to the final tyrosine residue of the cyclic peptide core but can also extend to the adjacent valine residue for products of Ttc and further to valine within the linear peptide for Ttm (Fig. 2). Finally, the number of linear Ser-C2-Val units (4-amino-3,5-dihydroxy-N-pentanyl-valine, colored green in Fig. 2) can be 0, 1, or 2 for both assembly lines, which presumably arises from passage of chain-extension intermediates from module 4 back to module 2 or from module 5 back to module 3. Based on the different possible combinations of these variables, each assembly line can theoretically generate well over 100 compound analogs; several dozen compounds are routinely detected from small (50 mL) cultures of producing organisms using mass spectrometry. LC-MS-MS analysis revealed that nearly all analogs detected from the native producers are also produced by the heterologous host, with some subtle differences in relative production levels (Supplementary Fig. 5). Overall titers of most analogs are comparable between native strains and host and in some cases greater in the host (Supplementary Fig. 5).

Characterization of non-assembly line genes.

To determine whether co-transferred genes beyond the core NRPS/PKS genes affect thalassospiramide titer or product distribution, we performed targeted deletions of all non-assembly line genes within ttc. Previous work using the broad-host-range vector, pCAP05 (ref. [16]), demonstrated that no upstream genes (−7 through −1) are essential, although heterologous expression using this vector produced low yields (Supplementary Fig. 6a) and proved to be genetically unstable over time. Targeted deletion of ttc −1, +1, +2, +3, and +4 in the stable pCB-ttc-int expression construct had no impact on thalassospiramide production; however, deletion of the gene encoding the putative PPTase TtcD resulted in an approximately five-fold reduction in thalassospiramide A (1), which was restored upon genetic complementation of ttcD (Supplementary Figs. 6 and 7). This observation suggests that the single native P. putida PPTase is capable of activating Ttc carrier proteins, albeit not as effectively as TtcD. The P. putida PPTase is clearly capable of activating Ttm carrier proteins, as no cognate PPTase is encoded within ttm, and thalassospiramides are still produced in the heterologous host. However, co-expression of ttcD with ttm resulted in an approximately two-fold increase in levels of thalassospiramide A (1) (Supplementary Figs. 6 and 7). Quantitative analysis suggests that the PPTase TtcD favorably biases production of analogs that incorporate one or more linear Ser-C2-Val units (n≥1), particularly for Ttm, as ttcD co-expression actually decreases production of C2 (6) and E4 (25) (n=0) (Supplementary Fig. 7). This result suggests that TtcD-catalyzed phosphopantetheinylation of TtmA carrier proteins predisposes TtmA to perform pass-back chain extension at the expense of linear assembly through an unknown mechanism.

Inactivation and testing of assembly line domains.

Taken together, the results of the heterologous expression and gene deletion experiments strongly suggest that thalassospiramide structural diversity is generated directly from the multi-modular assembly line itself and does not involve accessory enzymes beyond what is provided from primary metabolism. Although certain modules appear to be missing domains based on retro-biosynthetic analysis of thalassospiramide chemical structures, all necessary core and tailoring domains are present somewhere along the assembly line, including a DH domain in module 4 of both systems that was not previously annotated (Fig. 2, Supplementary Fig. 8). Thus, we set out to investigate the predicted inter-module activity of assembly line domains. Our initial approach leveraged gene deletion and complementation tools, focusing first on the smaller ttcC that encodes the last four domains of terminal module 6. Despite it harboring the lone assembly line MT domain, some thalassospiramides, such as A8 (5), are unusual in containing a ‘misplaced’ penultimate N-methylated valine residue in addition to the conserved terminal N-methylated tyrosine. Although MTs are usually positioned at the C-terminus of interrupted A domains[17], MT6 is positioned at the N-terminus of A6 (Supplementary Fig. 8d). To explore whether MT6 can methylate amino acids activated by both the A5 and A6 adenylation domains, we deleted ttcC, resulting in complete loss of thalassospiramide production. Complementation with wild-type ttcC restored thalassospiramide production, while complementation with a mutant encoding TtcC-G234D, in which MT6 has been selectively inactivated, resulted in dramatic reduction and complete loss of thalassospiramides A (1) and A8 (5), respectively, and concomitant formation of a new product with HR-MS and MS/MS spectra consistent with desmethyl thalassospiramide A, or thalassospiramide A15 (26) (Supplementary Table 2, Supplementary Fig. 9). The same result was observed for all analogs, resulting in the formation of a series of desmethyl cylic lipodepsipeptides. This confirms our hypothesis that MT6 can act within the upstream module 5 of TtcB. Furthermore, it suggests that pass-back chain extension occurs between modules 4 and 2 as opposed to 5 and 3, as promiscuous methylation is confined to the cyclic valine residue for Ttc. If linear valine residues were installed by module 5, promiscuous methylation should extend to these positions; however, we never observe this for products of Ttc. As module 5 of Ttm does not possess an A domain and presumably borrows the activity of A2, this could explain why promiscuous methylation can extend to upstream valine residues for Ttm but not Ttc. We attempted to use the same approach to characterize other thalassospiramide domains; however, efforts to perform PCR mutagenesis in ttcA, which is over 6.5 kb, and ttcB and ttmA, which are both over 15.5 kb, were ultimately unsuccessful. Therefore, we established new methodology combining oligo recombineering with CRISPR-Cas9 counter-selection for facile introduction of point mutations to large DNA constructs cloned into the pCB vector backbone (Supplementary Fig. 10). Using this method, we selectively inactivated a series of assembly line domains (Supplementary Fig. 11) to directly interrogate their role in thalassospiramide biosynthesis. Consistent with our annotation of C1a of TtcA as a starter C domain responsible for fatty acylation of the first amino acid residue[18], C1a inactivation completely abolished production of all thalassospiramides (Fig. 3a). We did not detect any masses corresponding to core peptides lacking an N-terminal fatty acid, suggesting that the assembly line can only generate lipopeptide products. In contrast, inactivation of A1a using two different point mutations, TtcA-G631D and TtcA-K972A, resulted in essentially complete loss of all thalassospiramides incorporating phenylalanine or tyrosine as the first residue but maintained or enhanced production of analogs incorporating serine in the first position (Fig. 3a). Inactivation of A3 using the analogous lysine to alanine mutation (TtcB-K2045A) resulted in complete loss of thalassospiramide production, as did selective inactivation of T1a (Fig. 3a). These results suggest that the serine residue adenylated by A3 is directly loaded onto T1a during biosynthesis of analogs that incorporate serine as the first amino acid residue.
Fig. 3 ∣

Selective inactivation of assembly line enzymatic domains alters product formation.

a,b, Changes in production level of thalassospiramide analogs from Ttc (a) and Ttm (b) assembly lines upon selective inactivation of specific enzymatic domains. Domain and precise amino acid mutations are listed in the first two columns. Fold-change in MS ion intensity is indicated by number and color intensity, while white boxes indicate analogs were not detected (n.d.) from the mutant. Statistical significance was calculated using a two-tailed Student’s t-test; n=3 biologically independent samples, *p<0.05, **p<0.005. (c) Structures and EICs of new thalassospiramide analogs produced upon TtmA C2 inactivation; see Supplementary Table 2 for HR-MS data.

We hypothesized that “B-like” thalassospiramides from Ttc arise through PKS module 1b and, correspondingly, that “A1-like” thalassospiramides arise through module 1b skipping. Consistent with that hypothesis, mutation of the active site cysteine of KS1b to alanine resulted in complete loss of “B-like” analogs but maintained production of “A1-like” analogs, although some yields were slightly reduced (Fig. 3a). Surprisingly, although AT1b inactivation (TtcA-S1728A) reduced production of “B-like” analogs, almost all could still be detected by LC-MS at ~8-24% of wild-type production levels (Fig. 3a). As the AT1b mutation is expected to abolish enzymatic activity and the assembly line has no other AT domains, we propose that the module 4-interacting trans-AT can partially complement AT1b. We previously hypothesized that the tandem T domains in module 4 might be important for determining whether chain-extension intermediates are passed forward or backward[12], although literature precedents indicated that multiple T domains do not change product identity but instead increase flux or yield[19]. We attempted to use the same oligo recombineering/CRISPR-Cas9 method to selectively inactivate T4a and T4b. However, CRISPR-Cas9 targeting did not result in oligo incorporation but instead recombination across the two very similar T domain sequences to generate an in-frame deletion of residues 3511 to 3596 in TtcB. The resultant mutant protein contained only a single chimeric T domain composed of 46% of the N-terminus of T4a and 54% of the C-terminus of T4b (Supplementary Fig. 11). Transfer of this construct to the heterologous host revealed that all thalassospiramide analogs could still be produced, including those incorporating linear Ser-C2-Val units arising from pass-back chain extension, albeit in decreased yields (Fig. 3a). This result proves that tandem T domains are not essential for bi-directional chain extension. Moreover, production of all analogs was affected equally, supporting the flux hypothesis. Finally, we predicted that “E-like” thalassospiramides produced by Ttm arise from module 2 skipping, analogous to module 1b skipping in Ttc. Thus, if pass-back chain extension occurs through modules 5 and 3, C2 inactivation in Ttm would preserve production of all “E-like” analogs, as C2 is completely skipped in this model. While C1 inactivation abolished production of all thalassospiramides, C2 inactivation dramatically reduced and completely abolished production of thalassospiramides E (23) and E1 (24), respectively (Fig. 3b). This result confirms our hypothesis that most pass-back events occur between modules 4 and 2, as levels of “E-like” analogs were not preserved. Furthermore, we observed a greater than 100-fold increase in levels of thalassospiramide E4 (25), suggesting that the inability to pass growing chains back via C2 forces the assembly line to pass intermediates forward, resulting in enhanced production of the “premature” termination product E4 (25) (Fig. 3b). However, we also observed the formation of two new compounds not previously detected with HR-MS and MS/MS spectra consistent with the structures of thalassospiramides E5 (27) and E6 (28), indicating that pass-back between modules 5 and 3 can occur but correlates with additional substrate dehydration by module 4 (Fig. 3c, Supplementary Table 2).

Models for thalassospiramide biosynthesis.

The results of Ttm C2 inactivation provide a clear mechanistic insight into the flexibility and control of pass-back chain extension during thalassospiramide biosynthesis, as illustrated in Figure 4. Selective inactivation of C2 does not affect early stages of “E-like” thalassospiramide biosynthesis, during which A2 loads T1 with valine and then module 2 is skipped following appendage of the N-terminal fatty acid. Assembly then proceeds through modules 3 and 4, but the chain-extension intermediate does not undergo immediate dehydration and may not be directly accessible to DH4. Now, the assembly line would normally pass the chain-extension intermediate from module 4 back to module 2 via C2, which is favored based on product distribution, as thalassospiramide E4 (25) is normally produced at very low abundance. We propose that the assembly line C domains play an important role in “measuring” intermediate chain length, promoting donation of the module 4 intermediate backward to module 2 instead of forward to module 5. However, C2 inactivation forces the chain-extension intermediate forward, making it accessible to DH4 before it enters module 5. Once within module 5, the intermediate would normally proceed directly to module 6, resulting in formation of thalassospiramide E4 (25). Consistent with this proposal, C2 inactivation drives a substantial increase in levels of E4 (25) compared to wild-type. However, chain length can also be “measured” at the donor site of C6 (as C6 usually accepts donor substrates of longer chain length), prompting the assembly line to catalyze pass-back of a subset of intermediates that have already undergone dehydration from module 5 back to module 3, resulting in eventual formation of new products E5 (27) and E6 (28). This result provides additional evidence that the substrate becomes accessible to DH4 just as it is passed forward to module 5 and dehydration is a passive result of forward chain extension, since a model that invokes DH4 gating of module 5 entry would not be consistent with the observed increase in levels of E4 (25) or formation of E5 (27) and E6 (28). Furthermore, it suggests that biosynthesis is flexible and controlled by a mechanism that measures intermediate chain length. If we assume that dehydration of the hydroxyl group within the linear unit is a signature for module 5 progression, there is evidence that pass-back between modules 5 and 3 occurs at low frequency under normal conditions, perhaps as an additional checkpoint, as we can detect thalassospiramides E7 (29), E8 (30), and E9 (31) by LC-MS (Supplementary Table 2), which are all produced by wild-type Ttm and have undergone additional rounds of “premature” dehydration.
Fig. 4 ∣

Model for thalassospiramide biosynthesis by Ttm C2 inactivation mutant.

Valine is loaded onto T1 by A2 and C1 catalyzes addition of an activated fatty acid (FA) bound to an acyl carrier protein (ACP) or coenzyme A. Module 2 is skipped, and the fatty valine is extended directly onto serine-loaded T3 via C3. Chain extension proceeds normally to module 4, where the substrate is not immediately dehydrated and would normally be passed back to module 2 via C2, perhaps based on chain length. However, C2 inactivation forces the intermediate forward, where it becomes transiently accessible to DH4 and is dehydrated before extension onto T5, which is loaded with valine by A2. Direct progression through module 6 results in formation of thalassospiramide E4 (25), which is substantially increased as a result of C2 inactivation. Alternatively, passage from module 5 back to module 3 results in generation of new thalassospiramide analogs E5 (27) and E6 (28), which undergo one or two additional rounds of chain extension, respectively, through modules 3-5.

We can also propose a full model for thalassospiramide A (1) biosynthesis via Ttc (Fig. 5). T1a is preferentially adenylated with serine through the downstream A3 domain. To our knowledge, this is the first report of an A domain activating a carrier protein within an upstream module that already possesses its own active A domain[12]. Subsequently, module 1b is skipped during formation of “A1-like” analogs. Normal chain extension proceeds from modules 2 to 4, at which point we hypothesize that the substrate is sequestered from DH4 activity and module 5 entry based on chain length. We propose that during the formation of “B-like” thalassospiramides, ketoreduction to generate the upstream statine-like amino acid residue occurs at this time, analogous to dual ketoreduction of two disparate positions catalyzed by KR3 of PksJ during bacillaene biosynthesis[20]. The chain-extension intermediate within module 4 is then passed back to module 2 via C2, where it undergoes another round of linear chain extension through modules 2 to 4 (Fig. 5). Now, the intermediate has reached “sufficient” chain length and becomes accessible to DH4 before transfer to module 5 via C5. MT6 promiscuously methylates valine residues installed by module 5 to generate analogs such as thalassospiramide A8 (5). Finally, normal progression through modules 5 and 6 results in the formation of thalassospiramide A (1), the most abundant product of both Ttc and Ttm.
Fig. 5 ∣

Model for thalassospiramide A biosynthesis by Ttc.

Serine is loaded onto T1a by A3, and C1a catalyzes addition of an activated fatty acid (FA) bound to an ACP or coenzyme A. Module 1b is skipped, and the fatty serine is passed directly to valine-loaded T2 via C2. Chain extension proceeds normally to module 4, where the substrate is sequestered from DH4, perhaps within an enzyme binding pocket. The intermediate is passed from module 4 back to module 2 via C2, where it undergoes another round of chain extension through modules 3 and 4. Upon return to module 4, the intermediate is now acted upon by DH4, perhaps due to a conformational change associated with longer chain length, and extended forward through modules 5 and 6 to generate thalassospiramide A (1).

Discussion

Modular assembly lines are large and dynamic enzymes that undergo dramatic domain conformational rearrangements during a single catalytic cycle[21-24]. However, whether multi-modular enzymes adopt a rigid, supermodular architecture[25] or a more flexible configuration[26] remains in debate. In this work, we demonstrate the ability of multi-modular assembly lines to catalyze bidirectional and nonlinear passage of chain-extension intermediates, favoring a more flexible arrangement. Several new features of multi-modular NRPS/PKS biosynthesis are described in this work. Thalassospiramide assembly lines catalyze several instances of inter-module substrate activation and tailoring. While A domain supplementation has been previously reported[6-10], prior examples have been limited to upstream domains supplementing downstream modules, often encoded on separate proteins, and only within modules that lack their own A domains. For Ttc, A3 adenylates T1a with serine more frequently than A1a does with phenylalanine or tyrosine, although A1a is active and participates in biosynthesis of numerous thalassospiramide analogs. For Ttm, both A2 and A3 deliver substrates to T1, but only A3 adenylates T5. Furthermore, only A2 loads T1 if module 2 is skipped, as all “E-like” analogs incorporate valine as the first amino acid, perhaps due to the specificity of C3[27]. MT6 promiscuously methylates valine residues activated by A5 of Ttc and A2 of Ttm. Finally, the trans-AT interacting with module 4 can partially supplement lost AT1b activity in Ttc. Complementation of lost cis-AT activity with cis-ATs from other modules or non-cognate trans-ATs has been previously observed in the 6-deoxyerythronolide B PKS synthase[28]. Thalassospiramide assembly lines also catalyze programmed module skipping. Forward module skipping has been previously observed in PKS[29-31] and NRPS[32-34] systems, both naturally and as a byproduct of engineering. Ttc and Ttm catalyze analogous skipping of PKS module 1b and NRPS module 2, respectively, although NRPS module 2 does not sit at an enzyme junction but is embedded within a very large polypeptide. Perhaps as a result, skipping is favored in Ttc but disfavored in Ttm, based on product distribution. Finally, thalassospiramide assembly lines catalyze pass-back chain extension. To our knowledge, this mechanism has not been previously reported in modular assembly line systems, although it is analogous to “iteration” observed in the fungal beauvericin and bassianolide synthetases[35], where intermediates are passed back and forth between adjacent modules. Consistent with previous findings, tandem T domains in thalassospiramide assembly lines increase flux[19] but are not mechanistic determinants of pass-back chain extension. This result is also consistent with the observation that the thalassospiramide assembly line from Oceanibaculum pacificum contains only a single T domain in module 4 and can still produce thalassospiramide A (1)[13]. There is no evidence that additional T domains promote “stalling” to allow for additional tailoring reactions, as having a single T domain in module 4 does not change product distribution but decreased levels of all analogs equally. Under normal conditions, thalassospiramide intermediates are passed from module 4 back to module 2 via C2. Upon C2 inactivation, we observe pass-back chain extension from module 5 back to module 3 via C3. This suggests that the assembly line is flexible and can respond to perturbation, trying to “correct” for aberrant intermediate chain length caused by C2 inactivation by passing back through C3, which, like C6, is specific for accepting intermediates with C-terminal valine residues. Our results are consistent with recent findings that multi-modular NRPSs can be “mixed and matched” if the specificity and relative position of downstream C domains are maintained[27]. It also suggests that assembly lines must possess some symmetry for pass-back chain extension to occur. While all NRPS modules possess mechanisms to control the timing of chain extension to prevent misinitiation[36,37], thalassospiramide modules have the added ability to accept longer intermediates from nonsequential modules. Perhaps more importantly, how physical proximity between upstream C domains and downstream T domains is achieved during pass-back chain extension remains unclear. We cannot predict the plasticity of the hybrid thalassospiramide assembly line proteins based on their primary amino acid sequence alone (Supplementary Fig. 8). Although we can assume that TtcA, TtcB, and TtmA are homodimeric due to the dimeric nature of KS and many linker domains, we do not know whether oligomerization impacts thalassospiramide assembly, although it is tempting to suggest that higher-order architecture helps facilitate nonlinear transfer. It is curious that although the early stages of thalassospiramide biosynthesis are flexible, resulting in production of lipopeptides with a high degree of N-terminal structural diversity, the final stages are rather fixed. The C-terminal cyclic depsipeptide core of all thalassospiramide analogs is highly conserved, particularly the 12-membered ring and the α,β-unsaturated carbonyl moiety that together form the pharmacophore responsible for calpain protease inhibition[38]. Thus, the assembly line constructs a series of chemical products in which the structural elements that confer activity are maintained, while accessory elements such as the fatty acid and linear chain composition and length, which may confer target specificity, are variable. It has been speculated that biosynthetic promiscuity resulting in chemical diversity may be evolutionarily advantageous[39,40]. Furthermore, the type of combinatorial biosynthesis observed in this study expands the portfolio of small molecules produced without introducing new assembly line modules or domains, or even additional tailoring enzymes, thus representing a means of expanding chemical diversity while minimizing genomic space. Our work was facilitated by new methodology combining oligo recombineering with CRISPR-Cas9 counter selection for facile editing of large DNA constructs. Although our method was used solely for domain inactivation in this study, it can be easily applied to perform other forms of multi-modular assembly line engineering, for example to alter A domain specificity[41,42]. Future efforts to understand the specific elements that enable and control assembly line flexibility will hopefully enhance efforts to engineer NRPS/PKS proteins and possibly expand their biosynthetic repertoire.

Methods

General methods.

A complete list of the primers, plasmids, and strains used in this study can be found in Supplementary Table 3. DNA fragments larger than 3 kb were amplified with PrimeSTAR Max (Clontech Laboratories, Inc.); all other PCR products were amplified with PrimeSTAR HS DNA polymerase (Clontech Laboratories, Inc.). DNA isolations and manipulations were carried out using standard protocols. Thalassospira sp. CNJ-328 and T. mobilis KA081020-065 were grown in GYP media (glucose 10 g/L, yeast extract 4 g/L, peptone 2 g/L, sea salt 25 g/L). S. cerevisiae VL6-48N was grown in YPDA media (yeast extract 10 g/L, peptone 20 g/L, dextrose 20 g/L, adenine 100 mg/L) or selective histidine drop-out media containing 5-FOA (yeast nitrogen base without amino acids and ammonium sulfate 1.7 g/L, yeast synthetic dropout medium without histidine 1.9 g/L, sorbitol 182 g/L, dextrose 20 g/L, ammonium sulfate 5 g/L, adenine 100 mg/L, 5-FOA 1 g/L). E. coli and P. putida strains were grown in LB. E. coli TOP10 and DH5α λpir were used for standard cloning procedures. E. coli BW25113/pIJ790 was used for λ Red PCR targeting, and E. coli HME68 was used for oligo recombineering and CRISPR-Cas9 counter selection. P. putida EM383 was used for heterologous expression. All strains were grown at 30 °C except TOP10 and DH5α λpir, which were grown at 37 °C. Liquid cultures were grown shaking at 220 r.p.m. When necessary, E. coli (and P. putida) cultures were supplemented with the following antibiotics: 50 μg/mL kanamycin (150 μg/mL for P. putida), 10 μg/mL gentamycin (30 μg/ml for P. putida), 50 μg/mL apramycin, 100 μg/mL ampicillin, 25 μg/mL chloramphenicol. The overall workflow for genetic manipulation and heterologous expression of the ttc and ttm pathways is outlined in Supplementary Fig. 1. Biosynthetic gene clusters were cloned from genomic DNA using a TAR cloning protocol described previously[43]. Cluster-specific capture vectors were generated through a one-step PCR amplification of pCAP-BAC (pCB) using primers pCB-ttcCV_F/R for ttc and pCB-ttmCV_F/R for ttm. Yeast clones were screened, and PCR positive constructs were purified and transferred to E. coli TOP10 for verification by restriction digestion. pCB was miniprepped from at least 25 mL of VL6-48N and at least 10 mL of TOP10. Expression was attempted but never achieved in E. coli strains, including BL21(DE3) and BAP1[44] (Supplementary Fig. 2). pJZ001, containing the intB13 cassette, was assembled by Gibson Assembly using four PCR fragments (amplified using primers CEN6/ARS4_608F/R, intB13_2330F/R, aacC1_1257F/R, and pADH_597F/R) and the pACYCDuet-1 vector backbone, linearized using HindIII and XhoI. Fully assembled pJZ001 was digested using HindIII and XhoI, and the 5155 bp fragment was gel purified and used for λ Red recombination[45] to knock-in the intB13 cassette into the pCB vector backbone, replacing several yeast genes no longer necessary. Retrofitted constructs were then transferred to P. putida by electroporation as described previously[16] and selected for using kanamycin and gentamicin. IntB13-mediated integration of ttc into the genome of P. putida was characterized by PCR and chemical analysis (Supplementary Fig. 3). Edited constructs were similarly introduced to P. putida and then tested for heterologous production of lipopeptide products.

Extraction and LC-MS analysis.

Precultures were inoculated with colonies picked from plates and grown overnight before being inoculated into full 50 mL cultures in 250 mL Erlenmeyer flasks (in triplicate). Full cultures were grown for 5 hours before addition of 1.5 g of autoclaved XAD7HP resin per 50 mL of culture. After 24 hours of additional incubation, culture ODs were measured and recorded at 600 nm and supernatant and cells were decanted. Resin was washed three times with Milli-Q water before extraction with 20 mL of ethyl acetate. Extracts were dried under nitrogen, resuspended in 200 μL of methanol, and filtered through a 0.22 μm filter prior to LC-MS-MS analysis. An Agilent 1100 series HPLC system (Palo Alto, CA, U.S.A.) was coupled to a Bruker Impact II Q-TOF mass spectrometer (Billerica, MA, U.S.A.) for LC-MS analysis. An Agilent ZORBAX 300SB-C18 LC column (300 Å, 5 μm, 150 × 0.5 mm) was used for LC separation. Mobile phase A was H2O in 0.1% FA and mobile phase B was ACN in 0.1% FA. The LC gradient was: t=0.00 min, 70%A; t=3.00 min, 70%A; t=5.00 min, 57%A; t=35.00 min, 57%A; t=49.00 min, 20%A; t=51.00 min, 0%A; t=52.00 min, 0%A; t=57.00 min, 70%A; t=60.00 min, 70%A. A post time of 10 min was set to re-equilibrate the column. For shorter runs, the LC gradient was: t=0.00 min, 70%A; t=3.00 min, 70%A; t=23.00 min, 20%A; t=24.00 min, 0%A; t=27.00 min, 70%A; t=30.00 min, 70%A. A post time of 3 min was set to re-equilibrate the column. Flow rate was 20 μL/min. Sample injection volume was 2 μL. MS conditions for MS/MS spectra generation were set as follows: capillary voltage, 4500; nebulizer gas flow, 0.8 Bar; dry gas, 5.0 L/min at 180 ºC; funnel 1 RF 150 Vpp; funnel 2 RF, 300 Vpp; isCID energy, 0 eV; hexapole RF: 50 Vpp; Quadrupole ion energy, 4 eV; low mass 50 m/z; collision cell energy, 20 – 50 eV; pre pulse storage 5.0 μs; collision RF, ramp from 350 to 800 Vpp; transfer time ramp from 50 to 100 μs; detection mass range 25 to 1000 m/z; MS/MS spectra collection rate was 2.0 Hz. All samples analyzed by comparison were run at the same time and under the same conditions. Values were normalized by culture ODs and compared only for peaks with identical MS spectra and retention time. HR-MS data for thalassospiramide analogs analyzed in this study are provided in Supplementary Table 2 and Supplementary Fig. 12-24.

Gene deletion and complementation experiments.

Gene deletions were made using λ Red PCR targeting as described previously[46]. Primers used to amplify the aac(3)IV cassette and confirm gene deletions are listed in Supplementary Table 3. Deletions using this cassette were made after addition of the intB13 cassette, which contains a gentamycin resistance gene, as the apramycin resistance gene aac(3)IV confers resistance to gentamycin in E. coli. For complementation, ttcD and ttcC were amplified using primers Tn7-ttcD_F/R and Tn7-ttcC_F/R, respectively, and cloned into the mini-Tn7 vector pUC18R6K-mini-Tn7T-Gm[47]. Cloned constructs were introduced to P. putida by electroporation[16] along with the helper plasmid pTNS1[47] and selected for using gentamycin. Complemented P. putida clones were then made electrocompetent and pCB constructs were transferred by electroporation and selected for using kanamycin and gentamycin. For complementation of TtcC-G234D (MT6 inactivation), the mutation was generated by amplification of pTn7::ttcC using primers Tn7-ttcC-g702a_F/R and confirmed by sequencing. Although a ttcA deletion construct was generated and a mini-Tn7 vector containing ttcA was prepared, the latter ultimately could not be transferred to P. putida for complementation, as no clones were obtained even after multiple attempts, likely due to the large size of the gene (>6.5 kb).

Inactivation and testing of assembly line enzymatic domains.

Assembly line domain active sites are shown in Supplementary Fig. 8. Motifs and active sites were identified by sequence alignment against annotated NRPS/PKS domains[22-24,48-55]. pJZ002, an ampicillin resistant version of the pCas9[56] vector, was constructed as follows. First, the BsaI restriction site was first removed from the ampicillin resistance gene bla via PCR amplification of pKD20 using primers pKD20-g848a_F/R. The resulting construct was PCR amplified using primers ts-repA101_F/R and combined with a fragment amplified from pCas9 using primers pCas9_5058F/R by Gibson assembly. Spacer sequences were cloned into pJZ002 as described previously[56]. Spacer sequences and targeting oligos used to target specific domains are listed in Supplementary Table 3. Oligo recombination and CRISPR-Cas9 counter selection were performed as described previously[56,57], with several modifications. The general workflow is shown in Supplementary Fig. 10. pCB constructs were first transferred to E. coli HME68 by electroporation and selected for using kanamycin. A transformant was picked and grown at 30 °C to OD600 0.4-0.5 and heat shocked for 15 minutes at 42 °C in a shaking water bath before being chilled on ice for 10 minutes. The cells were pelleted and washed with ice-cold water before being resuspended in a small volume of ice-cold water. 100 ng of pJZ002 containing the appropriate spacer was mixed with 100 ng of targeting oligo and the DNA mixture was introduced to the cells prior to electroporation at 2.5 kV in a 2 mm gap electroporation cuvette. Cells were recovered for 2 hours at 30 °C shaking and plated on LB with kanamycin and ampicillin. Four clones of each mutant were picked, miniprepped, and screened by sequencing. If no correct mutant was identified, a new spacer sequence was designed and cloned into pJZ002 and the method was retried. Very subtle mutations could be recovered efficiently using an effective spacer sequence, although the effectiveness of the spacer could only be determined empirically. In total, four of 12 spacer sequences were redesigned to achieve successful editing (Supplementary Table 3). Correctly edited constructs were transferred to TOP10 and confirmed by restriction digestion (Supplementary Fig. 11) before being transferred to P. putida for heterologous expression.

Data availability

The ttc and ttm biosynthetic gene cluster sequences are available in the MIBiG database (accession BGC0001050 and BGC0001876). Plasmids pCAP-BAC (#120229), pJZ001 (#120230), and pJZ002 (#120231) are available at Addgene.
  56 in total

Review 1.  Programming of erythromycin biosynthesis by a modular polyketide synthase.

Authors:  David E Cane
Journal:  J Biol Chem       Date:  2010-06-03       Impact factor: 5.157

Review 2.  The structural biology of biosynthetic megaenzymes.

Authors:  Kira J Weissman
Journal:  Nat Chem Biol       Date:  2015-09       Impact factor: 15.040

Review 3.  Biosynthesis of polyketides by trans-AT polyketide synthases.

Authors:  Eric J N Helfrich; Jörn Piel
Journal:  Nat Prod Rep       Date:  2016-02       Impact factor: 13.423

4.  Biosynthetic pathway for mannopeptimycins, lipoglycopeptide antibiotics active against drug-resistant gram-positive pathogens.

Authors:  Nathan A Magarvey; Brad Haltli; Min He; Michael Greenstein; John A Hucul
Journal:  Antimicrob Agents Chemother       Date:  2006-06       Impact factor: 5.191

Review 5.  Nonribosomal Peptide Synthesis-Principles and Prospects.

Authors:  Roderich D Süssmuth; Andi Mainz
Journal:  Angew Chem Int Ed Engl       Date:  2017-03-21       Impact factor: 15.336

Review 6.  Daptomycin, a bacterial lipopeptide synthesized by a nonribosomal machinery.

Authors:  Lars Robbel; Mohamed A Marahiel
Journal:  J Biol Chem       Date:  2010-06-03       Impact factor: 5.157

7.  The biosynthetic gene cluster for the antitumor drug bleomycin from Streptomyces verticillus ATCC15003 supporting functional interactions between nonribosomal peptide synthetases and a polyketide synthase.

Authors:  L Du; C Sánchez; M Chen; D J Edwards; B Shen
Journal:  Chem Biol       Date:  2000-08

8.  Identification of the biosynthetic gene cluster and an additional gene for resistance to the antituberculosis drug capreomycin.

Authors:  Elizabeth A Felnagle; Michelle R Rondon; Andrew D Berti; Heidi A Crosby; Michael G Thomas
Journal:  Appl Environ Microbiol       Date:  2007-05-11       Impact factor: 4.792

9.  Deciphering tuberactinomycin biosynthesis: isolation, sequencing, and annotation of the viomycin biosynthetic gene cluster.

Authors:  Michael G Thomas; Yolande A Chan; Sarah G Ozanick
Journal:  Antimicrob Agents Chemother       Date:  2003-09       Impact factor: 5.191

10.  Iron acquisition in plague: modular logic in enzymatic biogenesis of yersiniabactin by Yersinia pestis.

Authors:  A M Gehring; E DeMoll; J D Fetherston; I Mori; G F Mayhew; F R Blattner; C T Walsh; R D Perry
Journal:  Chem Biol       Date:  1998-10
View more
  8 in total

1.  Nonlinear Biosynthetic Assembly of Alpiniamide by a Hybrid cis/trans-AT PKS-NRPS.

Authors:  Renata Sigrist; Hanna Luhavaya; Shaun M K McKinnie; Amanda Ferreira da Silva; Igor D Jurberg; Bradley S Moore; Luciana Gonzaga de Oliveira
Journal:  ACS Chem Biol       Date:  2020-04-06       Impact factor: 5.100

2.  Site-Directed Mutagenesis of Large Biosynthetic Gene Clusters via Oligonucleotide Recombineering and CRISPR/Cas9 Targeting.

Authors:  Jia Jia Zhang; Bradley S Moore
Journal:  ACS Synth Biol       Date:  2020-07-06       Impact factor: 5.110

3.  Chemometrics and genome mining reveal an unprecedented family of sugar acid-containing fungal nonribosomal cyclodepsipeptides.

Authors:  Chen Wang; Dongliang Xiao; Baoqing Dun; Miaomiao Yin; Adigo Setargie Tsega; Linan Xie; Wenhua Li; Qun Yue; Sibao Wang; Han Gao; Min Lin; Liwen Zhang; István Molnár; Yuquan Xu
Journal:  Proc Natl Acad Sci U S A       Date:  2022-08-01       Impact factor: 12.779

4.  Halovirs I-K, antibacterial and cytotoxic lipopeptaibols from the plant pathogenic fungus Paramyrothecium roridum NRRL 2183.

Authors:  Dongliang Xiao; Mei Zhang; Ping Wu; Tianyi Li; Wenhua Li; Liwen Zhang; Qun Yue; Xinqi Chen; Xiaoyi Wei; Yuquan Xu; Chen Wang
Journal:  J Antibiot (Tokyo)       Date:  2022-03-14       Impact factor: 3.424

Review 5.  Synthetic biology enabling access to designer polyketides.

Authors:  Alexandra A Malico; Lindsay Nichols; Gavin J Williams
Journal:  Curr Opin Chem Biol       Date:  2020-08-04       Impact factor: 8.822

Review 6.  Genome mining methods to discover bioactive natural products.

Authors:  Katherine D Bauman; Keelie S Butler; Bradley S Moore; Jonathan R Chekan
Journal:  Nat Prod Rep       Date:  2021-11-17       Impact factor: 13.423

7.  Unraveling the iterative type I polyketide synthases hidden in Streptomyces.

Authors:  Bin Wang; Fang Guo; Chunshuai Huang; Huimin Zhao
Journal:  Proc Natl Acad Sci U S A       Date:  2020-03-26       Impact factor: 11.205

8.  Fungal siderophore biosynthesis catalysed by an iterative nonribosomal peptide synthetase.

Authors:  Yang Hai; Matthew Jenner; Yi Tang
Journal:  Chem Sci       Date:  2020-09-28       Impact factor: 9.825

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.