Literature DB >> 31838842

Evolution and Diversity of Assembly-Line Polyketide Synthases.

Aleksandra Nivina, Kai P Yuet, Jake Hsu, Chaitan Khosla.

Abstract

Assembly-line polyketide synthases (PKSs) are among the most complex protein machineries known in nature, responsible for the biosynthesis of numerous compounds used in the clinic. Their present-day diversity is the result of an evolutionary path that has involved the emergence of a multimodular architecture and further diversification of assembly-line PKSs. In this review, we provide an overview of previous studies that investigated PKS evolution and propose a model that challenges the currently prevailing view that gene duplication has played a major role in the emergence of multimodularity. We also analyze the ensemble of orphan PKS clusters sequenced so far to evaluate how large the entire diversity of assembly-line PKS clusters and their chemical products could be. Finally, we examine the existing techniques to access the natural PKS diversity in natural and heterologous hosts and describe approaches to further expand this diversity through engineering.

Year: 2019 PMID： 31838842 PMCID： PMC6935866 DOI： 10.1021/acs.chemrev.9b00525

Source DB: PubMed Journal: Chem Rev ISSN： 0009-2665 Impact factor: 60.622

Introduction

Polyketide Synthases (PKSs)

Polyketide synthases (PKSs) are multifunctional enzymes responsible for the biosynthesis of numerous natural products, many of which are currently used as antibiotics (e.g., erythromycin), antiparasitic drugs (e.g., avermectin), cholesterol-lowering agents (e.g., lovastatin), immunosuppressants (e.g., FK506), and cancer chemotherapy (e.g., epothilone). PKSs are classified into three types: type I PKSs are large multifunctional proteins comprised of several functional domains and found in both bacteria and fungi, type II PKSs are formed by discrete catalytic domains and are typically found in bacteria, type III PKSs are simpler chalcone synthase-type enzymes that catalyze the formation of the product within a single active site, mainly in plants and bacteria. Type I PKSs are subdivided into iterative PKSs (reviewed in ref (1)) and assembly-line PKSs, also called modular PKSs (reviewed in ref (2)). Whereas an iterative PKS catalyzes multiple chain elongation cycles using the same set of enzymatic domains, the nascent polyketide chain is channeled from one module to another within an assembly-line PKS such that each module typically catalyzes only one elongation cycle. Iterative type I PKSs are primarily found in fungi, while the assembly-line architecture predominates in bacteria, although several eukaryotic assembly-line PKSs have also been identified. This review focuses on assembly-line PKSs, which are among the most complex biosynthetic protein machineries known in nature. The structure and mechanism of assembly-line PKSs have been a subject of numerous studies (reviewed in refs (3−6)). The catalytic chemistry of a prototypical assembly-line PKS is schematically outlined in Figure . Within each module of the assembly-line, polyketide acyl chain elongation is catalyzed collaboratively by a ketosynthase (KS), an acyltransferase (AT), and an acyl carrier protein (ACP) domain. The ACP domain is post-translationally modified with a phosphopantetheinyl (P-pant) “swinging arm” by a P-pant transferase (PPTase). The KS receives the growing polyketide chain from the ACP of the previous module, while the AT trans-esterifies an α-carboxyacyl extender unit from an appropriate acyl-CoA metabolite onto the ACP. The KS then catalyzes a decarboxylative Claisen-like condensation between the polyketide intermediate and the extender unit. Before being translocated onto the KS of the next module, the newly synthesized, ACP-bound β-ketothioester intermediate can be modified by additional domains, such as a ketoreductase (KR), dehydratase (DH), enoylreductase (ER), methyltransferase (MT), or others. KR, DH, and ER successively and stereospecifically reduce the extended product into a β-hydroxyl, alkene, and methylene functionality, respectively; KR domains establish the stereoconfiguration of both the α- and β-carbon atoms of their products. These domains are usually encoded within the module but can also be present as free-standing proteins in trans.[7] Ultimately, the full-length polyketide is released from the PKS by hydrolysis or macrocyclization catalyzed by a thioesterase (TE) domain or reductive cleavage.

Figure 1

(A) The 6-deoxyerythronolide B synthase (DEBS), a prototypical assembly-line PKS, synthesizes 6-deoxyerythronolide B, the precursor of erythromycin A. (B–E) Reactions catalyzed by module 2 (M2) of DEBS. (B,C) Transacylation of the electrophilic and nucleophilic substrates of M2 from the ACP of module 1 (M1) and (2S)-methylmalonyl-CoA, respectively. (D,E) Polyketide chain elongation and ketoreduction. KS, ketosynthase; AT, acyltransferase; ACP, acyl carrier protein; KR, ketoreductase; KR0, redox-inactive KR with epimerase activity; DH, dehydratase; ER, enoylreductase; TE, thioesterase. Individual modules of assembly-line PKSs are classified as cis-AT and trans-AT modules. cis-AT modules contain all three essential domains (KS, AT, and ACP) comprising a PKS, whereas in trans-AT modules the extender unit is transacylated onto the ACP domain by a free-standing AT that is often shared across multiple modules (reviewed in ref (8)). Modules are connected either by intermodular linkers[9] or, if a PKS spans several polypeptides, by docking domains that establish specific noncovalent interactions between successive modules.[10] The architectures of cis-AT PKSs are often colinear to their genetic encoding (i.e., the order in which modules are encoded on the DNA level corresponds to the order in which they operate), whereas the modules of trans-AT assembly lines are often not colinear.[11] A number of tailoring enzymes can further modify the backbone, either while the intermediates are still bound to the assembly line or after they are released.[12−14] Typically, all genes involved in the biosynthesis of the final product are colocalized within the bacterial genome, forming a biosynthetic gene cluster (BGC).

Why Study PKS Evolution?

Assembly-line PKSs can contain up to 30 modules, distributed over several polypeptide chains. Together with nonribosomal peptide synthetases (NRPSs), they comprise two related classes of megasynthases attaining up to several MDa in size and responsible for the biosynthesis of numerous secondary metabolites. In addition to their remarkable catalytic mechanisms, their multimodular architecture also provides a unique example of studying the evolution of genes that encode multiple homologous but functionally distinct units. From a fundamental standpoint, there is a compelling correlation between genotypic and phenotypic diversity within this family of enzymes.[15] From a practical perspective, the study of assembly-line PKS evolution and diversity could help us expand our therapeutic arsenal. On one hand, exploration of natural PKS diversity holds the potential of discovering assembly lines that synthesize new bioactive polyketides. On the other hand, a better understanding of mechanisms used by nature for polyketide diversification could open new avenues for PKS engineering. Evolutionary-inspired approaches have already started to find use in guiding the assembly of chimeric PKSs that produce novel biomolecules. In the next sections, we will summarize the current models of assembly-line PKS evolution and their impact on enzyme engineering, evaluate the diversity of natural PKSs, provide an overview of methods that allow accessing this diversity through the activation of BGCs, and discuss the broader implications of this evolutionary analysis for the field of natural products research.

Evolution of Assembly-Line PKSs

There is compelling evidence to suggest that all PKSs are evolutionarily related: despite the differences in their architectures and mechanisms, their domains belong to the same protein families and catalyze similar reactions. However, the precise evolutionary relationships between different PKSs are unclear and therefore present an outstanding challenge. The multimodular architecture of assembly-line PKSs is uncommon among proteins, meaning that the selective pressures and molecular mechanisms involved in their evolution could be distinct from those operating in most protein families. Nonuniform distribution of different PKS types among bacterial and eukaryotic phyla further complicates the challenge. If one assumes that iterative PKSs predated assembly lines, then evolution of the present-day diversity of assembly-line PKSs likely involved genetic processes such as mutation, gene fusion to establish module architecture, gene duplication to yield multimodular PKSs, and further diversification of assembly lines via mutation, recombination, and interspecies horizontal gene transfer (HGT) (Figure A).

Figure 2

General model for PKS evolution. The multidomain architecture of type I PKS modules evidently arose through the fusion of genes encoding single-domain proteins of type II systems. The processes that led to the emergence of the multimodular architecture are less well understood. For instance, it is unclear whether assembly-line systems evolved from iterative PKSs that lost their ability to perform several consecutive condensation reactions on the same polyketide chain or from a separate subset of type II proteins. Once a set of assembly-line PKSs emerged, other processes allowed further diversification of these modular enzymes and their products. The most profound difference between iterative and assembly-line PKSs lies in the chemistry of the chain translocation reaction involving a KS and an ACP domain. Specifically, whereas KSs of iterative PKSs operate multiple times on the same polyketide chain,[16] the KS domains of assembly-line PKSs must release their β-ketoacyl-ACP product before the newly vacated active site Cys residue can attack the reactive thioester linkage in this product (Figure C). While the precise mechanism by which such back-transfer of the growing polyketide chain is precluded remains unclear, chain elongation by assembly-line PKS modules is energetically coupled to intermodular chain translocation via a “turnstile” mechanism: a module is precluded from accepting a new chain until the product of previous chain elongation cycle has been passed down to the downstream module.[17,18] This avoids KS reacylation by the downstream ACP and consequent iterative chain elongation. The existence of the “turnstile” mechanism suggests that chain translocation between different modules is an evolutionarily acquired feature, i.e., a gain-of-function mutation as opposed to a loss-of-function trait. In this section, we will first place the evolution of assembly-line PKSs in the context of related enzymes such as iterative PKSs and fatty acid synthases. This will be followed by a phylogenetic analysis of the domains of assembly-line PKSs as well as a brief review of genetic processes believed to play important roles in assembly-line PKS evolution. Both of these concepts are critical to understanding current models for assembly-line PKS evolution (and limitations thereof), which is the principal focus of this section.

Evolutionary Origins of Assembly-Line PKSs

Assembly-line PKSs are evolutionarily related to a number of other multifunctional enzyme families. Even though models for evolutionary relationships have been previously proposed based on phylogenetic studies (Supporting Information, Figure S1), the origins of diverse PKS types and subtypes are not yet fully understood. Relatively close homologues include type I iterative PKSs, such as those found in fungi,[19−21] certain lipid biosynthetic pathways of mycobacteria,[22,23] enediyne synthases in actinobacteria,[24] polyunsaturated fatty acid (PUFA) synthases,[25] and heterocyst glycolipid synthases in nitrogen-fixing cyanobacteria.[26] Assembly-line PKSs are also evolutionarily related to fatty acid synthases (FASs). Their modular architectures differ significantly from bacterial and fungal type I FASs[27] and more closely resemble vertebrate FASs instead, although it is unclear whether these relationships are products of divergent or convergent evolution.[20,21] The evolutionary relatedness to type II PKS and FAS systems is even more distant. Even though most enzymatic domains that form PKSs and NRPSs belong to different protein families, the two assembly-line systems use very similar biosynthetic strategies and often form hybrid assemblies: about one-third of biosynthetic gene clusters encode both types of enzymes.[28] These assembly lines appear to have evolved to facilitate translocation of hybrid products between individual PKS and NRPS modules, and their carrier proteins are serviced by the same PPTases and TEs with broad substrate specificity (reviewed in refs (29) and (30)). Surprisingly, hybrid assemblies have a wide array of architectures including nonmodular, iterative, assembly line, or mixed type.[28] Given their prevalence and the presence of specialized domains and interfaces to ensure intermodular interactions, it is tempting to speculate that these hybrid assemblies appeared early in the evolutionary history of polyketide and nonribosomal peptide natural products. However, this subject is beyond the scope of the current review.

Phylogeny of Catalytic Domains from Assembly-Line PKSs

Most evolutionary relationships of PKSs to related enzymes were deduced from the overall biosynthetic enzyme architecture and the alignment of KS domains, which show the highest degree of amino acid sequence conservation. However, when exploring the emergence of multimodularity within PKSs, an analysis of KS domains alone is insufficient; it does not reflect the entire evolutionary history of assembly-line PKSs, as shown by phylogenetic studies of other domains.

Ketosynthase (KS) Domains

Within assembly-line PKSs, KS domains fall into two clades corresponding to cis-AT and trans-AT enzymes.[31] The phylogenetic tree of KS domains of cis-AT PKSs typically follows the phylogeny of the host organisms, with higher sequence identities within a single BGC and, to a lesser extent, different assembly lines within the species.[32,33] The two exceptions are KS domains from mixed NRPS/PKS systems, which ligate a peptide intermediate from the upstream NRPS module to a polyketide extender unit, and the decarboxylative KSQ (KS0) domains, whose active site Cys residue is replaced by Gln. These two groups form separate branches that are very close to corresponding domains of trans-AT PKSs.[33] In contrast to the KS domains of cis-AT PKSs, KS domains of trans-AT PKSs are not phylogenetically grouped with other KSs from the same BGC. Instead, the closest KS relatives almost always elongate structurally similar polyketide intermediates.[34,35] It has been noted that KS domains from trans-AT PKSs are less promiscuous than cis-AT KSs[36] and form evolutionarily conserved units with ACPs from the upstream modules rather than ACPs from the same module.[37] A similar phylogenetic pattern has been observed for a group of aminopolyol synthases,[38] suggesting that it also may apply to some subsets of cis-AT PKSs.[39]

Acyltransferase (AT) Domains

In cis-AT PKSs, AT domains comprise two clades based on their substrate specificity. Apart from a small number of exceptions, one clade contains AT domains utilizing malonyl-CoA, while the other corresponds to AT domains utilizing methylmalonyl-CoA and rarer substrates.[20,40] In trans-AT PKSs, free-standing AT proteins comprise a distinct clade from their counterparts in cis-AT PKSs.[41] These ATs also distribute across two subclades: one that includes catalytically relevant acyltransferases (nearly all of which utilize malonyl extender units) and another that includes enzymes with acyl hydrolase activity and are therefore capable of hydrolyzing acetyl groups that are erroneously trans-acylated from acetyl-CoA onto an ACP.[42,43]

Other Domains

Like AT domains, KRs also cluster based on their catalytic properties. In cis-AT PKSs, KRs comprise two clades that segregate based on alcohol stereochemistry.[44] In trans-AT PKSs, KRs are distributed across four clades that are distinguished by not just alcohol stereochemistry but also the presence of other enzymes within the module, including methyltransferases (MTs) and dehydratases.[37] MT domains are relatively rare in cis-AT PKSs; they usually present an alternative mechanism for introducing an α-C substituent into the polyketide backbone by modules with malonyl-specific AT domains. The phylogeny of MT domains reflects the identity of the methyl acceptor, i.e., C- versus O-methyltransferases.[45] Notably, the N-MTs from NRPSs are also closely related to their homologues from cis-AT PKSs, albeit in a clade of their own. In contrast, MT domains in trans-AT PKSs cluster more variably, likely based on module composition as well as substrate specificity.[34] ACP domains are relatively short and variable, which complicated phylogenetic analysis until recently, when more sequences became available. In trans-AT PKSs, ACP clades track with those of their downstream KSs.[37] The ACPs from cis-AT PKSs comprise their own clade, although no clear clustering principle can be gleaned from this clade. Nonetheless, the ACPs of giant aminopolyol synthases appear to evolutionarily comigrate together with their downstream KSs, similar to the trans-AT ACPs.[38,39]

Processes Involved in Assembly-Line PKS Diversification

Not only does phylogenetic analysis of individual domains from assembly-line PKSs provide insight into evolutionary relationships within this PKS family, but it also highlights the genetic processes that may have led to their diversification. In this section, we review the genetic processes that are thought to have played important roles in the evolution of assembly-line PKSs.

Gene Duplication

Early in the study of assembly-line PKSs, it was noted that some modules within the same PKS share exceptionally high levels of sequence similarity.[46] The clustering of KS domains derived from the same assembly line has been observed for many cis-AT PKSs and has led to speculation that their multimodularity arose mainly through repeated gene duplication, followed by further diversification through mutation.[32,47] The fact that in most cis-AT PKSs, modules operate in the same order in which they are encoded on the DNA level, known as the principle of colinearity, also supported the role of gene duplication and deletion in their evolution.[48] As evidence for other processes, such as horizontal gene transfer, recombination, domain loss and acquisition, and gene conversion, accumulated, the gene duplication model was modified to include these processes.[15,20,34,48,49]

Horizontal Gene Transfer (HGT)

In bacteria, genetic diversity is often acquired through horizontal gene transfer (HGT), a process during which genetic information is transmitted laterally to other neighboring bacteria rather than vertically to their descendants.[49] There is ample evidence of its role in the evolution of assembly-line PKS clusters; in fact, it appears to have played a particularly strong role in PKS evolution in proteobacteria.[20,21] This inference is based on the observation of phylogenetic incongruencies between PKS genes and host species, anomalous distribution of genes among bacterial groups and atypical nucleotide compositions, and is especially notable in gene clusters encoding the biosynthesis of streptomycin,[50,51] epothilone,[32] and lagriamide,[52] among others, as well as PKS clusters of bacterial origin found in sponges,[53] filamentous fungi,[54] and other taxa. In one instance, HGT has even been observed experimentally.[55] The high frequency of HGT of PKS genes could be due to multiple factors. Some PKSs are encoded on plasmids[56−58] or located within pathogenicity islands,[59] which facilitates gene transfer through conjugation, transposition, or transduction. Additionally, transposon-like sequences are often observed proximal to KS domains, highlighting the potential for transfer of these PKS genes through transposition,[32] although no direct evidence of such events has been found. It has also been suggested that the high rate of HGT in actinomycetales could be due to the linearity or instability of their chromosomes.[60]

Gene Conversion

Gene conversion is a process by which two homologous sequences are homogenized, where one sequence becomes a copy of another through unidirectional sequence replacement. It is widespread and well-described in eukaryotes.[61] Examples of gene conversion have also been described in prokaryotes, where it is responsible for antigenic variation or the evolution of multigene families, but the extent of its importance in bacterial genomes is not well understood.[62] Gene conversion is thought to play a role in the evolution of cis-AT PKSs. For example, it may explain the almost identical sequences of modules comprising the mycolactone synthase[63] and also rationalize changes in the structures of some macrolide antibiotics.[64]

Recombination

Recombination undoubtedly plays a major role in the evolution of assembly-line PKSs; indeed, gene duplication, transposition, and gene conversion all rely on recombination processes. However, recombination by itself is an important mechanism of PKS evolution and diversification, especially in the cases of trans-AT PKSs.[34,65] In cis-AT PKSs, the lack of sequence conservation in docking domain pairs that flank adjacent modules suggests that modules comprising this class of assembly-line PKSs underwent recombinational shuffling.[66] The rate of homologous recombination differs between bacterial taxa and is particularly high in Streptomyces, which harbor a significant fraction of known PKSs. These bacteria undergo extensive HGT and recombination between species; these processes more recognized as being more important in sequence divergence than point mutation.[67] Homologous recombination within the same species is even higher,[68] and its importance for the diversification of PKS clusters has been demonstrated in the case of the avermectin producer, Streptomyces avermitilis.[65]

Models for Evolution of Assembly-Line PKSs

Current Model

In large part to account for the differences in the phylogenetic clustering of KS domains between cis-AT and trans-AT PKSs (section ), the prevailing view states that assembly-line PKSs have evolved via two independent and fundamentally different mechanisms. For cis-AT PKSs, gene duplication within the same PKS gene cluster is thought to be the driver of their evolutionary diversification, whereas for trans-AT PKSs, recombination is the dominant process (Figure A).[21,32,34] However, the necessity to evoke these distinct mechanisms leads to several discordances.

Figure 3

Models of cis-AT and trans-AT PKS evolution. (A) It has been hypothesized that evolution of cis- versus trans-AT PKSs took distinct paths.[21,32,34] However, this dichotomy has some discordances. It does not explain the absence of iterative trans-AT PKSs, the convergence toward strikingly similar architectures despite different evolutionary paths, the presence of AT domain vestiges in trans-AT modules,[34] or (B) the inconsistency of the phylogenetic tree of cis-AT KS domains with this hypothesis.[64] The last inconsistency is exemplified by KS domains from four homologous 16-membered macrolide synthases (left; TYLS, tylactone synthase; CHMS, chalcomycin synthase; SRMS, spiramycin synthase; NIDS, niddamycin synthase). Under the current model, their KS domains would be expected to form groups of orthologous domains (center). In fact, most KS domains are grouped with paralogues from the same PKS (right). Protein sequence alignment was performed with ClustalOmega,[84] and the dendrogram was constructed using UPGMA hierarchical clustering. (C) The discordance in KS sequence alignment is a result of concerted evolution and can be explained by gene conversion events between KS domains.[64,82] Gene conversion leads to high sequence similarity between paralogous domains, causing them to cluster closer to each other than to their orthologues (e.g., teal square). Because gene conversion need not affect all domains within a PKS (e.g., red square), some of them maintain a phylogenetic pattern reflecting ancestral events that had led to the separation of homologous assembly-line PKSs. (D) An alternative model for assembly-line PKS evolution builds on the hypothesis that trans-AT PKSs evolved from cis-AT PKSs through loss of AT domains. In this model, the high sequence identity of KS domains in cis-AT PKSs would be explained by subsequent gene conversion events rather than ancestral gene duplications.

Discordances in the Current Model

The above two-model hypothesis implies that multimodularity of assembly-line PKSs evolved independently at least twice and converged to an almost identical architecture. While not inconceivable, a single origin of multimodularity in cis-AT and trans-AT PKSs would be more parsimonious. Indeed, recent studies suggest that the two classes of assembly-line PKSs are more closely related than previously thought. Even though trans-AT PKS modules lack an AT domain, they usually contain a region called ATd, a subdomain nested between the KS and downstream domains.[69] It is structurally similar to the rigid KS-AT linker of cis-AT PKSs, a region that plays an important role in ACP docking during chain elongation and translocation.[70] The ATd subdomains of trans-AT PKSs often contain two additional helices, which have been proposed to facilitate lateral interactions between PKSs.[71] However, in some cases ATd subdomains also include a large fragment of the AT domain or even entire KS-AT didomains.[72] These KS-AT regions of various lengths may represent evolutionary intermediates between bacterial cis-AT and trans-AT PKSs. The evolution of trans-AT PKSs through AT domain loss would explain the absence of iterative trans-AT PKSs, which should have existed if the evolution of the two PKS groups was independent, from two respective groups of iterative PKS. Intriguingly, iterative cis-AT PKSs exist not only as stand-alone enzymes but are sometimes present as “stuttering” modules within an assembly-line PKS (reviewed in refs (1,73)). For example, the stigmatellin,[74] borrelidin,[75] aureothin,[76] and neoaureothin[48] synthases each harbor a module that performs more than one round of programmed chain elongation.[77,78] In other cases, module iterations are stochastic, leading to minor byproducts. For example, certain modules of DEBS and the epothilone synthase have been shown to iterate at measurable frequencies.[79,80] Although mechanisms have evolved to preclude back-transfer of polyketides in assembly-line PKSs (such as the “ratchet”[81]), these remnants of iterative functions could reflect the evolutionary origins of assembly-line PKSs. If cis-AT PKSs originated through module duplication, then it is also unclear why only the phylogeny of KS domains supports this model. One would expect other domains of duplicated modules to also be closely related. However, the non-KS domains are phylogenetically grouped by catalytic properties such as substrate specificity or stereospecificity rather than by the assembly line of origin (section ). While additional recombination events could explain this incongruity, in some cases (e.g., the avermectin synthase), the constituent modules would have had to undergo large-scale recombination in order for these assembly lines to have evolved by module duplication followed by recombination.[65] Finally, under the hypothesis that multimodularity of cis-AT PKSs evolved through module duplication, the phylogenetic tree of KS domains itself is discordant.[64] This is exemplified by a set of homologous PKSs producing 16-membered macrolides (Figure B). If module duplication preceded the diversification of the resulting assembly-line PKS into different homologous clusters, one would expect KS domains to be more distant from paralogous KS domains within the same PKS than from their orthologues. The phylogenetic tree shows a different pattern: for many PKSs, their paralogous KS domains have the highest sequence similarity. This discordance can be explained by extensive gene conversion between paralogous KSs: this rate has been estimated at 27%, and has been shown to result in a concerted evolution of PKS modules (Figure C).[64,82] If that is indeed the case, then the high sequence similarity between paralogous KS domains is the result of recent gene conversion events, rather than ancestral gene duplication that occurred during the emergence of assembly-line architecture.

Alternative Model

To resolve these discordances, we propose an alternative model for assembly-line PKS evolution that applies to both cis-AT and trans-AT PKSs (Figure D). Our model is based on the premise of extensive gene conversion between paralogous KS domains within the same cis-AT PKS, leading to repetitive regions of abnormally high sequence similarity within the same assembly line.[64,82] This would allow for an evolutionary process that is entirely analogous to the mosaic-like assembly proposed for trans-AT PKSs without the need to invoke extensive gene duplications.[34] In addition to presenting a simpler logic for trans-AT PKS evolution from cis-AT PKSs via the loss of AT domains, this model would also explain the absence of iterative trans-AT PKSs, the presence of AT domain remnants in many trans-AT PKSs, and the existence of assembly-line PKSs (e.g., the NOCAP synthase discussed below) that contain modules of both classes. The hypothesis of trans-AT PKS evolution through displacement of cis-AT PKS domains is also supported by phylogenetic evidence in algae.[83] Of course, further support for such a model would require clearer evidence for the role of gene conversion mechanisms in the evolution of assembly-line PKSs.

Model for Evolutionary Unit of an Assembly-Line PKS

Historically, the functional unit of a PKS was called a module: a polypeptide containing KS-AT-(DH-KR-ER)-ACP domains and able to perform one round of polyketide chain elongation and elaboration.[85,86] It is also an architectural, and hence genetic, unit: this domain order is conserved across vertebrate FASs, iterative PKSs, and cis-AT PKSs. However, it is unclear whether this genetic unit also corresponds to an evolutionary unit that has been preserved in multimodular PKSs. Each KS domain of an assembly-line PKSs must interact with the ACP domain of its upstream module during chain translocation as well as the ACP domain of its own module during chain elongation; both reactions require specific protein–protein interactions (Figure A).[70,87] Genetic recombination between homologous modules can be expected to scramble one of these interfaces while preserving the other. The KS domains of trans-AT PKSs appear to have coevolved with their ACP partners from upstream modules.[37] Their evolutionary relationships also appear to be correlated to structural similarities between their substrates, as defined by the enzymatic domains observed in the reductive loops of upstream modules.[34,37] This suggests that the canonical evolutionary unit of trans-AT PKSs is the (DH-KR-ER)-ACP-KS domain sequence, which would preserve the chain translocation interface. In contrast, the evolutionary history of KS domains of cis-AT PKSs is obscured by two factors. First, they show lower specificity toward their substrates.[88] Second, gene conversion events discussed above mask some of the evolutionary history of cis-AT PKSs. Nonetheless, a recent analysis of aminopolyol PKSs has revealed coevolutionary relationships between KS domains and processing enzymes from upstream modules, suggesting that a typical evolutionary unit is either (DH-KR-ER)-ACP-KS-AT or AT-(DH-KR-ER)-ACP-KS.[38] While it remains unclear whether the evolutionary comigration of KS domains and ACP domains of upstream modules generalizes to all cis-AT PKSs, this hypothesis is supported by the observation that the post-AT linker may be a functionally effective splice point for natural recombination as well as evolutionarily inspired PKS engineering[89,90] (discussed in section ). These observations have led to a proposed redefinition of module boundaries from the “classical” KS-AT-(DH-KR-ER)-ACP toward “alternative” AT-(DH-KR-ER)-ACP-KS.[37,39] While these boundaries most likely correspond to the evolutionary unit of assembly-line PKSs, they are different from the functional, architectural, and genetic unit defined by the “classical” module boundaries. More research is warranted before this new definition can be universally accepted.

Factors Influencing the Evolution of Assembly-Line PKS Diversity

While the emergence of the earliest functional assembly-line PKSs undoubtedly set the stage for their subsequent diversification through mutation, HGT, gene conversion, and recombination, a general understanding of these molecular processes cannot explain the tremendous phenotypic diversification that subsequently emerged within this PKS family. To do so more satisfactorily, these processes have to be put into the context of environmental and genetic factors and considered from the perspective of evolutionary advantages that they provide.

Environmental Factors

Many microorganisms produce a vast array of secondary metabolites whose biological roles in nature are not yet understood.[91−96] For example, polyketide natural products are produced by organisms dwelling in diverse environments ranging from soil to marine and fresh water, from free-living to symbiotic or parasitic systems.[97−100] These environmental factors presumably contributed to shaping the structural diversity and biological activity of polyketide natural products; however, our understanding of the connections between microbial ecology and natural product biosynthesis is still emerging and will therefore not be discussed here.

Genetic Factors

The genetic factors influencing the evolution of assembly-line PKSs are also not well understood. In prokaryotes, assembly-line PKSs are mainly confined to actinobacteria, proteobacteria, firmicutes, and cyanobacteria, with an uneven distribution among bacterial groups within each phylum.[28,101,102] The distribution of the two types of PKS assembly lines is also nonhomogeneous: cis-AT PKSs are most common in actinobacteria, cyanobacteria, and proteobacteria, whereas trans-AT PKSs are more widespread in proteobacteria and firmicutes.[34] The evolutionary rationale for this uneven distribution is also unclear. Actinobacteria and especially Streptomyces are by far the most prolific producers and often harbor multiple PKS clusters in their genomes. The study of their genomes revealed several key points that have likely contributed to the diversity of their natural products. First, Streptomyces contain numerous plasmids, integrative and conjugative elements, and genomic islands that carry biosynthetic clusters and can increase the rate of their horizontal gene transfer.[103−105] Identification of gene clusters on these mobile genetic elements highlights their biological relevance in horizontal gene transfer.[106] Second, their genomes favor the formation and recombination of multiple biosynthetic gene clusters: Streptomyces chromosomes are large (6–12 Mb), linear, and unstable. PKS clusters can span several hundreds of kilobases, and genome size scales almost linearly with the number of PKS clusters, suggesting that larger genomes are more likely to contain multiple clusters.[107] The linear structure and the instability of Streptomyces chromosomes contribute to the overall genomic plasticity that involves frequent HGT, recombination, gene duplication, and deletion.[108,109] Third, the GC content of DNA is highly correlated with recombination frequency in different organisms, even though the causality of these effects is not entirely clear.[110]Streptomyces are no exception to this rule, and their high GC content (>70%) is matched by a high recombination rate. PKS diversification in cyanobacteria has also been attributed to HGT, recombination, gene duplication and deletion, but no specific genetic trait can explain the observed diversity of secondary metabolites in this phylum.[111,112] Even less is known about the genetic factors that contribute to PKS diversification in other bacteria, and more research would be needed to elucidate the underlying molecular mechanisms.

Evolutionary Advantages

Two conceptually different perspectives exist on the role and diversification of natural products.[21] According to a more traditional viewpoint, the evolutionary advantage conferred by the function of the molecule constitutes the trait under selection.[113] Here, every molecule produced by a biosynthetic cluster must have an advantageous biological activity to justify the metabolic cost of its production and to be selected for. The alternative model, also referred to as the “screening hypothesis”,[114,115] presumes that the selected trait is the adaptability itself, i.e., the capacity to generate and maintain the chemical diversity of secondary metabolites that can be screened for advantageous properties when needed. This model does not require all molecules to have a beneficial function, so long as a few molecules provide enough advantage to maintain the entire system. The practical implications of the two models for natural product chemistry are quite distinct. The first model implies that screening for bioactive molecules holds great promise for the discovery of novel molecules of therapeutic interest. On the other hand, the second model anticipates that most natural products do not have measurable bioactivity and that a large library would be needed to screen for new therapeutics. While the available body of knowledge is insufficient to provide conclusive evidence, the difficulty in finding compounds with measurable bioactivity suggests that the screening hypothesis may be more realistic. However, this hypothesis also suggests that the mechanisms that create diversity are remarkable and that their success rate is sufficient for their presence to be selected for in bacterial genomes. This would imply that these same mechanisms can be leveraged to generate diversity in the laboratory and open new avenues for assembly-line PKS engineering (discussed in section ). Determining the precise order of events leading to the appearance of contemporary assembly-line PKSs would be an extremely challenging task. However, insights gained from this research can inform us about the best strategies to pursue in the future evolutionary-inspired engineering approaches.

Diversity of Orphan PKSs

As discussed above, the mechanisms and selective pressures involved in the evolution of assembly-line PKSs have led to astounding polyketide diversity. In this section, we will present computational approaches for estimating this natural diversity, including an updated catalogue of assembly-line PKSs found in the NCBI database. The number of novel clusters sequenced every year reflects the vastness of the PKS sequence space and the extent to which polyketide structural diversity is underexplored.

Catalogues of Assembly-Line PKSs

Biosynthetically characterized PKSs have been catalogued in a variety of databases. For example, CSDB,[116] ClusterMine360,[117] SBSPKS v2,[118] and DoBISCUIT[119] include manually curated lists of 150–300 known microbial PKSs and NRPSs, including many assembly-line PKSs. The more recent MIBiG repository is the result of a community effort to facilitate the standardized deposition and retrieval of BGCs responsible for making known natural products. As of August 2018, MIBiG includes over 250 assembly-line PKSs, including PKS-NRPS hybrids.[120] However, BGCs that make known polyketides only offer a narrow glimpse into the diversity of assembly-line PKSs. A powerful approach to evaluate the actual diversity of PKS clusters and their products is through computational analysis of sequence databases. Algorithms such as antiSMASH,[121] ClusterFinder,[122] PRISM,[123] and others (reviewed in refs (124,125)) allow users to mine sequenced data for microbial BGCs and predict the biosynthesized product. Additional algorithms can improve predictions for certain types of clusters: for example, NaPDoS uses domain phylogeny to predict PKS and NRPS products,[126] while the TransATor allows a more accurate prediction of trans-AT PKS products based on the substrate specificity of their KS domains.[127] Despite significant advances of in silico prediction algorithms, determining the structure of a polyketide from its BGC sequence alone remains an elusive goal. Nonetheless, computational analysis of BGCs can be used to estimate the diversity of assembly-line PKSs and their products. AntiSMASH is a particularly powerful and widely used tool for identifying and annotating bacterial BGCs,[121] with many additional functionalities becoming available in each new release. (antiSMASH 5.0 is the most recent one.[128]) Several other databases contain data from large-scale genome mining, such as IMG-ABC (∼150 PKSs sequenced at the Joint Genome Institute)[129] and antiSMASH database 2.0 (over 3000 PKSs from publicly available microbial genomes).[130] In 2013, we catalogued all nonredundant assembly-line PKSs available in the NCBI databases[102] and identified 885 nonredundant PKSs, most of which produced unknown compounds. (These uncharacterized PKSs were referred to as “orphans”.) Given the rapidly increasing number of genomes deposited into sequence archives, we have updated this catalogue to obtain a snapshot of assembly-line PKSs sequenced to date.

Updated Catalogue of Orphan Assembly-Line PKSs

The general strategy for compiling and phylogenetically analyzing all assembly-line PKSs has been described previously.[102] Briefly, a consensus ketosynthase (KS) sequence was aligned using BLAST against nine NCBI DNA databases as well as the archive for whole-genome shotgun sequences available as of May 2018. To select for multimodular PKSs, BLAST hits were refined by requiring a minimum of 3 KS domains located within 20kb of each other, and the PKS gene clusters that met this criterion were further analyzed by antiSMASH 4.0.[131] Identical PKSs were eliminated based on either an identical sequence or an identical domain architecture in the same species. From the remaining PKSs, the sequences of individual PKS and NRPS proteins were extracted and subjected to comparative pairwise analysis using BLAST, calculated as described in ref (102). PKSs that scored more than 90% in amino acid similarity were considered redundant, yielding the final catalogue of distinct assembly-line PKSs (Figure ). As before, cluster similarity scores were visualized in the form of a dendrogram.

Figure 4

Summary of the workflow to generate the catalogue of distinct assembly-line PKSs. In the final clustering schematic, the red line represents a PKS sequence that scored higher than 90% in amino acid similarity to another sequence and was thus removed from the catalogue of distinct clusters. A total of 3551 distinct clusters from 1662 species were catalogued, representing a 4-fold increase over the data set from five years prior. Among these, 1692 clusters were annotated as cis-AT PKS clusters, 975 as cis-AT PKS/NRPS hybrids, 293 as trans-AT PKSs, 343 as trans-AT PKS/NRPS hybrids, and 248 as other hybrids. The full list of nonredundant assembly-line PKS clusters and the dendrogram visualizing their distances are available online at http://web.stanford.edu/group/orphan_pks/. It should be noted that, although our number of PKS clusters closely matches the number listed in the antiSMASH 2.0 database (3302 type I PKSs and 623 trans-AT PKSs),[130] the two catalogues are complementary, not identical, because the analyses differed in terms of NCBI databases, PKS cluster types, and sequence similarity cutoffs. Nonetheless, both of them reflect the vast numbers of assembly-line PKS clusters present in nature.

Evaluating the Product Diversity of Orphan Assembly-Line PKSs

On the basis of the date when each PKS sequence in our catalogue was deposited in the NCBI database, it appears that the number of distinct assembly-line PKSs continues to grow exponentially, doubling every 2.5 years (Figure A, blue bars). This rate of discovery is consistent with the overall growth of NCBI sequencing data in GenBank. The vast majority of these clusters are orphan; by using the MIBiG database and NCBI annotations, we estimate that only around 10% of assembly-line PKSs in our catalogue have been linked to the production of a known molecule (Figure A, red bars).

Figure 5

(A) The discovery rate of distinct clusters is shown (blue; having less than 90% amino acid sequence similarity score to any other cluster). Also shown (in red) is the number of clusters with known products, determined using MIBiG database and NCBI annotations. For years 1994–2017, numbers reflect sequences deposited by December of that year. For 2018, only sequences deposited by May were taken into account. (B) Rediscovery rate among nucleotide sequences deposited to NCBI, determined as the percentage of redundant clusters (having more than 90% amino acid sequence similarity score to a previously sequenced cluster). (C) Distribution of sequence similarity scores between an orphan assembly-line PKS and its closest neighbor whose product has been characterized. PKSs with pairwise similarity scores above 50% probably make structurally similar polyketides, while orphan PKSs whose sequences show greater differences from those of any known PKS most likely produce novel chemotypes. (D) The red line plots the percentage of all distinct assembly-line PKSs that are chemically decoded. The blue line plots the percentage of orphan PKSs that are more than 50% similar to a chemically decoded assembly-line PKS. A major challenge in traditional natural product discovery is the high rate of rediscovery of a given molecule. Even among sequences deposited into NCBI databases, the number of redundant assembly-line PKS clusters has been increasing, reaching 51% by mid-2018 (Figure B). As one continues exploring PKS diversity, this will likely become even more problematic. The number of distinct assembly-line PKSs is astonishing in itself. However, it is even more interesting to consider the diversity of these clusters and their products. By eliminating redundant clusters (above 90% similarity score) from the catalogue, we sought to estimate the number of clusters producing different molecules. On the basis of similarities between nine 16-membered macrolide PKSs (46–89% similar, with a mean of 56%), we assume that assembly lines with higher than 50% similarity could make identical or very similar molecules. (As a point of reference, the tylactone and rosamicin synthases are 72% similar and produce the same polyketide backbone.) It should be noted that the tailoring enzymes associated with a given biosynthetic cluster differ even for PKSs with high sequence similarity and give rise to distinct natural products. Nonetheless, on the basis of the above arguments, we assume that PKSs that are less than 50% similar most likely produce polyketide products that could be regarded as distinct chemotypes. By evaluating the maximum sequence similarity of orphan assembly-line PKSs to any previously characterized PKS, it is possible to estimate how diverse their products are from known polyketide natural products. Remarkably, more than one-half of all orphan assembly lines show less than 50% sequence similarity to any known PKS (Figure C). Although the rate of chemically decoding orphan assembly-line PKSs cannot possibly keep up with their discovery (Figure D, red line), it appears that the fraction of orphan PKSs making polyketides whose structures are related to known natural products is increasing (Figure D, blue line). This fraction, however, is likely an overestimate because only ca. 20% of the emerging orphan PKSs preserve the same architecture over the entire assembly line. (Most of the orphan PKSs that comprise the blue line statistics in Figure B share their assembly-line architecture with a substantial portion, but not all, of a characterized PKS.) Nonetheless, the overall upward trend suggests that, while modern genomics-driven natural products discovery may be steadily sampling the actual diversity of assembly-line PKSs in nature, the major part of this diversity has not yet been explored.

Figure 6

Network of 3551 distinct assembly-line PKS clusters, visualized by Cytoscape 3.7.2.[133] Nodes correspond to known (larger circles) and orphan (smaller circles) PKSs and are color-coded according to antiSMASH predictions (legend). Edges represent >50% sequence similarity between two clusters, calculated as described in ref (102).

Similarity Network of Assembly-Line PKSs

The diversity of assembly-line PKSs can be visualized as a network (Figure ). Sequence similarity networks are a useful tool for analyzing relationships within a protein family.[132] Individual PKS sequences are represented as nodes (circles), while pairs of PKSs with sequence similarity above a certain threshold are shown as edges (lines) where an edge’s length correlates with the relative dissimilarity between the PKS pair. (The relative position of disconnected groups has no meaning.) Unlike dendrograms that only show optimal connections, networks allow visualization of all relationships above a threshold. In our analysis, the threshold of pairwise cluster similarity was 50%, and networks of distinct assembly-line PKS were visualized using Cytoscape 3.7.2.[133] Orphan PKS nodes (smaller circles) not connected to any node corresponding to a characterized PKS (larger circle) highlight the unexplored diversity of assembly-line PKSs. Network of 3551 distinct assembly-line PKS clusters, visualized by Cytoscape 3.7.2.[133] Nodes correspond to known (larger circles) and orphan (smaller circles) PKSs and are color-coded according to antiSMASH predictions (legend). Edges represent >50% sequence similarity between two clusters, calculated as described in ref (102). From this data, it is apparent that cis-AT PKSs (red) and PKS/NRPS hybrids (orange) separate from trans-AT PKSs (dark blue) and PKS/NRPS hybrids (light blue). This may be due to their nonuniform distribution among bacterial phyla. Indeed, the main group in the top left corner almost exclusively comprises actinobacterial PKSs (regardless of subclass), whereas the two large groups to its right is comprised of cyanobacterial and firmicute PKSs, respectively (Supporting Information, Figure S2). A few examples of PKSs with >50% similarity are found in species belonging to different phyla, supporting the theory that even though HGT has played an important role in assembly-line PKS evolution, it does not occur frequently between phyla.[20] Overall, these networks reveal promising opportunities for the exploration of polyketide diversity in nature. On one hand, orphan PKSs belonging to a large, tightly connected network that includes at least one known PKS may warrant investigation, as they could yield natural products with related properties. On the other hand, by exploring a disconnected group of orphan PKSs, one could discover truly novel polyketide structures and bioactivities. Such disconnected groups include, for example, PKSs that biosynthesize the DNA chelator colibactin,[134] the antimitotic agent rhizoxin,[135] and the pre-mRNA splicing inhibitor FR901464.[136]

Eukaryotic PKS Clusters

While a vast majority of chemically decoded assembly-line PKSs are from bacterial sources, it has become clear within the past decade that eukaryotic genomes also encode a number of these megasynthases (Figure A). When evolutionary relationships were visualized on a dendrogram or in a network, 59 distinct eukaryotic assembly-line PKSs clustered across several groups (Supporting Information, Figure S2 and the dendrogram available online).

Figure 7

(A) Distribution of assembly-line PKSs among the different phyla. (B) The nemamide PKS from C. elegans, described in ref (137). KS, ketosynthase; AT, acyltransferase; KR, ketoreductase; DH, dehydratase; C, condensation domain; A, adenylation domain; ACP, acyl carrier protein; PCP, peptidyl carrier protein; TE, thioesterase. One such group includes assembly-line PKS from nematodes, originally identified as an orphan PKS.[102] More recently, the hybrid PKS-NRPS from Caenorhabditis. elegans has been decoded as a producer of the nemamide family of natural products (Figure B).[137] These remarkable molecules are regulators of starvation-induced larval arrest. Another cluster of eukaryotic assembly-line PKSs is found in soil-dwelling social amoeba from the Dictyostelium genus. Genome sequencing has revealed more than 40 PKSs in Dictyostelium discoideum.[138] So far, only iterative PKSs have been characterized from these species,[139−141] and the chemistry and biology of polyketide products of Dictyostelium assembly-line PKSs remain unknown. Assembly-line PKSs are also found in various eukaryotic protists. Apicomplexan parasites such as Cryptosporidium, Toxoplasma, and Eimeria contain assembly-line PKSs that appear to produce fatty acid components of the rigid wall of their oocysts, thereby ensuring transmission of the pathogen between hosts.[142−144] These protists also encode assembly-line PKSs that appear to produce more oxygenated metabolites of unknown structure.[145] Related PKSs are found in the dinoflagellate Gambierdiscus. Dinoflagellates possess some of the largest genomes among eukaryotes, and very few whole-genome sequences are available in the NCBI database. However, transcriptomic analyses have revealed numerous PKSs in dinoflagellates and suggest that these marine protists are a large reservoir of these enzymes.[146,147] Their expression in Gambierdiscus has been linked to polyether toxins released during algal blooms.[148] Other eukaryotic species harboring assembly-line PKSs include phytopathogenic fungi, fish, arthropods, and mollusks. So far, the evolutionary history of eukaryotic assembly-line PKSs remains cryptic. It is possible that their patchy occurrence reflects a loss of this PKS family from most eukaryotic lineages. Alternatively, eukaryotic PKSs could have been acquired from prokaryotes during secondary endosymbiosis or resulted from more recent interkingdom HGT events.[83] Regardless, the diversity of molecules synthesized by eukaryotic assembly-line PKSs and their relevance to host development or pathogenicity suggest that they represent an underexplored source of bioactive natural products.

Prioritizing PKS Clusters for Further Study

In general, bioinformatic decoding of orphan assembly-line PKS chemistry is outside the realm of feasibility today, making experimental analysis a necessity. Given the abundance and the diversity of orphan PKSs, methods for prioritizing clusters for deorphanization are crucial.[149] The classic method, consisting of screening for biological activity in the native host, remains laborious and technically challenging[150] but has nonetheless benefitted enormously from state-of-the-art untargeted metabolomics approaches. A vivid example can be found in the work leading to the discovery of nemamides (Figure B).[137] Alternatively, culture-independent methods are also being used for compound prioritization,[151] and computational tools are becoming increasingly helpful in such pursuits.[152] Ultimately, given the enormous gap between the pace of discovering orphan PKS assembly lines and their deorphanization, selecting a target orphan PKS for further analysis is a subjective exercise. To help genome mining approaches navigate this diversity in search of the most interesting and novel clusters, several computational approaches are being developed. On one hand, cluster prioritization can be based on the novelty of the product’s chemical structure, often reflected by the orphan cluster’s evolutionary relationships with known BGCs. On the protein level, EvoMining reconstructs evolutionary histories of biosynthetic enzymes in an attempt to find clusters that produce molecules with novel chemical structures, the so-called “chemical dark matter”.[153,154] At the cluster level, the combination of BiG-SCAPE and CORASON tools opens the possibility to analyze BGC similarity networks and cluster group phylogenies to direct genome mining approaches either toward the discovery of molecules with novel structures or explore known compound analogues.[155] On the other hand, cluster prioritization can be based directly on the nature of enzymes present in the cluster. For instance, the ARTS tool predicts clusters that are more likely to produce a molecule with an antibiotic activity based on the presence of self-resistance genes: BGC-encoded genes that are homologous to the antibiotic target gene yet harbor mutations that confer resistance.[156] However, in silico prioritization of clusters is only the first step toward deorphanizing a BGC through experimental approaches. We recently identified a distinct clade of NOCAP (nocardiosis-associated polyketide) synthases only observed in 12 clinical strains of Nocardia isolated from nocardiosis-affected patients. Using both direct in vitro reconstitution from purified proteins and Escherichia coli as a heterologous host for polyketide biosynthesis, we characterized an unprecedented set of polyketides (ref (157), and Yuet et al., in preparation). While this example validates the utility of a combined in vitro and in vivo approach to deorphanize assembly-line PKS clusters identified through in silico analysis, it also highlights the importance of a careful choice of targets.

Accessing PKS Diversity

In the previous section, we discussed the diversity of orphan assembly-line PKSs and introduced potential strategies for prioritizing these promising sources of new natural products for further analysis. However, the process of producing a novel polyketide and determining its structure heavily relies on wet laboratory techniques. Genetic manipulation and chemical analyses are at the core of the efforts to explore the natural diversity of polyketide compounds. In this section, we delve deeper into state-of-the-art methods for connecting orphan assembly-line PKSs to their natural products. If an organism encoding an orphan PKS can be cultured and genetically manipulated, then promoter mutagenesis followed by metabolic profiling can enable natural product discovery. Alternatively, heterologous expression of an entire biosynthetic pathway in a well-established host, as in the case of the NOCAP synthase, can achieve the same goal. Here, we briefly review these two approaches.

Expressing Assembly-Line PKSs in Heterologous Hosts

Heterologous hosts such as E. coli have significant genetic and growth advantages over native hosts, thus allowing expression of BGCs from unculturable organisms. The most widespread approach to transferring BGCs in a heterologous host is direct cloning (Figure ). Because of the difficulty of handling large DNA fragments, researchers have developed tools based on homologous recombination to precisely capture the BGC of interest.

Figure 8

A general workflow for expressing assembly-line PKSs in heterologous hosts.

Phage Recombination-Assisted Cloning

Direct cloning of PKS gene clusters can be accomplished in E. coli with the assistance of phage recombination systems such as phage lambda-derived Red[158] and phage Rac-derived RecET.[159] For example, Photorhabdus luminescens TT01 harbors 10 unexplored secondary metabolic pathways. By using the full length RecET in E. coli, these BGCs (10–52 kb) were recombined onto pSC101-based expression vectors. Seven gene sets were cloned successfully, two of which were expressed in E. coli Nissle 1917, leading to the identification of new metabolites luminmycin A and luminmide A/B.[160] The use of RecET also enabled heterologous production of disorazol in Myxococcus xanthus(161) and salinomycin in Streptomyces coelicolor.[162] A cryptic hybrid PKS-NRPS from Paenibacillus lavae was cloned and activated in E. coli, leading to the production of a novel compound, sevadicin.[163]

Transformation-Associated Recombination Cloning

Transformation-associated recombination (TAR) techniques exploit homologous recombination in Saccharomyces cerevisiae to rapidly “capture” large gene clusters directly from genomic DNA.[164−166] For example, researchers studying marinopyrrole biosynthesis in Streptomyces sp. CNQ418 found that TAR could enable direct cloning within days while phage-mediated homologous recombination methods such as λ Red/ET recombineering have turnaround times of months.[167,168] A shuttle vector pTARa was developed containing three components for shuttling among three organisms: yeast, E. coli, and Streptomyces. CEN6 (centromere in chromosome VI) and ARS4 (autonomously replicating sequence 4) sequences as well as a URA3 selection marker allow for gene cluster assembly and propagation in S. cerevisiae. Bacterial artificial chromosome elements and a chloramphenicol resistance cassette allow for maintenance and verification in E. coli. An apramycin resistance cassette and the phage ϕC31 integration system enable site-specific chromosomal integration of the cluster in a number of different Streptomyces strains, including Streptomyces toyocaensis, Streptomyces lividans, and Streptomyces albus.[169] Using pTARa, these investigators directly cloned the 56 kb colibactin biosynthetic gene cluster from Citrobacter koseri, a gut bacterium. Multiple biosynthetic gene clusters, including an 89 kb orphan NRPS gene cluster, were also directly cloned or reassembled from cosmid DNA libraries.

pCAP-Based Transformation-Associated Recombination Cloning

More recently, the TAR cloning strategy has been adapted onto pCAP01, a shuttle vector.[170] Unlike pTARa, pCAP01 can be maintained at a higher copy number in E. coli, even with large (>50 kb) inserts. In addition, the φC31 integration elements in pCAP01 allow its site-specific integration into chromosomes of a broader range of heterologous actinobacteria.[171] Using λ-Red recombination-based methods, the 30 kb marinopyrrole and the transcriptionally silent 67 kb orphan tar BGC were cloned from Streptomyces sp. CNQ418 and expressed in Streptomyces coelicolor M152, leading to heterologous production of marinopyrrole and taromycin A, respectively.[168] Despite its relatively rapid workflow, this method requires a considerable amount of colony screening due to high levels of unproductive pCAP01 recircularization by nonhomologous end joining, resulting in capture rates below 2%. To minimize plasmid recircularization by nonhomologous end joining, the URA3 gene encoding the S. cerevisiae orotidine 5′-phosphate decarboxylase was introduced into pCAP01 as a counter-selectable marker, yielding pCAP03.[172] This vector was used to capture and express in Streptomyces coelicolor M1152 thiotetronic acid-producing 22 kb PKS/NRPS biosynthetic gene clusters from Salinispora pacifica CNS-863 after screening only 12 transformants (eight of which were positive; 75% capture rate) and Streptomyces afghaniensis after screening 10 transformants (two of which were positive; 20% capture rate). The method has been extended to capture clusters associated with the production of amicoumacin[173] and colibactin.[174] Primary limitations to the use of pCAP01 and pCAP03 include a requirement for restriction enzymes that cut at sites flanking, but not within, a biosynthetic gene cluster of interest, and the need for high-quality high-molecular weight genomic DNA to capture clusters larger than 50 kb. Another limitation inherent in all TAR-based methods involves the relatively slow growth rates of yeast.

Cas9-Assisted Targeting of Chromosome (CATCH) Cloning

To address the above challenges, a Cas9-assisted targeting of chromosome (CATCH) cloning strategy was developed.[175,176] In this method, bacteria are embedded in low-melting-temperature agarose gel, treated with lysozyme and proteinase K, and washed to yield high-quality high-molecular weight genomic DNA stabilized by agarose. The genomic DNA is then cleaved with CRISPR-Cas9 endonuclease directed by guide RNAs to digest specific sequences flanking the cluster of interest, bypassing the need for restriction enzymes. Avoiding homologous recombination in yeast altogether, the genomic DNA fragments are recovered by digestion with agarase, purified, and ligated into vectors with homologous 30 bp arms by Gibson assembly.[177] The reaction mixture is electrotransformed into E. coli. CATCH is rapid, taking ca. 8 h effort over several days, and yields positive clones varying from 20% (for 100 kb test inserts) to 60% (for 50 kb test inserts). The 78 kb bacillaene assembly-line PKS from Bacillus subtilis was successfully cloned after screening 102 transformants (12 of which were positive; 12% capture rate).

Site-Specific Recombination Cloning

Another direct cloning strategy based on the site-specific recombinase system Cre/loxP has been developed for assembly-line PKSs.[178] First, loxP sites are integrated flanking the gene cluster of interest with elements needed for plasmid replication. Then the Cre recombinase is expressed, and the whole region containing the gene cluster flanked by loxP is circularized as a plasmid. The resulting plasmid is isolated via transformation into E. coli. A 78 kb DNA fragment containing a siderophore biosynthetic gene cluster from Agrobacterium tumefaciens C58 was cloned with this strategy. An analogous method involving one less electroporation or conjugation step, based on ΦBT1 integrase-mediated recombination was used to clone the entire 55 kb erythromycin BGC.[179] Seven clones (out of a total of 20 E. coli colonies) selected for restriction enzyme verification harbored this BGC.

Activating Assembly-Line PKSs in Native Hosts

Some microorganisms harbor dozens of BGCs, many of which encode orphan assembly-line PKSs. For example, certain strains of Streptomyces are capable of producing as many as 50 distinct natural products.[180] However, many of these BGCs are tightly regulated.[181] For organisms that are culturable and amenable to genetic manipulation, researchers rely on either overexpressing positive transcription regulators or deleting negative regulators to activate these normally silent BGCs. For example, Bibb and co-workers identified a cryptic 29.5 kb gene cluster containing both modular type I and type III PKSs from Streptomyces venezuelae that was predicted to encode a biaryl metabolite, venemycin.[182] However, both the native host and a heterologous Streptomyces coelicolor host harboring this cluster yielded insufficient venemycin for structural analysis. To overcome this challenge, they overexpressed vemR, a transcriptional activator from the ATP-binding LuxR-like (LAL) family, with the constitutive promoter ermE* in both strains, resulting in the production of adequate venemycin for structural characterization, confirming its unusual biaryl structure. Similarly, an orphan ansamycin PKS cluster was activated in Streptomyces sp. XZQH13 by constitutive expression of another LAL family regulator gene astG1, leading to the isolation of two known ansatrienins, hydroxymycotrienin A, and thiazinotrienomycin G.[183] Another orphan ansamycin PKS cluster was activated in Streptomyces sp. LZ35 by constitutive overexpression of a LuxR family transcriptional regulatory gene, leading to the discovery of three new naphthalenic ansamycins, neoansamycins A–C.[184] This approach can be further developed for high throughput activation of silent BGCs. In a step toward this direction, CRISPR/Cas9 methods have been used to delete genes[185] or knock-in promoters in Streptomyces.[186] Notably, a promoter knock-in strategy led to activate BGCs of different classes (type I, II, and III PKSs, NRPS, hybrid PKS-NRPS, and phosphonate) in multiple Streptomyces species.[186] Along similar lines, CRISPRi, which utilizes a catalytically dead Cas9 to interfere with gene expression in a sequence-specific manner, has been used to repress transcription of negative regulatory genes.[187] The efficacy of these strategies is pathway-specific. For example, if a BGC contains multiple operons, then overexpression of one activator or knock-in of a single promoter may not be sufficient for activation. As an alternative to genetic manipulation, varying culture conditions such as media composition, aeration, culture vessel, and addition of enzyme inhibitors is sometimes sufficient to activate multiple biosynthetic gene clusters from a single strain. This “one strain–many compounds” (OSMAC) approach has been used to isolate more than 100 compounds (belonging to more than 25 different structural classes) from only six different microorganisms: Aspergillus ochraceus DSM 7428, Sphaeropsidales sp. F-24′707, Streptomyces sp. Gö 40/14, Streptomyces parvulus Tü 64, Streptomyces sp. A1, and Streptomyces Tü 3634.[188] Ribosome engineering is another effective approach to globally activate BGCs.[189−191] Strains of interest can also be cocultured with other organisms, resulting in interspecies crosstalk that acts to activate silent biosynthetic gene clusters.[192] While the above methods are applicable to many hosts, the resulting physiological disturbances are global, making comparative metabolic profiling challenging.

Expanding PKS Diversity through Engineering

Since the discovery of assembly-line PKSs in the 1990s, numerous attempts to reprogram them have been explored, prompted by their modular architecture. PKS engineering promises to expand the polyketide diversity beyond the chemical landscape of natural compounds, for instance, to introduce small changes to the structure that would improve the molecule’s bioactivity or bioavailability, much-sought results in medicinal chemistry. However, the task of PKS engineering is challenging. In this section, we will discuss how an understanding of the architecture and enzymatic reactions of assembly-line PKSs has gradually shifted engineering approaches from rational design toward evolutionary-inspired strategies (Figure ).

Figure 9

Over time, assembly-line PKS engineering has shifted from rational design (e.g., by domain swapping, module swapping, and module insertion) and combinatorial engineering (e.g., through in vitro combinatorial assembly) toward evolution-inspired approaches (e.g., use of natural splicing points, or inter- and intra-PKS recombination).

Combinatorial and Rational Engineering of Assembly-Line PKSs

Early attempts at combinatorial assembly explored the possibility of generating libraries of “unnatural” natural products.[193−195] However, it soon became clear that such approaches are not straightforward: reordering domains or modules derived from naturally occurring PKSs often results in catalytically compromised assemblies.[196] Subsequent approaches in combinatorial engineering of PKSs using module swapping confirmed that most hybrid assemblies turned over poorly,[197,198] which halted further efforts in this direction. More recently, a computational platform ClusterCAD was developed for streamlining the design of chimeric PKSs, potentially providing new opportunities for combinatorial polyketide biosynthesis.[199] In parallel to high-throughput combinatorial approaches, rational design strategies for accessing novel compounds were also explored. These included deletion, insertion, or replacement of intact domains and modules, engineering of substrate specificity, and metabolic supply of alternative precursors (reviewed in refs (200−202)). However, these approaches also encounter difficulties in generating fully active PKSs. For nearly two decades, it has been clear that most of these difficulties stem from our inability to engineer essential protein–protein interactions involved in intramodule and intermodule chain processing and from insufficient knowledge about the specificity of different domains toward alternative substrates.[203] Despite extensive structural and biochemical analysis,[3] the mechanistic basis for the underlying dynamic protein–protein interactions remains poorly understood. Overcoming this challenge would be critical for further PKS engineering, both using rational and combinatorial approaches.

Engineering Inspired by Evolution

Given the difficulty of engineering PKSs in the laboratory, the catalytic diversity of natural assembly-line PKSs is all the more astonishing. The toolkit of molecular mechanisms and evolutionary strategies employed by nature appears to be much better suited for the challenge than the strategies typically used in the lab. The idea of taking inspiration from natural approaches for PKS engineering was proposed more than a decade ago[65] and was developed in two directions. One uses natural evolution to guide the choice of splice points for further engineering by traditional cloning techniques, while the other uses natural recombination mechanisms for generating novel PKSs (Figure ).

Using Natural Splice Points

Because of the modular and colinear architecture of assembly-line PKSs, it is particularly appealing to engineer them by modifying single domains or modules to introduce small and predictable changes in the structure of the biosynthetic product. While point mutations can increase domain promiscuity or inactivate them, they rarely lead to altered domain function without compromising specificity or PKS turnover (reviewed in ref (204)). It appears that PKS evolution did not rely on point mutations to change domain specificity either: domains with the same specificity are phylogenetically close (see section ), suggesting that they originated from the same common ancestor rather than independently through point mutations. Instead, changes in domain specificity probably arise through domain swaps by recombination, which has prompted a search for natural splice points that can be exploited for engineering (Figure A, left). Because AT domains are responsible for selecting starter and extender units in polyketide biosynthesis, swapping them is an appealing strategy for product modification. AT domain swaps have been shown to yield functional chimeric PKSs as early as 1996.[194] Later studies revealed the presence of conserved regions in KS-AT interdomain linker (also called KAL) and post-AT linker, which most likely correspond to natural splice points and can be used for AT domain swapping.[41,205] These linkers may be responsible for maintaining structural integrity of the module upon recombination, thus enhancing the evolutionary degrees of freedom of assembly-line PKSs.[41] Conserved regions flanking AT domains were used as splice points to emulate natural recombination and exchange between modules: either from the same PKS as in the case of aureothin synthase, or from a homologous PKS as in the case of antimycin and antimycin-like synthases, leading to fully functional chimeric assembly-line PKSs.[206,207]

Figure 10

Alternative splice points for PKS engineering. (A) Two domain swapping strategies can lead to predictable changes in the structure of biosynthesized molecule: AT domain swaps affect the choice of starter or extender unit (malonyl-CoA, methylmalonyl-CoA, or other), whereas reductive loop swaps alter the configuration and oxidative state of the newly added extender unit. For both types of domain swaps, conserved regions were identified that can be used as splice points.[205,210] (B) “Classical” module boundaries match the boundaries of unimodular proteins and correspond to the functional unit of chain elongation (KS and downstream ACP) and subsequent modification (reductive loop). “Alternative” module boundaries break the functional unit of chain elongation but preserve chain translocation unit (KS and upstream ACP), along with the reductive loop that determines the oxidative state of the translocated substrate.[39] Both “classical” and “alternative” module boundaries have been successfully used for module deletion (shown here), as well as module swapping and insertion.[78,90,206] KS, ketosynthase; AT, acyltransferase; ACP, acyl carrier protein; KR, ketoreductase; DH, dehydratase. The termini of reductive domains and multidomains have also been identified to harbor recombinational hotspots.[65] Although previous domain swaps at these interfaces did not always result in active PKSs, a few approaches were successful.[208,209] On the basis of the phylogenetic analysis of these regions of PKS modules, a polylinker approach was developed that allowed testing of various splice sites and reductive domain donors while using presumed regions of natural recombination (Figure A, right).[210] This allows changing of the configuration and the oxidation state of the resulting polyketide. Another strategy commonly used in PKS engineering involves deleting, inserting, or swapping entire modules, leading to changes in the polyketide chain length. Under the current evolutionary model that considers gene duplication as a major step leading to the multimodular architectures of cis-AT PKSs, it is reasonable to assume that the unit of duplication corresponds to the KS-AT-(KR-DH-ER)-ACP module, which is the functional unit of chain elongation and matches the boundaries of unimodular proteins. This assumption has led to the use of intermodular ACP-KS regions (either linkers or docking domains) in the efforts of module rearrangement in the early 2000s (Figure B, left).[197,198] However, early on, it became apparent that modules can be deleted, swapped, or inserted at other splice points as well (Figure B, right). In 2004, the KS-AT linker region was used to delete two modules of amphotericin synthase, resulting in a functional PKS, producing high yields of the shortened polyene.[211] Later, analysis of several PKS systems suggested that the KS-AT interface was a natural splice site for protein engineering via homologous recombination.[41,65] Engineering of the aureothin and neoaureothin synthase showed that splitting modules along the KS-AT interface rather than the “classical” ACP-KS interface was more productive for module deletions and insertions.[78,90,206] This is particularly interesting in the light of recently proposed “alternative” module boundaries at the KS-AT interface, which are based on close evolutionary relationships between KS domains and the upstream processing domains (discussed in section ). Such evolutionarily inspired strategies that alter homologous clusters through domain or module exchanges represent a powerful approach because they often result in chimeric PKSs that produce higher yields of new molecules. A similar approach of using natural recombination points has been explored for engineering NRPSs. In one study, adenylation domains of hormaomycin synthase were successfully swapped at splicing points that show high sequence similarity.[212] In another study, a more general strategy was proposed by introducing exchange units with a splice point located between the condensation and adenylation domains, which allows the assembly of chimeric NRPSs producing various new compounds.[213]

Using Natural Recombination Mechanisms

In approaches where natural recombinational hotspots were used for generating chimeric PKSs, standard laboratory cloning techniques were used to perform the assembly. Another approach inspired by natural PKS evolution uses homologous recombination for the assembly and relies on naturally occurring regions of sequence similarity. The possibility of using homologous recombination was first assessed computationally and suggested numerous regions of sequence similarity that could potentially lead to chimeric assemblies and new molecules.[214,215] The feasibility of this approach was later shown experimentally in two different studies. First, homologous DNA recombination between DEBS and pikromycin (PIKS) clusters was shown to produce numerous functional chimeric assembly lines with splicing points located at various locations within modules, though preferentially in KS and AT domain-encoding regions.[216] This straightforward and versatile method relies on homologous recombination in yeast and has great potential for generating large libraries of PKS chimeras. A more recent study has demonstrated the possibility of generating chimeras by recombination within a single PKS cluster by harnessing the homologous recombination mechanisms of the host Streptomyces strain.[217] Using this method that accelerates the plausible mechanism of PKS evolution, 17 rapamycin synthase and nine tylactone synthase chimeras were generated, with splicing points mostly located in regions encoding KS, AT, and ACP domains. More strikingly, many of these chimeric PKSs were highly active, with titers comparable to those of the wild-type strain. The described studies demonstrate the power of evolutionary-inspired engineering for producing active assembly-line PKSs. Homologous recombination-based techniques rely on splicing at arbitrary locations of sequence similarity and generate many chimeras that have to be tested for activity, which requires screening methods. However, the high success rate of producing active PKSs represents a major advance compared to previous engineering approaches and opens many possibilities for future developments.

Future Directions

Understanding Assembly-Line PKS Evolution

Even though natural products are widely used in medicine, the role of small molecules in natural environments is poorly understood.[218] At subinhibitory concentrations, many antibiotics trigger specific responses and may act as signaling molecules.[219] More broadly, secondary metabolites seem to play various roles in the development of their producing strains and their interactions with the environment. Regardless of the ecological role that antibiotics play in the natural setting, antibiotic resistance genes have concomitantly coevolved in bacterial populations, forming the resistome.[220,221] While modern day antibiotic treatment is responsible for the increased degree of resistance gene mobility, the growth of environmental reservoirs of antibiotic resistance and the major healthcare problem that we are currently facing, the antibiotic resistome itself is ancient.[222,223] With the increased awareness of the gravity of the antibiotic resistance crisis, the question of its evolution is now being investigated with computational, epidemiological, and molecular tools.[224] The evolution of biosynthetic machines responsible for antibiotic production and diversification is the opposite side of the coin and has undoubtedly played a crucial role in shaping the resistome. However, our understanding of this process is less advanced; even though we have identified the global molecular processes involved (reviewed in section ), we remain unaware of the molecular mechanisms, the evolutionary paths leading to chemical diversification, and the overall dynamics of these events. The evolution of PKSs is not an exception, and these questions need to be addressed. One of the future challenges for the field is to establish the evolutionary relationships between antibiotic producing and antibiotic resistance systems. Apart from the obvious scientific value of understanding the coevolution of these two systems, it could potentially inform us of interesting avenues for the development of novel antibiotics, such as using nature’s toolkit for biosynthetic cluster diversification or exploring the chemical diversity not accessible through natural processes and thus less likely to have corresponding resistance mechanisms evolved and readily available.

Expanding Access to Natural Diversity

The exponential increase in the number of sequenced assembly-line PKSs and the high percentage of orphan clusters highlight the abundance of polyketide diversity that remains to be explored. Given the effort required to characterize the product of a single PKS, criteria for prioritizing orphan PKSs is essential. Development of computational and wet lab tools for prioritization is a promising direction for the field. For example, sequence similarity to known clusters[225] or the presence of a potential antibiotic self-resistance gene within the cluster[156] can be promising approaches to select PKSs for deorphanization. A single method is unlikely to solve this problem; instead, multiple approaches tailored to address different challenges must be exploited. One of the challenges in predicting product structure from the sequence of a PKS is the fact that not all assembly lines follow the colinearity rule: the order in which proteins are encoded can differ from the order in which they operate. Recently, a solution to this problem was proposed. The KS domains of trans-AT PKSs typically coevolve with the upstream ACPs and modifying domains, enabling more precise predictions of biosynthesized molecule structures.[127] In contrast, the coevolution signal between KS domains of cis-AT PKSs and the ACP domains of their upstream modules is not strong enough; here, module interactions are more predictable based on docking domain coevolution.[226] Evolutionary insights can also facilitate prioritization of experimental analysis of orphan assembly-line PKSs. A striking example is the discovery of several polyketides that share structural elements with the actin inhibitor misakinolide, most likely due to recombination between their BGCs.[227] If combinatorial exchange of BGC regions is a common strategy for trans-AT PKS diversification, this approach could readily lead to the discovery of assembly-line PKSs that produce chimeric molecules. Finally, it should be recognized that, although a tremendous diversity of assembly-line PKSs has been revealed through DNA sequencing, it is possible that biases in genomic and metagenomic sequencing have created a corresponding bias in our insights into PKS assembly lines. One way to assess the existence of such bias is by targeting underexplored bacterial phyla and environmental niches. For example, marine sponges have been found to harbor a large number of symbiotic bacteria from the genus Entotheonella, which produces natural products with a chemical richness that might be comparable to soil actinomycetes.[97,228] More generally, unculturable bacteria is a promising source of new BGCs, but their DNA can be difficult to retrieve. Several pipelines have been developed that allow culture-independent sequencing, extraction, prioritization, and characterization of DNA encoding PKSs and NRPSs.[151,229] As they become more versatile, the discovery of novel polyketides from metagenomic libraries may also become more powerful.

Overcoming Technical Challenges in Deorphanizing PKS Clusters

The techniques described in section enable researchers to rapidly travel from gene to polyketide discovery. However, the path from polyketide discovery to polyketide deorphanization remains slow and painstaking, as there are a multitude of factors that govern a polyketide’s production and stability. This challenge is the most significant barrier to quickly uncovering the chemical diversity of orphan polyketides. Without a complete structure, a polyketide, or its novel analogues, cannot be prepared by chemical synthesis, a route that can produce compounds at scales possibly unobtainable with either native or heterologous hosts. Notably, the absence of this information precludes detailed bioactivity experiments such as molecular-level structure–function analysis. To overcome this challenge, researchers must develop tools and strategies to analyze low abundance products. Arguably, the most notorious example is colibactin. Colibactin is a genotoxic hybrid polyketide–nonribosomal peptide produced in some gut commensal E. coli strains and is intriguingly associated with colorectal cancer.[134] Recently, the Crawford and Herzon groups used an interdisciplinary approach that involved genetics, isotope labeling, tandem mass spectrometry, and chemical synthesis to finally elucidate the full structure of colibactin.[230] For over a decade, no work, despite considerable efforts from several laboratories, had described the identification, isolation, and structural elucidation of the unstable final colibactin. In one herculean effort, 2000 L of culture were required to manufacture just 50 μg of a biosynthetic intermediate of colibactin for structural analysis using 1D and 2D NMR as well as tandem mass spectrometry.[231] Intriguing approaches like the one used for colibactin could be combined with yield optimization techniques in native or heterologous hosts to facilitate the deorphanization of polyketides in a timely manner.

Harnessing the Knowledge of PKS Evolution

Evolutionary analyses of assembly-line PKSs have proven their value not only in advancing the fundamental understanding of these enzymes but also in enabling practical applications. As described in section , an understanding of evolutionary processes can also be used for assembly-line PKS engineering. However, many questions about assembly-line PKS evolution remain to be answered. Perhaps foremost on this list is understanding how iterative PKSs gave rise to their assembly line counterparts. Other evolutionary questions are equally relevant. For example, how did trans-AT PKSs emerge from their cis-AT predecessors through the loss of AT domain? And what can we learn about PKS evolution from their nonuniform distribution among bacteria?

226 in total

Review 1. Natural products--a simple model to explain chemical diversity.

Authors: Richard D Firn; Clive G Jones
Journal: Nat Prod Rep Date: 2003-08 Impact factor: 13.423

Review 2. The antibiotic resistome: the nexus of chemical and genetic diversity.

Authors: Gerard D Wright
Journal: Nat Rev Microbiol Date: 2007-03 Impact factor: 60.633

3. A close look at a ketosynthase from a trans-acyltransferase modular polyketide synthase.

Authors: Darren C Gay; Glen Gay; Abram J Axelrod; Matthew Jenner; Christoph Kohlhaas; Annette Kampa; Neil J Oldham; Jörn Piel; Adrian T Keatinge-Clay
Journal: Structure Date: 2014-02-06 Impact factor: 5.006

Review 4. Exploration and exploitation of the environment for novel specialized metabolites.

Authors: Catarina Loureiro; Marnix H Medema; John van der Oost; Detmer Sipkema
Journal: Curr Opin Biotechnol Date: 2018-02-14 Impact factor: 9.740

5. Identification of two polyketide synthase gene clusters on the linear plasmid pSLA2-L in Streptomyces rochei.

Authors: M Suwa; H Sugino; A Sasaoka; E Mori; S Fujii; H Shinkawa; O Nimi; H Kinashi
Journal: Gene Date: 2000-04-04 Impact factor: 3.688

Review 6. Protein-protein interactions in "cis-AT" polyketide synthases.

Authors: Greg J Dodge; Finn P Maloney; Janet L Smith
Journal: Nat Prod Rep Date: 2018-10-17 Impact factor: 13.423

7. Interkingdom gene transfer of a hybrid NPS/PKS from bacteria to filamentous Ascomycota.

Authors: Daniel P Lawrence; Scott Kroken; Barry M Pryor; A Elizabeth Arnold
Journal: PLoS One Date: 2011-11-29 Impact factor: 3.240

8. A systematic computational analysis of biosynthetic gene cluster evolution: lessons for engineering biosynthesis.

Authors: Marnix H Medema; Peter Cimermancic; Andrej Sali; Eriko Takano; Michael A Fischbach
Journal: PLoS Comput Biol Date: 2014-12-04 Impact factor: 4.475

9. The origins of specificity in polyketide synthase protein interactions.

Authors: Mukund Thattai; Yoram Burak; Boris I Shraiman
Journal: PLoS Comput Biol Date: 2007-09 Impact factor: 4.475

10. Multimodular type I polyketide synthases in algae evolve by module duplications and displacement of AT domains in trans.

Authors: Ekaterina Shelest; Natalie Heimerl; Maximilian Fichtner; Severin Sasso
Journal: BMC Genomics Date: 2015-11-26 Impact factor: 3.969

30 in total

1. Mapping the catalytic conformations of an assembly-line polyketide synthase module.

Authors: Dillon P Cogan; Kaiming Zhang; Xiuyuan Li; Shanshan Li; Grigore D Pintilie; Soung-Hun Roh; Charles S Craik; Wah Chiu; Chaitan Khosla
Journal: Science Date: 2021-11-04 Impact factor: 47.728

Review 2. Applications of Norrish type I and II reactions in the total synthesis of natural products: a review.

Authors: Sasadhar Majhi
Journal: Photochem Photobiol Sci Date: 2021-09-18 Impact factor: 3.982

Review 3. Small Molecule Metabolites at the Host-Microbiota Interface.

Authors: Jason D Bishai; Noah W Palm
Journal: J Immunol Date: 2021-10-01 Impact factor: 5.426

Review 4. Synthetic biology enabling access to designer polyketides.

Authors: Alexandra A Malico; Lindsay Nichols; Gavin J Williams
Journal: Curr Opin Chem Biol Date: 2020-08-04 Impact factor: 8.822

5. Complete Reconstitution and Deorphanization of the 3 MDa Nocardiosis-Associated Polyketide Synthase.

Authors: Kai P Yuet; Corey W Liu; Stephen R Lynch; James Kuo; Wesley Michaels; Robert B Lee; Abigail E McShane; Brian L Zhong; Curt R Fischer; Chaitan Khosla
Journal: J Am Chem Soc Date: 2020-03-20 Impact factor: 15.419

6. Biosynthesis of the Nuclear Factor of Activated T Cells Inhibitor NFAT-133 in Streptomyces pactum.

Authors: Wei Zhou; Priyapan Posri; Mostafa E Abugrain; Alexandra J Weisberg; Jeff H Chang; Taifo Mahmud
Journal: ACS Chem Biol Date: 2020-12-07 Impact factor: 5.100

7. Caerulomycin and collismycin antibiotics share a trans-acting flavoprotein-dependent assembly line for 2,2'-bipyridine formation.

Authors: Bo Pang; Rijing Liao; Zhijun Tang; Shengjie Guo; Zhuhua Wu; Wen Liu
Journal: Nat Commun Date: 2021-05-25 Impact factor: 14.919

8. Initiating polyketide biosynthesis by on-line methyl esterification.

Authors: Pengwei Li; Meng Chen; Wei Tang; Zhengyan Guo; Yuwei Zhang; Min Wang; Geoff P Horsman; Jin Zhong; Zhaoxin Lu; Yihua Chen
Journal: Nat Commun Date: 2021-07-23 Impact factor: 14.919

9. Properties of a "Split-and-Stuttering" Module of an Assembly Line Polyketide Synthase.

Authors: Katarina M Guzman; Kai P Yuet; Stephen R Lynch; Corey W Liu; Chaitan Khosla
Journal: J Org Chem Date: 2021-03-23 Impact factor: 4.198

Review 10. Recent trends in biocatalysis.

Authors: Dong Yi; Thomas Bayer; Christoffel P S Badenhorst; Shuke Wu; Mark Doerr; Matthias Höhne; Uwe T Bornscheuer
Journal: Chem Soc Rev Date: 2021-06-18 Impact factor: 60.615