Literature DB >> 29062932

Generate a bioactive natural product library by mining bacterial cytochrome P450 patterns.

Xiangyang Liu1.   

Abstract

The increased number of annotated bacterial genomes provides a vast resource for genome mining. Several bacterial natural products with epoxide groups have been identified as pre-mRNA spliceosome inhibitors and antitumor compounds through genome mining. These epoxide-containing natural products feature a common biosynthetic characteristic that cytochrome P450s (CYPs) and its patterns such as epoxidases are employed in the tailoring reactions. The tailoring enzyme patterns are essential to both biological activities and structural diversity of natural products, and can be used for enzyme pattern-based genome mining. Recent development of direct cloning, heterologous expression, manipulation of the biosynthetic pathways and the CRISPR-CAS9 system have provided molecular biology tools to turn on or pull out nascent biosynthetic gene clusters to generate a microbial natural product library. This review focuses on a library of epoxide-containing natural products and their associated CYPs, with the intention to provide strategies on diversifying the structures of CYP-catalyzed bioactive natural products. It is conceivable that a library of diversified bioactive natural products will be created by pattern-based genome mining, direct cloning and heterologous expression as well as the genomic manipulation.

Entities:  

Keywords:  Genome mining; Microbial P450; Natural product library; Synthetic biology

Year:  2016        PMID: 29062932      PMCID: PMC5640691          DOI: 10.1016/j.synbio.2016.01.007

Source DB:  PubMed          Journal:  Synth Syst Biotechnol        ISSN: 2405-805X


“Discovery consists of seeing what everybody else has seen, and thinking what nobody else has thought.”

Introduction

The conception of the natural product library originated from the advocation for the Biological Resource Center (BRC) via the Organization for the Economic Cooperation and Development (OECD) in 1999. During the past 15 years, the production of a library of microbial natural products has been mainly driven by the development of high-throughput screening methods and the increasing number of reported marine natural products.2, 3, 4 Broadly speaking, researches on natural product libraries have included the activities of creating a genomic DNA prioritized strain library, a metagenome library,6, 7 a combinational library, a crude extract library,9, 10, 11 a fraction library12, 13, 14 and a pure compound library. Additionally, the bioactive compound library is commercially available from Selleckchem, OTAVA and Cromadex. Strategies for developing a high quality microbial natural product library include diversification and de-replication of strain, natural product structures and high throughput screening methods.2, 19, 20, 21, 22, 23 Even so, the low outcomes from the creation of a crude extract library and a fraction library are far from the expectation in drug discovery. These facts have diminished the interests of big pharma in the natural product library. Nature is always full of diversity and natural product biosynthesis is no exception. The “co-linearity” rule, and the diversity and variations in nonribosomal peptide synthetases (NRPSs), polyketide synthases (PKSs), hybrid NRPS/PKS systems and the beyond have been greatly explored.25, 26, 27, 28, 29, 30, 31 During last two decades, well over hundreds of natural product biosynthetic pathways and thousands of natural scaffolds such as nonribosomal peptide, polyketide, terpenoid and oligosaccharide have been characterized.32, 33 Except for the enzymes that catalyze formation of chemical scaffolds, there is a group of enzymes that decorate the functional group density, acting sequentially to transfer the building blocks into complex molecular structures. The tailoring enzymes can be classified into nonoxidative and oxidative enzymes. Nonoxidative enzymes includeacyltransferases, methyltransferases and glycosyltransferases, while oxidative enzymes encompass those who introduce water-solubilizing oxygen functionalities during maturation and are probably most consequential for introduction of both scaffold and functional group complexity. While considering how to diversify nature's small molecule inventory, the oxygenase enzymes have proven to be most valuable to characterize with the twin goals of prediction of new natural product scaffolds and combinatorial engineering of intermediates.33, 34 The most prominent of the oxidative enzymes are the heme iron cytochrome P450 (CYPs) monoxygenases found in microbes such as Streptomyces species, Bacillus species and Cyanobacteria, etc. Traditionally, natural product genome mining means “use analysis of DNA sequence data to predict structural elements of new natural products and then use this information to design strategies for rapidly identifying, purifying, and structurally characterizing the compounds”.39, 40 However, the peculiarities of biosynthetic pathways indicate that textbook “co-linearity” rules cannot be applied to deduce structures from all DNA data.41, 42 Thanks to the development of gene cluster prediction software such as antiSMASH, genome mining has become a quick and inexpensive way to analyze the biosynthetic potential of sequenced microbes. The current genome mining approaches include sequence-based genome mining,44, 45 bioactivity-guided genome mining, enzyme-based genome mining, pattern-based genome mining48, 49 and genome neighborhood network analysis.50, 51 Sequence-based genome mining is designed to detect and extract carboxyl (C) – and keto-synthase (KS) – domains from DNA or amino acid sequence data. The high degrees of sequence similarity (E-value <10e−40 for eSNAPD or 85–90% protein sequence identity for NaPDoS) suggest that identified biosynthetic gene clusters are responsible for the biosynthesis of similar natural products resulting from reference gene clusters.6, 7, 44, 52 For bioactivity-guided genome mining, the conserved structural motif of bioactive natural products can be used as a reference to perform genome mining. For example, the enediyne PKS (PKSE) is proposed to be involved in the formation of the highly reactive chromophore ring structure (or “warhead”) found in all enediynes. By PKSE based genome mining, the enediyne biosynthetic gene clusters have been identified from sequenced actinomycete genomes. FK228 is an antitumor drug with tumor cytotoxicity resulting from the functional disulfide group, which is catalyzed by the FAD-dependent disulfide oxidoreductase.54, 55 BLAST analysis of FAD-dependent disulfide oxidoreductase has led to the discovery of a homologous gene cluster of thailandepsins in Burkholderia thailandensis E264. Enzyme-based genome mining searches conserved synthase domains against the NCBI database of sequenced bacterial genomes in order to obtain the presumptive enzyme sequences. The “pattern-based genome mining” refers to the connection of MS/MS fragmentation pattern to the biosynthetic pathways genome mining and de-replication of certain bacterial species. For example, the MS/MS fragmentation pattern of the 827.492 for arenicolide A production was used to identify the uncharacterized gene cluster in S. pacifica strains CNQ-748 and CNT-138. The “genome neighborhood networks (GNN) analysis” is a bioinformatics strategy to predict enzymatic functions on a large scale based on their genomic context. In this case, bioinformatics of PepM and phosphonate GNN were applied for 278 sequenced bacterial genomes and led to the discovery of 19 new phosphonate natural products. Enediyne GNNs were generated for the virtual screening of the sequenced bacterial genomes resulted in 87 potential enediyne gene clusters from 78 different bacteria strains. The pattern-based genome mining and the genome neighborhood analysis provide a comprehensive method to identify the biosynthetic gene cluster of sequenced bacterial strains. With the physiological and medical Nobel Prize awarding to natural product avermectin and artemisinin, the discovery of natural products has entered a golden age. Our long term goal is to create a high quality, diversified natural product library. In the postgenomic era, the challenges to generate a microbial natural product library have been switched from the traditional de-replication strategies to issues of how to translate the annotated biosynthetic gene clusters of interest to a bioactive natural product library. The current review will focus on genome mining of CYPs which are involved in the biosynthetic gene clusters of bacterial secondary metabolites, especially those with epoxide functional groups, with an intention to share the considerations to build a diversified natural product library through CYP pattern-based genome mining, direct cloning and heterologous expression, and genome manipulation.

Genome mining of CYP-catalyzed bacterial natural products

This section will cover the introduction of CYP, two genome mining methods, and application of genome mining in two genera.

The importance of microbial CYPs

Microbial natural products catalyzed by CYP biosynthetic pathways have diversified biological activities including antitumor activities, antibacterial activities, antifungal activities, anti-HIV activities, anti-parasitic and anti-cholesterol activities. Natural products such as pladienolides/FD-895,59, 60 GEX1/herboxidiene, FR901464 (FR)/spliceostatins/thailanstatins (Fig. 1A) are known to have antitumor activities by targeting pre-mRNA spliceosome. Specifically, pladienolide B and spliceostatin A have been reported to express their antiproliferative activities against tumor cells at the nM range by targeting the splicing factor 3B subunit 1 (SF3B1) with the help of three-membered epoxide groups. Furthermore, CYP-catalyzed natural products with antitumor activities may have different mechanisms. For instance, epothilone A expresses the tubulin-binding activity, griseorhodin A is telomerase inhibitor, tirandamycin is RNA polymerase inhibitor and epoxomicin is proteasome-inhibitor, etc (Fig. 1B).
Fig. 1

P450 related microbial natural compounds. Except for staurosporine, all the selected compounds are epoxide-containing compounds. Other epoxide compounds without known biosynthetic pathways, such as antitumor compound trapoxin, are not listed. Panel A: Natural products with known spliceosome inhibitory activities; Panel B: Natural products with other activities such as tubulin-binding compounds epothilone A, telomerase inhibitor griseorhodin A, RNA polymerase inhibitor tirandamycin and proteasome-inhibitor epoxomicin, etc.

Microbial CYPs are one of the most widely distributed groups of tailoring enzymes to catalyze the formation of the final bioactive natural products.31, 35, 37, 67, 68, 69 CYPs catalyze the hydroxylation and/or epoxidation reactions in the late stages of biosynthesis after macrolide formation by PKSs.70, 71, 72 Reactions catalyzed by CYP monooxygenases include hydroxylation of saturated C—H bonds, epoxidation of CC double bonds, and oxidative decarboxylation (Table 1). For example, The CYP PldB hydroxylates 6-deoxy pladienolide B to pladienolide B in the pladienolide biosynthetic pathway. The CYP HerG catalyzes the stereospecific hydroxylation at C-18 of herboxidiene. For FR biosynthesis, the CYP Fr9R is not only involved in the hydroxylation at C-4 of FR901464, but is also related to the formation of the hemiketal group at C-1. CYPs in the biosynthetic pathways of natural products often work together with its patterns such as transferases and epoxidases to catalyze the production of epoxide groups. Moreover, dual-function CYPs have been reported to catalyze sequential epoxidation and hydroxylation of the same substrate.65, 88, 107
Table 1

CYPs involved in the biosynthesis of microbial natural products.

P450 enzymesNo. amino acid (AA)Accession no.Match in the databaseaReference
CYP name as annotatedProtein identifierIdentity%AA overlap
Hydroxylation of saturated C—H bonds
AmphL396AAK73504CYP107EAAK73504100%39673
AziB1401B4XY99.1CYP107-likeWP_018771011.134%38874
ChmPI407AAS79447CYP107BAAS79447100%40775
EpnK401AHB38512CYP162-likeWP_030872520.140%39966
EryF404Q00441.2CYP107AWP_009950397.1100%40476
EryK397CYP113A1CYP113A-likeWP_009950895.1100%39777
HerG422AEZ64507CYP107B-likeWP_016577953.148%40461
MeiE459AAM97314CYP171AAAM97314100%45978
MycCI383BAC57023CYP105UBAC57023100%38379
NcsB3410AAM77997CYP154JAAM77997100%41080
NysL394AAF71769CYP107EAAF71769100%39481
OxyD396CCD33151CYP146ACCD33151100%39682
PikC416O87605CYP107LAAC64105100%41683
PldB399BAH02272CYP107B-likeBAH02272100%39959
PteC399BAC68123CYP105P1WP_010981850.1100%39984
TylHI436AAD41818CYP105UAAD41818100%43685
TylI417AAA21341CYP107-likeAAA21341100%41786
ZbmVIIc416ACG60779CYP185-likeWP_030613068.151%43078
TxtC395AAL36838.1CYP105AWP_009073396.141%40687
Dual function, hydroxylation and epoxidation
GsfF414BAJ16472CYP105ABAJ16472100%41488
MycG397BAA03672CYP107BBAA03672100%39789
Fma-P450400AHL19974b90
Fr9R482AIC32704CYP136-likeWP_022984814.132%45562
PenD298ADO8559291
PntD299ADO8557692
TamI413ADC79647.1CYP107B-likeADC79647.1100%41365
TstR482AGN11891CYP136-likeBAP1529733%45593
Catalyze epoxidation
Asm301005AAM54108CYP102FAAM54108100%100594
ChmPII401AAS79446CYP107BAAS79446100%40175
EpnI431AHB38510CYP107B-likeWP_030776170.148%41266
EpoK419Q9KIZ4CYP167AQ9KIZ4100%41963
EpxC425AHB38496CYP107B-likeWP_020576451.142%40666
GrhO3416AAM33670CYP105-likeAAM33670100%41664
HedR409AAP85338CYP105-likeAAP85338100%40995
OleP407AAA92553CYP107BAAA92553100%40796
PimD397CAC20932CYP107ECAC20932100%39797
PimG398CAC20928CYP105HCAC20928100%39897
TamI413ADC79647CYP107B-likeADC79647100%41398
Oxidation of methyl group
AmphN399AAK73509CYP105HAAK73509100%39999
NysN398AAF71771CYP105HAAF71771100%39881
C-C coupling
DynOrf19403ACB47070CYP107MWP_015621195.151%398100
HmtS397CBZ42153CYP113AWP_030360446.152%398101
OxyB398AAL90878CYP165BQ8RN04100%398102
OxyC406AAL90879CYP165BQ8RN03100%406103
C-N-C coupling
DynE10400ACB47071CYP107BWP_029899036.153%400100
SpcN390AGL96571CYP244AAGL96571100%390104
Catalyzes the nitration using NO and O2
TxtE406ELP66108.1CYPP450TXTEELP66108.1100%40687
Oxidative decarboxylation
HmtN4194E2P_ACYP113A4E2P_A100%419101
HmtT4184GGV_ACYP113A4GGV_A100%418101
Mei-Orf4398ADC45514CYP107-likeADC45514100%39878
StaP426ABI94389CYP245AABI94389100%426105
SpcP427AGL96575CYP245AAGL96575100%427104

CYP names as annotated at website: https://cyped.biocatnet.de/.

Not found in database.

CYPs involved in the biosynthesis of microbial natural products. CYP names as annotated at website: https://cyped.biocatnet.de/. Not found in database.

Bioinformatics of microbial CYPs

The information on three-dimensional structures is essential to understand the molecular basis for substrate recognition and specificity of CYPs. It is universally thought that three elements are essential in all CYPs sequences: (a) the conserved cysteine, which is the fifth ligand to the heme Fe atom and can be represented as FXXGXXXCXG, (b) the EXXR motif forming a charge pair in the K helix (possibly involved in heme binding) and (c) overall CYP fold topology, including stability. However, there are some unusual examples for bacterial CYPs, such as variations at conserved motifs, heme incorporation and topology, substrate binding and functionalities. For example, the amino acid sequence of CYP157C1 contains EQSLW in place of the conserved EXXR. The conserved threonine in CYP107A1 is not present and a hydroxyl group of the substrate 6-deoxyerythronolide B can directly donate a hydrogen bond to the Fe-linked dioxygen for proton transfer. Different from CYP107A1, CYP158A2 binds two molecules of flaviolin in its active site, and the 2-OH group of flaviolin is responsible for anchoring the substrate in the active site, while the 5-OH and 7-OH stabilize water molecules which are important for catalysis. For the discovery of bacterial CYPs in sequenced genomes, genes encoding the CYP heme binding domain (FXXGXXXCXG) can be screened for the presence of a highly conserved threonine in the putative I-helix, which is proposed to be involved in CYP oxygen activation in most CYPs, and the conserved EXXR motif is located in the K-helix. The sequences of polypeptides containing all three motifs can be further used as queries for BLAST searches of the GenBank non-redundant protein database (www.ncbi.nlm.nih.gov/BLAST/) to identify their closest homologues in other organisms and tentatively assign the CYP proteins to subfamilies. Briefly, >40% of amino acid sequence identity places a CYP in the same family and >55% places it in the same subfamily. Besides GenBank, there is a CYPED database which scours Genbank by BLAST-searching and retrieving the CYP sequences to put in their database (https://cyped.biocatnet.de/).112, 113, 114, 115 This database contains the most bacterial CYPs (Table 1) that are involved in the biosynthesis of bioactive natural products (Fig. 1). There is also another online CYP database (http://drnelson.uthsc.edu/CytochromeP450.html), which contains1042 bacterial CYP genes. P450 related microbial natural compounds. Except for staurosporine, all the selected compounds are epoxide-containing compounds. Other epoxide compounds without known biosynthetic pathways, such as antitumor compound trapoxin, are not listed. Panel A: Natural products with known spliceosome inhibitory activities; Panel B: Natural products with other activities such as tubulin-binding compounds epothilone A, telomerase inhibitor griseorhodin A, RNA polymerase inhibitor tirandamycin and proteasome-inhibitor epoxomicin, etc.

The logic for CYP pattern-based genome mining

There is evidence that CYPs can be used for genome mining for pre-mRNA spliceosome inhibitors. First, all current pre-mRNA spliceosome inhibitors are epoxide-containing natural products (Fig. 1A) whose biosynthesis gene clusters contain genes encoding putative CYP oxygenases or epoxidases. It has been reported that the presence of epoxide group is important in conferring activity to FR analogues.116, 117 Second, CYPs may not directly be involved in the formation of some epoxide groups,70, 118 but CYPs are present in the relevant biosynthetic gene clusters (Fig. 2). For example, in a recent metagenome mining report, all six epoxyketone gene clusters contain CYPs, indicating that CYP-containing gene clusters may be a rich source for the biosynthesis of epoxide compounds. Third, even antitumor activities of CYP-catalyzed compounds are related to different mechanisms. Small molecule screenings identified that oxaspiro compounds (the farnesyltransferase inhibitor manumycin A analogues) are pre-mRNA splicing inhibitors. In addition, a number of compounds such as the CYP-related non-epoxide compounds staurosporine (kinase inhibitor), are novel inhibitors of spliceosome assembly. Fourth, homologous CYPs can be identified through genome mining by a 50–60% similarity except for the 80%–90% rule for PKS and nonribosomal peptide synthetase (NRPS) domain searching. For example, BLAST analysis demonstrated a 50% identity between CYPs PldB and HerG from pladienolide and herboxidiene (both are pre-mRNA spliceosome inhibitors), respectively. In the biosynthesis of FD-891, the CYP GfsF showed 57% identity to the CYP monooxygenase Mflv2418. Finally, the CYP gene can be used as a signature gene for gene cluster cloning and gene identification. However, CYP pattern should be considered because CYPs and other tailoring enzymes such as transferases and epoxidases act together on the released PKS or NRPS megasynthases68, 122, 123 (Fig. 3).
Fig. 2

Neighbor-joining tree of selected P450 enzymes including 13 putative P450 enzymes from Kutzneria species. CYPs catalyze epoxide formation (●), hydroxylations (■), oxidation of methyl groups (♦), decarboxylations (▲) and C–C or CNC formations (▼). The tree was generated by using MEGA6 using the neighbor joining method. Significant bootstrap values are indicated at the nodes. The scale bar represents 0.1 mutational events per site. PldB, Fr9R, GfsF and StaP associated gene clusters produce pre-mRNA spliceosome inhibitors pladienolid, FR901464, FD-891 and staurosporine, respectively.

Fig. 3

Pattern-based genome mining of homologous gene clusters of spliceosome inhibitors which contain CYPs. The biosynthetic gene clusters of thailanstatins (Accession no. KJ461964.1) and FR901464 (Accession no. JX307851.1), pladienolide (Accession no. AB435553.1) and herboxidiene (Accession no. JN671974.1) as well as staurosporine (Accession no. AB088119.1) were used for the search of homologous gene clusters by using antiSMASH 3.0.

Neighbor-joining tree of selected P450 enzymes including 13 putative P450 enzymes from Kutzneria species. CYPs catalyze epoxide formation (●), hydroxylations (■), oxidation of methyl groups (♦), decarboxylations (▲) and C–C or CNC formations (▼). The tree was generated by using MEGA6 using the neighbor joining method. Significant bootstrap values are indicated at the nodes. The scale bar represents 0.1 mutational events per site. PldB, Fr9R, GfsF and StaP associated gene clusters produce pre-mRNA spliceosome inhibitors pladienolid, FR901464, FD-891 and staurosporine, respectively. Pattern-based genome mining of homologous gene clusters of spliceosome inhibitors which contain CYPs. The biosynthetic gene clusters of thailanstatins (Accession no. KJ461964.1) and FR901464 (Accession no. JX307851.1), pladienolide (Accession no. AB435553.1) and herboxidiene (Accession no. JN671974.1) as well as staurosporine (Accession no. AB088119.1) were used for the search of homologous gene clusters by using antiSMASH 3.0.

Discovery of thailanstatins from Burkholderia sp.

Recently, the genus Burkholderia has attracted the attention of several research groups to employ different genome mining strategies for the study of bioactive natural products. Among those products, FR is a general spliceosome inhibitor discovered in 1992 from Pseudomonas sp. No. 2663,125, 126, 127 now identified as Burkholderia sp. FERM BP3421. There are three oxygenase activities encoded in FR gene cluster: (a) the flavin-dependent monooxygenase (FMO) domain in the last module of fr9GH, (b) the CYP encoded by fr9R, and (c) the Fe(II)/α-ketoglutarate-dependent dioxygenase encoded by fr9P. The hemiketal FR is mainly biosynthesized through the epoxidation at C3 by the FMO domain of Fr9GH, the hydroxylation at C-4 by the Fr9R, the hydroxylation at C-1 by Fr9P and then decarboxylation. First, BLAST analysis shows that Fr9R has a high identity (82%) to TstR in the sequenced genome of B. thailandensis MSMB43 (currently being described as “Burkholderia hymptydooensis”).93, 128, 129 Second, Fr9K is a key 3-hydroxy-3-methylglutaryl-CoA synthase (HCS) homologue-enzyme catalyzing the transfer of —CH2COO— from acyl-S-acyl carrier protein (ACP) to a β-ketothioester polyketide intermediate. Fr9K has a high identity (87%) to one of the enzymes (TstK) predicted from the genome of B. thailandensis MSMB43. Further, a regulator gene expression studies for optimization of production media, and targeted isolation and purification efforts toward diene compounds (UV235 nm), led to the discovery of three compounds called thailanstatins and a group of similar or identical compounds called spliceostatins130, 131 (Fig. 4). Particularly, thailanstatin A contains a carboxyl group, which not only makes it more stable than FR in PBS solution, but also leads to increased activities compared to FR when carboxylic groups were esterified (with enhanced membrane permeability).
Fig. 4

Genome mining protocol of Burkholderia thailandensis MSMB43 for thailanstatins/spliceostatins. Starting from the two key biosynthetic enzymes CYP Fr9R and 3-hydroxy-3-methylglutaryl-CoA synthase (HCS) Fr9K that catalyze the formation of the hemiketal hydroxyl group and the epoxide group of FR901464, the homologous enzymes and biosynthetic gene cluster of thailanstatin were predicted. The Reverse Transcriptional (RT)-PCR help to identify the optimum medium for the expression of regulatory gene tstA. The novel compounds were finally obtained through microbial fermentation, natural product isolation and structural elucidation by tracking of the characteristic UV absorption for the diene motif at UV235 nm.

Genome mining protocol of Burkholderia thailandensis MSMB43 for thailanstatins/spliceostatins. Starting from the two key biosynthetic enzymes CYP Fr9R and 3-hydroxy-3-methylglutaryl-CoA synthase (HCS) Fr9K that catalyze the formation of the hemiketal hydroxyl group and the epoxide group of FR901464, the homologous enzymes and biosynthetic gene cluster of thailanstatin were predicted. The Reverse Transcriptional (RT)-PCR help to identify the optimum medium for the expression of regulatory gene tstA. The novel compounds were finally obtained through microbial fermentation, natural product isolation and structural elucidation by tracking of the characteristic UV absorption for the diene motif at UV235 nm.

Kutzneria species are potential resources of CYP-catalyzed compounds

In the Actinobacteria phylum, the genus Kutzneria is a minor branch of the Pseudonocardiaceae family, currently containing eight species (Fig. 5). Only aculeximycin and kutznerides are known to be produced by Kutzneria albida DSM 43870T and Kutzneria sp. 744, respectively.132, 133 Genome mining of Kutzneria species suggests that they could produce compounds similar to the previously reported pre-mRNA spliceosome inhibitors and a group of interesting CYP catalyzed compounds. For example, the pladienolide biosynthetic gene cluster has only four genes (including the pldB CYP gene, and the putative epoxidase encoding gene pldD) besides the PKS genes. Inspired by the successful discovery of thailanstatins, pladienolide B CYP monooxygenase-based bioinformatics analysis was conducted and revealed that the Kutzneria sp. 744 genome contains the most homologous (78%) enzyme KUTG_06291 (EWM16617) when CYP PldB sequence was used as input for homology search of the GeneBank (Fig. 6). Therefore, the genus Kutzneria was postulated to produce compounds similar to pladienolide. However, there are still a lot of gaps in the sequenced genome of Kutzneria sp. 744.
Fig. 5

Phylogenetic tree for taxa of the genus Kutzneria and other antitumor compound-producing species (Streptomyces and Burkholderia). The tree was calculated from complete 16S rRNA gene sequences using the neighbor joining method, illustrating the genus Kutzneria position relative to other selected species. All species are type strains except for the strains Kutzneria sp. 744, Burkholderia sp., MSMB43 and MSMB121. The 16S RNA sequence of the pladienolide producer Streptomyces platensis Mer-11107 is not available, but DNA hybridization suggests that Streptomyces platensis Mer-11107 has 87% similarity to the type strain CGMCC4.1975. Percentages at nodes represent levels of bootstrap support from 1000 resampled datasets; Scale bar, 0.02 nucleotide substitutions per site.

Fig. 6

Strategy for the discovery of novel compounds from Kutzneria species. By starting with the structure of pladienolide B, and moving through phylogenetic analysis of the streptomycetes-related cytochrome P450s, Kutzneria species were selected for further analysis of P450-related polyketides. The genome sequencing of the closest strain of Kutzneria sp. 744 was finished on February, 2014 without detailed annotation and there are still a lot of gaps waiting to be closed.

Phylogenetic tree for taxa of the genus Kutzneria and other antitumor compound-producing species (Streptomyces and Burkholderia). The tree was calculated from complete 16S rRNA gene sequences using the neighbor joining method, illustrating the genus Kutzneria position relative to other selected species. All species are type strains except for the strains Kutzneria sp. 744, Burkholderia sp., MSMB43 and MSMB121. The 16S RNA sequence of the pladienolide producer Streptomyces platensis Mer-11107 is not available, but DNA hybridization suggests that Streptomyces platensis Mer-11107 has 87% similarity to the type strain CGMCC4.1975. Percentages at nodes represent levels of bootstrap support from 1000 resampled datasets; Scale bar, 0.02 nucleotide substitutions per site. Strategy for the discovery of novel compounds from Kutzneria species. By starting with the structure of pladienolide B, and moving through phylogenetic analysis of the streptomycetes-related cytochrome P450s, Kutzneria species were selected for further analysis of P450-related polyketides. The genome sequencing of the closest strain of Kutzneria sp. 744 was finished on February, 2014 without detailed annotation and there are still a lot of gaps waiting to be closed. Kutzneria albida DSM 43870T is the only Kutzneria strain with a complete sequenced genome. So far, the K. albida genome (9.87 Mb) is among the largest actinobacterial genomes sequenced. Thus, Kutzneria albida DSM 43870T was selected for further phylogenetic analysis of CYPs in parallel with a library of 44 CYPs originate from biosynthetic gene clusters of 32 bioactive natural products.37, 61, 62, 63, 64, 73, 74, 76, 78, 80, 81, 82, 83, 84, 85, 86, 88, 89, 90, 91, 92, 94, 95, 96, 97, 98, 99, 100, 102, 103, 104, 105 AntiSMASH analysis revealed that the K. albida DSM 43870T genome encodes 47 biosynthetic gene clusters which contain at least 13 different CYPs distributed in eight gene clusters (GC4, GC10, GC11, GC18, GC19, GC33, GC34 and GC40) (Fig. 5, Table S1). Among those gene clusters, only GC40 is known to be responsible for the biosynthesis of aculeximycin. The CYP KALB_6568 is predicted to be responsible for catalyzing the formation of C14 hydroxyl group. However, the compounds biosynthesized by the other CYP-related gene clusters are not known yet. BLAST analysis (Fig. 2) suggests that in the K. albida DSM 43870T genome, GC19 encodes compounds with CYP (KALB_3944, possessing 99% identity to Mei-Orf4)-encoded functional groups similar to meilingmycin and tirandamycin B (66% identity to TamI). As for GC11, the 99% identity between CYP KALB_3411 and Stap (in the staurosporine biosynthesis gene cluster) suggests that GC11 may be responsible for the biosynthesis of compounds with functional groups formed by similar oxidative decarboxylations. Additionally, antiSMASH analysis indicates that GC11 has a 20% similarity to the biosynthetic gene clusters of the pre-mRNA spliceosome inhibitor staurosporine. Similarly, the 99% identity between CYP KALB_3295 (in GC10) and Asm30 (CYP in ansamitocin biosynthesis gene cluster) implies that GC10 may produce epoxide compounds similar to ansamitocin. Similar analysis suggests that tirandamycin-like compounds may be produced through GC19 based on the 66% identity between KALB_3945 and TamI.98, 135 Unfortunately, the 27% identity between KALB_5792 (in GC33) and AziB1 (CYP in the azinomycin B biosynthetic gene cluster), and the 19% identity between KALB_5804 (in GC33) and EpoK (CYP in the epothilone biosynthesis gene cluster) make it unpredictable for the natural products biosynthesized by GC33. Except for Kutzneria albida DSM 43870T, another top five strains, Actinoplanes sp. N902-109 (NC_021191.1), Micromonospora aurantiaca ATCC 27029 (CP002162.1), Streptomyces violaceusniger Tu 4113 (CP002994.1), Streptomyces bingchenggensis BCW-1 (CP002047.1) and Streptomyces sp. PVA 94-07 (CM002273.1), were prioritized through CYP pattern-based genome mining of pladienolide biosynthetic pathway. These strains have the potential CYP pattern pathways to produce unknown compounds (Tables S2–S6).

Conceivable methods to translate biosynthetic pathways to natural products

This section covers the methods (Fig. 7) or tools that have been used or have the potential to be applied to translate biosynthetic pathways to natural products.
Fig. 7

Schematic representation of methods to translate biosynthetic pathways to novel natural products.

Schematic representation of methods to translate biosynthetic pathways to novel natural products.

Direct cloning and heterologous expression of biosynthetic gene clusters

Heterologous expression was primarily used to avoid the lack of genetic tools and regulatory complexity in native hosts via the use of strains that are more amenable to engineering. Direct cloning, genetic engineering and heterologous expression of microbial natural product biosynthesis pathways were reviewed in 2013. The heterologous expression methods can also be borrowed from those for the discovery of small molecules from metagenomics cosmid library.7, 137, 138 The approaches to cloning targeted gene clusters directly from genomic DNA include RecE-mediated homologous recombination, oriTdirected capture, transformation-associated recombination (TAR)141, 142, 143 to facilitate functional expression experiments and phageφBT1 integrase-mediated site-specific recombination. Among those, TAR cloning in Saccharomyces cerevisiae has been extensively applied to capture and express large biosynthetic gene clusters from environmental DNA samples.145, 146, 147 Recently, the TAR direct cloning approach was used to clone targeted whole gene cluster from rare actinomycetes and Gram-negative bacteria. By using TAR, the targeted gene cluster was cloned into plasmid pCAP01 by homologous recombination in Saccharomyces cerevisiae strain VL6-48 in order to obtain the captured vector, and then the captured vector was transformed into the corresponding host Streptomycete coelicolor and E. coli for rare actinomycete and Gram-negative bacteria, respectively. An example is the heterologous production of lipopeptide taromycin A (from marine actinomycete Saccharomonospora sp. CNQ-490) in Streptomycete coelicolor M512 and the heterologous production of alterochromide lipopeptides (from Pseudoalteromonas piscicida JCM 20779) in E. coli BL21 (DE3) utilizing native and E. coli-based T7 promoter sequences. Since the development of a TAR-based genetic platform allows for heterologous production of lipopeptides in different hosts, efforts have been focused on the development of the high throughput TAR capture method to express the pathway. Successful production of the desired products often requires an optimal relationship of timing and flux between primary and secondary cellular metabolism. Besides the use of genetic engineering for the direct cloning of a whole gene cluster, there has been considerable interest in the development of engineered bacterial strains for efficient heterologous production of secondary metabolites.148, 149, 150, 151 For example, the deletion of a 1.4 Mb segment from the left subtelomeric region of the 9.02 Mb Streptomyces avermitilis genome resulted in the generation of two large-deletion mutants. By using these large-deletion mutants, i.e., S. avermitilis SUKA17 or 22, twenty of the entire biosynthetic gene clusters for secondary metabolites, including aminoglycosides, nonribosomal peptides, polyketides and terpenes were successfully expressed. The biosynthetic gene cluster of the polyketide pladienolide has been expressed in a deletion mutant of Streptomyces avermitilis with an extra copy of the regulatory gene pldR under control of an alternative promoter. It is worthy to note that the engineered hosts are not only useful for the production of exogenous secondary metabolites, but they also facilitate scale-up production and preparation of promising natural products due to their “clean” background.

Manipulation of the biosynthetic pathways

The recent approaches on the activation and up-regulation of microbial biosynthetic pathways for the discovery of natural products have been reviewed.153, 154 Except for these methods, there are four types of research reports that may explain how to translate biosynthetic gene clusters to novel natural products by using synthetic biology methods. First, optimized bioactive compounds can be produced by engineering the biosynthetic operon and reconstituting the biosynthetic pathway. The group led by Müller and Brönstrup successfully deleted the gene in hydroxymalonyl-CoA biosynthetic operon of the bengamide biosynthetic gene cluster and inserted a promoter on the expression construct pBen32. Heterologous expression of the modified biosynthetic gene cluster led to the discovery of more potent compounds. Second, new compounds with increased activities are produced by inactivation of CYP pattern genes. Bills' team works on the manipulation of pneumocandin biosynthetic pathway. They focused on the inactivation of three genes (GLP450-1, GLP450-2, and GLOXY1) and generated 13 different pneumocandin analogues that lack one, two, three, or four hydroxyl groups on 4R, 5R-dihydroxy-ornithine and 3S, 4S-dihydroxy-homotyrosine of the parent hexapeptide. Among them, seven analogues are previously unreported. Third, natural product analogues can be produced by creating promoter-driven tailoring enzyme constructs. Brady's group created an ermE promoter cassette, which was introduced to the upstream of the first ORF of the biosynthetic gene cluster of interest. Further introduction of the cosmid containing the constitutively expressed tailoring operon into S. toyocaensis:ΔStaL resulted in the production of three new glycopeptides. Fourth, stable compounds with increased titer are produced by engineering the CYP pathway. Researchers in Pfizer engineered a pAE-PF29 vector which enabled the overexpression of Fr9R encoded by CYP fr9R in Burkholderia sp. FERM BP-3421, and led to the enhanced production of stable thailanstatin A from spliceostatin C.

Synthetic biology tool kits

Except for the four examples of genetic manipulations, synthetic biology tools to edit bacterial genomes have been reviewed by other researchers. The Clustered Regularly Interspaced Short Palindromic Repeats and Cas proteins (CRISPR-Cas) systems are composed of a powerful and broadly applicable set of tools to manipulate Streptomyces genomes. There are three research groups that worked on the gene deletion of Streptomyces species by using different CRISPR-Cas toolkits. The group led by Zhao developed a temperature-sensitive pCRISPomyces system, which is applicable to the genome editing of Streptomyces lividans 66, Streptomyces viridochromogenes DSM 40736 and Streptomyces albus J1074. Lee and Webber's group focused on the gene deletion efficiency of actinorhodin biosynthetic gene cluster in Streptomyces coelicolor A3 by an engineered CRISPR-Cas system with improved efficiency. Sun's team worked on the discovery of a novel CRISPR-Cas system in combination with the counterselection system CodA(sm), and the D314A mutant of cytosine deaminase, to delete the actinorhodin biosynthetic gene cluster in Streptomyces coelicolor M145 genome.

The advantages and challenges of the above mentioned strategies

The advantages and challenges to generate a bioactive natural product library by CYP pattern-based genome mining as compared to the traditional strategies such as microbial fermentation with different media and chromatographic fractionation for different polarities can be summarized as follows.

Advantages

CYP is a class of most important tailoring enzymes that can be explored and exploited for the structural diversification of bioactive natural products. Genome mining is an efficient method. The combination of PCR screening and genome mining will prioritize the cryptic gene clusters of interest for annotation and heterologous expression. This is more efficient when compared to the traditional conception of one strain many active compounds (OSMAC). Genome mining can be used for large scale and high throughput method to accelerate drug discovery, as exemplified by a four-year case study on genome mining of 10,000 actinomycetes for 19 novel phosphonic acid natural products. Both heterologous expression and activation of cryptic pathways can generate novel compounds that have unique biological activities but are difficult to access using synthetic chemistry. CRISPR-cas9 enabled genome editing tools are very efficient. First, compared to the 5% (2/38) efficiency in the traditional in-frame gene deletion method, the efficiency of the CRISPR-Cas9 systems can be 100% without unwanted side- or off-target effects. Second, CRISPR-Cas9 systems can overcome the employment of the counter selectable marker for the selection of double crossover mutants. Third, CRISPR-Cas9 systems can avoid the scar sequence left at the target site and realize multiple gene deletions. Fourth, CRISPR/Cas system provides new modularity, which can target the site of interest by inserting of a short spacer into a CRISPR array/sgRNA construct. The insertion can be achieved with high throughput using modern DNA assembly techniques. CRISPR/Cas9 genome editing can shorten the two-month period of previous gene deletion method to the current 1–2 week time course.

Main challenges

How to prioritize the characterization of orphan biosynthetic gene clusters, and how to rapidly connect genes to biosynthesized small molecules will come with increase in DNA-sequenced data. Pathway manipulations in native hosts are subjected to the genetic complexity of the bacteria. For example, researchers have developed a procedure which takes about 10 days for the chromosomal knock out of Gram-negative bacteria B. pseudomallei by using the pheS-gat cassette-assisted by plasmid pKaKa2. However, this method is not applicable to the gene knockout of another strain called B. gladioli, which requires another pCR4Blunt-TOPO-based vector for the recombination assisted by plasmid pKD46. Up to now, Cas9 targeting is still limited by protospacer adjacent motif (PAM) sequences. Also, targeting efficiency is site-dependent. Although the development of genetic manipulation tools could greatly enhance the chances toward discovery and production of natural products, a major challenge in the process of microbial genome mining is to produce compounds in high titers. Bacterial genomes are publically available but the accessibility of strains of interest is limited due to the regulations of international or domestic properties. Decreasing DNA synthesis costs and advances in DNA assembly could help to solve the issue with limited material access. A larger fraction of strains that are isolated in research labs worldwide will be the future challenge.

Conclusion

The current review provides strategies on the discovery of a group of epoxide-containing natural products (Fig. 8), highlights (1) CYP and its pattern enzyme-based genome mining as a guidance for the generation of a diversified natural product library, (2) direct cloning and heterologous expression of biosynthetic pathways and genomic manipulation methods and tools will translate the selected biosynthetic gene clusters to the bioactive natural product library. It is worthy to note that tailoring reactions play important roles on diversifications of scaffolds such as polyketide, peptide and hybrid polyketide-peptide backbones, and often tailoring enzyme patterns such as CYPs, ligases, Cyclases, ketoreductases, transferases, and oxygenases can be used for genome mining. This advanced genome mining will avoid the de-replications of biosynthetic gene clusters to quickly identify the annotated biosynthetic gene clusters from the vast pool of sequenced bacterial genomes. It is predicted that we will be sequencing genomes for pennies as nearly as 2020 (https://youtu.be/j88APStUcp4). Synthetic biology tools pioneered by different researchers will be continuously developed toward a high-throughput potential due to the increasing numbers of sequenced bacterial genomes. This review also points out the potential mechanisms and diversity of CYPs in the microbial biosynthesis of natural product antitumor agents. The Kutzneria strains, as rare actinobacterial species, can be explored and exploited for CYP catalyzed compounds, and can be used as a rich resource for the diversification of microbial CYPs.
Fig. 8

The proposed protocol for the development of a natural product library in postgenomic stage.

The proposed protocol for the development of a natural product library in postgenomic stage.
  161 in total

Review 1.  The enzymology of combinatorial biosynthesis.

Authors:  Christopher D Reeves
Journal:  Crit Rev Biotechnol       Date:  2003       Impact factor: 8.429

Review 2.  Bioprospecting microbial natural product libraries from the marine environment for drug discovery.

Authors:  Xiangyang Liu; Elizabeth Ashforth; Biao Ren; Fuhang Song; Huanqin Dai; Mei Liu; Jian Wang; Qiong Xie; Lixin Zhang
Journal:  J Antibiot (Tokyo)       Date:  2010-07-07       Impact factor: 2.649

3.  Molecular networking as a dereplication strategy.

Authors:  Jane Y Yang; Laura M Sanchez; Christopher M Rath; Xueting Liu; Paul D Boudreau; Nicole Bruns; Evgenia Glukhov; Anne Wodtke; Rafael de Felicio; Amanda Fenner; Weng Ruh Wong; Roger G Linington; Lixin Zhang; Hosana M Debonsi; William H Gerwick; Pieter C Dorrestein
Journal:  J Nat Prod       Date:  2013-09-11       Impact factor: 4.050

Review 4.  [Strategies on the construction of high-quality microbial natural product library--a review].

Authors:  Jiang Bian; Fuhang Song; Lixin Zhang
Journal:  Wei Sheng Wu Xue Bao       Date:  2008-08

Review 5.  Exploiting cyanobacterial P450 pathways.

Authors:  Faith O Robert; Jagroop Pandhal; Phillip C Wright
Journal:  Curr Opin Microbiol       Date:  2010-03-17       Impact factor: 7.934

6.  Genome mining in Streptomyces. Elucidation of the role of Baeyer-Villiger monooxygenases and non-heme iron-dependent dehydrogenase/oxygenases in the final steps of the biosynthesis of pentalenolactone and neopentalenolactone.

Authors:  Myung-Ji Seo; Dongqing Zhu; Saori Endo; Haruo Ikeda; David E Cane
Journal:  Biochemistry       Date:  2011-02-08       Impact factor: 3.162

7.  Kutznerides 1-4, depsipeptides from the actinomycete Kutzneria sp. 744 inhabiting mycorrhizal roots of Picea abies seedlings.

Authors:  Anders Broberg; Audrius Menkis; Rimvydas Vasiliauskas
Journal:  J Nat Prod       Date:  2006-01       Impact factor: 4.050

8.  The hedamycin locus implicates a novel aromatic PKS priming mechanism.

Authors:  Tsion Bililign; Chang-Gu Hyun; Jessica S Williams; Anne M Czisny; Jon S Thorson
Journal:  Chem Biol       Date:  2004-07

9.  Process and metabolic strategies for improved production of Escherichia coli-derived 6-deoxyerythronolide B.

Authors:  Blaine Pfeifer; Zhihao Hu; Peter Licari; Chaitan Khosla
Journal:  Appl Environ Microbiol       Date:  2002-07       Impact factor: 4.792

10.  Construction of soil environmental DNA cosmid libraries and screening for clones that produce biologically active small molecules.

Authors:  Sean F Brady
Journal:  Nat Protoc       Date:  2007       Impact factor: 13.491

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.