| Literature DB >> 27163034 |
Anthony W Goering1, Ryan A McClure1, James R Doroghazi2, Jessica C Albright1, Nicole A Haverland1, Yongbo Zhang3, Kou-San Ju2, Regan J Thomson1, William W Metcalf2, Neil L Kelleher1.
Abstract
For more than half a century the pharmaceutical industry has sifted through natural products produced by microbes, uncovering new scaffolds and fashioning them into a broad range of vital drugs. We sought a strategy to reinvigorate the discovery of natural products with distinctive structures using bacterial genome sequencing combined with metabolomics. By correlating genetic content from 178 actinomycete genomes with mass spectrometry-enabled analyses of their exported metabolomes, we paired new secondary metabolites with their biosynthetic gene clusters. We report the use of this new approach to isolate and characterize tambromycin, a new chlorinated natural product, composed of several nonstandard amino acid monomeric units, including a unique pyrrolidine-containing amino acid we name tambroline. Tambromycin shows antiproliferative activity against cancerous human B- and T-cell lines. The discovery of tambromycin via large-scale correlation of gene clusters with metabolites (a.k.a. metabologenomics) illuminates a path for structure-based discovery of natural products at a sharply increased rate.Entities:
Year: 2016 PMID: 27163034 PMCID: PMC4827660 DOI: 10.1021/acscentsci.5b00331
Source DB: PubMed Journal: ACS Cent Sci ISSN: 2374-7943 Impact factor: 14.553
Figure 1Work and data flows for a “metabologenomics” approach to natural products discovery from bacteria. Using information obtained from interpreting 178 sequenced genomes into gene clusters and gene cluster families (top center panel, described in ref (8)) and from MS-based metabolomics of the same 178 strains with accurate mass (bottom center panel), pairwise correlation yields scores that associate metabolites with their gene cluster families (panel at right).
Figure 2Structures of JBIR34 and 35 (A), tambromycin (B), and key NMR correlations (C). Structure of tambromycin is shown with the new amino acid residue tambroline highlighted in panel (B). The related structures JBIR 34 and 35 are shown in panel (A). Panel (C) highlights key correlations derived from 1H-1H COSY, 1H-1H NOESY, and 1H-13C HMBC experiments. 1H-13C HMBC correlations from the indole 2-position proton and 2-methyl-serine methylene protons to the same carbonyl carbon were used to determine the connectivity of these substructures as a methyl-oxazoline. Another important 1H-13C HMBC correlation from the alpha proton of tambroline to the carbonyl carbon of 2-methyl-serine associated these two substructures. These correlations and the overall sequence of the peptide were confirmed by tandem mass spectrometry (Figure S5). 1H-1H COSY correlations across the continuous spin system present in the pyrrolidine substructure are shown as widened bonds.
Data Enabling the Metabologenomic Identification of Tambromycin and Its Biosynthetic Gene Clustera
This table shows the co-occurrence of tambromycin with members of 10 gene cluster families (GCFs) for 11 Streptomycete strains probed in this study. Strains are listed at the far left, tambromycin detection by MS is shown in columns 3 and 4, and columns 5–14 show all NRPS GCFs that are present in two or more of this set of 11 strains. Note that data for the strains listed in the top nine rows were obtained in the first pass application of LC-MS and automated data reduction. The strains listed in the bottom two rows were interrogated in a targeted fashion because their genomes harbored the tambroline biosynthetic gene cluster. Note that the highest level of co-occurrence is between tambroline and NRPS Gene Cluster Family #519 in the first-pass metabologenomics data set (i.e., top nine strains). The background color in column 4 highlights whether the observed MS intensity was above (green) or below (red) the threshold intensity of 5 × 106 set for automated metabolite detection; the selected ion intensity derived from the LC-MS data reflects the number of ion counts (i.e., the NL value). Numbers designating each gene cluster family appear as archived on the website at www.igb.illinois.edu/labs/metcalf/gcf.
Figure 3Distribution of the tambromycin gene cluster across diverse Streptomycetes. (A) The biosynthetic gene cluster of tambromycin is—with one exception—distributed throughout the Streptomyces virginiae clade (www.igb.illinois.edu/labs/metcalf/gcf/gcfDisplay.php?gcf=NRPS_GCF.519). (B) Eighteen strains within the virginiae clade were identified as having a member of this GCF. The plurality of strains containing a member of this gene cluster family made it possible to draw precise gene cluster boundaries from the ORFs that are conserved across all members of this group.
Functional Annotation of Genes in the Biosynthetic Gene Cluster of Tambromycin (from Strain F-4474)
| ORF | no. of amino acids | proposed function | homologue function | homologue accession (Uniprot) | fractional amino acid identity (%) |
|---|---|---|---|---|---|
| 207 | flavin reductase | flavin reductase RbmH | Q8KI76 | 75/155 (48) | |
| 534 | tryptophan 6-halogenase | flavin-dependent tryptophan halogenase RebH | Q8KHZ8 | 354/527 (67) | |
| 333 | SARP family regulator | SARP family transcriptional regulator | R1FZM2 | 126/284 (44) | |
| 528 | aldehyde dehydrogenase | betaine-aldehyde dehydrogenase | R4SX69 | 290/484 (60) | |
| 611 | NRPS (A-T) (Trp) | uncharacterized protein | A4FJG4 | 301/547 (55) | |
| 249 | NRPS (TE) | thioesterase | A0A077JC92 | 109/237 (46) | |
| 60 | mbth | putative MbtH-like protein | W7IRY5 | 33/54 (61) | |
| 405 | cytochrome p450 (tryptophan oxygenase) | cytochrome P450 NovI | Q9L9F9 | 140/400 (35) | |
| 1313 | NRPS (A-T-Cyc-Mt) (Cl-indole-acid) | nonribosomal peptide synthetase | A0A077JBM3 | 736/1319 (56) | |
| 215 | phosphopantetheinyl transferase | 4′-phosphopantetheinyl transferase | A0A0C5FYL8 | 92/198 (46) | |
| 344 | regulatory element | uncharacterized protein | A0A0C1VA91 | 179/353 (51) | |
| 426 | transport | major facilitator superfamily protein | R4TAR4 | 191/396 (48) | |
| 1364 | NRPS (C-A-T-Te) (2-me-ser) | uncharacterized protein | N0CW84 | 567/1402 (40) | |
| 1119 | NRPS (C-A-T) (tambroline) | nonribosomal peptide synthetase | K0K8E4 | 474/1126 (42) | |
| 469 | NRPS (C) | long-chain-fatty-acid-CoA ligase | J2JV27 | 171/463 (37) | |
| 397 | acyl-CoA dehydrogenase | acyl-CoA dehydrogenase domain protein | C7QK66 | 162/341 (48) | |
| 382 | acyl-CoA dehydrogenase | uncharacterized protein | V6K9A0 | 105/320 (33) | |
| 194 | GCN5 family acetyltransferase | A0A094M2Y4 | 94/191 (49) | ||
| 357 | tryptophan aldolase | threonine aldolase | C7QGM4 | 178/331 (54) | |
| 230 | hypothetical protein | uncharacterized protein fmoF | A0A077JC94 | 111/222 (50) | |
| 467 | alanine hydroxymethyltransferase | serine hydroxymethyltransferase fmoH | A0A077JCX7 | 272/402 (68) | |
| 1097 | NRPS (Cyc-A-T) 2-me-serine | nonribosomal peptide synthetase fmoA3 | A0A077JG85 | 605/1091 (55) | |
| 625 | regulatory element | regulatory protein | D9W2C9 | 331/607 (55) | |
| 69 | hypothetical protein | uncharacterized protein | A0A0F0GHR6 | 27/73 (37) | |
| 1015 | regulatory element | putative AfsR-like transcriptional regulator | B1VL80 | 522/989 (53) |
Figure 4Overview of the tambromycin biosynthetic gene cluster. Organization of the tambromycin biosynthetic gene cluster is shown. ORFs and protein representations are color coordinated with the structure of tambromycin to indicate their proposed association with biosynthesis of a particular amino acid substructure. Proteins colored gray are not proposed to be involved directly in biosynthesis, but rather in cluster regulation, transport, or other supporting roles (e.g., phosphopantetheinylation of NRPS proteins).
Figure 5Summary of stable isotope feeding experiments and biosynthetic insights. (A) Stable isotope experiments were carried out by feeding labeled lysine, tryptophan, and alanine. When strain F-4474 was cultured with tryptophan-d3, tambromycin was observed to incorporate three deuterons from the indole portion of the tryptophan label into its structure, consistent with two substitutions occurring on the indole ring. Tambromycin also appeared to incorporate two 13C alanine monomers, which are modified via a hydroxymethyltransferase to form 2-methyl-serine. (B) Summary of heavy isotopes observed to be incorporated into the structure of tambromycin. (C) Lysine labeled with no heavy isotopes (panel at far left) deuterium, 13C and 15N (at the indicated positions in each panel) was used to illuminate the biosynthetic path leading to the formation of tambroline by comparing the change in mass of a tambroline-derived internal fragment ion observed in the MS2 spectrum of tambromycin. (D) Proposal for a biosynthesis of tambroline from lysine through an acyl-CoA dehydrogenase-catalyzed mechanism consistent with the labeling results shown in C).