| Literature DB >> 35736201 |
Jessie James Limlingan Malit1,2, Hiu Yu Cherie Leung1,2, Pei-Yuan Qian1,2.
Abstract
Large-scale genome-mining analyses have identified an enormous number of cryptic biosynthetic gene clusters (BGCs) as a great source of novel bioactive natural products. Given the sheer number of natural product (NP) candidates, effective strategies and computational methods are keys to choosing appropriate BGCs for further NP characterization and production. This review discusses genomics-based approaches for prioritizing candidate BGCs extracted from large-scale genomic data, by highlighting studies that have successfully produced compounds with high chemical novelty, novel biosynthesis pathway, and potent bioactivities. We group these studies based on their BGC-prioritization logics: detecting presence of resistance genes, use of phylogenomics analysis as a guide, and targeting for specific chemical structures. We also briefly comment on the different bioinformatics tools used in the field and examine practical considerations when employing a large-scale genome mining study.Entities:
Keywords: antibiotics; bioactive compounds; genome mining; genomics; natural products; secondary metabolites
Mesh:
Substances:
Year: 2022 PMID: 35736201 PMCID: PMC9231227 DOI: 10.3390/md20060398
Source DB: PubMed Journal: Mar Drugs ISSN: 1660-3397 Impact factor: 6.085
Figure 1General workflow and examples of bioinformatic tools for natural product discovery guided by large-scale genome mining.
Genetic elements and NP features targeted by resistance, phylogenomic, structure, and RiPP-guided genome-mining strategies and the natural products they identified.
| Resistance-Gene-Guided | |||
|---|---|---|---|
| Resistance Gene(s) | Natural Product | Source Organism | Reference |
| pentapeptide repeat protein (PRP) sequences | alkylpyrone-407 | [ | |
| pyxidicycline A | [ | ||
| dihydroxyacid dehydratase | aspterric acid | [ | |
| tripartite efflux system PleABC | prosekin | Pseudomonas prosekii LMG 26867 | [ |
| lanosterol 14α-demethylase | lanomycin | [ | |
| fatty acid synthase | thiotetroamide | Streptomyces afghaniensis NRRL 5621 | [ |
| D-stereospecific peptidase | bogorol | [ | |
|
| |||
|
|
|
|
|
| Whole BGCs of different families | aryl polyenes | [ | |
| “Expanded-then-recruited” enzyme families; 3-carboxyvinyl-phosphoshikimate transferase | arseno-organic metabolites | [ | |
| Each shared gene found in glycopeptide antibiotic-producing BGCs | corbomycin | [ | |
| ATP-grasp ligase | MdnA7 | [ | |
| LuxR | cepacin A | [ | |
| Whole BGCs containing | detoxin S1 | [ | |
| terpene synthase | hydropyrene | [ | |
| chain length factor (CLF) protein | oryzanaphthopyran A | [ | |
|
| |||
|
|
|
|
|
| cationic amino acid residues | brevicidine | [ | |
| chemical transformations catalyzed by cytochrome P450 on cyclodipeptides | cyctetryptomycin B | [ | |
| prenyl groups on cyclodipeptides | griseocazine D1 | [ | |
| thioether bonds | freyrasin | [ | |
| chemical transformations catalyzed by the DUF–SH didomain | guangnanmycin | [ | |
| phosphonic acid | argolaphos A | [ | |
|
| |||
| Combining the structure-guided strategy with precursor peptide sequence search | |||
|
|
|
|
|
| lanthipeptide | birimositide | [ | |
| cyanobactin | tolypamide | [ | |
| polyoxazole-thiazole-based cyclopeptide | aurantizolicin | [ | |
| thiopeptide | saalfelduracin | [ | |
| thioamitides | thiovarsolin A | [ | |
| sactipeptide | streptosactin | [ | |
| lanthipeptide | flavucin, agalacticin, etc. | [ | |
Figure 2Compounds identified through resistance-gene-guided genome mining and their resistance determinants (in brackets): alkylpyrone-407 and pyxidicycline A (pentapeptide repeat protein (PRP) sequences), aspterric acid (dihydroxyacid dehydratase), prosekin (tripartite efflux system PleABC), lanomycin (lanosterol 14α-demethylase), thiotetroamide (fatty acid synthase), and bogorol (D-stereospecific peptidase).
Figure 3Compounds with novel chemical scaffolds identified through phylogenomics-guided genome mining: aryl polyenes from Escherichia coli, arseno-organic metabolites from Streptomyces lividans, corbomycin from Streptomyces sp. WAC01529, MdnA7 from Cyanothece sp. PCC 7822, cepacin A from Burkholderia ambifaria, detoxin S1 from Streptomyces sp. NRRL S-325, hydropyrene from Streptomyces clavuligerus ATCC 27064, and oryzanaphthopyran A from Streptacidiphilus oryzae CGMCC 4.2012.
Figure 4Compounds identified through structure-guided genome mining and specific chemical moieties targeted for (in red): brevicidine (cationic amino acid residues), argolaphos A (phosphonic acid), cyctetryptomycin B (chemical transformations catalyzed by cytochrome P450), griseocazine D1 (prenyl groups), freyrasin (thioether bonds), and guangnanmycin (chemical transformations catalyzed by the DUF–SH didomain).
Figure 5Novel RiPPs identified through large-scale genome mining with unusual biosynthesis and chemical moieities: birimositide Streptomyces rimosus subsp. rimosus WC3908, tolypamide from Tolypothrix sp. PCC 7601, aurantizolicin from Streptomyces auranticaus JA 4570, saalfelduracin from Amycolatopsis saalfeldensis NRRL B-24474, streptosactin from Streptococcus thermophilus JIM 8232, thiovarsolin A from Streptomyces varsoviensis, and flavucin from Corynebacterium lipophiloflavum DSM 44291.