| Literature DB >> 31926431 |
Hengqian Ren1, Chengyou Shi1, Huimin Zhao2.
Abstract
Natural products (NPs), also known as secondary metabolites, are produced in bacteria, fungi, and plants. NPs represent a rich source of antibacterial, antifungal, and anticancer agents. Recent advances in DNA sequencing technologies and bioinformatics unveiled nature's great potential for synthesizing numerous NPs that may confer unprecedented structural and biological features. However, discovering novel bioactive NPs by genome mining remains a challenge. Moreover, even with interesting bioactivity, the low productivity of many NPs significantly limits their practical applications. Here we discuss the progress in developing bioinformatics tools for efficient discovery of bioactive NPs. In addition, we highlight computational methods for optimizing the productivity of NPs of pharmaceutical importance.Entities:
Keywords: Bioengineering; Bioinformatics; Biological Sciences; Metabolic Engineering
Year: 2019 PMID: 31926431 PMCID: PMC6957853 DOI: 10.1016/j.isci.2019.100795
Source DB: PubMed Journal: iScience ISSN: 2589-0042
Figure 1Summary of Computational Strategies for Natural Product Biosynthetic Pathway Prediction and Optimization Described in This Review
Summary of Computational Tools for Pathway Prediction Highlighted in This Review
| Computational Tools | Target Organism | BGC Prediction | Input | Chemical Structure Prediction | Key Features | URL | Reference |
|---|---|---|---|---|---|---|---|
| AntiSMASH | Bacteria and fungi | Unrestricted | DNA sequences | Yes | Integrate multiple BGC prediction tools/algorithms: ClusterFinder, NaPDoS, RODEO | ( | |
| PlantiSMASH | Plant | Unrestricted | DNA sequences | No | Plant-adapted pHMMs and cluster detection rules and support for co-expression analysis | ( | |
| NP.searcher | Bacteria | NRPS, PKS, NRPS/PKS | DNA sequences | Yes | Predict 2D and 3D structure of NRPS/PKS | ( | |
| SMURF | Fungi | NRPS, PKS, NRPS/PKS, DMATS | Protein sequences and chromosomal coordinates of genes | No | Use gene coordinates as well as protein sequences as input | ( | |
| ClustScan | Bacteria | NRPS, PKS | DNA sequences | Yes | First employ pHMMs of signature genes for BGC prediction | Obtain by request at | ( |
| eSNaPD | Bacteria | Unrestricted | Metagenomic DNA | No | Uncover biosynthetic diversity from metagenomic data | ( | |
| ClusterFinder | Bacteria | Unrestricted | DNA sequences | No | Prediction is based on Pfam domain frequencies | ( | |
| EvoMining | Bacteria | Unrestricted | DNA sequences | No | Genome mining based on evolutionary principles | ( | |
| NRPS-PKS/SBPKS | Bacteria | NRPS, PKS | Protein sequences | Yes | Model 3D structures of individual PKS catalytic domains | ( | |
| NaPDoS | Metagenomic sample | NRPS, PKS | Protein or DNA sequences | No | Phylogenic approach for domain analysis, various query types including genome contigs | ( | |
| BAGEL | Bacteria | Bacteriocin, RIPP | DNA sequences | No | Single-input whole-genome analysis for bacteriocin and RIPP BGC detection | ( | |
| RODEO | Bacteria | RIPP | Protein accession number | Yes | Combine hidden Markov model-based analysis, heuristic scoring and machine learning | ( |
Figure 2General Workflow of antiSMASH
Primarily, antiSMASH identifies signature biosynthetic genes (hit on their pHMMs) that encode enzymes responsible for generating a class-specific scaffold and locates a cluster based on a set of manually curated BGC cluster rules. Alternatively, it uses the ClusterFinder algorithm for BGC prediction. For fungal BGCs, Cluster Assignment by Islands of Sites (CASSIS) is integrated for better prediction of gene cluster boundaries. In addition, several downstream analyses can be performed: NRPS/PKS domain analysis and annotation, prediction of the core chemical structure of PKSs and NRPSs, substrate specificity prediction for NRPS ‘A’ domains (via NRPSpredictor2) (Röttig et al., 2011), ClusterBlast gene cluster comparative analysis, and smCOG secondary metabolism protein family analysis.
Summary of Computational Tools and Databases for Pathway Optimization Highlighted in This Review
| Name | Category | Target | Function | Key Features | URL | Reference |
|---|---|---|---|---|---|---|
| RetroPath 2.0 | Computational tool | Biosynthetic genes | Artificial pathway design | Automated open source workflow for retrosynthesis based on generalized reaction rules | ( | |
| Kyoto Encyclopedia of Genes and Genomes (KEGG) | Database | Biosynthetic Genes | Database for systematic analysis of gene functions, linking genomic information with higher order functional information | ( | ||
| 1000 Plants (1KP) Project | Database | Biosynthetic genes | Transcriptome data from over 1,000 plant species | ( | ||
| CoStar | Computational tool | Biosynthetic genes | Codon optimization | Optimization of gene sequences by avoiding hairpins, GC content variation and repeat | ( | |
| Presyncodon | Computational tool | Biosynthetic genes | Codon optimization | Design of gene sequences for optimized heterologous expression by machine learning | ( | |
| BacPP | Computational tool | Promoter | Promoter prediction | Identification of promoters from genome sequences by a machine learning algorithm | ( | |
| WebGeSter DB | Database | Terminator | Database containing a million terminators identified in 1,060 bacterial genome sequences and 798 plasmids | ( | ||
| RBS Calculator | Computational tool | RBS | RBS design | Control of expression level by designing RBSs with various strength | ( | |
| ClusterCAD | Computational tool | Biosynthetic genes | Protein engineering | Design of chimeric PKSs | ( |
Figure 3Overview of the RBS Calculator
(A) Description of the thermodynamic model of bacterial translation initiation used in RBS calculator.
(B) Application of RBS calculator in rational design of RBS library and optimization of violacein biosynthesis (Jeschek et al., 2016).