| Literature DB >> 29062930 |
Tilmann Weber1, Hyun Uk Kim1,2.
Abstract
Natural products are among the most important sources of lead molecules for drug discovery. With the development of affordable whole-genome sequencing technologies and other 'omics tools, the field of natural products research is currently undergoing a shift in paradigms. While, for decades, mainly analytical and chemical methods gave access to this group of compounds, nowadays genomics-based methods offer complementary approaches to find, identify and characterize such molecules. This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work. In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on 'omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http://www.secondarymetabolites.org) is introduced to provide a one-stop catalog and links to these bioinformatics resources. In addition, an outlook is presented how the existing tools and those to be developed will influence synthetic biology approaches in the natural products field.Entities:
Keywords: A, adenylation domain; Antibiotics; BGC, biosynthetic gene cluster; Bioinformatics; Biosynthesis; C, condensation domain; GPR, gene-protein-reaction; HMM, hidden Markov model; LC, liquid chromatography; MS, mass spectrometry; NMR, nuclear magnetic resonance; NRP, non-ribosomally synthesized peptide; NRPS; NRPS, non-ribosomal peptide synthetase; Natural product; PCP, peptidyl carrier protein; PK, polyketide; PKS; PKS, polyketide synthase; RiPP, ribosomally and post-translationally modified peptide; SVM, support vector machine
Year: 2016 PMID: 29062930 PMCID: PMC5640684 DOI: 10.1016/j.synbio.2015.12.002
Source DB: PubMed Journal: Synth Syst Biotechnol ISSN: 2405-805X
Fig. 1Overview of the most commonly used and freely accessible tools specialized for the analysis of secondary metabolites and their pathways.
Comprehensive collection of freely accessible software programs and databases dedicated to natural product research. Only software programs and databases properly functioning as of December 2015 are listed in this table. A more comprehensive list can be found at the SMBP (http://www.secondarymetabolites.org).
| Software program or database | URL | Reference | Last publication or documented update | Main content and/or function |
|---|---|---|---|---|
| 2metDBR | 2013 | Standalone (Mac) tool to mine PKS/NRPS gene clusters | ||
| antiSMASHR/N | 2015 | Web application and standalone tool (LINUX, MacOS and MS Windows) to mine and analyze BGCs; includes comparative genomics tools and a homology-based metabolic modeling pipeline | ||
| BAGELR | 2013 | Web application to mine and analyze RiPPs | ||
| CLUSEANR | 2013 | Standalone (LINUX and MacOS) tool to mine and analyze BGCs, mainly PKS/NRPS | ||
| ClusterFinderN | 2014 | Standalone tool (LINUX and MacOS) to identify BGCs with an non-rule based approach | ||
| eSNaPDR | 2014 | Web application to mine metagenomic datasets for BGCs | ||
| EvoMiningN | 2015 | Web application for phylogenomic approach of cluster identification | ||
| GNP/Genome SearchR | 2015 | Web application to mine and analyze BGCs, mainly PKS/NRPS | ||
| GNP/PRISMR | 2015 | Web application to mine and analyze BGCs, mainly PKS/NRPS, including glycosylations and structure prediction | ||
| MIDDAS-MN | 2013 | Web application to use transcriptome data to identify BGC coordinates in fungal genomes | ||
| MIPS-CGN | 2015 | Web application to identify BGC coordinates in fungal genomes without transcriptome data | ||
| NaPDoSR | 2012 | Web application offering phylogenomic analysis of PKS-KS and NRPS-C domains | ||
| SMURFR | 2010 | Web application to mine PKS/NRPS/terpenoid gene clusters in fungal genome | ||
| ClustScan Professional | 2008 | Java-based standalone tool to mine for PKS/NRPS BGCs | ||
| NP.searcher | 2009 | Web application/standalone tool (LINUX) to mine for PKS/NRPS BGCs | ||
| NRPS-PKS/SBSPKS | 2010 | Web application to mine for PKS BGCs | ||
| SEARCHPKS | 2003 | Web application to mine for PKS BGCs | ||
| LSI-based A-domain function predictor | 2014 | Web application to predict A-domain specificities | ||
| NRPS/PKS substrate predictor | 2013 | Web application to predict A-domain/AT-domain specificities | ||
| NRPSpredictor/NRPSpredictor2 | 2011 | Web application/standalone tool (LINUX, MS Windows, MacOS) to predict A-domain specificities | ||
| NRPSsp | 2012 | Web application to predict A-domain specificities | ||
| PKS/NRPS Web Server/Predictive Blast Server | 2009 | Web application to determine domain organization and A-domain specificities | ||
| SEARCHGTr | 2005 | Web application to predict glycosyltransferase specificities | ||
| SEQL-NRPS | 2015 | Web application to predict A-domain specificities | ||
| Bactibase | 2011 | Web accessible database of bacteriocins | ||
| ClusterMine360 | 2013 | Web accessible database of BGCs | ||
| ClustScan Database | 2013 | Web accessible database of PKS/NRPS BGCs | ||
| DoBISCUIT | 2015 | Web accessible database of PKS/NRPS BGCs | ||
| IMG-ABC | 2015 | Web accessible database of BGCs, tightly integrated into JGI's IMG platform | ||
| MIBiG | 2015 | Web accessible repository of BGCs | ||
| Recombinant ClustScan Database | 2013 | Database of | ||
| Antibioticome | Unpublished | 2015 | Web accessible database on compounds, compound families and modes of action | |
| ChEBI | 2015 | Web accessible database and ontology on compounds focused on small molecules | ||
| ChEMBL | 2015 | Web accessible database on bioactive compounds with drug-like properties | ||
| ChemSpider | 2015 | Web accessible database on structures and properties of over 35 million structures | ||
| KNApSAcK database | 2015 | Web accessible database on compounds; standalone version of KNApSAcK metabolite database available | ||
| NORINE | 2015 | Web accessible database on NRPs | ||
| Novel Antibiotics Database | Unpublished | 2008 | Web accessible database on compounds | |
| PubChem | 2015 | Web accessible database on compounds and bioactivities; source data available for download | ||
| StreptomeDB | 2015 | Web accessible database on compounds produced by streptomycetes; download of compounds and metadata in SD format. | ||
| Cycloquest | 2011 | Web application to correlate tandem MS data of cyclopeptides with gene clusters | ||
| GNPS | unpublished | 2015 | Generic metabolomics portal to analyze MS/MS data (dereplication and molecular networking) | |
| GNP/iSNAP | 2015 | Web application to automatically identify metabolites in MS/MS data based on genomic data | ||
| NRPquest | 2014 | Web application to correlate NRP tandem data with gene clusters | ||
| Pep2Path | 2014 | Standalone application to correlate peptide sequence tags with NRP and RiPP BGCs | ||
| RiPPquest | 2014 | Web application to correlate RIPP tandem data with gene clusters | ||
High-throughput metabolic modeling tools that can facilitate engineering of actinomycetes for secondary metabolite production. Tools are shown in the order of the year they appeared.
| Software program | URL | Reference | Year of publication | Main content and/or function |
|---|---|---|---|---|
| Model SEED | 2010 | First online high-throughput metabolic modeling tool | ||
| MEMOSys | 2011 | Allows management, storage, and development of metabolic models | ||
| SuBliMinaL Toolbox | 2011 | Has strengths in managing chemical information for metabolites in a metabolic model | ||
| FAME | 2012 | Allows streamlined analysis of a newly built metabolic model using various simulation methods | ||
| GEMSiRV | 2012 | Allows metabolic model reconstruction, simulation and visualization | ||
| MetaFlux in Pathway Tools | 2012 | Provides strong supports for predicting, modeling, curating and visualizing metabolic pathways | ||
| MicrobesFlux | 2012 | Allows both flux balance analysis (FBA) and dynamic FBA of a newly generated metabolic model | ||
| RAVEN Toolbox | 2013 | Allows metabolic model reconstruction, simulation and visualization in MATLAB environment | ||
| CoReCo | 2014 | Useful for modeling metabolisms of multiple related species | ||
| merlin | 2015 | Most recently released metabolic modeling program with comprehensive genome annotation functionalities necessary for model generation | ||
| antiSMASH | 2015 | Provides comprehensive genome mining platform for BGCs; currently the only platform offering automated modeling including secondary metabolite specific reactions |
Fig. 2A screenshot of the antiSMASH page in the Secondary Metabolite Bioinformatics Portal at http://www.secondarymetabolites.org.