| Literature DB >> 25948579 |
Tilmann Weber1, Kai Blin2, Srikanth Duddela3, Daniel Krug4, Hyun Uk Kim5, Robert Bruccoleri6, Sang Yup Lee5, Michael A Fischbach7, Rolf Müller4, Wolfgang Wohlleben8, Rainer Breitling9, Eriko Takano9, Marnix H Medema10.
Abstract
Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25948579 PMCID: PMC4489286 DOI: 10.1093/nar/gkv437
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Example output of a KnownClusterBlast output, using the balhimycin gene cluster (GenBank Y16952.3). The significance thresholds used are the same as for the ClusterBlast module (8). Following the balhimycin gene cluster itself, several other BGCs involved in the biosynthesis of similar glycopeptides are shown as next best hits. The percentage of genes in the query cluster that are present in the hit cluster is included as extra information. Also, hyperlinks to the MIBiG repository are available, where users can find additional information on each gene cluster.
Figure 2.BiosynML output and Geneious plugin. The schematic shows the interfacing of typical tasks during BGC analysis—including antiSMASH annotation, manual BGC refinement, deposition to in-house databases and submission to the public MIBiG repository—supported by BiosynML functionality.
Overview of analyses integrated into antiSMASH
|
| |
|---|---|
| Aminocoumarins | Melanins |
| Aminoglycosides/aminocyclitols | Microcins |
| Aryl polyenes | Microviridins |
| Bacteriocins | Non-ribosomal peptides |
| Beta-lactams | Nucleosides |
| Bottromycins | Oligosaccharide |
| Butyrolactones | Others |
| ClusterFinder fatty acids | Phenazines |
| ClusterFinder saccharides | Phosphoglycolipids |
| Cyanobactins | Phosphonates |
| (Dialkyl)resorcinols | Polyunsaturated fatty acids |
| Ectoines | Trans-AT type I PKS |
| Furans | Type I PKS |
| Glycocins | Type II PKS |
| Head-to-tail cyclized peptides | Type III PKS |
| Heterocyst glycolipid PKS-like | Proteusins |
| Homoserine lactones | Sactipeptides |
| Indoles | Siderophores |
| Ladderane lipids | Terpenes |
| Lantipeptides | Thiopeptides |
| Linear azol(in)e-containing peptides (LAPs) | |
| Lasso peptide | |
| Linaridins | |
|
| |
| ClusterFinder | |
|
| |
| Domain structure of PKSs and NRPSsc | |
| NRPS: A-domain specificity prediction | |
| PKS: AT specificity prediction | |
| Identification of conserved active site motifs; stereochemistry-determining motifs | |
| Prediction of core chemical structure (NRPS, PKS, lanthipeptides) | |
| smCOG secondary metabolism-related gene family prediction | |
|
| |
| Protein family detection (PFAM) search | |
| EC number prediction | |
| Homology-based metabolic modeling (with template models | |
|
| |
| ClusterBlast (identification of similar clusters in sequenced genomes) | |
| SubClusterBlast (identification of conserved operons and multigene modules with known function) | |
| KnownClusterBlast (identification of similar experimentally characterized gene clusters) | |
|
| |
| NCBI BLAST+ | |
| NaPDoS | |
| Norine | |
|
| |
| Genbank | |
| EMBL | |
| SBML (for metabolic model files) | |
| BiosynML | |
| XLS (Microsoft Excel) | |
| Tab-delimited text files | |
|
| |
FASTA (nucleotide or protein).
Genbank/Genpept.
EMBL.
Direct download via NCBI accession number.
aFor a list of profile Hidden Markov Models (pHMMs) used to detect the different classes, please see Supplementary Table S1.
bFor a list of rules, please see Supplementary Table S2.
cFor a list of detectable domains, please see Supplementary Table S3.