Literature DB >> 30395294

The antiSMASH database version 2: a comprehensive resource on secondary metabolite biosynthetic gene clusters.

Kai Blin1, Victòria Pascal Andreu2, Emmanuel L C de Los Santos3, Francesco Del Carratore4, Sang Yup Lee1,5, Marnix H Medema2, Tilmann Weber1.   

Abstract

Natural products originating from microorganisms are frequently used in antimicrobial and anticancer drugs, pesticides, herbicides or fungicides. In the last years, the increasing availability of microbial genome data has made it possible to access the wealth of biosynthetic clusters responsible for the production of these compounds by genome mining. antiSMASH is one of the most popular tools in this field. The antiSMASH database provides pre-computed antiSMASH results for many publicly available microbial genomes and allows for advanced cross-genome searches. The current version 2 of the antiSMASH database contains annotations for 6200 full bacterial genomes and 18,576 bacterial draft genomes and is available at https://antismash-db.secondarymetabolites.org/.

Entities:  

Year:  2019        PMID: 30395294      PMCID: PMC6324005          DOI: 10.1093/nar/gky1060

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

A majority of antibacterial and antifungal drugs, as well as drugs for many other indications, are derived from microbial natural products (1). Traditionally, bioactive natural compounds were identified via classical isolation and analysis approaches. The increasing availability of genomic data in the last two decades allows us to complement these approaches with genome mining to identify and characterize biosynthetic pathways for natural products in genome and metagenome data (2). Specialized software to support researchers in their search for natural products has been available for some years (for a comprehensive overview/list of such tools, please see (3–5)). Since its initial release in 2011, antiSMASH (6–9) has established itself as a standard tool for secondary metabolite genome mining and is currently the most widely used software pipeline for this task. antiSMASH uses a rule-based cluster detection approach to identify 45 different types of secondary metabolite biosynthetic pathways via their core biosynthetic enzymes. For nonribosomal peptide synthases, type I polyketides, terpenes, lanthipeptides, thiopeptides, sactipeptides and lassopeptides, antiSMASH can also provide more detailed predictions of the compounds produced by the respective biosynthetic gene clusters (BGCs). Identified clusters are compared to a database of clusters previously predicted by antiSMASH using the built-in ClusterBlast algorithm. A similar algorithm, KnownClusterBlast is used to compare the identified cluster against the manually curated set of known BGCs from the MIBiG (10) database. Secondary metabolite clusters of orthologous group (smCoG) classification is used to assign functions to gene products in the predicted BGCs. As antiSMASH is a genome mining pipeline designed to analyze individual genomes, we developed the antiSMASH database (11) to provide interconnections and cross-genome search functionality based on antiSMASH results for many publicly available microbial genomes. Moreover, it provides users with instant access to full antiSMASH results of publicly available genome sequences. Here we present version 2 of the antiSMASH database. The database content of version 1, which was generated with version 3 of antiSMASH, was updated with annotation of the current antiSMASH 4.2.1 release. This implies that the antiSMASH database now includes updated detection rules, updated ClusterBlast database links, TTA codon prediction, NRPS-A domain predictions by the up-to-date SANDPUMA software (12), classification of terpenes and improved links to MIBiG (10) (for details, please see (9)). Furthermore, new sequences that became available after version 1 release were included. Version 2 of the antiSMASH database now contains genome mining results for 6,200 full bacterial genomes and 18 576 draft genomes from the NCBI RefSeq database (13). The increased dataset is accompanied by improvements in the search functionality, data export options and the user interface of the antiSMASH database.

MATERIALS AND METHODS

Selection of included genomes

Microbial genome resources are growing rapidly and, despite taxonomically novel genomes being released frequently, there is a lot of sequence redundancy in the NCBI genome databases, i.e. thousands of sequences of mostly pathogenic bacteria such as Pseudomonas aeruginosa or Escherichia coli. Therefore, with the objective of creating a representative set of genomes that are non-redundant, we designed an approach to effectively update the antiSMASH database, maintaining its high quality and adequately representing natural diversity without significantly decreasing the overall pipeline performance in terms of speed. Genomes categorized as ‘draft genomes’ are fragmented in multiple contigs. As many secondary metabolite biosynthetic gene cluster contain repetitive sequences, this implies that many BGCs end up being split on multiple contigs without any linkage information, leading to low-quality BGC data. Consequently, in order to minimize this issue we prioritized the inclusion of NCBI RefSeq genomes that were annotated with the assembly level ‘complete genome’ or ‘chromosome’ present in the database on April 2018 (10 863 genomes in total). We then estimated the distance between selected assemblies using fastANI (Average Nucleotide Identity) (https://github.com/ParBLiSS/FastANI). FastANI uses a hash-based algorithm to estimate the average nucleotide identity between pairs of genomic assemblies. A network was generated with each genome as a node, and weighted edges between nodes corresponding to the fastANI estimate between genomes. We used a fastANI similarity score of 99.6 as a cutoff for having an edge between nodes. Nodes were then assigned to communities using the multilevel community structure algorithm (https://arxiv.org/abs/0803.0476) in the igraph Python package (Csardi G, Nepusz T: The igraph software package for complex network research, InterJournal, Complex Systems 1695. 2006. http://igraph.org). Finally, a representative genome from each community was chosen by prioritizing assemblies with the highest contig N50 and lowest contig L50. This resulted in a total of 6,200 complete genomes for the antiSMASH database. In order to supplement the set of complete and chromosomal assemblies, we added a set of draft genomes to the antiSMASH database. To select draft genomes for addition to the database, we started with a previously published set of precomputed fastANI similarity scores of ninety thousand prokaryotic genomes (https://doi.org/10.1101/225342). We pre-filtered this set to remove poor quality genomes (N50 < 20 kb and assembly anomalies). We then performed the same procedure as with the complete and chromosomal assemblies to group the draft genomes into communities. A representative genome from each community was chosen by prioritizing assemblies based on assembly level (scaffold > contig), and then selecting assemblies with the highest contig N50 and lowest contig L50. In order to maintain consistency with the complete and chromosomal set, only draft genomes that had corresponding RefSeq assemblies were included in the database. The following resulted in an additional 18,576 draft genome entries that were added to the database.

antiSMASH annotations and data import

Based on the selection criteria mentioned above, the assemblies were downloaded from the NCBI servers in GenBank format using the ncbi-genome-download tool (https://github.com/kblin/ncbi-genome-download/). GNU parallel (14) was used to run multiple docker containers of antiSMASH 4.2.1 simultaneously. Different analysis parameters were used for the full and partial genome set. For full genomes, ClusterBlast, KnownClusterBlast, SubClusterBlast, ActiveSiteFinder, TTA codon detection in automatic mode, secondary metabolite clusters of orthologous groups prediction, and cluster-specific detailed annotations were run (command line flags: --clusterblast --knownclusterblast --subclusterblast --asf --tta-auto --smcogs-notree). For draft genomes, antiSMASH was run in fast mode, skipping the detailed annotations. Additionally, KnownClusterBlast, TTA codon detection in automatic mode, and secondary metabolite clusters of orthologous groups prediction were run (command line flags: --minimal --knownclusterblast --tta-auto --smcogs-notree). The SQL schema of the (https://github.com/antismash/db-schema/) antiSMASH database was updated to accommodate the annotation changes and additional features/predictions that were introduced by antiSMASH version 4. The antiSMASH results in GenBank format were loaded into the SQL schema using the import script available at https://github.com/antismash/db-import/.

RESULTS AND DISCUSSION

With an update to the PGAP annotation pipeline used by the NCBI, the annotation issues causing us to use records from GenBank instead of RefSeq for version 1 of the antiSMASH database have largely been resolved. Hence, with version 2 of the database, we have switched to using RefSeq genomes to obtain more unified gene annotations. The antiSMASH database 2 contains BGCs identified in 6,200 full genomes (an increase of 58%) and adds 18 576 draft genomes. Annotations in the database are generated by antiSMASH version 4.2.1, the most recent release of antiSMASH (9). New in the antiSMASH 4.2.1 release are detection rules for N-acyl amino acids, polybrominated diphenyl ethers, and PPY-like pyrones. Detailed cluster product predictions have been added for lasso peptides, thiopeptides, sactipeptides (based on RODEO (15)), non-ribosomal peptide synthases (based on SANDPUMA (12)) and terpenes. The ClusterBlast and KnownClusterBlast databases have been updated. The search builder has been extended to cover these new features. A new search field in the taxonomy browser makes it easier to navigate to species of interest in the much larger dataset. The gene cluster data obtained in the queries can be downloaded. Depending on the type of search, different file formats are available. For gene cluster searches, the result table can be downloaded in tabular (CSV) format, alternatively it is possible to retrieve the DNA sequence of all matching clusters in FASTA format. Gene and protein domain searches offer a download the protein and nucleotide sequences of all matching genes or protein domains, respectively, or a tabular representation of the results. New options are provided to download specific chunks of the result data (for example only the first 1000 sequences) and to select between standard FASTA headers including the IDs and descriptive headers also including the query the hits were obtained with. The selection of genomes available from NCBI still skews the perspective on the available diversity of biosynthetic gene clusters. While the antiSMASH database contains sequences from 33 different phyla, sequences from e.g. proteobacteria are vastly overrepresented due to their significance as pathogens. The database now contains 32 548 biosynthetic gene clusters from the full genome dataset, an increase of 46% from version 1 (Table 1). Statistics from the 18 576 draft genomes certainly overpredict the number of identified clusters due to clusters being split over several contigs and counted multiple times, the fast-mode results still provide a good first estimate of the available biosynthetic diversity of the draft genomes. Of the 119 558 BGCs predicted on the draft genomes, over a third (41 482) are in contact with at least one contig edge and thus likely incomplete. In comparison, only ∼1% of the clusters from the full genome dataset (390 in 32 548) are located on a contig edge. As the abundant fragmentation of clusters in draft genomes is skewing the numbers, the following statistics only count the results from the full genomes. See Table 2 for detailed cluster counts by BGC type and a comparison with the cluster counts from version 1.
Table 1.

Overview on BGC numbers in version 1 and version 2 of the antiSMASH database

Overall database statisticsVersion 1 countsVersion 2 counts% change
Full (high quality) genomes 3907620058
Number of BGCs in full genomes 22 29232 54846
Draft genomes 018 576New
Number of BGCs in draft genomes 0119 558New
BGCs in total 22 292152 106682
Table 2.

Changes in cluster counts of the different BGC types between version 1 and version 2 of the antiSMASH database (excluding data from draft genomes)

Gene cluster types (high quality genomes)Version 1 countsVersion 2 counts% change
NRPS
Nonribosomal peptide5878789334
Terpenes
Terpene3362501849
Polyketides
Type I polyketide2608330227
Type III polyketide742114154
hglE-type polyketide59076830
Trans-AT polyketide51262322
Type II polyketide17330777
PPY-like pyrone013New
RiPPs
Bacteriocin/RiPP3323519856
Lanthipeptide857112131
Thiopeptide1221097799
Lasso peptide35156260
Sactipeptide59318439
Microviridin1870289
Head-to-tail cyclised (subtilosin-like)2252136
Proteusin1339200
Microcin53–40
Bottromycin-like12100
Other
Other1887232223
Siderophore1399174525
Homoserine lactone1084160848
Aryl polyene988159561
Ectoine42479487
Butyrolactone189392107
Phosphonate24834238
Resorcinol18426142
Ladderane11321792
Phenazine15221038
Melanin45113151
N-acyl amino acid cluster0110New
Indole48104117
Cyanobactin3077157
Polyunsaturated fatty acid456136
Oligosaccharide405435
Aminoglycoside/aminocyclitol265196
Nucleoside2349113
Linaridin1735106
beta-lactam1330131
Aminocoumarin310233
Pheganomycin-like ligase5740
Phosphoglycolipid14300
Furan2350
Glycocin143–79
Polybrominated diphenyl ether01New
Overview on BGC numbers in version 1 and version 2 of the antiSMASH database Changes in cluster counts of the different BGC types between version 1 and version 2 of the antiSMASH database (excluding data from draft genomes) In order to get an accurate taxonomic overview, the identified BGCs were mapped to a phylogenetic tree displaying approximately half of the genomes (12 219 complete and draft) that are included in the database (Figure 1A). The topology of the tree shows the microbial diversity chosen, ranging from well characterized phyla to unclassified bacteria found in diverse ecosystems. Proteobacteria, Actinobacteria, Bacteroidetes, Firmicutes, Spirochaetes, Tenericutes, Cyanobacteria and Deinoccocus-Thermus, the eight most abundant bacterial divisions in our database, accounting for 97.6% of genomes and all vary in the number of harbored BGCs (Figure 1B). High BGC numbers are characteristic features for some groups of bacteria such as Actinobacteria (containing 13 clusters on average (full genomes) while others rarely possess one, like Tenericute. These bacteria exhibit different distributions in terms of encoded secondary metabolite types as defined by antiSMASH (Figure 1C). For these statistics, the 45 BGC classes in antiSMASH have been condensed into five major groups: Non-Ribosomal Peptide Synthetase (NRPS), Polyketide, Ribosomally synthesized and post-translationally modified peptides (RiPP), terpenes and Others, clusters that do not belong to any of the aforementioned types. Terpenes, bacteriocins (a type of RiPP) and NRPS are the most common BGC types, all with higher number of representatives in the phylum Proteobacteria.
Figure 1.

Statistic summary of the antiSMASH database version 2. (A) A phylogenetic tree constructed from the revised version of tree of life based on 120 conserved protein markers (16). The original tree was pruned by genome assembly id using ETE Toolkit (17), to only keep leaves that belong to genomes of the antiSMASH database version 2. The visualization and customization of the tree was performed with iTOL (18). As a result, 12 219 leaves from the total of 24,776 bacterial genomes are shown in this phylogeny. The colored ring represents the eight most abundant phyla; 97.6% of the genomes, and the bar plots in the outer ring the number of BGCs per genome. (B) Boxplots of the BGCs counts per phylum, with the values on top showing the total number of complete genomes per phylum. (C) Pie charts of the five major BGC classes per phylum showing the diversity of natural products produced by each group of bacteria.

Statistic summary of the antiSMASH database version 2. (A) A phylogenetic tree constructed from the revised version of tree of life based on 120 conserved protein markers (16). The original tree was pruned by genome assembly id using ETE Toolkit (17), to only keep leaves that belong to genomes of the antiSMASH database version 2. The visualization and customization of the tree was performed with iTOL (18). As a result, 12 219 leaves from the total of 24,776 bacterial genomes are shown in this phylogeny. The colored ring represents the eight most abundant phyla; 97.6% of the genomes, and the bar plots in the outer ring the number of BGCs per genome. (B) Boxplots of the BGCs counts per phylum, with the values on top showing the total number of complete genomes per phylum. (C) Pie charts of the five major BGC classes per phylum showing the diversity of natural products produced by each group of bacteria.

CONCLUSIONS

Genome mining is a valuable method to assess the biosynthetic potential of microorganisms. Since 2011, antiSMASH has assisted researchers with their secondary metabolite genome mining projects. The public web service has processed ∼400 000 jobs, and the standalone tool has been downloaded over 10 000 times. The antiSMASH database both allows instant access to antiSMASH results for many publicly available genomes instead of waiting several hours for a de-novo antiSMASH run and allows advanced cross-genome searches for BGCs with specific features of interest. In comparison to version 1, the updated version 2 of the antiSMASH database provides antiSMASH 4.2.1 annotations for 6200 full genomes, which is an increase by 58%, and newly introduces data for 18 576 draft genomes. The graphical query builder allows researchers to interactively formulate searches to answer cross-genome research questions, while the results are presented in the familiar antiSMASH output format.

DATA AVAILABILITY

The antiSMASH database is available at https://antismash-db.secondarymetabolites.org/. There are no access restrictions for academic or commercial use of the web server. The source code components and SQL schema for the antiSMASH database are available on GitHub (https://github.com/antismash) under an OSI-approved Open Source license.
  17 in total

Review 1.  Natural Products as Sources of New Drugs from 1981 to 2014.

Authors:  David J Newman; Gordon M Cragg
Journal:  J Nat Prod       Date:  2016-02-07       Impact factor: 4.050

Review 2.  In silico tools for the analysis of antibiotic biosynthetic pathways.

Authors:  Tilmann Weber
Journal:  Int J Med Microbiol       Date:  2014-02-19       Impact factor: 3.473

3.  Minimum Information about a Biosynthetic Gene cluster.

Authors:  Marnix H Medema; Renzo Kottmann; Pelin Yilmaz; Matthew Cummings; John B Biggins; Kai Blin; Irene de Bruijn; Yit Heng Chooi; Jan Claesen; R Cameron Coates; Pablo Cruz-Morales; Srikanth Duddela; Stephanie Düsterhus; Daniel J Edwards; David P Fewer; Neha Garg; Christoph Geiger; Juan Pablo Gomez-Escribano; Anja Greule; Michalis Hadjithomas; Anthony S Haines; Eric J N Helfrich; Matthew L Hillwig; Keishi Ishida; Adam C Jones; Carla S Jones; Katrin Jungmann; Carsten Kegler; Hyun Uk Kim; Peter Kötter; Daniel Krug; Joleen Masschelein; Alexey V Melnik; Simone M Mantovani; Emily A Monroe; Marcus Moore; Nathan Moss; Hans-Wilhelm Nützmann; Guohui Pan; Amrita Pati; Daniel Petras; F Jerry Reen; Federico Rosconi; Zhe Rui; Zhenhua Tian; Nicholas J Tobias; Yuta Tsunematsu; Philipp Wiemann; Elizabeth Wyckoff; Xiaohui Yan; Grace Yim; Fengan Yu; Yunchang Xie; Bertrand Aigle; Alexander K Apel; Carl J Balibar; Emily P Balskus; Francisco Barona-Gómez; Andreas Bechthold; Helge B Bode; Rainer Borriss; Sean F Brady; Axel A Brakhage; Patrick Caffrey; Yi-Qiang Cheng; Jon Clardy; Russell J Cox; René De Mot; Stefano Donadio; Mohamed S Donia; Wilfred A van der Donk; Pieter C Dorrestein; Sean Doyle; Arnold J M Driessen; Monika Ehling-Schulz; Karl-Dieter Entian; Michael A Fischbach; Lena Gerwick; William H Gerwick; Harald Gross; Bertolt Gust; Christian Hertweck; Monica Höfte; Susan E Jensen; Jianhua Ju; Leonard Katz; Leonard Kaysser; Jonathan L Klassen; Nancy P Keller; Jan Kormanec; Oscar P Kuipers; Tomohisa Kuzuyama; Nikos C Kyrpides; Hyung-Jin Kwon; Sylvie Lautru; Rob Lavigne; Chia Y Lee; Bai Linquan; Xinyu Liu; Wen Liu; Andriy Luzhetskyy; Taifo Mahmud; Yvonne Mast; Carmen Méndez; Mikko Metsä-Ketelä; Jason Micklefield; Douglas A Mitchell; Bradley S Moore; Leonilde M Moreira; Rolf Müller; Brett A Neilan; Markus Nett; Jens Nielsen; Fergal O'Gara; Hideaki Oikawa; Anne Osbourn; Marcia S Osburne; Bohdan Ostash; Shelley M Payne; Jean-Luc Pernodet; Miroslav Petricek; Jörn Piel; Olivier Ploux; Jos M Raaijmakers; José A Salas; Esther K Schmitt; Barry Scott; Ryan F Seipke; Ben Shen; David H Sherman; Kaarina Sivonen; Michael J Smanski; Margherita Sosio; Evi Stegmann; Roderich D Süssmuth; Kapil Tahlan; Christopher M Thomas; Yi Tang; Andrew W Truman; Muriel Viaud; Jonathan D Walton; Christopher T Walsh; Tilmann Weber; Gilles P van Wezel; Barrie Wilkinson; Joanne M Willey; Wolfgang Wohlleben; Gerard D Wright; Nadine Ziemert; Changsheng Zhang; Sergey B Zotchev; Rainer Breitling; Eriko Takano; Frank Oliver Glöckner
Journal:  Nat Chem Biol       Date:  2015-09       Impact factor: 15.040

4.  Computational approaches to natural product discovery.

Authors:  Marnix H Medema; Michael A Fischbach
Journal:  Nat Chem Biol       Date:  2015-09       Impact factor: 15.040

5.  ETE: a python Environment for Tree Exploration.

Authors:  Jaime Huerta-Cepas; Joaquín Dopazo; Toni Gabaldón
Journal:  BMC Bioinformatics       Date:  2010-01-13       Impact factor: 3.169

6.  antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences.

Authors:  Marnix H Medema; Kai Blin; Peter Cimermancic; Victor de Jager; Piotr Zakrzewski; Michael A Fischbach; Tilmann Weber; Eriko Takano; Rainer Breitling
Journal:  Nucleic Acids Res       Date:  2011-06-14       Impact factor: 16.971

7.  antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters.

Authors:  Tilmann Weber; Kai Blin; Srikanth Duddela; Daniel Krug; Hyun Uk Kim; Robert Bruccoleri; Sang Yup Lee; Michael A Fischbach; Rolf Müller; Wolfgang Wohlleben; Rainer Breitling; Eriko Takano; Marnix H Medema
Journal:  Nucleic Acids Res       Date:  2015-05-06       Impact factor: 16.971

8.  Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees.

Authors:  Ivica Letunic; Peer Bork
Journal:  Nucleic Acids Res       Date:  2016-04-19       Impact factor: 16.971

9.  antiSMASH 2.0--a versatile platform for genome mining of secondary metabolite producers.

Authors:  Kai Blin; Marnix H Medema; Daniyal Kazempour; Michael A Fischbach; Rainer Breitling; Eriko Takano; Tilmann Weber
Journal:  Nucleic Acids Res       Date:  2013-06-03       Impact factor: 16.971

10.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation.

Authors:  Nuala A O'Leary; Mathew W Wright; J Rodney Brister; Stacy Ciufo; Diana Haddad; Rich McVeigh; Bhanu Rajput; Barbara Robbertse; Brian Smith-White; Danso Ako-Adjei; Alexander Astashyn; Azat Badretdin; Yiming Bao; Olga Blinkova; Vyacheslav Brover; Vyacheslav Chetvernin; Jinna Choi; Eric Cox; Olga Ermolaeva; Catherine M Farrell; Tamara Goldfarb; Tripti Gupta; Daniel Haft; Eneida Hatcher; Wratko Hlavina; Vinita S Joardar; Vamsi K Kodali; Wenjun Li; Donna Maglott; Patrick Masterson; Kelly M McGarvey; Michael R Murphy; Kathleen O'Neill; Shashikant Pujar; Sanjida H Rangwala; Daniel Rausch; Lillian D Riddick; Conrad Schoch; Andrei Shkeda; Susan S Storz; Hanzhen Sun; Francoise Thibaud-Nissen; Igor Tolstoy; Raymond E Tully; Anjana R Vatsan; Craig Wallin; David Webb; Wendy Wu; Melissa J Landrum; Avi Kimchi; Tatiana Tatusova; Michael DiCuccio; Paul Kitts; Terence D Murphy; Kim D Pruitt
Journal:  Nucleic Acids Res       Date:  2015-11-08       Impact factor: 16.971

View more
  56 in total

1.  Insight into the Effect of Small RNA srn225147 on Mutacin IV in Streptococcus mutans.

Authors:  Shanshan Liu; Huihui Li; Zhenfei Guo; Junchang Guan; Yu Sun; Kai Zhang
Journal:  Indian J Microbiol       Date:  2019-08-28       Impact factor: 2.461

Review 2.  Leveraging synthetic biology for producing bioactive polyketides and non-ribosomal peptides in bacterial heterologous hosts.

Authors:  Taylor B Cook; Brian F Pfleger
Journal:  Medchemcomm       Date:  2019-04-25       Impact factor: 3.597

3.  Global analysis of adenylate-forming enzymes reveals β-lactone biosynthesis pathway in pathogenic Nocardia.

Authors:  Serina L Robinson; Barbara R Terlouw; Megan D Smith; Sacha J Pidot; Timothy P Stinear; Marnix H Medema; Lawrence P Wackett
Journal:  J Biol Chem       Date:  2020-08-21       Impact factor: 5.157

4.  DeepRiPP integrates multiomics data to automate discovery of novel ribosomally synthesized natural products.

Authors:  Nishanth J Merwin; Walaa K Mousa; Chris A Dejong; Michael A Skinnider; Michael J Cannon; Haoxin Li; Keshav Dial; Mathusan Gunabalasingam; Chad Johnston; Nathan A Magarvey
Journal:  Proc Natl Acad Sci U S A       Date:  2019-12-23       Impact factor: 11.205

5.  Heterologous Production of Lasso Peptide Capistruin in a Burkholderia Host.

Authors:  Sylvia Kunakom; Alessandra S Eustáquio
Journal:  ACS Synth Biol       Date:  2020-01-13       Impact factor: 5.110

6.  The antiSMASH database version 3: increased taxonomic coverage and new query features for modular enzymes.

Authors:  Kai Blin; Simon Shaw; Satria A Kautsar; Marnix H Medema; Tilmann Weber
Journal:  Nucleic Acids Res       Date:  2021-01-08       Impact factor: 16.971

Review 7.  Microbial natural product databases: moving forward in the multi-omics era.

Authors:  Jeffrey A van Santen; Satria A Kautsar; Marnix H Medema; Roger G Linington
Journal:  Nat Prod Rep       Date:  2020-08-28       Impact factor: 13.423

Review 8.  Mining genomes to illuminate the specialized chemistry of life.

Authors:  Marnix H Medema; Tristan de Rond; Bradley S Moore
Journal:  Nat Rev Genet       Date:  2021-06-03       Impact factor: 53.242

9.  Unraveling the iterative type I polyketide synthases hidden in Streptomyces.

Authors:  Bin Wang; Fang Guo; Chunshuai Huang; Huimin Zhao
Journal:  Proc Natl Acad Sci U S A       Date:  2020-03-26       Impact factor: 11.205

10.  Rhodococcus comparative genomics reveals a phylogenomic-dependent non-ribosomal peptide synthetase distribution: insights into biosynthetic gene cluster connection to an orphan metabolite.

Authors:  Agustina Undabarrena; Ricardo Valencia; Andrés Cumsille; Leonardo Zamora-Leiva; Eduardo Castro-Nallar; Francisco Barona-Gomez; Beatriz Cámara
Journal:  Microb Genom       Date:  2021-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.