Literature DB >> 23104377

ClusterMine360: a database of microbial PKS/NRPS biosynthesis.

Abstract

ClusterMine360 (http://www.clustermine360.ca/) is a database of microbial polyketide and non-ribosomal peptide gene clusters. It takes advantage of crowd-sourcing by allowing members of the community to make contributions while automation is used to help achieve high data consistency and quality. The database currently has >200 gene clusters from >185 compound families. It also features a unique sequence repository containing >10 000 polyketide synthase/non-ribosomal peptide synthetase domains. The sequences are filterable and downloadable as individual or multiple sequence FASTA files. We are confident that this database will be a useful resource for members of the polyketide synthases/non-ribosomal peptide synthetases research community, enabling them to keep up with the growing number of sequenced gene clusters and rapidly mine these clusters for functional information.

Entities: Chemical Disease Species

Mesh：

Substances：

Year: 2012 PMID： 23104377 PMCID： PMC3531105 DOI： 10.1093/nar/gks993

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

The amount of information on microbial secondary metabolite biosynthesis has been growing explosively. Gene clusters responsible for the biosynthesis of polyketides and non-ribosomal peptides, identified by the presence of polyketide synthases (PKS) or non-ribosomal peptide synthetases (NRPS) encoding genes, have received significant attention, resulting in the sequencing of hundreds of gene clusters. With the power, speed and low cost of next-generation sequencing methods, this number is expected to rapidly increase by at least an order of magnitude in the next few years. To take advantage of this wealth of data, it needs to be easily accessible and discoverable. Although the sequences themselves are available in National Center for Biotechnology Information (NCBI) databases (1,2), they are frequently difficult to locate, partially because of the large amounts of information that these databases host. There is no standardized annotation for these biosynthetic gene clusters. For example, some are tagged with PKS and/or NRPS, such as the cycloheximide (accession number JX014302; Shen,B. and Yin,M., unpublished data) and streptothricin (accession number AB684619; Maruyama,C., Toyoda,J., Kato,Y., Izumikawa,M., Takagi,M., Shinya,K., Katano,H., Utagawa,T. and Hamano,Y., unpublished data) gene clusters, whereas others are tagged with the term polyketide synthase or non-ribosomal peptide synthetase, such as laidlomycin (accession number JQ793783; Hwang,J.Y., Kim,H.S., Sedai,B. and Nam,D.H., unpublished data) and collismycin A (accession number HE575208) (3). With the rapid growth in bacterial genome sequencing, many new clusters are located within much larger genome sequence files and are occasionally unannotated, such as the antibiotic TA/myxovirescin biosynthetic gene cluster in the Myxococcus xanthus genome (accession number CP000113.1) (4). These problems are compounded by the fact that gene cluster discovery is being undertaken by researchers from diverse fields of expertise, including chemistry, biochemistry, microbiology, biotechnology and drug discovery, all with differing standards for gene cluster annotation. Thus, it is no surprise that given these issues, it can be extremely challenging, time consuming and often frustrating to find appropriate genes cluster in the NCBI database. To accelerate research and leverage existing data in PKS/NRPS biosynthesis, a focused and comprehensive database that gathers this gene cluster information together is required (5). Although there are some existing databases that provide important resources on PKS/NRPS gene clusters and/or their products (6–9), none have the features necessary to enable the community to maximize the benefit from sequence data. In particular, we have identified two key features that are required for the community. The first is to have a comprehensive up-to-date database. Because of the rapid emergence of new gene clusters across a broad range of disciplines, a resource that can be easily updated by any and all community members is required to ensure that the database is comprehensive and current. The second is that the difficulty in accessing multiple diverse gene clusters has limited the ability of researchers to carry out comprehensive phylogenetic and functional analysis. Therefore, the database must have the ability to generate multiple sequence FASTA files for individual catalytic domains found in PKS and NRPS biosynthesis. In evaluating the existing PKS/NRPS databases, we found that some of them, such as NRPS-PKS (6) and MAPSI (7), have not been updated in recent years. Others, such as NORINE (8), focus on the products of the cluster and do not contain information on the gene cluster itself. DoBISCUIT (http://www.bio.nite.go.jp/pks/) is a new and promising database, but currently has limited amounts of data, whereas PKMiner (9) is limited to type II PKS clusters. Curated databases, such as those mentioned previously, can offer high levels of data quality, but they are not always actively updated, as few institutions or research groups have the resources to maintain ongoing manual curation. Additionally, there can be long lag times between the discovery of a new gene cluster and its inclusion in a traditionally curated database. Newly discovered clusters are often excluded from these databases, as they do not meet curation criteria. For example, they may lack a characterized product as is seen for a large number of cryptic or silent gene clusters from whole genome sequencing efforts (10). The result of this is a bias towards a limited number of well-known archetypical clusters, such as the erythromycin (accession number AY623658) (11) and tyrocidine (accession number AF004835) (12) gene clusters. This is a particularly important concern for researchers attempting to assign function to new gene clusters and those involved in bioprospecting, as they need access to the breath and diversity of sequenced clusters and not simply the well-known prototypical textbook clusters. The best way to address these issues, which are limiting the research ability of the community, is to build a dynamic resource that allows users to make contributions, minimizes the amount of time-consuming manual curation by database administrators, but maintains the high standard of curated data quality. New data, especially from bacterial genome sequencing, is being generated at an extraordinarily rapid rate (5). To keep up with this influx of data, while at the same time minimizing the amount of inefficient data entry, we chose to develop a server based workflow engine to assist in curation of gene cluster data. Additionally, we have adopted a community-based approach for the collection of data for this database. Researchers can sign up for a free account, allowing them to add to or update the database. This crowd-sourcing allows participation by those who are most interested in using the data, ensuring broad coverage of the data across diverse fields, while decreasing the need for a dedicated full-time curator. Community-based curation has some unique challenges. In particular, it can be difficult to ensure high levels of data quality (13,14). To address this issue, we have limited the input from the users, such that only a few key details need be provided with the bulk of the data collection and analysis being performed in an automated fashion using known databases, such as the NCBI databases, and analysis tools, including antiSMASH (antibiotics and secondary metabolite analysis shell) (15). The use of automation means the database can ‘auto-curate’ itself, reducing the amount of administrative burden and enabling the database to grow dynamically through community contributions.

DATABASE ORGANIZATION

The microbial PKS/NRPS database, ClusterMine360 (http://www.clustermine360.ca/), is organized around two key elements, the compound family and the gene cluster (see Figure 1). A compound family is a grouping of compounds that have the same core structure. This term is used, as most gene clusters produce more than one compound, although they tend to be highly related. For example, the epothilone biosynthetic pathway produces four highly related polyketides, epothilones A–D, which differ by the presence or absence of a methyl group and an epoxide moiety (17,18). Thus, by organizing by compound family, we are able to capture the chemical diversity generated by a single biosynthetic gene cluster without duplicating data in the database. The ‘Compound Families’ page of the website has a listing of all of the families along with an image of the structure of a representative member of the family (if available).

Figure 1.

Organization of ClusterMine360. The compound family and cluster represent the two major organization units of the database. Additional data fields connect to either the compound family or cluster. The organization of the fredericamycin gene cluster is shown in the cluster pane (16). As many natural products are known by more than one name, synonyms for each compound family can be added. This is essential to limit duplicate entries. For example, the polyketide pimaricin is also widely known as natamycin. Before adding a new compound family, the list of existing names is checked to ensure it has not already been added. If the compound family has already been added under another name, the user is notified and is given the primary name for that family in the database. Additionally, the database queries ChemSpider to identify synonyms for each compound family and adds these to the compound family’s details page, ensuring a comprehensive set of synonyms for each compound family. Because many compounds can be highly related, yet clearly not from the same compound family, each compound family can be linked to related families. For example, erythromycin, megalomycin and oleandomycin all share the same polyketide core, but differ in their sugar residues attached to the core. These are clearly highly related compounds; thus, they are linked together as related families. Identification of related families is highly subjective. Although it is possible to evaluate similarity between structures using mathematical coefficients, such as the Tanimoto similarity or Euclidian distance (19), no weighting scheme that captured the subjective relatedness of, for example, erythromycin, megalomycin and oleandomycin, without including, for example, methylmycin, narbomycin, pikromycin or lankamycin, was available. Compound families can also be related by similarities in the clusters that produce them. As part of the analysis undertaken by antiSMASH, it searches for similar clusters, and the results of these are then used to automatically link the compound families. Links to related compound families are shown on the compound family’s details page, enabling users to easily access data for related compounds. To capture some of the broader relatedness between compound families, each family is associated with one or more overall biosynthetic pathway type, such as PKS type I, type II, type III or NRPS. Clusters with PKS and NRPS domains are identified as hybrid pathways. This enables the compound families to be rapidly sorted by a broad structural relatedness. The second major organization unit of the database is the gene cluster. Multiple clusters can be associated with a given compound family. For example, epothilone biosynthetic gene clusters have been sequenced from two strains of Sorangium cellulosum (20,21), and erythromycin gene clusters have been sequenced from Saccharopolyspora erythraea (22) and Aeromicrobium erythreum (11). Each cluster is associated with an NCBI nucleotide record. The NCBI record is used as the source for the lineage of the producing organism, including the phylum, genus and species. Links to primary literature references for the sequencing data are also retrieved from the NCBI record and displayed on the cluster’s details page. Linked to each gene cluster is the annotation data for each gene in the cluster and each domain found in the PKS and NRPS encoding genes. These data are generated through antiSMASH analysis of each gene cluster (15). The domain sequences, extracted from the antiSMASH results, are also available from the gene cluster’s details page.

AUTOMATION

Ensuring high data quality is time consuming, and it makes database upkeep difficult. One of the most important requirements for the database was to integrate automation to make curation as easy as possible. As most of the data are populated automatically, external users are able to contribute without much risk to data quality. This semi-automatic curation also means that large amounts of data can be added to the database in a relatively short amount of time. The following steps occur once a cluster is added (see Figure 2). First, the NCBI nucleotide database is queried to retrieve important information about the sequence, such as its description, the name and lineage of the organism it was isolated from and any sequencing references that are associated with the record. Once this information has been retrieved, the cluster is submitted to antiSMASH for analysis. The database automatically tracks the progress of the antiSMASH submission and proceeds to download the results when completed. The results are then parsed to retrieve information, such as the pathway types for that cluster, which is used to ensure that the pathway types of the linked compound family are correct. Finally, if antiSMASH has identified any PKS/NRPS domains, the amino acid sequence of those domains will be stored in the database’s sequence repository along with key information, such as domain substrate specificity, stereochemistry and activity of the domain, as applicable. In addition, when a compound family is added, it is searched against the PubChem (23,24) database to retrieve Medical Subject Heading (MeSH) pharmacological identifiers that classify the compound’s bioactivity. Simplified molecular-input line-entry system (SMILES) strings are also retrieved enabling users to search the database by substructure. The typical time to complete these processes ranges from a few minutes to a few hours depending on server load.

Figure 2.

ClusterMine360 has automated many of the steps required for curating the database. Automated curation is essential to enable crowd-sourcing without sacrificing data quality.

ClusterMine360 has automated many of the steps required for curating the database. Automated curation is essential to enable crowd-sourcing without sacrificing data quality. In addition to the automated processes above, we also incorporated some other features that make it particularly easy for users to add data. When a compound family is added to the database, a wizard guides the user through the process of entering information on pathway types, synonyms and related families and helping the user in generating an image for the structure of the compound. To make it easy to associate an image, the ChemSpider database (http://www.chemspider.com) is queried to retrieve images that match the compound family name. Alternatively, an image can be generated from a user supplied SMILES string. Similarly, when adding synonyms, potential synonyms are returned from ChemSpider and the user can easily select those that are applicable.

antiSMASH

antiSMASH is the bioinformatics tool we use to provide analysis on clusters. antiSMASH can scan a cluster’s sequence and determine the most likely pathway type for that cluster. For type I PKS clusters, it also attempts to predict whether it is modular, iterative or has trans-acyltransferase (ATs). It is also able to make predictions for individual domains. It endeavours to determine the substrate specificity for AT and adenylation domains. For ketoreductase (KR) domains, it assesses whether it is active or inactive, and the probable stereochemistry of the product. More details can be found in (15). To ensure the standardization of the large amounts of data in the database and to minimize manual curation, the results retrieved from antiSMASH by ClusterMine360 cannot be edited by individual users to include new biochemistry. However, as newly characterized PKS/NRPS domains are added to antiSMASH, the clusters in the database can be easily re-analysed to take advantage of the improved analytics.

USER CONTRIBUTIONS

User contributions to the database are encouraged and acknowledged. To contribute to the database, users must register for a free account using a simple registration form. The name of the contributor, the name of their research group and a link to their webpage is displayed on records that they have added to the database.

PRESENT CONTENT

Currently, the database has >185 unique compound families, >200 clusters with known products and >300 clusters with no known products (silent or cryptic gene clusters). The sequence repository has 10 000+ PKS/NRPS domains from >500 clusters available for download, including 1300+ acyl carrier proteins (ACPs), 1000+ ATs, 1000+ KRs, 1300+ ketosynthases (KSs), 250+ thioesterases (TEs), along with sequences from less common domains, such as heterocyclization and epimerization domains.

SEQUENCE REPOSITORY

One of the most unique aspects of this database is its sequence repository. The repository contains large number of diverse PKS/NRPS domains extracted from the antiSMASH analysis of the clusters contained in the database. We have also included the ability to scan any NCBI nucleotide record and have the detected PKS/NRPS included in this repository. We believe that this repository will become an invaluable tool to those involved in identifying sequence homologies and bioprospecting. The sequences can be downloaded individually in FASTA format. Alternatively, all of the domains in a given cluster can be downloaded at once in a zip file. We have also included the ability to filter the domains based on a variety of criteria, following which they can be downloaded in a multi-sequence FASTA file. Importantly, the depth of information included in each sequence’s header is exceptional. They are full of rich information, such as accession number, producing organism, gene identifier, pathway type, domain type and any predicted properties of that domain. We have also included an option to output shortened headers for use with bioinformatics tools that have restrictions on the number of characters in the header.

ClusterMine360: A POWERFUL TOOL FOR PHYLOGENETIC ANALYSIS

To demonstrate the utility of the ClusterMine360 database, NRPS heterocyclization domains were selected and used for cluster analysis. Heterocyclization domains play a key role in NRPS biosynthesis, coupling acyl and peptidyl groups onto Cys, Ser and Thr residues followed by cyclization of the associated side-chain to generate thiazol and oxazole rings (25–27). This occurs during the biosynthesis of non-ribosomal peptides, such as the antibiotic bacitracin, and mixed non-ribosomal peptide/polyketides, such as the antimitotic agents epothilone and rhizoxin. A FASTA file of 106 heterocyclization domains was downloaded and aligned using Multiple Sequence Comparison by Log-Expectation (MUSCLE) (28). A phylogenetic tree was generated from the resulting alignment using the PhyML maximum likelihood method with the Whelan and Goldman (WAG) model of amino acid substitution and nearest neighbour interchange for the tree topology search (29). The tree shows that heterocyclization domains clustered by function, based on whether the domain used enzyme bound Cys, Ser or Thr as its substrate (Figure 3). To evaluate which residues each heterocyclization domain used, the ‘detail of cluster’ function in the sequence repository was examined to identify the specificity of adenylation domain associated with the heterocyclization domain. Based on this analysis, the tree shows that Cys, Ser and Thr specific heterocyclization domains all tree apart from each other. This analysis shows that with ClusterMine360, it is possible to rapidly develop phylogenetic tools to predict the function of an individual domain.

Figure 3.

A rooted phylogenetic tree of heterocyclization domains from NRPS gene clusters shows that heterocyclization domains tree is based on function. ClusterMine360 provides a rapid and powerful tool for generating and analysing phylogenetic trees of PKS and NRPS domains.

CONCLUSION

ClusterMine360 (http://www.clustermine360.ca/) is a unique database of microbial PKS/NRPS clusters. It contains >200 clusters from >185 compound families, and it features a unique sequence repository containing >10 000 PKS/NRPS domains. By leveraging automation and crowd-sourcing, we believe that this database will grow dynamically through contributions from interested parties as new clusters are discovered and sequenced. We are confident that this database will be a useful resource for members of the PKS/NRPS research community, enabling them to keep up with the growing number of sequenced gene clusters, and rapidly mine these clusters for functional information.

FUNDING

The National Science and Engineering Research Council of Canada (NSERC); Ontario Ministry of Research and Innovation; University of Ottawa. Funding for open access charge: University of Ottawa and NSERC. Conflict of interest statement. None declared.

27 in total

1. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0.

Authors: Stéphane Guindon; Jean-François Dufayard; Vincent Lefort; Maria Anisimova; Wim Hordijk; Olivier Gascuel
Journal: Syst Biol Date: 2010-03-29 Impact factor: 15.683

2. Elucidating the biosynthetic pathway for the polyketide-nonribosomal peptide collismycin A: mechanism for formation of the 2,2'-bipyridyl ring.

Authors: Ignacio Garcia; Natalia M Vior; Alfredo F Braña; Javier González-Sabin; Jürgen Rohr; Francisco Moris; Carmen Méndez; José A Salas
Journal: Chem Biol Date: 2012-03-23

3. New natural epothilones from Sorangium cellulosum, strains So ce90/B2 and So ce90/D13: isolation, structure elucidation, and SAR studies.

Authors: I H Hardt; H Steinmetz; K Gerth; F Sasse; H Reichenbach; G Höfle
Journal: J Nat Prod Date: 2001-07 Impact factor: 4.050

4. Epothilone biosynthesis: assembly of the methylthiazolylcarboxy starter unit on the EpoB subunit.

Authors: H Chen; S O'Connor; D E Cane; C T Walsh
Journal: Chem Biol Date: 2001-09

5. MapsiDB: an integrated web database for type I polyketide synthases.

Authors: Hongseok Tae; Jae Kyung Sohng; Kiejung Park
Journal: Bioprocess Biosyst Eng Date: 2009-02-11 Impact factor: 3.210

6. PKMiner: a database for exploring type II polyketide synthases.

Authors: Jinki Kim; Gwan-Su Yi
Journal: BMC Microbiol Date: 2012-08-08 Impact factor: 3.605

7. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences.

Authors: Marnix H Medema; Kai Blin; Peter Cimermancic; Victor de Jager; Piotr Zakrzewski; Michael A Fischbach; Tilmann Weber; Eriko Takano; Rainer Breitling
Journal: Nucleic Acids Res Date: 2011-06-14 Impact factor: 16.971

8. NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy.

Authors: Kim D Pruitt; Tatiana Tatusova; Garth R Brown; Donna R Maglott
Journal: Nucleic Acids Res Date: 2011-11-24 Impact factor: 16.971

9. GenBank.

Authors: Dennis A Benson; Ilene Karsch-Mizrachi; Karen Clark; David J Lipman; James Ostell; Eric W Sayers
Journal: Nucleic Acids Res Date: 2011-12-05 Impact factor: 16.971

Review 10. Industrial methodology for process verification in research (IMPROVER): toward systems biology verification.

Authors: Pablo Meyer; Julia Hoeng; J Jeremy Rice; Raquel Norel; Jörg Sprengel; Katrin Stolle; Thomas Bonk; Stephanie Corthesy; Ajay Royyuru; Manuel C Peitsch; Gustavo Stolovitzky
Journal: Bioinformatics Date: 2012-03-14 Impact factor: 6.937

48 in total

Review 1. Microbial natural products: molecular blueprints for antitumor drugs.

Authors: Lesley-Ann Giddings; David J Newman
Journal: J Ind Microbiol Biotechnol Date: 2013-09-03 Impact factor: 3.346

Review 2. Fungal secondary metabolism: regulation, function and drug discovery.

Authors: Nancy P Keller
Journal: Nat Rev Microbiol Date: 2019-03 Impact factor: 60.633

3. Genomic charting of ribosomally synthesized natural product chemical space facilitates targeted mining.

Authors: Michael A Skinnider; Chad W Johnston; Robyn E Edgar; Chris A Dejong; Nishanth J Merwin; Philip N Rees; Nathan A Magarvey
Journal: Proc Natl Acad Sci U S A Date: 2016-10-03 Impact factor: 11.205

4. Databases of the thiotemplate modular systems (CSDB) and their in silico recombinants (r-CSDB).

Authors: Janko Diminic; Jurica Zucko; Ida Trninic Ruzic; Ranko Gacesa; Daslav Hranueli; Paul F Long; John Cullum; Antonio Starcevic
Journal: J Ind Microbiol Biotechnol Date: 2013-03-16 Impact factor: 3.346

5. Ketoacylsynthase Domains of a Polyunsaturated Fatty Acid Synthase in Thraustochytrium sp. Strain ATCC 26185 Can Effectively Function as Stand-Alone Enzymes in Escherichia coli.

Authors: Xi Xie; Dauenpen Meesapyodsuk; Xiao Qiu
Journal: Appl Environ Microbiol Date: 2017-04-17 Impact factor: 4.792

6. DeepRiPP integrates multiomics data to automate discovery of novel ribosomally synthesized natural products.

Authors: Nishanth J Merwin; Walaa K Mousa; Chris A Dejong; Michael A Skinnider; Michael J Cannon; Haoxin Li; Keshav Dial; Mathusan Gunabalasingam; Chad Johnston; Nathan A Magarvey
Journal: Proc Natl Acad Sci U S A Date: 2019-12-23 Impact factor: 11.205

7. Diversity of Polyketide Synthases and Nonribosomal Peptide Synthetases Revealed Through Metagenomic Analysis of a Deep Oligotrophic Cave.

Authors: Laima Lukoseviciute; Jolanta Lebedeva; Nomeda Kuisiene
Journal: Microb Ecol Date: 2020-07-08 Impact factor: 4.552

8. Minimum Information about a Biosynthetic Gene cluster.

Authors: Marnix H Medema; Renzo Kottmann; Pelin Yilmaz; Matthew Cummings; John B Biggins; Kai Blin; Irene de Bruijn; Yit Heng Chooi; Jan Claesen; R Cameron Coates; Pablo Cruz-Morales; Srikanth Duddela; Stephanie Düsterhus; Daniel J Edwards; David P Fewer; Neha Garg; Christoph Geiger; Juan Pablo Gomez-Escribano; Anja Greule; Michalis Hadjithomas; Anthony S Haines; Eric J N Helfrich; Matthew L Hillwig; Keishi Ishida; Adam C Jones; Carla S Jones; Katrin Jungmann; Carsten Kegler; Hyun Uk Kim; Peter Kötter; Daniel Krug; Joleen Masschelein; Alexey V Melnik; Simone M Mantovani; Emily A Monroe; Marcus Moore; Nathan Moss; Hans-Wilhelm Nützmann; Guohui Pan; Amrita Pati; Daniel Petras; F Jerry Reen; Federico Rosconi; Zhe Rui; Zhenhua Tian; Nicholas J Tobias; Yuta Tsunematsu; Philipp Wiemann; Elizabeth Wyckoff; Xiaohui Yan; Grace Yim; Fengan Yu; Yunchang Xie; Bertrand Aigle; Alexander K Apel; Carl J Balibar; Emily P Balskus; Francisco Barona-Gómez; Andreas Bechthold; Helge B Bode; Rainer Borriss; Sean F Brady; Axel A Brakhage; Patrick Caffrey; Yi-Qiang Cheng; Jon Clardy; Russell J Cox; René De Mot; Stefano Donadio; Mohamed S Donia; Wilfred A van der Donk; Pieter C Dorrestein; Sean Doyle; Arnold J M Driessen; Monika Ehling-Schulz; Karl-Dieter Entian; Michael A Fischbach; Lena Gerwick; William H Gerwick; Harald Gross; Bertolt Gust; Christian Hertweck; Monica Höfte; Susan E Jensen; Jianhua Ju; Leonard Katz; Leonard Kaysser; Jonathan L Klassen; Nancy P Keller; Jan Kormanec; Oscar P Kuipers; Tomohisa Kuzuyama; Nikos C Kyrpides; Hyung-Jin Kwon; Sylvie Lautru; Rob Lavigne; Chia Y Lee; Bai Linquan; Xinyu Liu; Wen Liu; Andriy Luzhetskyy; Taifo Mahmud; Yvonne Mast; Carmen Méndez; Mikko Metsä-Ketelä; Jason Micklefield; Douglas A Mitchell; Bradley S Moore; Leonilde M Moreira; Rolf Müller; Brett A Neilan; Markus Nett; Jens Nielsen; Fergal O'Gara; Hideaki Oikawa; Anne Osbourn; Marcia S Osburne; Bohdan Ostash; Shelley M Payne; Jean-Luc Pernodet; Miroslav Petricek; Jörn Piel; Olivier Ploux; Jos M Raaijmakers; José A Salas; Esther K Schmitt; Barry Scott; Ryan F Seipke; Ben Shen; David H Sherman; Kaarina Sivonen; Michael J Smanski; Margherita Sosio; Evi Stegmann; Roderich D Süssmuth; Kapil Tahlan; Christopher M Thomas; Yi Tang; Andrew W Truman; Muriel Viaud; Jonathan D Walton; Christopher T Walsh; Tilmann Weber; Gilles P van Wezel; Barrie Wilkinson; Joanne M Willey; Wolfgang Wohlleben; Gerard D Wright; Nadine Ziemert; Changsheng Zhang; Sergey B Zotchev; Rainer Breitling; Eriko Takano; Frank Oliver Glöckner
Journal: Nat Chem Biol Date: 2015-09 Impact factor: 15.040

9. Computational approaches to natural product discovery.

Authors: Marnix H Medema; Michael A Fischbach
Journal: Nat Chem Biol Date: 2015-09 Impact factor: 15.040

10. The antiSMASH database version 3: increased taxonomic coverage and new query features for modular enzymes.

Authors: Kai Blin; Simon Shaw; Satria A Kautsar; Marnix H Medema; Tilmann Weber
Journal: Nucleic Acids Res Date: 2021-01-08 Impact factor: 16.971