| Literature DB >> 31807684 |
Jeffrey A van Santen1, Grégoire Jacob1, Amrit Leen Singh1, Victor Aniebok2, Marcy J Balunas3, Derek Bunsko1, Fausto Carnevale Neto1,4,5, Laia Castaño-Espriu6, Chen Chang1, Trevor N Clark1, Jessica L Cleary Little7, David A Delgadillo2, Pieter C Dorrestein8, Katherine R Duncan6, Joseph M Egan1, Melissa M Galey7, F P Jake Haeckl1, Alex Hua1, Alison H Hughes6, Dasha Iskakova1, Aswad Khadilkar2, Jung-Ho Lee7, Sanghoon Lee1, Nicole LeGrow1, Dennis Y Liu1, Jocelyn M Macho2, Catherine S McCaughey1, Marnix H Medema9, Ram P Neupane10, Timothy J O'Donnell10, Jasmine S Paula1, Laura M Sanchez7, Anam F Shaikh11, Sylvia Soldatou6, Barbara R Terlouw9, Tuan Anh Tran7,12, Mercia Valentine1, Justin J J van der Hooft9, Duy A Vo2, Mingxun Wang8, Darryl Wilson1, Katherine E Zink7, Roger G Linington1.
Abstract
Despite rapid evolution in the area of microbial natural products chemistry, there is currently no open access database containing all microbially produced natural product structures. Lack of availability of these data is preventing the implementation of new technologies in natural products science. Specifically, development of new computational strategies for compound characterization and identification are being hampered by the lack of a comprehensive database of known compounds against which to compare experimental data. The creation of an open access, community-maintained database of microbial natural product structures would enable the development of new technologies in natural products discovery and improve the interoperability of existing natural products data resources. However, these data are spread unevenly throughout the historical scientific literature, including both journal articles and international patents. These documents have no standard format, are often not digitized as machine readable text, and are not publicly available. Further, none of these documents have associated structure files (e.g., MOL, InChI, or SMILES), instead containing images of structures. This makes extraction and formatting of relevant natural products data a formidable challenge. Using a combination of manual curation and automated data mining approaches we have created a database of microbial natural products (The Natural Products Atlas, www.npatlas.org) that includes 24 594 compounds and contains referenced data for structure, compound names, source organisms, isolation references, total syntheses, and instances of structural reassignment. This database is accompanied by an interactive web portal that permits searching by structure, substructure, and physical properties. The Web site also provides mechanisms for visualizing natural products chemical space and dashboards for displaying author and discovery timeline data. These interactive tools offer a powerful knowledge base for natural products discovery with a central interface for structure and property-based searching and presents new viewpoints on structural diversity in natural products. The Natural Products Atlas has been developed under FAIR principles (Findable, Accessible, Interoperable, and Reusable) and is integrated with other emerging natural product databases, including the Minimum Information About a Biosynthetic Gene Cluster (MIBiG) repository, and the Global Natural Products Social Molecular Networking (GNPS) platform. It is designed as a community-supported resource to provide a central repository for known natural product structures from microorganisms and is the first comprehensive, open access resource of this type. It is expected that the Natural Products Atlas will enable the development of new natural products discovery modalities and accelerate the process of structural characterization for complex natural products libraries.Entities:
Year: 2019 PMID: 31807684 PMCID: PMC6891855 DOI: 10.1021/acscentsci.9b00806
Source DB: PubMed Journal: ACS Cent Sci ISSN: 2374-7943 Impact factor: 14.553
Examples of Existing Natural Product Structure Databases
| database | description | URL | access | number of compounds | last revision | downloadable? |
|---|---|---|---|---|---|---|
| Dictionary of Natural Products | plant, microbial and marine natural products | 1 | commercial | 226 000 | 2019 | no |
| MarinLit | marine-derived natural products, including invertebrates, algae and microorganisms | 2 | commercial | ∼30 000 | July 2019 | no |
| Antibase | microbial natural products | 3 | commercial | 43 700 | May 2017 | no |
| StreptomeDB | natural products derived from bacteria of the genus | 4 | open access | 4040 | Aug. 2015 | yes |
| Supernatural II | chemical structures of primary and secondary metabolites and natural macromolecules | 5 | open access | 325 508 | April 2018 | no |
| NPEdia | natural products from plants and microorganisms, focused on structural features | 6 | open access | 49 610 | Jan, 2014 | no |
| AfroDB | natural products from African medicinal plants | 7 | open access | 1000 | Nov. 2013 | yes |
| NuBBEDB | natural products isolated from Brazil | 8 | open access | 2200 | Aug. 2017 | yes |
http://dnp.chemnetbase.com
http://pubs.rsc.org/marinlit/
https://www.wiley.com/en-us/AntiBase%3A+The+Natural+Compound+Identifier-p-9783527343591
http://132.230.56.4/streptomedb2/ (Previously: http://www.pharmaceutical-bioinformatics.de/streptomedb2/).
http://bioinf-applied.charite.de/supernatural_new/index.php
http://www.cbrg.riken.jp/npedia/?LANG=en
Downloadable at https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0078085 (ZINC searchable: http://zinc.docking.org/catalogs/afronp/).
https://nubbe.iq.unesp.br/portal/nubbe-search.html
Figure 1Workflow for creation and curation of the Natural Products Atlas.
Figure 2(A) Search interface (basic search). (B) Explore view (compound), (C) Discover view (author).
Figure 3Four views in the Explore section. (A) Compound view, providing data for individual compounds. (B) Cluster view, illustrating compounds with close structural similarity. (C) Node view, illustrating clusters of compounds that are more distantly related. (D) Global view, presenting distribution of all chemical space in the Natural Products Atlas.
Figure 4Global view, illustrating positions of all natural products containing a pyridine functional group as a substructure motif.
Figure 5(A) Cluster view for guanacastepene E (NPAID 2040, red circle) and related compounds. (B) Node view for guanacastepene E, illustrating distribution and connectivities of related clusters (purple hexagons) and example structures from each cluster.