| Literature DB >> 31399811 |
Maria Sorokina1, Christoph Steinbeck2.
Abstract
Natural products (NPs), often also referred to as secondary metabolites, are small molecules synthesised by living organisms. Natural products are of interest due to their bioactivity and in this context as starting points for the development of drugs and other bioactive synthetic products. In order to select compounds from virtual libraries, Ertl et al. developed a natural product likeness score which was later published as an open data, open source implementation. Here we present NaPLeS, an easily portable, containerised, open source web application based on open data to compute natural product likeness scores for chemical libraries.Entities:
Keywords: Database; Docker container; Natural products; Web application
Year: 2019 PMID: 31399811 PMCID: PMC6688286 DOI: 10.1186/s13321-019-0378-z
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Fig. 1NaPLeS workflow schema. a NaPLeS training workflow schema. b NaPLeS query trough the web application
Size of individual data sets prior and after processing
| Database | Number of parsed molecules | Number of unique molecules | Origin of molecules | Link/references |
|---|---|---|---|---|
| UEFS | 503 | 478 | Generalist |
|
| HIT | 530 | 477 | Plants | |
| SANCDB | 623 | 592 | Plants | |
| AfroDB | 944 | 874 | Plants | |
| Sellec Chem NP | 1590 | 1411 | Generalist | |
| NPACT | 1572 | 1452 | Plants | |
| ChEMBL NP | 1899 | 1328 | Generalist | |
| NuBBE | 2215 | 2022 | Plants, Insects | |
| StreptomeDB | 2443 | 2320 | Bacteria | |
| PubChem NP | 2938 | 2813 | Generalist | |
| NANPDB | 6840 | 3912 | Generalist | |
| ChEBI NP | 16223 | 15074 | Generalist | |
| NPAtlas | 20036 | 18909 | Bacteria, Fungi | |
| TCMDB | 58388 | 50910 | Plants | |
| InterBioScreen NP | 67910 | 66789 | Generalist | |
| Manually curated dataset | 77651 | 74368 | Generalist | [ |
| ZINC NP | 85201 | 67320 | Generalist |
|
| UNPD (via ISDB) | 213206 | 157089 | Generalist | |
| Super Natural II (not in the training set) | 84554 | 59121 | Generalist | bioinf-applied.charite.de/supernatural_new [ |
Fig. 2Distribution of the NP-likeness score for natural products and synthetic molecules
Fig. 3Top 5 molecules tagged as natural products in public databases with negative NP-likeness score. From left to right: 2,4-dichlorobenzohydrazide, 1,2,4-trichlorobenzene, malonohydrazide, picric acid and pyridine-2,3-dihydrazide
Fig. 4Top 10 fragments with highest frequency among synthetic molecules
Fig. 5Top 10 fragments with higher frequency among natural products
Fig. 6Examples of highly repeated in natural products fragments centered on an oxygen