| Literature DB >> 25520777 |
Ye Hu1, Jürgen Bajorath1.
Abstract
In 2012, we reported 30 compound data sets and/or programs developed in our laboratory in a data article and made them freely available to the scientific community to support chemoinformatics and computational medicinal chemistry applications. These data sets and computational tools were provided for download from our website. Since publication of this data article, we have generated 13 new data sets with which we further extend our collection of publicly available data and tools. Due to changes in web servers and website architectures, data accessibility has recently been limited at times. Therefore, we have also transferred our data sets and tools to a public repository to ensure full and stable accessibility. To aid in data selection, we have classified the data sets according to scientific subject areas. Herein, we describe new data sets, introduce the data organization scheme, summarize the database content and provide detailed access information in ZENODO (doi: 10.5281/zenodo.8451 and doi:10.5281/zenodo.8455).Entities:
Year: 2014 PMID: 25520777 PMCID: PMC4264635 DOI: 10.12688/f1000research.3713.1
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Data sets and programs.
| Entry | Year | Subject area
| Description |
|---|---|---|---|
| 1
[ | 2007 | VS_ML_1 | 9 activity classes (AC) with increasing structural diversity |
| 2
[ | 2007 | VS_ML_2 | ~1.44 million ZINC compounds used for various virtual screening trials |
| 3
[ | 2007 | PROG_1 | Molecular similarity histogram filtering |
| 4
[ | 2007 | SSR_1 | 4 SD files with 26 selectivity sets; compounds are annotated with selectivity values for different targets |
| 5
[ | 2008 | SSR_2 | 7 compound selectivity sets containing 267 biogenic amine GPCR antagonists |
| 6
[ | 2008 | SSR_3 | 18 selectivity sets for targets from 4 families |
| 7
[ | 2008 | VS_ML_3 | 25 sets of compounds of increasing complexity and size |
| 8
[ | 2009 | VS_ML_4 | 242 hERG inhibitors |
| 9
[ | 2009 | SSR_4 | 243 ionotropic glutamate ion channel antagonists |
| 10
[ | 2009 | PROG_2 | Combinatorial analog graph (CAG) program with a sample set consisting of 51 thrombin inhibitors |
| 11
[ | 2009 | VS_ML_5 | 20 AC from the literature and 15 AC from the Molecular Drug Data Report |
| 12
[ | 2010 | VS_ML_6 | 8 AC |
| 13
[ | 2010 | PROG_3 | Program to generate target selectivity patterns of scaffolds |
| 14
[ | 2010 | PROG_4 | Multi-target CAGs (see also entry 10) with a sample set containing 33 kinase inhibitors |
| 15
[ | 2010 | PROG_5 | SARANEA |
| 16
[ | 2010 | PROG_6 | 3D activity landscape program with a sample set containing 248 cathepsin S inhibitors |
| 17
[ | 2010 | SAR_1 | 2 sets of MMPs from BindingDB and ChEMBL |
| 18
[ | 2010 | PROG_7 | Similarity-potency tree (SPT) program with a sample set containing 874 factor Xa inhibitors |
| 19
[ | 2010 | VS_ML_7 | 17 target-directed compound sets; each set contains a minimum of 10 distinct scaffolds and each
|
| 20
[ | 2011 | SAR_VZ | 10,489 malaria screening hits |
| 21
[ | 2011 | SAR_2 | 458 target-based sets with scaffolds and scaffold hierarchies |
| 22
[ | 2011 | SAR_VZ | 4 sets of compounds active against 3 or 4 targets |
| 23
[ | 2011 | SAR_VZ | 881 factor Xa inhibitors |
| 24
[ | 2011 | VS_ML_8 | 50 AC prioritized for similarity searching |
| 25
[ | 2011 | VS_ML_9 | 25 data sets from successful ligand-based virtual screening applications |
| 26
[ | 2011 | SAR_3 | 26 conserved scaffolds in activity profile sequences of length 4 |
| 27
[ | 2011 | PROG_8 | Scaffold distance function |
| 28
[ | 2011 | SAR_4 | 2 sets of compounds with multiple K
i or IC
50 measurements against the same targets that differed within
|
| 29
[ | 2012 | SAR_VZ | 4 AC |
| 30
[ | 2012 | SAR_5 | 5 sets of different types of activity cliffs |
| 31
[ | 2012 | VS_ML_10 | 50 AC for scaffold hopping analysis |
| 32
[ | 2012 | SAR_6 | 61 AC consisting of SAR transfer series with regular potency progression |
| 33
[ | 2013 | SAR_7 | 4 activity measurement type-dependent sets of scaffolds |
| 34
[ | 2013 | VS_ML_11 | 2 multi-target compound sets |
| 35
[ | 2013 | VS_ML_12 | 4 multi-target compound sets and 3 multi-mechanism sets |
| 36
[ | 2013 | SAR_8 | 2337 compound series matrices |
| 37
[ | 2013 | SAR_9 | 128 AC containing ≥100 compounds with K i values |
| 38
[ | 2014 | SAR_10 | 30,452 and 45,607 target-based MMS with K i and IC 50 values, respectively |
| 39
[ | 2014 | SAR_11 | 221 drug-unique scaffolds |
| 40
[ | 2014 | SAR_12 | 92,734 MMPs based upon retrosynthetic rules for 435 AC |
| 41
[ | 2014 | SAR_13 | 20,073 and 25,297 MMP-based activity cliffs with K i and IC 50 values, respectively |
| 42
[ | 2014 | SAR_14 | 4 activity measurement type-dependent sets of SAR transfer series with approximate or regular
|
| 43
[ | 2014 | SAR_15 | 169,889 and 240,322 transformation size-restricted MMPs based upon retrosynthetic rules with K
i and
|
Data entries are organized according to scientific subject areas: structure-activity relationship (SAR) and structure-selectivity relationship (SSR) analysis, SAR visualization (SAR_VZ), virtual screening via similarity searching or machine learning (VS_ML), and programs (PROG). References in the Entry column provide the original publication introducing the program and/or data set. Program entries are described in more detail in Table 2 of our original data article [1]. The new compound data sets 31–43 are discussed in the text. Programs and data sets reported herein have been separately deposited in ZENODO for access and download.