| Literature DB >> 24593807 |
Disha Gupta-Ostermann1, Veerabahu Shanmugasundaram, Jürgen Bajorath.
Abstract
The SAR matrix data structure organizes compound data sets according to structurally analogous matching molecular series in a format reminiscent of conventional R-group tables. An intrinsic feature of SAR matrices is that they contain many virtual compounds that represent unexplored combinations of core structures and substituents extracted from compound data sets on the basis of the matched molecular pair formalism. These virtual compounds are candidates for further exploration but are difficult, if not impossible to prioritize on the basis of visual inspection of multiple SAR matrices. Therefore, we introduce herein a compound neighborhood concept as an extension of the SAR matrix data structure that makes it possible to identify preferred virtual compounds for further analysis. On the basis of well-defined compound neighborhoods, the potency of virtual compounds can be predicted by considering individual contributions of core structures and substituents from neighbors. In extensive benchmark studies, virtual compounds have been prioritized in different data sets on the basis of multiple neighborhoods yielding accurate potency predictions.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24593807 DOI: 10.1021/ci5000483
Source DB: PubMed Journal: J Chem Inf Model ISSN: 1549-9596 Impact factor: 4.956