| Literature DB >> 27477099 |
N Kireeva1,2, I I Baskin1,3, H A Gaspar1, D Horvath1, G Marcou1, A Varnek4.
Abstract
Here, the utility of Generative Topographic Maps (GTM) for data visualization, structure-activity modeling and database comparison is evaluated, on hand of subsets of the Database of Useful Decoys (DUD). Unlike other popular dimensionality reduction approaches like Principal Component Analysis, Sammon Mapping or Self-Organizing Maps, the great advantage of GTMs is providing data probability distribution functions (PDF), both in the high-dimensional space defined by molecular descriptors and in 2D latent space. PDFs for the molecules of different activity classes were successfully used to build classification models in the framework of the Bayesian approach. Because PDFs are represented by a mixture of Gaussian functions, the Bhattacharyya kernel has been proposed as a measure of the overlap of datasets, which leads to an elegant method of global comparison of chemical libraries.Keywords: Bhattacharyya kernel; Comparison of databases; Data visualization; Dimensionality reduction; Generative topographic maps; Manifold learning; Predicting activity profiles
Year: 2012 PMID: 27477099 DOI: 10.1002/minf.201100163
Source DB: PubMed Journal: Mol Inform ISSN: 1868-1743 Impact factor: 3.353