| Literature DB >> 30259699 |
Abstract
This paper introduces two generative topographic mapping (GTM) methods that can be used for data visualization, regression analysis, inverse analysis, and the determination of applicability domains (ADs). In GTM-multiple linear regression (GTM-MLR), the prior probability distribution of the descriptors or explanatory variables (X) is calculated with GTM, and the posterior probability distribution of the property/activity or objective variable (y) given X is calculated with MLR; inverse analysis is then performed using the product rule and Bayes' theorem. In GTM-regression (GTMR), X and y are combined and GTM is performed to obtain the joint probability distribution of X and y; this leads to the posterior probability distributions of y given X and of X given y, which are used for regression and inverse analysis, respectively. Simulations using linear and nonlinear datasets and quantitative structure-activity relationship (QSAR) and quantitative structure-property relationship (QSPR) datasets confirm that GTM-MLR and GTMR enable data visualization, regression analysis, and inverse analysis considering appropriate ADs. Python and MATLAB codes for the proposed algorithms are available at https://github.com/hkaneko1985/gtm-generativetopographicmapping.Keywords: Applicability Domains; Data Visualization; Generative Topographic Mapping; Inverse Analysis; Regression
Mesh:
Substances:
Year: 2018 PMID: 30259699 DOI: 10.1002/minf.201800088
Source DB: PubMed Journal: Mol Inform ISSN: 1868-1743 Impact factor: 3.353