Literature DB >> 10761147

Nonlinear multivariate regression outperforms several concisely designed neural networks on three QSPR data sets

.   

Abstract

Neural networks (NNs) are accepted as the most powerful nonlinear technique in QSAR and QSPR modeling. However, the NN models are often very robust, containing a large number of parameters optimized during the training procedure. We have recently found (J. Chem. Inf. Comput. Sci. 1999, 39, 121-132) that the simpler nonlinear multiregression (MR) models are significantly better than the robust NNs, according to the same statistical parameters. In the present paper we investigated whether the nonlinear MR models are also better than the concisely designed NN models. Nonlinear MR models were generated in the following way. First, nonlinear terms, the 2-fold and 3-fold cross-products of initial descriptors, were calculated and added to initial descriptors. Then, the combination of two powerful techniques for descriptor selection (CROMRsel for "the best" selection and CROMRiisel for approximative, "i by i" stepwise selection) were used to detect the most important descriptors in MR models. For boiling points (BPs) of 150 alkanes the 20-descriptor MR model produced the cross-validated (CV) standard error of 2.88 K, and the best NN model (with 70-80 adjusted weights) had 3.60 K. Prediction of BPs of 50 compounds using the 17-descriptor MR model (obtained on 100 compounds) gave the standard error of 3.58 K. In the case of modeling of 243 chemical shifts CV standard errors were (in ppm) 0.89 and 1.19 with 15- and 9-descriptor MR models, respectively. The best NN models adjusted 60-90 weights and achieved 1.42 ppm. The standard error in predicting the 83 chemical shifts using the 10-descriptor MR model obtained on 160 samples was 1.25 ppm. It is also shown in this data set that the model quality depends on the scaling procedure used for transformation of the initial descriptors. In modeling the sublimation enthalpy the CV correlation coefficient was 0.97 using the best 4-descriptor MR model versus 0.93 obtained using NN with approximately 50 adjusted weights. The CV correlation coefficient in predicting the sublimation enthalpies for 21 compounds using the 4-descriptor MR model was 0.98. This is, to our knowledge, the first unambiguous result which shows a way for obtaining nonlinear MR models having better fitted, cross-validated, and predictive performances than the corresponding NN models. Moreover, the nonlinear MR models are significantly simpler than the NN models, which allows one to establish the functional relationships between the modeled property/activity and descriptors.

Year:  2000        PMID: 10761147     DOI: 10.1021/ci990061k

Source DB:  PubMed          Journal:  J Chem Inf Comput Sci        ISSN: 0095-2338


  6 in total

1.  On use of the variable Zagreb vM2 index in QSPR: boiling points of benzenoid hydrocarbons.

Authors:  Sonja Nikolić; Ante Milicević; Nenad Trinajstić; Albin Jurić
Journal:  Molecules       Date:  2004-12-31       Impact factor: 4.411

Review 2.  Variable connectivity index as a tool for modeling structure-property relationships.

Authors:  Milan Randić; Matevz Pompe; Denise Mills; Subhash C Basak
Journal:  Molecules       Date:  2004-12-31       Impact factor: 4.411

3.  Self-organizing neural networks for modeling robust 3D and 4D QSAR: application to dihydrofolate reductase inhibitors.

Authors:  Jaroslaw Polanski; Andrzej Bak; Rafal Gieleciak; Tomasz Magdziarz
Journal:  Molecules       Date:  2004-12-31       Impact factor: 4.411

Review 4.  Modeling kinetics of subcellular disposition of chemicals.

Authors:  Stefan Balaz
Journal:  Chem Rev       Date:  2009-05       Impact factor: 60.622

5.  Toxicity of aliphatic ethers: a comparative study.

Authors:  Ante Milicević; Sonja Nikolić; Nenad Trinajstić
Journal:  Mol Divers       Date:  2006-05-19       Impact factor: 2.943

6.  A Study on the Prediction of Compressive Strength of Self-Compacting Recycled Aggregate Concrete Utilizing Novel Computational Approaches.

Authors:  Jesús de-Prado-Gil; Covadonga Palencia; P Jagadesh; Rebeca Martínez-García
Journal:  Materials (Basel)       Date:  2022-07-28       Impact factor: 3.748

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.