Literature DB >> 21594265

Support vector machine regression (LS-SVM)--an alternative to artificial neural networks (ANNs) for the analysis of quantum chemistry data?

Roman M Balabin1, Ekaterina I Lomakina.   

Abstract

A multilayer feed-forward artificial neural network (MLP-ANN) with a single, hidden layer that contains a finite number of neurons can be regarded as a universal non-linear approximator. Today, the ANN method and linear regression (MLR) model are widely used for quantum chemistry (QC) data analysis (e.g., thermochemistry) to improve their accuracy (e.g., Gaussian G2-G4, B3LYP/B3-LYP, X1, or W1 theoretical methods). In this study, an alternative approach based on support vector machines (SVMs) is used, the least squares support vector machine (LS-SVM) regression. It has been applied to ab initio (first principle) and density functional theory (DFT) quantum chemistry data. So, QC + SVM methodology is an alternative to QC + ANN one. The task of the study was to estimate the Møller-Plesset (MPn) or DFT (B3LYP, BLYP, BMK) energies calculated with large basis sets (e.g., 6-311G(3df,3pd)) using smaller ones (6-311G, 6-311G*, 6-311G**) plus molecular descriptors. A molecular set (BRM-208) containing a total of 208 organic molecules was constructed and used for the LS-SVM training, cross-validation, and testing. MP2, MP3, MP4(DQ), MP4(SDQ), and MP4/MP4(SDTQ) ab initio methods were tested. Hartree-Fock (HF/SCF) results were also reported for comparison. Furthermore, constitutional (CD: total number of atoms and mole fractions of different atoms) and quantum-chemical (QD: HOMO-LUMO gap, dipole moment, average polarizability, and quadrupole moment) molecular descriptors were used for the building of the LS-SVM calibration model. Prediction accuracies (MADs) of 1.62 ± 0.51 and 0.85 ± 0.24 kcal mol(-1) (1 kcal mol(-1) = 4.184 kJ mol(-1)) were reached for SVM-based approximations of ab initio and DFT energies, respectively. The LS-SVM model was more accurate than the MLR model. A comparison with the artificial neural network approach shows that the accuracy of the LS-SVM method is similar to the accuracy of ANN. The extrapolation and interpolation results show that LS-SVM is superior by almost an order of magnitude over the ANN method in terms of the stability, generality, and robustness of the final model. The LS-SVM model needs a much smaller numbers of samples (a much smaller sample set) to make accurate prediction results. Potential energy surface (PES) approximations for molecular dynamics (MD) studies are discussed as a promising application for the LS-SVM calibration approach. This journal is © the Owner Societies 2011

Entities:  

Mesh:

Year:  2011        PMID: 21594265     DOI: 10.1039/c1cp00051a

Source DB:  PubMed          Journal:  Phys Chem Chem Phys        ISSN: 1463-9076            Impact factor:   3.676


  11 in total

1.  Solvation Free Energy Calculations with Quantum Mechanics/Molecular Mechanics and Machine Learning Models.

Authors:  Pan Zhang; Lin Shen; Weitao Yang
Journal:  J Phys Chem B       Date:  2019-01-15       Impact factor: 2.991

2.  Application of Machine Learning in Developing Quantitative Structure-Property Relationship for Electronic Properties of Polyaromatic Compounds.

Authors:  Tuan H Nguyen; Lam H Nguyen; Thanh N Truong
Journal:  ACS Omega       Date:  2022-06-17

3.  Machine learning and semi-empirical calculations: a synergistic approach to rapid, accurate, and mechanism-based reaction barrier prediction.

Authors:  Elliot H E Farrar; Matthew N Grayson
Journal:  Chem Sci       Date:  2022-06-14       Impact factor: 9.969

Review 4.  Polymeric Nanocarriers: A Transformation in Doxorubicin Therapies.

Authors:  Kamila Butowska; Anna Woziwodzka; Agnieszka Borowik; Jacek Piosik
Journal:  Materials (Basel)       Date:  2021-04-22       Impact factor: 3.623

5.  Estimation of diffusion coefficients from voltammetric signals by support vector and gaussian process regression.

Authors:  Martin Bogdan; Dominik Brugger; Wolfgang Rosenstiel; Bernd Speiser
Journal:  J Cheminform       Date:  2014-05-28       Impact factor: 5.514

6.  Inline Measurement of Particle Concentrations in Multicomponent Suspensions using Ultrasonic Sensor and Least Squares Support Vector Machines.

Authors:  Xiaobin Zhan; Shulan Jiang; Yili Yang; Jian Liang; Tielin Shi; Xiwen Li
Journal:  Sensors (Basel)       Date:  2015-09-18       Impact factor: 3.576

7.  FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier.

Authors:  Victor Tkachev; Maxim Sorokin; Artem Mescheryakov; Alexander Simonov; Andrew Garazha; Anton Buzdin; Ilya Muchnik; Nicolas Borisov
Journal:  Front Genet       Date:  2019-01-15       Impact factor: 4.599

8.  A fourth-generation high-dimensional neural network potential with accurate electrostatics including non-local charge transfer.

Authors:  Tsz Wai Ko; Jonas A Finkler; Stefan Goedecker; Jörg Behler
Journal:  Nat Commun       Date:  2021-01-15       Impact factor: 14.919

9.  Local Kernel Regression and Neural Network Approaches to the Conformational Landscapes of Oligopeptides.

Authors:  Raimon Fabregat; Alberto Fabrizio; Edgar A Engel; Benjamin Meyer; Veronika Juraskova; Michele Ceriotti; Clemence Corminboeuf
Journal:  J Chem Theory Comput       Date:  2022-02-18       Impact factor: 6.006

10.  Flexible Data Trimming Improves Performance of Global Machine Learning Methods in Omics-Based Personalized Oncology.

Authors:  Victor Tkachev; Maxim Sorokin; Constantin Borisov; Andrew Garazha; Anton Buzdin; Nicolas Borisov
Journal:  Int J Mol Sci       Date:  2020-01-22       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.