| Literature DB >> 23321019 |
Ioana Oprisiu1, Sergii Novotarskyi, Igor V Tetko.
Abstract
The Online Chemical Modeling Environment (OCHEM, http://ochem.eu) is a web-based platform that provides tools for automation of typical steps necessary to create a predictive QSAR/QSPR model. The platform consists of two major subsystems: a database of experimental measurements and a modeling framework. So far, OCHEM has been limited to the processing of individual compounds. In this work, we extended OCHEM with a new ability to store and model properties of binary non-additive mixtures. The developed system is publicly accessible, meaning that any user on the Web can store new data for binary mixtures and develop models to predict their non-additive properties.The database already contains almost 10,000 data points for the density, bubble point, and azeotropic behavior of binary mixtures. For these data, we developed models for both qualitative (azeotrope/zeotrope) and quantitative endpoints (density and bubble points) using different learning methods and specially developed descriptors for mixtures. The prediction performance of the models was similar to or more accurate than results reported in previous studies. Thus, we have developed and made publicly available a powerful system for modeling mixtures of chemical compounds on the Web.Entities:
Year: 2013 PMID: 23321019 PMCID: PMC3568005 DOI: 10.1186/1758-2946-5-4
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Figure 1Methodology used to calculate descriptors for mixtures.
Figure 2Protocols for validation of mixture property models.
Figure 3Screenshot of the OCHEM features implemented to analyze mixtures.
Statistical parameters of Online Chemical Modeling Environment (OCHEM,http://ochem.eu) models with the lowest RMSE according to the “” cross-validation protocol for the prediction of binary mixture densities
| | ||||||
|---|---|---|---|---|---|---|
| Method/descriptors | R2 | RMSE | R2 | RMSE | R2 | RMSE |
| LibSVM/Dragon | 0.69 ± 0.05* | 0.014 ± 0.001 | 0.81 ± 0.09 | 0.011 ± 0.002 | 0.88 ± 0.04 | 0.0089 ± 0.001 |
| ASNN/Inductive descriptors [ | 0.68 ± 0.04 | 0.014 ± 0.0008 | 0.72 ± 0.04 | 0.0131 ± 0.0009 | 0.81 ± 0.06 | 0.011 ± 0.001 |
| LibSVM/Inductive descriptors | 0.71 ± 0.05 | 0.014 ± 0.001 | 0.81 ± 0.03 | 0.0109 ± 0.0005 | 0.88 ± 0.04 | 0.0084 ± 0.001 |
| ASNN/Dragon [ | 0.56 ± 0.06 | 0.016 ± 0.001 | 0.69 ± 0.04 | 0.014 ± 0.001 | 0.85 ± 0.04 | 0.0099 ± 0.001 |
| ASNN/ChemAxon [ | 0.55 ± 0.06 | 0.017 ± 0.001 | 0.69 ± 0.04 | 0.0137 ± 0.0009 | 0.88 ± 0.03 | 0.0088 ± 0.001 |
* 95% confidence intervals were calculated using a bootstrap procedure based on 1,000 replicas (implemented for all models in OCHEM). RMSE – Root Mean Squared Error; R2 – square of the Pearson correlation coefficient.
Comparison of performances of OCHEM models with the consensus model of Oprisiu et al.[3]for the prediction of bubble temperatures of mixtures
| | |||||
|---|---|---|---|---|---|
| Training set | Q2 | 0.93 ± 0.01 | 0.95 | 0.92 ± 0.03 | 0.9 |
| RMSE | 6.2 ± 0.6 | 5.2 | 6.5 ± 0.5 | 7.0 | |
| Test set | Q2 | 0.92 ± 0.01 | 0.88 | 0.56 ± 0.06 | 0.4 |
| RMSE | 5.7 ± 0.3 | 5.9 | 19 ± 1 | 21.4 | |
Q2 – coefficient of determination.
Comparison of classification results for the azeotropic behavior of mixtures calculated using OCHEM and Oprisiu[4]models
| Balanced Accuracy | 0.80 ± 0.04 | 0.78 ± 0.04 | 0.82 | 0.85 ± 0.07 | 0.82 |
| Recall of zeotrops | 0.77 | 0.77 | 0.78 | 0.74 | 0.73 |
| Recall of azeotropes | 0.83 | 0.80 | 0.85 | 0.95 | 0.91 |
*OCHEM models were developed using 465 mixtures, which also included pure compounds.