Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 One- to four-dimensional kernels for virtual screening and the prediction of physical, chemical, and biological properties.

Literature DB >> 17338509

One- to four-dimensional kernels for virtual screening and the prediction of physical, chemical, and biological properties.

Chloé-Agathe Azencott¹, Alexandre Ksikes, S Joshua Swamidass, Jonathan H Chen, Liva Ralaivola, Pierre Baldi.

Abstract

Many chemoinformatics applications, including high-throughput virtual screening, benefit from being able to rapidly predict the physical, chemical, and biological properties of small molecules to screen large repositories and identify suitable candidates. When training sets are available, machine learning methods provide an effective alternative to ab initio methods for these predictions. Here, we leverage rich molecular representations including 1D SMILES strings, 2D graphs of bonds, and 3D coordinates to derive efficient machine learning kernels to address regression problems. We further expand the library of available spectral kernels for small molecules developed for classification problems to include 2.5D surface and 3D kernels using Delaunay tetrahedrization and other techniques from computational geometry, 3D pharmacophore kernels, and 3.5D or 4D kernels capable of taking into account multiple molecular configurations, such as conformers. The kernels are comprehensively tested using cross-validation and redundancy-reduction methods on regression problems using several available data sets to predict boiling points, melting points, aqueous solubility, octanol/water partition coefficients, and biological activity with state-of-the art results. When sufficient training data are available, 2D spectral kernels in general tend to yield the best and most robust results, better than state-of-the art. On data sets containing thousands of molecules, the kernels achieve a squared correlation coefficient of 0.91 for aqueous solubility prediction and 0.94 for octanol/water partition coefficient prediction. Averaging over conformations improves the performance of kernels based on the three-dimensional structure of molecules, especially on challenging data sets. Kernel predictors for aqueous solubility (kSOL), LogP (kLOGP), and melting point (kMELT) are available over the Web through: http://cdb.ics.uci.edu.

Entities: Chemical

Mesh：

Substances：

Year: 2007 PMID： 17338509 DOI： 10.1021/ci600397p

Source DB: PubMed Journal: J Chem Inf Model ISSN： 1549-9596 Impact factor: 4.956

Keyword Cloud
Cited

14 in total

1. A CROC stronger than ROC: measuring, visualizing and optimizing early retrieval.

Authors: S Joshua Swamidass; Chloé-Agathe Azencott; Kenny Daily; Pierre Baldi
Journal: Bioinformatics Date: 2010-04-07 Impact factor: 6.937

2. Lossless compression of chemical fingerprints using integer entropy codes improves storage and retrieval.

Authors: Pierre Baldi; Ryan W Benz; Daniel S Hirschberg; S Joshua Swamidass
Journal: J Chem Inf Model Date: 2007-10-30 Impact factor: 4.956

3. Learning to predict chemical reactions.

Authors: Matthew A Kayala; Chloé-Agathe Azencott; Jonathan H Chen; Pierre Baldi
Journal: J Chem Inf Model Date: 2011-09-02 Impact factor: 4.956

4. Analysis and use of fragment-occurrence data in similarity-based virtual screening.

Authors: Shereena M Arif; John D Holliday; Peter Willett
Journal: J Comput Aided Mol Des Date: 2009-06-18 Impact factor: 3.686

5. Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules.

Authors: Alessandro Lusci; Gianluca Pollastri; Pierre Baldi
Journal: J Chem Inf Model Date: 2013-07-02 Impact factor: 4.956

One- to four-dimensional kernels for virtual screening and the prediction of physical, chemical, and biological properties.

1. A CROC stronger than ROC: measuring, visualizing and optimizing early retrieval.

2. Lossless compression of chemical fingerprints using integer entropy codes improves storage and retrieval.

3. Learning to predict chemical reactions.

4. Analysis and use of fragment-occurrence data in similarity-based virtual screening.

5. Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules.

6. Protein-ligand interaction prediction: an improved chemogenomics approach.

Review 7. Machine learning for in silico virtual screening and chemical genomics: new strategies.

8. A constructive approach for discovering new drug leads: Using a kernel methodology for the inverse-QSAR problem.

9. Estimation of the applicability domain of kernel-based machine learning models for virtual screening.

10. Influence relevance voting: an accurate and interpretable virtual high throughput screening method.