Literature DB >> 25318024

Benchmarking study of parameter variation when using signature fingerprints together with support vector machines.

Jonathan Alvarsson1, Martin Eklund, Claes Andersson, Lars Carlsson, Ola Spjuth, Jarl E S Wikberg.   

Abstract

QSAR modeling using molecular signatures and support vector machines with a radial basis function is increasingly used for virtual screening in the drug discovery field. This method has three free parameters: C, γ, and signature height. C is a penalty parameter that limits overfitting, γ controls the width of the radial basis function kernel, and the signature height determines how much of the molecule is described by each atom signature. Determination of optimal values for these parameters is time-consuming. Good default values could therefore save considerable computational cost. The goal of this project was to investigate whether such default values could be found by using seven public QSAR data sets spanning a wide range of end points and using both a bit version and a count version of the molecular signatures. On the basis of the experiments performed, we recommend a parameter set of heights 0 to 2 for the count version of the signature fingerprints and heights 0 to 3 for the bit version. These are in combination with a support vector machine using C in the range of 1 to 100 and γ in the range of 0.001 to 0.1. When data sets are small or longer run times are not a problem, then there is reason to consider the addition of height 3 to the count fingerprint and a wider grid search. However, marked improvements should not be expected.

Mesh:

Year:  2014        PMID: 25318024     DOI: 10.1021/ci500344v

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  10 in total

1.  Advances in Predictions of Oral Bioavailability of Candidate Drugs in Man with New Machine Learning Methodology.

Authors:  Urban Fagerholm; Sven Hellberg; Ola Spjuth
Journal:  Molecules       Date:  2021-04-28       Impact factor: 4.411

2.  Large-scale ligand-based predictive modelling using support vector machines.

Authors:  Jonathan Alvarsson; Samuel Lampa; Wesley Schaal; Claes Andersson; Jarl E S Wikberg; Ola Spjuth
Journal:  J Cheminform       Date:  2016-08-10       Impact factor: 5.514

3.  The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching.

Authors:  Egon L Willighagen; John W Mayfield; Jonathan Alvarsson; Arvid Berg; Lars Carlsson; Nina Jeliazkova; Stefan Kuhn; Tomáš Pluskal; Miquel Rojas-Chertó; Ola Spjuth; Gilleain Torrance; Chris T Evelo; Rajarshi Guha; Christoph Steinbeck
Journal:  J Cheminform       Date:  2017-06-06       Impact factor: 5.514

4.  Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data.

Authors:  Alexios Koutsoukas; Keith J Monaghan; Xiaoli Li; Jun Huan
Journal:  J Cheminform       Date:  2017-06-28       Impact factor: 5.514

5.  Efficient iterative virtual screening with Apache Spark and conformal prediction.

Authors:  Laeeq Ahmed; Valentin Georgiev; Marco Capuccini; Salman Toor; Wesley Schaal; Erwin Laure; Ola Spjuth
Journal:  J Cheminform       Date:  2018-03-01       Impact factor: 5.514

6.  Pharmaceutical Machine Learning: Virtual High-Throughput Screens Identifying Promising and Economical Small Molecule Inhibitors of Complement Factor C1s.

Authors:  Jonathan J Chen; Lyndsey N Schmucker; Donald P Visco
Journal:  Biomolecules       Date:  2018-05-07

7.  Assessing the calibration in toxicological in vitro models with conformal prediction.

Authors:  Ola Spjuth; Andrea Volkamer; Andrea Morger; Fredrik Svensson; Staffan Arvidsson McShane; Niharika Gauraha; Ulf Norinder
Journal:  J Cheminform       Date:  2021-04-29       Impact factor: 5.514

8.  Towards agile large-scale predictive modelling in drug discovery with flow-based programming design principles.

Authors:  Samuel Lampa; Jonathan Alvarsson; Ola Spjuth
Journal:  J Cheminform       Date:  2016-11-24       Impact factor: 5.514

9.  A confidence predictor for logD using conformal regression and a support-vector machine.

Authors:  Maris Lapins; Staffan Arvidsson; Samuel Lampa; Arvid Berg; Wesley Schaal; Jonathan Alvarsson; Ola Spjuth
Journal:  J Cheminform       Date:  2018-04-03       Impact factor: 5.514

10.  Machine Learning Strategies When Transitioning between Biological Assays.

Authors:  Staffan Arvidsson McShane; Ernst Ahlberg; Tobias Noeske; Ola Spjuth
Journal:  J Chem Inf Model       Date:  2021-06-21       Impact factor: 4.956

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.