Literature DB >> 29096442

Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets.

Alexandru Korotcov1, Valery Tkachenko1, Daniel P Russo2,3, Sean Ekins2.   

Abstract

Machine learning methods have been applied to many data sets in pharmaceutical research for several decades. The relative ease and availability of fingerprint type molecular descriptors paired with Bayesian methods resulted in the widespread use of this approach for a diverse array of end points relevant to drug discovery. Deep learning is the latest machine learning algorithm attracting attention for many of pharmaceutical applications from docking to virtual screening. Deep learning is based on an artificial neural network with multiple hidden layers and has found considerable traction for many artificial intelligence applications. We have previously suggested the need for a comparison of different machine learning methods with deep learning across an array of varying data sets that is applicable to pharmaceutical research. End points relevant to pharmaceutical research include absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) properties, as well as activity against pathogens and drug discovery data sets. In this study, we have used data sets for solubility, probe-likeness, hERG, KCNQ1, bubonic plague, Chagas, tuberculosis, and malaria to compare different machine learning methods using FCFP6 fingerprints. These data sets represent whole cell screens, individual proteins, physicochemical properties as well as a data set with a complex end point. Our aim was to assess whether deep learning offered any improvement in testing when assessed using an array of metrics including AUC, F1 score, Cohen's kappa, Matthews correlation coefficient and others. Based on ranked normalized scores for the metrics or data sets Deep Neural Networks (DNN) ranked higher than SVM, which in turn was ranked higher than all the other machine learning methods. Visualizing these properties for training and test sets using radar type plots indicates when models are inferior or perhaps over trained. These results also suggest the need for assessing deep learning further using multiple metrics with much larger scale comparisons, prospective testing as well as assessment of different fingerprints and DNN architectures beyond those used.

Entities:  

Keywords:  deep learning; drug discovery; machine learning; pharmaceutics; support vector machine

Mesh:

Year:  2017        PMID: 29096442      PMCID: PMC5741413          DOI: 10.1021/acs.molpharmaceut.7b00578

Source DB:  PubMed          Journal:  Mol Pharm        ISSN: 1543-8384            Impact factor:   4.939


  107 in total

Review 1.  Towards a gold standard: regarding quality in public domain chemistry databases and approaches to improving the situation.

Authors:  Antony J Williams; Sean Ekins; Valery Tkachenko
Journal:  Drug Discov Today       Date:  2012-03-08       Impact factor: 7.851

2.  Using open source computational tools for predicting human metabolic stability and additional absorption, distribution, metabolism, excretion, and toxicity properties.

Authors:  Rishi R Gupta; Eric M Gifford; Ted Liston; Chris L Waller; Moses Hohman; Barry A Bunin; Sean Ekins
Journal:  Drug Metab Dispos       Date:  2010-08-06       Impact factor: 3.922

3.  Chembench: a cheminformatics workbench.

Authors:  Theo Walker; Christopher M Grulke; Diane Pozefsky; Alexander Tropsha
Journal:  Bioinformatics       Date:  2010-09-30       Impact factor: 6.937

4.  A support vector machine approach to classify human cytochrome P450 3A4 inhibitors.

Authors:  Jan M Kriegl; Thomas Arnhold; Bernd Beck; Thomas Fox
Journal:  J Comput Aided Mol Des       Date:  2005-03       Impact factor: 3.686

5.  Algorithms for network analysis in systems-ADME/Tox using the MetaCore and MetaDrug platforms.

Authors:  S Ekins; A Bugrim; L Brovold; E Kirillov; Y Nikolsky; E Rakhmatulin; S Sorokina; A Ryabov; T Serebryiskaya; A Melnikov; J Metz; T Nikolskaya
Journal:  Xenobiotica       Date:  2006 Oct-Nov       Impact factor: 1.908

6.  ADMET evaluation in drug discovery. 13. Development of in silico prediction models for P-glycoprotein substrates.

Authors:  Dan Li; Lei Chen; Youyong Li; Sheng Tian; Huiyong Sun; Tingjun Hou
Journal:  Mol Pharm       Date:  2014-02-18       Impact factor: 4.939

7.  Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information.

Authors:  Iurii Sushko; Sergii Novotarskyi; Robert Körner; Anil Kumar Pandey; Matthias Rupp; Wolfram Teetz; Stefan Brandmaier; Ahmed Abdelaziz; Volodymyr V Prokopenko; Vsevolod Y Tanchuk; Roberto Todeschini; Alexandre Varnek; Gilles Marcou; Peter Ertl; Vladimir Potemkin; Maria Grishina; Johann Gasteiger; Christof Schwab; Igor I Baskin; Vladimir A Palyulin; Eugene V Radchenko; William J Welsh; Vladyslav Kholodovych; Dmitriy Chekmarev; Artem Cherkasov; Joao Aires-de-Sousa; Qing-You Zhang; Andreas Bender; Florian Nigsch; Luc Patiny; Antony Williams; Valery Tkachenko; Igor V Tetko
Journal:  J Comput Aided Mol Des       Date:  2011-06-10       Impact factor: 3.686

8.  Machine learning methods in chemoinformatics.

Authors:  John B O Mitchell
Journal:  Wiley Interdiscip Rev Comput Mol Sci       Date:  2014-09-01

9.  Machine learning models identify molecules active against the Ebola virus in vitro.

Authors:  Sean Ekins; Joel S Freundlich; Alex M Clark; Manu Anantpadma; Robert A Davey; Peter Madrid
Journal:  F1000Res       Date:  2015-10-20

10.  PubChem Substance and Compound databases.

Authors:  Sunghwan Kim; Paul A Thiessen; Evan E Bolton; Jie Chen; Gang Fu; Asta Gindulyte; Lianyi Han; Jane He; Siqian He; Benjamin A Shoemaker; Jiyao Wang; Bo Yu; Jian Zhang; Stephen H Bryant
Journal:  Nucleic Acids Res       Date:  2015-09-22       Impact factor: 16.971

View more
  57 in total

1.  Machine Learning Platform to Discover Novel Growth Inhibitors of Neisseria gonorrhoeae.

Authors:  Janaina Cruz Pereira; Samer S Daher; Kimberley M Zorn; Matthew Sherwood; Riccardo Russo; Alexander L Perryman; Xin Wang; Madeleine J Freundlich; Sean Ekins; Joel S Freundlich
Journal:  Pharm Res       Date:  2020-07-13       Impact factor: 4.200

Review 2.  Advancing computer-aided drug discovery (CADD) by big data and data-driven machine learning modeling.

Authors:  Linlin Zhao; Heather L Ciallella; Lauren M Aleksunes; Hao Zhu
Journal:  Drug Discov Today       Date:  2020-07-11       Impact factor: 7.851

Review 3.  Generative chemistry: drug discovery with deep learning generative models.

Authors:  Yuemin Bian; Xiang-Qun Xie
Journal:  J Mol Model       Date:  2021-02-04       Impact factor: 1.810

4.  Comparing Machine Learning Algorithms for Predicting Drug-Induced Liver Injury (DILI).

Authors:  Eni Minerali; Daniel H Foil; Kimberley M Zorn; Thomas R Lane; Sean Ekins
Journal:  Mol Pharm       Date:  2020-06-08       Impact factor: 4.939

5.  A Machine Learning Strategy for Drug Discovery Identifies Anti-Schistosomal Small Molecules.

Authors:  Kimberley M Zorn; Shengxi Sun; Cecelia L McConnon; Kelley Ma; Eric K Chen; Daniel H Foil; Thomas R Lane; Lawrence J Liu; Nelly El-Sakkary; Danielle E Skinner; Sean Ekins; Conor R Caffrey
Journal:  ACS Infect Dis       Date:  2021-01-12       Impact factor: 5.084

6.  Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.

Authors:  Thomas Lane; Daniel P Russo; Kimberley M Zorn; Alex M Clark; Alexandru Korotcov; Valery Tkachenko; Robert C Reynolds; Alexander L Perryman; Joel S Freundlich; Sean Ekins
Journal:  Mol Pharm       Date:  2018-04-26       Impact factor: 4.939

7.  Comparing Multiple Machine Learning Algorithms and Metrics for Estrogen Receptor Binding Prediction.

Authors:  Daniel P Russo; Kimberley M Zorn; Alex M Clark; Hao Zhu; Sean Ekins
Journal:  Mol Pharm       Date:  2018-08-28       Impact factor: 4.939

8.  Comparing Machine Learning Models for Aromatase (P450 19A1).

Authors:  Kimberley M Zorn; Daniel H Foil; Thomas R Lane; Wendy Hillwalker; David J Feifarek; Frank Jones; William D Klaren; Ashley M Brinkman; Sean Ekins
Journal:  Environ Sci Technol       Date:  2020-11-19       Impact factor: 9.028

9.  Comparison of Machine Learning Models for the Androgen Receptor.

Authors:  Kimberley M Zorn; Daniel H Foil; Thomas R Lane; Wendy Hillwalker; David J Feifarek; Frank Jones; William D Klaren; Ashley M Brinkman; Sean Ekins
Journal:  Environ Sci Technol       Date:  2020-10-21       Impact factor: 9.028

10.  Machine Learning Models for Estrogen Receptor Bioactivity and Endocrine Disruption Prediction.

Authors:  Kimberley M Zorn; Daniel H Foil; Thomas R Lane; Daniel P Russo; Wendy Hillwalker; David J Feifarek; Frank Jones; William D Klaren; Ashley M Brinkman; Sean Ekins
Journal:  Environ Sci Technol       Date:  2020-09-15       Impact factor: 9.028

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.