Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets.

Literature DB >> 29096442

Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets.

Alexandru Korotcov¹, Valery Tkachenko¹, Daniel P Russo^2,3, Sean Ekins².

Abstract

Machine learning methods have been applied to many data sets in pharmaceutical research for several decades. The relative ease and availability of fingerprint type molecular descriptors paired with Bayesian methods resulted in the widespread use of this approach for a diverse array of end points relevant to drug discovery. Deep learning is the latest machine learning algorithm attracting attention for many of pharmaceutical applications from docking to virtual screening. Deep learning is based on an artificial neural network with multiple hidden layers and has found considerable traction for many artificial intelligence applications. We have previously suggested the need for a comparison of different machine learning methods with deep learning across an array of varying data sets that is applicable to pharmaceutical research. End points relevant to pharmaceutical research include absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) properties, as well as activity against pathogens and drug discovery data sets. In this study, we have used data sets for solubility, probe-likeness, hERG, KCNQ1, bubonic plague, Chagas, tuberculosis, and malaria to compare different machine learning methods using FCFP6 fingerprints. These data sets represent whole cell screens, individual proteins, physicochemical properties as well as a data set with a complex end point. Our aim was to assess whether deep learning offered any improvement in testing when assessed using an array of metrics including AUC, F1 score, Cohen's kappa, Matthews correlation coefficient and others. Based on ranked normalized scores for the metrics or data sets Deep Neural Networks (DNN) ranked higher than SVM, which in turn was ranked higher than all the other machine learning methods. Visualizing these properties for training and test sets using radar type plots indicates when models are inferior or perhaps over trained. These results also suggest the need for assessing deep learning further using multiple metrics with much larger scale comparisons, prospective testing as well as assessment of different fingerprints and DNN architectures beyond those used.

Entities: CellLine Chemical Disease Gene Species

Keywords: deep learning; drug discovery; machine learning; pharmaceutics; support vector machine

Mesh：

Year: 2017 PMID： 29096442 PMCID： PMC5741413 DOI： 10.1021/acs.molpharmaceut.7b00578

Source DB: PubMed Journal: Mol Pharm ISSN： 1543-8384 Impact factor: 4.939

107 in total

Review 1. Towards a gold standard: regarding quality in public domain chemistry databases and approaches to improving the situation.

Authors: Antony J Williams; Sean Ekins; Valery Tkachenko
Journal: Drug Discov Today Date: 2012-03-08 Impact factor: 7.851

2. Using open source computational tools for predicting human metabolic stability and additional absorption, distribution, metabolism, excretion, and toxicity properties.

Authors: Rishi R Gupta; Eric M Gifford; Ted Liston; Chris L Waller; Moses Hohman; Barry A Bunin; Sean Ekins
Journal: Drug Metab Dispos Date: 2010-08-06 Impact factor: 3.922

3. Chembench: a cheminformatics workbench.

Authors: Theo Walker; Christopher M Grulke; Diane Pozefsky; Alexander Tropsha
Journal: Bioinformatics Date: 2010-09-30 Impact factor: 6.937

4. A support vector machine approach to classify human cytochrome P450 3A4 inhibitors.

Authors: Jan M Kriegl; Thomas Arnhold; Bernd Beck; Thomas Fox
Journal: J Comput Aided Mol Des Date: 2005-03 Impact factor: 3.686

5. Algorithms for network analysis in systems-ADME/Tox using the MetaCore and MetaDrug platforms.

Authors: S Ekins; A Bugrim; L Brovold; E Kirillov; Y Nikolsky; E Rakhmatulin; S Sorokina; A Ryabov; T Serebryiskaya; A Melnikov; J Metz; T Nikolskaya
Journal: Xenobiotica Date: 2006 Oct-Nov Impact factor: 1.908

6. ADMET evaluation in drug discovery. 13. Development of in silico prediction models for P-glycoprotein substrates.

Authors: Dan Li; Lei Chen; Youyong Li; Sheng Tian; Huiyong Sun; Tingjun Hou
Journal: Mol Pharm Date: 2014-02-18 Impact factor: 4.939

7. Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information.

Authors: Iurii Sushko; Sergii Novotarskyi; Robert Körner; Anil Kumar Pandey; Matthias Rupp; Wolfram Teetz; Stefan Brandmaier; Ahmed Abdelaziz; Volodymyr V Prokopenko; Vsevolod Y Tanchuk; Roberto Todeschini; Alexandre Varnek; Gilles Marcou; Peter Ertl; Vladimir Potemkin; Maria Grishina; Johann Gasteiger; Christof Schwab; Igor I Baskin; Vladimir A Palyulin; Eugene V Radchenko; William J Welsh; Vladyslav Kholodovych; Dmitriy Chekmarev; Artem Cherkasov; Joao Aires-de-Sousa; Qing-You Zhang; Andreas Bender; Florian Nigsch; Luc Patiny; Antony Williams; Valery Tkachenko; Igor V Tetko
Journal: J Comput Aided Mol Des Date: 2011-06-10 Impact factor: 3.686

8. Machine learning methods in chemoinformatics.

Authors: John B O Mitchell
Journal: Wiley Interdiscip Rev Comput Mol Sci Date: 2014-09-01

9. Machine learning models identify molecules active against the Ebola virus in vitro.

Authors: Sean Ekins; Joel S Freundlich; Alex M Clark; Manu Anantpadma; Robert A Davey; Peter Madrid
Journal: F1000Res Date: 2015-10-20

10. PubChem Substance and Compound databases.

Authors: Sunghwan Kim; Paul A Thiessen; Evan E Bolton; Jie Chen; Gang Fu; Asta Gindulyte; Lianyi Han; Jane He; Siqian He; Benjamin A Shoemaker; Jiyao Wang; Bo Yu; Jian Zhang; Stephen H Bryant
Journal: Nucleic Acids Res Date: 2015-09-22 Impact factor: 16.971

57 in total

1. Machine Learning Platform to Discover Novel Growth Inhibitors of Neisseria gonorrhoeae.

Authors: Janaina Cruz Pereira; Samer S Daher; Kimberley M Zorn; Matthew Sherwood; Riccardo Russo; Alexander L Perryman; Xin Wang; Madeleine J Freundlich; Sean Ekins; Joel S Freundlich
Journal: Pharm Res Date: 2020-07-13 Impact factor: 4.200

Review 2. Advancing computer-aided drug discovery (CADD) by big data and data-driven machine learning modeling.

Authors: Linlin Zhao; Heather L Ciallella; Lauren M Aleksunes; Hao Zhu
Journal: Drug Discov Today Date: 2020-07-11 Impact factor: 7.851

Review 3. Generative chemistry: drug discovery with deep learning generative models.

Authors: Yuemin Bian; Xiang-Qun Xie
Journal: J Mol Model Date: 2021-02-04 Impact factor: 1.810

4. Comparing Machine Learning Algorithms for Predicting Drug-Induced Liver Injury (DILI).

Authors: Eni Minerali; Daniel H Foil; Kimberley M Zorn; Thomas R Lane; Sean Ekins
Journal: Mol Pharm Date: 2020-06-08 Impact factor: 4.939

5. A Machine Learning Strategy for Drug Discovery Identifies Anti-Schistosomal Small Molecules.

Authors: Kimberley M Zorn; Shengxi Sun; Cecelia L McConnon; Kelley Ma; Eric K Chen; Daniel H Foil; Thomas R Lane; Lawrence J Liu; Nelly El-Sakkary; Danielle E Skinner; Sean Ekins; Conor R Caffrey
Journal: ACS Infect Dis Date: 2021-01-12 Impact factor: 5.084