Literature DB >> 23829430

In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Naïve Bayes and Parzen-Rosenblatt window.

Alexios Koutsoukas1, Robert Lowe, Yasaman Kalantarmotamedi, Hamse Y Mussa, Werner Klaffke, John B O Mitchell, Robert C Glen, Andreas Bender.   

Abstract

In this study, two probabilistic machine-learning algorithms were compared for in silico target prediction of bioactive molecules, namely the well-established Laplacian-modified Naïve Bayes classifier (NB) and the more recently introduced (to Cheminformatics) Parzen-Rosenblatt Window. Both classifiers were trained in conjunction with circular fingerprints on a large data set of bioactive compounds extracted from ChEMBL, covering 894 human protein targets with more than 155,000 ligand-protein pairs. This data set is also provided as a benchmark data set for future target prediction methods due to its size as well as the number of bioactivity classes it contains. In addition to evaluating the methods, different performance measures were explored. This is not as straightforward as in binary classification settings, due to the number of classes, the possibility of multiple class memberships, and the need to translate model scores into "yes/no" predictions for assessing model performance. Both algorithms achieved a recall of correct targets that exceeds 80% in the top 1% of predictions. Performance depends significantly on the underlying diversity and size of a given class of bioactive compounds, with small classes and low structural similarity affecting both algorithms to different degrees. When tested on an external test set extracted from WOMBAT covering more than 500 targets by excluding all compounds with Tanimoto similarity above 0.8 to compounds from the ChEMBL data set, the current methodologies achieved a recall of 63.3% and 66.6% among the top 1% for Naïve Bayes and Parzen-Rosenblatt Window, respectively. While those numbers seem to indicate lower performance, they are also more realistic for settings where protein targets need to be established for novel chemical substances.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23829430     DOI: 10.1021/ci300435j

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  48 in total

1.  Validation strategies for target prediction methods.

Authors:  Neann Mathai; Ya Chen; Johannes Kirchmair
Journal:  Brief Bioinform       Date:  2020-05-21       Impact factor: 11.622

2.  Novel synthetic biscoumarins target tumor necrosis factor-α in hepatocellular carcinoma in vitro and in vivo.

Authors:  Hosadurga Kumar Keerthy; Chakrabhavi Dhananjaya Mohan; Kodappully Sivaraman Siveen; Julian E Fuchs; Shobith Rangappa; Mahalingam S Sundaram; Feng Li; Kesturu S Girish; Gautam Sethi; Andreas Bender; Kanchugarakoppal Subbegowda Rangappa
Journal:  J Biol Chem       Date:  2014-09-17       Impact factor: 5.157

Review 3.  Assessing the public landscape of clinical-stage pharmaceuticals through freely available online databases.

Authors:  Rebekah H Griesenauer; Constantino Schillebeeckx; Michael S Kinch
Journal:  Drug Discov Today       Date:  2019-01-25       Impact factor: 7.851

Review 4.  Providing data science support for systems pharmacology and its implications to drug discovery.

Authors:  Thomas Hart; Lei Xie
Journal:  Expert Opin Drug Discov       Date:  2016-01-09       Impact factor: 6.098

5.  Development of a novel azaspirane that targets the Janus kinase-signal transducer and activator of transcription (STAT) pathway in hepatocellular carcinoma in vitro and in vivo.

Authors:  Chakrabhavi Dhananjaya Mohan; Hanumantharayappa Bharathkumar; Krishna C Bulusu; Vijay Pandey; Shobith Rangappa; Julian E Fuchs; Muthu K Shanmugam; Xiaoyun Dai; Feng Li; Amudha Deivasigamani; Kam M Hui; Alan Prem Kumar; Peter E Lobie; Andreas Bender; Gautam Sethi; Kanchugarakoppal S Rangappa
Journal:  J Biol Chem       Date:  2014-10-15       Impact factor: 5.157

Review 6.  Evidence-Based Precision Oncology with the Cancer Targetome.

Authors:  Aurora S Blucher; Gabrielle Choonoo; Molly Kulesz-Martin; Guanming Wu; Shannon K McWeeney
Journal:  Trends Pharmacol Sci       Date:  2017-09-27       Impact factor: 14.819

7.  ANTENNA, a Multi-Rank, Multi-Layered Recommender System for Inferring Reliable Drug-Gene-Disease Associations: Repurposing Diazoxide as a Targeted Anti-Cancer Therapy.

Authors:  Annie Wang; Hansaim Lim; Shu-Yuan Cheng; Lei Xie
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2018-03-16       Impact factor: 3.710

8.  Large-Scale Modeling of Multispecies Acute Toxicity End Points Using Consensus of Multitask Deep Learning Methods.

Authors:  Sankalp Jain; Vishal B Siramshetty; Vinicius M Alves; Eugene N Muratov; Nicole Kleinstreuer; Alexander Tropsha; Marc C Nicklaus; Anton Simeonov; Alexey V Zakharov
Journal:  J Chem Inf Model       Date:  2021-02-03       Impact factor: 4.956

9.  Bioactivity Comparison across Multiple Machine Learning Algorithms Using over 5000 Datasets for Drug Discovery.

Authors:  Thomas R Lane; Daniel H Foil; Eni Minerali; Fabio Urbina; Kimberley M Zorn; Sean Ekins
Journal:  Mol Pharm       Date:  2020-12-16       Impact factor: 4.939

Review 10.  Machine learning approaches and databases for prediction of drug-target interaction: a survey paper.

Authors:  Maryam Bagherian; Elyas Sabeti; Kai Wang; Maureen A Sartor; Zaneta Nikolovska-Coleska; Kayvan Najarian
Journal:  Brief Bioinform       Date:  2021-01-18       Impact factor: 11.622

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.