Literature DB >> 31518124

All-Assay-Max2 pQSAR: Activity Predictions as Accurate as Four-Concentration IC50s for 8558 Novartis Assays.

Eric J Martin1, Valery R Polyakov1, Xiang-Wei Zhu1, Li Tian1,2, Prasenjit Mukherjee1, Xin Liu1,2.   

Abstract

Profile-quantitative structure-activity relationship (pQSAR) is a massively multitask, two-step machine learning method with unprecedented scope, accuracy, and applicability domain. In step one, a "profile" of conventional single-assay random forest regression models are trained on a very large number of biochemical and cellular pIC50 assays using Morgan 2 substructural fingerprints as compound descriptors. In step two, a panel of partial least squares (PLS) models are built using the profile of pIC50 predictions from those random forest regression models as compound descriptors (hence the name). Previously described for a panel of 728 biochemical and cellular kinase assays, we have now built an enormous pQSAR from 11 805 diverse Novartis (NVS) IC50 and EC50 assays. This large number of assays, and hence of compound descriptors for PLS, dictated reducing the profile by only including random forest regression models whose predictions correlate with the assay being modeled. The random forest regression and pQSAR models were evaluated with our "realistically novel" held-out test set, whose median average similarity to the nearest training set member across the 11 805 assays was only 0.34, comparable to the novelty of compounds actually selected from virtual screens. For the 11 805 single-assay random forest regression models, the median correlation of prediction with the experiment was only rext2 = 0.05, virtually random, and only 8% of the models achieved our standard success threshold of rext2 = 0.30. For pQSAR, the median correlation was rext2 = 0.53, comparable to four-concentration experimental IC50s, and 72% of the models met our rext2 > 0.30 standard, totaling 8558 successful models. The successful models included assays from all of the 51 annotated target subclasses, as well as 4196 phenotypic assays, indicating that pQSAR can be applied to virtually any disease area. Every month, all models are updated to include new measurements, and predictions are made for 5.5 million NVS compounds, totaling 50 billion predictions. Common uses have included virtual screening, selectivity design, toxicity and promiscuity prediction, mechanism-of-action prediction, and others. Several such actual applications are described.

Entities:  

Year:  2019        PMID: 31518124     DOI: 10.1021/acs.jcim.9b00375

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  8 in total

1.  Analysis of the benefits of imputation models over traditional QSAR models for toxicity prediction.

Authors:  Moritz Walter; Luke N Allen; Antonio de la Vega de León; Samuel J Webb; Valerie J Gillet
Journal:  J Cheminform       Date:  2022-06-07       Impact factor: 8.489

2.  QSAR-derived affinity fingerprints (part 2): modeling performance for potency prediction.

Authors:  Isidro Cortés-Ciriano; Ctibor Škuta; Andreas Bender; Daniel Svozil
Journal:  J Cheminform       Date:  2020-06-05       Impact factor: 5.514

3.  Recommender Systems in Antiviral Drug Discovery.

Authors:  Ekaterina A Sosnina; Sergey Sosnin; Anastasia A Nikitina; Ivan Nazarov; Dmitry I Osolodkin; Maxim V Fedorov
Journal:  ACS Omega       Date:  2020-06-21

Review 4.  Applications of Deep-Learning in Exploiting Large-Scale and Heterogeneous Compound Data in Industrial Pharmaceutical Research.

Authors:  Laurianne David; Josep Arús-Pous; Johan Karlsson; Ola Engkvist; Esben Jannik Bjerrum; Thierry Kogej; Jan M Kriegl; Bernd Beck; Hongming Chen
Journal:  Front Pharmacol       Date:  2019-11-05       Impact factor: 5.810

5.  SMMPPI: a machine learning-based approach for prediction of modulators of protein-protein interactions and its application for identification of novel inhibitors for RBD:hACE2 interactions in SARS-CoV-2.

Authors:  Priya Gupta; Debasisa Mohanty
Journal:  Brief Bioinform       Date:  2021-04-12       Impact factor: 11.622

Review 6.  Towards the sustainable discovery and development of new antibiotics.

Authors:  Marcus Miethke; Marco Pieroni; Tilmann Weber; Mark Brönstrup; Peter Hammann; Ludovic Halby; Paola B Arimondo; Philippe Glaser; Bertrand Aigle; Helge B Bode; Rui Moreira; Yanyan Li; Andriy Luzhetskyy; Marnix H Medema; Jean-Luc Pernodet; Marc Stadler; José Rubén Tormo; Olga Genilloud; Andrew W Truman; Kira J Weissman; Eriko Takano; Stefano Sabatini; Evi Stegmann; Heike Brötz-Oesterhelt; Wolfgang Wohlleben; Myriam Seemann; Martin Empting; Anna K H Hirsch; Brigitta Loretz; Claus-Michael Lehr; Alexander Titz; Jennifer Herrmann; Timo Jaeger; Silke Alt; Thomas Hesterkamp; Mathias Winterhalter; Andrea Schiefer; Kenneth Pfarr; Achim Hoerauf; Heather Graz; Michael Graz; Mika Lindvall; Savithri Ramurthy; Anders Karlén; Maarten van Dongen; Hrvoje Petkovic; Andreas Keller; Frédéric Peyrane; Stefano Donadio; Laurent Fraisse; Laura J V Piddock; Ian H Gilbert; Heinz E Moser; Rolf Müller
Journal:  Nat Rev Chem       Date:  2021-08-19       Impact factor: 34.571

7.  Simplified, interpretable graph convolutional neural networks for small molecule activity prediction.

Authors:  Jeffrey K Weber; Joseph A Morrone; Sugato Bagchi; Jan D Estrada Pabon; Seung-Gu Kang; Leili Zhang; Wendy D Cornell
Journal:  J Comput Aided Mol Des       Date:  2021-11-24       Impact factor: 4.179

8.  Predicting Total Drug Clearance and Volumes of Distribution Using the Machine Learning-Mediated Multimodal Method through the Imputation of Various Nonclinical Data.

Authors:  Hiroaki Iwata; Tatsuru Matsuo; Hideaki Mamada; Takahisa Motomura; Mayumi Matsushita; Takeshi Fujiwara; Kazuya Maeda; Koichi Handa
Journal:  J Chem Inf Model       Date:  2022-08-22       Impact factor: 6.162

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.