Literature DB >> 32207612

Experimental Error, Kurtosis, Activity Cliffs, and Methodology: What Limits the Predictivity of Quantitative Structure-Activity Relationship Models?

Robert P Sheridan1, Prabha Karnachi1, Matthew Tudor2, Yuting Xu3, Andy Liaw3, Falgun Shah2, Alan C Cheng4, Elizabeth Joshi5, Meir Glick6, Juan Alvarez6.   

Abstract

Given a particular descriptor/method combination, some quantitative structure-activity relationship (QSAR) datasets are very predictive by random-split cross-validation while others are not. Recent literature in modelability suggests that the limiting issue for predictivity is in the data, not the QSAR methodology, and the limits are due to activity cliffs. Here, we investigate, on in-house data, the relative usefulness of experimental error, distribution of the activities, and activity cliff metrics in determining how predictive a dataset is likely to be. We include unmodified in-house datasets, datasets that should be perfectly predictive based only on the chemical structure, datasets where the distribution of activities is manipulated, and datasets that include a known amount of added noise. We find that activity cliff metrics determine predictivity better than the other metrics we investigated, whatever the type of dataset, consistent with the modelability literature. However, such metrics cannot distinguish real activity cliffs due to large uncertainties in the activities. We also show that a number of modern QSAR methods, and some alternative descriptors, are equally bad at predicting the activities of compounds on activity cliffs, consistent with the assumptions behind "modelability." Finally, we relate time-split predictivity with random-split predictivity and show that different coverages of chemical space are at least as important as uncertainty in activity and/or activity cliffs in limiting predictivity.

Entities:  

Mesh:

Year:  2020        PMID: 32207612     DOI: 10.1021/acs.jcim.9b01067

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  5 in total

1.  Predicting target-ligand interactions with graph convolutional networks for interpretable pharmaceutical discovery.

Authors:  Paola Ruiz Puentes; Laura Rueda-Gensini; Natalia Valderrama; Isabela Hernández; Cristina González; Laura Daza; Carolina Muñoz-Camargo; Juan C Cruz; Pablo Arbeláez
Journal:  Sci Rep       Date:  2022-05-19       Impact factor: 4.996

2.  Advances in exploring activity cliffs.

Authors:  Dagmar Stumpfe; Huabin Hu; Jürgen Bajorath
Journal:  J Comput Aided Mol Des       Date:  2020-05-05       Impact factor: 3.686

Review 3.  Uncertainty quantification: Can we trust artificial intelligence in drug discovery?

Authors:  Jie Yu; Dingyan Wang; Mingyue Zheng
Journal:  iScience       Date:  2022-07-21

4.  Implications of Additivity and Nonadditivity for Machine Learning and Deep Learning Models in Drug Design.

Authors:  Karolina Kwapien; Eva Nittinger; Jiazhen He; Christian Margreitter; Alexey Voronov; Christian Tyrchan
Journal:  ACS Omega       Date:  2022-07-19

5.  Nonadditivity in public and inhouse data: implications for drug design.

Authors:  D Gogishvili; E Nittinger; C Margreitter; C Tyrchan
Journal:  J Cheminform       Date:  2021-07-02       Impact factor: 5.514

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.