Literature DB >> 36105596

Calculation of exact Shapley values for support vector machines with Tanimoto kernel enables model interpretation.

Christian Feldmann1, Jürgen Bajorath1.   

Abstract

The support vector machine (SVM) algorithm is popular in chemistry and drug discovery. SVM models have black box character. Their predictions can be interpreted through feature weighting or the model-agnostic Shapley additive explanations (SHAP) formalism that locally approximates Shapley values (SVs) originating from game theory. We introduce an algorithm termed SV-expressed Tanimoto similarity (SVETA) for the exact calculation of SVs to explain SVM models employing the Tanimoto kernel, the gold standard for the assessment of molecular similarity. For a model system, the exact calculation of SVs is demonstrated. In an SVM-based compound classification task from drug discovery, only a limited correlation between exact SV and SHAP values is observed, prohibiting the use of approximate values for rationalizing predictions. For exemplary test compounds, atom-based mapping of prioritized features delineates coherent substructures that closely resemble those obtained by analyzing independently derived random forest models, thus providing consistent explanations.
© 2022 The Author(s).

Entities:  

Keywords:  Computational chemistry; cheminformatics; computer modeling

Year:  2022        PMID: 36105596      PMCID: PMC9464958          DOI: 10.1016/j.isci.2022.105023

Source DB:  PubMed          Journal:  iScience        ISSN: 2589-0042


  26 in total

Review 1.  In silico approaches for predicting ADME properties of drugs.

Authors:  Fumiyoshi Yamashita; Mitsuru Hashida
Journal:  Drug Metab Pharmacokinet       Date:  2004-10       Impact factor: 3.614

2.  Extended-connectivity fingerprints.

Authors:  David Rogers; Mathew Hahn
Journal:  J Chem Inf Model       Date:  2010-05-24       Impact factor: 4.956

3.  Visualisation and interpretation of Support Vector Regression models.

Authors:  B Ustün; W J Melssen; L M C Buydens
Journal:  Anal Chim Acta       Date:  2007-03-18       Impact factor: 6.558

4.  New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays.

Authors:  Jonathan B Baell; Georgina A Holloway
Journal:  J Med Chem       Date:  2010-04-08       Impact factor: 7.446

5.  Can we open the black box of AI?

Authors:  Davide Castelvecchi
Journal:  Nature       Date:  2016-10-06       Impact factor: 49.962

Review 6.  Applications of machine learning in drug discovery and development.

Authors:  Jessica Vamathevan; Dominic Clark; Paul Czodrowski; Ian Dunham; Edgardo Ferran; George Lee; Bin Li; Anant Madabhushi; Parantu Shah; Michaela Spitzer; Shanrong Zhao
Journal:  Nat Rev Drug Discov       Date:  2019-06       Impact factor: 84.694

7.  The Calculation of Molecular Structural Similarity: Principles and Practice.

Authors:  Peter Willett
Journal:  Mol Inform       Date:  2014-04-29       Impact factor: 3.353

Review 8.  Principles and Practice of Explainable Machine Learning.

Authors:  Vaishak Belle; Ioannis Papantonis
Journal:  Front Big Data       Date:  2021-07-01

9.  Explainable machine learning predictions of dual-target compounds reveal characteristic structural features.

Authors:  Christian Feldmann; Maren Philipps; Jürgen Bajorath
Journal:  Sci Rep       Date:  2021-11-03       Impact factor: 4.379

10.  The ChEMBL bioactivity database: an update.

Authors:  A Patrícia Bento; Anna Gaulton; Anne Hersey; Louisa J Bellis; Jon Chambers; Mark Davies; Felix A Krüger; Yvonne Light; Lora Mak; Shaun McGlinchey; Michal Nowotka; George Papadatos; Rita Santos; John P Overington
Journal:  Nucleic Acids Res       Date:  2013-11-07       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.