Literature DB >> 34238309

Explaining multivariate molecular diagnostic tests via Shapley values.

Joanna Roder1, Laura Maguire2, Robert Georgantas2, Heinrich Roder2.   

Abstract

BACKGROUND: Machine learning (ML) can be an effective tool to extract information from attribute-rich molecular datasets for the generation of molecular diagnostic tests. However, the way in which the resulting scores or classifications are produced from the input data may not be transparent. Algorithmic explainability or interpretability has become a focus of ML research. Shapley values, first introduced in game theory, can provide explanations of the result generated from a specific set of input data by a complex ML algorithm.
METHODS: For a multivariate molecular diagnostic test in clinical use (the VeriStrat® test), we calculate and discuss the interpretation of exact Shapley values. We also employ some standard approximation techniques for Shapley value computation (local interpretable model-agnostic explanation (LIME) and Shapley Additive Explanations (SHAP) based methods) and compare the results with exact Shapley values.
RESULTS: Exact Shapley values calculated for data collected from a cohort of 256 patients showed that the relative importance of attributes for test classification varied by sample. While all eight features used in the VeriStrat® test contributed equally to classification for some samples, other samples showed more complex patterns of attribute importance for classification generation. Exact Shapley values and Shapley-based interaction metrics were able to provide interpretable classification explanations at the sample or patient level, while patient subgroups could be defined by comparing Shapley value profiles between patients. LIME and SHAP approximation approaches, even those seeking to include correlations between attributes, produced results that were quantitatively and, in some cases qualitatively, different from the exact Shapley values.
CONCLUSIONS: Shapley values can be used to determine the relative importance of input attributes to the result generated by a multivariate molecular diagnostic test for an individual sample or patient. Patient subgroups defined by Shapley value profiles may motivate translational research. However, correlations inherent in molecular data and the typically small ML training sets available for molecular diagnostic test development may cause some approximation methods to produce approximate Shapley values that differ both qualitatively and quantitatively from exact Shapley values. Hence, caution is advised when using approximate methods to evaluate Shapley explanations of the results of molecular diagnostic tests.

Entities:  

Keywords:  Artificial intelligence; Explainability; Interpretability; Machine learning; Molecular diagnostic test; Shapley values

Year:  2021        PMID: 34238309      PMCID: PMC8265031          DOI: 10.1186/s12911-021-01569-9

Source DB:  PubMed          Journal:  BMC Med Inform Decis Mak        ISSN: 1472-6947            Impact factor:   2.796


  18 in total

1.  Dissecting racial bias in an algorithm used to manage the health of populations.

Authors:  Ziad Obermeyer; Brian Powers; Christine Vogeli; Sendhil Mullainathan
Journal:  Science       Date:  2019-10-25       Impact factor: 47.728

2.  Prognostic performance of proteomic testing in advanced non-small cell lung cancer: a systematic literature review and meta-analysis.

Authors:  Ticiana A Leal; Angela C Argento; Krish Bhadra; D Kyle Hogarth; Julia Grigorieva; Rachel M Hartfield; Robert C McDonald; Philip D Bonomi
Journal:  Curr Med Res Opin       Date:  2020-07-23       Impact factor: 2.580

3.  Detection of novel truncated forms of human serum amyloid A protein in human plasma.

Authors:  Urban A Kiernan; Kemmons A Tubbs; Dobrin Nedelkov; Eric E Niederkofler; Randall W Nelson
Journal:  FEBS Lett       Date:  2003-02-27       Impact factor: 4.124

4.  Mass spectrometry to classify non-small-cell lung cancer patients for clinical outcome after treatment with epidermal growth factor receptor tyrosine kinase inhibitors: a multicohort cross-institutional study.

Authors:  Fumiko Taguchi; Benjamin Solomon; Vanesa Gregorc; Heinrich Roder; Robert Gray; Kazuo Kasahara; Makoto Nishio; Julie Brahmer; Anna Spreafico; Vienna Ludovini; Pierre P Massion; Rafal Dziadziuszko; Joan Schiller; Julia Grigorieva; Maxim Tsypin; Stephen W Hunsucker; Richard Caprioli; Mark W Duncan; Fred R Hirsch; Paul A Bunn; David P Carbone
Journal:  J Natl Cancer Inst       Date:  2007-06-06       Impact factor: 13.506

5.  Predictive value of a proteomic signature in patients with non-small-cell lung cancer treated with second-line erlotinib or chemotherapy (PROSE): a biomarker-stratified, randomised phase 3 trial.

Authors:  Vanesa Gregorc; Silvia Novello; Chiara Lazzari; Sandro Barni; Michele Aieta; Manlio Mencoboni; Francesco Grossi; Tommaso De Pas; Filippo de Marinis; Alessandra Bearz; Irene Floriani; Valter Torri; Alessandra Bulotta; Angela Cattaneo; Julia Grigorieva; Maxim Tsypin; Joanna Roder; Claudio Doglioni; Matteo Giaj Levra; Fausto Petrelli; Silvia Foti; Mariagrazia Viganò; Angela Bachi; Heinrich Roder
Journal:  Lancet Oncol       Date:  2014-05-13       Impact factor: 41.316

6.  Interpretable Machine Learning Model for Locoregional Relapse Prediction in Oropharyngeal Cancers.

Authors:  Paul Giraud; Philippe Giraud; Eliot Nicolas; Pierre Boisselier; Marc Alfonsi; Michel Rives; Etienne Bardet; Valentin Calugaru; Georges Noel; Enrique Chajon; Pascal Pommier; Magali Morelle; Lionel Perrier; Xavier Liem; Anita Burgun; Jean Emmanuel Bibault
Journal:  Cancers (Basel)       Date:  2020-12-28       Impact factor: 6.639

7.  Phase II trial of sorafenib and erlotinib in advanced pancreatic cancer.

Authors:  Dana B Cardin; Laura Goff; Chung-I Li; Yu Shyr; Charles Winkler; Russell DeVore; Larry Schlabach; Melanie Holloway; Pam McClanahan; Krista Meyer; Julia Grigorieva; Jordan Berlin; Emily Chan
Journal:  Cancer Med       Date:  2014-02-12       Impact factor: 4.452

8.  The serum-based VeriStrat® test is associated with proinflammatory reactants and clinical outcome in non-small cell lung cancer patients.

Authors:  Mary Jo Fidler; Cristina L Fhied; Joanna Roder; Sanjib Basu; Selina Sayidine; Ibtihaj Fughhi; Mark Pool; Marta Batus; Philip Bonomi; Jeffrey A Borgia
Journal:  BMC Cancer       Date:  2018-03-20       Impact factor: 4.430

9.  Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation.

Authors:  Sulaiman Somani; Adam J Russak; Akhil Vaid; Jessica K De Freitas; Fayzan F Chaudhry; Ishan Paranjpe; Kipp W Johnson; Samuel J Lee; Riccardo Miotto; Felix Richter; Shan Zhao; Noam D Beckmann; Nidhi Naik; Arash Kia; Prem Timsina; Anuradha Lala; Manish Paranjpe; Eddye Golden; Matteo Danieletto; Manbir Singh; Dara Meyer; Paul F O'Reilly; Laura Huckins; Patricia Kovatch; Joseph Finkelstein; Robert M Freeman; Edgar Argulian; Andrew Kasarskis; Bethany Percha; Judith A Aberg; Emilia Bagiella; Carol R Horowitz; Barbara Murphy; Eric J Nestler; Eric E Schadt; Judy H Cho; Carlos Cordon-Cardo; Valentin Fuster; Dennis S Charney; David L Reich; Erwin P Bottinger; Matthew A Levin; Jagat Narula; Zahi A Fayad; Allan C Just; Alexander W Charney; Girish N Nadkarni; Benjamin S Glicksberg
Journal:  J Med Internet Res       Date:  2020-11-06       Impact factor: 5.428

View more
  3 in total

1.  Real-world performance of blood-based proteomic profiling in first-line immunotherapy treatment in advanced stage non-small cell lung cancer.

Authors:  Patricia Rich; R Brian Mitchell; Eric Schaefer; Paul R Walker; John W Dubay; Jason Boyd; David Oubre; Ray Page; Mazen Khalil; Suman Sinha; Scott Boniol; Hafez Halawani; Edgardo S Santos; Warren Brenner; James M Orsini; Emily Pauli; Jonathan Goldberg; Andrea Veatch; Mitchell Haut; Bassam Ghabach; Savita Bidyasar; Maria Quejada; Waseemullah Khan; Kan Huang; Linda Traylor; Wallace Akerley
Journal:  J Immunother Cancer       Date:  2021-10       Impact factor: 13.751

2.  Development and Validation of an Insulin Resistance Model for a Population with Chronic Kidney Disease Using a Machine Learning Approach.

Authors:  Chia-Lin Lee; Wei-Ju Liu; Shang-Feng Tsai
Journal:  Nutrients       Date:  2022-07-09       Impact factor: 6.706

Review 3.  Ageing and cancer: a research gap to fill.

Authors:  Eric Solary; Nancy Abou-Zeid; Fabien Calvo
Journal:  Mol Oncol       Date:  2022-05-21       Impact factor: 7.449

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.