Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Current cancer driver variant predictors learn to recognize driver genes instead of functional variants.

Literature DB >> 33441128

Current cancer driver variant predictors learn to recognize driver genes instead of functional variants.

Daniele Raimondi¹, Antoine Passemiers¹, Piero Fariselli², Yves Moreau³.

Abstract

BACKGROUND: Identifying variants that drive tumor progression (driver variants) and distinguishing these from variants that are a byproduct of the uncontrolled cell growth in cancer (passenger variants) is a crucial step for understanding tumorigenesis and precision oncology. Various bioinformatics methods have attempted to solve this complex task.
RESULTS: In this study, we investigate the assumptions on which these methods are based, showing that the different definitions of driver and passenger variants influence the difficulty of the prediction task. More importantly, we prove that the data sets have a construction bias which prevents the machine learning (ML) methods to actually learn variant-level functional effects, despite their excellent performance. This effect results from the fact that in these data sets, the driver variants map to a few driver genes, while the passenger variants spread across thousands of genes, and thus just learning to recognize driver genes provides almost perfect predictions.
CONCLUSIONS: To mitigate this issue, we propose a novel data set that minimizes this bias by ensuring that all genes covered by the data contain both driver and passenger variants. As a result, we show that the tested predictors experience a significant drop in performance, which should not be considered as poorer modeling, but rather as correcting unwarranted optimism. Finally, we propose a weighting procedure to completely eliminate the gene effects on such predictions, thus precisely evaluating the ability of predictors to model the functional effects of single variants, and we show that indeed this task is still open.

Entities: Disease Gene Species

Keywords: Bias in machine learning; Cancer driver variant prediction; Clever Hans effect

Year: 2021 PMID： 33441128 PMCID： PMC7807764 DOI： 10.1186/s12915-020-00930-0

Source DB: PubMed Journal: BMC Biol ISSN： 1741-7007 Impact factor: 7.431

42 in total

1. A new disease-specific machine learning approach for the prediction of cancer-causing missense variants.

Authors: Emidio Capriotti; Russ B Altman
Journal: Genomics Date: 2011-07-07 Impact factor: 5.736

2. M-CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity.

Authors: Karthik A Jagadeesh; Aaron M Wenger; Mark J Berger; Harendra Guturu; Peter D Stenson; David N Cooper; Jonathan A Bernstein; Gill Bejerano
Journal: Nat Genet Date: 2016-10-24 Impact factor: 38.330

3. Multilevel biological characterization of exomic variants at the protein level significantly improves the identification of their deleterious effects.

Authors: Daniele Raimondi; Andrea M Gazzo; Marianne Rooman; Tom Lenaerts; Wim F Vranken
Journal: Bioinformatics Date: 2016-02-18 Impact factor: 6.937

4. Computational approaches to identify functional genetic variants in cancer genomes.

Authors: Abel Gonzalez-Perez; Ville Mustonen; Boris Reva; Graham R S Ritchie; Pau Creixell; Rachel Karchin; Miguel Vazquez; J Lynn Fink; Karin S Kassahn; John V Pearson; Gary D Bader; Paul C Boutros; Lakshmi Muthuswamy; B F Francis Ouellette; Jüri Reimand; Rune Linding; Tatsuhiro Shibata; Alfonso Valencia; Adam Butler; Serge Dronov; Paul Flicek; Nick B Shannon; Hannah Carter; Li Ding; Chris Sander; Josh M Stuart; Lincoln D Stein; Nuria Lopez-Bigas
Journal: Nat Methods Date: 2013-08 Impact factor: 28.547

5. Distinguishing cancer-associated missense mutations from common polymorphisms.

Authors: Joshua S Kaminker; Yan Zhang; Allison Waugh; Peter M Haverty; Brock Peters; Dragan Sebisanovic; Jeremy Stinson; William F Forrest; J Fernando Bazan; Somasekar Seshagiri; Zemin Zhang
Journal: Cancer Res Date: 2007-01-15 Impact factor: 12.701

6. Deep neural networks are more accurate than humans at detecting sexual orientation from facial images.

Authors: Yilun Wang; Michal Kosinski
Journal: J Pers Soc Psychol Date: 2018-02

7. dbNSFP v3.0: A One-Stop Database of Functional Predictions and Annotations for Human Nonsynonymous and Splice-Site SNVs.

Authors: Xiaoming Liu; Chunlei Wu; Chang Li; Eric Boerwinkle
Journal: Hum Mutat Date: 2016-01-05 Impact factor: 4.878

8. CanDrA: cancer-specific driver missense mutation annotation with optimized features.

Authors: Yong Mao; Han Chen; Han Liang; Funda Meric-Bernstam; Gordon B Mills; Ken Chen
Journal: PLoS One Date: 2013-10-30 Impact factor: 3.240

9. Driver gene classification reveals a substantial overrepresentation of tumor suppressors among very large chromatin-regulating proteins.

Authors: Zeev Waks; Omer Weissbrod; Boaz Carmeli; Raquel Norel; Filippo Utro; Yaara Goldschmidt
Journal: Sci Rep Date: 2016-12-23 Impact factor: 4.379