Runjun D Kumar1, Adam C Searleman2, S Joshua Swamidass3, Obi L Griffith4, Ron Bose2. 1. Division of Oncology, Department of Medicine, Washington University School of Medicine, Computational and Systems Biology Program, Washington University in St Louis. 2. Division of Oncology, Department of Medicine, Washington University School of Medicine. 3. Computational and Systems Biology Program, Washington University in St Louis, Department of Pathology and Immunology, Washington University School of Medicine and. 4. Division of Oncology, Department of Medicine, Washington University School of Medicine, Division of Oncology, Department of Medicine, Washington University School of Medicine.
Abstract
MOTIVATION: Several tools exist to identify cancer driver genes based on somatic mutation data. However, these tools do not account for subclasses of cancer genes: oncogenes, which undergo gain-of-function events, and tumor suppressor genes (TSGs) which undergo loss-of-function. A method which accounts for these subclasses could improve performance while also suggesting a mechanism of action for new putative cancer genes. RESULTS: We develop a panel of five complementary statistical tests and assess their performance against a curated set of 99 HiConf cancer genes using a pan-cancer dataset of 1.7 million mutations. We identify patient bias as a novel signal for cancer gene discovery, and use it to significantly improve detection of oncogenes over existing methods (AUROC = 0.894). Additionally, our test of truncation event rate separates oncogenes and TSGs from one another (AUROC = 0.922). Finally, a random forest integrating the five tests further improves performance and identifies new cancer genes, including CACNG3, HDAC2, HIST1H1E, NXF1, GPS2 and HLA-DRB1. AVAILABILITY AND IMPLEMENTATION: All mutation data, instructions, functions for computing the statistics and integrating them, as well as the HiConf gene panel, are available at www.github.com/Bose-Lab/Improved-Detection-of-Cancer-Genes. CONTACT: rbose@dom.wustl.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Several tools exist to identify cancer driver genes based on somatic mutation data. However, these tools do not account for subclasses of cancer genes: oncogenes, which undergo gain-of-function events, and tumor suppressor genes (TSGs) which undergo loss-of-function. A method which accounts for these subclasses could improve performance while also suggesting a mechanism of action for new putative cancer genes. RESULTS: We develop a panel of five complementary statistical tests and assess their performance against a curated set of 99 HiConf cancer genes using a pan-cancer dataset of 1.7 million mutations. We identify patient bias as a novel signal for cancer gene discovery, and use it to significantly improve detection of oncogenes over existing methods (AUROC = 0.894). Additionally, our test of truncation event rate separates oncogenes and TSGs from one another (AUROC = 0.922). Finally, a random forest integrating the five tests further improves performance and identifies new cancer genes, including CACNG3, HDAC2, HIST1H1E, NXF1, GPS2 and HLA-DRB1. AVAILABILITY AND IMPLEMENTATION: All mutation data, instructions, functions for computing the statistics and integrating them, as well as the HiConf gene panel, are available at www.github.com/Bose-Lab/Improved-Detection-of-Cancer-Genes. CONTACT: rbose@dom.wustl.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: P Andrew Futreal; Lachlan Coin; Mhairi Marshall; Thomas Down; Timothy Hubbard; Richard Wooster; Nazneen Rahman; Michael R Stratton Journal: Nat Rev Cancer Date: 2004-03 Impact factor: 60.716
Authors: Malachi Griffith; Obi L Griffith; Adam C Coffman; James V Weible; Josh F McMichael; Nicholas C Spies; James Koval; Indraniel Das; Matthew B Callaway; James M Eldred; Christopher A Miller; Janakiraman Subramanian; Ramaswamy Govindan; Runjun D Kumar; Ron Bose; Li Ding; Jason R Walker; David E Larson; David J Dooling; Scott M Smith; Timothy J Ley; Elaine R Mardis; Richard K Wilson Journal: Nat Methods Date: 2013-10-13 Impact factor: 28.547
Authors: Michael S Lawrence; Petar Stojanov; Craig H Mermel; James T Robinson; Levi A Garraway; Todd R Golub; Matthew Meyerson; Stacey B Gabriel; Eric S Lander; Gad Getz Journal: Nature Date: 2014-01-05 Impact factor: 49.962
Authors: Timothy J Ley; Elaine R Mardis; Li Ding; Bob Fulton; Michael D McLellan; Ken Chen; David Dooling; Brian H Dunford-Shore; Sean McGrath; Matthew Hickenbotham; Lisa Cook; Rachel Abbott; David E Larson; Dan C Koboldt; Craig Pohl; Scott Smith; Amy Hawkins; Scott Abbott; Devin Locke; Ladeana W Hillier; Tracie Miner; Lucinda Fulton; Vincent Magrini; Todd Wylie; Jarret Glasscock; Joshua Conyers; Nathan Sander; Xiaoqi Shi; John R Osborne; Patrick Minx; David Gordon; Asif Chinwalla; Yu Zhao; Rhonda E Ries; Jacqueline E Payton; Peter Westervelt; Michael H Tomasson; Mark Watson; Jack Baty; Jennifer Ivanovich; Sharon Heath; William D Shannon; Rakesh Nagarajan; Matthew J Walter; Daniel C Link; Timothy A Graubert; John F DiPersio; Richard K Wilson Journal: Nature Date: 2008-11-06 Impact factor: 49.962
Authors: Felix Dietlein; Donate Weghorn; Amaro Taylor-Weiner; André Richters; Brendan Reardon; David Liu; Eric S Lander; Eliezer M Van Allen; Shamil R Sunyaev Journal: Nat Genet Date: 2020-02-03 Impact factor: 38.330
Authors: Susanne Tilk; Svyatoslav Tkachenko; Christina Curtis; Dmitri A Petrov; Christopher D McFarland Journal: Elife Date: 2022-09-01 Impact factor: 8.713
Authors: Melissa A Richard; Austin L Brown; John W Belmont; Michael E Scheurer; Vidal M Arroyo; Kayla L Foster; Kathleen D Kern; Melissa M Hudson; Wendy M Leisenring; M Fatih Okcu; Yadav Sapkota; Yutaka Yasui; Lindsay M Morton; Stephen J Chanock; Leslie L Robison; Gregory T Armstrong; Smita Bhatia; Kevin C Oeffinger; Philip J Lupo; Kala Y Kamdar Journal: Cancer Date: 2020-10-13 Impact factor: 6.860
Authors: Hongchen Cai; Su Kit Chew; Chuan Li; Min K Tsai; Laura Andrejka; Christopher W Murray; Nicholas W Hughes; Emily G Shuldiner; Emily L Ashkin; Rui Tang; King L Hung; Leo C Chen; Shi Ya C Lee; Maryam Yousefi; Wen-Yang Lin; Christian A Kunder; Le Cong; Christopher D McFarland; Dmitri A Petrov; Charles Swanton; Monte M Winslow Journal: Cancer Discov Date: 2021-02-19 Impact factor: 38.272