Hai Yang1,2, Qiang Wei1,2, Xue Zhong2,3, Hushan Yang4, Bingshan Li1,2. 1. Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, TN, USA. 2. Vanderbilt Genetics Institute, Nashville, TN, USA. 3. Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA. 4. Department of Medical Oncology, Sidney Kimmel Cancer Center, Thomas Jefferson University, Philadelphia, PA, USA.
Abstract
Motivation: Comprehensive catalogue of genes that drive tumor initiation and progression in cancer is key to advancing diagnostics, therapeutics and treatment. Given the complexity of cancer, the catalogue is far from complete yet. Increasing evidence shows that driver genes exhibit consistent aberration patterns across multiple-omics in tumors. In this study, we aim to leverage complementary information encoded in each of the omics data to identify novel driver genes through an integrative framework. Specifically, we integrated mutations, gene expression, DNA copy numbers, DNA methylation and protein abundance, all available in The Cancer Genome Atlas (TCGA) and developed iDriver, a non-parametric Bayesian framework based on multivariate statistical modeling to identify driver genes in an unsupervised fashion. iDriver captures the inherent clusters of gene aberrations and constructs the background distribution that is used to assess and calibrate the confidence of driver genes identified through multi-dimensional genomic data. Results: We applied the method to 4 cancer types in TCGA and identified candidate driver genes that are highly enriched with known drivers. (e.g.: P < 3.40 × 10 -36 for breast cancer). We are particularly interested in novel genes and observed multiple lines of supporting evidence. Using systematic evaluation from multiple independent aspects, we identified 45 candidate driver genes that were not previously known across these 4 cancer types. The finding has important implications that integrating additional genomic data with multivariate statistics can help identify cancer drivers and guide the next stage of cancer genomics research. Availability and Implementation: The C ++ source code is freely available at https://medschool.vanderbilt.edu/cgg/ . Contacts: hai.yang@vanderbilt.edu or bingshan.li@Vanderbilt.Edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: Comprehensive catalogue of genes that drive tumor initiation and progression in cancer is key to advancing diagnostics, therapeutics and treatment. Given the complexity of cancer, the catalogue is far from complete yet. Increasing evidence shows that driver genes exhibit consistent aberration patterns across multiple-omics in tumors. In this study, we aim to leverage complementary information encoded in each of the omics data to identify novel driver genes through an integrative framework. Specifically, we integrated mutations, gene expression, DNA copy numbers, DNA methylation and protein abundance, all available in The Cancer Genome Atlas (TCGA) and developed iDriver, a non-parametric Bayesian framework based on multivariate statistical modeling to identify driver genes in an unsupervised fashion. iDriver captures the inherent clusters of gene aberrations and constructs the background distribution that is used to assess and calibrate the confidence of driver genes identified through multi-dimensional genomic data. Results: We applied the method to 4 cancer types in TCGA and identified candidate driver genes that are highly enriched with known drivers. (e.g.: P < 3.40 × 10 -36 for breast cancer). We are particularly interested in novel genes and observed multiple lines of supporting evidence. Using systematic evaluation from multiple independent aspects, we identified 45 candidate driver genes that were not previously known across these 4 cancer types. The finding has important implications that integrating additional genomic data with multivariate statistics can help identify cancer drivers and guide the next stage of cancer genomics research. Availability and Implementation: The C ++ source code is freely available at https://medschool.vanderbilt.edu/cgg/ . Contacts: hai.yang@vanderbilt.edu or bingshan.li@Vanderbilt.Edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Helen Davies; Chris Hunter; Raffaella Smith; Philip Stephens; Chris Greenman; Graham Bignell; Jon Teague; Adam Butler; Sarah Edkins; Claire Stevens; Adrian Parker; Sarah O'Meara; Tim Avis; Syd Barthorpe; Lisa Brackenbury; Gemma Buck; Jody Clements; Jennifer Cole; Ed Dicks; Ken Edwards; Simon Forbes; Matthew Gorton; Kristian Gray; Kelly Halliday; Rachel Harrison; Katy Hills; Jonathon Hinton; David Jones; Vivienne Kosmidou; Ross Laman; Richard Lugg; Andrew Menzies; Janet Perry; Robert Petty; Keiran Raine; Rebecca Shepherd; Alexandra Small; Helen Solomon; Yvonne Stephens; Calli Tofts; Jennifer Varian; Anthony Webb; Sofie West; Sara Widaa; Andrew Yates; Francis Brasseur; Colin S Cooper; Adrienne M Flanagan; Anthony Green; Maggie Knowles; Suet Y Leung; Leendert H J Looijenga; Bruce Malkowicz; Marco A Pierotti; Bin T Teh; Siu T Yuen; Sunil R Lakhani; Douglas F Easton; Barbara L Weber; Peter Goldstraw; Andrew G Nicholson; Richard Wooster; Michael R Stratton; P Andrew Futreal Journal: Cancer Res Date: 2005-09-01 Impact factor: 12.701
Authors: Z Liu; X Yang; Z Li; C McMahon; C Sizer; L Barenboim-Stapleton; V Bliskovsky; B Mock; T Ried; W B London; J Maris; J Khan; C J Thiele Journal: Cell Death Differ Date: 2011-01-21 Impact factor: 15.828
Authors: T C He; A B Sparks; C Rago; H Hermeking; L Zawel; L T da Costa; P J Morin; B Vogelstein; K W Kinzler Journal: Science Date: 1998-09-04 Impact factor: 47.728
Authors: Cyriac Kandoth; Michael D McLellan; Fabio Vandin; Kai Ye; Beifang Niu; Charles Lu; Mingchao Xie; Qunyuan Zhang; Joshua F McMichael; Matthew A Wyczalkowski; Mark D M Leiserson; Christopher A Miller; John S Welch; Matthew J Walter; Michael C Wendl; Timothy J Ley; Richard K Wilson; Benjamin J Raphael; Li Ding Journal: Nature Date: 2013-10-17 Impact factor: 49.962
Authors: Kaitlin E Samocha; Elise B Robinson; Stephan J Sanders; Christine Stevens; Aniko Sabo; Lauren M McGrath; Jack A Kosmicki; Karola Rehnström; Swapan Mallick; Andrew Kirby; Dennis P Wall; Daniel G MacArthur; Stacey B Gabriel; Mark DePristo; Shaun M Purcell; Aarno Palotie; Eric Boerwinkle; Joseph D Buxbaum; Edwin H Cook; Richard A Gibbs; Gerard D Schellenberg; James S Sutcliffe; Bernie Devlin; Kathryn Roeder; Benjamin M Neale; Mark J Daly Journal: Nat Genet Date: 2014-08-03 Impact factor: 38.330
Authors: Fathelrahman M Hassan; Afnan A Alsultan; Faisal Alzahrani; Waleed H Albuali; Dalal K Bubshait; Elfadil M Abass; Mudathir A Elbasheer; Abdulmohsen A Alkhanbashi Journal: Saudi Med J Date: 2021-09 Impact factor: 1.422