Anupama Jha1, Mathieu Quesnel-Vallières2,3, David Wang4, Andrei Thomas-Tikhonenko5,6,7, Kristen W Lynch8, Yoseph Barash9,10. 1. Department of Computer and Information Science, School of Engineering and Applied Science, Philadelphia, USA. anupamaj@seas.upenn.edu. 2. Department of Genetics, Philadelphia, USA. mathieu.quesnel-vallieres@pennmedicine.upenn.edu. 3. Department of Biochemistry and Biophysics, Philadelphia, USA. mathieu.quesnel-vallieres@pennmedicine.upenn.edu. 4. Department of Genetics, Philadelphia, USA. 5. Department of Pathology and Laboratory Medicine, Philadelphia, USA. 6. Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA. 7. Division of Cancer Pathobiology, Children's Hospital of Philadelphia, Philadelphia, USA. 8. Department of Biochemistry and Biophysics, Philadelphia, USA. 9. Department of Computer and Information Science, School of Engineering and Applied Science, Philadelphia, USA. yosephb@pennmedicine.upenn.edu. 10. Department of Genetics, Philadelphia, USA. yosephb@pennmedicine.upenn.edu.
Abstract
BACKGROUND: Cancer is a set of diseases characterized by unchecked cell proliferation and invasion of surrounding tissues. The many genes that have been genetically associated with cancer or shown to directly contribute to oncogenesis vary widely between tumor types, but common gene signatures that relate to core cancer pathways have also been identified. It is not clear, however, whether there exist additional sets of genes or transcriptomic features that are less well known in cancer biology but that are also commonly deregulated across several cancer types. RESULTS: Here, we agnostically identify transcriptomic features that are commonly shared between cancer types using 13,461 RNA-seq samples from 19 normal tissue types and 18 solid tumor types to train three feed-forward neural networks, based either on protein-coding gene expression, lncRNA expression, or splice junction use, to distinguish between normal and tumor samples. All three models recognize transcriptome signatures that are consistent across tumors. Analysis of attribution values extracted from our models reveals that genes that are commonly altered in cancer by expression or splicing variations are under strong evolutionary and selective constraints. Importantly, we find that genes composing our cancer transcriptome signatures are not frequently affected by mutations or genomic alterations and that their functions differ widely from the genes genetically associated with cancer. CONCLUSIONS: Our results highlighted that deregulation of RNA-processing genes and aberrant splicing are pervasive features on which core cancer pathways might converge across a large array of solid tumor types.
BACKGROUND: Cancer is a set of diseases characterized by unchecked cell proliferation and invasion of surrounding tissues. The many genes that have been genetically associated with cancer or shown to directly contribute to oncogenesis vary widely between tumor types, but common gene signatures that relate to core cancer pathways have also been identified. It is not clear, however, whether there exist additional sets of genes or transcriptomic features that are less well known in cancer biology but that are also commonly deregulated across several cancer types. RESULTS: Here, we agnostically identify transcriptomic features that are commonly shared between cancer types using 13,461 RNA-seq samples from 19 normal tissue types and 18 solid tumor types to train three feed-forward neural networks, based either on protein-coding gene expression, lncRNA expression, or splice junction use, to distinguish between normal and tumor samples. All three models recognize transcriptome signatures that are consistent across tumors. Analysis of attribution values extracted from our models reveals that genes that are commonly altered in cancer by expression or splicing variations are under strong evolutionary and selective constraints. Importantly, we find that genes composing our cancer transcriptome signatures are not frequently affected by mutations or genomic alterations and that their functions differ widely from the genes genetically associated with cancer. CONCLUSIONS: Our results highlighted that deregulation of RNA-processing genes and aberrant splicing are pervasive features on which core cancer pathways might converge across a large array of solid tumor types.
Authors: Ji Cao; Yijie Wang; Rong Dong; Guanyu Lin; Ning Zhang; Jing Wang; Nengming Lin; Yongchuan Gu; Ling Ding; Meidan Ying; Qiaojun He; Bo Yang Journal: Cancer Res Date: 2015-09-30 Impact factor: 12.701