Yiqun Zhang1, Fengju Chen1, Chad J Creighton2,3,4,5. 1. Dan L. Duncan Comprehensive Cancer Center Division of Biostatistics, Baylor College of Medicine, Houston, TX, 77030, USA. 2. Dan L. Duncan Comprehensive Cancer Center Division of Biostatistics, Baylor College of Medicine, Houston, TX, 77030, USA. creighto@bcm.edu. 3. Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA. creighto@bcm.edu. 4. Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA. creighto@bcm.edu. 5. Department of Medicine, Baylor College of Medicine, Houston, TX, 77030, USA. creighto@bcm.edu.
Abstract
BACKGROUND: Combined whole-genome sequencing (WGS) and RNA sequencing of cancers offer the opportunity to identify genes with altered expression due to genomic rearrangements. Somatic structural variants (SVs), as identified by WGS, can involve altered gene cis-regulation, gene fusions, copy number alterations, or gene disruption. The absence of computational tools to streamline integrative analysis steps may represent a barrier in identifying genes recurrently altered by genomic rearrangement. RESULTS: Here, we introduce SVExpress, a set of tools for carrying out integrative analysis of SV and gene expression data. SVExpress enables systematic cataloging of genes that consistently show increased or decreased expression in conjunction with the presence of nearby SV breakpoints. SVExpress can evaluate breakpoints in proximity to genes for potential enhancer translocation events or disruption of topologically associated domains, two mechanisms by which SVs may deregulate genes. The output from any commonly used SV calling algorithm may be easily adapted for use with SVExpress. SVExpress can readily analyze genomic datasets involving hundreds of cancer sample profiles. Here, we used SVExpress to analyze SV and expression data across 327 cancer cell lines with combined SV and expression data in the Cancer Cell Line Encyclopedia (CCLE). In the CCLE dataset, hundreds of genes showed altered gene expression in relation to nearby SV breakpoints. Altered genes involved TAD disruption, enhancer hijacking, and gene fusions. When comparing the top set of SV-altered genes from cancer cell lines with the top SV-altered genes previously reported for human tumors from The Cancer Genome Atlas and the Pan-Cancer Analysis of Whole Genomes datasets, a significant number of genes overlapped in the same direction for both cell lines and tumors, while some genes were significant for cell lines but not for human tumors and vice versa. CONCLUSION: Our SVExpress tools allow computational biologists with a working knowledge of R to integrate gene expression with SV breakpoint data to identify recurrently altered genes. SVExpress is freely available for academic or commercial use at https://github.com/chadcreighton/SVExpress . SVExpress is implemented as a set of Excel macros and R code. All source code (R and Visual Basic for Applications) is available.
BACKGROUND: Combined whole-genome sequencing (WGS) and RNA sequencing of cancers offer the opportunity to identify genes with altered expression due to genomic rearrangements. Somatic structural variants (SVs), as identified by WGS, can involve altered gene cis-regulation, gene fusions, copy number alterations, or gene disruption. The absence of computational tools to streamline integrative analysis steps may represent a barrier in identifying genes recurrently altered by genomic rearrangement. RESULTS: Here, we introduce SVExpress, a set of tools for carrying out integrative analysis of SV and gene expression data. SVExpress enables systematic cataloging of genes that consistently show increased or decreased expression in conjunction with the presence of nearby SV breakpoints. SVExpress can evaluate breakpoints in proximity to genes for potential enhancer translocation events or disruption of topologically associated domains, two mechanisms by which SVs may deregulate genes. The output from any commonly used SV calling algorithm may be easily adapted for use with SVExpress. SVExpress can readily analyze genomic datasets involving hundreds of cancer sample profiles. Here, we used SVExpress to analyze SV and expression data across 327 cancer cell lines with combined SV and expression data in the Cancer Cell Line Encyclopedia (CCLE). In the CCLE dataset, hundreds of genes showed altered gene expression in relation to nearby SV breakpoints. Altered genes involved TAD disruption, enhancer hijacking, and gene fusions. When comparing the top set of SV-altered genes from cancer cell lines with the top SV-altered genes previously reported for humantumors from The Cancer Genome Atlas and the Pan-Cancer Analysis of Whole Genomes datasets, a significant number of genes overlapped in the same direction for both cell lines and tumors, while some genes were significant for cell lines but not for humantumors and vice versa. CONCLUSION: Our SVExpress tools allow computational biologists with a working knowledge of R to integrate gene expression with SV breakpoint data to identify recurrently altered genes. SVExpress is freely available for academic or commercial use at https://github.com/chadcreighton/SVExpress . SVExpress is implemented as a set of Excel macros and R code. All source code (R and Visual Basic for Applications) is available.
Authors: Joachim Weischenfeldt; Taronish Dubash; Alexandros P Drainas; Balca R Mardin; Yuanyuan Chen; Adrian M Stütz; Sebastian M Waszak; Graziella Bosco; Ann Rita Halvorsen; Benjamin Raeder; Theocharis Efthymiopoulos; Serap Erkek; Christine Siegl; Hermann Brenner; Odd Terje Brustugun; Sebastian M Dieter; Paul A Northcott; Iver Petersen; Stefan M Pfister; Martin Schneider; Steinar K Solberg; Erik Thunissen; Wilko Weichert; Thomas Zichner; Roman Thomas; Martin Peifer; Aslaug Helland; Claudia R Ball; Martin Jechlinger; Rocio Sotillo; Hanno Glimm; Jan O Korbel Journal: Nat Genet Date: 2016-11-21 Impact factor: 38.330
Authors: Koichiro Inaki; Axel M Hillmer; Leena Ukil; Fei Yao; Xing Yi Woo; Leah A Vardy; Kelson Folkvard Braaten Zawack; Charlie Wah Heng Lee; Pramila Nuwantha Ariyaratne; Yang Sun Chan; Kartiki Vasant Desai; Jonas Bergh; Per Hall; Thomas Choudary Putti; Wai Loon Ong; Atif Shahab; Valere Cacheux-Rataboul; Radha Krishna Murthy Karuturi; Wing-Kin Sung; Xiaoan Ruan; Guillaume Bourque; Yijun Ruan; Edison T Liu Journal: Genome Res Date: 2011-04-05 Impact factor: 9.043
Authors: Mahmoud Ghandi; Franklin W Huang; Judit Jané-Valbuena; Gregory V Kryukov; Christopher C Lo; E Robert McDonald; Jordi Barretina; Ellen T Gelfand; Craig M Bielski; Haoxin Li; Kevin Hu; Alexander Y Andreev-Drakhlin; Jaegil Kim; Julian M Hess; Brian J Haas; François Aguet; Barbara A Weir; Michael V Rothberg; Brenton R Paolella; Michael S Lawrence; Rehan Akbani; Yiling Lu; Hong L Tiv; Prafulla C Gokhale; Antoine de Weck; Ali Amin Mansour; Coyin Oh; Juliann Shih; Kevin Hadi; Yanay Rosen; Jonathan Bistline; Kavitha Venkatesan; Anupama Reddy; Dmitriy Sonkin; Manway Liu; Joseph Lehar; Joshua M Korn; Dale A Porter; Michael D Jones; Javad Golji; Giordano Caponigro; Jordan E Taylor; Caitlin M Dunning; Amanda L Creech; Allison C Warren; James M McFarland; Mahdi Zamanighomi; Audrey Kauffmann; Nicolas Stransky; Marcin Imielinski; Yosef E Maruvka; Andrew D Cherniack; Aviad Tsherniak; Francisca Vazquez; Jacob D Jaffe; Andrew A Lane; David M Weinstock; Cory M Johannessen; Michael P Morrissey; Frank Stegmeier; Robert Schlegel; William C Hahn; Gad Getz; Gordon B Mills; Jesse S Boehm; Todd R Golub; Levi A Garraway; William R Sellers Journal: Nature Date: 2019-05-08 Impact factor: 49.962
Authors: Simon A Forbes; David Beare; Harry Boutselakis; Sally Bamford; Nidhi Bindal; John Tate; Charlotte G Cole; Sari Ward; Elisabeth Dawson; Laura Ponting; Raymund Stefancsik; Bhavana Harsha; Chai Yin Kok; Mingming Jia; Harry Jubb; Zbyslaw Sondka; Sam Thompson; Tisham De; Peter J Campbell Journal: Nucleic Acids Res Date: 2016-11-28 Impact factor: 16.971
Authors: Jamunarani Veeraraghavan; Ying Tan; Xi-Xi Cao; Jin Ah Kim; Xian Wang; Gary C Chamness; Sourindra N Maiti; Laurence J N Cooper; Dean P Edwards; Alejandro Contreras; Susan G Hilsenbeck; Eric C Chang; Rachel Schiff; Xiao-Song Wang Journal: Nat Commun Date: 2014-08-07 Impact factor: 14.919
Authors: Paul A Northcott; Ivo Buchhalter; A Sorana Morrissy; Volker Hovestadt; Joachim Weischenfeldt; Tobias Ehrenberger; Susanne Gröbner; Maia Segura-Wang; Thomas Zichner; Vasilisa A Rudneva; Hans-Jörg Warnatz; Nikos Sidiropoulos; Aaron H Phillips; Steven Schumacher; Kortine Kleinheinz; Sebastian M Waszak; Serap Erkek; David T W Jones; Barbara C Worst; Marcel Kool; Marc Zapatka; Natalie Jäger; Lukas Chavez; Barbara Hutter; Matthias Bieg; Nagarajan Paramasivam; Michael Heinold; Zuguang Gu; Naveed Ishaque; Christina Jäger-Schmidt; Charles D Imbusch; Alke Jugold; Daniel Hübschmann; Thomas Risch; Vyacheslav Amstislavskiy; Francisco German Rodriguez Gonzalez; Ursula D Weber; Stephan Wolf; Giles W Robinson; Xin Zhou; Gang Wu; David Finkelstein; Yanling Liu; Florence M G Cavalli; Betty Luu; Vijay Ramaswamy; Xiaochong Wu; Jan Koster; Marina Ryzhova; Yoon-Jae Cho; Scott L Pomeroy; Christel Herold-Mende; Martin Schuhmann; Martin Ebinger; Linda M Liau; Jaume Mora; Roger E McLendon; Nada Jabado; Toshihiro Kumabe; Eric Chuah; Yussanne Ma; Richard A Moore; Andrew J Mungall; Karen L Mungall; Nina Thiessen; Kane Tse; Tina Wong; Steven J M Jones; Olaf Witt; Till Milde; Andreas Von Deimling; David Capper; Andrey Korshunov; Marie-Laure Yaspo; Richard Kriwacki; Amar Gajjar; Jinghui Zhang; Rameen Beroukhim; Ernest Fraenkel; Jan O Korbel; Benedikt Brors; Matthias Schlesner; Roland Eils; Marco A Marra; Stefan M Pfister; Michael D Taylor; Peter Lichter Journal: Nature Date: 2017-07-19 Impact factor: 49.962
Authors: Martina Ghetti; Ivan Vannini; Clelia Tiziana Storlazzi; Giovanni Martinelli; Giorgia Simonetti Journal: Mol Cancer Date: 2020-03-30 Impact factor: 27.401