Bahman Afsari1, Theresa Guo2, Michael Considine1, Liliana Florea3, Luciane T Kagohara1, Genevieve L Stein-O'Brien1, Dylan Kelley2, Emily Flam2, Kristina D Zambo2, Patrick K Ha4, Donald Geman5, Michael F Ochs6, Joseph A Califano7, Daria A Gaykalova2, Alexander V Favorov1,8, Elana J Fertig1. 1. Division of Biostatistics and Bioinformatics, Department of Oncology, Sidney Kimmel Comprehensive Cancer Center. 2. Department of Otolaryngology-Head and Neck Surgery. 3. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, MD 21205, USA. 4. Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco, CA 94158, USA. 5. Department of Applied Mathematics & Statistics, Johns Hopkins University, Baltimore, MD 21218, USA. 6. Department of Mathematics & Statistics, The College of New Jersey, Ewing, NJ 08628, USA. 7. Division of Otolaryngology, Department of Surgery, University of California, San Diego, CA 92093, USA. 8. Laboratory of Systems Biology and Computational Genetics, Vavilov Institute of General Genetics, RAS, Moscow 119333, Russia.
Abstract
Motivation: Current bioinformatics methods to detect changes in gene isoform usage in distinct phenotypes compare the relative expected isoform usage in phenotypes. These statistics model differences in isoform usage in normal tissues, which have stable regulation of gene splicing. Pathological conditions, such as cancer, can have broken regulation of splicing that increases the heterogeneity of the expression of splice variants. Inferring events with such differential heterogeneity in gene isoform usage requires new statistical approaches. Results: We introduce Splice Expression Variability Analysis (SEVA) to model increased heterogeneity of splice variant usage between conditions (e.g. tumor and normal samples). SEVA uses a rank-based multivariate statistic that compares the variability of junction expression profiles within one condition to the variability within another. Simulated data show that SEVA is unique in modeling heterogeneity of gene isoform usage, and benchmark SEVA's performance against EBSeq, DiffSplice and rMATS that model differential isoform usage instead of heterogeneity. We confirm the accuracy of SEVA in identifying known splice variants in head and neck cancer and perform cross-study validation of novel splice variants. A novel comparison of splice variant heterogeneity between subtypes of head and neck cancer demonstrated unanticipated similarity between the heterogeneity of gene isoform usage in HPV-positive and HPV-negative subtypes and anticipated increased heterogeneity among HPV-negative samples with mutations in genes that regulate the splice variant machinery. These results show that SEVA accurately models differential heterogeneity of gene isoform usage from RNA-seq data. Availability and implementation: SEVA is implemented in the R/Bioconductor package GSReg. Contact: bahman@jhu.edu or favorov@sensi.org or ejfertig@jhmi.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: Current bioinformatics methods to detect changes in gene isoform usage in distinct phenotypes compare the relative expected isoform usage in phenotypes. These statistics model differences in isoform usage in normal tissues, which have stable regulation of gene splicing. Pathological conditions, such as cancer, can have broken regulation of splicing that increases the heterogeneity of the expression of splice variants. Inferring events with such differential heterogeneity in gene isoform usage requires new statistical approaches. Results: We introduce Splice Expression Variability Analysis (SEVA) to model increased heterogeneity of splice variant usage between conditions (e.g. tumor and normal samples). SEVA uses a rank-based multivariate statistic that compares the variability of junction expression profiles within one condition to the variability within another. Simulated data show that SEVA is unique in modeling heterogeneity of gene isoform usage, and benchmark SEVA's performance against EBSeq, DiffSplice and rMATS that model differential isoform usage instead of heterogeneity. We confirm the accuracy of SEVA in identifying known splice variants in head and neck cancer and perform cross-study validation of novel splice variants. A novel comparison of splice variant heterogeneity between subtypes of head and neck cancer demonstrated unanticipated similarity between the heterogeneity of gene isoform usage in HPV-positive and HPV-negative subtypes and anticipated increased heterogeneity among HPV-negative samples with mutations in genes that regulate the splice variant machinery. These results show that SEVA accurately models differential heterogeneity of gene isoform usage from RNA-seq data. Availability and implementation: SEVA is implemented in the R/Bioconductor package GSReg. Contact: bahman@jhu.edu or favorov@sensi.org or ejfertig@jhmi.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Theresa Guo; Akihiro Sakai; Bahman Afsari; Michael Considine; Ludmila Danilova; Alexander V Favorov; Srinivasan Yegnasubramanian; Dylan Z Kelley; Emily Flam; Patrick K Ha; Zubair Khan; Sarah J Wheelan; J Silvio Gutkind; Elana J Fertig; Daria A Gaykalova; Joseph Califano Journal: Cancer Res Date: 2017-07-21 Impact factor: 12.701
Authors: Shihao Shen; Juw Won Park; Zhi-xiang Lu; Lan Lin; Michael D Henry; Ying Nian Wu; Qing Zhou; Yi Xing Journal: Proc Natl Acad Sci U S A Date: 2014-12-05 Impact factor: 11.205
Authors: Ning Leng; John A Dawson; James A Thomson; Victor Ruotti; Anna I Rissman; Bart M G Smits; Jill D Haag; Michael N Gould; Ron M Stewart; Christina Kendziorski Journal: Bioinformatics Date: 2013-02-21 Impact factor: 6.937
Authors: Mihaela Pertea; Geo M Pertea; Corina M Antonescu; Tsung-Cheng Chang; Joshua T Mendell; Steven L Salzberg Journal: Nat Biotechnol Date: 2015-02-18 Impact factor: 54.908
Authors: Yi Xing; Tianwei Yu; Ying Nian Wu; Meenakshi Roy; Joseph Kim; Christopher Lee Journal: Nucleic Acids Res Date: 2006-06-06 Impact factor: 16.971
Authors: Emily F Davis-Marcisak; Thomas D Sherman; Pranay Orugunta; Genevieve L Stein-O'Brien; Sidharth V Puram; Evanthia T Roussos Torres; Alexander C Hopkins; Elizabeth M Jaffee; Alexander V Favorov; Bahman Afsari; Loyal A Goff; Elana J Fertig Journal: Cancer Res Date: 2019-07-23 Impact factor: 12.701