John Vivian1, Jordan M Eizenga1, Holly C Beale2, Olena M Vaske2, Benedict Paten1. 1. Computational Genomics Laboratory, University of California, Santa Cruz, Santa Cruz, CA. 2. Molecular, Cell, and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA.
Abstract
PURPOSE: Many antineoplastics are designed to target upregulated genes, but quantifying upregulation in a single patient sample requires an appropriate set of samples for comparison. In cancer, the most natural comparison set is unaffected samples from the matching tissue, but there are often too few available unaffected samples to overcome high intersample variance. Moreover, some cancer samples have misidentified tissues of origin or even composite-tissue phenotypes. Even if an appropriate comparison set can be identified, most differential expression tools are not designed to accommodate comparisons to a single patient sample. METHODS: We propose a Bayesian statistical framework for gene expression outlier detection in single samples. Our method uses all available data to produce a consensus background distribution for each gene of interest without requiring the researcher to manually select a comparison set. The consensus distribution can then be used to quantify over- and underexpression. RESULTS: We demonstrate this method on both simulated and real gene expression data. We show that it can robustly quantify overexpression, even when the set of comparison samples lacks ideally matched tissue samples. Furthermore, our results show that the method can identify appropriate comparison sets from samples of mixed lineage and rediscover numerous known gene-cancer expression patterns. CONCLUSION: This exploratory method is suitable for identifying expression outliers from comparative RNA sequencing (RNA-seq) analysis for individual samples, and Treehouse, a pediatric precision medicine group that leverages RNA-seq to identify potential therapeutic leads for patients, plans to explore this method for processing its pediatric cohort.
PURPOSE: Many antineoplastics are designed to target upregulated genes, but quantifying upregulation in a single patient sample requires an appropriate set of samples for comparison. In cancer, the most natural comparison set is unaffected samples from the matching tissue, but there are often too few available unaffected samples to overcome high intersample variance. Moreover, some cancer samples have misidentified tissues of origin or even composite-tissue phenotypes. Even if an appropriate comparison set can be identified, most differential expression tools are not designed to accommodate comparisons to a single patient sample. METHODS: We propose a Bayesian statistical framework for gene expression outlier detection in single samples. Our method uses all available data to produce a consensus background distribution for each gene of interest without requiring the researcher to manually select a comparison set. The consensus distribution can then be used to quantify over- and underexpression. RESULTS: We demonstrate this method on both simulated and real gene expression data. We show that it can robustly quantify overexpression, even when the set of comparison samples lacks ideally matched tissue samples. Furthermore, our results show that the method can identify appropriate comparison sets from samples of mixed lineage and rediscover numerous known gene-cancer expression patterns. CONCLUSION: This exploratory method is suitable for identifying expression outliers from comparative RNA sequencing (RNA-seq) analysis for individual samples, and Treehouse, a pediatric precision medicine group that leverages RNA-seq to identify potential therapeutic leads for patients, plans to explore this method for processing its pediatric cohort.
Authors: S Ramaswamy; P Tamayo; R Rifkin; S Mukherjee; C H Yeang; M Angelo; C Ladd; M Reich; E Latulippe; J P Mesirov; T Poggio; W Gerald; M Loda; E S Lander; T R Golub Journal: Proc Natl Acad Sci U S A Date: 2001-12-11 Impact factor: 11.205
Authors: Anoop P Patel; Itay Tirosh; John J Trombetta; Alex K Shalek; Shawn M Gillespie; Hiroaki Wakimoto; Daniel P Cahill; Brian V Nahed; William T Curry; Robert L Martuza; David N Louis; Orit Rozenblatt-Rosen; Mario L Suvà; Aviv Regev; Bradley E Bernstein Journal: Science Date: 2014-06-12 Impact factor: 47.728
Authors: Yelena Y Janjigian; Laura H Tang; Daniel G Coit; David P Kelsen; Todd D Francone; Martin R Weiser; Suresh C Jhanwar; Manish A Shah Journal: Cancer Epidemiol Biomarkers Prev Date: 2011-03-10 Impact factor: 4.254
Authors: M F Di Renzo; M Olivero; S Ferro; M Prat; I Bongarzone; S Pilotti; A Belfiore; A Costantino; R Vigneri; M A Pierotti Journal: Oncogene Date: 1992-12 Impact factor: 9.867
Authors: Yulia Newton; S Rod Rassekh; Rebecca J Deyell; Yaoqing Shen; Martin R Jones; Chris Dunham; Stephen Yip; Sreeja Leelakumari; Jingchun Zhu; Duncan McColl; Teresa Swatloski; Sofie R Salama; Tony Ng; Glenda Hendson; Anna F Lee; Yussanne Ma; Richard Moore; Andrew J Mungall; David Haussler; Joshua M Stuart; Colleen Jantzen; Janessa Laskin; Steven J M Jones; Marco A Marra; Olena Morozova Journal: JCO Precis Oncol Date: 2018-04-19