Davide Risso1, Stefano Maria Pagnotta2. 1. Dept. of Statistical Sciences, Università degli Studi di Padova, Padova, Italy. 2. Dept. of Science and Technology, Università degli Studi del Sannio, Benevento, Italy.
Abstract
MOTIVATION: Data transformations are an important step in the analysis of RNA-seq data. Nonetheless, the impact of transformation on the outcome of unsupervised clustering procedures is still unclear. RESULTS: Here, we present an Asymmetric Winsorization per Sample Transformation (AWST), which is robust to data perturbations and removes the need for selecting the most informative genes prior to sample clustering. Our procedure leads to robust and biologically meaningful clusters both in bulk and in single-cell applications. AVAILABILITY: The AWST method is available at https://github.com/drisso/awst. The code to reproduce the analyses is available at https://github.com/drisso/awst\_analysis. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Data transformations are an important step in the analysis of RNA-seq data. Nonetheless, the impact of transformation on the outcome of unsupervised clustering procedures is still unclear. RESULTS: Here, we present an Asymmetric Winsorization per Sample Transformation (AWST), which is robust to data perturbations and removes the need for selecting the most informative genes prior to sample clustering. Our procedure leads to robust and biologically meaningful clusters both in bulk and in single-cell applications. AVAILABILITY: The AWST method is available at https://github.com/drisso/awst. The code to reproduce the analyses is available at https://github.com/drisso/awst\_analysis. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Huipeng Li; Elise T Courtois; Debarka Sengupta; Yuliana Tan; Kok Hao Chen; Jolene Jie Lin Goh; Say Li Kong; Clarinda Chua; Lim Kiat Hon; Wah Siew Tan; Mark Wong; Paul Jongjoon Choi; Lawrence J K Wee; Axel M Hillmer; Iain Beehuat Tan; Paul Robson; Shyam Prabhakar Journal: Nat Genet Date: 2017-03-20 Impact factor: 38.330
Authors: Michele Ceccarelli; Floris P Barthel; Tathiane M Malta; Thais S Sabedot; Sofie R Salama; Bradley A Murray; Olena Morozova; Yulia Newton; Amie Radenbaugh; Stefano M Pagnotta; Samreen Anjum; Jiguang Wang; Ganiraju Manyam; Pietro Zoppoli; Shiyun Ling; Arjun A Rao; Mia Grifford; Andrew D Cherniack; Hailei Zhang; Laila Poisson; Carlos Gilberto Carlotti; Daniela Pretti da Cunha Tirapelli; Arvind Rao; Tom Mikkelsen; Ching C Lau; W K Alfred Yung; Raul Rabadan; Jason Huse; Daniel J Brat; Norman L Lehman; Jill S Barnholtz-Sloan; Siyuan Zheng; Kenneth Hess; Ganesh Rao; Matthew Meyerson; Rameen Beroukhim; Lee Cooper; Rehan Akbani; Margaret Wrensch; David Haussler; Kenneth D Aldape; Peter W Laird; David H Gutmann; Houtan Noushmehr; Antonio Iavarone; Roel G W Verhaak Journal: Cell Date: 2016-01-28 Impact factor: 41.582
Authors: Marek Gierliński; Christian Cole; Pietà Schofield; Nicholas J Schurch; Alexander Sherstnev; Vijender Singh; Nicola Wrobel; Karim Gharbi; Gordon Simpson; Tom Owen-Hughes; Mark Blaxter; Geoffrey J Barton Journal: Bioinformatics Date: 2015-07-23 Impact factor: 6.937
Authors: Marlon Stoeckius; Christoph Hafemeister; William Stephenson; Brian Houck-Loomis; Pratip K Chattopadhyay; Harold Swerdlow; Rahul Satija; Peter Smibert Journal: Nat Methods Date: 2017-07-31 Impact factor: 28.547
Authors: Aaron T L Lun; Samantha Riesenfeld; Tallulah Andrews; The Phuong Dao; Tomas Gomes; John C Marioni Journal: Genome Biol Date: 2019-03-22 Impact factor: 13.583