Alexander Lachmann1, Federico M Giorgi2, Mariano J Alvarez3, Andrea Califano4. 1. Department of Biomedical Informatics (DBMI) Department of Systems Biology Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY, USA. 2. Department of Systems Biology Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY, USA Scuola Superiore Sant'Anna, Pisa, Italy. 3. Department of Systems Biology. 4. Department of Biomedical Informatics (DBMI) Department of Systems Biology Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY, USA Department of Biochemistry and Molecular Biophysics Institute for Cancer Genetics Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, USA.
Abstract
MOTIVATION: Multiplex readout assays are now increasingly being performed using microfluidic automation in multiwell format. For instance, the Library of Integrated Network-based Cellular Signatures (LINCS) has produced gene expression measurements for tens of thousands of distinct cell perturbations using a 384-well plate format. This dataset is by far the largest 384-well gene expression measurement assay ever performed. We investigated the gene expression profiles of a million samples from the LINCS dataset and found that the vast majority (96%) of the tested plates were affected by a significant 2D spatial bias. RESULTS: Using a novel algorithm combining spatial autocorrelation detection and principal component analysis, we could remove most of the spatial bias from the LINCS dataset and show in parallel a dramatic improvement of similarity between biological replicates assayed in different plates. The proposed methodology is fully general and can be applied to any highly multiplexed assay performed in multiwell format. CONTACT: ac2248@columbia.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Multiplex readout assays are now increasingly being performed using microfluidic automation in multiwell format. For instance, the Library of Integrated Network-based Cellular Signatures (LINCS) has produced gene expression measurements for tens of thousands of distinct cell perturbations using a 384-well plate format. This dataset is by far the largest 384-well gene expression measurement assay ever performed. We investigated the gene expression profiles of a million samples from the LINCS dataset and found that the vast majority (96%) of the tested plates were affected by a significant 2D spatial bias. RESULTS: Using a novel algorithm combining spatial autocorrelation detection and principal component analysis, we could remove most of the spatial bias from the LINCS dataset and show in parallel a dramatic improvement of similarity between biological replicates assayed in different plates. The proposed methodology is fully general and can be applied to any highly multiplexed assay performed in multiwell format. CONTACT: ac2248@columbia.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Wesley Tansey; Kathy Li; Haoran Zhang; Scott W Linderman; Raul Rabadan; David M Blei; Chris H Wiggins Journal: Biostatistics Date: 2022-04-13 Impact factor: 5.279
Authors: Mario Niepel; Marc Hafner; Caitlin E Mills; Kartik Subramanian; Elizabeth H Williams; Mirra Chung; Benjamin Gaudio; Anne Marie Barrette; Alan D Stern; Bin Hu; James E Korkola; Joe W Gray; Marc R Birtwistle; Laura M Heiser; Peter K Sorger Journal: Cell Syst Date: 2019-07-10 Impact factor: 10.304