Mengqi Zhang1, F Richard Guo2. 1. Department of Surgery, Perelman Medical School, University of Pennsylvania, Philadelphia, PA 19104, USA. 2. Statistical Laboratory, University of Cambridge, Cambridge CB3 0WB, UK.
Abstract
MOTIVATION: Single-cell sequencing brings about a revolutionarily high resolution for finding differentially expressed genes (DEGs) by disentangling highly heterogeneous cell tissues. Yet, such analysis is so far mostly focused on comparing between different cell types from the same individual. As single-cell sequencing becomes cheaper and easier to use, an increasing number of datasets from case-control studies are becoming available, which call for new methods for identifying differential expressions between case and control individuals. RESULTS: To bridge this gap, we propose barycenter single-cell differential expression (BSDE), a nonparametric method for finding DEGs for case-control studies. Through the use of optimal transportation for aggregating distributions and computing their distances, our method overcomes the restrictive parametric assumptions imposed by standard mixed-effect-modeling approaches. Through simulations, we show that BSDE can accurately detect a variety of differential expressions while maintaining the type-I error at a prescribed level. Further, 1345 and 1568 cell type-specific DEGs are identified by BSDE from datasets on pulmonary fibrosis and multiple sclerosis, among which the top findings are supported by previous results from the literature. AVAILABILITY AND IMPLEMENTATION: R package BSDE is freely available from doi.org/10.5281/zenodo.6332254. For real data analysis with the R package, see doi.org/10.5281/zenodo.6332566. These can also be accessed thorough GitHub at github.com/mqzhanglab/BSDE and github.com/mqzhanglab/BSDE_pipeline. The two single-cell sequencing datasets can be download with UCSC cell browser from cells.ucsc.edu/?ds=ms and cells.ucsc.edu/?ds=lung-pf-control. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Single-cell sequencing brings about a revolutionarily high resolution for finding differentially expressed genes (DEGs) by disentangling highly heterogeneous cell tissues. Yet, such analysis is so far mostly focused on comparing between different cell types from the same individual. As single-cell sequencing becomes cheaper and easier to use, an increasing number of datasets from case-control studies are becoming available, which call for new methods for identifying differential expressions between case and control individuals. RESULTS: To bridge this gap, we propose barycenter single-cell differential expression (BSDE), a nonparametric method for finding DEGs for case-control studies. Through the use of optimal transportation for aggregating distributions and computing their distances, our method overcomes the restrictive parametric assumptions imposed by standard mixed-effect-modeling approaches. Through simulations, we show that BSDE can accurately detect a variety of differential expressions while maintaining the type-I error at a prescribed level. Further, 1345 and 1568 cell type-specific DEGs are identified by BSDE from datasets on pulmonary fibrosis and multiple sclerosis, among which the top findings are supported by previous results from the literature. AVAILABILITY AND IMPLEMENTATION: R package BSDE is freely available from doi.org/10.5281/zenodo.6332254. For real data analysis with the R package, see doi.org/10.5281/zenodo.6332566. These can also be accessed thorough GitHub at github.com/mqzhanglab/BSDE and github.com/mqzhanglab/BSDE_pipeline. The two single-cell sequencing datasets can be download with UCSC cell browser from cells.ucsc.edu/?ds=ms and cells.ucsc.edu/?ds=lung-pf-control. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Keegan D Korthauer; Li-Fang Chu; Michael A Newton; Yuan Li; James Thomson; Ron Stewart; Christina Kendziorski Journal: Genome Biol Date: 2016-10-25 Impact factor: 13.583
Authors: Wu-Lin Zuo; Mahboubeh R Rostami; Michelle LeBlanc; Robert J Kaner; Sarah L O'Beirne; Jason G Mezey; Philip L Leopold; Karsten Quast; Sudha Visvanathan; Jay S Fine; Matthew J Thomas; Ronald G Crystal Journal: PLoS One Date: 2020-09-17 Impact factor: 3.240
Authors: Gianni Carraro; Apoorva Mulay; Changfu Yao; Takako Mizuno; Bindu Konda; Martin Petrov; Daniel Lafkas; Joe R Arron; Cory M Hogaboam; Peter Chen; Dianhua Jiang; Paul W Noble; Scott H Randell; Jonathan L McQualter; Barry R Stripp Journal: Am J Respir Crit Care Med Date: 2020-12-01 Impact factor: 21.405