Chixiang Chen1,2, Yuk Yee Leung3,4, Matei Ionita3,4, Li-San Wang3,4, Mingyao Li4,5. 1. Division of Biostatistics and Bioinformatics, University of Maryland School of Medicine, Baltimore, MD, 21201, USA. 2. Department of Neurosurgery, University of Maryland School of Medicine, Baltimore, MD, 21201, USA. 3. Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA. 4. Penn Neurodegeneration Genomics Center, Philadelphia, PA, 19104, USA. 5. Department of Biostatistics Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA.
Abstract
MOTIVATION: Cell-type deconvolution of bulk tissue RNA sequencing (RNA-seq) data is an important step towards understanding the variations in cell-type composition among disease conditions. Owing to recent advances in single-cell RNA sequencing (scRNA-seq) and the availability of large amounts of bulk RNA-seq data in disease-relevant tissues, various deconvolution methods have been developed. However, the performance of existing methods heavily relies on the quality of information provided by external data sources, such as the selection of scRNA-seq data as a reference and prior biological information. RESULTS: We present the Integrated and Robust Deconvolution (InteRD) algorithm to infer cell-type proportions from target bulk RNA-seq data. Owing to the innovative use of penalized regression with a new evaluation criterion for deconvolution, InteRD has three primary advantages. First, it is able to effectively integrate deconvolution results from multiple scRNA-seq datasets. Second, InteRD calibrates estimates from reference-based deconvolution by taking into account extra biological information as priors. Third, the proposed algorithm is robust to inaccurate external information imposed in the deconvolution system. Extensive numerical evaluations and real data applications demonstrate that InteRD yields more accurate and robust cell-type proportion estimates that agree well with known biology. AVAILABILITY AND IMPLEMENTATION: The proposed InteRD framework is implemented in R and the package is available at https://cran.r-project.org/web/packages/InteRD/index.html. SUPPLEMENTARY INFORMATION: Supplementary Materials including pseudo algorithms, more simulation results, and extra discussion and information are available at Bioinformatics online.
MOTIVATION: Cell-type deconvolution of bulk tissue RNA sequencing (RNA-seq) data is an important step towards understanding the variations in cell-type composition among disease conditions. Owing to recent advances in single-cell RNA sequencing (scRNA-seq) and the availability of large amounts of bulk RNA-seq data in disease-relevant tissues, various deconvolution methods have been developed. However, the performance of existing methods heavily relies on the quality of information provided by external data sources, such as the selection of scRNA-seq data as a reference and prior biological information. RESULTS: We present the Integrated and Robust Deconvolution (InteRD) algorithm to infer cell-type proportions from target bulk RNA-seq data. Owing to the innovative use of penalized regression with a new evaluation criterion for deconvolution, InteRD has three primary advantages. First, it is able to effectively integrate deconvolution results from multiple scRNA-seq datasets. Second, InteRD calibrates estimates from reference-based deconvolution by taking into account extra biological information as priors. Third, the proposed algorithm is robust to inaccurate external information imposed in the deconvolution system. Extensive numerical evaluations and real data applications demonstrate that InteRD yields more accurate and robust cell-type proportion estimates that agree well with known biology. AVAILABILITY AND IMPLEMENTATION: The proposed InteRD framework is implemented in R and the package is available at https://cran.r-project.org/web/packages/InteRD/index.html. SUPPLEMENTARY INFORMATION: Supplementary Materials including pseudo algorithms, more simulation results, and extra discussion and information are available at Bioinformatics online.
Authors: Theodore Rabinowicz; Jean MacDonald-Comber Petetot; Peter S Gartside; David Sheyn; Tony Sheyn; Courten-MyersGabrielleM de Journal: J Neuropathol Exp Neurol Date: 2002-01 Impact factor: 3.685
Authors: João Fadista; Petter Vikman; Emilia Ottosson Laakso; Inês Guerra Mollet; Jonathan Lou Esguerra; Jalal Taneera; Petter Storm; Peter Osmark; Claes Ladenvall; Rashmi B Prasad; Karin B Hansson; Francesca Finotello; Kristina Uvebrant; Jones K Ofori; Barbara Di Camillo; Ulrika Krus; Corrado M Cilio; Ola Hansson; Lena Eliasson; Anders H Rosengren; Erik Renström; Claes B Wollheim; Leif Groop Journal: Proc Natl Acad Sci U S A Date: 2014-09-08 Impact factor: 11.205
Authors: Hansruedi Mathys; Jose Davila-Velderrain; Zhuyu Peng; Fan Gao; Shahin Mohammadi; Jennie Z Young; Madhvi Menon; Liang He; Fatema Abdurrob; Xueqiao Jiang; Anthony J Martorell; Richard M Ransohoff; Brian P Hafler; David A Bennett; Manolis Kellis; Li-Huei Tsai Journal: Nature Date: 2019-05-01 Impact factor: 49.962
Authors: Aaron M Newman; Chih Long Liu; Michael R Green; Andrew J Gentles; Weiguo Feng; Yue Xu; Chuong D Hoang; Maximilian Diehn; Ash A Alizadeh Journal: Nat Methods Date: 2015-03-30 Impact factor: 28.547
Authors: Aaron M Newman; Chloé B Steen; Chih Long Liu; Andrew J Gentles; Aadel A Chaudhuri; Florian Scherer; Michael S Khodadoust; Mohammad S Esfahani; Bogdan A Luca; David Steiner; Maximilian Diehn; Ash A Alizadeh Journal: Nat Biotechnol Date: 2019-05-06 Impact factor: 54.908
Authors: Dirk Repsilber; Sabine Kern; Anna Telaar; Gerhard Walzl; Gillian F Black; Joachim Selbig; Shreemanta K Parida; Stefan H E Kaufmann; Marc Jacobsen Journal: BMC Bioinformatics Date: 2010-01-14 Impact factor: 3.169
Authors: Aivi T Nguyen; Kui Wang; Gang Hu; Xuran Wang; Zhen Miao; Joshua A Azevedo; EunRan Suh; Vivianna M Van Deerlin; David Choi; Kathryn Roeder; Mingyao Li; Edward B Lee Journal: Acta Neuropathol Date: 2020-08-25 Impact factor: 17.088