Brian B Nadel1, David Lopez1, Dennis J Montoya1, Feiyang Ma1, Hannah Waddel2, Misha M Khan3, Serghei Mangul4, Matteo Pellegrini1,5. 1. Bioinformatics Interdepartmental Degree Program, Molecular Biology Institute, Department of Molecular Cellular and Developmental Biology, and Institute for Genomics and Proteomics, University of California Los Angeles, 610 Charles E Young Dr S, Los Angeles, CA 90095, USA. 2. Department of Mathematics, University of Utah, 155 1400 E, Salt Lake City, UT 84112, USA. 3. Departments of Biology and Computer Science, Swarthmore College, 500 College Ave, Swarthmore, PA 19081, USA. 4. Department of Clinical Pharmacy, USC School of Pharmacy, 1450 Alcazar Street Los Angeles, CA 90089, USA. 5. Department of Dermatology, David Geffen School of Medicine, University of California Los Angeles, 10833 Le Conte Ave, Los Angeles, CA 90095, USA.
Abstract
BACKGROUND: The cell type composition of heterogeneous tissue samples can be a critical variable in both clinical and laboratory settings. However, current experimental methods of cell type quantification (e.g., cell flow cytometry) are costly, time consuming and have potential to introduce bias. Computational approaches that use expression data to infer cell type abundance offer an alternative solution. While these methods have gained popularity, most fail to produce accurate predictions for the full range of platforms currently used by researchers or for the wide variety of tissue types often studied. RESULTS: We present the Gene Expression Deconvolution Interactive Tool (GEDIT), a flexible tool that utilizes gene expression data to accurately predict cell type abundances. Using both simulated and experimental data, we extensively evaluate the performance of GEDIT and demonstrate that it returns robust results under a wide variety of conditions. These conditions include multiple platforms (microarray and RNA-seq), tissue types (blood and stromal), and species (human and mouse). Finally, we provide reference data from 8 sources spanning a broad range of stromal and hematopoietic types in both human and mouse. GEDIT also accepts user-submitted reference data, thus allowing the estimation of any cell type or subtype, provided that reference data are available. CONCLUSIONS: GEDIT is a powerful method for evaluating the cell type composition of tissue samples and provides excellent accuracy and versatility compared to similar tools. The reference database provided here also allows users to obtain estimates for a wide variety of tissue samples without having to provide their own data.
BACKGROUND: The cell type composition of heterogeneous tissue samples can be a critical variable in both clinical and laboratory settings. However, current experimental methods of cell type quantification (e.g., cell flow cytometry) are costly, time consuming and have potential to introduce bias. Computational approaches that use expression data to infer cell type abundance offer an alternative solution. While these methods have gained popularity, most fail to produce accurate predictions for the full range of platforms currently used by researchers or for the wide variety of tissue types often studied. RESULTS: We present the Gene Expression Deconvolution Interactive Tool (GEDIT), a flexible tool that utilizes gene expression data to accurately predict cell type abundances. Using both simulated and experimental data, we extensively evaluate the performance of GEDIT and demonstrate that it returns robust results under a wide variety of conditions. These conditions include multiple platforms (microarray and RNA-seq), tissue types (blood and stromal), and species (human and mouse). Finally, we provide reference data from 8 sources spanning a broad range of stromal and hematopoietic types in both human and mouse. GEDIT also accepts user-submitted reference data, thus allowing the estimation of any cell type or subtype, provided that reference data are available. CONCLUSIONS: GEDIT is a powerful method for evaluating the cell type composition of tissue samples and provides excellent accuracy and versatility compared to similar tools. The reference database provided here also allows users to obtain estimates for a wide variety of tissue samples without having to provide their own data.
Authors: Evan Z Macosko; Anindita Basu; Rahul Satija; James Nemesh; Karthik Shekhar; Melissa Goldman; Itay Tirosh; Allison R Bialas; Nolan Kamitaki; Emily M Martersteck; John J Trombetta; David A Weitz; Joshua R Sanes; Alex K Shalek; Aviv Regev; Steven A McCarroll Journal: Cell Date: 2015-05-21 Impact factor: 41.582
Authors: Carrie A Davis; Benjamin C Hitz; Cricket A Sloan; Esther T Chan; Jean M Davidson; Idan Gabdank; Jason A Hilton; Kriti Jain; Ulugbek K Baymuradov; Aditi K Narayanan; Kathrina C Onate; Keenan Graham; Stuart R Miyasato; Timothy R Dreszer; J Seth Strattan; Otto Jolanki; Forrest Y Tanaka; J Michael Cherry Journal: Nucleic Acids Res Date: 2018-01-04 Impact factor: 16.971
Authors: Gregor Sturm; Francesca Finotello; Florent Petitprez; Jitao David Zhang; Jan Baumbach; Wolf H Fridman; Markus List; Tatsiana Aneichyk Journal: Bioinformatics Date: 2019-07-15 Impact factor: 6.931
Authors: Jane E Lattin; Kate Schroder; Andrew I Su; John R Walker; Jie Zhang; Tim Wiltshire; Kaoru Saijo; Christopher K Glass; David A Hume; Stuart Kellie; Matthew J Sweet Journal: Immunome Res Date: 2008-04-29
Authors: Fernanda Rosa; Ashok K Sharma; Manoj Gurung; David Casero; Katelin Matazel; Lars Bode; Christy Simecka; Ahmed A Elolimy; Patricia Tripp; Christopher Randolph; Timothy W Hand; Keith D Williams; Tanya LeRoith; Laxmi Yeruva Journal: Front Immunol Date: 2022-06-29 Impact factor: 8.786
Authors: Natalia Rodríguez; Patricia Gassó; Albert Martínez-Pinteño; Àlex-González Segura; Gisela Mezquida; Lucia Moreno-Izco; Javier González-Peñas; Iñaki Zorrilla; Marta Martin; Roberto Rodriguez-Jimenez; Iluminada Corripio; Salvador Sarró; Angela Ibáñez; Anna Butjosa; Fernando Contreras; Miquel Bioque; Manuel-Jesús Cuesta; Mara Parellada; Ana González-Pinto; Esther Berrocoso; Miquel Bernardo; Sergi Mas Journal: Schizophrenia (Heidelb) Date: 2022-04-27
Authors: Mercedeh Movassagh; Sarah U Morton; Steven J Schiff; Jeffrey A Bailey; Christine Hehnly; Jasmine Smith; Trang T Doan; Rafael Irizarry; James R Broach; Joseph N Paulson Journal: BMC Genomics Date: 2022-06-13 Impact factor: 4.547
Authors: Brian B Nadel; Meritxell Oliva; Benjamin L Shou; Keith Mitchell; Feiyang Ma; Dennis J Montoya; Alice Mouton; Sarah Kim-Hellmuth; Barbara E Stranger; Matteo Pellegrini; Serghei Mangul Journal: Brief Bioinform Date: 2021-11-05 Impact factor: 13.994
Authors: Samantha N Hart; Samir P Patel; Felicia M Michael; Peter Stoilov; Chi Jing Leow; Alvaro G Hernandez; Ariane Jolly; Pierre de la Grange; Alexander G Rabchevsky; Stefan Stamm Journal: Neurotrauma Rep Date: 2022-03-04
Authors: Sarah Kim-Hellmuth; François Aguet; Meritxell Oliva; Manuel Muñoz-Aguirre; Silva Kasela; Valentin Wucher; Stephane E Castel; Andrew R Hamel; Ana Viñuela; Amy L Roberts; Serghei Mangul; Xiaoquan Wen; Gao Wang; Alvaro N Barbeira; Diego Garrido-Martín; Brian B Nadel; Yuxin Zou; Rodrigo Bonazzola; Jie Quan; Andrew Brown; Angel Martinez-Perez; José Manuel Soria; Gad Getz; Emmanouil T Dermitzakis; Kerrin S Small; Matthew Stephens; Hualin S Xi; Hae Kyung Im; Roderic Guigó; Ayellet V Segrè; Barbara E Stranger; Kristin G Ardlie; Tuuli Lappalainen Journal: Science Date: 2020-09-11 Impact factor: 63.714