Literature DB >> 30395307

BloodSpot: a database of healthy and malignant haematopoiesis updated with purified and single cell mRNA sequencing profiles.

Frederik Otzen Bagger1,2,3, Savvas Kinalis1, Nicolas Rapin4,5,6,7.   

Abstract

BloodSpot is a gene-centric database of mRNA expression of haematopoietic cells. The web-based interface to the database includes three concomitant levels of visualization for a gene query; foremost is the expression across hematopoietic cell types, second is analysis of survival of Acute Myeloid Leukaemia patients based on gene expression, and lastly, the expression visualized in an interactive developmental tree. With the introduction of single cell data we have now also included an unbiased dimensionality reduction method to show gene expression over the continuum of haematopoiesis. The webserver includes a few select analysis functionalities, like Student's t-test, identification of correlating genes and lookup of whole genetic signatures, with the aim of making generation and testing of hypotheses quick and intuitive. The visualizations have been updated to accommodate new datatypes and the database has been largely expanded with RNA-sequencing datasets, both purified in bulk and at single cell resolution, increasing the number of single samples more than 10 fold, while keeping simplicity in presentation. The database should be of interest for any researcher within leukaemia, haematopoiesis, cellular development, or stem cells. The database is freely available at www.bloodspot.eu.

Entities:  

Year:  2019        PMID: 30395307      PMCID: PMC6323996          DOI: 10.1093/nar/gky1076

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

BloodSpot (1) is a database of haematopoietic cells in health and disease. The database and interface have been built with the aim of providing quick access for hypothesis testing and generation, via gene-centric lookup of mRNA expression throughout the course of haematopoiesis as well as in expanded leukemic blasts. The interface is, importantly, a one-click, no scroll access to relevant information (on the majority of screens). Uniquely for collected databases, BloodSpot provides detailed information on the definition and inclusion-criteria for each cell type, allowing researchers to draw conclusions without scavenging through supplementary material from original papers. In the initial versions (2,3) of BloodSpot, microarray was the standard high-throughput technique to assess gene expression in haematopoietic cell types, and large and comprehensive studies delineated the full constitution of the haematopoietic system (4), as well large cohorts of patients with aberrant and leukemic blasts (5,6) with intricate fluorescence-activated cell sorting (FACS) schemes.. Microarrays have now almost entirely been replaced by short read RNA-sequencing. Recently, it has also become possible to investigate haematopoiesis at single cell resolution (7), either in combination with FACS (8) or as an unbiased outline of the full constitution of the bone marrow (9,10). This has allowed a glimpse into the full continuum of haematopoiesis, independently of surface exposed marker proteins used for FACS. Quality assessment and filtering are important steps when processing single cell RNA-sequencing data and several methods have been developed for this purpose, e.g. (11,12). A number of other hematopoietic expression databases, each filling a niche, have existed alongside BloodSpot, as reviewed by (13). Most notably are stem cell specific databases like Stemformatics (14) and SyStemCell (15) both also including cells from the hematopoietic stem cell compartment; their interfaces are built for creating analysis workflows rather than accessing processed data. Hematopoietic specific databases are found in ErythronDB (16) (specifically erythropoiesis) and Haemosphere (17), both providing multi-click access to analysis and data, with focus on in-house data. The latter is specifically useful for the use of multidimensional scaling plots to outline problematic quality and cell types of the included data. The ambitious Leukemia Gene Atlas (18) and Gene Expression Commons (19) are no longer updated (last data addition from 2013) and dedicated mouse database BloodExpress (20) has been retired. With this update of BloodSpot we embrace the newest available techniques and data, both from bulk sequencing of highly-purified FACS sorted cells and single cell RNA-seq, to quickly visualize expression of genes or signatures across hematopoietic cells, in the most informative way, to assist researchers and clinicians within the fields of leukaemia, stem cells, and development, to test and generate hypotheses.

MATERIALS AND METHODS

In-house single cell data was processed as described in (21) and external single cell data was obtained either as deposited in github (Setty, M., Kiseliovas, V., Levine, J., Gayoso, A., Mazutis, L. and Pe’er, D. (2018) Palantir characterizes cell fate continuities in human hematopoiesis. bioRxiv, https://doi.org/10.1101/385328), Unique Molecular Identifiers (UMIs) acquired and processed though a standard workflow utilizing 10× genomics cellranger (10), or as normalized and filtered read counts (8, 22). Blueprint data was downloaded at the processing level ‘gene_quantification.rsem_grape2_crg.GRCh38’ (23). Purified FACS sorted early human progenitor data from Notta et al. (24) was trimmed for NEXTERA adaptors using trim_galore (version 0.4.0, with additional parameters: -q 15 –stringency 3 –length 36) and aligned and quantified using star- 2.5.2b. Single cell RNA sequencing data visualizations and dimensionality reduction was performed using a recent manifold learning technique, Uniform Manifold Approximation and Projection (UMAP) (McInnes, L., Healy, J. (2018) UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, arXiv, https://arxiv.org/abs/1802.03426). In essence UMAP optimizes towards retaining local structure of the data, while preserving the global structure. It was applied both for visualization (reducing the dimensionality to two) and as a pre-processing step to the clustering algorithm (reducing the dimensionality to 10). Furthermore, k-means were used for clustering the single cell datasets. The elbow method was used to determine the final number of clusters, k. Briefly, plotting the inertia (within-cluster sum of squares) for varying values of k allows for a sensible k to be set, i.e. large enough that adding a new cluster would not improve the inertia (Supplementary Figure S1). By choosing a clustering algorithm and dimensionality so that clusters in the 2D plot apparently become split into separate clusters, it is possible not only to appreciate the continuum of haematopoietic development, and assess expression at different stages, but also to include relevant information from dimensions which do not appear on the two-dimensional plot. In the single cell data the abundant zero-count values were excluded from the main expression SinaPlot (26), as it greatly slowed the loading of the page, without adding information, but have been retained for calculations and visualizations on the UMAPs. Signatures from DMAP (4) where calculated from the processed and normalized expression matrix. Samples included were common myeloid progenitor, megakaryocyte and pre-B-cell. Differential testing was performed with Limma (27) creating contrasts for each cell type against all other (weighted) and requiring genes to have P > 0.05 and log2-foldchange above 1 to be included in the signature. The intensity of the expression levels of cells was used to colour samples in the UMAP. The intensity is computed as the mean of an expression score function across all genes of the signatures. The function is given by the logarithm of the expression multiplied by the expression score function (x log x).

RESULTS AND DISCUSSION

Single cell RNA-sequencing of haematopoietic stem and progenitor cells

Development of new and sensitive library preparation protocols have made single cell resolution expression profiling possible. In particular in the hematopoietic stem cell compartment these advances provide an unprecedented opportunity to investigate early blood development in an unbiased manner. We have included several recent unique datasets for the study of hematopoietic progenitors at the single cell level in mouse (21,22) and human (8, 10, Setty et al.), and devised a new interface window for investigating their gene expression. Every single cell is visualized as one dot in a dimensionality reduced UMAP plot, such that the full continuum of differentiating cells can be assessed and addressed in an antibody-independent manner. This in effect means that the UMAP plots are a result of expression from all the genes and the cells, in such a way that cells that are similar are close together, and cells that are dissimilar are further apart. As in a principal component analysis (PCA) genes that are more informative are weighted higher in the assessment of similarity, (higher variance, and in this case also higher correlation, over cells). Importantly for single cell sequencing of haematopoietic cells, UMAP offers meaningful organization of cell clusters and also preserves cellular continuums, unlike the popular t-SNE plot (Becht, E., Dutertre, C.-A., Kwok, I.W.H., Ng, L.G., Ginhoux, F. and Newell, E.W. (2018) Evaluation of UMAP as an alternative to t-SNE for single-cell data. bioRxiv, https://doi.org/10.1101/298430); this advantage comes, at times, at the cost of increased white space and overlapping dots in the plots.

Clustering of single cells in a UMAP space

With use of k-means clustering in a 10D UMAP space we clustered unlabelled single cells, and colour coded the clusters for interpretability, and ease of interpretation in the SinaPlot (26). The expression of a query gene will appear as the intensity of colours on the UMAP, and is independent of the clustering. The clustering serves to evaluate the expression quantitatively over the continuum, and also helps to discover cellular connections that are not apparent in a 2D plot.

Validation of UMAP visualization

Expression of hematopoietic signatures created from DMAP(4) was used to assess the validity of the visualization and clustering. In Figure 1 single cell data from Paul et al. (22) is seen showing mean expression of DMAP gene signatures. Figures for remaining cell types and single cell datasets can be found in Supplementary Figures S2–S5. Whereas distinct separation of each cell type is not to be expected, it is clear that UMAP clusters and map regions that are dominated by, and in some cases only contain, a single classically defined cell type or its progenitor state.
Figure 1.

UMAP embeddings of the expression levels of the cells from Paul et al. study visualized on two dimensions. (A) all cells are visualized, colour corresponds to the type, as can be seen on legend. (B–D) The intensity of the expression levels of cells is computed as the mean of an expression score function across all genes of the signatures Common Myeloid Progenitor (B), Megakaryocyte (C) and Pre-B-cell (D). As it is shown in the colour bar, more intense colour corresponds to higher expression levels. Colour intensities are logarithm of the expression multiplied by expression (x log x) and was chosen for visualization of expression, to help differentiate between regions with different expression levels.

UMAP embeddings of the expression levels of the cells from Paul et al. study visualized on two dimensions. (A) all cells are visualized, colour corresponds to the type, as can be seen on legend. (B–D) The intensity of the expression levels of cells is computed as the mean of an expression score function across all genes of the signatures Common Myeloid Progenitor (B), Megakaryocyte (C) and Pre-B-cell (D). As it is shown in the colour bar, more intense colour corresponds to higher expression levels. Colour intensities are logarithm of the expression multiplied by expression (x log x) and was chosen for visualization of expression, to help differentiate between regions with different expression levels.

Inclusion criteria

We have included large studies of FACS sorted cells which broadly cover hematopoietic compartments, as well as single cell datasets, which in an unbiased way represent haematopoietic cells, independent of surface markers. We included newly published data, which analysed >1000 cells and where we could re-find priming of cells which have known precursors in the HCS compartment (as shown in Figure 1 and Supplement Figures S2–S5).

RNA-sequencing of FACS purified cells

BloodSpot is now expanded with high quality RNA-seq of FACS purified bulk sequencing data (23,24,28). Noteworthy is data from the BLUEPRINT epigenetics consortium: further to the epigenetics assays the consortium provided a conspectus of expression profiles from sorted populations of the human hematopoietic system. This task was first performed in microarrays by the DMAP (4) project, who conducted this task with a sorting resolution and with a completeness of cell types that yet remains to be exceeded.

The BloodSpot database update

The BloodSpot webserver is updated with curated high quality RNA-sequencing data from both single cell and FACS sorted purified cells. It now includes >25 000 samples, that are presented in an easy-to-navigate manner, and requires only a gene name as input for results. The database interface continues to be a one-click service, even if modifications to data inclusion and statistical tests can be performed, if required for publication purposes. On a gene query a plot of expression will be shown along with survival data, or UMAP for single cell data, and a hierarchical display based on the hematopoietic development or sample correlation. A dropdown can display correlating genes or pathways and can be useful for hypothesis generation. The database has a steady growing userbase and fills a niche within existing databases. With this update we ensure that the BloodSpot remains a resource at the forefront of the hematopoietic field. New data will continuously be curated and added to the database. Furthermore, biannual meetings with a user group and developers will systematically review new data releases since the last update, to ensure data is up to date. The database should be relevant for all researchers and clinicians within haematopoiesis, cellular development and stem cells.

DATA AVAILABILITY

Umap is available in the GitHub repository https://github.com/lmcinnes/umap The Following data was acquired from Gene Expression Omnibus (GEO): GSE75478 (human single cells HSC), GSE60101 (Mouse purified bulk), GSE108155 and GSE72857 (Mouse single cell HSC). GSE76234 (Human purified bulk) Blueprint data was acquired from http://dcc.blueprint-epigenome.eu and cd34+ (13) can be found at http://support.10xgenomics.com/single-cell/datasets. DMAP data was downloaded from http://www.broadinstitute.org/dmap/home Human HSC 10x genomics data was acquired from https://github.com/dpeerlab/Palantir/ Click here for additional data file.
  25 in total

1.  Clinical utility of microarray-based gene expression profiling in the diagnosis and subclassification of leukemia: report from the International Microarray Innovations in Leukemia Study Group.

Authors:  Torsten Haferlach; Alexander Kohlmann; Lothar Wieczorek; Giuseppe Basso; Geertruy Te Kronnie; Marie-Christine Béné; John De Vos; Jesus M Hernández; Wolf-Karsten Hofmann; Ken I Mills; Amanda Gilkes; Sabina Chiaretti; Sheila A Shurtleff; Thomas J Kipps; Laura Z Rassenti; Allen E Yeoh; Peter R Papenhausen; Wei-Min Liu; P Mickey Williams; Robin Foà
Journal:  J Clin Oncol       Date:  2010-04-20       Impact factor: 44.544

2.  Densely interconnected transcriptional circuits control cell states in human hematopoiesis.

Authors:  Noa Novershtern; Aravind Subramanian; Lee N Lawton; Raymond H Mak; W Nicholas Haining; Marie E McConkey; Naomi Habib; Nir Yosef; Cindy Y Chang; Tal Shay; Garrett M Frampton; Adam C B Drake; Ilya Leskov; Bjorn Nilsson; Fred Preffer; David Dombkowski; John W Evans; Ted Liefeld; John S Smutko; Jianzhu Chen; Nir Friedman; Richard A Young; Todd R Golub; Aviv Regev; Benjamin L Ebert
Journal:  Cell       Date:  2011-01-21       Impact factor: 41.582

3.  limma powers differential expression analyses for RNA-sequencing and microarray studies.

Authors:  Matthew E Ritchie; Belinda Phipson; Di Wu; Yifang Hu; Charity W Law; Wei Shi; Gordon K Smyth
Journal:  Nucleic Acids Res       Date:  2015-01-20       Impact factor: 16.971

Review 4.  From haematopoietic stem cells to complex differentiation landscapes.

Authors:  Elisa Laurenti; Berthold Göttgens
Journal:  Nature       Date:  2018-01-24       Impact factor: 49.962

5.  Leukemia gene atlas--a public platform for integrative exploration of genome-wide molecular data.

Authors:  Katja Hebestreit; Sören Gröttrup; Daniel Emden; Jannis Veerkamp; Christian Ruckert; Hans-Ulrich Klein; Carsten Müller-Tidow; Martin Dugas
Journal:  PLoS One       Date:  2012-06-14       Impact factor: 3.240

6.  Haemopedia: An Expression Atlas of Murine Hematopoietic Cells.

Authors:  Carolyn A de Graaf; Jarny Choi; Tracey M Baldwin; Jessica E Bolden; Kirsten A Fairfax; Aaron J Robinson; Christine Biben; Clare Morgan; Kerry Ramsay; Ashley P Ng; Maria Kauppi; Elizabeth A Kruse; Tobias J Sargeant; Nick Seidenman; Angela D'Amico; Marthe C D'Ombrain; Erin C Lucas; Sandra Koernig; Adriana Baz Morelli; Michael J Wilson; Steven K Dower; Brenda Williams; Shen Y Heazlewood; Yifang Hu; Susan K Nilsson; Li Wu; Gordon K Smyth; Warren S Alexander; Douglas J Hilton
Journal:  Stem Cell Reports       Date:  2016-08-04       Impact factor: 7.765

7.  Genetic Drivers of Epigenetic and Transcriptional Variation in Human Immune Cells.

Authors:  Lu Chen; Bing Ge; Francesco Paolo Casale; Louella Vasquez; Tony Kwan; Diego Garrido-Martín; Stephen Watt; Ying Yan; Kousik Kundu; Simone Ecker; Avik Datta; David Richardson; Frances Burden; Daniel Mead; Alice L Mann; Jose Maria Fernandez; Sophia Rowlston; Steven P Wilder; Samantha Farrow; Xiaojian Shao; John J Lambourne; Adriana Redensek; Cornelis A Albers; Vyacheslav Amstislavskiy; Sofie Ashford; Kim Berentsen; Lorenzo Bomba; Guillaume Bourque; David Bujold; Stephan Busche; Maxime Caron; Shu-Huang Chen; Warren Cheung; Oliver Delaneau; Emmanouil T Dermitzakis; Heather Elding; Irina Colgiu; Frederik O Bagger; Paul Flicek; Ehsan Habibi; Valentina Iotchkova; Eva Janssen-Megens; Bowon Kim; Hans Lehrach; Ernesto Lowy; Amit Mandoli; Filomena Matarese; Matthew T Maurano; John A Morris; Vera Pancaldi; Farzin Pourfarzad; Karola Rehnstrom; Augusto Rendon; Thomas Risch; Nilofar Sharifi; Marie-Michelle Simon; Marc Sultan; Alfonso Valencia; Klaudia Walter; Shuang-Yin Wang; Mattia Frontini; Stylianos E Antonarakis; Laura Clarke; Marie-Laure Yaspo; Stephan Beck; Roderic Guigo; Daniel Rico; Joost H A Martens; Willem H Ouwehand; Taco W Kuijpers; Dirk S Paul; Hendrik G Stunnenberg; Oliver Stegle; Kate Downes; Tomi Pastinen; Nicole Soranzo
Journal:  Cell       Date:  2016-11-17       Impact factor: 41.582

8.  Massively parallel digital transcriptional profiling of single cells.

Authors:  Grace X Y Zheng; Jessica M Terry; Phillip Belgrader; Paul Ryvkin; Zachary W Bent; Ryan Wilson; Solongo B Ziraldo; Tobias D Wheeler; Geoff P McDermott; Junjie Zhu; Mark T Gregory; Joe Shuga; Luz Montesclaros; Jason G Underwood; Donald A Masquelier; Stefanie Y Nishimura; Michael Schnall-Levin; Paul W Wyatt; Christopher M Hindson; Rajiv Bharadwaj; Alexander Wong; Kevin D Ness; Lan W Beppu; H Joachim Deeg; Christopher McFarland; Keith R Loeb; William J Valente; Nolan G Ericson; Emily A Stevens; Jerald P Radich; Tarjei S Mikkelsen; Benjamin J Hindson; Jason H Bielas
Journal:  Nat Commun       Date:  2017-01-16       Impact factor: 14.919

9.  BloodExpress: a database of gene expression in mouse haematopoiesis.

Authors:  Diego Miranda-Saavedra; Subhajyoti De; Matthew W Trotter; Sarah A Teichmann; Berthold Göttgens
Journal:  Nucleic Acids Res       Date:  2008-11-04       Impact factor: 16.971

10.  BloodSpot: a database of gene expression profiles and transcriptional programs for healthy and malignant haematopoiesis.

Authors:  Frederik Otzen Bagger; Damir Sasivarevic; Sina Hadi Sohi; Linea Gøricke Laursen; Sachin Pundhir; Casper Kaae Sønderby; Ole Winther; Nicolas Rapin; Bo T Porse
Journal:  Nucleic Acids Res       Date:  2015-10-26       Impact factor: 16.971

View more
  55 in total

1.  A Gain-of-Function p53-Mutant Oncogene Promotes Cell Fate Plasticity and Myeloid Leukemia through the Pluripotency Factor FOXH1.

Authors:  Evangelia Loizou; Ana Banito; Geulah Livshits; Yu-Jui Ho; Richard P Koche; Francisco J Sánchez-Rivera; Allison Mayle; Chi-Chao Chen; Savvas Kinalis; Frederik O Bagger; Edward R Kastenhuber; Benjamin H Durham; Scott W Lowe
Journal:  Cancer Discov       Date:  2019-05-08       Impact factor: 39.397

2.  Chronic myeloid leukemia stem cells require cell-autonomous pleiotrophin signaling.

Authors:  Heather A Himburg; Martina Roos; Tiancheng Fang; Yurun Zhang; Christina M Termini; Lauren Schlussel; Mindy Kim; Amara Pang; Jenny Kan; Liman Zhao; Hyung Suh; Joshua P Sasine; Gopal Sapparapu; Peter M Bowers; Gary Schiller; John P Chute
Journal:  J Clin Invest       Date:  2020-01-02       Impact factor: 14.808

3.  Glycan analysis of human neutrophil granules implicates a maturation-dependent glycosylation machinery.

Authors:  Vignesh Venkatakrishnan; Régis Dieckmann; Ian Loke; Harry C Tjondro; Sayantani Chatterjee; Johan Bylund; Morten Thaysen-Andersen; Niclas G Karlsson; Anna Karlsson-Bengtsson
Journal:  J Biol Chem       Date:  2020-07-14       Impact factor: 5.157

4.  Epigenetic inactivation of ERF reactivates γ-globin expression in β-thalassemia.

Authors:  Xiuqin Bao; Xinhua Zhang; Liren Wang; Zhongju Wang; Jin Huang; Qianqian Zhang; Yuhua Ye; Yongqiong Liu; Diyu Chen; Yangjin Zuo; Qifa Liu; Peng Xu; Binbin Huang; Jianpei Fang; Jinquan Lao; Xiaoqin Feng; Yafeng Li; Ryo Kurita; Yukio Nakamura; Weiwei Yu; Cunxiang Ju; Chunbo Huang; Narla Mohandas; Dali Li; Cunyou Zhao; Xiangmin Xu
Journal:  Am J Hum Genet       Date:  2021-03-17       Impact factor: 11.025

5.  A CRISPR RNA-binding protein screen reveals regulators of RUNX1 isoform generation.

Authors:  Amanda G Davis; Jaclyn M Einstein; Dinghai Zheng; Nathan D Jayne; Xiang-Dong Fu; Bin Tian; Gene W Yeo; Dong-Er Zhang
Journal:  Blood Adv       Date:  2021-03-09

6.  The KDM4/JMJD2 histone demethylases are required for hematopoietic stem cell maintenance.

Authors:  Karl Agger; Koutarou Nishimura; Satoru Miyagi; Jan-Erik Messling; Kasper Dindler Rasmussen; Kristian Helin
Journal:  Blood       Date:  2019-08-21       Impact factor: 22.113

7.  Clinical MDR1 inhibitors enhance Smac-mimetic bioavailability to kill murine LSCs and improve survival in AML models.

Authors:  Emma Morrish; Anthony Copeland; Donia M Moujalled; Jason A Powell; Natasha Silke; Ann Lin; Kate E Jarman; Jarrod J Sandow; Gregor Ebert; Liana Mackiewicz; Jessica A Beach; Elizabeth L Christie; Alexander C Lewis; Giovanna Pomilio; Karla C Fischer; Laura MacPherson; David D L Bowtell; Andrew I Webb; Marc Pellegrini; Mark A Dawson; Stuart M Pitson; Andrew H Wei; John Silke; Gabriela Brumatti
Journal:  Blood Adv       Date:  2020-10-27

8.  Characterization and evolutionary origin of novel C<sub>2</sub>H<sub>2</sub> zinc finger protein (ZNF648) required for both erythroid and megakaryocyte differentiation in humans.

Authors:  Daniel C J Ferguson; Juraidah Haji Mokim; Marjolein Meinders; Edmund R R Moody; Tom A Williams; Sarah Cooke; Kongtana Trakarnsanga; Deborah E Daniels; Ivan Ferrer-Vicens; Deborah Shoemark; Chartsiam Tipgomut; Katherine A Macinnes; Marieangela C Wilson; Belinda K Singleton; Jan Frayne
Journal:  Haematologica       Date:  2021-11-01       Impact factor: 9.941

9.  A critical role of nuclear m6A reader YTHDC1 in leukemogenesis by regulating MCM complex-mediated DNA replication.

Authors:  Yue Sheng; Jiangbo Wei; Fang Yu; Huanzhou Xu; Chunjie Yu; Qiong Wu; Yin Liu; Lei Li; Xiao-Long Cui; Xueying Gu; Bin Shen; Wei Li; Yong Huang; Sumita Bhaduri-McIntosh; Chuan He; Zhijian Qian
Journal:  Blood       Date:  2021-12-30       Impact factor: 22.113

10.  Integrated analysis of patient samples identifies biomarkers for venetoclax efficacy and combination strategies in acute myeloid leukemia.

Authors:  Haijiao Zhang; Yusuke Nakauchi; Thomas Köhnke; Melissa Stafford; Daniel Bottomly; Rozario Thomas; Beth Wilmot; Shannon K McWeeney; Ravindra Majeti; Jeffrey W Tyner
Journal:  Nat Cancer       Date:  2020-08-18
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.