| Literature DB >> 18698424 |
Davendra Sohal1, Andrew Yeatts, Kenny Ye, Andrea Pellagatti, Li Zhou, Perry Pahanish, Yongkai Mo, Tushar Bhagat, John Mariadason, Jacqueline Boultwood, Ari Melnick, John Greally, Amit Verma.
Abstract
Microarray-based studies of global gene expression (GE) have resulted in a large amount of data that can be mined for further insights into disease and physiology. Meta-analysis of these data is hampered by technical limitations due to many different platforms, gene annotations and probes used in different studies. We tested the feasibility of conducting a meta-analysis of GE studies to determine a transcriptional signature of hematopoietic progenitor and stem cells. Data from studies that used normal bone marrow-derived hematopoietic progenitors was integrated using both RefSeq and UniGene identifiers. We observed that in spite of variability introduced by experimental conditions and different microarray platforms, our meta-analytical approach can distinguish biologically distinct normal tissues by clustering them based on their cell of origin. When studied in terms of disease states, GE studies of leukemias and myelodysplasia progenitors tend to cluster with normal progenitors and remain distinct from other normal tissues, further validating the discriminatory power of this meta-analysis. Furthermore, analysis of 57 normal hematopoietic stem and progenitor cell GE samples was used to determine a gene expression signature characteristic of these cells. Genes that were most uniformly expressed in progenitors and at the same time differentially expressed when compared to other normal tissues were found to be involved in important biological processes such as cell cycle regulation and hematopoiesis. Validation studies using a different microarray platform demonstrated the enrichment of several genes such as SMARCE, Septin 6 and others not previously implicated in hematopoiesis. Most interestingly, alpha-integrin, the only common stemness gene discovered in a recent comparative murine analysis (Science 302(5644):393) was also enriched in our dataset, demonstrating the usefulness of this analytical approach.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18698424 PMCID: PMC2495035 DOI: 10.1371/journal.pone.0002965
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Schema of data collection and analysis.
Sources of data for the meta-analysis*
| Author | Source of cells | No. of datasets | Platform |
| Sternberg A, et al | CD34, MDS | 22 | U133 A/B |
| Oswald J, et al | CD34 | 3 | U133 A/B |
| Su AI, et al | CD34, various normal NHTs | 19 | U133 A/B |
| Eckfeldt CE, et al | CD34 | 18 | U133 A/B |
| Bhatia M, et al (GEO) | CD34 | 15 | U133 A/B |
| Pellagatti A, et al | CD34, MDS | 22 | U133 Plus 2.0 |
| Breit S, et al | Bone marrow | 9 | U95 |
| Ge X, et al | Various normal NHTs | 19 | U133 A/B |
| Gutierrez NC, et al | Bone marrow (AML) | 9 | U133 A/B |
| Roth RB, et al (GEO) | Various normal NHTs | 7 | U133 Plus 2.0 |
| Cheok MH, et al | Bone marrow (ALL) | 6 | U95 |
NHTs: Non-hematopoietic tissues, GEO: Gene Expression Omnibus database set,
MDS: Myelodysplastic syndrome, AML: Acute myeloid leukemia, ALL: Acute lymphoblastic leukemia
Numbers in brackets are reference numbers.
Platform and tissue type for various datasets.
| U95 | U133 A/B | U133 Plus 2.0 | Total | |
| Normal hematopoietic stem cells | 9 | 46 | 11 | 66 |
| Normal tissues, non-hematopoietic | 0 | 36 | 7 | 43 |
| Diseased hematopoietic stem cells | 6 | 23 | 11 | 40 |
| Total | 15 | 105 | 29 | 149 |
Figure 2Normal bone marrow HSC clustering.
Experimental conditions, microarray platforms and sources of cells influence gene expression patterns of normal bone marrow derived HSCs. Dendrogram of normal bone marrow derived hematopoietic cells based on unsupervised hierarchical clustering, using (1 - Pearson correlation coefficient) as the distance measure. Same color in each horizontal row indicates same group.
Pairwise absolute correlation coefficients for normal hematopoietic cell samples
| Mean (Range) | Median | |
| Same study | 0.87 (0.26–1.00) | 0.95 |
| Different study | 0.35 (0.00–0.93) | 0.04 |
| Same platform | 0.83 (0.26–1.00) | 0.82 |
| Different platform | 0.02 (0.00–0.06) | 0.01 |
| Same cells (CD34 or BM) | 0.58 (0.01–1.00) | 0.79 |
| Different cells | 0.01 (0.00–0.01) | 0.01 |
Figure 3Distinguishing normal non-hematopoietic tissues.
Despite differing platforms and experimental conditions, GE profiles can separate out normal tissues based on cell/tissue of origin. Dendrogram based on unsupervised hierarchical clustering, using (1 - Pearson correlation coefficient) as the distance measure. Rectangles indicate samples from the U133 A/B platform and ovals from the U133 Plus 2.0 platform. Triplicate sets of samples from human liver, heart, testis, kidney, etc. are from different studies, and their grouping together is a strong indicator of comparability across studies and platforms.
Pairwise absolute correlation coefficients for normal non-hematopoietic cell samples.
| Mean (Range) | Median | |
| Same study | 0.58 (0.27–0.91) | 0.59 |
| Different study | 0.55 (0.23–0.95) | 0.55 |
| Same platform | 0.58 (0.24–0.95) | 0.59 |
| Different platform | 0.51 (0.23–0.88) | 0.49 |
| Same tissue | 0.77 (0.68–0.95) | 0.80 |
| Different tissue | 0.57 (0.23–0.88) | 0.59 |
Figure 4A: Biological relationships identified.
Dendrogram of normal hematopoietic, diseased hematopoietic and non hematopoietic tissues GE profiles reveals biological relationships between them. MDS sets intersperse with normal hematopoietic tissues whereas AML samples are a separate group, exactly as their biological dissimilarity patterns. Dendrogram based on unsupervised hierarchical clustering, using (1 - Pearson correlation coefficient) as the distance measure. Same color in each horizontal row indicates same group. UniGene IDs were used for integrating data. B: Clustering using RefSeq IDs. Same clustering as in 4A, showing poorer performance of RefSeq IDs, compared to UniGene IDs, in uncovering biological relationships. Dendrogram based on unsupervised hierarchical clustering, using (1 - Pearson correlation coefficient) as the distance measure. Same color in each horizontal row indicates same group.
Figure 5A: “Stemness genes”.
349 UniGene IDs were identified as being consistently expressed amongst the normal hematopoietic cells and differentially expressed between hematopoietic and non-hematopoietic cells. Genes enriched in hematopoietic progenitor and stem cell datasets were involved in important functional pathways in the cell, including drug metabolism, hematological system development, cell signaling and cancer and cell death, as shown in the bar graph alongside. One such network is shown, which includes the GATA2, Cyclin E and SMARCE1 genes. B: Heatmap of “stemness” genes. 349 Unigene IDs were identified as being consistently expressed amongst the normal hematopoietic cells and differentially expressed between hematopoietic and non-hematopoietic cells. Out of these, 176 genes were enriched in HSC datasets when compared to other tissue types.
‘Stemness genes’*
| Major functions | Well-annotated genes |
| Gene Expression, Cell Cycle, Cellular Development, | ABCC1, CASP8, CSNK1G2, E2F3, GATA2, JARID2, RALBP1, SMARCA4, SMARCE1, STK10, SUMO1, TAL1, TCF12, TFDP2, USP4, USP7 |
| Cell Morphology, Cellular Assembly and Organization, Cell Signaling | C1ORF2, GLIPR1, HSPA9, ING2, LPIN1, MAP3K4, MAP4K1, NCK1, NFATC1, PAK2, PPM1F, PPP3CA, TP53, UBE3A, ZNF84, BRPF1, EWSR1, HSPA4, LYN, MAPKAPK5, PHF21A, PTEN, TIMM17A, TROVE2 |
| Cancer, Cellular Growth and Proliferation,Tumor Morphology | ATP6V0A2, CD47, HNRPUL1, MLLT10, MPHOSPH9, MTR, PDS5A, SEC63, SH3BGRL |
| Others | TIPRL, TSR1, TXNDC9, SFRS17A, CENTB2, THOC2, KIAA0368, PAX3, TFIP11, TUFT, FMR1, NUFIP1 |
Some important genes differentially over-expressed in hematopoietic progenitors, as compared to non-hematopoietic tissues
Genes quiescent in HSC progenitor cells*
| Major functions | Well-annotated genes |
| Skeletal and Muscular System Development, Function and Disorders, Genetic Disorders | ADD1, APBB3, ARHGAP1, ATP2A2, BCL2L2, BGN, CALCOCO1, CALD1, CALM1, COL18A1, COL6A1, DDR1, ESRRA, FMOD, MYH9, NCOA1, PFN2, PXN, RHOC, SQSTM1, TPM1 |
| Cellular Assembly and Organization, Cellular Function and Maintenance, Cell Signaling | APP, CADM1, CD59, CLSTN1, ERBB2, F8, FLOT1, GDI1, IKBKG, MAPK13, MYO1C, NDRG2, NFE2L1, PTRF, RAB5B, RAB5C, SFRP1, SHC1, SPTAN1, WFS1 |
| Protein Degradation, Cellular Movement, Cell Morphology | ARF3, ARFIP2, CES2, COL1A1, CTNND1, EIF4G1, GSK3A, GSTA1, GSTM2, KIF5C, MFN2, MMP14, PAPSS2, PCDHGC3, PTPRF, SDC1, TIMP3, TSPAN3 |
| Others | CDC42EP4, CHST10, DEFB1, FKBP1A, HDLBP, LPP, S100A13, TEGT, AKAP1, CLOCK, JAM3, PCTK1, TLE2, TMPRSS6, TNFAIP1, TRIP10, USP13, SPOCK2 |
Some important genes differentially under-expressed in hematopoietic progenitors, as compared to non-hematopoietic tissues