Yajun Liu1, De Cheng, Zhenzhen Li, Xing Gao, Huayan Wang. 1. College of Veterinary Medicine, Shaanxi Center for Stem Cell Engineering and Technology, Northwest A&F University, Yangling, Shaanxi, China.
Abstract
Induced pluripotent stem cells (iPSCs) obtained by the ectopic expression of defined transcription factors have tremendous promise and therapeutic potential for regenerative medicine. Many studies have highlighted important differences between iPSCs and embryonic stem cells (ESCs). In this work, we used meta-analysis to compare the global transcriptional profiles of human iPSCs from various cellular origins and induced by different methods. The induction strategy affected the quality of iPSCs in terms of transcriptional signatures. The iPSCs generated by non-integrating methods were closer to ESCs in terms of transcriptional distance than iPSCs generated by integrating methods. Several pathways that could be potentially useful for studying the molecular mechanisms underlying transcription factor-mediated reprogramming leading to pluripotency were also identified. These pathways were mostly associated with the maintenance of ESC pluripotency and cancer regulation. Numerous genes that are up-regulated during the induction of reprogramming also have an important role in the success of human preimplantation embryonic development. Our results indicate that hiPSCs maintain their pluripotency through mechanisms similar to those of hESCs.
Induced pluripotent stem cells (iPSCs) obtained by the ectopic expression of defined transcription factors have tremendous promise and therapeutic potential for regenerative medicine. Many studies have highlighted important differences between iPSCs and embryonic stem cells (ESCs). In this work, we used meta-analysis to compare the global transcriptional profiles of human iPSCs from various cellular origins and induced by different methods. The induction strategy affected the quality of iPSCs in terms of transcriptional signatures. The iPSCs generated by non-integrating methods were closer to ESCs in terms of transcriptional distance than iPSCs generated by integrating methods. Several pathways that could be potentially useful for studying the molecular mechanisms underlying transcription factor-mediated reprogramming leading to pluripotency were also identified. These pathways were mostly associated with the maintenance of ESC pluripotency and cancer regulation. Numerous genes that are up-regulated during the induction of reprogramming also have an important role in the success of human preimplantation embryonic development. Our results indicate that hiPSCs maintain their pluripotency through mechanisms similar to those of hESCs.
Induced pluripotent stem cells (iPSCs) are derived from somatic cells by transfecting two pluripotent transcription factors, Oct4 (O) and Sox2 (S), and two proto-oncogenes, c-Myc (M) and Klf4 (K). These four transcription factors globally reset the epigenetic and transcriptional state of fibroblasts into that of pluripotent cells (Takahashi ). This technology provides alternative pluripotent cells that closely resemble blastocyst-derived embryonic stem cells (ESCs), which are considered the gold standard for stem cells (Takahashi ; Kang ). The replacement of ESCs with iPSCs in the field of regenerative medicine is based on the assumption that iPSCs are as potent as ESCs in their ability to differentiate and in their safety for clinical applications (Boue ). Mouse iPSCs have the same functional characteristics as mouse ESCs, as shown by their capacity to generate mice in tetraploid complementation experiments (Boland ; Kang ; Zhao ). In contrast, this convincing pluripotency test is difficult to execute in human iPSCs (hiPSCs). Genome-wide profiling analysis of gene expression (Ghosh ), DNA methylation patterns (Doi ) and differentiation properties have detected incomplete reprogramming in hiPSCs. These findings suggest that there are substantial differences between hESCs and hiPSCs.The advantages and disadvantages of the delivery method for each factor have been discussed elsewhere (Achiwa ; Gonzalez ). Since the first report on iPSCs produced by retroviral delivery of four factors (OSKM), a substantial number of alternative approaches have been developed to induce pluripotency. In this report, we describe a meta-analysis of gene expression information from multiple independent but related studies (summarized in Table 1). For this, we compared the transcription signatures of hiPSCs generated by different methods and transcriptional factors, with hESCs serving as the gold standard. We also determined the detailed molecular events involved in human cell reprogramming by comparing the transcriptomes of hiPSCs and fibroblasts.
Table 1
Data for 20 hiPSC lines derived from cells of different origins and different methods of induction. The dataset for each iPSC line includes the donor cells, method of induction and reprogramming factors. The differentially expressed genes were identified by comparing iPSCs and ESCs of the same sex and from the same laboratory. All of the microarray data can be retrieved through the corresponding GEO number. iPSCs-ESCs indicates the number of differentially expressed genes between iPSCs and ESCs, and iPSCs-fibroblasts indicates the number of differentially expressed genes between iPSCs and fibroblasts. K: Klf4, L: Lin28, M: c-Myc, N: Nanog, O: Oct4, S: Sox2.
Donor cells
iPSCs
Method of induction
Reprogramming factors
ESCs
Sex
GEO number
iPSCs-ESCs
iPSCs-fibroblasts
Reference
NPC
iPS 462813
Episome
OS
HUES6
F
GSE18618
9507
7688
Marchetto et al. (2009)
MSC fibroblasts
MSC-derived iPSC line
Retroviral
OSKM
BG01 ESCs
M
1424
15106
Marchetto et al. (2009)
Foreskin
iPS cells
Episome
OSNL
H13B ESC
M
GSE15148
8823
9689
Yu et al. (2009)
Foreskin
JT-iPSC
Episome
OSNL
H13
M
GSE20014
1843
10673
Jia et al. (2010)
Foreskin
iPSC
Minicircle DNA
OSNL
H13
M
7668
12092
Jia et al. (2010)
Fibroblast
hiPSC
Retroviral
OSKMN
Hsf1 p51
F
GSE22392
10048
9598
Chin et al. (2010)
Fibroblast
hiPS
Inducible system
OSK
hES_BG01
M
GSE22499
6209
15684
Guenther et al. (2010)
Fibroblast
hiPSC
Proteins
OSKMN
H9 hESCs
F
GSE16093
1610
9104
Kim et al. (2009)
dH1f fibroblast
dH1f-iPS3
Retroviral
OSKM
H1-OGN hES cells
M
GSE9832
8658
8614
Park et al. (2008)
dH1cf16
dH1cf16-iPS5
Retroviral
OSKM
H1-OGN hES cells
M
11159
7793
Park et al. (2008)
BJ fibroblast
BJ_iPS
Retroviral
OSKM
H1_ES-1
M
14523
15424
Park et al. (2008)
dH1F fibroblast
dH1F_iPS3
Retroviral
OSKM
H1_ES-1
M
GSE23583
8658
15243
Warren et al. (2010)
dH1F fibroblast
dH1F_RiPS
mRNA
OSKM
H1_ES-1
M
9125
11490
Warren et al. (2010)
BJ1 fibroblast
BJ1-iPSC 2
Retroviral
OSKM
BG01 ESCs
M
2188
12499
Warren et al. (2010)
BJ fibroblast
BJ_RiPS
mRNA
OSKM
H1_ES-1
M
10173
11072
Warren et al. (2010)
MRC5 fibroblast
MRC5_RiPS
mRNA
OSKM
H1_ES-1
M
10445
12092
Warren et al. (2010)
BJ1 fibroblast
BJ1-iPS1
Retroviral
OSKM
H1-OGN hES cells
M
GSE24182
2920
15424
Loewer et al. (2010)
CB_CD133+
CBiPS_4F
Retroviral
OSKM
ES2_1
F
GSE16694
15414
16810
Giorgetti et al. (2009)
CB_CD133+
CBiPS_2F
Retroviral
OS
ES2_1
F
14523
8614
Park et al. (2008)
BJ sample
BJ hIPS
Inducible system
OSKMN
HUES 8 p
M
GSE12390
6159
15684
Maherali et al. (2008)
Materials and Methods
Source of gene expression data
All of the microarray information and individual cell intensity (CEL) files in the HG-U133Plus2 microarray platform (Affymetrix) were obtained online at the Gene Expression Omnibus (GEO), a public repository for a wide range of high-throughput experimental data. The donor cells and different hiPSC lines are summarized in Table 1.
Microarray analysis
We imported datasets from GEO into GeneSpring GX 11.0 using a guided workflow step to identify potential targets that were both statistically and biologically meaningful. Probe sets with gene-level normalized intensities greater than log (base 2) of 5.0 in a least one sample were excluded from ANOVA. The data were then filtered based on their flag values (P – present and A – absent) to remove probe sets for which the signal intensities for all the treatment groups were in the lowest 20 percentile of all intensity values. ANOVA in conjunction with the Benjamini-Hochberg FDR multiple test correction was used to identify genes that were differentially expressed between different groups. The level of significance was set at p < 0.05.
Gene ontology (GO) annotation and pathway analysis
The functions of up- or down-regulated genes in iPSCs vs. somatic cells were investigated by using the Database for Annotation, Visualization and Integrated Discovery (DAVID) v 6.7 (Huang ) based on gene ontology (GO) (Ashburner ) annotations. In addition, groups of genes associated with specific pathways (based on the Kyoto Encyclopedia of Genes and Genomes – KEGG) were analyzed together to assess pathway regulation during reprogramming.
Network analysis
We investigated the possible functional associations between the top 484 noticeably significant unregulated genes in iPSCs compared with fibroblasts using the STRING database (STRING score of at least 0.5) (von Mering ). Gene networks for which there was high confidence as interacting partners were visualized using MEDUSA (Hooper and Bork, 2005).
Results
Comparative global transcriptomic analysis of iPSCs and ESCs
Figure 1 provides a flowchart of the global transcriptomic analysis. Reprogramming methods can be divided into two classes, i.e., those that are integration-free (including synthetic modified mRNA, episomes, proteins and minicircles) and those involving the integration of exogenous transcription factors (lentiviral and retroviral methods and inducible reprogramming systems). Most (75%) of the iPSCs analyzed in this study used fibroblasts as the donor cell type. ANOVA was used to determine the degree of reprogramming within hiPSCs derived using different methods of induction and transcription factors, and to examine the “distance”, i.e., number of differentially expressed genes (based on cut-off criteria of p < 0.05 and a fold-change = 2), among hESCs, hiPSCs and their corresponding donor cells (Figure 1). To eliminate the influence of micro-environmental factors associated with different laboratories and the genetic background of donor cells, the differentially expressed genes were identified by comparing iPSCs and ESCs derived from the same laboratory and donor animals of the same sex (Table 1). Table S1 (Supplementary material) provides a detailed list of the genes that were differentially expressed between iPSCs and ESCs.
Figure 1
A schematic overview of the approach used in this study. The microarray data for ESCs, iPSCs and their donor cells were obtained from the GEO database. Comparison of the gene expression signature between ESCs and iPSCs showed that the characteristics of the reprogramming varied according to the strategy used. Likewise, comparison of the gene expression signature between iPSCs and original donor cells provided insights into the process of induced reprogramming.
We also analyzed the relationship between the “distance” of iPSCs vs. ESCs and the method used to deliver the transcription factor(s). iPSCs generated by integrating viral vectors (moloney-based retrovirus and HIV-based lentivirus) were not as close to ESC lines as iPSCs generated by non-integrating methods (episomes, synthetic modified mRNA, proteins and minicircle DNA) (Figure 2A). The type of transcription factor used had little impact on the gene expression signature of iPSCs (Figure 2B). No overlapping genes were differentially expressed between hESCs and hiPSCs derived from various reprogramming experiments, i.e., there were no consistent differences in the global gene expression between human ESCs and iPSCs. These findings supported the idea that reprogramming progressed through a series of stochastic events to produce pluripotency.
Figure 2
The transcriptional signature of iPSCs from different laboratories using different methods of induction and different transcription factors. (A) iPSCs generated by integrating viral vectors were less closely related to ESCs than iPSCs generated by a non-integrating method. (B) The choice of transcription factor (OS, OSK, OSNL, OSKMN and OSKM) did not significantly affect the transcriptional profile of iPSCs. K: Klf4, L: Lin28, M: c-Myc, N: Nanog, O: Oct4, S: Sox2. *p < 0.05 and **p < 0.001 compared to the retroviral method.
Functional analysis of significantly altered genes between iPSCs and donor cells
The detailed molecular events involved in reprogramming to produce iPSCs remain largely unknown. To address this issue, we undertook an in-depth analysis of the biological functions of differentially expressed genes in all 20 iPSC lines vs. donor fibroblasts; the selection criteria were again p < 0.05 (Student’s t-test) and at least a two-fold difference in gene expression. Table 1 summarizes the number of differentially expressed genes between the iPSC lines and the original cell lines. Of these, 312 genes up-regulated in each iPSC line were compared with fibroblasts (Table S2). We defined the 312 up-regulated probes as essential for maintaining the pluripotency of hiPSCs (EMP genes). The STRING database was used to visualize all known functional interactions between EMP genes in iPSC lines using the default cutoff suggested by STRING. One hundred and fifty-nine genes in this set (32%) interacted with each other (Figure 3). The functional network of genes with higher expression levels in iPSCs showed a central, highly interconnected area in which common pluripotency regulators such as Pou5f1, Nanog, Lin28, Dnmt3 and Dppa4 were identified. This finding indicated that hiPSCs and hESCs shared a similar core network to maintain pluripotency. The absence of Sox2 in this analysis reflects the fact that Marchetto used mouse neural stem cells (NSCs), which have a high endogenous expression of Sox2, as the donor cell lines to induce reprogramming. Hence, Sox2 was not included in the 312 genes unregulated in iPSCs. This protein interaction network for pluripotency provides a model for exploring neo-factors that may enhance the induction of reprogramming.
Figure 3
Predicted stem-cell-specific protein-protein interaction network of genes with higher expression levels in iPSCs compared to somatic cells.
We took advantage of a recently published microarray dataset (Xie ) to study the dynamic changes in EMP genes during mammalian preimplantation embryonic development (Table S3). One hundred and twenty EMP genes, including Pou5f1Dppa4 and Lin28, were up regulated during the transitional phase from the four-cell stage to the eight-cell stage of human early embryonic development, known as the human zygotic genome activation period (Hoffert ) (Figure 4). This pluripotent network, which is essential for maintaining the self-renewal of iPSCs, also plays a pivotal role in establishing embryos in vivo. The 101 EMP genes that were down-regulated during the process could contribute to the differentiation of stem cells in vivo and in vitro.
Figure 4
The gene expression tendency of 312 transcripts (EMP genes) in different stages of human preimplantation embryonic development. One hundred and twenty transcripts were upregulated and 101 transcripts were down-regulated, the later mainly in the four-cell stage to eight-cell stage. Red represents the up regulated expressed genes and green the down-regulated ones.
The functions associated with genes that were significantly altered in reprogramming were examined by analyzing the over-represented annotations and pathways using DAVID, with a cut-off criterion of p < 0.01. The over-represented GO terms focused on “regulation of transcription” and “regulation of cell proliferation” (Table S4). The results of this analysis supported the idea that an increase in proliferation rate was necessary for fully cellular reprogramming (Smith ).We also analyzed whether significant pathways in iPSCs were enriched in significantly altered genes. The results showed that hiPSCs were responsive to the TGF-β signaling pathway that regulates the maintenance of pluripotency, self-renewal and proliferation of hESCs (Table S4). These results demonstrated that hiPSCs reprogrammed from somatic or embryonic cells relied on similar signaling pathways to control their pluripotency.
Discussion
The results described herein show that the overall transcriptional profiles of different human iPSC lines shared a common “signature” with hESCs, although there were certain differences. Notably, the transcriptomes of hiPSCs produced by a delivery method that avoided genomic integration shared a greater gene expression signature with hESCs than did iPSCs produced by a virus-based method. Gene-delivery methods can affect the quality of the resulting iPSCs by influencing the amount, balance, continuity and silencing of transgene expression. Potent oncogenes such as myc apparently have little effect on the transcriptional signature of iPSCs. Our findings provide a basis for selecting the most suitable method for clinical or basic applications and a better understanding of the reprogramming process.This study also improves our understanding of the mechanisms of cellular reprogramming. The transcriptional network maintains the self-renewal and pluripotency of iPSCs established primarily during preimplantation at the stage of zygote genome activation. Detailed analysis showed that increased proliferation and the up-regulation of genes that drive the cell cycle are necessary events for fibroblast reprogramming. Recent reports have shown that hiPSCs are more tumorigenic than hESCs based on a comparison of protein-coding point mutations (Gore ), copy number variations (Hussein ) and DNA methylation (Lister ). Together, these results stress the link between pluripotency and tumorigenicity. Given that self-renewal is a hallmark of ESCs and cancer cells, the ability to induce tumors during cellular reprogramming implies that there are potential risks involved in the use of iPSCs for regenerative therapy.In addition, non-coding RNA, including microRNA (miRNA) and large intergenic non-coding (lincRNA), which may represent a distinct layer to fine-tune the transcriptional network of stem cells, has a role in modulating the induction of reprogramming (Judson ; Loewer ). Significantly, recent work has shown that a single miRNA cluster rapidly reprogrammed mouse and human fibroblasts into iPSCs and totally avoided the use of transcription factors (Anokye-Danso F ). The mechanism underlying reprogramming by miRNA differs from that of transcription factor-induced reprogramming in that there is no requirement for protein translation; the former method also targets hundreds of ESC-related mRNAs directly.In conclusion, we have examined the gene expression profiles of iPSCs obtained by different methods and from donor cell of different of origins. iPSCs produced by non-integrative methods are more closely resembled the fully reprogrammed pluripotent state than did iPSCs obtained by using integrative delivery systems, although the efficiency and kinetics were lower. Some of the results described here may reflect the markedly different circumstances in which they were generated, e.g., the culture conditions, the passage number at which the cells were used and the age of the donor cells. Another limitation in our analysis was that only the initial state (donor cell) and end state (pluripotent cell) of reprogramming were examined.Further research on each aspect of reprogramming, e.g., the initial transcriptional response to the induction of reprogramming, the epigenetic roadblocks, the partially pluripotent state and the late events leading to pluripotency, is required in order to understand how reprogramming leads to pluripotency. A comprehensive understanding of the events involved in reprogramming a set of iPSCs can only be reached by examining the changes in the corresponding transcriptome (protein coding RNA, microRNA and lincRNA expression), epigenome (genome imprint, X chromosome activation, histone modifications and DNA methylation), metabolome and proteome.
Authors: M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock Journal: Nat Genet Date: 2000-05 Impact factor: 38.330
Authors: Samer M Hussein; Nizar N Batada; Sanna Vuoristo; Reagan W Ching; Reija Autio; Elisa Närvä; Siemon Ng; Michel Sourour; Riikka Hämäläinen; Cia Olsson; Karolina Lundin; Milla Mikkola; Ras Trokovic; Michael Peitz; Oliver Brüstle; David P Bazett-Jones; Kari Alitalo; Riitta Lahesmaa; Andras Nagy; Timo Otonkoski Journal: Nature Date: 2011-03-03 Impact factor: 49.962
Authors: Sabine Loewer; Moran N Cabili; Mitchell Guttman; Yuin-Han Loh; Kelly Thomas; In Hyun Park; Manuel Garber; Matthew Curran; Tamer Onder; Suneet Agarwal; Philip D Manos; Sumon Datta; Eric S Lander; Thorsten M Schlaeger; George Q Daley; John L Rinn Journal: Nat Genet Date: 2010-11-07 Impact factor: 38.330
Authors: Luigi Warren; Philip D Manos; Tim Ahfeldt; Yuin-Han Loh; Hu Li; Frank Lau; Wataru Ebina; Pankaj K Mandal; Zachary D Smith; Alexander Meissner; George Q Daley; Andrew S Brack; James J Collins; Chad Cowan; Thorsten M Schlaeger; Derrick J Rossi Journal: Cell Stem Cell Date: 2010-09-30 Impact factor: 24.633
Authors: Athurva Gore; Zhe Li; Ho-Lim Fung; Jessica E Young; Suneet Agarwal; Jessica Antosiewicz-Bourget; Isabel Canto; Alessandra Giorgetti; Mason A Israel; Evangelos Kiskinis; Je-Hyuk Lee; Yuin-Han Loh; Philip D Manos; Nuria Montserrat; Athanasia D Panopoulos; Sergio Ruiz; Melissa L Wilbert; Junying Yu; Ewen F Kirkness; Juan Carlos Izpisua Belmonte; Derrick J Rossi; James A Thomson; Kevin Eggan; George Q Daley; Lawrence S B Goldstein; Kun Zhang Journal: Nature Date: 2011-03-03 Impact factor: 49.962
Authors: Vivek Shukla; Mahadev Rao; Hongen Zhang; Jeanette Beers; Darawalee Wangsa; Danny Wangsa; Floryne O Buishand; Yonghong Wang; Zhiya Yu; Holly S Stevenson; Emily S Reardon; Kaitlin C McLoughlin; Andrew S Kaufman; Eden C Payabyab; Julie A Hong; Mary Zhang; Sean Davis; Daniel Edelman; Guokai Chen; Markku M Miettinen; Nicholas P Restifo; Thomas Ried; Paul A Meltzer; David S Schrump Journal: Cancer Res Date: 2017-09-21 Impact factor: 12.701