| Literature DB >> 31560727 |
Michael Shelton1, Morten Ritso2,3, Jun Liu1, Daniel O'Neil1,4, Avetik Kocharyan1, Michael A Rudnicki2,3,5, William L Stanford1,2,3,4, Ilona S Skerjanc1, Alexandre Blais1,4.
Abstract
Human embryonic stem cell (hESC)-derived skeletal muscle progenitors (SMP)-defined as PAX7-expressing cells with myogenic potential-can provide an abundant source of donor material for muscle stem cell therapy. As in vitro myogenesis is decoupled from in vivo timing and 3D-embryo structure, it is important to characterize what stage or type of muscle is modeled in culture. Here, gene expression profiling is analyzed in hESCs over a 50 day skeletal myogenesis protocol and compared to datasets of other hESC-derived skeletal muscle and adult murine satellite cells. Furthermore, day 2 cultures differentiated with high or lower concentrations of CHIR99021, a GSK3A/GSK3B inhibitor, were contrasted. Expression profiling of the 50 day time course identified successively expressed gene subsets involved in mesoderm/paraxial mesoderm induction, somitogenesis, and skeletal muscle commitment/formation which could be regulated by a putative cascade of transcription factors. Initiating differentiation with higher CHIR99021 concentrations significantly increased expression of MSGN1 and TGFB-superfamily genes, notably NODAL, resulting in enhanced paraxial mesoderm and reduced ectoderm/neuronal gene expression. Comparison to adult satellite cells revealed that genes expressed in 50-day cultures correlated better with those expressed by quiescent or early activated satellite cells, which have the greatest therapeutic potential. Day 50 cultures were similar to other hESC-derived skeletal muscle and both expressed known and novel SMP surface proteins. Overall, a putative cascade of transcription factors has been identified which regulates four stages of myogenesis. Subsets of these factors were upregulated by high CHIR99021 or their binding sites were significantly over-represented during SMP activation, ranging from quiescent to late-activated stages. This analysis serves as a resource to further study the progression of in vitro skeletal myogenesis and could be mined to identify novel markers of pluripotent-derived SMPs or regulatory transcription/growth factors. Finally, 50-day hESC-derived SMPs appear similar to quiescent/early activated satellite cells, suggesting they possess therapeutic potential.Entities:
Year: 2019 PMID: 31560727 PMCID: PMC6764674 DOI: 10.1371/journal.pone.0222946
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Gene clustering identifies up- and down-regulated groups that control development, pluripotency, and cell division.
Pluripotent H9 hESCs were differentiated for 50 days as previously described [12], beginning with high (7.5 μM or 10 μM) CHIR99021 concentrations. mRNA samples from the indicated time points were subjected to gene expression profiling. Gene clustering based on expression profiles generated 13 clusters of similarly expressed probes across the differentiation time course. Similarly patterned clusters were manually grouped for further analysis, and clusters were divided into (A) genes whose expression increased with differentiation, and (B) genes whose expression decreased. A full list of gene identities within each cluster can be found in S1 Table.
Gene Ontology characterization of gene clusters shows progressive development through the expected stages of skeletal myogenesis, and progressive drop of pluripotency and cell cycling genes expected throughout differentiation.
| Cluster | # of Genes | GO: Biological Process | q-value | Genes / Families | Soleimani | Lilja |
|---|---|---|---|---|---|---|
| GO:0009952—Anterior/posterior specification | 5.0 E-11 | FGF8, MSGN1, WNT8A | 4 (1.36%) | 24 (1.84%) | ||
| 1 | 275 | GO:0061053—Somite development | 3.6 E-05 | DLL3, FOXC1, LFNG | ||
| GO:0007498—Mesoderm Development | 1.8 E-02 | MIXL1, T, TBX6 | ||||
| GO:0048598—Embryonic morphogenesis | 2.5 E-23 | HOXs, PDGFRA, PAX3 | 13 (4.42%) | 54 (4.15%) | ||
| 2 | 395 | GO:0030509—BMP signaling pathway | 8.6 E-06 | BMP1/4/6/7, MSX1/2 | ||
| GO:0016055—WNT signaling pathway | 4.2 E-04 | DKK1, RSPO3, WNT3A/5A/5B | ||||
| GO:0030198—ECM organization | 2.1 E-23 | COLs, LAMs, MMPs | 81 (27.6%) | 237 (18.2%) | ||
| 3 | 2067 | GO:0061061—Muscle structure development | 1.7 E-12 | DMD, MEOX1/2, NCAM1 | ||
| GO:0006811—Ion transport | 9.2 E-03 | CACNs, KCNs, SLCs | ||||
| GO:0006936—Muscle contraction | 9.7 E-27 | CHRNs, MYH/Ls, TNNC/Ts | 40 (13.6%) | 68 (5.22%) | ||
| 4 | 736 | GO:0014706—Striated muscle development | 7.6 E-24 | MSTN, MRFs, PAX7 | ||
| GO:0014902—Myotube differentiation | 6.8 E-07 | MEF2C, MYOD1, MYOG | ||||
| GO:0007155—Cell adhesion | 1.2 E-04 | CDH7, ITGA1/B2, PECAM1 | 7 (2.38%) | 22 (1.69%) | ||
| 5 + 6 | 267 | GO:0006955—Immune response | 2.6 E-04 | C1Qs, FCGRs, IL6ST/7 | ||
| GO:0006811—Ion transport | 6.1 E-02 | CCLs, CYBB, SLCs | ||||
| Ratio Stem Cell vs. Induced-Mesoderm: | 6 (2.04%) | 21 (1.61%) | ||||
| 7 | 331 | Top 500 Ranked by Relative-expression | 8.9 E-32 | FGF2, NANOG, SOX2 | ||
| PCBC_ratio_SC_vs_MESO-5_500 | ||||||
| Ratio Stem Cell vs. Induced-Mesoderm: | 4 (1.36%) | 13 (1.00%) | ||||
| 8 | 310 | Top 500 Ranked by Relative-expression | 7.2 E-08 | ENHO, FGF19, POU5F1 | ||
| PCBC_ratio_SC_vs_MESO-5_500 | ||||||
| GO:0006928—Cell / subcellular movement | 2.2 E-06 | KIFs, LMNA, PODXL/2 | 19 (6.46%) | 54 (4.15%) | ||
| 9 + 10 | 406 | GO:0048729—Tissue morphogenesis | 1.9 E-05 | GLI2, KDR, PDGFA/B | ||
| GO:0048870 –Cell motility | 2.1 E-04 | ARC, EPCAM, MMPs | ||||
| GO:0007049—Cell cycle | 4.9 E-40 | CDCs, CDKs, KIFs | 11 (3.74%) | 70 (5.36%) | ||
| 11 + 12 | 1394 | GO:0006260—DNA replication | 5.9 E-25 | BRAC1/2, MCMs, PRIM1/2 | ||
| GO:0006396—RNA processing | 9.6 E-15 | PRMTs, RRPs, SRSFs | ||||
| GO:0050000—Chromosome localization | 4.4 E-07 | CENPs, CEP55, KIFs | 0 (0%) | 5 (0.38%) | ||
| 13 | 97 | GO:0051276—Chromosome organization | 3.1 E-06 | NUF2, PRMT6, REC8 | ||
| GO:0071103—DNA conformation change | 4.1 E-02 | ERN2, HIST1Hs, HIST2Hs |
The gene clusters generated in Fig 1A and 1B were passed through ToppFun functional enrichment analysis (p < 0.05, FDR < 0.05). Representative categories of GO:Biological Process were selected for each cluster, as well as representative genes or gene families within each GO category. Where clusters 7 and 8 contained little or no significant GO categories, these clusters were significantly enriched for genes more highly expressed in pluripotent stem cells relative to induced mesoderm, according to the Progenitor Cell Biology Consortium (PCBC) expression atlas [32]. Genes from each cluster were compared against a list of genes known to be differentially expressed in murine myoblasts with Pax3- or Pax7-over expression [33], and genes whose promoters are directly bound by Pax7 [34]. The percentage of either lists’ genes that were represented within each cluster was determined, and the hypergeometric probability of each result was calculated (p(X) ≥ n). Clusters 2–4, and 9 + 10 were notably enriched for a significant number of Pax7 regulated genes.
Transcription factor binding site analysis of gene clusters identified several known developmental myogenesis factors, suggesting co-identified and previously unstudied factors may also play important roles.
| Cluster | # of Genes | Transcription Factor Binding Sites (Z-score) |
|---|---|---|
| 1 | 275 | LEF1 (21.2), TCF7L2 (19.9), |
| 2 | 395 | |
| 3 | 2067 | |
| 4 | 736 | |
| 5 + 6 | 267 | ARID3A (23.3), PRRX2 (18.6), HOXA5 (18.4), FOXD3 (14.8), SOX5 (14.5), NR2F1 (11.3), |
| 7 | 331 | KLF4 (14.2), SOX2 (11.2), POU5F1 (10.7), ZEB1 (5.1) |
| 8 | 310 | NHLH1 (10.3), INSM1 (7.3), PPARG (7.1), EBF1 (5.6) |
| 9 + 10 | 406 | PRRX2 (11.4), POU5F1 (11.2), HOXA5 (10.7), ARID3A (8.3), |
| 11 + 12 | 1394 | PPARG (15.6), |
| 13 | 97 | NR3C1 (8.1), NR2F1 (6.3) |
Gene clusters from Fig 1A and 1B were also analyzed to identify significant TFBSs. Significantly enriched motifs were listed for each cluster group (Z-score > 5). Only motifs whose parent transcription factor were represented in the clustering analysis were listed. Motifs whose parent transcription factor was contained within the same cluster are underlined. For clusters 1–4 only: motifs whose parent transcription factor fell within the previous cluster are italicized. The significantly enriched TAL1::TCF3 motifs in the two earliest clusters have been manually attributed to MSGN1, owing to the two sites’ similarity [39]. The complete list of TFBS results can be found in S2A–S2L Table.
Fig 2The highest tolerable CHIR99021 concentrations induce significantly more paraxial mesoderm gene expression than 3 μM.
Pluripotent H9 hESCs were differentiated with either 7.5 μM (High) or 3 μM (Low) CHIR99021 for 2 days, and with DMSO as a vehicle control. (A) At day 2, cells were fixed and stained with antibodies against T (Red) and Hoechst dye (Blue) to visualize cell nuclei (Scale bar = 20 μm). (B) Volocity 6.0 software was used to count the average number of cells and the proportion of T+ cells with high or low CHIR99021 treatment from 10 randomly chosen fields. High CHIR99021-treated cultures contained significantly fewer total cells than low CHIR99021-treated (n = 3, **p < 0.05) according to one-way ANOVA with post-hoc Tukey test (n = 3, p < 0.05). (C) Day 2 mRNA samples from high and low CHIR99021 treated cultures were subjected to gene expression profiling. Expander 7.0 t-test identified 835 probes that were significantly upregulated at day 2 in both the high and low CHIR99021 conditions, relative to day 0 (n = 3, p < 0.05, FDR < 0.05). High and low CHIR99021 conditions showed 167 and 181 uniquely expressed probes, respectively. (D) Select genes identified in (C) were validated by qPCR. The qPCR results were expressed as a fold-change on the primary Y-axis, and the microarray results on the secondary Y-axis, both relative to day 0 control (n = 3, *p < 0.05). Several genes—including MSGN1 and NODAL—were significantly more expressed in high CHIR99021-treated cultures according to Expander 7.0 t-test of the microarray, and one-way ANOVA with post-hoc Tukey test of the qPCR (n = 3, **p < 0.05). Expander 7.0 t-test was also used to identify 67 and 88 additional probes that showed significantly differential expression between a direct comparison of high and low CHIR99021 conditions (n = 3, p < 0.05). The resulting genes—along with the 167 & 181 uniquely expressed probes from (C)—were classified using PANTHER to identify growth factors, signaling molecules, receptors, and transcriptional regulators. Heat maps were generated for the top 20 differentially expressed genes with (E) more expression in high CHIR99021-treatment, and (F) more expression in low CHIR99021-treatment. Individual replicates are shown for each condition. Replicate expression values were standardized to the average of the day 0 triplicates.
TGFB signaling Gene Ontology categories are significantly enriched among genes more highly expressed is high CHIR99021 conditions.
| GO Category | q-value (E-02) | Genes in Query | Example Genes |
|---|---|---|---|
| TGFB-receptor binding | 0.88 | 5 | BMP6, BMP7, GDNF, TGFB2, TGFB3 |
| Epithelial cell differentiation | 0.80 | 16 | ABCB1, BMP6, BMP7, CES1, DMBT1, FSHR, GDNF, HOXB13, HPS1, LHX1, NKX6-1, NTRK3, OCA2, SIX1, TGFB2, WNT4 |
| Embryonic skeletal system morphogenesis | 0.97 | 7 | BMP7, HYAL1, LHX1, RUNX2, SIX1, TGFB2, TGFB3 |
| Cellular response to endogenous stimulus | 2.08 | 22 | BMP6, BMP7, FSHR, GDNF, IGFBP5, LHX1, NKX6-1, RUNX1, RUNX2, SSTR2, SSTR5, TGFB2, TGFB3, WNT4 |
| Response to endogenous stimulus | 3.37 | 26 | ATP2A1, BMP6, BMP7, FSHR, GDNF, HOXB13, IGFBP5, KALRN, LHX1, |
| Embryonic cranial skeletal morphogenesis | 4.26 | 5 | LHX1, RUNX2, SIX1, TGFB2, TGFB3 |
Probes from Fig 2C were passed through ToppFun functional enrichment analysis (p < 0.05, FDR < 0.05). A complete list of significant GO:Molecular Function and GO:Biological Process categories and select representative genes were tabulated.
Neuronal differentiation Gene Ontology categories are significantly enriched among genes more highly expressed is low CHIR99021 conditions.
| GO Category | q-value (E-02) | Genes in Query | Example Genes |
|---|---|---|---|
| Neuron differentiation | 2.10 | 26 | CDH4, COL25A1, DGKG, EPB41L3, FZD1, ISPD, NIN, NRXN3, POU3F2, PTCH1, RAPH1, RNF165, RUNX3, SARM1, SATB2, SLITRK5, TCF4 |
| Positive regulation of developmental process | 2.46 | 25 | ACVRL1, CCR7, CDH4, FGF1, GPER1, NIN, NRXN3, PLCB1, POU3F2, PTCH1, PTGER4, RAG1, SLITRK5, SMAD9, TCF4, TMEM119 |
| Generation of neurons | 3.55 | 27 | CDH4, COL25A1, DGKG, EPB41L3, FZD1, GPER1, ISPD, NIN, NRXN3, POU3F2, PTCH1, RAPH1, RNF165, RUNX3, SARM1, SATB2, SLITRK5, TCF4 |
| Positive regulation of lipid metabolic process | 4.41 | 8 | ACSL6, CCR7, DAB2, FGF1, IRS1, PPARA, PRKCE, SIRT4 |
Probes from Fig 2C were passed through ToppFun functional enrichment analysis (p < 0.05, FDR < 0.05). A complete list of significant GO:Molecular Function and GO:Biological Process categories and select representative genes were tabulated.
Fig 3Day 50 cultures share elevated FOS/JUN, NOTCH, and TGFB-signaling with early activated satellite cells.
The mean day 50 expression of individual genes were standardized to the mean of day 0 samples, while the mean expression of genes from (A) in vitro-cultured [70] or (B) in vivo-derived [8] early activated satellite cells were standardized to the mean of their respective late activated satellite cell samples (n = 3). Gene lists were generated from the four plot quadrants of (A) and (B) given an absolute 4-fold change cut-off. Genes from (C) quadrant I, and (D) quadrant II were classified using PANTHER to identify growth factors, signaling molecules, receptors, and transcriptional regulators. Day 50 cultures shared more genes in common with early- than late-activated satellite cells, including FOS/JUN, NOTCH, and TGFB-signaling genes, along with other notable genes BMP4, CEBPB, DMD, IGF1, and PAX7. Day 50 cultures and late activated satellite cells shared more structural and ion channel genes, such as LAMB1/3, MYH3, TTN, and the Cholinergic Receptor Nicotinic (CHRN) gene gamily. Quadrant gene lists were also analyzed to identify significantly enriched TFBS. Only motifs whose parent transcription factor was contained within the same quadrant were considered, given a 2-fold change cut-off. Significant motifs from in vitro-cultured satellite cells are shown for (E) quadrant I, (F) quadrant II, and (G) quadrant III; quadrant IV yielded no significant results. Motifs that are highlighted red are enriched in both in vitro-cultured (A) and in vivo-derived (B) satellite cell quadrants.
Fig 4Day 50 cultures shared expression of previously considered quiescence markers, namely EGR and FOS/JUN genes.
(A) A list of genes expressed in satellite cells from Fig 3D of Machado et al. 2017 was chosen to compare with the van Velthoven et al. 2017 study [72], and with day 50 hESC-derived myogenic cultures. Known quiescence genes SPRY1 and CALCR were also included. Genes (rows) were clustered based on their expression pattern across all studies. Analysis showed that day 50 cultures cluster more closely with quiescent cells (T0, Fixed) regarding NOTCH and HOX gene expression, but closer with early activated cells (T3, T5, Non-fixed) when comparing EGR and FOS/JUN gene families. The genes from Fig 3D of Machado et al. 2017 were also analyzed to identify significantly enriched TFBSs in genes that were more highly expressed in (B) quiescent, and (C) early activated satellite cells (Z > 5). The PLAG1 motif (red text) was significantly enriched in both quiescent and early activated gene sets.
Fig 5hESC-derived cultures show expression profiles resembling quiescent/early activated satellite cells, and novel/known SMP surface markers.
(A) The mean expression of individual genes from [12] and [24] were standardized to the mean of their respective day 0 samples, and gene lists were generated from the four plot quadrants given an absolute 4-fold change cut-off. (B) Genes were filtered to select only those with at least 4-fold change in expression on either array relative to their respective day 0 samples, and a heat map of individual replicates showed fairly congruent expression profiles. The specific identities and expression values for each gene can be found in S4 Table. (C) A list of key sarcomeric structural genes was adapted from (85). Day 50 cultures exhibited more fetal-like expression levels of MYH7, MYL3, MYLK and several Z-disc genes, while day 30′ cells appeared to express the tropomyosins TPM2/TPM3 at levels more similar to fetal muscle. (D) Genes from quadrant I were separated using PANTHER to identify growth factors, signaling molecules, receptors, and transcriptional regulators. A heat map of individual replicates for select genes were shown for each condition. Day 50 and day 30′ cultures share overlapping gene expression similar to that of quiescent satellite cells, including IGF1, PAX7, FOS/JUN and TGFB-signaling genes. (E) Quadrant I and (F) quadrant III gene lists were also analyzed to identify significantly enriched TFBSs (Z > 5). Only motifs whose parent transcription factor was contained within the same quadrant were considered, given 2-fold change cut-off.
Most surface proteins shared between day 50 and day 30ʹ cultures were also expressed in human fetal muscle.
| Category | Total Genes in Category | Genes Shared with Fetal | Genes |
|---|---|---|---|
| Day 50 Only | 272 | 49 | A1BG, ACVRL1, ADRA1B, AGRN, |
| Quadrant I | 99 | 77 | ACVR2A, |
| Day 30′ Only | 18 | 11 |
PANTHER was used to identify receptors, channels, and cell adhesion molecules within genes shared between day 50 and day 30ʹ cultures, or genes unique to either dataset. Several cluster of differentiation (CD), integrin, and G-protein coupled receptors not recognized by PANTHER were manually added. Genes that were also expressed in the Choi et al. fetal muscle sample were bolded. Several surface markers known to identify embryonic skeletal muscle progenitors or satellite cells—CXCR4, ERBB3, ICAM1, NCAM1, VCAM1—were shared between both hESC-derived cultures while some surface markers, however, were only elevated in Day 50 cultures: CD82 and NGFR.
Fig 6Putative cascade of transcription factors that regulated four stages of in vitro skeletal muscle development.
Key transcription factors were taken from the TFBS analysis in Table 2; only motifs whose parent transcription factor was contained within the same or in the previous cluster were included. The colored arrows indicate whether a transcription factor may be regulating genes within its own cluster (red arrow), within the subsequent cluster (blue arrow), or both (green arrow) according to TFBS enrichment analysis. The significantly enriched TAL1::TCF3 motifs in the two earliest clusters have been manually attributed to MSGN1, owing to the two sites’ similarity [39]. Factors more notably expressed in cultures differentiated with high versus low CHIR99021 concentrations—T and MSNG1—are outlined (yellow box). Significantly enriched TFBSs in the promoters of genes shared between hESC-derived cultures and various satellite cell gene sets are also outlined: early activated (red box), late activated (blue box), and quiescent (green box). Hypergeometric probability analysis determined significant enrichment of PAX3/7 Soleimani et al. and Lilja et al. target genes in all 3 satellite cell datasets.