| Literature DB >> 32917699 |
Q R Xing1,2, Chadi A El Farran1,3, Pradeep Gautam1,3, Yu Song Chuah1, Tushar Warrier1,3, Cheng-Xu Delon Toh1, Nam-Young Kang4,5, Shigeki Sugii6,7, Young-Tae Chang4,8,9,10, Jian Xu3,11, James J Collins12,13,14, George Q Daley15,16,17,18, Hu Li19, Li-Feng Zhang20, Yuin-Han Loh21,3,22,23.
Abstract
Cellular reprogramming suffers from low efficiency especially for the human cells. To deconstruct the heterogeneity and unravel the mechanisms for successful reprogramming, we adopted single-cell RNA sequencing (scRNA-Seq) and single-cell assay for transposase-accessible chromatin (scATAC-Seq) to profile reprogramming cells across various time points. Our analysis revealed that reprogramming cells proceed in an asynchronous trajectory and diversify into heterogeneous subpopulations. We identified fluorescent probes and surface markers to enrich for the early reprogrammed human cells. Furthermore, combinatory usage of the surface markers enabled the fine segregation of the early-intermediate cells with diverse reprogramming propensities. scATAC-Seq analysis further uncovered the genomic partitions and transcription factors responsible for the regulatory phasing of reprogramming process. Binary choice between a FOSL1 and a TEAD4-centric regulatory network determines the outcome of a successful reprogramming. Together, our study illuminates the multitude of diverse routes transversed by individual reprogramming cells and presents an integrative roadmap for identifying the mechanistic part list of the reprogramming machinery.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32917699 PMCID: PMC7486102 DOI: 10.1126/sciadv.aba1190
Source DB: PubMed Journal: Sci Adv ISSN: 2375-2548 Impact factor: 14.136
Fig. 1Single-cell systems used for deconvoluting the heterogeneity in human cellular reprogramming.
(A) Overview of the prepared single-cell NGS libraries across various time points of human cellular reprogramming. The microfluidic platform was used to prepare 439 scRNA-Seq and 891 scATAC-Seq libraries (duplicates) of good quality. 10X Genomics platform was utilized to prepare 32,138 scRNA-Seq libraries of good quality. (B) QC of microfluidic capture–based scRNA-Seq libraries. Dotplot demonstrates the exon mapping percentage (x axis) of each scRNA-Seq library, along with its corresponding detected gene rate (y axis). Blue dots represent libraries passing the QC filters. (C) Average enrichment of capture-based scRNA-Seq libraries over genebodies. (D) UMAP plot for the prepared 10X scRNA-Seq libraries. (E and F) Superimposition of the expression levels for MET genes (E) and fibroblast and pluripotent genes (F). (G) QC of scATAC-Seq libraries. Dotplot demonstrates the library size (x axis) of each scATAC-Seq library, along with its contribution to the respective time point’s HARs (y axis). Red dots represent the libraries passing the QC filters. (H) Average enrichment profile of a D16+ scATAC-Seq library around transcription start sites (TSS) of the genome with a window of −3000 bp to 3000 bp. (I) Histogram of insert size metric of a D16+ scATAC-Seq library revealing nucleosomal pattern.
Fig. 2Identification of diverse reprogramming subgroups and construction of reprogramming trajectories.
(A) PCAs showing subgroups present in D2 (left), D8 (middle), and D16+ (right) cells determined by RCA, based on their correlation to cells of various lineages in the RCA panel. Each color represents a subgroup. Gray color indicates the minority outlier cells, which do not belong to the indicated subgroups. (B) Boxplots showing single-cell expression of the differentially expressed genes CDK1 (D2), GDF3 (D8), MMP2 (D16+), and LIN28A (D16+) across the time points and their respective subgroups. Lines represent the median expression. (C) Left: Trajectory of reprogramming cells constructed from the 10X scRNA-Seq libraries based on DDRTree dimension reduction. Colors represent time points. Right: Pseudotime calculated by Monocle. Color indicates pseudotime. (D) Stacked columns indicating the distribution of reprogramming time points across the pseudotemporal states. Colors represent time points. (E) Superimposition of D8 subgroups (left) and D16+ subgroups (right) on the trajectory of reprogramming. Colors represent subgroups. (F) Stacked columns revealing the distribution of D8 subgroups across the pseudotemporal states. Colors represent D8 subgroups, and gray color indicates cells of the other time points. (G) Superimposition of expression of the D8 subgroup-specific genes (RFC3 and GDF3) and the D16+ subgroup–specific genes (NANOG, MMP2, and LIN28A) on the reprogramming trajectories.
Fig. 3Identification of surface markers for the early-intermediate reprogramming cells.
(A) Dotplots indicating the expression of ANPEP and CD44 along the pseudotime. Smooth lines are composed of multiple dots representing the mean expression level at each pseudotime, regardless of the state. (B) Stacked histograms (top) showing the fluorescence intensities (x axis) of the surface markers in the cells indicated on the left. Red dotted boxes highlight the positively stained populations. Quantifications are shown below. (C) Overlaid histograms showing the staining signals of the surface markers in the top 10% and bottom 10% BDD2-C8–stained cells. Red dotted boxes highlight the positively stained populations. The numbers on top indicate the percentages of positively stained cells. (D) Quantitative reverse transcription polymerase chain reaction (qRT-PCR) measuring the relative expression levels in the D8-sorted cells. n = 2; error bar indicates SD. * indicates P < 0.05; ** indicates P < 0.005; *** indicates P < 0.0005. (E) Quantification of TRA-1-60+ colonies yielded from the D8-sorted cells. Representative images are shown below. n = 2; error bar indicates SD. (F) Bar charts showing the distribution of costaining signals of the surface markers across the cells of various reprogramming time points. (G) qRT-PCR exhibiting the relative expression of the collagen/mesenchymal genes (top) and pluripotent genes (bottom) in the D8 CD13-sorted cells induced from MSC using Sendai virus. n = 2; error bar indicates SD. (H) Quantification of TRA-1-60+ colonies yielded from D8 CD13-sorted cells induced from MSC using Sendai virus (top). Representative images are shown below. n = 2; error bar indicates SD.
Fig. 4Refined classification and enrichment of early-intermediate reprogramming cells.
(A and B) t-SNE plots indicating the CD13 antigen profiles (A) and Seurat clusters (B) of the D8 CD13-sorted 10X libraries. (C) RCA clustering of the D8 CD13-sorted 10X libraries. (D) MAGIC plot showing the correlation between CD13 and GDF3. Colors represent the expression levels of NANOG. (E) Heatmap showing the DEGs of CD13 clusters. Genes highlighted in orange are expressed highly in D8 G2 and G3. (F) Violin plot demonstrating the expression of GDF3 across the clusters. (G) Left: Trajectory constructed by the 10X scRNA-Seq libraries of various time points and D8 CD13-sorted cells. Right: Superimposition of the CD13 clusters. (H) qRT-PCR showing the relative expression in the D8 CD13 & CD201-sorted cells. n = 2; error bar indicates SD. (I) Left: Heatmap showing the DEGs of D8 CD13 & CD201-sorted cells determined from their bulk RNA-Seq libraries. GO terms and the associated genes are indicated on the right. (J) Normalized TRA-1-60+ colonies upon knockdown of genes highly expressed in CD13+CD201+ cells at D5 of reprogramming. Representative images are shown above. n = 3; error bar indicates SD. (K) Quantification of TRA-1-60+ colonies yielded from the D8 CD13 & CD201-sorted cells. Representative images are shown above. n = 2; error bar indicates SD. GAPDH, glyceraldehyde-3-phosphate dehydrogenase; MHC, major histocompatibility complex.
Fig. 5Stage-specific TF regulatory networks of reprogramming.
(A) Heatmap showing the TFs’ expression across the pseudotime states. Color code on top represents the pseudotime states. Representative TFs of each category are listed on the right. (B) Correlation between scATAC-Seq libraries based on the calculated JASPAR motif deviations in the HARs. Side color bar indicates time points of the scATAC-Seq libraries. (C) Plot indicating the significantly variable motifs in terms of accessibility in the scATAC-Seq libraries. y axis represents the variability score assigned to each JASPAR motif, whereas x axis represents the motif rank. (D) scATAC-Seq heatmap based on the deviation scores of the significantly variable JASPAR motifs. Color code on top represents time points. Motifs were classified to three major types according to the dynamics of accessibility across the time points. (E) t-SNE plot of scATAC-Seq libraries based on the deviation scores of JASPAR motifs. (F to I) Superimposition of motif enrichment scores for OC motifs FOSL1 and CEBPA (F), Transient motif GATA1:TAL1 (G), CO motifs: type I-TEAD4 (H); type II-FOXL1 (I) on the t-SNE plot. Colors indicate the motif accessibility levels.
Fig. 6TFs contributing to the heterogeneity in chromatin accessibility of the intermediate reprogramming cells.
(A) Plot indicating the significantly variable motifs in terms of accessibility in D8 cells. (B) Clustering of D8 scATAC-Seq libraries based on the accessibility of variable motifs. (C) Expression of FOSL1 and TEAD4 in the D8 CD13-sorted cells. (D) t-SNE plot based on the regulon activity matrix (left) and superimposition of regulon activities for FOSL1 and TEAD4 (right). (E) The number of normalized TRA-1-60+ colonies upon knockdown (KD) of FOSL1 at the indicated reprogramming time points. Representative images are shown below, n = 3. Error bar indicates SD. * indicates P < 0.05; *** indicates P < 0.0005. (F and G) Quantification of TRA-1-60+ colonies upon overexpression (OE) of FOSL1 in D5 cells (F) and D8 CD13− cells (G). Representative images are shown below, n = 3. Error bar indicates SD. (H) The number of normalized TRA-1-60+ colonies upon knockdown of TEAD4 at the indicated reprogramming time points. Representative images are shown below, n = 2. Error bar indicates SD. * indicates P < 0.05; ** indicates P < 0.005; N.S. indicates not significant. (I) Quantification of TRA-1-60+ colonies upon overexpression of TEAD4. Representative images are shown below, n = 3. Error bar indicates SD. (J) Clustering of D8 scATAC-Seq libraries based on the accessibility of FOSL1 and TEAD4 bound sites. (K and L) Heatmaps showing the expression of functional FOSL1 (K) and TEAD4 targets (L) in the D8 CD13 & CD201-sorted cells. (M) Proposed model of the study.