| Literature DB >> 35172134 |
Camden Jansen1, Kitt D Paraiso1, Jeff J Zhou2, Ira L Blitz2, Margaret B Fish2, Rebekah M Charney2, Jin Sun Cho2, Yuuri Yasuoka3, Norihiro Sudou4, Ann Rose Bright5, Marcin Wlizla6, Gert Jan C Veenstra5, Masanori Taira7, Aaron M Zorn6, Ali Mortazavi8, Ken W Y Cho9.
Abstract
Mesendodermal specification is one of the earliest events in embryogenesis, where cells first acquire distinct identities. Cell differentiation is a highly regulated process that involves the function of numerous transcription factors (TFs) and signaling molecules, which can be described with gene regulatory networks (GRNs). Cell differentiation GRNs are difficult to build because existing mechanistic methods are low throughput, and high-throughput methods tend to be non-mechanistic. Additionally, integrating highly dimensional data composed of more than two data types is challenging. Here, we use linked self-organizing maps to combine chromatin immunoprecipitation sequencing (ChIP-seq)/ATAC-seq with temporal, spatial, and perturbation RNA sequencing (RNA-seq) data from Xenopus tropicalis mesendoderm development to build a high-resolution genome scale mechanistic GRN. We recover both known and previously unsuspected TF-DNA/TF-TF interactions validated through reporter assays. Our analysis provides insights into transcriptional regulation of early cell fate decisions and provides a general approach to building GRNs using highly dimensional multi-omic datasets.Entities:
Keywords: ATAC-seq; ChIP-seq; RNA-seq; Xenopus; cis-regulatory modules; endoderm; gene regulatory networks; linked self-organizing maps; mesoderm; multi-omic
Mesh:
Substances:
Year: 2022 PMID: 35172134 PMCID: PMC8917868 DOI: 10.1016/j.celrep.2022.110364
Source DB: PubMed Journal: Cell Rep Impact factor: 9.423
Figure 1.Using self-organizing maps (SOMs) to discover ME GRN
(A) Genome browser view of TF binding during X. tropicalis development. Shown are maternally expressed (Foxh1, Otx1, Sox7, Vegt, Ctnnb1, Smad1, and Smad2/3) and zygotically expressed (Foxa4, Gsc, Eomes, Tbxt, and Vegt) TF binding in the gsc gene locus. Shaded are the well-characterized proximal, distal, and upstream CRMs, associated with TF binding. Further upstream are binding sites in possibly unexplored CRMs.
(B) Datasets used in this analysis, targeting several wild-type and MO-injected embryos at developmental stages important for ME development.
(C) The X. tropicalis genome is partitioned (grey shadings in bottom track) using ChIP-seq and ATAC-seq peak locations. Each partition is assigned ChIP-seq and ATAC-seq signal quantified as reads per kilobase per million (RPKMs) for all chromatin datasets.
(D) The RNA-seq and ChIP-seq/ATAC-seq datasets were each converted into training matrices and clustered using SOM metaclustering using SOMatic. These clusters were then linked using the SOM Linking tool within SOMatic. The pairwise linked metaclusters (LMs) and spatial SOM data were mined for regulatory connections and built into networks.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
|
| ||
| Antibodies | ||
| Covance; This paper | N/A | |
|
| N/A | |
| Santa Cruz Biotechnology | Cat#sc-6031x | |
|
| N/A | |
|
| N/A | |
|
| N/A | |
|
| ||
| Chemicals, Peptides, and Recombinant Proteins | ||
|
| ||
| Dynabeads Protein G | Life Technologies | Cat#10003D |
|
| ||
| Critical Commercial Assays | ||
|
| ||
| NEXTflex ChIP-seq kit | Bioo Scientific | Cat#NOVA-5143-01 |
| Superscript II | Life Technologies | Cat#18064014 |
| KAPA HiFi HotStart ReadyMix (2x) | Kapa Biosystems | Cat#KK2601 |
| Agencourt AMPure XP beads | Beckman Coulter | Cat#A63881 |
| Nextera DNA Library Prep Kit | Illumina | Cat#FC-121-1030 |
|
| ||
| Deposited Data | ||
|
| ||
| RRID: SCR_003280; URL: | ||
|
| GEO: GSE48560 | |
|
| GEO: GSE48560 | |
|
| GEO: GSE48560 | |
|
| GEO: GSE53654 | |
|
| GEO: GSE53654 | |
|
| DRA: DRA000576 | |
|
| DRA: DRA000509 | |
|
| DRA: DRA000508 | |
|
| DRA: DRA000505 | |
|
| DRA: DRA000506 | |
|
| DRA: DRA000573 | |
|
| DRA: DRA000574 | |
|
| GEO: GSE67974 | |
|
| GEO: GSE67974 | |
|
| GEO: GSE67974 | |
|
| GEO: GSE67974 | |
|
| GEO: GSE67974 | |
|
| GEO: GSE67974 | |
|
| GEO: GSE67974 | |
|
| GEO: GSE67974 | |
|
| GEO: GSE67974 | |
|
| GEO: GSE67974 | |
|
| GEO: GSE67974 | |
|
| GEO: GSE72657 | |
|
| GEO: GSE85273 | |
|
| GEO: GSE85273 | |
|
| GEO: GSE85273 | |
|
| GEO: GSE118024 | |
|
| GEO: GSE118024 | |
|
| GEO: GSE148726 | |
| This Paper | GEO: GSE118024 | |
| This Paper | GEO: GSE118024 | |
| This Paper | GEO: GSE118024 | |
| This Paper | GEO: GSE118024 | |
| This Paper | GEO: GSE118024 | |
| This Paper | GEO: GSE118024 | |
|
| GEO: GSE145619 | |
|
| GEO: GSE65785 | |
|
| GEO: GSE81458 | |
|
| ArrayExpress: E-MTAB-8555 | |
|
| ArrayExpress: E-MTAB-8555 | |
|
| GEO: GSE148726 | |
|
| GEO: GSE148726 | |
| This Paper | GEO: GSE118024 | |
|
| ||
| Experimental Models: Organisms/Strains | ||
|
| ||
| University of Virginia, NASCO | URL: | |
|
| ||
| Oligonucleotides | ||
|
| ||
| Template switching oligo |
| N/A |
| ISPCR primers |
| N/A |
| Indexing primers |
| N/A |
| Foxh1 MO 5′-TCATCCTGAGGCTCCGCCCTCTCTA-3′ | GeneTools; | N/A |
| Tcf7l1 MO 5′-CGCCGCTGTTTAGTTGAGGCATGA-3′ | GeneTools; | N/A |
| Sox17a MO 5′-AGCCACCATCAGGGCTGCTCATGGT-3′ | GeneTools; | N/A |
| wt zic2 F: ctgtgagtatttacattttacccttgc | IDT | N/A |
| wtfoxa2 F: cagatttcacacagaaaaattaggatc | IDT | N/A |
| wt eomes F: tacatctctataagtatgtgtgca | IDT | N/A |
| wt gata6 F: aacactcatagtttccctttg | IDT | N/A |
| wt sox17b F: ggttagccagcaggtaactg | IDT | N/A |
| wt osr2 F: gtccctgtacaagtaggacatt | IDT | N/A |
| wt bmp4 F: ggtggtatttccagggttcccttta | IDT | N/A |
| wt gata4 F: agcatggacatgtttaatggact | IDT | N/A |
| wt wnt8 F: aatgggcagaatatgagaagagt | IDT | N/A |
| wt mixer F: gggcaaagtcatgagattggt | IDT | N/A |
| wt tbst F: gcgttcattttgccaccaa | IDT | N/A |
| wt nodal F: acactttaaaaggattaatgggatttatct | IDT | N/A |
| wt admp F: atatatatatatatactaacagtatatcttgcccaaag | IDT | N/A |
| wt map7d3 F: agttttccttccaccaaagaaaa | IDT | N/A |
| wt pcdh8.2.1 F: aaatctctttcatattcagccgg | IDT | N/A |
| wt pcdh8.2.2 F: acctaaagtcacatcccatcag | IDT | N/A |
| wt pcdh8.2.3 F: ggtgcagtgaatggcttattc | IDT | N/A |
| wt Pdk4 F: agactaaaactgttataagaatttctaatttttaataaatatttg | IDT | N/A |
| wt serpinf2 F: agaaatggtgcaccactg | IDT | N/A |
| wt sfrp2 F: aatgagaaaagtgtggtataaga | IDT | N/A |
| wt slc12a3.2 F: gaacatatatgtactatgcacttctaacc | IDT | N/A |
| wt zic2 F: ctgtgagtatttacattttacccttgc | IDT | N/A |
| mutant foxa2 F: cagatttcacacagaaaaattaggatc | IDT | N/A |
| mutant eomes F: tacatctctataagtatgtgtgca | IDT | N/A |
| mutant gata6 F: aacactcatagtttccctttg | IDT | N/A |
| mutant sox17b F: ggttagccagcaggtaactg | IDT | N/A |
| mutant osr2 F: gtccctgtacaagtaggacatt | IDT | N/A |
| mutant bmp4 F: ggtggtatttccagggttcccttta | IDT | N/A |
| mutant gata4 F: agcatggacatgtttaatggact | IDT | N/A |
| mutant wnt8 F: aatgggcagaatatgagaagagt | IDT | N/A |
| mutant mixer F: gggcaaagtcatgagattggt | IDT | N/A |
| mutant tbst F: gcgttcattttgccaccaa | IDT | N/A |
| mutant nodal F: acactttaaaaggattaatgggatttatct | IDT | N/A |
|
| ||
| Recombinant DNA | ||
|
| ||
| – 104 |
| N/A |
| pRL-SV40 | Promega | Cat#E2231 |
| zic2 Luc reporter | This Paper | N/A |
| zic2 mutant Luc reporter | This Paper | N/A |
| foxa2 Luc reporter | This Paper | N/A |
| foxa2 mutant Luc reporter | This Paper | N/A |
| eomes Luc reporter | This Paper | N/A |
| eomes mutant Luc reporter | This Paper | N/A |
| gata6 Luc reporter | This Paper | N/A |
| gata6 mutant Luc reporter | This Paper | N/A |
| sox17b Luc reporter | This Paper | N/A |
| sox17b mutant Luc reporter | This Paper | N/A |
| osr2 Luc reporter | This Paper | N/A |
| osr2 mutant Luc reporter | This Paper | N/A |
| gata4 Luc reporter | This Paper | N/A |
| gata4 mutant Luc reporter | This Paper | N/A |
| wnt8 Luc reporter | This Paper | N/A |
| wnt8 mutant Luc reporter | This Paper | N/A |
| mixer Luc reporter | This Paper | N/A |
| mixer mutant Luc reporter | This Paper | N/A |
| tbxt Luc reporter | This Paper | N/A |
| tbxt mutant Luc reporter | This Paper | N/A |
| nodal Luc reporter | This Paper | N/A |
| nodal mutant Luc reporter | This Paper | N/A |
|
| ||
| Software and Algorithms | ||
|
| ||
| RSEM v.1.2.12 |
| RRID: SCR_013027; URL: |
| Bowtie 2 v2.2.7 |
| RRID: SCR_016368; URL: |
| MACS2 v2.0.10 |
| RRID: SCR_013291; URL: |
| DEseq2 v3.11 |
| RRID: SCR_015687; URL: |
| SOMatic |
| URL: |
| FIMO v4.12.0 |
| RRID: SCR_001783; URL: |
| IGVv2.3.20 |
| RRID: SCR_011793; URL: |
| Xenmine/Gene Ontology |
| N/A |
Figure 2.RNA-seq SOM metaclustering reveals developmental gene modules that contain similarly regulated genes
(A) SOM slices relating to gene expression signal Wildtype at stage 10.5 and the fold change between Foxh1 MO and control experiments at stage 10.5. Creation of SOM visualization is described in STAR Methods. Metaclusters containing genes from the core ME network show unique temporal dynamics during development. nodal, nodal2, and sia are grouped left and gsc, nodal1, lhx, and osr2 are grouped right (top). Overlaid metacluster boundaries show the genes that are up- and down-regulated upon Foxh1 MO KD (bottom).
(B) Each metacluster is filled with genes with a similar expression profile (labeled “Eigen-Profile”); for example, a heatmap of the genes in metacluster 11 is shown.
(C) Heatmap of average temporal expression profiles of genes belonging to 13 RNA metaclusters. Parentheses after RNA metaclusters indicate number of genes in each RNA metacluster.
(D) Two-tailed Wilcox hypothesis analysis applied on gene metaclusters. Each metacluster responded to each MO experiment differently at different time points.
(E) GO term enrichments for genes within three example RNA SOM metaclusters. Each metacluster had unique functional enrichments supporting the coherence of these clusters.
See Figure S1.
Figure 3.SOM-based clustering shows Foxh1 co-binding and functional gene modules during gastrulation
(A) Heatmap of Foxh1 ChIP-enriched metaclusters that visualizes the different patterns of co-regulation present in Foxh1-bound CRMs. The heatmap is initially expressed as TPMs and then maximum normalized. Blue and red represent regions with low and high signals, respectively.
(B) Experiment hierarchy of ATAC/ChIP-seq data after metacluster correction. The developmental stages of each experiment are indicated by the same color coding as (A).
(C) GO term enrichments for genes nearby genome regions within three example ATAC/ChIP SOM metaclusters.
Figure 4.RNA metaclusters can be further segregated by spatial RNA SOM
(A) SOM slices from the spatial RNA SOM analysis corresponding to RNAs from the animal, dorsal, and vegetal explants with overlaid spatial RNA metacluster (sR) boundaries. Some important sR locations are noted.
(B) Heatmap of the fold change of genes within sRs over whole-embryo signal, indicating enrichment and reduction of genes in particular RNA metaclusters.
(C) Heatmap of statistical difference between gene expression in each tissue and the whole embryo. Six sRs showed statistically significant differences in ectoderm/mesoderm or in endoderm.
(D) Joint membership of genes in sRs and RNA metaclusters from the full RNA dataset. Rows and columns are hierarchically clustered.
(E) Temporal (from wild type) and spatial gene expression profiles for genes in sR9, sR6, sR15, and sR1 and R38.
(F) Average temporal and spatial gene expression profiles for genes in R23, R16, R11, R10, or R1, based on sRs.
Figure 5.sR assists in identifying candidate TFs for Xenopus ME differentiation
(A and B) Temporal and spatial gene expression profiles of TFs with motifs found near endodermally (A) or ectodermally (B) enriched genes. Asterisks indicate TFs that show distinct spatial expression.
(C) Temporal and spatial gene expression profiles for spatially differential TFs (bold) matched with the average gene expression profile of their predicted targets. Correlations were calculated by comparing their spatial gene expression profiles.
(D) The temporal and spatial gene expression profiles of genes important in Xenopus ME development, separated by RNA metacluster.
Figure 6.GRN centered on the activity of Tcf7l1, Sox17, Vegt, Smad2/3, and Foxh1
(A) Our predicted developmental GRN. The active CRMs were identified based on the enrichment of their respective TFs, enrichment of Ep300 signal, and DNA binding motif presence. Shown are literature identified targets (“prior direct targets”) and potential new connections (“new potential targets”). Note that only a subset of targets is shown, and the network is focused only on TF and signaling molecule targets.
(B) Fold change of relative luciferase units in log scale of putative CRMs comparing Foxh1 binding site mutations over wild type. Each of these shows that enhancer activity depends on Foxh1 binding sites. Two biologically independent experiments were performed.
(C) Fold change of relative luciferase units of putative CRMs comparing Sox17 binding site mutations over wild type. Each shows that enhancer activity depends on Sox17 binding sites. Two biologically independent experiments were performed.
See Figure S6.
Figure 7.New and known core ME TF targets
List of targets in the core ME network for the TFs: Foxh1, Sox17, Tcf7l1, Vegt, and Smad2/3. Bolded entries are new to this analysis. Underlined entries were successfully validated.