| Literature DB >> 35977841 |
Xin Huang1,2, Xiaohan Tang3, Xuemei Bai1,3, Honglei Li4, Huan Tao1,2, Junting Wang1,3, Yaru Li1, Yu Sun1, Yang Zheng1, Xiang Xu1, Longteng Wang5, Yang Ding1, Meisong Lu3, Pingkun Zhou2, Xiaochen Bo1, Hao Li1, Hebing Chen1.
Abstract
During early mammalian embryo development, different epigenetic marks undergo reprogramming and play crucial roles in the mediation of gene expression. Currently, several databases provide multi-omics information on early embryos. However, how interconnected epigenetic markers function together to coordinate the expression of the genetic code in a spatiotemporal manner remains difficult to analyze, markedly limiting scientific and clinical research. Here, we present dbEmbryo, an integrated and interactive multi-omics database for human and mouse early embryos. dbEmbryo integrates data on gene expression, DNA methylation, histone modifications, chromatin accessibility, and higher-order chromatin structure profiles for human and mouse early embryos. It incorporates customized analysis tools, such as "multi-omics visualization," "Gene&Peak annotation," "ZGA gene cluster," "cis-regulation," "synergistic regulation," "promoter signal enrichment," and "3D genome." Users can retrieve gene expression and epigenetic profile patterns to analyze synergistic changes across different early embryo developmental stages. We showed the uniqueness of dbEmbryo among extant databases containing data on early embryo development and provided an overview. Using dbEmbryo, we obtained a phase-separated model of transcriptional control during early embryo development. dbEmbryo offers web-based analytical tools and a comprehensive resource for biologists and clinicians to decipher molecular regulatory mechanisms of human and mouse early embryo development.Entities:
Year: 2022 PMID: 35977841 PMCID: PMC9435744 DOI: 10.1101/gr.276744.122
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.438
Figure 1.The main functions of dbEmbryo. (A) “Multi-omics visualization” searches and browses the tracks of epigenetic signals by gene names or chromosomes. (B) “Gene&Peak analysis” contains two functions. “Gene&Peak annotation” performs GO and KEGG functional annotation of specific gene sets or peaks for epigenetic signals. “ZGA gene cluster” defines different gene clusters for ZGA genes to represent different synergistic epigenetic mechanisms. (C) “Synergistic regulation” examines the correlation of different epigenetic signals with gene expression at various developmental stages. (D) “Promoter signal enrichment” compares the enrichment of different epigenetic signals at gene promoters spanning developmental stages. (E) “Cis-regulation” identifies transcription factor (TF) binding sites (TFBSs) based on ATAC-seq or TFs. (F) “3D genome” visualizes the 3D genome information (e.g., compartment, TAD, loop) across early embryo development stages in different resolutions.
Figure 2.Accessible chromatin initiates transcriptional activity in the two-cell stage. (A) Correlations between gene expression (NCBI Gene Expression Omnibus [GEO; https://www.ncbi.nlm.nih.gov/geo/] GSE66582) and chromatin accessibility (GEO; GSE66581) in promoters (2 kb upstream of TSSs) of ZGA-only genes. Correlation coefficients (R2 values) were calculated by Pearson's linear correlation: (*) 1 × 10−5< P < 1× 10−10, (**) 1 × 10−10< P < 1 × 10−20, (***) P < 1× 10−20. (B) UCSC plots show chromatin accessibility (GSE66581) in promoters of the genes Arntl and Nfya. (C) Correlations between gene expression (GSE66582) and TFBS density in promoter regions (2 kb upstream of TSSs) of ZGA-only genes. Correlation coefficients (R2 values) were calculated by Pearson's linear correlation. (*) 1 × 10−5< P < 1 × 10−10, (**) 1 × 10−10< P < 1 × 10−20, (***) P < 1 × 10−20. (D) The green line shows the average expression level for 395 TFs, determined as described in the Methods; the orange line shows correlation coefficients between gene expression (GSE66582) and TFBS density. (E) UCSC plot shows chromatin accessibility (GSE66581) of Rpl38 in the early two-cell and two-cell stages. DHSs (GSE76642) and ATAC peaks (GSE66581) by other studies are also shown. (F) TFBSs scanned in corresponding ATAC peak. P-value is calculated by FIMO. Motif figures are collected from the HOCOMOCO (v10) database. The bar plot shows expression levels of TFs. (G) Transcriptional regulatory model in early embryo development.
Figure 3.H3K4me3 facilitates gene up-regulation in the four-cell stage. (A) Correlations between gene expression (GSE66582) and H3K4me3 (GSE71434) signal levels in promoters (2 kb upstream of TSSs) of ZGA-only genes. Correlation coefficients (R2 values) were calculated by Pearson's linear correlation. (*) 1 × 10−5< P < 1 × 10−10. (B) UCSC plots show chromatin accessibility (GSE66581) and H3K4me3 (GSE71434) signal surrounding gene Alppl2. Bar plot shows the expression level of gene Alppl2. (C) UCSC plots show chromatin accessibility (GSE66581) and H3K4me3 (GSE71434) signal surrounding gene Cyp26a1. Bar plot shows the expression level of gene Cyp26a1. (D) Pie charts show the percentage of up-regulated genes with and without H3K4me3 (GSE71434) peaks in the promoter (left) and chromatin accessibility of up-regulated genes that are targeted by H3K4me3 (GSE71434) peaks in the promoter (right).
Figure 4.3D chromatin structures enhance transcriptional activity in the eight-cell stage. (A) The bar plot shows the percentage of TAD boundaries (GSE82185) that are new in the current compared with the previous stages. (B) The average expression level of genes is located in the TAD boundary and outside the TAD boundary (GSE82185). P-values were calculated by t-test. (C) Pie charts show genes with accessible chromatin (GSE66581) in the promoter or the chromatin interaction loop region (GSE82185). (D) The expression level of Sidt2. (E) Hi-C heatmaps of Sidt2 and associated loop region at 5-kb resolution for late two- and eight-cell embryos. (F) UCSC plots show chromatin accessibility (GSE66581) of Sidt2 and the associated significant interaction region. (G) The model shows that accessible chromatin in the eight-cell stage is condensed by widespread interaction loops, thus enhancing transcriptional activity.
Figure 5.Multifaceted regulation of epigenetic marks at ICM stage. (A) Scatter plots show the correlations between gene expression (GSE66582) and epigenetic signals (ATAC-seq: GSE66581; H3K4me3: GSE71434; H3K9me3: GSE97778; H3K27me3: GSE76687; mCG/CG: GSE56697) in promoters (2 kb upstream of TSSs). Correlation coefficients were calculated by Pearson's linear correlation. (B) Heatmaps show the enrichment of epigenetic marks (ATAC-seq: GSE66581; H3K4me3: GSE71434; H3K9me3: GSE97778; H3K27me3: GSE76687; mCG/CG: GSE56697; normalized RPKM) in the ICM stage; a bar chart shows the expression level of corresponding genes. (C,D) UCSC plots show epigenetic signals (ATAC-seq: GSE66581; H3K4me3: GSE71434; H3K9me3: GSE97778; H3K27me3: GSE76687; mCG/CG: GSE56697) surrounding Zfp106, Snap23, Zfp639, Rhox13, Defb7, Usp29, Top3a, Gnas, and Vnn1. The bar plot shows gene expression level.
Figure 6.Different layers of epigenetic marks shape transcriptional programs. (A) The linear regression quantitatively describes the effects of different layers of epigenetic marks (ATAC-seq: GSE66581; H3K4me3: GSE71434; HiC: GSE82185; DNA methylation: GSE56697) on transcriptional activity in preimplantation embryos. The analysis was conducted on ZGA-only genes. (B–G) Gene Obox6 illustrates the relationship between different layers of epigenetic information (ATAC-seq: GSE66581; H3K4me3: GSE71434; H3K9me3: GSE97778; H3K27me3: GSE76687; mCG/CG: GSE56697) and transcriptional activity (GSE66582). (B) TFBSs scanned in accessible chromatin of Obox6. Expression levels of these TFs were high (FPKM ≥ 10). (C–F) UCSC plots show the epigenetic signal surrounding gene Obox6 for two-, four-, and eight-cell embryos and ICM. (G) Colormap shows the expression level of Obox6 during preimplantation.