| Literature DB >> 34635154 |
Taotao Sheng1,2, Shamaine Wei Ting Ho2,3, Wen Fong Ooi4, Chang Xu2,3, Manjie Xing2,4, Nisha Padmanabhan2, Kie Kyon Huang2, Lijia Ma5, Mohana Ray5, Yu Amanda Guo6, Ngak Leng Sim6, Chukwuemeka George Anene-Nzelu7,8,9,10, Mei Mei Chang6, Milad Razavi-Mohseni11, Michael A Beer11, Roger Sik Yin Foo7,8, Raghav Sundar2,12, Yiong Huak Chan13, Angie Lay Keng Tan2, Xuewen Ong2, Anders Jacobsen Skanderup6, Kevin P White14,15, Sudhakar Jha16,17,18,19, Patrick Tan20,21,22,23,24,25.
Abstract
BACKGROUND: Enhancers are distal cis-regulatory elements required for cell-specific gene expression and cell fate determination. In cancer, enhancer variation has been proposed as a major cause of inter-patient heterogeneity-however, most predicted enhancer regions remain to be functionally tested.Entities:
Keywords: CapSTARR-seq; Enhancer heterogeneity; Enhancer landscape; Enhancer-promoter interactions; Gastric cancer
Mesh:
Substances:
Year: 2021 PMID: 34635154 PMCID: PMC8504099 DOI: 10.1186/s13073-021-00970-3
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Fig. 1Workflow of analyses conducted in this study. The diagram describes the general flow of analyses performed in this study and datasets used. Briefly, distal enhancer landscapes of GC were profiled using H3K27ac ChIP-seq, supported by additional H3K4me3 ChIP-seq, H3K4me1 ChIP-seq, and 5mC MeDIP-seq datasets. CapSTARR-seq profiling was then performed to identify functional enhancers, and integrated with ATAC-seq measuring chromatin accessibility. Enhancer-gene regulation was investigated using the ABC (“Activity-by-contact”) model, where RNA-seq was used as a readout of target gene expression levels. Finally, we examined potential cis- and trans-regulation mechanisms of differential enhancers. For cis-regulation, we derived SNP and CNA (copy number alteration) information using WGS datasets
Fig. 2Distal enhancer landscapes of GC cell lines. a Histone profiles of OCUM1 and SNU16 cells show enrichment of H3K27ac and H3K4me3 around the UTP15 TSS. A predicted distal enhancer enriched for H3K27ac and > 2.5 kb distant from UTP15 TSS is observed. b Comparison of H3K27ac signals over common predicted enhancers between two KATOIII replicates. c Genome-wide average profile of chromatin marks (H3K27ac, H3K4me1, and H3K4me3) and DNA methylation (5mC) at all predicted enhancers and active promoters. Active promoters are those annotated promoters overlapping H3K27ac peaks. H3K27ac, H3K4me1, and H3K4me3 profiles were generated by ChIP-seq. 5mC profiles were generated by MeDIP-seq. RPM: Reads per million mapped reads. The “summit” of a predicted enhancer region refers to the midpoint of the bimodal peak. The indicated windows (1 kb, 6 kb) were chosen as indicated to highlight the bimodal pattern of histone marks. d Recurrence rates of predicted enhancers. Recurrent predicted enhancers were identified as those enhancers occurring in at least two GC cell lines. Data presented are the mean percentage +/− standard deviation of commonly predicted enhancers found in two or more gastric cancer cell lines, as a function of the number of cell lines. e Distribution of predicted super-enhancer and typical enhancers present in GCs across the human genome. f t-distributed stochastic neighbor embedding (t-SNE) analysis using predicted enhancers present in GCs reveals separation between GCs and matched normal tissues (n =36). g Difference in somatic point mutation rates among predicted super-enhancers, typical enhancers, and randomly selected genomic regions. P value: two-side Student’s t test. Relative somatic point mutation rate was calculated as the log2 fold change of the somatic point mutation rate over the background point mutation rate
Fig. 3CapSTARR-seq Functional Enhancer Profiling. a CapSTARR-seq experimental workflow. b H3K27ac (red) and CapSTARR-seq (blue) profiles at the ABHD11, CLDN3, and CLDN4 loci in OCUM1 cells. The top blue track depicts CapSTARR-seq signals merged from three CapSTARR-seq replicates. Black boxes denote CapSTARR-seq probes. c Circos visualization of CapSTARR-seq signals across the human genome in SNU16 (the inner circle) and OCUM1 cells (the outer circle). HDAC1, MYC, and CD44 are highlighted as genes associated with CapSTARR-seq high enhancers. d Chromatin state discovery and characterization. The leftmost panel displays a ChomHMM heatmap of emission parameters where each row corresponds to a different state and each column a different histone mark. Shown to its left are candidate state descriptions for each state followed by a state abbreviation. The heatmap in the middle displays the overlap enrichment for various external genomic annotations. The rightmost heatmap shows fold enrichment for each state near TSSs. e Differences in CapSTARR-seq signals (log2 RPKM) in six enhancer-associated chromatin states called from ChromHMM in OCUM1 (left) and SNU16 (right). f Distribution of H3K27ac active and inactive elements in CapSTARR-seq enriched regions (CaPERs) in OCUM1 and SNU16 cells. g Differences in H3K4me1 ChIP-seq and ATAC-seq peak enrichment at H3K27ac active and inactive CaPERs. Error bars indicate two independent biological replicates. h Distribution of OCUM1/SNU16 H3K27ac inactive CaPERs in the “open” state across 17 cell lines. Red regions denote OCUM1/SNU16 H3K27ac inactive CaPERs located in open chromatin in any of 17 cell lines. Gray regions denote OCUM1/SNU16 H3K27ac inactive CaPERs located in closed chromatin across all the cell lines. i Differences in CapSTARR-seq signals (log2 RPKM) in enhancer categories in OCUM1 cells. P values: Mann–Whitney U test. j Differences in TF binding sites over enhancer categories. The number of TF binding sites over an enhancer was calculated using the ReMap database. P values: Mann–Whitney U test
Fig. 4Activity-by-contact model of enhancer-gene regulation. a ABC model schema. e1 and e2 denote two arbitrary enhancers (solid red circles) for a gene (black arrow). “G” denotes a gene. Both “E” and “e” denote an enhancer. ABCE,G denotes the ABC score (predicted effect) of a E–G pair. Activity (“A”) estimates the enhancer strength while Contact (“C”) estimates the frequency of the enhancer-gene connection. The ABC score of an enhancer to one gene’s expression is calculated by that enhancer’s effect divided by the total effect of all enhancers for the gene. b ABC model for explaining gene expression differences between OCUM1 and SNU16 cells. Activity is estimated as the level of H3K27ac-enrichment at an enhancer while Contact is quantified as a function of the genomic distance between the enhancer and the TSS of the gene (Contact = Distance−1). c Comparison of ABC score differences and observed gene expression differences between OCUM1 and SNU16 cells. Each dot represents a differentially expressed gene between OCUM1 and SNU16 cells. Activity of an enhancer is estimated as the H3K27ac signal. R: Pearson’s correlation coefficient. P value: Pearson’s correlation test. d GRO-cap data confirms a statistically higher percentage of transcribed enhancers involved in enhancer-promotor interactions compared to H3K27ac-defined enhancers in GM12878 lymphoblastoid cells. Transcribed enhancers are inferred from GRO-cap-based annotations of TSSs. P values were calculated empirically by random shuffling of sequences within H3K27ac-enriched regions. e Comparison of ABC score differences and observed gene expression differences between OCUM1 and SNU16 cells. Activity of an enhancer is estimated as the geometric mean of H3K27ac-enrichment and CapSTARR-seq signal. f Comparison of the HSPB1 expression difference and the ABC score difference between OCUM1 and SNU16 cells in the HSPB1 locus. Predicted E–P connections (dotted red arcs) are based on ABC maps in OCUM1 and SNU16 cells. Observed E–P connections (solid red arcs) are derived from the pcHi-C database
Fig. 5Cis-analysis of differential enhancers. a Distribution of enhancers associated with copy number. CN-abnormal enhancers represent differential enhancers with altered DNA copy numbers in OCUM1 or SNU16. CN-associated enhancers represent differential enhancers showing concordant changes in H3K27ac signals and DNA copy number. MYC, WDR11, FGFR2, CD44, PDHX, and ING1-associated enhancers are highlighted as they show a statistically significant correlation between enhancer copy numbers with target gene expression levels. b Correlation of H3K27ac signals and ING1 gene expression with the DNA copy numbers for the ING1 enhancer among multiple GC lines. c H3K27ac ChIP-seq and CapSTARR-seq tracks at the enhancer region harboring the ARL4C-associated SNP rs1464264 in OCUM1 and SNU16 cells. d H3K27ac ChIP-seq tracks in GC lines and pcHi-C associations in gastric tissues. The SNP rs1464264 genotype is annotated above the H3K27ac ChIP-seq track in each cell line. Significant chromatin interactions are shown below the axis (green loop). e Differences in expression of ARL4C in three groups of cell lines with different SNP rs1464264 genotypes (GG, AG, and AA). *: P < 0.05, ns: not significant, Mann–Whitney U test. f Difference in expression of ARL4C in three groups of GC patients with different SNP rs1464264 genotypes (GG, AG, and AA) in the Singapore cohort (n= 161). *: P < 0.05, ns: not significant, Student’s t test. g Survival analysis comparing patient groups with samples exhibiting low (green) and high (red) expression of ARL4C in the ACRG cohort (n = 300). P value is calculated using the Log-rank test. Survival data are indicated for every 25 months
Fig. 6Trans-analysis of differential enhancers. a Top 5 transcription factor binding enrichments at SNU16-specific enhancers determined by HOMER de novo motif analysis. The last column shows the percentage of target sequence with the corresponding motif. b Expression of HNF4α in normal gastric (n = 89) and GC samples (n = 185) from the Singapore cohort. Expression of HNF4α in normal gastric (n = 35) and GC samples (n = 415) from the TCGA cohort. P value: Mann–Whitney U test. c Integration of H3K27ac ChIP-seq data, HNF4α ChIP-seq binding profiles (SNU16, YCC3, IM95, KATOIII, IST1, NUGC4, and OCUM1) and RNA-seq data in control, HNF4α overexpressing cells (HFE145), HNF4α knockdown (YCC3) at the GRHL2 gene locus. The red box indicates an enhancer associated with GRHL2 (chr8: 102,449,130-102,450,795). d GRHL2 gene expression levels and H3K27ac signals over the enhancer for GRHL2 are linearly correlated across 28 cell lines. R: Pearson’s correlation coefficient. P value: Pearson’s correlation test. e Singapore cohort analysis reveals GRHL2 and HNF4α transcriptomic correlations using microarray data (Pearson’s correlation test). f TCGA cohort analysis confirms GRHL2 and HNF4α transcriptomic correlations using RNA-seq data (Pearson’s correlation test)