| Literature DB >> 35222842 |
Xiaohan Tang1, Junting Wang1,2, Huan Tao2, Lin Yuan3, Guifang Du2, Yang Ding2, Kang Xu2, Xuemei Bai1, Yaru Li2, Yu Sun2, Xin Huang2, Xiushuang Zheng1, Qianqian Li1, Bowen Gong1, Yang Zheng2, Jingxuan Xu2, Xiang Xu2, Zhe Wang2, Xiaochen Bo2, Meisong Lu1, Hao Li2, Hebing Chen2.
Abstract
Endometrial cancer (EC) is one of the three fatal tumors of the female reproductive system. Epigenetic alterations have been reported to be important in tumorigenesis, especially the chromatin accessibility changes and transcription factor binding differences. However, the regulatory mechanism underlying epigenetic alterations in EC development remains unclear. Here, we identified and characterized transcription factor binding site clustered regions (TFCRs) by integrating chromatin accessibility and transcription factor binding information. We totally identified 78,820 TFCRs and explored the relationship between TFCRs and regulatory elements, gene expression and mutation. Finally, we constructed a bioinformatic framework to identify candidate oncogenes and screened 13 candidate key genes, which may serve as potential diagnostic markers or therapeutic targets of EC.Entities:
Keywords: ATAC-seq; Diagnostic biomarkers; Endometrial cancer; TFCR; Transcriptional regulation
Year: 2022 PMID: 35222842 PMCID: PMC8844752 DOI: 10.1016/j.csbj.2022.01.014
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Schematic diagram for the identification of TFCRs.
Fig. 2Relationship between TFCRs and ATAC-seq peaks. (A) Pie chart shows the rate of overlap with TFCRs in ATAC-seq peaks. (B) Barplot shows the normalized TFBS distribution for ATAC-seq peaks are overlapped with or without TFCRs. (C) Gene ontology analysis of genes whose ATAC-seq peaks are non-overlapped with TFCRs. (D) Barplot shows the mutation rate (ρ) of the whole genome, TFCRs, ATAC-seq peaks and TFCR-specific peaks. ρ represents the number of TCGA mutation sites per thousand bases in a certain range.
Fig. 3Genomic characterization of TFCRs and ATAC-seq peaks. (A) Barplot shows the proportion of TFCRs or ATAC-seq peaks located in gene promoters. (B) Stacked barplot shows the proportion of TFCRs or ATAC-seq peaks located in super-enhancers (Red) and typical enhancers (Gray). (C) Barplot shows the proportion of TFCRs or ATAC-seq peaks with CpG islands. (D) Barplot shows the proportion of TFCRs or ATAC-seq peaks on repeats (microsatellite sites). * P < 0.05.
Fig. 4Properties of the characteristic parameters TC and CAS in TFCRs. (A) Correlation analysis of the TF complexity and chromatin accessibility score in TFCRs (R2 = 0.05). Boxplot shows (B) the distribution of score in different groups obtained by grouping the TF complexity of TFCRs with an interval value of 10. (C-D) The gene expression levels in different groups in ATAC-seq peaks classified by CAS and TFCRs classified by TC. The upper and lower lines above and below the boxes are the whiskers. (E) Barplot shows the mutation distribution of TFCR with promoter and TFCR without promoter. (F) The line graph shows the mutation rate of TFCRs in gene promoters (CAS: Darkblue; TC: Darkred) classified by TC or CAS. (G-H) Barplots show the proportion of reported cancer driver genes distributed in different classification of TC and CAS.
Fig. 5Identification of key cancer genes in endometrial cancer. (A) The pipeline used to identify key cancer genes in endometrial cancer based on TFCRs and ATAC-seq peaks. (B) The genome browser shows the positional relationship of the example gene, PRSS8 with TFCRs, ATAC-seq peaks, TCGA mutation sites and transcription factor binding sites in the genome. Each track represents a different component and is distinguished by different colors. The picture below is an enlarged image of the picture above.
Fig. 6Effect of HOXA9 knockdown on apoptosis, migration and invasion of HEC-1-A. (A) Effect of HOXA9 knockdown on (B) apoptosis, (C) invasion and migration potential of HEC-1-A, detected by flow cytometry and transwell assay. Data are expressed as the mean ± standard deviation of three independent experiments. * P < 0.05; ** P < 0.01; *** P < 0.001. NC, negative control. (D) TFCR regulation model.