| Literature DB >> 34093651 |
Dabin Jeong1, Sangsoo Lim2, Sangseon Lee3, Minsik Oh4, Changyun Cho1, Hyeju Seong5, Woosuk Jung5, Sun Kim1,2,6.
Abstract
Gene expression profile or transcriptome can represent cellular states, thus understanding gene regulation mechanisms can help understand how cells respond to external stress. Interaction between transcription factor (TF) and target gene (TG) is one of the representative regulatory mechanisms in cells. In this paper, we present a novel computational method to construct condition-specific transcriptional networks from transcriptome data. Regulatory interaction between TFs and TGs is very complex, specifically multiple-to-multiple relations. Experimental data from TF Chromatin Immunoprecipitation sequencing is useful but produces one-to-multiple relations between TF and TGs. On the other hand, co-expression networks of genes can be useful for constructing condition transcriptional networks, but there are many false positive relations in co-expression networks. In this paper, we propose a novel method to construct a condition-specific and combinatorial transcriptional network, applying kernel canonical correlation analysis (kernel CCA) to identify multiple-to-multiple TF-TG relations in certain biological condition. Kernel CCA is a well-established statistical method for computing the correlation of a group of features vs. another group of features. We, therefore, employed kernel CCA to embed TFs and TGs into a new space where the correlation of TFs and TGs are reflected. To demonstrate the usefulness of our network construction method, we used the blood transcriptome data for the investigation on the response to high fat diet in a human and an arabidopsis data set for the investigation on the response to cold/heat stress. Our method detected not only important regulatory interactions reported in previous studies but also novel TF-TG relations where a module of TF is regulating a module of TGs upon specific stress.Entities:
Keywords: TF cooperation; condition specific network; gene regulatory network; kernel canonical correlation analysis; network dynamics; transcription factor
Year: 2021 PMID: 34093651 PMCID: PMC8172963 DOI: 10.3389/fgene.2021.652623
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Workflow. STEP 1: To detect interacting transcription factor (TF) and target gene (TG) modules, respectively, prior protein–protein interaction (PPI) network was instantiated with gene expression data and community detection algorithm was used to detect condition-specific TF and TG modules. STEP 2: To get putative TF–TG relations, we conducted projection from a TF module to a TG module through public gene regulatory network (GRN). This process is conducted for every possible TF–TG module pair. STEP 3: Utilizing kernel canonical correlation analysis (CCA), we constructed condition-specific GRN that detects multiple-to-multiple regulatory relationships between TFs and TGs.
Comparison of our method to ARACNe-AP and GENIE3 in terms of specificity, precision, and recall with respect to the ground truth network from a literature search tool, BEST (Lee et al., 2016).
| +3 h | Node comparison | Specificity | 0.841 | 0.270 | 0.692 | 0.961 |
| Recall | 0.230 | 0.829 | 0.533 | 0.483 | ||
| Edge comparison | Precision | 0 | 8.05 × 10−6 | 9.01 × 10−3 | 3.12 × 10−2 | |
| Recall | 0 | 9.04 × 10−3 | 0.413 | 0.591 | ||
| +6 h | Node comparison | Specificity | 0.869 | 0.277 | 0.741 | 0.957 |
| Recall | 0.197 | 0.830 | 0.451 | 0.389 | ||
| Edge comparison | Precision | 6.30 × 10−6 | 8.20 × 10−6 | 6.51 × 10−4 | 2.89 × 10−2 | |
| Recall | 8.84 × 10−4 | 9.04 × 10−3 | 0.188 | 0.340 | ||
Figure 2Network dynamics of a gene regulatory network (GRN) sub-network after high-fat meal (HFM) over time. Two circular diagrams in the upper panel show the change in gene–gene relationship, in particular, TF–TG regulation. Two tables in the middle summarize top 10 most enriched biological pathways with p-value corrected by false discovery rate (FDR) < 0.05. Two networks are dynamics of a TF–TG sub-network with the highest dynamics score. In the TF–TG sub-network, two transcription factors (TFs), FOXO3 and FOXO4, regulates different sets of target genes (TGs) over time. TFs were denoted with diamond-shaped nodes. Square nodes denotes TGs and circle nodes denote genes connected to TFs and TGs. Nodes colored pink denote genes that consist of GRN in each time point. Nodes colored red denote shared TFs among concatenated sub-networks and nodes colored green DEGs that are detected by DESeq with FDR < 0.05.
Figure 3Examination of transcription factor (TF) cooperation. High score of B and C represents the amount of cooperativity of co-working TFs in the pathways. Heatmap in the left panel shows the cooperation in terms of pathway enrichment over time in high-fat meal (HFM). Pathway enrichment in the G was compared to the simulations given each TF (G) and measured using Equation (5). Heatmap in the right panel shows the cooperative potential using enriched pathway genes. Betweenness centrality was compared between G and G using Equation (6).
Figure 4Network dynamics of a gene regulatory network (GRN) sub-network after heat stress over time. Network in the left panel shows the change in gene-gene relationship, in particular, TF–TG regulation. DEGs are denoted as pink. Four tables in the middle summarize top five most enriched biological pathways with p-value corrected by false discovery rate (FDR) < 0.05. Heat stress related Gene Ontology (GO) terms are enriched in GO enrichment tests with DEGs. The networks in the right panel are dynamics of a TF–TG sub-network that are DEG enriched. Transcription factors (TFs) were denoted with green nodes. Blue nodes denote target genes (TGs) and gray nodes denote genes connected to TFs and TGs. Square nodes denote DEGs that are detected by Limma with FDR < 0.05.
Figure 5Network dynamics of a gene regulatory network (GRN) sub-network after cold stress over time. Networks in the left panel shows the change in gene–gene relationship, in particular, TF–TG regulation. DEGs are denoted as pink. Four tables in the middle summarize top five most enriched biological pathways with p-value corrected by false discovery rate (FDR) < 0.05. Cold stress related Gene Ontology (GO) terms are enriched in GO enrichment tests with DEGs. The networks in the right panel are dynamics of a TF–TG sub-network that are DEG enriched. TFs were denoted with green nodes. Blue nodes denote TGs and gray nodes denote genes connected to transcription factors (TFs) and target genes (TGs). Square nodes denote DEGs that are detected by limma with FDR < 0.05. We showed GRN from tp3 to tp6, since GRN constructed in tp1 and tp2 is too small because the number of DEGs are too small in tp1 and tp2—41 and 23, respectively.