| Literature DB >> 19638230 |
Junhee Seok1, Wenzhong Xiao, Lyle L Moldawer, Ronald W Davis, Markus W Covert.
Abstract
BACKGROUND: Understanding the transcriptional regulatory networks that map out the coordinated dynamic responses of signaling proteins, transcription factors and target genes over time would represent a significant advance in the application of genome wide expression analysis. The primary challenge is monitoring transcription factor activities over time, which is not yet available at the large scale. Instead, there have been several developments to estimate activities computationally. For example, Network Component Analysis (NCA) is an approach that can predict transcription factor activities over time as well as the relative regulatory influence of factors on each target gene.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19638230 PMCID: PMC2729748 DOI: 10.1186/1752-0509-3-78
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1Schematic of the approach. (A) Flowchart describing the steps to reconstruct our initial transcriptional regulatory network. (B) A set of gene expression profiles (matrix E) and about a proposed structure for the underlying transcriptional regulatory network (matrix S(0)) are used as inputs for Network Component Analysis (NCA). NCA uses an algorithm that first calculates the expected transcription factor activities (matrix A), and then recalculates S based on the new values of A, until both matrices converge. The outputs of this procedure are A* and S*, final values of A and S, which provide information about transcription factor activity as well as regulatory structure, respectively.
Figure 2Transcription factor activities calculated using NCA. (A) Predicted activities of the ten transcription factors used in this study. For each transcription factor, rows represent progression in time and columns correspond to the four human subjects. Activities of each row are normalized to the zero time point. (B) Transcription factor activities (blue) compared to gene expression (green), with Pearson correlation coefficients noted. Both activity and expression at each time point are averages normalized to the time = 0 values, and the activity is further scaled for direct comparison with the expression values. (C) Correlation matrix between transcription factor activities. Red represents positive correlation, and blue represents negative correlation. (D) Inferred combinatorial regulation pairs of transcription factors. A blue solid line indicates that the pair was supported by protein-protein interaction knowledge of BIND and high correlation of their activities (>0.75). A black solid line indicates that the pair was only supported by high correlation, and a blue dotted line indicates that the pair was only supported by the interaction database.
NCA simulation with random noisy connections
| 1% | 5% | 10% | 15% | 20% | |
| Mean(Corr) | 0.9892 | 0.9548 | 0.9330 | 0.9107 | 0.8880 |
| SD(Corr) | 0.0678 | 0.1258 | 0.1378 | 0.1579 | 0.1662 |
NCA simulation was performed based on the original network model with 1~20% of random noisy connections. A pair of a transcription factor and a target gene was randomly selected; a random connection was added if the original connection did not exist, or removed otherwise. For each percentage of random connections, the simulation was repeated by 100 times. Mean and standard deviation of activity correlations with the original noise-free network model were calculated.
Figure 3Hierarchical clustering in the context of a defined regulatory network. (A) The adjusted strength matrix was used for clustering, after which the gene expression matrix was appended. Seven major clusters which have more than five associated genes are highlighted. In the adjusted strength matrix heatmap, green color indicates that there is no prior regulatory connection in our model while white color indicates a weak regulatory influence. (B) Clustering with gene expression only. Genes in the Cluster F(regulated by STAT6) were noted with green dots, and genes in the Cluster G(regulated by MYC) were noted with orange dots. (C) Clustering with the binary regulatory relations (initial connectivity matrix) assuming all regulatory strengths are equal.
Major clusters formed from the adjusted strength matrix.
| Cluster | Num. of genes | Dominant transcription factors | Avg. of pair-wise correlations | P-value |
| A | 18 | NFKB1, RELA | 0.5742 | <0.001* |
| B | 11 | STAT1 | 0.7815 | <0.001* |
| C | 5 | STAT3 | 0.5138 | 0.025 |
| D | 17 | JUN, FOS | 0.4054 | 0.001 |
| E | 6 | MYC | 0.4350 | 0.031 |
| F | 5 | STAT6 | 0.7546 | 0.004 |
| G | 21 | MYC | 0.6442 | <0.001* |
For N genes in a cluster, N(N-1)/2 pair-wise Pearson's correlations on expression were measured and their average was calculated. Next, N genes were randomly selected from a set of ~18,000 human genes and the average pair-wise correlation was calculated from the random gene set. The distribution of the average pair-wise correlation of random genes was re-estimated 1,000 times to generate a null distribution of the average pair-wise correlations. A p-value for each cluster was estimated by counting the number of random gene sets for which the average correlation is larger than the cluster correlation. *No random gene set that exceeded the cluster was found in 1,000 repeats.
Figure 4Identification of new target genes for major clusters. (A) The average expression profiles of the four clusters with > 10 members. (B) Expressions of extended regulatory genes sorted by correlation coefficients(c) with the average expression profile of a cluster. Each extended gene group was divided into highly correlated (c > 0.5), un-correlated (t0.5
Predicted genes for Cluster A and B from the extended gene sets.
| Cluster | Gene Symbol | Gene Name |
| A | CFLAR | CASP8 and FADD-like apoptosis regulator |
| CXCL1 | chemokine (C-X-C motif) ligand 1 | |
| EFNA1 | ephrin-A1 | |
| G0S2 | G0/G1switch 2 | |
| IL1R1 | interleukin 1 receptor, type I | |
| IL1RN | interleukin 1 receptor antagonist | |
| OLR1 | oxidized low density lipoprotein (lectin-like) receptor 1 | |
| PLK3 | polo-like kinase 3 | |
| PTX3 | pentraxin-related gene, rapidly induced by IL-1 beta | |
| RUNX1 | runt-related transcription factor 1 | |
| TNFAIP6 | tumor necrosis factor, alpha-induced protein 6 | |
| TNIP1 | TNFAIP3 interacting protein 1 | |
| B | CASP4 | caspase 4, apoptosis-related cysteine peptidase |
| CD14 | CD14 molecule | |
| CISH | cytokine inducible SH2-containing protein | |
| GBP1 | guanylate binding protein 1, interferon-inducible, 67 kDa | |
| GBP2 | guanylate binding protein 2, interferon-inducible | |
| GCH1 | GTP cyclohydrolase 1 | |
| HSPA1A | heat shock 70 kDa protein 1A | |
| IFI16 | interferon, gamma-inducible protein 16 | |
| IFIT3 | interferon-induced protein with tetratricopeptide repeats 3 | |
| IFITM1 | interferon induced transmembrane protein 1 | |
| IGF1R | insulin-like growth factor 1 receptor | |
| IL10RB | interleukin 10 receptor, beta | |
| IRF2 | interferon regulatory factor 2 | |
| ISG15 | ISG15 ubiquitin-like modifier | |
| ISG20 | interferon stimulated exonuclease gene 20 kDa | |
| JAK2 | Janus kinase 2 | |
| MX1 | myxovirus (influenza virus) resistance 1, interferon-inducible protein p78 | |
| PLSCR1 | phospholipid scramblase 1 | |
| RIPK1 | receptor (TNFRSF)-interacting serine-threonine kinase 1 | |
| SOCS1 | suppressor of cytokine signaling 1 | |
| STAT2 | signal transducer and activator of transcription 2 | |
| TIMP1 | TIMP metallopeptidase inhibitor 1 | |
| USP18 | ubiquitin specific peptidase 18 | |
| WARS | tryptophanyl-tRNA synthetase | |
Genes that showed high correlation to the average expression profile of a cluster (c > 0.5) were accepted as part of the cluster.
Predicted genes for Cluster A and B form the "no evidence" group
| Cluster | Category | Gene Symbol | Gene Name |
| A | Top 10 | ETS2 | v-ets erythroblastosis virus E26 oncogene homolog 2 |
| MTF1 | metal-regulatory transcription factor 1 | ||
| SAMSN1 | SAM domain, SH3 domain and nuclear localization signals 1 | ||
| IVNS1ABP | influenza virus NS1A binding protein | ||
| IFNGR2 | interferon gamma receptor 2 | ||
| PLAUR | plasminogen activator, urokinase receptor | ||
| IL1R2 | interleukin 1 receptor, type II | ||
| AZIN1 | antizyme inhibitor 1 | ||
| EHD1 | EH-domain containing 1 | ||
| PCNX | pecanex homolog | ||
| Related to innate immunity | IFNGR2 | interferon gamma receptor 2 | |
| IL1R2 | interleukin 1 receptor, type II | ||
| MAP4K4 | mitogen-activated protein kinase kinase kinase kinase 4 | ||
| NCF1 | neutrophil cytosolic factor 1 | ||
| TXN | thioredoxin | ||
| RIPK2 | receptor-interacting serine-threonine kinase 2 | ||
| Supported by other evidence | IFNGR2* | interferon gamma receptor 2 | |
| GPR84** | G protein-coupled receptor 84 | ||
| FCAR** | Fc fragment of IgA, receptor for | ||
| GADD45B** | growth arrest and DNA-damage-inducible, beta | ||
| B | Top 10 | TRIM5 | tripartite motif-containing 5 |
| GK | glycerol kinase | ||
| SP110 | SP110 nuclear body protein | ||
| TMEM140 | transmembrane protein 140 | ||
| CHMP5 | chromatin modifying protein 5 | ||
| RHBDF2 | rhomboid 5 homolog 2 | ||
| SAMD9 | sterile alpha motif domain containing 9 | ||
| DDX58 | DEAD (Asp-Glu-Ala-Asp) box polypeptide 58 | ||
| TLE3 | transducin-like enhancer of split 3 | ||
| TRIM21 | tripartite motif-containing 21 | ||
| Related to innate immunity | CSF2RB | colony stimulating factor 2 receptor, beta, low-affinity | |
| HCK | hemopoietic cell kinase | ||
| IL13RA1 | interleukin 13 receptor, alpha 1 | ||
| TLR1 | toll-like receptor 1 | ||
| TNFRSF1A | tumor necrosis factor receptor superfamily, member 1A | ||
| Supported by other evidence | DDX5*** | DEAD (Asp-Glu-Ala-Asp) box polypeptide 5 | |
| HERC5*** | hect domain and RLD 5 | ||
| IFIT2*** | interferon-induced protein with tetratricopeptide repeats 2 | ||
| OASL*** | 2'-5'-oligoadenylate synthetase-like | ||
| SNX10*** | sorting nexin 10 | ||
| ACSL4*** | acyl-CoA synthetase long-chain family member 4 | ||
The ten most correlated target genes, as well as genes related to innate immunity (from the fifty most correlated genes) and genes which were supported by other evidence, are shown. *Predicted as a target gene from the motif sequence analysis. **DNA-protein binding detected in LPS stimulation. ***Up-regulated by a mutant of STAT1 which can activate expression in the absence of tyrosine phosphorylation.
Figure 5A dynamic network of transcription. At time zero, LPS is injected, giving rise to transcription factor activation, which then leads to induction or repression of gene expression, production and secretion of cytokines, and initiation of secondary signals. Target genes which correspond to secreted proteins (e.g., IL10, IL1A and IL1B) are noted with green circles, and transcription factors that are regulated by other factors, such as STAT1 and MYC, are noted with cyan circles. The seven major clusters marked in Figure 3A are grouped with orange boxes. Black lines denote activation of a transcription factor by an extracellular signal, red and blue lines show the influence of a transcription factor on a target gene, and green dotted lines indicate secretion of a gene product.