| Literature DB >> 21496241 |
Jeff Nie1, Ron Stewart, Hang Zhang, James A Thomson, Fang Ruan, Xiaoqi Cui, Hairong Wei.
Abstract
BACKGROUND: Identifying the key transcription factors (TFs) controlling a biological process is the first step toward a better understanding of underpinning regulatory mechanisms. However, due to the involvement of a large number of genes and complex interactions in gene regulatory networks, identifying TFs involved in a biological process remains particularly difficult. The challenges include: (1) Most eukaryotic genomes encode thousands of TFs, which are organized in gene families of various sizes and in many cases with poor sequence conservation, making it difficult to recognize TFs for a biological process; (2) Transcription usually involves several hundred genes that generate a combination of intrinsic noise from upstream signaling networks and lead to fluctuations in transcription; (3) A TF can function in different cell types or developmental stages. Currently, the methods available for identifying TFs involved in biological processes are still very scarce, and the development of novel, more powerful methods is desperately needed.Entities:
Mesh:
Year: 2011 PMID: 21496241 PMCID: PMC3101171 DOI: 10.1186/1752-0509-5-53
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
TF cluster identified with pluripotency of human embryonic stem cells
| Genes | Symbol | Description | Evidence |
|---|---|---|---|
| NM_024865 | NANOG | Nanog homeobox | [ |
| BC099704 | NANOGP8 | Nanog homeobox pseudogene 8 | Pseudogene with similarity to Nanog. |
| NM_003106 | SOX2 | SRY box 2 | [ |
| NM_002701 | POU5F1 | POU class 5 homeobox 1 | [ |
| NM_006892 | DNMT3B | DNS methyltransferase 3 beta | [ |
| NM_004078 | CSRP1 | cysteine-rich protein | |
| NM_080618 | CTCFL | CCCTC-binding factor (zinc finger protein)-like | [ |
| NM_016089 | ZNF589 | Zinc finger 589 | [ |
| NM_004426 | PHC1 | Polyhomeotic homolog 1 | [ |
| NM_005407 | SALL2 | SAL2 like | [ |
| NM_018645 | HES6 | Hairy and enhancer of split 6 | |
| NM_173547 | TRIM65 | Tripartite motif containing 65 | |
| NM_004427 | PHC2 | Polyhomeotic homolog 2 | [ |
| NM_032805 | ZFP206 | Zinc finger protein 206 (ZSCAN10) | [ |
| NM_001421 | ELF4 | ETS domain TF | |
| NM_003325 | HIRA | HIR Histone Cell Cycle regulator | [ |
| NM_033204 | ZNF101 | Zinc finger protein 101 | |
| BC098403 | ETV1 | ETS variant 1 | [ |
| NM_006079 | CITED2 | Cbp/p300-interacting transactivator | [ |
| NM_021728 | OTX2 | Orthodenticle homeobox 2 | |
| NM_024015 | HOXB4 | Homeobox B4 | |
| NM_006074 | TRIM22 | Tripartite motif-containing 22 | [ |
| XM_929986 | LOC653441 | Similar to polyhomeotic 1-like | Gene with sequence similarity to PHC1 |
| NM_004497 | FOXA3 | Forkhead box 3 | |
| BC008687 | NEUROG1 | Neurogenin 1 | [ |
| NM_006161 | NEUROG1 | Neurogenin 1 | [ |
| NM_001965 | EGR4 | Early growth response | |
| NM_033178 | DUX4 | Double homeobox 4 | [ |
| NM_006732 | FOSB | FBJ oncogene homolog B | [ |
| NM_003317 | TITF1 | NK2 homeobox 1 | [ |
| NM_002478 | MYOD1 | myogenic differentiation 1 | [ |
| NM_006192 | PAX1 | Paired box 1 | [ |
| NM_002700 | POU4F3 | POU class 4 homeobox 3 | [ |
| BC10493 | POU4F3 | POU class 4 homeobox 3 | [ |
| NM_001002295 | GATA3 | GATA binding protein 3 | Trophectoderm [ |
| NM_012258 | HEY1 | Hairy/enhancer-of-split related with YRPW motif 1 | Trophectoderm [ |
| NM_001804 | CDX1 | Caudal type homeobox 1 | |
| NM_001430 | EPAS1 | Endothelial PAS domain protein 1 | |
| NM_032638 | GATA2 | GATA binding protein 2 | Trophectoderm [ |
| NM_030379 | GLI2 | GLI family zinc finger 2 | Mesoderm [ |
| NM_017410 | HOXC13 | Homeobox C13 | Ectoderm[ |
| NM_002202 | ISL1 | ISL LIM homeobox 1 | Mesoderm [ |
| NM_033343 | LHX4 | LIM homeobox 4 | |
| NM_002315 | LMO1 | LIM domain only 1 (rhombotin 1) | |
| NM_005461 | MAFB | v-maf musculoaponeurotic fibrosarcoma oncogene Homolog B (avian) | Neural [ |
| NM_002448 | MSX1 | Msh homeobox 1 | |
| NM_002449 | MSX2 | Msh homeobox 2 | Mesoderm[ |
| NM_175747 | OLIG3 | Oligodendrocyte transcription factor 3 | Neural [ |
| NM_006099 | PIAS3 | Protein inhibitor of activated STAT, 3 | Neural [ |
| NM_019854 | PRMT8 | Protein arginine methyltransferase 8 | Neural [ |
| NM_030567 | PRR7 | Proline rich 7 (synaptic) | [ |
| BC071571 | RFX2 | Regulatory factor X, 2 (influences HLA class II expression) | |
| NM_003068 | SNAI2 | Snail homolog 2 (Drosophila) | Neural Crest[ |
| NM_031439 | SOX7 | SRY (sex determining region Y)-box 7 | Endoderm (Parietal) [ |
| NM_003150 | STAT3 | Signal transducer and activator of transcription 3 (acute-phase response factor) | |
| NM_003221 | TFAP2B | Transcription factor AP-2 beta (activating enhancer binding protein 2 beta) | |
| NM_016267 | VGLL1 | Vestigial like 1 (Drosophila) | |
| NM_007129 | ZIC2 | Zic family member 2 (odd-paired homolog, Drosophila) | Neural [ |
| NM_152320 | ZNF641 | zinc finger protein 641 | |
Cluster 1, 2, 5, 7 and 19 identified from salt stress data of Arabidopsis roots containing root growth and development
| Gene | Symbol | Description | Evidence |
|---|---|---|---|
| AT5G58010 | LRL3 | Roothairless1 | [ |
| AT5G19790 | RAP2.11 | Ethylene response factor controlling root growth | [ |
| AT1G27740 | RSL4 | Postmitotic cell growth in root-hair cells | [ |
| AT1G66470 | RHD6 | Early root hair formation | [ |
| AT5G25810 | TINY | ERF/AP2 TF control cell expansion in root | [ |
| AT2G28160 | FRU | Regulates iron uptake responses in outer cells of root | [ |
| AT1G33280 | BRN1 | BRN1, SMB control root cap maturation | [ |
| AT4G10350 | BRN2 | BRN2, SMB control root cap maturation | [ |
| AT1G79580 | SMB | FEZ and SMB control root stem cells | [ |
| AT5G39820 | ANAC094 | Apical meristem protein, function unknown | [ |
| AT1G26870 | FEZ | FEZ and SMB control root stem cells in cap | [ |
| AT1G74500 | TOM7 | Embryonic root initiation | [ |
| AT3G27010 | TCP20 | Postembryonic cell division in root | [ |
| AT2G30340 | LBD13 | Expressed in cells at the adaxial base of lateral roots | [ |
| AT2G40470 | LBD15 | Expressed in cells at the adaxial base of lateral roots | [ |
| AT1G51190 | PLT2 | Control root stem cell activity near cap | [ |
| AT1G66350 | RGL1 | Root epidermal differentiation | [ |
| AT2G37260 | TTG2 | Differentiation of trichomes and root hairless cells | [ |
| AT5G57420 | IAA33 | IAA is involved in root development | [ |
| AT2G29060 | scarecrow transcription factor family protein | ||
| AT5G07580 | DNA binding/transcription factor | ||
| AT1G21340 | Dof-type zinc finger DNA-binding protein | ||
| AT1G75710 | C2H2-like zinc finger protein | ||
| AT1G77200 | DREB subfamily A-4 of ERF/AP2 transcription factor | ||
| AT1G71930 | VND7 | Regulates xylem vessel formation | [ |
| AT5G12870 | MYB46 | Target of SND1, control second wall biosynthesis | [ |
| AT1G01780 | LIM | LIM domain-containing protein | |
| AT1G12260 | VND4 | Switches for protoxylem and metaxylem vessel formation | [ |
| AT1G17950 | MYB52 | Second wall growth | [ |
| AT1G63910 | MYB103 | Second wall growth | [ |
| AT1G66230 | MYB20 | Second wall growth | [ |
| AT1G68810 | bHLH | Root vascular initial | [ |
| AT1G73410 | MYB54 | Second wall growth | [ |
| AT2G39830 | DAR2 | DA-1 related, control organ size | [ |
| AT2G45420 | LBD18 | Lateral root and tracheary element formation | [ |
| AT3G21270 | ADOF2 | Early stages of vascular development | [ |
| AT4G00220 | JLO | A central regulator of auxin distribution and signaling in root | [ |
| AT4G28500 | SND2 | Vascular cell differentiation | [ |
| AT5G66610 | DAR7 | DA-1 related, control organ size | [ |
| AT5G24330 | AtXR6 | Cell cycle regulation of late G1 to S phase | [ |
| AT3G01330 | DEL3 | Cyclin D/retinoblastoma/E2F pathway | [ |
| AT2G22840 | AtGRF1 | Growth factor expressed in root | [ |
| AT2G36400 | AtGRF3 | Growth factor expressed in root | [ |
| AT4G37740 | AtGRF2 | Growth factor expressed in root | [ |
| AT3G50870 | MNP | GATA transcription factor | [ |
| AT1G34355 | PS1 | Parallel spindle 1 involved in meiosis | [ |
| AT4G23800 | HMG1/HMG2 | High mobile group 1, 2 | [ |
| AT5G25475 | Transcription factor B3 family | ||
| AT2G46270 | GBF3 | induced by ABA under water deprivation | [ |
| AT3G19290 | ABF4 | Regulate ABRE-dependent ABA signaling involved in drought stress | [ |
| AT1G21000 | Zinc | zinc-binding family protein | |
| AT1G51140 | bHLH | Drought stress | [ |
| AT1G52890 | ANAC019 | Bind to drought-responsive cis-element in response to ABA | [ |
| AT1G73730 | EIL3 | Ethylene signaling | [ |
| AT2G18550 | HB-2 | DNA binding/transcription factor | |
| AT2G46680 | ATHB7 | Growth regulator in response to ABA | [ |
| AT3G12980 | HAC5 | H3/H4 histone acetyltransferase/histone acetyltransferase | |
| AT3G61890 | ATHB12 | Growth regulator in response to ABA | [ |
| AT4G21440 | MYB102 | ABA-induced protein | [ |
| AT4G25480 | DREB1A | Drought stress genes responsive to ABA | |
| AT4G27410 | RD26 | Transcriptional activator in ABA-mediated dehydration response | [ |
| AT4G34000 | ABF3 | Regulate ABRE-dependent ABA-mediated dehydration response | [ |
| AT4G37180 | MYB | myb family transcription factor | |
| AT5G04760 | MYB | myb family transcription factor | |
| AT5G47640 | NF-YB2 | NF-YB2 (NUCLEAR FACTOR Y, SUBUNIT B2); transcription factor | |
Comparison of Triple-Link with MCL and Affinity Propagation
| AGI | Cluster ID (TL) | Cluster ID (MCL) | Cluster ID (AP) |
|---|---|---|---|
Figure 1The clusters identified by Triple-Link are well connected between any nodes within each cluster. 1A: Cluster 1 (yellow nodes) contains the TFs predicted to be involved in pluripotency in human embryonic stem cells. The three genes in the center are master TFs, NANOG, POU5F1, and SOX2, which are crucial for pluripotency. A few nodes located immediately outside the inner ring are those that may not be always captured, depending on the parameters used. 1B: Cluster 7 (yellow nodes) contains the TFs controlling root growth in Arabidopsis under salt stress.
Figure 2Normal Q-Q plot and Q-Q plot of a few genes for which Spearman and Regression make no difference (SOX3), and for which there is a difference (RGC32 and HAND2).
The intersection of coexpressed genes to NANOG, SOX2, and POU5F1 when Spearman and regression are used
| Common Genes | Unique Genes |
|---|---|
| Regression/Pearson | |
| Spearman | |
Figure 3Shared coexpression connectivity represents the coordination of two TFs in the context of other genes while correlation coefficient/regression p value between two TFs only reflects the distance between these two TFs with regard to coexpression.
Figure 4The workflow of TF cluster. Automated package that can recognize transcription regulators controlling a biological process with gene expression data (microarray or RNA-seq). The TF-recognition can be classified into two phases: construction of TF coexpression network and decomposition of the network into coordinated TF clusters. The package was developed with Perl, and thus can be used in various platforms.