| Literature DB >> 19619308 |
Mike J Mason1, Guoping Fan, Kathrin Plath, Qing Zhou, Steve Horvath.
Abstract
BACKGROUND: Recent work has revealed that a core group of transcription factors (TFs) regulates the key characteristics of embryonic stem (ES) cells: pluripotency and self-renewal. Current efforts focus on identifying genes that play important roles in maintaining pluripotency and self-renewal in ES cells and aim to understand the interactions among these genes. To that end, we investigated the use of unsigned and signed network analysis to identify pluripotency and differentiation related genes.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19619308 PMCID: PMC2727539 DOI: 10.1186/1471-2164-10-327
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Network Connection Strength Versus Expression Correlation. Network adjacency (y-axis) versus correlation (x-axis) for an unweighted network (black step function with τ = 0.8) and weighted networks (dashed lines corresponding to different powers, β) in an unsigned network (a) and a signed network (b). Note that cor(x, x) = -1 leads to adjacency = 0 in the signed network. The weighted network preserves the continuous nature of the co-expression information while an unweighted network dichotomizes the correlation.
Figure 2Unsigned and Signed Mouse ES cell Networks in Ivanova et al. a, left, Dendrogram of the unsigned network of the Ivanova et al (2006) data set with color bands below indicating module membership for the unsigned network (U) and the signed network (S). a, right, heat map for visualizing standardized gene expressions (rows) across samples (columns) for genes in the turquoise module in the unsigned network. b, left, Dendrogram of the signed network of the Ivanova et al (2006) data set with color bands below indicating module membership for the signed network (S) and the unsigned network (U). b, right, heat map of expression profiles across samples for genes in the turquoise, black, and blue modules in the signed network. Note, modules are not scaled to reflect the number of genes in each module. c, scatter plot of module membership, k, (x-axis) plotted against gene significance, GS, (y-axis) for the black and blue modules in the signed network with known ES cell regulators and differentiation genes labelled.
Transcription Factor Binding in Ivanova et al Networks
| 899 | 0.863, 0.232 | 1.09, 0.0398 | 0.956, 0.366 | |
| 1241 | 0.862, 0.179 | 0.757, 3.09e-09 | 1.92, 2.5e-21 | |
| 1556 | 0.825, 0.0832 | 0.811, 1.78e-07 | 1.13, 0.0496 | |
| 7407 | 1.16, 2.37e-05 | 1.18, 4.62e-45 | 0.866, 2.17e-07 | |
| 1673 | 1.1, 0.169 | 1.68, 6.75e-74 | 0.281, 7.61e-28 | |
| 5721 | 0.898, 0.0268 | 0.801, 4.3e-36 | 1.2, 6.96e-10 | |
| 1053 | 1.14, 0.15 | 1.46, 5.74e-22 | 0.458, 7.39e-10 | |
| 941 | 2.9, 7.25e-24 | 1.83, 4.8e-58 | 0.288, 1.95e-15 | |
| 220 | 0.608, 0.165 | 0.488, 1.59e-07 | 1.93, 4.42e-05 | |
| 1090 | 0.736, 0.0402 | 0.754, 2.04e-08 | 1.99, 3.27e-21 | |
| 581 | 0.783, 0.174 | 0.931, 0.15 | 1.78, 2.58e-08 | |
| 1538 | 0.731, 0.0138 | 0.778, 1.14e-09 | 1.12, 0.0534 | |
| 4428 | 0.949, 0.224 | 0.853, 2.38e-14 | 1.13, 0.000604 | |
| 514 | 1.09, 0.286 | 1.11, 0.0548 | 0.824, 0.122 | |
| 649 | 1.61, 0.00131 | 1.89, 2.87e-45 | 0.181, 6.99e-15 | |
| 2242 | 1.4, 4.41e-05 | 1.85, 8.63e-160 | 0.341, 3.32e-31 | |
| 1270 | 1.01, 0.429 | 1.56, 1.19e-37 | 0.371, 8.07e-16 | |
| 650 | 1.03, 0.388 | 1.19, 0.00083 | 0.67, 0.00378 | |
| 445 | 2.16, 4.62e-06 | 1.71, 4.65e-21 | 0.37, 3.14e-06 | |
| 5205 | 0.679, 2.41e-09 | 0.749, 3.51e-49 | 1.24, 2.71e-11 | |
| 1042 | 1.05, 0.326 | 1.35, 1.9e-13 | 0.633, 4.69e-05 | |
Enrichment for binding by the Oct4 and cMyc TF Groups and Suz12.
Each enrichment score (before the comma) is followed by its corresponding p-value (after the comma) calculated from the hyper-geometric distribution. P-values are uncorrected for multiple comparisons.
Figure 3Relating Module Membership to Epigenetic Regulation. The top 1000 genes with highest module membership in the black module (top row) and blue module (bottom row) are related to 3 epigenetic variables (corresponding to the 3 columns). The y-axis reports the proportion of top 1000 genes that are known to belong to the group of genes defined on the x-axis. Histone H3K4me3 trimethylation status is abbreviated K4, H3K27me3 trimethylation statys is abbreviated by K27. Promoters that are both H3K4 and H3K27 trimethilated in ES cells (denoted K4&K27) are thought to poise key developmental genes for activation upon differentiation [50,51]. Note that genes with promoter CpG methylation are significantly (p = 2.0 × 10-14) under-enriched with respect to the top 1000 black module genes.
Module Membership Versus Epigenetic Variables
| Source | Degrees Of Freedom | Sums of Sq | Prop. Of Total Var | p-value (F test) | Sums of Sq | Prop. Of Total Var | p-value (F-test) |
| 3 | 170.91 | 0.067 | < 2.2E-16 | 30.49 | 0.034 | < 2.2E-16 | |
| 1 | 39.21 | 0.015 | < 2.2E-16 | 1.48 | 0.002 | 2.6E-04 | |
| 1 | 8.62 | 0.003 | 8.0E-08 | 0.79 | 0.001 | 7.5E-03 | |
| 2 | 4.56 | 0.002 | 4.9E-04 | 4.71 | 0.005 | 6.0E-10 | |
| 1 | 0.88 | 0.000 | 8.7E-02 | 0.19 | 0.000 | 1.9E-01 | |
| 1 | 0.04 | 0.000 | 7.1E-01 | 0.58 | 0.001 | 2.2E-02 | |
| 7846 | 2353.8 | 0.917 | 868.36 | 0.958 | |||
| 7855 | 2567.11 | 906.6 | |||||
This analysis of variance table reports which epigenetic variables and TF binding data have a significant effect on (columns on the left hand side) and (columns on the right hand side). For each variable (source of variation), the columns report the degrees of freedom, the sums of squares, the proportion of total variance explained by the variable and the corresponding F-test p-value. Note that histone trimethylation status is the most significant source of variation for both and .
Figure 4Unsigned and Signed Networks of the Zhou et al ES Expression Data. a, left, Dendrogram of the unsigned network of the Zhou et al (2007) data with color bands below indicating module membership for the unsigned network (U) and the signed network (S). a, right, A heat map shows microarray expression profiles accross samples for genes in the blue and black modules in the unsigned network. b, left, Dendrogram of the signed network of the Zhou et al (2007) data with color bands below indicating module membership for the signed network (S) and the unsigned network (U). b, right, heatmap of expression profiles (rows) across samples (columns) for genes in the blue and black modules in the signed network.
Figure 5Expression Changes Versus Module Membership in the Black and Blue Modules (Zhou et al). Module membership, k, is plotted against log2 expression fold change (FC) for the black (a) and blue (b) modules in the unsigned network of the Zhou et al (2007) data. FC is the ratio between the average expression in Oct4 positive samples and Oct4 negative microarray samples. Known ES cell regulators are labeled. Genes are colored by module membership in the signed network. c and d are analogous to a and b but module membership is with regard to the signed black and blue modules.
Transcription Factor Binding in Zhou et al Networks
| black | 2659 | 1.55, 2.08e-09 | 1.4, 4.18e-46 | 0.666, 6.21e-10 |
| blue | 1484 | 0.872, 0.172 | 1.09, 0.0125 | 1.01, 0.447 |
| brown | 992 | 1.2, 0.093 | 1.02, 0.315 | 1.11, 0.127 |
| green | 1493 | 0.758, 0.028 | 0.863, 0.000186 | 1.36, 4.79e-06 |
| grey | 7318 | 0.766, 1.87e-08 | 0.803, 1.69e-51 | 1.2, 7.19e-14 |
| magenta | 1583 | 0.919, 0.28 | 1.04, 0.147 | 1.1, 0.0982 |
| red | 2593 | 1.48, 2.3e-07 | 1.64, 4.02e-107 | 0.437, 2.49e-26 |
| turquoise | 838 | 0.997, 0.459 | 1.3, 1.74e-08 | 0.829, 0.0631 |
| yellow | 2002 | 1.08, 0.213 | 1.1, 0.00171 | 0.844, 0.0106 |
| black | 1859 | 1.94, 2.94e-15 | 1.4, 4.22e-30 | 0.523, 2.19e-13 |
| blue | 1972 | 0.656, 0.000465 | 1.03, 0.18 | 1.13, 0.03 |
| brown | 1267 | 1.45, 0.000755 | 1.72, 1.22e-59 | 0.411, 5.78e-14 |
| green | 1548 | 0.818, 0.0763 | 0.933, 0.0414 | 1.27, 0.000329 |
| grey | 7175 | 0.729, 1.64e-10 | 0.784, 2.34e-59 | 1.24, 2.7e-18 |
| pink | 659 | 0.941, 0.432 | 1.27, 5e-06 | 0.474, 2.48e-06 |
| red | 2184 | 1.48, 2.18e-06 | 1.63, 1.7e-85 | 0.493, 9.37e-18 |
| turquoise | 2317 | 0.896, 0.156 | 1.05, 0.0498 | 1.09, 0.0716 |
| yellow | 1870 | 1.17, 0.0587 | 1.05, 0.0545 | 0.885, 0.054 |
Module enrichments for binding by the Oct4 and cMyc TF Groups amd Suz12. Each enrichment score (before the comma) is followed by its corresponding p-value calculated from the hyper-geometric distribution (uncorrected for multiple comparisons).
Functional Pathways in Highly Connected Pluripotency and Differentiation Related Genes in the Zhou et al Network
| 9.72E-17 | er-golgi transport; protein localization; protein transport; vesicle-mediated transport; secretion by cell; cellular localization; secretory pathway; intracellular transport; | Myl6 (0.994), Sh3glb1 (0.993), Tm9sf3 (0.993), Tram1 (0.992), Derl1 (0.991), Serinc1 (0.991), Lman1 (0.991), Lrp10 (0.991), Mcfd2 (0.99), Mcfd2 (0.99), Tmed10 (0.99), Tpcn1 (0.989), Arl1 (0.989), Tinagl (0.987), Rab2 (0.987), Txndc1 (0.987), Col4a1 (0.987) |
| 5.65E-08 | Glycan structures – biosynthesis 1; signal-anchor; transferase activity, glycosyltransferase | Glt8d1 (0.993), Creb3 (0.991), Fut8 (0.99), Fkrp (0.99), Extl2 (0.989), Glt8d3 (0.987), Itm2c (0.986), Hs3st1 (0.986), Pofut2 (0.986), Dpagt1 (0.985), Mgat2 (0.983), Abhd6 (0.982), Ddost (0.982), Ndst2 (0.981), B4galnt1 (0.981), St3gal6 (0.98) |
| 2.98E-06 | membrane; transmembrane; transmembrane region; topological domain:Cytoplasmic | H13 (0.996), Pdgfra (0.994), Cd59a (0.994), Glt8d1 (0.993), Sh3glb1 (0.993), Tm9sf3 (0.993), Tram1 (0.992), Gdpd5 (0.991) |
| 0.00185 | organ development; system development; anatomical structure morphogenesis; cell differentiation; organ morphogenesis | Pdgfra (0.994), Myl6 (0.994), Sh3glb1 (0.993), Lmo4 (0.992), Rgnef (0.989), Syvn1 (0.988), Kit (0.988), Fndc3b (0.988), Txndc1 (0.987), Lama1 (0.987), Barx1 (0.986), Col4a2 (0.986), Ctgf (0.985), Fgf3 (0.985), Crim1 (0.983), Pthr1 (0.983) |
| 1.98E-08 | response to DNA damage stimulus; DNA damage; DNA repair | Msh6 (0.993), Rif1 (0.983), Mre11a (0.982), Setx (0.974), Xrcc5 (0.971), Chek1 (0.968), Xab2 (0.967), Xrn2 (0.967), Trp53 (0.959), Npm1 (0.958), Tdp1 (0.955), Bccip (0.954) |
| 3.75E-08 | Mitochondrion; transit peptide; Mitochondrion | Mrpl15 (0.992), Ppif (0.991), Mrps5 (0.987), Hspa9 (0.984), Coq3 (0.984), Tst (0.981), Mrpl45 (0.98), Akap1 (0.979), L2hgdh (0.978), Mrps31 (0.978), Chchd4 (0.976), Abce1 (0.975), Dci (0.975), Fpgs (0.974), Mrpl39 (0.973), Bdh1 (0.971) |
| 5.83E-08 | nucleus; biopolymer metabolic process; DNA binding; cellular metabolic process; Transcription regulation; | Msh6 (0.993), Pes1 (0.991), Zic3 (0.991), Uchl1 (0.99), Rnf138 (0.99), Rnf138 (0.99), Wdr36 (0.989), Pou5f1 (0.989), Rbpj (0.987), Glo1 (0.987), Tdgf1 (0.987), OTTMUSG00000010173 (0.986), Aarsd1 (0.986), Nup133 (0.985), Xpo1 (0.985), Xpo1 (0.985), Dnajc6 (0.985), Klhl13 (0.984), Dppa4 (0.984), |
| 5.26E-04 | cell cycle phase; cell cycle process; cell cycle; mitotic cell cycle; mitosis; cell division | Pes1 (0.991), Rif1 (0.983), Mre11a (0.982), Gtpbp4 (0.972), Chek1 (0.968), Mnat1 (0.966), Rcc2 (0.964), Gadd45gip1 (0.963), Rpa1 (0.961), Hells (0.96), Trp53 (0.959), Terf1 (0.959) |
Enriched GO terms of genes with the highest 5% black and blue k's. Highly connected genes in each functional group are given. All kvalues are highly significant (Bonferroni corrected p-value < 0.00015).
Figure 6Comparison of Genes Ranked by Network Connectivity and Differential Expression in the Ivanova et al data set. Ingenuity Pathway Analysis of functional enrichments in the set of genes ranked within the top 1000 by Student's t-test and kand yet do not overlap with each other. Venn diagrams show the amount of gene overlap between the top 1000 black (pluripotency) module genes and the top 1000 genes most significantly down-regulated upon Oct4 RNAi (left); gene overlap between the top 1000 blue (differentiation) module genes and the 1000 genes most significantly up-regulated with Oct4 RNAi (right). Significance of differential expression was determined using Student's t-statistic. p-values have been corrected for multiple hypothesis tests (Benjamini-Hochberg). Only significantly enriched functional groups are shown.
Figure 7Transcriptional Regulators related to Pluripotency and Differentiation in the Zhou Network. TF and Suz12 binding in the promoter regions of highly connected genes related to ES cell pluripotency and self-renewal with GO terms of transcriptional regulation or chromatin structure. Genes are listed by black k(positive, left and negative, right) along with their corresponding significance level (log10 of the Bonferroni corrected p-value generated by a correlation test). Binding data from Chen et al (2008), Boyer et al (2006), and Loh et al (2006), are marked in blue (bound) and beige (unbound). For Oct4, Sox2, and Suz12, where binding is given by two studies, binding will be blue if it is found in both studies and light blue if found in only one.
Figure 8Non-Transcriptional Regulators Related to Pluripotency and Differentiation. TFs and Suz12 binding of highly connected genes related to ES cell pluripotency and self-renewal lacking GO terms for transcriptional regulation or chromatin structure. Genes are tablulated in the same format as Figure 7.
Figure 9Pluripotency Transcriptional Regulators that are not Bound by the Core TF Machinery in ES Cells. Genes related to ES cell pluripotency and self-renewal with GO terms of transcriptional regulation or chromatin structure and little pluripotency TF binding. Genes are listed by black kalong with their corresponding significance level (log10 of the Bonferroni corrected p-value generated by a correlation test). Binding data from Chen et al, Boyer et al and Loh et al, are marked in blue (bound) and beige (unbound). For Suz12, light blue indicates that a genes is called bound in Chen et al or Boyer et al but not in both.
Motif Enrichment in Genes bound by Oct4 or cMyc TF Groups
| SoxOct | (Oct4, Sox2) | 66 | 3.41 | 2.0E-21 | 103 | 0.45 | 6.4E-24 |
| Oct4 | (Oct4) | 37 | 2.33 | 9.9E-08 | 151 | 0.81 | 2.5E-03 |
| Sox2 | (Sox2) | 51 | 3.06 | 3.8E-15 | 178 | 0.86 | 1.4E-02 |
| Nanog | (Nanog) | 18 | 2.20 | 5.0E-04 | 122 | 1.35 | 3.9E-04 |
| Stat1 | (Stat3) | 34 | 2.83 | 7.3E-09 | 145 | 1.32 | 1.0E-04 |
| Ebox | (cMyc) | 6 | 0.59 | 9.5E-01 | 373 | 4.02 | 2.6E-207 |
| E2f1 | (E2f1) | 1 | 1.44 | 9.4E-02 | 284 | 2.25 | 1.2E-40 |
| Klf4 | (Klf4) | 38 | 1.02 | 4.0E-01 | 235 | 0.61 | 1.4E-19 |
| Lrh1 | (Lrh1) | 28 | 2.22 | 1.4E-05 | 158 | 1.12 | 6.9E-02 |
| Elk1 | (Elk1) | 10 | 0.19 | 3.1E-02 | 170 | 1.64 | 8.6E-11 |
Sequences co-bound by all TFs of the Oct4 group (Oct4, Sox2, Nanog, Smad1, and Stat3) or all TFs of the cMyc group (cMyc, nMyc, E2f1, and Zfx) in the Chen et al data set were scanned for relevant motifs, plus the Lrh1 (Nr5a2) and Elk1 motifs. There were 122 genes bound by all TFs in the Oct4 group and 1173 genes bound by all TFs in the cMyc group. Enrichments, computed by comparing against motif scans in control sequences, and p-values are shown below.