Literature DB >> 25164757

Comparative analysis of regulatory information and circuits across distant species.

Alan P Boyle1, Carlos L Araya1, Cathleen Brdlik2, Philip Cayting2, Chao Cheng3, Yong Cheng2, Kathryn Gardner4, LaDeana W Hillier5, Judith Janette4, Lixia Jiang2, Dionna Kasper4, Trupti Kawli2, Pouya Kheradpour6, Anshul Kundaje7, Jingyi Jessica Li8, Lijia Ma5, Wei Niu4, E Jay Rehm9, Joel Rozowsky3, Matthew Slattery9, Rebecca Spokony9, Robert Terrell5, Dionne Vafeados5, Daifeng Wang3, Peter Weisdepp5, Yi-Chieh Wu6, Dan Xie2, Koon-Kiu Yan3, Elise A Feingold10, Peter J Good10, Michael J Pazin10, Haiyan Huang11, Peter J Bickel11, Steven E Brenner12, Valerie Reinke4, Robert H Waterston5, Mark Gerstein3, Kevin P White9, Manolis Kellis6, Michael Snyder2.   

Abstract

Despite the large evolutionary distances between metazoan species, they can show remarkable commonalities in their biology, and this has helped to establish fly and worm as model organisms for human biology. Although studies of individual elements and factors have explored similarities in gene regulation, a large-scale comparative analysis of basic principles of transcriptional regulatory features is lacking. Here we map the genome-wide binding locations of 165 human, 93 worm and 52 fly transcription regulatory factors, generating a total of 1,019 data sets from diverse cell types, developmental stages, or conditions in the three species, of which 498 (48.9%) are presented here for the first time. We find that structural properties of regulatory networks are remarkably conserved and that orthologous regulatory factor families recognize similar binding motifs in vivo and show some similar co-associations. Our results suggest that gene-regulatory properties previously observed for individual factors are general principles of metazoan regulation that are remarkably well-preserved despite extensive functional divergence of individual network connections. The comparative maps of regulatory circuitry provided here will drive an improved understanding of the regulatory underpinnings of model organism biology and how these relate to human biology, development and disease.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25164757      PMCID: PMC4336544          DOI: 10.1038/nature13668

Source DB:  PubMed          Journal:  Nature        ISSN: 0028-0836            Impact factor:   49.962


Transcription-regulatory factors (RFs) guide the development and cellular activities of all organisms through highly cooperative and dynamic control of gene expression programs. RF-coding genes are often conserved across deep phylogenies, their DNA-binding protein domains are preferentially conserved at the amino-acid level, and their in vitro binding specificities are also frequently conserved across large distances[3,4]. However, the specific DNA targets and binding partners of regulators can evolve much more rapidly than DNA-binding domains, making it unclear whether the in vivo binding properties of RFs are conserved across large evolutionary distances. Comparisons of the locations of regulatory binding across species has been controversial, with some studies suggesting extensive conservation[1,2,5-10] while others suggest extensive turnover[11-14]. While it is generally assumed that across very large evolutionary distances regulatory circuitry is largely diverged, there exist highly-conserved sub-networks[15-18]. Thus, confusion exists in the level of regulatory turnover between related species, possibly due to the small number of factors studied. Moreover, despite recent observations of the architecture of metazoan regulatory networks a direct comparison of their topology and structure –such as clustered binding and regulatory network motifs– has not been possible owing to large differences in the procedures employed to assay RF binding in distinct species. Here we present a systematic and uniform comparison of regulation using many factors across distantly related species to help address these questions on a scale not previously possible. To compare regulatory architecture and binding across diverse organisms, the modENCODE and ENCODE consortia mapped the binding locations of 93 C. elegans RFs, 52 D. melanogaster RFs, and 165 human RFs as a community resource (Fig. 1, Supplementary Table 1). These RF binding datasets represent a substantial increase over those previously published for worm (194 new datasets for a total of 219) and human (211 new, 707 total) and a substantial improvement in data quality in fly with a move from ChIP-chip to ChIP-seq (93 new, 93 total)[2,8,19,20]. The majority of RFs are site-specific transcription factors (TFs) (83 in worm, 41 in fly, and 119 in human), although general regulatory factors such as RNA Pol II were also assayed.
Figure 1

Datasets overview

Data generated by the modENCODE and ENCODE consortium used in these analyses. The inner circle represents the fraction of datasets being presented for the first time in this paper. Each major context (cell lines in human and developmental stages in worm and fly) in each organism is colored a different hue in the outer two circles surrounding each organism and labeled on the edges of the diagrams. Datasets not in one of the main contexts are marked with asterisks. Each ChIP’d factor is depicted in the middle ring and the count is shown in parenthesis on the edges of the diagram (a given factor can be represented in multiple contexts). Every dataset is depicted in the outer ring, scaled by the number of peaks, and shaded to represent polymerase (red), transcription factor (lighter shade) and other (darker shade). In total 165, 93, and 52 unique factors were ChIP’d across all conditions and cell lines in human, worm, and fly respectively.

All RFs were analyzed by ChIP-seq according to modENCODE/ENCODE standards: antibodies were extensively characterized, and at least two independent biological replicates were analyzed[21]. Worm RFs were assayed in embryo (EX) and stage 1–4 larvae (L1-L4 larvae), fly RFs in early embryo (EE), late embryo (LE) and post embryo (PP), and human RFs in myelocytic leukemia K562 cells, lymphoblastoid GM12878 cells, H1 embryonic stem cells, cervical cancer HeLa cells, and liver eptihelium HepG2 cells. Binding sites were scored using a uniform pipeline that identifies reproducible targets using IDR analysis (Extended Data Figure 1)[22] and quality-filtered experiments (see Supplementary Information). These rigorous quality metrics insure that the data sets used here are robust. All data presented are available at www.ENCODEProject.org/comparative/regulation/.
Extended Data Figure 1

Outline of data processing pipeline

All data sets were processed using a uniform processing pipeline with identical alignment and filtering criteria and standardized IDR peak calling using SPP (Human + Worm) and MACS2 (Fly).

In order to explore motif conservation, we examined the 31 cases in which we had members of orthologous TF families profiled in at least two species (Extended Data Figure 2a; Supplementary data) we examined whether regulatory features were conserved across species. Sequence enriched motifs were found for 18 of the 31 families and for 12 orthologous families (41 RFs), the same motif is enriched in both species (Extended Data Figure 2b–c). For 18 of 31 families (64 of 93 RFs), the motif from one species is enriched in the bound regions of another species (one-sided hypergeometric, p-value=3.3×10−4). These findings indicate that many factors retain highly similar in vivo sequence specificity within orthologous families, a feature noted previously for only a limited number of factors.
Extended Data Figure 2

Motifs

(a) 32 TF gene families with a binding dataset for at least two species (names abbreviated). Cross enrichment indicates the enrichment of motifs from one species in the datasets of another. For 13 families, we observed no cross enrichment (red). For 7 families (blue) we observed cross enrichment and for an additional 12 (green) we also had matching motifs. For two cases marked by an asterisk a known fly motif matches the human motif but no worm motif matches. (b) PRDM1/Blimp-1/blmp-1 gene family. We discovered a motif in worm datasets that match literature derived known motifs from human and fly. (c) All three motifs are highly similar and enriched in human PRDM1 and worm blmp-1 datasets. Cell-type and treatment are indicated for each dataset in parenthesis. Enrichments in each box are the fraction of motif instances that are inside the bound regions and dividing that by the fraction of shuffled motif instances. Additional motifs known and discovered for these and other datasets are included in Supplementary Information.

Next, we used RNA-seq data[3] to determine whether targets of orthologous RFs are specifically expressed at similar developmental stages between fly and worm. As a class, orthologous RFs (both assayed here and not) are significantly expressed at similar stages (Extended Data Figure 3a–c). However, expression of orthologous targets of orthologous RFs in worm and fly shows little significant target overlap (Extended Data Figure 3d) and the large majority of orthologous RFs did not show conserved target functions (Extended Data Figure 4a–c), suggesting extensive re-wiring of regulatory control across metazoans. Nevertheless, human and worm orthologous RFs were more likely to show conserved target gene functions than non-orthologous RFs (Extended Data Figure 4d, Wilcoxon test p-value < 3.9 × 10−6), highlighting RFs with conserved target functions.
Extended Data Figure 3

Orthologous expression in worm/fly

(a) Fly-worm stage alignment of expression using all fly-worm orthologs. (b) Fly-worm stage alignment by using all TF orthologs. (c) Fly-worm stage alignment by using ChIP’d TF ortholog. (d) Fly-worm stage alignment by using proximal genes to ChIP’d TF binding sites. The stage-mapped data exhibit two sets of collinear patterns between the two species (distinct diagonals). In the bottom diagonal, expression from worm embryos and larvae are matched with fly embryos and larvae, respectively; worm adults are matched with fly early embryos and fly female adults, possibly due to the orthologous gene expression in eggs of both species; worm dauers are matched with fly late embryo to L1 and L3 stages, which is similar to the position of dauer stages in the worm lifecycle (between worm L1 and L4 stages). In the upper diagonal, worm middle embryos are matched with fly L1 stage; worm late embryos are matched with fly prepupae and pupae stages; worm L4 male larvae are matched with fly male adults. This collinear pattern may be attributable to fly genes with two-mode expression profiles and many-to-one fly-worm orthologous gene pairs. For more details, please refer to the companion paper31.

Extended Data Figure 4

Comparison of GO enrichment of orthologous TF pairs

A comparison of GO enrichment of orthologous TF pairs for all contexts in (a) Human vs Worm, (b) Human vs. Fly, and (c) Worm vs. Fly is shown. Red boxes indicate level of similar GO enrichment. ‘Plus’ signs mark orthologous TF pairs with white ‘pluses’ indicating the most significant enrichment for an ortholog pair. (d) Orthologous factors are more enriched for matching GO terms than non-orthologous factors.

RF binding is not randomly distributed throughout the genome, but rather, in all three species, approximately 50% of binding events are found in highly-occupied clusters, termed HOT regions[1,2,5,8,10]. HOT regions show enhancer function in integrated transcriptional reporters[11] and are stabilized by cohesin[15,17]. HOT regions show no significant enrichment with non-specific antibodies (Extended Data Figure 5), in contrast to recent work using raw signal[19] rather than IDR peaks, although the possibility that they are artifacts has been raised.
Extended Data Figure 5

Human HOT enrichments are not overly enriched for control DNA

HOT regions do not represent assembly or ChIP-ability artifacts. (a) Scatter plot of IgG IP/Input vs TF Occupancy. Scatterplot is shaded by density of points. Red dash line represents HOT threshold and black dashed line represent an enrichment of 1x. Black line represents best fitting line to the scatter plot (R2 = 0.0045) (b) A scatterplot of density (number of TF peaks per kb) rather than total number of peaks in a region shows a similar trend. (c) Barplot of fraction of regions with high IgG enrichment for HOT and non-HOT (RGB) regions using the same threshold (1.5x) as Teytelman et al. Figure 7 reveals little similarity between HOT regions and artifact ChIP regions. (d) The fraction of HOT (red) and non- HOT (blue) regions with high IgG enrichment is plotted as a function of threshold. Black line represents no enrichment (IgG/Input = 1x) and grey dashed line represents the enrichment cutoff (1.5x) used in (b) and in Teytelman et al. Figure 7. (e) Comparison of IgG (IgG/Input) and RNA Pol II enrichment (RNA PolII/Input) shows a different trend from Teytelman et al. Fig 3a. (e) Nearly all (99.967%) of our uniformly processed RNA PolII binding sites have IP/Input rations >2x, with a median enrichment of ~20x.

By comparing HOT regions across different developmental times and cells types, we find that 5–10% of HOT regions are constitutive, and the remaining are context-specific, indicating HOT regions are dynamically established, rather than an intrinsic property of specific regions. In humans we find that ~90% of constitutive HOT regions fall within promoter chromatin states compared to only ~10–20% of context-specific HOT regions (Fig. 2a, Extended Data Figure 6). Instead, ~80–90% of context-specific HOT regions fall within enhancer states. Moreover, these context-specific HOT regions are specifically enriched for enhancers in matching cell types or developmental stages. For example, 80% of GM12878-called HOT regions fall within GM12878-specific enhancers but only ~10% of GM12878-called HOT regions fall within enhancers called in other cell-types (Fig. 2b). These patterns remain similar for all cell types (Extended Data Figure 7), suggesting the two types of HOT regions are established concordantly and dynamically between cell types, though these patterns are weaker in the worm and fly data.
Figure 2

HOT regions

HOT regions contain binding sites for a large number of factors. (a) A total of 2,283, 2,948, and 46,348 HOT regions exist of which 29.1%, 13.7%, and 9.7% are constitutive in worm, fly, and human respectively. A large fraction of HOT regions are shared across multiple contexts but the majority of HOT regions are specific to a single context. (b) Constitutive human HOT (cHOT) regions show strong enrichment for promoters while cell-type specific [GM12878 (GM), H1hesc (H1), HepG2 (HG), HelaS3 (HL), K562 (K5)] HOT regions show more enhancer enrichment (see also Extended Data Figure 3).

Extended Data Figure 6

HOT regions were identified in all organisms

(a) To identify HOT region for each context, we first analyzed the number and size distribution of target binding regions (in which factor binding sites are concentrated). For each target case simulation, we randomly select an equivalent number of random binding regions with a matched size distribution. Next, for each factor assayed (in the target case), we evaluated the number and size of observed binding sites, and simulated an equivalent number and size distribution of target binding sites, restricting their placement to the simulated binding regions. We collapsed simulated binding sites from all factors into binding regions, verifying that these cluster into a similar number of simulated binding regions as the target binding regions. We identify regions at a 5% (HOT) and 1% (XOT) occupancy threshold based on this simulated data. (b) Binding of regulatory factors covers different fractions of the genomes of fly, human, and worm. Coverage is shown for constitutively HOT regions (cHOT – red), HOT regions (yellow), and non-HOT regions (RGB –green). Coverage for XOT regions is given in parenthesis.

Extended Data Figure 7

HOT enrichments with context-specific enhancer enrichments

(a) Histone marks for HOT regions (represented by points and smoothed to show density) at proximal and (b) distal sites show similar trends of histone mark enrichment in their flanking regions. Enhancer calls for a specific developmental stage (c, e) or cell type (d) (labeled over each set of bar graphs) match HOT regions from that cell type and not HOT regions from another cell type. Each set of six bar graphs represents the same set of HOT regions called constitutively HOT or specific to each of the five cell types. Constitutive HOT (cHOT) regions are significantly enriched at promoters with the remaining regions overlapping enhancer regions.

We next constructed regulatory networks in each species by predicting gene targets of each RF using TIP[23] and used simulated annealing to reveal the organization of RFs in three layers of master-regulators, intermediate regulators, and low-level regulators (Fig. 3a–b). The algorithm found only 7% of RFs at the top layer of the network in fly and 13% in worm, compared to 33% in human. We also found that more edges are upward flowing in human (30%) than worm and fly (22% and 7%). This suggests differences in the global network organization with more extensive feedback and a higher number of master regulators in human.
Figure 3

Networks

(a) Statistics of the transcription regulatory networks in human, worm, fly and their hierarchical organization. (b) An example of the hierarchical network for worm. (c) Network motif enrichment. The human, worm and fly networks are mostly consistent in terms of motif enrichment. The motif feed-forward loop is the most enriched motif in all three networks. (d) Different transcription factors have different tendencies to appear as top, middle and bottom regulators in a FFL. The lists of human, worm, fly TFs with corresponding tendencies are displayed.

We next assessed the local structure of regulatory networks, by searching for enriched sub-graphs known as network motifs (Fig. 3c). We found that the same network motifs were most and least enriched in the three species. In each case, the most abundant was the feed-forward loop (FFL), while the least abundant were cascade motifs, and both divergent and convergent regulation. Moreover, specific RFs were enriched for origin, target, or intermediate regulators in these FFLs in each species (Fig. 3d). Surprisingly, the number of FFLs varied by developmental stage in both worm and fly, with L1 stage in worm and late-embryo stage in fly showing the highest number of FFLs (Extended Data Figure 8), suggesting increased filtering fluctuations and accelerating responses in these stages[24].
Extended Data Figure 8

The number of feed forward loops in different stage-specific networks

The number of FFLs in a stage is normalized by the number of TFs in the corresponding stage-specific network. Though the sets of TFs may differ, the number of TFs in each stage stays roughly the same.

We next determined whether the three species showed conserved RF co-associations. We first focused on global co-associations where two factors co-associate frequently regardless of context, either by intermolecular interactions or independent recruitment (Extended Data Figure 9). With the exception of a small number of conserved global RF co-associations (e.g. SIN3A with HDAC1, HDAC2, and NR2C2 in fly and human[25-27] and MXI1 with E2F1, E2F4, and E2F6 in worm and human), the majority of global co-associations were not conserved in the contexts and species pairs analyzed.
Extended Data Figure 9

Co-associations

Evolutionary retention and change in TF co-associations. The pairwise co-association strengths between orthologous TFs are shown for human-worm orthologs (a, b) and human-fly orthologs (c, d). For each pair of species-specific orthologs across multiple samples, the co-association strength, measured as the fraction of significant co-binding events between experiments, is shown (IntervalStats32). (a) Human co-association matrix for human-worm orthologs. (b) Worm co-association matrix for human-worm orthologs. (c) Human co-association matrix for human-fly orthologs. (d) Fly co-association matrix for human-fly orthologs. (e) Comparison of human-worm TF ortholog co-associations. The co-association strength of human-worm orthologs in human (x-axis) is plotted against the co-association strength in worm (y-axis). Lines depict 1 (solid) and 1.5 (dashed) standard deviations from the mean score. Factors in blue represent enrichments due to paralogous TFs in human that tend to be highly co-associated. (f) Comparison of human-fly TF ortholog co-associations. Co-association strength in human (x-axis) is plotted against co-association strength in fly (y-axis). For TF orthologs assayed in multiple developmental stages/cell-lines, the maximal co-association between contexts was selected for the comparative analyses (e, f).

Because RF co-association at distinct binding regions is local and contextual (i.e. different combinations of factors co-associate at different genomic locations), we next used an approach to detect co-association at distinct regions of the genome based on conserved patterns of RF binding. This method uses Self Organizing Maps (SOMs) to analyze co-association patterns at specific loci by better exploring the full combinatorial space of RF binding than traditional co-association approaches (Fig. 4a–c)[28]. We demonstrate that co-associations at distinct genomic regions reveal a more complex view of regulatory structure and bring forth categorical enrichments that are lost in a larger, genomic context.
Figure 4

TF co-association

Many instances of TF co-association are under very specific contexts and are likely not observed in a simple genome-wide co-association study. (a) We combined the patterns of orthologous factors and genomic regions from two organisms to train a SOM where each ‘hexagon’ contains genomic regions from either organism with the same binding pattern of orthologous factors for worm (b) and fly (g). Each hexagon is shaded by the frequency of the pattern in the pairs of organisms. We show an example of binding patterns of 4 hexagons from the human-fly (c–d) and the human-worm (e–f). Names above the heatmaps are human factor names while those below are their ortholog names. Dark shaded boxes indicate binding of that factor. (c) A binding pattern shared at equal frequency between human and fly with only CTCF and SETDB1 (CTCF and SuVar3-9 in fly) binding. (d) A binding pattern that occurs more frequently in human shows ELF1, RNA Pol II, STAT, and TBP binding. (e) A binding pattern at similar frequencies in human and worm that is an example of a HOT region. (f) A pattern more frequent in humans than worms shows RNA Pol II, E2F, FOS, MYBL2, HDAC1, MXI1, FOXA, and TBP binding. (h) Co-localization patterns that occur more frequently near promoters (<500bp) in humans are highly likely to also occur at promoters in worm (80%) and fly (100%).

We examined whether specific contextual co-associations are conserved for orthologous RFs by using binding data from each organismal pair i.e. human-worm and human-fly (Fig. 4b,g). Specific RF co-associations were observed; most are conserved to varying degrees across each organism with very few that are entirely organism-specific (Fig. 4b,g). These co-associations result in expected sets of factors such as the previously noted SIN3A+HDAC co-association. In addition, we find new co-associations such as the pattern in Fig. 4f for human-worm, which in worm is highly enriched for GO terms associated with sex determination. We further examined which co-associations are conserved at distinct gene locations (i.e. proximal and distal). We found distinct combinations of conserved co-associations in relation to TSS regions. Interestingly, virtually all TSS-proximal co-associations in human remain TSS-proximal in worm (~80%) and fly (~100%), indicating that co-associations that occur at promoters are often highly conserved (Fig. 4h). On the other hand, co-associations at distal regions are much less conserved. Using a large resource of regulatory binding information, our results suggest that there is little conservation of individual regulatory targets and binding patterns for these highly divergent metazoans. However, we do find strong conservation of overall regulatory architecture, both in network motif usage and in concentrated regulatory binding at dynamically established HOT regions. We observe an increased conservation of in vivo sequence preferences and some target gene functions, with context-specific RF partners still be observed at specific loci in these distal comparisons. These findings are consistent with previous results indicating that the gene targets of regulation are typically quite divergent and likely account for many of the phenotypic differences among species[12-14,16,29,30], despite conserved sequence preferences. We significantly extend these observations, both in the number of regulators studied and in the range of regulatory properties studied, and provide specific examples of conserved and diverged regulatory functions. Lastly, beyond its potential for comparative studies of gene regulation, the primary datasets provide invaluable new information of genome-wide TF binding information both in human, and in two of the most important metazoan models of human biology, development, and disease.

Methods

Detailed methods are in the supplement. Data sets described here can be obtained from the ENCODE project website at www.ENCODEProject.org/comparative/regulation/.

Outline of data processing pipeline

All data sets were processed using a uniform processing pipeline with identical alignment and filtering criteria and standardized IDR peak calling using SPP (Human + Worm) and MACS2 (Fly).

Motifs

(a) 32 TF gene families with a binding dataset for at least two species (names abbreviated). Cross enrichment indicates the enrichment of motifs from one species in the datasets of another. For 13 families, we observed no cross enrichment (red). For 7 families (blue) we observed cross enrichment and for an additional 12 (green) we also had matching motifs. For two cases marked by an asterisk a known fly motif matches the human motif but no worm motif matches. (b) PRDM1/Blimp-1/blmp-1 gene family. We discovered a motif in worm datasets that match literature derived known motifs from human and fly. (c) All three motifs are highly similar and enriched in human PRDM1 and worm blmp-1 datasets. Cell-type and treatment are indicated for each dataset in parenthesis. Enrichments in each box are the fraction of motif instances that are inside the bound regions and dividing that by the fraction of shuffled motif instances. Additional motifs known and discovered for these and other datasets are included in Supplementary Information.

Orthologous expression in worm/fly

(a) Fly-worm stage alignment of expression using all fly-worm orthologs. (b) Fly-worm stage alignment by using all TF orthologs. (c) Fly-worm stage alignment by using ChIP’d TF ortholog. (d) Fly-worm stage alignment by using proximal genes to ChIP’d TF binding sites. The stage-mapped data exhibit two sets of collinear patterns between the two species (distinct diagonals). In the bottom diagonal, expression from worm embryos and larvae are matched with fly embryos and larvae, respectively; worm adults are matched with fly early embryos and fly female adults, possibly due to the orthologous gene expression in eggs of both species; worm dauers are matched with fly late embryo to L1 and L3 stages, which is similar to the position of dauer stages in the worm lifecycle (between worm L1 and L4 stages). In the upper diagonal, worm middle embryos are matched with fly L1 stage; worm late embryos are matched with fly prepupae and pupae stages; worm L4 male larvae are matched with fly male adults. This collinear pattern may be attributable to fly genes with two-mode expression profiles and many-to-one fly-worm orthologous gene pairs. For more details, please refer to the companion paper31.

Comparison of GO enrichment of orthologous TF pairs

A comparison of GO enrichment of orthologous TF pairs for all contexts in (a) Human vs Worm, (b) Human vs. Fly, and (c) Worm vs. Fly is shown. Red boxes indicate level of similar GO enrichment. ‘Plus’ signs mark orthologous TF pairs with white ‘pluses’ indicating the most significant enrichment for an ortholog pair. (d) Orthologous factors are more enriched for matching GO terms than non-orthologous factors.

Human HOT enrichments are not overly enriched for control DNA

HOT regions do not represent assembly or ChIP-ability artifacts. (a) Scatter plot of IgG IP/Input vs TF Occupancy. Scatterplot is shaded by density of points. Red dash line represents HOT threshold and black dashed line represent an enrichment of 1x. Black line represents best fitting line to the scatter plot (R2 = 0.0045) (b) A scatterplot of density (number of TF peaks per kb) rather than total number of peaks in a region shows a similar trend. (c) Barplot of fraction of regions with high IgG enrichment for HOT and non-HOT (RGB) regions using the same threshold (1.5x) as Teytelman et al. Figure 7 reveals little similarity between HOT regions and artifact ChIP regions. (d) The fraction of HOT (red) and non- HOT (blue) regions with high IgG enrichment is plotted as a function of threshold. Black line represents no enrichment (IgG/Input = 1x) and grey dashed line represents the enrichment cutoff (1.5x) used in (b) and in Teytelman et al. Figure 7. (e) Comparison of IgG (IgG/Input) and RNA Pol II enrichment (RNA PolII/Input) shows a different trend from Teytelman et al. Fig 3a. (e) Nearly all (99.967%) of our uniformly processed RNA PolII binding sites have IP/Input rations >2x, with a median enrichment of ~20x.

HOT regions were identified in all organisms

(a) To identify HOT region for each context, we first analyzed the number and size distribution of target binding regions (in which factor binding sites are concentrated). For each target case simulation, we randomly select an equivalent number of random binding regions with a matched size distribution. Next, for each factor assayed (in the target case), we evaluated the number and size of observed binding sites, and simulated an equivalent number and size distribution of target binding sites, restricting their placement to the simulated binding regions. We collapsed simulated binding sites from all factors into binding regions, verifying that these cluster into a similar number of simulated binding regions as the target binding regions. We identify regions at a 5% (HOT) and 1% (XOT) occupancy threshold based on this simulated data. (b) Binding of regulatory factors covers different fractions of the genomes of fly, human, and worm. Coverage is shown for constitutively HOT regions (cHOT – red), HOT regions (yellow), and non-HOT regions (RGB –green). Coverage for XOT regions is given in parenthesis.

HOT enrichments with context-specific enhancer enrichments

(a) Histone marks for HOT regions (represented by points and smoothed to show density) at proximal and (b) distal sites show similar trends of histone mark enrichment in their flanking regions. Enhancer calls for a specific developmental stage (c, e) or cell type (d) (labeled over each set of bar graphs) match HOT regions from that cell type and not HOT regions from another cell type. Each set of six bar graphs represents the same set of HOT regions called constitutively HOT or specific to each of the five cell types. Constitutive HOT (cHOT) regions are significantly enriched at promoters with the remaining regions overlapping enhancer regions.

The number of feed forward loops in different stage-specific networks

The number of FFLs in a stage is normalized by the number of TFs in the corresponding stage-specific network. Though the sets of TFs may differ, the number of TFs in each stage stays roughly the same.

Co-associations

Evolutionary retention and change in TF co-associations. The pairwise co-association strengths between orthologous TFs are shown for human-worm orthologs (a, b) and human-fly orthologs (c, d). For each pair of species-specific orthologs across multiple samples, the co-association strength, measured as the fraction of significant co-binding events between experiments, is shown (IntervalStats32). (a) Human co-association matrix for human-worm orthologs. (b) Worm co-association matrix for human-worm orthologs. (c) Human co-association matrix for human-fly orthologs. (d) Fly co-association matrix for human-fly orthologs. (e) Comparison of human-worm TF ortholog co-associations. The co-association strength of human-worm orthologs in human (x-axis) is plotted against the co-association strength in worm (y-axis). Lines depict 1 (solid) and 1.5 (dashed) standard deviations from the mean score. Factors in blue represent enrichments due to paralogous TFs in human that tend to be highly co-associated. (f) Comparison of human-fly TF ortholog co-associations. Co-association strength in human (x-axis) is plotted against co-association strength in fly (y-axis). For TF orthologs assayed in multiple developmental stages/cell-lines, the maximal co-association between contexts was selected for the comparative analyses (e, f).
  27 in total

Review 1.  Evolution at two levels in humans and chimpanzees.

Authors:  M C King; A C Wilson
Journal:  Science       Date:  1975-04-11       Impact factor: 47.728

Review 2.  Network motifs: theory and experimental approaches.

Authors:  Uri Alon
Journal:  Nat Rev Genet       Date:  2007-06       Impact factor: 53.242

3.  Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences.

Authors:  Michael F Berger; Gwenael Badis; Andrew R Gehrke; Shaheynoor Talukder; Anthony A Philippakis; Lourdes Peña-Castillo; Trevis M Alleyne; Sanie Mnaimneh; Olga B Botvinnik; Esther T Chan; Faiqua Khalid; Wen Zhang; Daniel Newburger; Savina A Jaeger; Quaid D Morris; Martha L Bulyk; Timothy R Hughes
Journal:  Cell       Date:  2008-06-27       Impact factor: 41.582

4.  Transcriptional repression by the methyl-CpG-binding protein MeCP2 involves a histone deacetylase complex.

Authors:  X Nan; H H Ng; C A Johnson; C D Laherty; B M Turner; R N Eisenman; A Bird
Journal:  Nature       Date:  1998-05-28       Impact factor: 49.962

5.  A complex containing N-CoR, mSin3 and histone deacetylase mediates transcriptional repression.

Authors:  T Heinzel; R M Lavinsky; T M Mullen; M Söderstrom; C D Laherty; J Torchia; W M Yang; G Brard; S D Ngo; J R Davie; E Seto; R N Eisenman; D W Rose; C K Glass; M G Rosenfeld
Journal:  Nature       Date:  1997-05-01       Impact factor: 49.962

6.  Hotspots of transcription factor colocalization in the genome of Drosophila melanogaster.

Authors:  Celine Moorman; Ling V Sun; Junbai Wang; Elzo de Wit; Wendy Talhout; Lucas D Ward; Frauke Greil; Xiang-Jun Lu; Kevin P White; Harmen J Bussemaker; Bas van Steensel
Journal:  Proc Natl Acad Sci U S A       Date:  2006-07-31       Impact factor: 11.205

7.  Transcriptional repression by REST: recruitment of Sin3A and histone deacetylase to neuronal genes.

Authors:  Y Huang; S J Myers; R Dingledine
Journal:  Nat Neurosci       Date:  1999-10       Impact factor: 24.884

8.  Divergence of transcription factor binding sites across related yeast species.

Authors:  Anthony R Borneman; Tara A Gianoulis; Zhengdong D Zhang; Haiyuan Yu; Joel Rozowsky; Michael R Seringhaus; Lu Yong Wang; Mark Gerstein; Michael Snyder
Journal:  Science       Date:  2007-08-10       Impact factor: 47.728

9.  Dynamic trans-acting factor colocalization in human cells.

Authors:  Dan Xie; Alan P Boyle; Linfeng Wu; Jie Zhai; Trupti Kawli; Michael Snyder
Journal:  Cell       Date:  2013-10-24       Impact factor: 41.582

10.  Tissue-specific transcriptional regulation has diverged significantly between human and mouse.

Authors:  Duncan T Odom; Robin D Dowell; Elizabeth S Jacobsen; William Gordon; Timothy W Danford; Kenzie D MacIsaac; P Alexander Rolfe; Caitlin M Conboy; David K Gifford; Ernest Fraenkel
Journal:  Nat Genet       Date:  2007-05-21       Impact factor: 38.330

View more
  82 in total

1.  A Large Multiethnic Genome-Wide Association Study of Adult Body Mass Index Identifies Novel Loci.

Authors:  Thomas J Hoffmann; Hélène Choquet; Jie Yin; Yambazi Banda; Mark N Kvale; Maria Glymour; Catherine Schaefer; Neil Risch; Eric Jorgenson
Journal:  Genetics       Date:  2018-08-14       Impact factor: 4.562

2.  Genomics: Hiding in plain sight.

Authors:  Felix Muerdter; Alexander Stark
Journal:  Nature       Date:  2014-08-28       Impact factor: 49.962

3.  Multiple sclerosis-associated CLEC16A controls HLA class II expression via late endosome biogenesis.

Authors:  Marvin M van Luijn; Karim L Kreft; Marlieke L Jongsma; Steven W Mes; Annet F Wierenga-Wolf; Marjan van Meurs; Marie-José Melief; Rik van der Kant; Lennert Janssen; Hans Janssen; Rusung Tan; John J Priatel; Jacques Neefjes; Jon D Laman; Rogier Q Hintzen
Journal:  Brain       Date:  2015-03-29       Impact factor: 13.501

Review 4.  Single-cell genome-wide studies give new insight into nongenetic cell-to-cell variability in animals.

Authors:  Arkadiy K Golov; Sergey V Razin; Alexey A Gavrilov
Journal:  Histochem Cell Biol       Date:  2016-07-13       Impact factor: 4.304

Review 5.  Enhancers as non-coding RNA transcription units: recent insights and future perspectives.

Authors:  Wenbo Li; Dimple Notani; Michael G Rosenfeld
Journal:  Nat Rev Genet       Date:  2016-03-07       Impact factor: 53.242

6.  Conserved and species-specific transcription factor co-binding patterns drive divergent gene regulation in human and mouse.

Authors:  Adam G Diehl; Alan P Boyle
Journal:  Nucleic Acids Res       Date:  2018-02-28       Impact factor: 16.971

7.  The ModERN Resource: Genome-Wide Binding Profiles for Hundreds of Drosophila and Caenorhabditis elegans Transcription Factors.

Authors:  Michelle M Kudron; Alec Victorsen; Louis Gevirtzman; LaDeana W Hillier; William W Fisher; Dionne Vafeados; Matt Kirkey; Ann S Hammonds; Jeffery Gersch; Haneen Ammouri; Martha L Wall; Jennifer Moran; David Steffen; Matt Szynkarek; Samantha Seabrook-Sturgis; Nader Jameel; Madhura Kadaba; Jaeda Patton; Robert Terrell; Mitch Corson; Timothy J Durham; Soo Park; Swapna Samanta; Mei Han; Jinrui Xu; Koon-Kiu Yan; Susan E Celniker; Kevin P White; Lijia Ma; Mark Gerstein; Valerie Reinke; Robert H Waterston
Journal:  Genetics       Date:  2017-12-28       Impact factor: 4.562

8.  Stable Caenorhabditis elegans chromatin domains separate broadly expressed and developmentally regulated genes.

Authors:  Kenneth J Evans; Ni Huang; Przemyslaw Stempor; Michael A Chesney; Thomas A Down; Julie Ahringer
Journal:  Proc Natl Acad Sci U S A       Date:  2016-10-25       Impact factor: 11.205

9.  Hierarchical cooperation of transcription factors from integration analysis of DNA sequences, ChIP-Seq and ChIA-PET data.

Authors:  Ruimin Wang; Yunlong Wang; Xueying Zhang; Yaliang Zhang; Xiaoyong Du; Yaping Fang; Guoliang Li
Journal:  BMC Genomics       Date:  2019-05-08       Impact factor: 3.969

10.  Activin/Smad2-induced Histone H3 Lys-27 Trimethylation (H3K27me3) Reduction Is Crucial to Initiate Mesendoderm Differentiation of Human Embryonic Stem Cells.

Authors:  Lu Wang; Xuanhao Xu; Yaqiang Cao; Zhongwei Li; Hao Cheng; Gaoyang Zhu; Fuyu Duan; Jie Na; Jing-Dong J Han; Ye-Guang Chen
Journal:  J Biol Chem       Date:  2016-12-13       Impact factor: 5.157

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.