Rui R Catarino1, Alexander Stark1. 1. Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), 1030 Vienna, Austria.
Abstract
Enhancers are important genomic regulatory elements directing cell type-specific transcription. They assume a key role during development and disease, and their identification and functional characterization have long been the focus of scientific interest. The advent of next-generation sequencing and clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9-based genome editing has revolutionized the means by which we study enhancer biology. In this review, we cover recent developments in the prediction of enhancers based on chromatin characteristics and their identification by functional reporter assays and endogenous DNA perturbations. We discuss that the two latter approaches provide different and complementary insights, especially in assessing enhancer sufficiency and necessity for transcription activation. Furthermore, we discuss recent insights into mechanistic aspects of enhancer function, including findings about cofactor requirements and the role of post-translational histone modifications such as monomethylation of histone H3 Lys4 (H3K4me1). Finally, we survey how these approaches advance our understanding of transcription regulation with respect to promoter specificity and transcriptional bursting and provide an outlook covering open questions and promising developments.
Enhancers are important genomic regulatory elements directing cell type-specific transcription. They assume a key role during development and disease, and their identification and functional characterization have long been the focus of scientific interest. The advent of next-generation sequencing and clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9-based genome editing has revolutionized the means by which we study enhancer biology. In this review, we cover recent developments in the prediction of enhancers based on chromatin characteristics and their identification by functional reporter assays and endogenous DNA perturbations. We discuss that the two latter approaches provide different and complementary insights, especially in assessing enhancer sufficiency and necessity for transcription activation. Furthermore, we discuss recent insights into mechanistic aspects of enhancer function, including findings about cofactor requirements and the role of post-translational histone modifications such as monomethylation of histone H3 Lys4 (H3K4me1). Finally, we survey how these approaches advance our understanding of transcription regulation with respect to promoter specificity and transcriptional bursting and provide an outlook covering open questions and promising developments.
Throughout development, a single genome gives rise to different cell types with unique morphologies and functions. Each cell type differentially expresses a subset of genes, which defines its identity. Transcription begins with the recruitment of RNA polymerase II (Pol II) and auxiliary factors to core promoters, short DNA sequences around transcription start sites (TSSs). While core promoters are sufficient to recruit Pol II and drive basal levels of transcription (Orphanides et al. 1996; Roeder 1996; Blackwood and Kadonaga 1998), they require cis-regulatory elements (CREs) or enhancers for full activity (Banerji et al. 1981; Shlyueva et al. 2014). Enhancers bind transcription factors (TFs) and cofactors to recruit and activate Pol II at target gene promoters from both proximal and distal positions. The genomic positions of enhancers typically show characteristic chromatin properties, which are often used to predict enhancers (Fig. 1; Heintzman et al. 2009), but how enhancers function is still poorly understood, and their reliable identification in large genomes has been a major obstacle. In this review, we discuss recent technological advances in the field of regulatory genomics and novel insights into enhancer function and biology.
Figure 1.
Overview: enhancers and (core) promoters. (A) Gene (black bars) transcription starts at the TSSs (straight arrow) within core promoter elements (light brown). Enhancers (blue boxes) are cis-regulatory DNA sequences that activate expression of their target genes and are often found in introns or distal intergenic regions both upstream and downstream. (B) Nucleosomes (black circles) bind to DNA and decrease accessibility to other proteins, such as TFs (colored rods). Enhancers contain TF-binding motifs (colored boxes), sequences specifically recognized by TFs and to which TFs bind in competition with nucleosomes. Enhancer-bound TFs recruit transcriptional cofactors (colored polygons) and activate gene expression from a distal gene. Cofactors often have catalytic activity and post-translationally modify TFs, histones, and other proteins in the vicinity of enhancers and promoters (small colored circles indicate such modifications).
Overview: enhancers and (core) promoters. (A) Gene (black bars) transcription starts at the TSSs (straight arrow) within core promoter elements (light brown). Enhancers (blue boxes) are cis-regulatory DNA sequences that activate expression of their target genes and are often found in introns or distal intergenic regions both upstream and downstream. (B) Nucleosomes (black circles) bind to DNA and decrease accessibility to other proteins, such as TFs (colored rods). Enhancers contain TF-binding motifs (colored boxes), sequences specifically recognized by TFs and to which TFs bind in competition with nucleosomes. Enhancer-bound TFs recruit transcriptional cofactors (colored polygons) and activate gene expression from a distal gene. Cofactors often have catalytic activity and post-translationally modify TFs, histones, and other proteins in the vicinity of enhancers and promoters (small colored circles indicate such modifications).
Next-generation sequencing (NGS) enables the genome-wide profiling of gene expression and chromatin properties, allowing correlative enhancer predictions
Over the past decade, NGS has enabled studies of enhancers and their properties across entire genomes and, recently, with single-cell resolution (Adli and Bernstein 2011; Buenrostro et al. 2013; Nagano et al. 2013; Deng et al. 2014; Macosko et al. 2015). Chromatin immunoprecipitation (ChIP) coupled to NGS (Johnson et al. 2007; Robertson et al. 2007) is now used routinely to map TF binding and histone modifications across entire genomes, including histone H3 Lys4 monomethylation (H3K4me1) found at enhancers, H3K4 trimethylation (H3K4me3) at promoters, and H3K27 acetylation (H3K27ac) at active enhancers and promoters (for reviews, see Buecker and Wysocka 2012; Zentner and Scacheri 2012; Calo and Wysocka 2013). Together with other correlative traits such as DNA accessibility and (bidirectional) transcription, they allow genome-wide enhancer predictions (for reviews, see Buecker and Wysocka 2012; Maston et al. 2012; Shlyueva et al. 2014). Recent advances have explored the relationship between these correlative traits and enhancers, including their putative contributions to enhancer activity.
DNA accessibility is required for enhancer activity and is a good predictor of enhancers
Histone octamers compact genomic DNA into units termed nucleosomes, which inhibit the DNA accessibility of other proteins such as TFs (Svaren et al. 1994; Walter et al. 1995; Liu et al. 2006), creating a barrier for enhancer activation. Interestingly, nucleosomes have different affinities to different DNA sequences (Thåström et al. 1999; Bernstein et al. 2004; Segal et al. 2006; Kaplan et al. 2009; Brogaard et al. 2012), and enhancers appear to have a particularly high sequence-encoded nucleosome positioning preference. Inactive enhancers are therefore typically occupied by nucleosomes, which block TF-binding sites (TFBS), establishing a “default off” state (Lidor Nili et al. 2010; Charoensawan et al. 2012; Barozzi et al. 2014). Enhancer activation is thought to rely on specialized pioneer TFs that bind and displace nucleosomes (Perlmann and Wrange 1988; Imbalzano et al. 1994; Cirillo et al. 2002; Zaret and Carroll 2011; Soufi et al. 2015), presumably via ATP-dependent chromatin remodelers (for reviews, see Clapier and Cairns 2009; Hargreaves and Crabtree 2011), although alternative mechanisms are possible (for reviews, see Deplancke et al. 2016; Reiter et al. 2017).Intriguingly, recent work has shown that nucleosomal DNA can be accessible to micrococcal nuclease (MNase) (Iwafuchi-Doi et al. 2016; Mieczkowski et al. 2016; Mueller et al. 2017), particularly at enhancers and promoters that have less stably bound nucleosomes, as determined by salt extraction fractionation (Henikoff et al. 2009). Decreased nucleosome stability and increased dynamics can be caused by TFs that alter nucleosome positioning (Iwafuchi-Doi et al. 2016), post-transcriptional modifications of histones (for review, see Tessarz and Kouzarides 2014), or the incorporation of histone variants (Dion et al. 2007; Henikoff et al. 2009; Mieczkowski et al. 2016; for review, see Talbert and Henikoff 2017). The H3.3 variant, for example, decreases nucleosome stability (Jin and Felsenfeld 2007), and its incorporation into nucleosomes at enhancers and promoters is mediated by the histone chaperone complex HIRA (Goldberg et al. 2010), recruited by the DNA-binding RPA complex (Zhang et al. 2017). Consistently, depletion of either RPA or HIRA impairs H3.3 incorporation and affects transcription, emphasizing the importance of destabilized nucleosomes for gene expression.Given its importance for TF binding and enhancer activity, DNA accessibility has been used to predict enhancers and is one of the most predictive chromatin features (Boyle et al. 2008; Kwasnieski et al. 2014). However, other genomic regions such as insulators or promoters are also accessible (Boyle et al. 2008), and DNA accessibility of enhancers does not quantitatively reflect their activity; in fact, inactive enhancers can also be accessible (Arnold et al. 2013; Andersson et al. 2014b). Therefore, enhancer prediction approaches typically consider more features to increase the specificity toward active enhancers.
Certain histone modifications are predictive of enhancers yet are not required for enhancer activity
Some post-translational modifications (PTMs) of histones are frequently found at enhancers and are catalyzed by enzymes such as P300/CBP or Mll3/4 that function as transcriptional activators. While PTMs of histone residues that contact DNA can affect nucleosome stability (Neumann et al. 2009; Tropberger et al. 2013; Pradeepa et al. 2016; for review, see Tessarz and Kouzarides 2014), PTMs in the unstructured histone tails might recruit so-called reader proteins, but their requirement for enhancer function is less clear. The modifications of histone tail residues that are most often found at enhancers are H3K27ac and H3K4me1 (Heintzman et al. 2009; Creyghton et al. 2010; Rada-Iglesias et al. 2011; Zentner et al. 2011; Bonn et al. 2012; for reviews, see Buecker and Wysocka 2012; Shlyueva et al. 2014).H3K4me1 occurs at most or all enhancers in both their active and inactive or primed states prior to activation (Creyghton et al. 2010; Rada-Iglesias et al. 2011; Zentner et al. 2011; Bonn et al. 2012; Arnold et al. 2013). Despite this extensive co-occurrence, two recent studies demonstrated that H3K4me1 is not required for enhancer activity: Even though rendering the mammalian H3K4 methyltransferases Mll3/Mll4 or their fly ortholog, Trithorax-related (Trr), catalytically inactive efficiently depleted H3K4me1, it had negligible effects on gene expression and resulted in viable and fertile flies (Dorighi et al. 2017; Rickels et al. 2017). The same mild impact on global gene expression was observed with a hyperactive mutant of Trr that increased H3K4me1 above physiological levels (Rickels et al. 2017). Even though H3K4me1 loss might affect individual genes, potentially relating to distal enhancer contacts and H3K4me1-bound proteins (Local et al. 2018; Yan et al. 2018), these observations suggest that H3K4me1 is not generally required for enhancer activity and does not seem to be able to cause ectopic gene expression. This is in stark contrast to the consequences of depleting the respective methyltransferases, which strongly perturbs gene expression and is lethal in flies and mice (Sedkov et al. 1999; Lee et al. 2013; Dorighi et al. 2017; Rickels et al. 2017; for reviews, see Shilatifard 2012; Herz et al. 2013). A similar finding applies to UTX, a H3K27 demethylase that recruits P300 and Mll4 and is required for enhancer activity, while its catalytic activity is not (Wang et al. 2017).While the catalytic activity of Mll3/4/Trr methyltransferases is dispensable for transcription activation, the acetyltransferase activity of P300/CBP is required (Hilton et al. 2015; Boija et al. 2017). P300/CBP acetylates H3K27, a residue that can also be trimethylated to form H3K27me3 during Polycomb-repressive complex 2 (PRC2)-mediated silencing (Boyer et al. 2006). Indeed, H3K27 mutations can dominantly affect PRC2 target genes, as was shown experimentally by injecting mRNA of H3K27R (where R is arginine, a positively charged amino acid that cannot be acetylated or methylated) into mouse zygotes, which impaired heterochromatin establishment and transcriptional silencing (Santenard et al. 2010). More importantly, dominant PRC2 loss of function is also thought to underlie the cancer-causing effect of the H3K27M mutant that is observed in >70% of pediatric gliomas (Schwartzentruber et al. 2012; Wu et al. 2012; Bender et al. 2013; Lewis et al. 2013; Funato et al. 2014; Herz et al. 2014).To assess the regulatory relevance of H3K27 in Drosophila, Pengelly et al. (2013) analyzed cells in which most of the canonical H3 was replaced with H3K27R. H3K27R mutant cells failed to silence PRC2 target genes and formed tissues that displayed developmental defects similar to PRC2 mutants. While enhancer activity was not directly assessed and noncanonical H3 variants were not mutated, the absence of a general loss of transcription suggests that H3K27ac might not be required for enhancer activity (Pengelly et al. 2013). This is consistent with independent observations that active enhancers in both flies (Bonn et al. 2012) and mice (Pradeepa et al. 2016) are not necessarily marked by H3K27ac and suggests that P300/CBP might target histone residues other than H3K27 (Pradeepa et al. 2016) or nonhistone proteins such as TFs (Edmunds and Mahadevan 2004; Ashwell 2006; Kim et al. 2006; Roe et al. 2015) or preinitiation complex (PIC) components, including Pol II (Schröder et al. 2013). However, even if not directly involved in transcription activation, H3K27ac could modulate other aspects of enhancer activity; for example, destabilizing nucleosomes or recruiting H3K27ac-binding proteins.Histone 3 Ser10 and Ser28 phosphorylation at mitogen- and stress-activated protein kinase 1/2 (MSK1/2)-induced enhancers might also have an auxiliary role (Sawicka et al. 2014; Josefowicz et al. 2016). While the MSK1/2 kinases downstream from p38 and ERK signaling are mainly known to activate TFs by phosphorylation (Wiggin et al. 2002), nucleosomes that flank the TF-bound enhancers also become phosphorylated at H3S10 and H3S28 during kinase signaling (Sawicka et al. 2014). Mutation of H3S28, but not H3S10, to alanine reduced the recruitment of p300 to these enhancers, and in vitro transcription assays show that transcription output is reduced (Josefowicz et al. 2016). While the effects were small compared with the activation mediated by the MSK1/2 target TFs, this example shows that PTMs at secondary targets near enhancers can gain modulatory roles with potential benefits for the robustness or efficiency of enhancer function.Altogether, recent data suggest that enhancer-associated histone modifications are not necessarily required for enhancer activity (Fig. 2); i.e., even a strong correlation does not imply causation (Pollex and Furlong 2017). We argue that the recent finding that H3K4me1 is entirely dispensable for transcription regulation warrants careful reconsideration of a putative role of other histone modifications such as H3K27ac. Among the possible functions of histone PTMs that could be more difficult to assess are, for example, indirect ones; i.e., PTMs that function by preventing others, such as H3K27ac, which may function to prevent H3K27 methylation and Polycomb-mediated silencing (Pengelly et al. 2013). Alternatively, it is possible that several PTMs contribute to enhancer function through parallel mechanisms such that the loss of one PTM may be obscured by the presence of others. However, it is also possible that some histone modifications in the vicinity of enhancers could be functionally neutral by-products: Perfect specificity does not exist in biological systems, and increasing specificity is typically costly, implies a trade-off in sensitivity or enzyme kinetics, and can evolve only if it confers selective advantage. If that were the case, evolution may still have made use of such by-products to modulate enhancer function or increase robustness to conditions not typically assessed in laboratory studies.
Figure 2.
Contributions of cofactors and histone tail modifications to enhancer activity. (A) The methyltransferases Mll3/4/Trr are required for enhancer activity and transcription, but their methyltransferase activity is not (Dorighi et al. 2017; Rickels et al. 2017). (Top) An active enhancer with methyltransferase and H3K4me1-marked flanking histones activates gene expression. (Middle) Mutation of the catalytic domain results in the loss of H3K4me1 but maintains enhancer activity and gene expression. (Bottom) Knockout of the methyltransferase leads to the loss of gene expression. (B) Enzymatic targets of acetyltransferases P300/CBP, which have been reported to acetylate many proteins, including TFs, cofactors, histones, and members of the PIC, including Pol II (for references, see the text). (C) H3K27ac may have an indirect role in preventing PRC2-mediated silencing. (Top) PRC2 (brown) catalyzes H3K27me3, which is blocked by H3K27ac, preventing PRC2-mediated silencing. (Bottom) Mutations of H3K27 to methionine (M; as observed frequently in pediatric gliomas) or arginine (R) prevent both acetylation and methylation. Both mutations induce changes in gene expression that mimic PRC2 loss-of-function H3K27M in a dominant fashion, as indicated by the dashed cross (for references, see the text).
Contributions of cofactors and histone tail modifications to enhancer activity. (A) The methyltransferases Mll3/4/Trr are required for enhancer activity and transcription, but their methyltransferase activity is not (Dorighi et al. 2017; Rickels et al. 2017). (Top) An active enhancer with methyltransferase and H3K4me1-marked flanking histones activates gene expression. (Middle) Mutation of the catalytic domain results in the loss of H3K4me1 but maintains enhancer activity and gene expression. (Bottom) Knockout of the methyltransferase leads to the loss of gene expression. (B) Enzymatic targets of acetyltransferases P300/CBP, which have been reported to acetylate many proteins, including TFs, cofactors, histones, and members of the PIC, including Pol II (for references, see the text). (C) H3K27ac may have an indirect role in preventing PRC2-mediated silencing. (Top) PRC2 (brown) catalyzes H3K27me3, which is blocked by H3K27ac, preventing PRC2-mediated silencing. (Bottom) Mutations of H3K27 to methionine (M; as observed frequently in pediatric gliomas) or arginine (R) prevent both acetylation and methylation. Both mutations induce changes in gene expression that mimic PRC2 loss-of-function H3K27M in a dominant fashion, as indicated by the dashed cross (for references, see the text).
Enhancer transcripts as predictors of enhancer activity
Enhancer transcription is also correlated with enhancer activity, as has been observed for individual genes (Tuan et al. 1992) and by genome-wide analysis of transcription and Pol II binding in mammals (Kim et al. 2010; Djebali et al. 2012), flies (De Santa et al. 2010; Kharchenko et al. 2011; Bonn et al. 2012), and nematodes (Chen et al. 2013; for review, see Li et al. 2016).Transcription from enhancers has been reported to often be bidirectional (Kim et al. 2010; Andersson et al. 2014b), although it can also be unidirectional (Koch et al. 2011). The resulting enhancer RNAs (eRNAs) may be polyadenylated and stable (Koch et al. 2011; Andersson et al. 2014b), but, more typically, eRNAs are unstable and rapidly degraded by the exosome (Andersson et al. 2014b; Lubas et al. 2015). This instability hinders eRNA detection with RNA sequencing approaches that measure steady-state RNA levels (Rabani et al. 2014; Schwalb et al. 2016). Indeed, eRNA detection is improved by exosome inhibition (Andersson et al. 2014a) or methods that measure nascent RNA, such as global run-on (GRO) sequencing (GRO-seq) (Core et al. 2008), precision nuclear run-on (PRO) sequencing (PRO-seq) (Kwak et al. 2013), START-seq (Scruggs et al. 2015), native elongating transcript (NET) sequencing (NET-seq) (Churchman and Weissman 2011), and transient transcriptome sequencing (TT-seq) (Schwalb et al. 2016; Michel et al. 2017).The correlation of eRNA transcription and enhancer activity has led to the proposal that enhancers and promoters might be more similar than traditionally assumed (Core et al. 2014; Andersson 2015; Kim and Shiekhattar 2015). It has also been used for enhancer prediction across different cell types and tissues (Melgar et al. 2011; Andersson et al. 2014a) and in inducible systems such as neuronal activation (Kim et al. 2010; Schaukowitch et al. 2014), immune response (De Santa et al. 2010; Kaikkonen et al. 2013; Michel et al. 2017), hormone signaling (Hah et al. 2011, 2013; Wang et al. 2011; Li et al. 2013; Lai et al. 2015), or the modulation of TF activity (Melo et al. 2013). Interestingly, the timing between enhancer and gene transcription seems to be locus-specific (or may depend on the approaches used), with eRNA transcription preceding mRNA production for some loci (De Santa et al. 2010; Arner et al. 2015), while, for others, both RNA species were transcribed synchronously (Kaikkonen et al. 2013; Michel et al. 2017).Even though eRNAs seem to generally be good predictors of active enhancers (Melgar et al. 2011; Wang et al. 2011; Andersson et al. 2014a; Rennie et al. 2017; Henriques et al. 2018; Mikhaylichenko et al. 2018), not all predicted candidates function as enhancers. For example, while 70% of CAGE (cap analysis of gene expression)-defined enhancers validated in reporter assays (Andersson et al. 2014a), the remaining 30% may have other regulatory functions, and, indeed, bidirectional transcription has been reported for insulators (Melgar et al. 2011) and accessible DNA in general (Young et al. 2017). Similarly, active enhancers might show little or no eRNA transcription, outcomes that depend on the sensitivity of eRNA detection. For example, 20%–33% of the tested regions without detectable eRNA initiation yet with enhancer-associated histone marks showed enhancer activity in reporter assays in mammalian cells (Andersson et al. 2014a), and some active developmental enhancers in Drosophila did not initiate eRNAs to detectable levels (Mikhaylichenko et al. 2018). Enhancers that do or do not strongly initiate eRNA transcription seem to differ at the sequence level (i.e., the occurrence of core promoter elements) (Andersson et al. 2014b; Core et al. 2014; Arnold et al. 2017; Mikhaylichenko et al. 2018), and it will be interesting to see whether cell types or tissues—or their respective proliferation status—also influence eRNA abundance. Overall, the results obtained so far suggest that eRNAs associate with enhancer activity across different species but that they are not perfectly predictive and therefore may not be either required or sufficient for enhancer activity. This further suggests that enhancer and promoter elements can co-occur in the genome but that enhancer and promoter functionalities are distinct and not interdependent (Arnold et al. 2017; Catarino et al. 2017; Mikhaylichenko et al. 2018).
Proposed eRNA functions are diverse and context-dependent
Establishing a causal and potentially mechanistic relationship between eRNAs and enhancer activity has been difficult, presumably because of at least three reasons: (1) Enhancer activity might be influenced by the act of eRNA transcription or by the nascent or mature RNAs. (2) Genetic manipulations of eRNA sequences necessarily alter the enhancer sequence and thus, potentially, its activity. (3) Enhancer activity likely impacts eRNA transcription, creating a circularity that obscures directionality and causality.Pol II binding and eRNA transcription have been reported to displace nucleosomes and establish DNA accessibility (Gilchrist et al. 2010; Mousavi et al. 2013) such that the act of eRNA transcription might have a role in enhancer activity. Alternatively, nascent eRNAs could be important, as they have been reported to perform diverse functions, including the stabilization of TF binding (Sigova et al. 2015), the recruitment and activation of cofactors (Kaikkonen et al. 2013; Gardini et al. 2014; Lai et al. 2015; Bose et al. 2017), the release of NELF from promoters (Schaukowitch et al. 2014), or the promotion of cohesin-mediated enhancer–promoter contacts (Li et al. 2013; Hsieh et al. 2015; Isoda et al. 2017). However, each of these functions seems to apply only to individual eRNAs rather than globally or may depend on the respective experimental models or approaches. A recent proposal might reconcile the diversity of eRNA functions, particularly regarding protein recruitment: If eRNAs mediated the formation of specialized membraneless compartments at active enhancers or promoters via phase transition (Muerdter and Stark 2016; Hnisz et al. 2017), these compartments could feature high local concentrations of diverse activators. RNA-mediated phase transition has indeed been reported for RNAs with tandem repeats and for nucleolar rRNAs (Berry et al. 2015; Jain and Vale 2017). However, it remains to be tested whether short unstable eRNAs with highly diverse sequences could drive phase transition and whether this could contribute to enhancer activity. It will also be interesting to learn how compartments around active gene loci can remain separated from compartments with repressive properties (Banani et al. 2017), such as those formed during heterochromatin protein 1 (HP1)-mediated phase transition (Larson et al. 2017; Strom et al. 2017).The genetic manipulation of eRNAs is also challenging, as it necessarily affects the DNA sequence of the enhancers and thus, potentially, the enhancers’ activities. This has been highlighted by recent work that assessed the function of long noncoding RNAs (lncRNas) by polyA site insertion next to the lncRNA promoter, enforcing early transcription termination (Anderson et al. 2016; Engreitz et al. 2016; Paralkar et al. 2016). In many cases, the lncRNAs seemed to be dispensable, and transcription was activated by enhancers located proximally to the lncRNAs’ promoters (for discussion, see Bassett et al. 2014; Espinosa 2016). The lncRNA upperhand is one of the exceptions, as it seems to control Hand2 expression and heart development (Anderson et al. 2016). Similarly, the direct depletion of eRNAs by either RNAi or RNase H has been reported to abrogate the expression of individual genes (Hah et al. 2013; Lam et al. 2013; Li et al. 2013; Schaukowitch et al. 2014), yet this approach has not been used very frequently, presumably due to the inefficiency of targeting nascent RNAs by these approaches or the difficulty in further increasing the turnover of already short-lived eRNAs (De Santa et al. 2010; Andersson et al. 2014b; Lubas et al. 2015).A final complication is that enhancer activity likely influences eRNA transcription: Active enhancers are characterized by accessible DNA and TFs that recruit activating cofactors such as P300/CBP, Mll3/4, or Mediator, which can bind and/or activate Pol II at target promoters. This established mechanism implies the existence of strongly activating cues at enhancers, making it likely that enhancers are transcribed even if this transcription was not functional; i.e., entirely neutral. This is because evolving a sequence specificity for transcription initiation that makes initiation perfectly specific for gene starts (or an active mechanism that prevents initiation in any other region) is energetically costly and can evolve only if there is a strong selective advantage. In addition, TSSs in enhancers contain sequences that weakly match to core promoter elements—short sequences that can occur by chance even in random sequences (Andersson et al. 2014a; Mikhaylichenko et al. 2018). While a sensible null hypothesis should therefore be that eRNAs are neutral by-products of accessible DNA in the vicinity of strong transcriptional activators (Young et al. 2017; for discussion, see Natoli and Andrau 2012), it is possible that evolution has taken advantage of Pol II binding, transcription, or eRNAs at enhancers and evolved means to modulate enhancer function.
Toward a functional definition of regulatory elements
Genomic traits used to predict enhancers can also be found in other genomic regions, making enhancer predictions imperfect. Thus, enhancer identification should include direct functional tests of enhancer activity, following the original definition of enhancers as DNA sequences that increase transcription from distal promoters (Banerji et al. 1981; for review, see Shlyueva et al. 2014). Along this paradigm, two approaches are being applied that measure either the ability of candidate sequences to drive transcription in standardized reporter assays or the requirement of candidate regions for endogenous gene expression; i.e., tests of sufficiency and necessity, respectively.
Enhancer DNA is sufficient for enhancer activity
One of the most fascinating properties of enhancers is their functional autonomy; i.e., their ability to retain their transcription-activating function outside their endogenous contexts even in combination with heterologous promoters and reporter genes (Banerji et al. 1981). Ectopic assays explore this property to test DNA sequences separated from their endogenous sequence and chromatin environments. This removes any regulatory cues that could confound the results, providing a fair comparison between the enhancer activities of different DNA sequences. Such activities can be quantified by measuring the abundance of reporter mRNAs or proteins; e.g., via the proteins’ enzymatic activities. In fact, the first enhancer was identified using such reporter assays, which also provided the functional definition of enhancers as DNA sequence elements that activate transcription irrespective of distance, orientation, and position (Banerji et al. 1981; for review, see Shlyueva et al. 2014). Classical reporter assays (e.g., those based on luciferase) enable systematic tests and have been widely used yet suffered from low throughput, as candidates needed to be tested one by one.
Assessing the enhancer potential of candidate DNA sequences genome-wide
Several methods have taken advantage of NGS to vastly increase the throughput of ectopic enhancer activity assays (Fig. 3). Massively parallel reporter assays (MPRAs) uniquely associate candidate enhancers with barcodes—short unique DNA sequences that are used instead of reporter genes (Kwasnieski et al. 2012; Melnikov et al. 2012; Patwardhan et al. 2012). Enhancers drive expression of their associated barcodes, and the abundance of each barcode among all reporter mRNAs reflects the activity of the associated enhancer, allowing the parallel testing of many candidates (for reviews, see Shlyueva et al. 2014; White 2015; Santiago-Algarra et al. 2017). Self-transcribing active regulatory region (STARR) sequencing (STARR-seq) (Arnold et al. 2013; Muerdter et al. 2018; for review, see Muerdter et al. 2015) tests enhancer candidates downstream from the TSS such that active enhancers drive their own transcription. The direct coupling of enhancer sequences and activities in cis (i.e., the use of each enhancer as its own barcode) simplifies library construction and allows millions of candidates to be tested at once, enabling genome-wide screens in fly and mammalian cells (Arnold et al. 2013; Muerdter et al. 2018).
Figure 3.
Ectopic reporter assays measure enhancer activities quantitatively. (A) Ectopic assays remove candidate sequences from their endogenous loci and test them in a heterologous setup using reporter genes (green) such as β-galactosidase (lacZ) or luciferase. Such reporter constructs can be used with nonintegrating (episomal) plasmids or can be integrated into the genome. (B) High-throughput reporter assays vastly increase the number of candidates per experiment by replacing the reporter gene with either a barcode or the enhancer itself. MPRAs uniquely assign each candidate to a barcode and define enhancer activities by quantifying the barcode-containing transcripts. STARR-seq uses each candidate as its own barcode, which simplifies library cloning and increases throughput. Both types of assays typically include ORFs (green) to stabilize the reporter mRNA.
Ectopic reporter assays measure enhancer activities quantitatively. (A) Ectopic assays remove candidate sequences from their endogenous loci and test them in a heterologous setup using reporter genes (green) such as β-galactosidase (lacZ) or luciferase. Such reporter constructs can be used with nonintegrating (episomal) plasmids or can be integrated into the genome. (B) High-throughput reporter assays vastly increase the number of candidates per experiment by replacing the reporter gene with either a barcode or the enhancer itself. MPRAs uniquely assign each candidate to a barcode and define enhancer activities by quantifying the barcode-containing transcripts. STARR-seq uses each candidate as its own barcode, which simplifies library cloning and increases throughput. Both types of assays typically include ORFs (green) to stabilize the reporter mRNA.The high-throughput testing of enhancer candidates and variants in multiplexed ectopic assays enables their application to many questions, including the importance of TF-binding motifs (Melnikov et al. 2012; Patwardhan et al. 2012; Kheradpour et al. 2013; Yanez-Cuna et al. 2014), their arrangements (Smith et al. 2013; Erceg et al. 2014; Fiore and Cohen 2016; White et al. 2016; Vierbuchen et al. 2017), and other sequence elements (White et al. 2013; Vockley et al. 2016; Grossman et al. 2017; Chaudhari and Cohen 2018) for enhancer activity or the functional impact of single-nucleotide polymorphisms (SNPs) (Kwasnieski et al. 2012; Reddy et al. 2012; Vockley et al. 2015; Tewhey et al. 2016; Ulirsch et al. 2016; for reviews, see Maston et al. 2012; Spitz and Furlong 2012; Yáñez-Cuna et al. 2013; Levine et al. 2014; Shlyueva et al. 2014; Zabidi and Stark 2016). It will be exciting to see such approaches advance our understanding of how enhancer activities are encoded in enhancer sequences.
Integrated reporters can recapitulate developmental enhancer activities but with differences from the endogenous enhancer loci
The study of cell type-specific enhancer activities across entire embryos and throughout development requires the reporter constructs to be integrated into the genome, which is typically achieved by microinjections (Kothary et al. 1989; Visel et al. 2008) or by integrating retroviruses and lentiviruses (Murtha et al. 2014). However, this leads to random integrations likely biased toward accessible DNA near active enhancer or promoter regions (Bushman 2003; Myers et al. 2005), which may strongly influence reporter gene transcription (Akhtar et al. 2013). To allow more controlled comparisons, enhancer candidates have been tested using reporters integrated into identical genomic positions (e.g., Dickel et al. 2014; Kvon et al. 2014) in a trade-off with throughput.Genomically integrated reporter assays have been used to define enhancer activities throughout development in both flies (Kvon et al. 2014) and mice (Spitz et al. 2003; Visel et al. 2007; Marinić et al. 2013; Osterwalder et al. 2018), recapitulating dynamic and cell type-specific enhancer activities that often (82% for developmental enhancers in Drosophila embryos) (Kvon et al. 2014) match the endogenous activities as judged by the expression pattern of neighboring genes (Sagai et al. 2005; Kvon et al. 2014; Kvon 2015). Interestingly however, the activity patterns of some enhancers in such ectopic assays were broader than the enhancers’ endogenous activities (Spitz et al. 2003; Kvon et al. 2014). These discrepancies might stem from additional regulatory elements that regulate the enhancers’ target genes or from locus-specific transcriptional silencing of the enhancers, mediated, for example, by flanking sequences and chromatin features that differ between the endogenous loci and the ectopic site.
Episomal and integrated reporters in defined cell types
In contrast to the assays above, enhancer activities in individual defined cell types in culture are often tested using episomal plasmid-based reporters, particularly when many candidates are tested in highly parallelized assays (for review, see White 2015). Studies of simian virus 40 (SV40) DNA using electron microscopy suggest that episomal plasmid DNA is chromatinized and acquires nucleosomes similar to genomic DNA (Cremisi et al. 1975), although the chromatin likely differs from the enhancer candidates’ endogenous loci. Indeed, a substantial fraction of the sequences identified as strong enhancers in such assays is likely silenced in their endogenous contexts as judged by DNA accessibility (∼30% of enhancers active in ectopic assays in fly cells and 60% in human cells are closed) (Arnold et al. 2013; Muerdter et al. 2018). The prevalence of H3K27me3 and H3K9me3 marks at the endogenous loci of such closed enhancers suggests that they are silenced at the chromatin level by the Polycomb- and HP1-dependent pathways (Arnold et al. 2013; Muerdter et al. 2018). In human cells, many closed enhancers are retrotransposons, and, consistently, an LTR-derived mouse mammary tumor virus (MMTV) promoter was silenced only when stably integrated into the cellular chromatin but not on an episomal plasmid (Archer et al. 1992).Genomic integration therefore has been assumed to provide a chromosomal environment similar to the enhancers’ natural chromatin state. However, this assumption is not generally true, as the insertion sites of the reporter constructs are typically different from the endogenous sites of the candidate enhancers (particularly when retroviruses or lentiviruses are used) (see above). Importantly, Drosophila S2 cell enhancers that were accessible in their endogenous contexts (i.e., open) and those that were silenced at the chromatin level (i.e., closed) were active in both episomal and genomically integrated assays (Arnold et al. 2013). Consistently, enhancer activities measured by reporter assays using integrating and nonintegrating lentiviruses were, overall, highly similar (Pearson's correlation coefficent = 0.85) (Inoue et al. 2017). These outcomes suggest that differences in enhancer activities may not result from episomal versus chromosomally integrated assays but from chromatin-mediated developmental silencing that is stablished during cell type differentiation and can be maintained in differentiated cells but not established de novo. In a given cell type, episomal and genomically integrated assays therefore should yield similar results, which, however, can differ from the candidates’ endogenous activities (Arnold et al. 2013; Muerdter et al. 2018).
Assessing the impact of enhancer candidates on gene expression by genetic enhancer perturbation
A key question not addressed by ectopic enhancer activity assays is whether—and how—a genomic enhancer affects cellular gene expression. It is known that mutations or deletions of genomic enhancers can cause gene misregulation and lead to different diseases, including developmental defects (e.g., Sagai et al. 2005) or cancer (e.g., Pomerantz et al. 2009; Wasserman et al. 2010; Sur et al. 2012; Mansour et al. 2014). In fact, many disease-associated genetic variants identified through genome-wide association studies (GWASs) map to noncoding regulatory sequences (Degner et al. 2012; Maurano et al. 2012; Schaub et al. 2012; Karczewski et al. 2013). The question of how different enhancers contribute to cellular gene expression motivated a second type of enhancer activity assay based on the genetic perturbation of genomic enhancers (Fig. 4).
Figure 4.
Clustered regularly interspaced short palindromic repeat (CRISPR)–Cas9-based approaches to assess endogenous enhancer activities. (A) Endogenous enhancer activities are typically assessed by genetic perturbation and assays that detect loss of enhancer function. Transcription of an endogenous target gene is driven by the enhancer (blue) and is lost upon enhancer mutation or deletion by Cas9 and guide RNAs (gRNAs). Deletions can be repaired through homologous recombination to insert exogenous sequences (purple), allowing essentially arbitrary manipulations such as the exchange of enhancers with homologous sequences from other species (e.g., Kvon et al. 2016). (B) Endogenous high-throughput screens rely on cell selection to enrich for gRNAs that perturb enhancer activities. In a typical screen, a pool of gRNAs is transfected into cells, which introduces mutations or deletions in candidate regions. gRNAs that target active enhancers (blue) disrupt target gene expression and can be enriched by selecting for a cellular phenotype (e.g., increased proliferation [left] or reporter gene expression [right]). The gRNAs enriched in the selected cells can identify the enhancers targeted (for references, see the text).
Clustered regularly interspaced short palindromic repeat (CRISPR)–Cas9-based approaches to assess endogenous enhancer activities. (A) Endogenous enhancer activities are typically assessed by genetic perturbation and assays that detect loss of enhancer function. Transcription of an endogenous target gene is driven by the enhancer (blue) and is lost upon enhancer mutation or deletion by Cas9 and guide RNAs (gRNAs). Deletions can be repaired through homologous recombination to insert exogenous sequences (purple), allowing essentially arbitrary manipulations such as the exchange of enhancers with homologous sequences from other species (e.g., Kvon et al. 2016). (B) Endogenous high-throughput screens rely on cell selection to enrich for gRNAs that perturb enhancer activities. In a typical screen, a pool of gRNAs is transfected into cells, which introduces mutations or deletions in candidate regions. gRNAs that target active enhancers (blue) disrupt target gene expression and can be enriched by selecting for a cellular phenotype (e.g., increased proliferation [left] or reporter gene expression [right]). The gRNAs enriched in the selected cells can identify the enhancers targeted (for references, see the text).
Versatile testing of endogenous enhancer activities by clustered regularly interspaced short palindromic repeat (CRISPR)–Cas9
Early experimental approaches disrupted endogenous enhancer activities as part of genetic screens via random insertional mutagenesis with transposons (e.g., P element or sleeping beauty) that randomly integrate in the genome (for reviews, see Kawakami et al. 2017; Kebriaei et al. 2017). This approach led, for example, to the identification of the extensive regulatory region of the Drosophila gene decapentaplegic (dpp) (St Johnston et al. 1990), which was further defined using ectopic enhancer activity assays (Blackman et al. 1991). Targeted genome engineering allows the direct testing of specific candidate sequences and was used, for example, to demonstrate the importance of the sonic hedgehog (shh) limb enhancer for shh expression and limb development (Sagai et al. 2005). Targeted editing of specific genomic regions has been revolutionized recently by CRISPR–Cas9, which introduces double-strand breaks in target DNA sequences defined by sequence-complementary guide RNAs (gRNAs) (Jinek et al. 2012).The simultaneous use of two gRNAs to delete defined genomic regions (Fig. 4A) has been used, for example, to measure the regulatory contribution of an enhancer cluster (also called “superenhancer” [SE]) (Hnisz et al. 2013; Lovén et al. 2013; Whyte et al. 2013) in the Sox2 locus in embryonic stem cells (Li et al. 2014; Zhou et al. 2014) and in the Myc locus in blood cells (Bahr et al. 2018). In addition, the targeted deletion of individual constituent enhancers within SEs revealed that enhancer activity is mostly dependent on a few constituents that activate transcription predominately additively (Hnisz et al. 2015; Hay et al. 2016; Moorthy et al. 2017; Xie et al. 2017; for discussion, see Dukler et al. 2016). In Drosophila, enhancer deletion has been used to uncouple the tissue-specific contribution of different enhancers to the overall expression pattern of the rhomboid gene (Rogers et al. 2017).Provided a DNA template to repair the double-strand breaks via homology-directed DNA repair, essentially arbitrary manipulations are possible, including the insertion of defined sequences. For example, to study the evolution of shh expression in vertebrates, the endogenous shh limb enhancer sequence was replaced by orthologous sequences from different species, including snakes (Kvon et al. 2016). Substitution of the mouse enhancer with the snake enhancer led to development of limbless (or “serpentized”) mice due to a 17-base-pair deletion in the snake enhancer that removed binding sites for the TF ETS1. Overall, CRISPR–Cas9 provides a powerful and flexible approach to assess the impact of endogenous enhancers on gene expression in vivo, including the functional impact of SNPs that are significantly associated with phenotypic traits according to GWASs (Smemo et al. 2014; Yao et al. 2014; Claussnitzer et al. 2015; Singh and Schimenti 2015; Cohen et al. 2017).
Multiplexed testing of endogenous enhancer functions by CRISPR–Cas9
CRISPR/Cas9-mediated loss-of-function studies of enhancers have been multiplexed to screen several regions via multiple gRNAs simultaneously (for review, see Lopes et al. 2016). While the approaches discussed above used CRISPR/Cas9 to introduce defined genetic manipulations and evaluated their impact on transcription, multiplexed screens introduce many DNA alterations via the use of complex gRNA pools, select cells based on a particular phenotype (e.g., increased proliferation), and identify gRNAs that are significantly enriched or depleted in the selected cells (Fig. 4B). Such screens have been used to identify functional enhancers among p53- and ERα-binding sites (Korkmaz et al. 2016) or identify important DNA motifs within regulatory regions (Canver et al. 2015; Sanjana et al. 2016). Since not all genes are associated with a selectable cellular phenotype, target genes can be tagged with GFP, allowing selection by FACS (Rajagopal et al. 2016; Diao et al. 2017). FACS-based CRISPR/Cas9 screens for genomic regulatory elements revealed regions required for the expression of the respective GFP-tagged genes, including “closed” regions without chromatin properties typically associated with enhancers and promoter regions that functioned as enhancers of distal genes (Rajagopal et al. 2016; Diao et al. 2017; for discussion, see Catarino et al. 2017).
Targeted recruitment of transcriptional repressors can identify enhancers
Catalytically dead Cas9 (dCas9) retains its DNA targeting ability, making it a flexible in vivo recruitment device. While DNA binding of dCas9 alone can inhibit gene expression through steric hindrance (Gilbert et al. 2013; Qi et al. 2013), dCas9-mediated recruitment of transcriptional repressors is more potent (Gilbert et al. 2013). For example, recruitment of KRAB (Thakore et al. 2015; Adamson et al. 2016; Dixit et al. 2016; Klann et al. 2017) or LSD1 (Kearns et al. 2015) has been used to repress enhancers, which can also be exploited for enhancer identification. For example, a pool of gRNAs targeting candidate regions in the GATA1 and MYC loci (essential genes for K562 survival and proliferation) was used to recruit KRAB to these candidates and identify enhancers (Fulco et al. 2016). Overall, such high-throughput endogenous assays are able to test thousands of genomic regions in a single screen, which is, however, typically centered on a particular target gene with a selectable phenotype. Recently, dCas9-KRAB recruitment has been coupled with single-cell RNA sequencing methods, enabling the combined targeting of multiple enhancers while assessing global effects on gene expression for many genes in parallel (Xie et al. 2017).
Endogenous and ectopic enhancer activity provides complementary insights into enhancer sufficiency and necessity
Ectopic and endogenous assays address characteristically different questions and provide complementary insights into gene regulation (Fig. 5): Ectopic assays assess the sufficiency of DNA sequences to activate transcription, while endogenous perturbations assess the necessity of genomic regions for the expression of a specific gene. The implications of these differences are interesting and important: As ectopic reporter assays measure the ability of DNA sequences to drive reporter gene transcription in a neutral context, active candidates can be inactive in their endogenous contexts, silenced at the chromatin level (Arnold et al. 2013; Muerdter et al. 2018).
Figure 5.
Ectopic enhancer activity assays and genetic perturbations of endogenous enhancers are complementary, and the outcomes need to be interpreted with care. Each row represents a different scenario in which the candidate (blue) is an active cellular enhancer or not (ground truth; left columns). The right columns indicate the respective outcomes of ectopic enhancer activity assays and genetic perturbations. (Green checkmark) Enhancer activity detected; (red cross) no enhancer activity detected; (yellow checkmark) outcome depends on degree of redundancy. See the text for details.
Ectopic enhancer activity assays and genetic perturbations of endogenous enhancers are complementary, and the outcomes need to be interpreted with care. Each row represents a different scenario in which the candidate (blue) is an active cellular enhancer or not (ground truth; left columns). The right columns indicate the respective outcomes of ectopic enhancer activity assays and genetic perturbations. (Green checkmark) Enhancer activity detected; (red cross) no enhancer activity detected; (yellow checkmark) outcome depends on degree of redundancy. See the text for details.The results of endogenous enhancer perturbations also need to be interpreted with care: Regions required for the expression of certain genes are not necessarily enhancers, as perturbations might influence transcription by other means; e.g., when insulators, locus control elements, or promoters are affected. Conversely, regions that do not appear necessary for the expression of a particular gene can still be active: They might regulate a different gene or act redundantly with other enhancers. Key developmental genes are commonly regulated by multiple enhancers, which can act redundantly to assure robust gene regulation (Frankel et al. 2010; Perry et al. 2010; for review, see Barolo 2012). While full redundancy is thought to be rare, as the negative selection preserving such redundant sequences and functions would be low (Nowak et al. 1997), many enhancers seem to act partially redundantly, assuring transcription above a certain threshold (Bothma et al. 2015; Lam et al. 2015; Cannavò et al. 2016; Chatterjee et al. 2016; Osterwalder et al. 2018) or diverging in spatial activity patterns over development (Kvon et al. 2014; Bahr et al. 2018).These considerations suggest that results from ectopic enhancer activity assays should be interpreted in combination with methods that assess DNA accessibility (Arnold et al. 2013; Muerdter et al. 2018) and that results from endogenous enhancer perturbations need to be tested for enhancer functionality by enhancer activity assays.
Functional enhancer activity assays define rules of promoter targeting within topologically associating domains (TADs)
Precise developmental gene regulation requires enhancers to specifically regulate their cognate promoters, and enhancer–promoter targeting is regulated at different levels, including the three-dimensional genome structure, DNA accessibility, and biochemical compatibilities (for review, see Zabidi and Stark 2016).Distal enhancers come into close proximity with their target promoters (Wijgerde et al. 1995; Dillon et al. 1997), and some of these contacts are stable across developmental stages and appear to be independent of enhancer activity or gene transcription, while others seem to be dynamic and occur only during or after enhancer activation (Ghavi-Helm et al. 2014; Dixon et al. 2015; Fraser et al. 2015; Williamson et al. 2016; Dao et al. 2017; Rubin et al. 2017). Enhancer–promoter contacts occur typically within large chromosome domains, termed TADs (Dixon et al. 2012; Nora et al. 2012; Sexton et al. 2012), which are limited by boundaries that prevent interdomain contacts (Handoko et al. 2011; Schwartz et al. 2012; Narendra et al. 2015; for reviews, see Pombo and Dillon 2015; Merkenschlager and Nora 2016; Schmitt et al. 2016; Zabidi and Stark 2016). Within TADs, enhancer–promoter contacts appear to not be constrained (Symmons et al. 2014, 2016) yet may be facilitated or stabilized by factors such as YY1 (Weintraub et al. 2017). TADs may form by the extrusion of DNA loops (Nasmyth 2001; Alipour and Marko 2012; Sanborn et al. 2015; Fudenberg et al. 2016)—CTCF and cohesin being key factors in the formation of boundaries (Parelho et al. 2008; Wendt et al. 2008). Indeed, depletion of CTCF or cohesin disrupts TADs and seems to instead favor contacts within active or respressive compartments (Hou et al. 2012; Seitan et al. 2013; Zuin et al. 2014; Ing-Simmons et al. 2015; Ulianov et al. 2016; Haarhuis et al. 2017; Nora et al. 2017; Rao et al. 2017; Schwarzer et al. 2017; Wutz et al. 2017). Mutations of TAD boundaries highlight their importance in restricting enhancer–promoter contacts and gene activation, as they lead to the blurring of chromatin regions, improper gene expression, and developmental defects (Nora et al. 2012; Guo et al. 2015; Lupiáñez et al. 2015; Franke et al. 2016; Narendra et al. 2016; Hanssen et al. 2017).
Biochemical compatibilities between enhancers and promoters: different keys for different locks
Besides physical constraints imposed by the three-dimensional genome architecture, enhancers cannot regulate all promoters indiscriminately: Different promoters inserted at identical genomic positions were activated differentially (Butler and Kadonaga 2001), and housekeeping and developmental enhancers displayed a strong specificity toward housekeeping and developmental core promoters, respectively (Zabidi et al. 2015). This specificity appears to be encoded in the enhancer sequences and depends on specific TF motifs, suggesting that different enhancers recruit different TFs and cofactors to activate different promoters (Zabidi et al. 2015; Zabidi and Stark 2016). Indeed, directed recruitment of cofactors to minimal core promoters with the DNA-binding domain of Gal4 revealed that several cofactors were sufficient to activate transcription and showed preferences toward housekeeping versus developmental promoters or vice versa (Stampfel et al. 2015). These results suggest that enhancers and promoters need to be biochemically compatible (van Arensbergen et al. 2014; Zabidi et al. 2015; Zabidi and Stark 2016) and that different enhancers might use different cofactors and may respond differentially to cofactor inhibition or depletion.Indeed, the inhibition or depletion of Brd4 in mammalian cells resulted, for example, in the selective down-regulation of certain genes, including Myc (Zuber et al. 2011), even though Brd4 appears to bind rather indiscriminately to most if not all active enhancers and promoters (Zhang et al. 2012; Kanno et al. 2014). Furthermore, in certain leukemic cells, some enhancers become activated upon Brd4 inhibition, indicating that Brd4-independent enhancers exist (Rathert et al. 2015). Likewise, inhibition of the fly ortholog of P300/CBP results in the deregulation of many genes that are both up-regulated and down-regulated (Boija et al. 2017). The inhibition of the cyclin-dependent kinases (CDKs) CDK7 (Chipumuro et al. 2014; Kwiatkowski et al. 2014; Ebmeier et al. 2017) and CDK8 (Lenstra et al. 2011; Kemmeren et al. 2014; Pelish et al. 2015; Jeronimo et al. 2016) also leads to differential gene expression effects, although this could result from differences at the transcription or post-transcriptional level; i.e., via differential RNA stabilities. In contrast, CDK9 seems to be required at virtually all promoters (Ni et al. 2008; Henriques et al. 2013; Jonkers et al. 2014; Gressel et al. 2017), presumably reflecting a universal requirement to release Pol II from a paused state after initiation into productive elongation (Adelman et al. 2005; Kwak et al. 2013; Henriques et al. 2018). Such promoter- and enhancer-specific requirements of different cofactors are consistent with the differential requirements for subunits of the large cofactor complex Mediator (Allen and Taatjes 2015), which has been assessed recently by rapid subunit depletion (Anandhakumar et al. 2016; Petrenko et al. 2017).Our ability to rapidly inhibit or deplete cellular proteins (for review, see Housden et al. 2017) with technologies such as anchor away (Haruki et al. 2008) or degron-related methods (Dohmen et al. 1994; Nishimura et al. 2009) provides unprecedented opportunities to study the cofactor dependencies of gene transcription (Anandhakumar et al. 2016; Petrenko et al. 2017; Warfield et al. 2017; Winter et al. 2017; Xue et al. 2017). These developments promise exciting new insights into how enhancers activate transcription from target promoters and which cofactors—or the PTMs that they catalyze (see above)—might be mechanistically involved in this process.
Enhancers regulate transcriptional bursting frequency
New insights into enhancer-mediated activation also come from measuring transcription initiation kinetics at promoters; i.e., the pattern of transcription initiation events (for review, see Lenstra et al. 2016). Transcription is not a constant process but occurs in waves with bursts of transcription initiation that are separated by inactive intervals (Golding et al. 2005; Chubb et al. 2006; Raj et al. 2006; Dar et al. 2012). This bursting phenomenon means that the overall transcriptional output can be regulated by modulating the frequency of transcription bursts or the burst size; i.e., the number of mRNAs made per burst.Fluorescent in situ hybridization (FISH) and the use of viral MS2 and PP7 RNA structures recognized by fluorescently tagged MCP and PCP proteins allow the imaging of RNA with single-molecule resolution (Bertrand et al. 1998; Janicki et al. 2004; Larson et al. 2011a; for reviews, see Larson et al. 2011b; Larson 2011; Chen and Larson 2016). Using single-molecule imaging of reporter genes in Drosophila embryos (Fukaya et al. 2016) or the β-globin and γ-globin genes in mouse and human cells (Bartman et al. 2016) showed that enhancers activate transcription predominantly via increasing burst frequency rather than burst size. Consistently, different enhancer strengths or the enhancer-blocking functions of insulators were reflected in the burst frequencies. At high transcription rates, individual bursts might merge (Fukaya et al. 2016), or a transition to a different mode might occur in which transcription is regulated only by increases of burst size (Skupsky et al. 2010; Dar et al. 2012).Interestingly, burst size and the lengths of permissive and nonpermissive periods appear to be determined by the core promoter sequence. The presence of TATA-box motifs in promoters, particularly of inducible genes, is associated with large burst sizes (Basehoar et al. 2004; Hornung et al. 2012; Carey et al. 2013; Tantale et al. 2016), which are decreased upon mutating the TATA box (Hornung et al. 2012; Tantale et al. 2016). The dependency of burst size on the core promoter sequence and TATA boxes is reminiscent of “enhancer responsiveness”; i.e., the efficiency of core promoters to convert activating enhancer input into productive transcription events (Arnold et al. 2017). It is conceivable that highly enhancer-responsive core promoters show large burst sizes; i.e., produce a high number of mRNAs per burst. We are excited to see how new approaches such as single-molecule imaging shed new light on transcriptional regulation.
Discussion
In this review, we discussed recent progress in identifying and functionally characterizing enhancer elements in animal genomes, focusing on predictions via correlative features and functional assays that assess sufficiency or necessity. Using such assays, international consortia such as Encode and individual groups have compiled large compendia of enhancers and annotated genomic regions (e.g., Ernst and Kellis 2012; Hoffman et al. 2012), which allow the rough estimation of how many enhancers our genomes might contain. Work over the past years reported ∼10,000 enhancers for individual mammalian cells (e.g., ENCODE Project Consortium 2012; Muerdter et al. 2018) compared with ∼15,000 expressed genes (Ramsköld et al. 2009), between 200,000 and 300,000 across ∼20 mouse tissues (Shen et al. 2012; Yue et al. 2014), and ∼400,000 across a set of 127 cell lines (ENCODE Project Consortium 2012). For the much smaller Drosophila genome, estimates range from at least 50,000 to 100,000 enhancers (Kvon et al. 2014), which altogether suggests that the human genome might contain up to several million enhancers.We anticipate that assays that assess the impact of endogenous enhancers on gene expression by genetic manipulations will be further improved by modifications and optimization of Cas9 (Kleinstiver et al. 2015, 2016; Chen et al. 2017) and improved rules of gRNA design (Fu et al. 2014; Sander and Joung 2014). By recruiting the transcription-activating or -repressing functions through extended gRNAs, such assays can be multiplexed, and different transcriptional regulators can be recruited simultaneously to different gene loci (Tak et al. 2017). Expanding CRISPR applications, Cas13a allows targeting of RNA molecules (Abudayyeh et al. 2017; Cox et al. 2017), potentially providing an alternative to RNase H or RNAi depletion, particularly for the study of ncRNAs in the nucleus.In addition to these developments, over the past years, we have witnessed the characterization of the DNA-binding preferences for an increasingly complete set of TFs (e.g., Noyes et al. 2008; Badis et al. 2009; Jolma et al. 2013; Franco-Zorrilla et al. 2014; Weirauch et al. 2014; Hume et al. 2015; Narasimhan et al. 2015; Mathelier et al. 2016; Kulakovskiy et al. 2018; for reviews, see Deplancke et al. 2016; Inukai et al. 2017; Morgunova and Taipale 2017), their sensitivity to DNA methylation (Domcke et al. 2015; Yin et al. 2017), and how TF dimerization impacts DNA-binding affinities (Jolma et al. 2015; Isakova et al. 2016, 2017). This vocabulary and insights into the importance of binding site arrangement (Senger et al. 2004; Smith et al. 2013; Erceg et al. 2014; Farley et al. 2016; Fiore and Cohen 2016), the use of nonoptimal binding sites (Crocker et al. 2015; Farley et al. 2015), and how different types of TFs cooperate to control enhancer activity (Keung et al. 2014; Stampfel et al. 2015; for review, see Reiter et al. 2017) advance our understanding of how enhancer sequences encode and determine cell type-specific transcription.Our increased understanding of TF–DNA interactions is complemented by the functional testing of enhancer candidates and mutant variants in ectopic and endogenous assays. Together with new technologies to rapidly deplete proteins and single-molecule live imaging of TFs (Chen et al. 2014) or nascent RNAs (Little et al. 2013; Levine et al. 2014), the upcoming years will provide not only novel insights into enhancer–promoter communication but the means to assess whether—and how—the different correlative traits are involved in enhancer function. We look forward to seeing how these developments allow an increasingly complete understanding of enhancer sequences and function and how our genomes encode gene expression.
Authors: Stefan Bonn; Robert P Zinzen; Charles Girardot; E Hilary Gustafson; Alexis Perez-Gonzalez; Nicolas Delhomme; Yad Ghavi-Helm; Bartek Wilczyński; Andrew Riddell; Eileen E M Furlong Journal: Nat Genet Date: 2012-01-08 Impact factor: 38.330
Authors: Britt Adamson; Thomas M Norman; Marco Jost; Min Y Cho; James K Nuñez; Yuwen Chen; Jacqueline E Villalta; Luke A Gilbert; Max A Horlbeck; Marco Y Hein; Ryan A Pak; Andrew N Gray; Carol A Gross; Atray Dixit; Oren Parnas; Aviv Regev; Jonathan S Weissman Journal: Cell Date: 2016-12-15 Impact factor: 41.582
Authors: Kristel M Dorighi; Tomek Swigut; Telmo Henriques; Natarajan V Bhanu; Benjamin S Scruggs; Nataliya Nady; Christopher D Still; Benjamin A Garcia; Karen Adelman; Joanna Wysocka Journal: Mol Cell Date: 2017-05-05 Impact factor: 17.970
Authors: Michal Rabani; Raktima Raychowdhury; Marko Jovanovic; Michael Rooney; Deborah J Stumpo; Andrea Pauli; Nir Hacohen; Alexander F Schier; Perry J Blackshear; Nir Friedman; Ido Amit; Aviv Regev Journal: Cell Date: 2014-12-11 Impact factor: 41.582
Authors: Anthony Mathelier; Oriol Fornes; David J Arenillas; Chih-Yu Chen; Grégoire Denay; Jessica Lee; Wenqiang Shi; Casper Shyr; Ge Tan; Rebecca Worsley-Hunt; Allen W Zhang; François Parcy; Boris Lenhard; Albin Sandelin; Wyeth W Wasserman Journal: Nucleic Acids Res Date: 2015-11-03 Impact factor: 16.971
Authors: Carlos A Origel Marmolejo; Bhagyashree Bachhav; Sahiti D Patibandla; Alexander L Yang; Laura Segatori Journal: Nat Chem Biol Date: 2020-03-09 Impact factor: 15.040