Literature DB >> 24015356

DNA methylation presents distinct binding sites for human transcription factors.

Shaohui Hu1, Jun Wan, Yijing Su, Qifeng Song, Yaxue Zeng, Ha Nam Nguyen, Jaehoon Shin, Eric Cox, Hee Sool Rho, Crystal Woodard, Shuli Xia, Shuang Liu, Huibin Lyu, Guo-Li Ming, Herschel Wade, Hongjun Song, Jiang Qian, Heng Zhu.   

Abstract

DNA methylation, especially CpG methylation at promoter regions, has been generally considered as a potent epigenetic modification that prohibits transcription factor (TF) recruitment, resulting in transcription suppression. Here, we used a protein microarray-based approach to systematically survey the entire human TF family and found numerous purified TFs with methylated CpG (mCpG)-dependent DNA-binding activities. Interestingly, some TFs exhibit specific binding activity to methylated and unmethylated DNA motifs of distinct sequences. To elucidate the underlying mechanism, we focused on Kruppel-like factor 4 (KLF4), and decoupled its mCpG- and CpG-binding activities via site-directed mutagenesis. Furthermore, KLF4 binds specific methylated or unmethylated motifs in human embryonic stem cells in vivo. Our study suggests that mCpG-dependent TF binding activity is a widespread phenomenon and provides a new framework to understand the role and mechanism of TFs in epigenetic regulation of gene transcription. DOI:http://dx.doi.org/10.7554/eLife.00726.001.

Entities:  

Keywords:  DNA methylation; Human; epigenetics; protein microarray; protein-DNA interactions; transcription factors; transcription regulation

Mesh:

Substances:

Year:  2013        PMID: 24015356      PMCID: PMC3762332          DOI: 10.7554/eLife.00726

Source DB:  PubMed          Journal:  Elife        ISSN: 2050-084X            Impact factor:   8.140


Introduction

DNA methylation is an ancient and major epigenetic modification that plays an important role in key biological processes, including genomic imprinting, X-chromosome inactivation, suppression of transposable elements, and carcinogenesis (Jaenisch and Bird, 2003; Egger et al., 2004; Robertson, 2005; Feinberg, 2007; Reik, 2007). In higher eukaryotes, methylation of CpG sites, especially at promoter regions, is generally considered as the hallmark of gene silencing (Baylin, 2005). The molecular consequence of CpG methylation is generally believed to disrupt TF–DNA interactions either directly (Nan et al., 1998), or indirectly by recruiting sequence-independent methylated DNA-binding proteins that occupy the methylated promoters and compete for the TF binding sites (Boyes and Bird, 1991). So far, only MeCP2, MBD1, MBD2, and a few zinc finger proteins have been identified as bona fide methylated DNA-binding proteins (Lewis et al., 1992; Meehan et al., 1989; Filion et al., 2006; Bartke et al., 2010; Bartels et al., 2011; Quenneville et al., 2011; Spruijt et al., 2013). It is unclear whether the methylated DNA binding activity is widespread among different TF subfamilies. Furthermore, the transcriptional regulatory activity of these methylation-dependent TF–DNA interactions has not been explored. Finally, the structural basis of these methylation-dependent TF–DNA interactions remains elusive. Here we employed a protein microarray-based approach to characterize the entire human TF repertoire for their direct binding capacity to a large set of methylated DNA motifs. In this work, we simultaneously tested the methylated DNA-binding activity for the majority of human TFs. Since we examined each methylated DNA motif individually on the protein microarray, we were able to determine the specific sequences surrounding the methylated CpG that is recognized by TFs. Furthermore, we demonstrated that the methylated- and unmethylated-DNA binding activities of KLF4 could be decoupled by protein mutagenesis.

Results

We selected a total of 154 DNA motifs (Supplementary file 1A) with the following characteristics: (1) predicted to be potential TF-binding sites in promoter regions of the human genome (Xie et al., 2005; Yu et al., 2006; Elemento et al., 2007); (2) representative of a subset of the 460 DNA motifs that have been examined for protein-binding activity in our previous study (Hu et al., 2009; Xie et al., 2010); and (3) carrying at least one CpG site. Because our goal was to identify DNA methylation-dependent binding activity, we developed a competition assay on a protein microarray to identify TFs that prefer DNA motifs carrying mCpGs (Figure 1A). Each synthesized double-stranded DNA motif was end-labeled with Cy5 (Hu et al., 2009) and converted to the double-stranded mCpG form using a bacterial CpG DNA methylase SssI. Each methylated motif was incubated with the protein microarray in the presence of its unlabeled (cold), unmethylated counterpart in 10-fold excess. The competition assay was first tested on a pilot array, comprised of MBD1, MBD2, MeCP2, and several negative control proteins. As illustrated in Figure 1B, strong binding signals were observed in a protein concentration-dependent manner, while control proteins, such as GST and BSA, did not show any detectable signals, confirming the reliability of the competition assay.
Figure 1.

Protein microarray-based approach identified mCpG-dependent DNA-binding activity among human TFs and cofactors.

(A) A competition assay was used to identify proteins that preferentially bind to methylated DNA motifs. SCAPER (S-phase cyclin A-associated protein in the ER) and E2F3 (E2F transcription factor 3) were shown here as two examples of methylated DNA-binding proteins. (B) A proof-of-principle assay was conducted using known methylated DNA-binding proteins on a pilot protein microarray. (C) Binding profiles of 41 TFs and 6 co-factors against 150 of the 154 tested methylated DNA motifs are summarized in the interaction map. TFs are color-coded based on the subfamilies. (D) EMSA assays validated DNA-binding activity for four selected TF candidates. Representative images from three independent experiments with similar results are shown. (E) Competition EMSA assays confirmed mCpG-dependent DNA-binding activities. As expected, 10-fold unlabeled, methylated DNA motif readily abolished the protein–DNA complex formation of the tested TFs with the biotinylated and methylated DNA motifs (Lane 1 in each image). However, 10-fold cold unmethylated DNA counterparts could not compete off methylated DNA binding, consistent with the protein microarray results. (F) HOXA5 and DIDO1 showed mCpG-dependent activation of luciferase activity in GT1-7 cells. Values represent mean ± SD (n = 3; **: p<0.01; t-test).

DOI: http://dx.doi.org/10.7554/eLife.00726.003

(A) Workflow of data normalization. (B) Local normalization (window size 9 × 9). (C) Extrapolation of background noise distribution. Noise distribution of N is mirrored from distribution of N. Standard deviation (SD) was calculated based on distribution N (Lower panel). (D) Distribution of Z scores of all proteins on a microarray. Z = 3 was selected as the cutoff in our study to identify the positives. The Z scores of some methylated DNA-binding proteins identified previously (Spruijt et al., 2013) are below our cutoff, while KLF4's Z score for one DNA motif is >6.

DOI: http://dx.doi.org/10.7554/eLife.00726.004

Left panel: signal comparison between a duplicated binding-assay with motif M303 shows a high correlation, confirming the reproducibility of the assay. Right panel: comparison between two random binding assays exhibited non-correlation between motifs M303 and M259.

DOI: http://dx.doi.org/10.7554/eLife.00726.005

The median value of TFs/cofactors binding to one methylated CpG-containing motif is 8.

DOI: http://dx.doi.org/10.7554/eLife.00726.006

Most TFs/cofactors bind to very few methylated DNA motif(s); whereas 7 TFs bind to more than 77 of the 154 motifs tested in this study.

DOI: http://dx.doi.org/10.7554/eLife.00726.007

(A) Distribution of TF subfamily members that showed mCpG-binding activity. (B) Distribution of all annotated TF subfamily members presented on the TF protein microarrays. Statistic analysis showed that none of the TF subfamilies was significantly enriched (p<0.01) for methylated motif-binding activities.

DOI: http://dx.doi.org/10.7554/eLife.00726.008

The results confirmed specificity of mCpG-dependent DNA-binding activities.

DOI: http://dx.doi.org/10.7554/eLife.00726.009

Eight units of KLF4 (TCCCGCCCA) and HOXA5 (AAACGCTGCC) binding motifs were separately cloned into the promoter region of a CpG-free luciferase reporter vector, and methylated with SssI before transfected into GT1-7 cells. Bisulfite sequencing confirmed that the CpG methylation levels of both motifs reached ∼100% after SssI treatment.

DOI: http://dx.doi.org/10.7554/eLife.00726.010

The curve is far from saturation, suggesting that more such TFs/co-factors remain to be discovered.

DOI: http://dx.doi.org/10.7554/eLife.00726.011

Protein microarray-based approach identified mCpG-dependent DNA-binding activity among human TFs and cofactors.

(A) A competition assay was used to identify proteins that preferentially bind to methylated DNA motifs. SCAPER (S-phase cyclin A-associated protein in the ER) and E2F3 (E2F transcription factor 3) were shown here as two examples of methylated DNA-binding proteins. (B) A proof-of-principle assay was conducted using known methylated DNA-binding proteins on a pilot protein microarray. (C) Binding profiles of 41 TFs and 6 co-factors against 150 of the 154 tested methylated DNA motifs are summarized in the interaction map. TFs are color-coded based on the subfamilies. (D) EMSA assays validated DNA-binding activity for four selected TF candidates. Representative images from three independent experiments with similar results are shown. (E) Competition EMSA assays confirmed mCpG-dependent DNA-binding activities. As expected, 10-fold unlabeled, methylated DNA motif readily abolished the protein–DNA complex formation of the tested TFs with the biotinylated and methylated DNA motifs (Lane 1 in each image). However, 10-fold cold unmethylated DNA counterparts could not compete off methylated DNA binding, consistent with the protein microarray results. (F) HOXA5 and DIDO1 showed mCpG-dependent activation of luciferase activity in GT1-7 cells. Values represent mean ± SD (n = 3; **: p<0.01; t-test). DOI: http://dx.doi.org/10.7554/eLife.00726.003

Data analysis of the protein microarray assays.

(A) Workflow of data normalization. (B) Local normalization (window size 9 × 9). (C) Extrapolation of background noise distribution. Noise distribution of N is mirrored from distribution of N. Standard deviation (SD) was calculated based on distribution N (Lower panel). (D) Distribution of Z scores of all proteins on a microarray. Z = 3 was selected as the cutoff in our study to identify the positives. The Z scores of some methylated DNA-binding proteins identified previously (Spruijt et al., 2013) are below our cutoff, while KLF4's Z score for one DNA motif is >6. DOI: http://dx.doi.org/10.7554/eLife.00726.004

Reproducibility of protein microarray data.

Left panel: signal comparison between a duplicated binding-assay with motif M303 shows a high correlation, confirming the reproducibility of the assay. Right panel: comparison between two random binding assays exhibited non-correlation between motifs M303 and M259. DOI: http://dx.doi.org/10.7554/eLife.00726.005

Distribution of number of mCpG-binding TFs/co-factors in a given motif-bind assay.

The median value of TFs/cofactors binding to one methylated CpG-containing motif is 8. DOI: http://dx.doi.org/10.7554/eLife.00726.006

Distribution of number of methylated motifs recognized by a given TF/co-factor.

Most TFs/cofactors bind to very few methylated DNA motif(s); whereas 7 TFs bind to more than 77 of the 154 motifs tested in this study. DOI: http://dx.doi.org/10.7554/eLife.00726.007

Distribution of TF subfamily members.

(A) Distribution of TF subfamily members that showed mCpG-binding activity. (B) Distribution of all annotated TF subfamily members presented on the TF protein microarrays. Statistic analysis showed that none of the TF subfamilies was significantly enriched (p<0.01) for methylated motif-binding activities. DOI: http://dx.doi.org/10.7554/eLife.00726.008

Four additional EMSA assays (A) and competition EMSA assays (B).

The results confirmed specificity of mCpG-dependent DNA-binding activities. DOI: http://dx.doi.org/10.7554/eLife.00726.009

Methylation level of the KLF4 and HOXA5 luciferase reporter constructs.

Eight units of KLF4 (TCCCGCCCA) and HOXA5 (AAACGCTGCC) binding motifs were separately cloned into the promoter region of a CpG-free luciferase reporter vector, and methylated with SssI before transfected into GT1-7 cells. Bisulfite sequencing confirmed that the CpG methylation levels of both motifs reached ∼100% after SssI treatment. DOI: http://dx.doi.org/10.7554/eLife.00726.010

Number of unique mCpG-binding TFs/co-factors in function of number of tested methylated DNA motifs.

The curve is far from saturation, suggesting that more such TFs/co-factors remain to be discovered. DOI: http://dx.doi.org/10.7554/eLife.00726.011 We then probed for direct interactions between 154 methylated motifs and the human TF repertoire using the TF protein microarray, which includes 1321 TF and 210 co-factor proteins (Supplementary file 1B). These full-length human proteins were overexpressed and individually purified in yeast as N-terminal GST fusion proteins as previously reported (Hu et al., 2009). The quantity and purity of the purified proteins were examined with Coomassie and silver stains against a random set of purified proteins (data not shown). Each of the labeled 154 methylated motifs was separately incubated on the human TF protein microarray in the presence of the unlabeled counterparts in 10-fold excess. Each reaction was performed in duplicate to ensure reproducibility. After a washing step, binding signals were acquired and processed using the GenePix software (Figure 1—figure supplement 1). After the binding signals were further normalized by correcting local background signals with a 9 × 9 sub-grid, Z-scores of each protein on the array were determined as previously described (Hu et al., 2009). A Z-score of 3 was used as the cutoff to identify the positives. Correlation analysis of the binding signals showed high reproducibility between each duplicated assay (Figure 1—figure supplement 2). Of the 154 methylated motifs examined, 150 (97%) showed significant binding signals to at least one protein on the microarrays (median = 8), suggesting binding to methylated DNA is prevalent, at least in this in vitro competition assay (see ‘Materials and methods’, Figure 1—figure supplement 3). In addition, 41 TFs and 6 TF co-factors (3% of all factors examined) showed mCpG-dependent binding activity (Figure 1C). Among these, 15% showed a broad binding activity to over 50% of all methylated motifs tested, suggesting that these proteins are not sensitive to the sequence context surrounding the mCpG site. For example, two zinc finger TFs, ZNF114 and ZNF416, could recognize almost all of the motifs tested in the competition assay. However, the majority of the 47 mCpG-binding proteins showed mCpG- and sequence-dependent binding activity, with 22 TFs exhibiting binding signaling to fewer than three motifs (Figure 1—figure supplement 4). Notably, the mCpG binding activity is widespread among various TF subfamilies, such as the zf-C2H2, Homeobox, bHLH, Forkhead, bZIP, and HMG box subfamilies (Figure 1C); no TF subfamily was found significantly enriched in the hit list (Figure 1—figure supplement 5). On the other hand, consistent with the important role of DNA methylation in the development of cancer (Egger et al., 2004; Reik, 2007), there was a significant enrichment for known oncogenes and tumor suppressors (10/47; p<0.01; hypergeometric model) and factors involved in tissue development (25/47; p<0.015; hypergeometric model; Figure 1C). Together, these results suggest that methylation-dependent direct TF–DNA interaction is a widespread phenomenon among various TF subfamilies in humans.
Figure 1—figure supplement 1.

Data analysis of the protein microarray assays.

(A) Workflow of data normalization. (B) Local normalization (window size 9 × 9). (C) Extrapolation of background noise distribution. Noise distribution of N is mirrored from distribution of N. Standard deviation (SD) was calculated based on distribution N (Lower panel). (D) Distribution of Z scores of all proteins on a microarray. Z = 3 was selected as the cutoff in our study to identify the positives. The Z scores of some methylated DNA-binding proteins identified previously (Spruijt et al., 2013) are below our cutoff, while KLF4's Z score for one DNA motif is >6.

DOI: http://dx.doi.org/10.7554/eLife.00726.004

Figure 1—figure supplement 2.

Reproducibility of protein microarray data.

Left panel: signal comparison between a duplicated binding-assay with motif M303 shows a high correlation, confirming the reproducibility of the assay. Right panel: comparison between two random binding assays exhibited non-correlation between motifs M303 and M259.

DOI: http://dx.doi.org/10.7554/eLife.00726.005

Figure 1—figure supplement 3.

Distribution of number of mCpG-binding TFs/co-factors in a given motif-bind assay.

The median value of TFs/cofactors binding to one methylated CpG-containing motif is 8.

DOI: http://dx.doi.org/10.7554/eLife.00726.006

Figure 1—figure supplement 4.

Distribution of number of methylated motifs recognized by a given TF/co-factor.

Most TFs/cofactors bind to very few methylated DNA motif(s); whereas 7 TFs bind to more than 77 of the 154 motifs tested in this study.

DOI: http://dx.doi.org/10.7554/eLife.00726.007

Figure 1—figure supplement 5.

Distribution of TF subfamily members.

(A) Distribution of TF subfamily members that showed mCpG-binding activity. (B) Distribution of all annotated TF subfamily members presented on the TF protein microarrays. Statistic analysis showed that none of the TF subfamilies was significantly enriched (p<0.01) for methylated motif-binding activities.

DOI: http://dx.doi.org/10.7554/eLife.00726.008

To validate the protein microarray results, we selected 11 TFs with different mCpG-dependent DNA-binding behaviors (i.e., sequence-dependent or -independent). Among them, six (e.g., ARNT2) bound to fewer than three methylated motifs, while the others (e.g., HOXA5) bound to at least 66 motifs (Supplementary file 1C). Using an electrophoretic mobility shift assay (EMSA), we confirmed that 8 of the 11 tested TFs could readily form a protein–DNA complex with their corresponding methylated motifs, but not with the unmethylated form (Figure 1D; Figure 1—figure supplement 6). This result indicates either a 27% of false positive rate of the protein microarray assay or a higher sensitivity of protein microarray than EMSA. While the binding activities to the mCpG-carrying motifs could be readily competed off with the unlabeled mCpG-containing analogs in 10-fold excess, the unmethylated counterparts showed no obvious impact, indicating that the complex formation between these TFs and specific DNA motifs is not non-specific but requires CpG methylation (Figure 1E).
Figure 1—figure supplement 6.

Four additional EMSA assays (A) and competition EMSA assays (B).

The results confirmed specificity of mCpG-dependent DNA-binding activities.

DOI: http://dx.doi.org/10.7554/eLife.00726.009

To determine whether mCpG-dependent TF–DNA interactions could convey transcriptional activity in vivo, we selected HOXA5 and DIDO1 to perform cell-based luciferase assays. Motifs M305 and M24 recognized by HOXA5 and DIDO1, respectively, were separately cloned into the promoter region of a CpG-free luciferase construct (pCpGL) (Klug and Rehli, 2006), which was then methylated with SssI and transfected with the HOXA5 and DIDO1 expression constructs at different concentrations into a mammalian cell line. The complete methylation of the motifs before transfection was confirmed using Sanger-bisulfite sequencing (Figure 1—figure supplement 7). Unmethylated luciferase construct was used as a negative control. Comparing with the luciferase activity of the negative controls, we found that both HOXA5 and DIDO1 demonstrated dose- and methylation-dependent enhancement of the luciferase activity (Figure 1F), indicating that the two TFs could specifically bind to the methylated promoters and enhance the downstream gene transcription.
Figure 1—figure supplement 7.

Methylation level of the KLF4 and HOXA5 luciferase reporter constructs.

Eight units of KLF4 (TCCCGCCCA) and HOXA5 (AAACGCTGCC) binding motifs were separately cloned into the promoter region of a CpG-free luciferase reporter vector, and methylated with SssI before transfected into GT1-7 cells. Bisulfite sequencing confirmed that the CpG methylation levels of both motifs reached ∼100% after SssI treatment.

DOI: http://dx.doi.org/10.7554/eLife.00726.010

We then asked whether these mCpG-binding TFs also bound to the same consensus motifs in an unmethylated form. Since our previous genome-wide analysis of human TF–DNA interactions included these 150 unmodified CpG-containing motifs (Hu et al., 2009), we integrated both protein microarray datasets. We discovered that 17 TFs recognize both methylated and unmethylated motifs (Figure 2A). Of the 435 protein–DNA interactions (PDIs) examined, only a small fraction (4%) exhibited motif-binding activity irrespective of the CpG methylation status (Figure 2A; yellow bars). Even in these cases, TF binding to methylated forms is much stronger than to the unmethylated forms based on the inability of non-methylated analogs to compete effectively even when present in 10-fold excess. Interestingly, 321 (74%) and 97 (22%) PDIs exhibit specificity to methylated (Figure 2A; red bars) and unmethylated motifs (Figure 2A; blue bars), respectively. To evaluate sequence specificity for the newly discovered methylation-dependent PDIs, we determined the consensus methylated motif sequences for the 17 TFs. The comparison with their known consensus sequences indicated that, in most cases, the sequences of methylated and unmethylated motifs recognized by the same TFs are distinct (Figure 2A; right panel; ‘Materials and methods’). For example, nuclear factor I/C (NFIC) binds to a methylated motif M209, GTmCpGCC, while its known consensus sequence is TTGGC, suggesting that many TFs are of dual-specificity, recognizing methylated and unmethylated DNA motifs of distinct sequences.
Figure 2.

A group of 17 TFs can bind to both methylated and unmethylated motifs of distinct sequences.

(A) Our previous PDI dataset was compiled with the dataset in this study to generate binding preference of the 17 TFs. Methylated consensus motifs of the 17 TFs identified based on the protein microarray results are compared with their known consensus motifs. (B) EMSA assays confirmed that four TFs could specifically interact with both methylated and unmethylated motifs of distinct sequences. Representative images from three independent experiments with similar results are shown. (C) and (D) Two possible scenarios are proposed to distinguish the mode of interactions between these TFs and their corresponding motifs. (E) and (F) Competition EMSA assays showed that both scenarios are possible. Representative images from two independent experiments with similar results are shown in each panel. (G) OIRD sensorgrams for ZMYM3 and KLF4 binding to methylated motifs M203 and M197, and their unmethylated counterparts, respectively. The OIRD measurements were performed at two concentrations of each protein. Solid lines represent the OIRD signals. Dashed lines are fitted On- and Off-curves. Red arrows indicate the starting point when a TF protein was introduced to the OIRD reaction chamber. Blue arrows indicate the time points when wash buffer was added. (H) Summary of average KD values measured at two concentrations of each protein. ‘NB’ indicates no observed binding signals.

DOI: http://dx.doi.org/10.7554/eLife.00726.012

As expected, unlabeled and methylated motif M319 showed dose-dependent competition against the labeled, methylated motif M319; whereas unlabeled and unmethylated motif M47 could readily compete off the binding signals. Same results were observed for ZMYM3.

DOI: http://dx.doi.org/10.7554/eLife.00726.013

Complex formation between KLF4 and methylated mM197 and between KLF4 and unmethylated umM412 is not affected by either umM412 or mM917, respectively. However, when both methylated and non-methylated competitor DNA was added, the complex formation was abolished.

DOI: http://dx.doi.org/10.7554/eLife.00726.014

Competition EMSA assays confirm KLF4's binding specificity to methylated motif M197 (mM197) and unmethylated motif M412 (umM412).

DOI: http://dx.doi.org/10.7554/eLife.00726.015

(A) MBD2b with a reported KD value of 330 nM was used as a benchmark in the OIRD system, showing the sensorgrams of MBD2b binding to methylated M203, M213 and M197. (B)–(D) OIRD sensorgrams for ZMYM3, TFAP2A and KLF4 binding to methylated motifs M203, M213 and M197, and their unmethylated counterparts, respectively. The OIRD measurements were performed at two concentrations of each protein. Solid lines represent OIRD signals. Dashed lines are fitted On- and Off-curves. Red arrows indicate the starting point when a TF protein was introduced to the OIRD reaction chamber. Blue arrows indicate the time points when wash buffer was added.

DOI: http://dx.doi.org/10.7554/eLife.00726.016

A group of 17 TFs can bind to both methylated and unmethylated motifs of distinct sequences.

(A) Our previous PDI dataset was compiled with the dataset in this study to generate binding preference of the 17 TFs. Methylated consensus motifs of the 17 TFs identified based on the protein microarray results are compared with their known consensus motifs. (B) EMSA assays confirmed that four TFs could specifically interact with both methylated and unmethylated motifs of distinct sequences. Representative images from three independent experiments with similar results are shown. (C) and (D) Two possible scenarios are proposed to distinguish the mode of interactions between these TFs and their corresponding motifs. (E) and (F) Competition EMSA assays showed that both scenarios are possible. Representative images from two independent experiments with similar results are shown in each panel. (G) OIRD sensorgrams for ZMYM3 and KLF4 binding to methylated motifs M203 and M197, and their unmethylated counterparts, respectively. The OIRD measurements were performed at two concentrations of each protein. Solid lines represent the OIRD signals. Dashed lines are fitted On- and Off-curves. Red arrows indicate the starting point when a TF protein was introduced to the OIRD reaction chamber. Blue arrows indicate the time points when wash buffer was added. (H) Summary of average KD values measured at two concentrations of each protein. ‘NB’ indicates no observed binding signals. DOI: http://dx.doi.org/10.7554/eLife.00726.012

Competition EMSA assays for ARID3B and ZMYM3.

As expected, unlabeled and methylated motif M319 showed dose-dependent competition against the labeled, methylated motif M319; whereas unlabeled and unmethylated motif M47 could readily compete off the binding signals. Same results were observed for ZMYM3. DOI: http://dx.doi.org/10.7554/eLife.00726.013

Competition EMSA assays for KLF4 and TFAP2A.

Complex formation between KLF4 and methylated mM197 and between KLF4 and unmethylated umM412 is not affected by either umM412 or mM917, respectively. However, when both methylated and non-methylated competitor DNA was added, the complex formation was abolished. DOI: http://dx.doi.org/10.7554/eLife.00726.014

Summary of KLF4’s dual-specificity.

Competition EMSA assays confirm KLF4's binding specificity to methylated motif M197 (mM197) and unmethylated motif M412 (umM412). DOI: http://dx.doi.org/10.7554/eLife.00726.015

OIRD sensorgrams for three TFs and MBD2b binding to three methylated DNA motifs.

(A) MBD2b with a reported KD value of 330 nM was used as a benchmark in the OIRD system, showing the sensorgrams of MBD2b binding to methylated M203, M213 and M197. (B)–(D) OIRD sensorgrams for ZMYM3, TFAP2A and KLF4 binding to methylated motifs M203, M213 and M197, and their unmethylated counterparts, respectively. The OIRD measurements were performed at two concentrations of each protein. Solid lines represent OIRD signals. Dashed lines are fitted On- and Off-curves. Red arrows indicate the starting point when a TF protein was introduced to the OIRD reaction chamber. Blue arrows indicate the time points when wash buffer was added. DOI: http://dx.doi.org/10.7554/eLife.00726.016 To experimentally confirm this apparent dual-specificity, we performed EMSA assays with two motifs, recognized by the same TF, in both methylated and unmethylated forms (Figure 2B). For instance, KLF4 could form a protein–DNA complex with the methylated motif M197 (TCCmCpGCCC), but not with its unmethylated form (Figure 2B). In contrast, methylation on motif M412 (GCTTTTACG) disrupted its interaction with KLF4 (Figure 2B). The same phenomenon was confirmed for three additional TFs, namely TFAP2A, ARID3B, and ZMYM3 (Figure 2B). These results raised an interesting question: does binding by a dual-specificity TF to one motif affect binding to the other of different methylation status? By performing a competition EMSA assay we observed two scenarios for different TFs (Figure 2C–F). In the first scenario, binding to one motif interferes the binding activity to the other motif. For example, the formation of protein–DNA complex between ARID3B and a methylated motif M319 (AAAmCpGCTTCC) could be readily competed off with the unmethylated motif M47 (GTGGGCGAAA) in 10-fold excess, and vice versa (Figure 2E), suggesting that ARID3B either uses the same DNA-binding domain to interact with both motifs, or binding to one motif inhibits binding to the other, presumably via conformational changes or steric hindrance (Figure 2C). The specificity of these assays was further confirmed with a competition EMSA assay with ARID3B and ZMYM3 (Figure 2—figure supplement 1). In the second scenario, binding activities of a dual-specificity TF are independent of each other (Figure 2D). For instance, KLF4–DNA complex formation with the methylated motif M197 was not affected by adding unmethylated motif M412 in 10-fold excess in the competition assay, suggesting that it may use two different domains to distinguish methylated from unmethylated motifs (Figure 2F; left panel). As expected, addition of 10-fold excess of methylated M197 could readily compete off the complex formation of KLF4 with labeled M197 (Figure 2F; left panel). The same results were observed for TFAP2A (Figure 2F; right panel). The binding specificities of KLF4 and TFAP2A were further confirmed by adding both methylated and non-methylated competitor DNA, resulting in disruption of the complex formation (Figure 2—figure supplement 2). To facilitate direct comparison, we reproduced all combinations of the competition EMSA assays for KLF4 in parallel on the same gel. The results confirmed its dual-specificity in a non-competitive fashion (Figure 2—figure supplement 3). The fact that KLF4 could form protein–DNA complexes with both motifs equally well, regardless of the presence of the competitors, also suggested that it is highly unlikely that these two motif sequences could bind to the same domain at different affinities.
Figure 2—figure supplement 1.

Competition EMSA assays for ARID3B and ZMYM3.

As expected, unlabeled and methylated motif M319 showed dose-dependent competition against the labeled, methylated motif M319; whereas unlabeled and unmethylated motif M47 could readily compete off the binding signals. Same results were observed for ZMYM3.

DOI: http://dx.doi.org/10.7554/eLife.00726.013

Figure 2—figure supplement 2.

Competition EMSA assays for KLF4 and TFAP2A.

Complex formation between KLF4 and methylated mM197 and between KLF4 and unmethylated umM412 is not affected by either umM412 or mM917, respectively. However, when both methylated and non-methylated competitor DNA was added, the complex formation was abolished.

DOI: http://dx.doi.org/10.7554/eLife.00726.014

We next employed an oblique incidence reflectivity difference (OIRD) system (Landry et al., 2004; Zhu et al., 2007; Fei et al., 2011) to determine binding affinity (i.e., KD values) of ZMYM3, TFAP2A, and KLF4 with their corresponding methylated DNA motifs M203, 213, and M197, respectively. As a comparison, the well known methylated DNA-binding protein, MBD2b, was also included. Because the OIRD system can monitor binding events in a real-time, label-free fashion, we therefore obtained the Kon and Koff values, and determined the KD values of ZMYM3, TFAP2A, and KLF4 as 460 nM, 399 nM, and 479 nM, respectively (Figure 2G; Figure 2—figure supplement 4A–D). As expected, none of them showed any significant binding activity to the unmethylated DNA motifs in the OIRD measurements, confirming our previous observations. On the other hand, the KD values of MBD2b measured against the same motifs are in close range to the three TFs tested (Figure 2—figure supplement 4A), suggesting that these TFs could bind to methylated DNA motifs almost as well as MBD2b (Figure 2H).
Figure 2—figure supplement 4.

OIRD sensorgrams for three TFs and MBD2b binding to three methylated DNA motifs.

(A) MBD2b with a reported KD value of 330 nM was used as a benchmark in the OIRD system, showing the sensorgrams of MBD2b binding to methylated M203, M213 and M197. (B)–(D) OIRD sensorgrams for ZMYM3, TFAP2A and KLF4 binding to methylated motifs M203, M213 and M197, and their unmethylated counterparts, respectively. The OIRD measurements were performed at two concentrations of each protein. Solid lines represent OIRD signals. Dashed lines are fitted On- and Off-curves. Red arrows indicate the starting point when a TF protein was introduced to the OIRD reaction chamber. Blue arrows indicate the time points when wash buffer was added.

DOI: http://dx.doi.org/10.7554/eLife.00726.016

To further dissect the molecular mechanism of this phenomenon, we focused on KLF4 to identify the key residue(s) responsible for interacting with the methylated cytosine, in order to decouple its dual-specificity. Intriguingly, KLF4 encodes a classic zf-C2H2 domain, a truncated and a full zf-H2C2 domain at its very C-terminus (Figure 3A and Figure 3—figure supplement 1). Because the X-ray structure of KLF4 bound to methylated CpG sequences is not available, we used a modeling approach. These efforts were facilitated by crystal structures of other, unrelated 5mC-binding proteins (e.g., MeCP2 and ZFP57) (Ho et al., 2008; Liu et al., 2012), as well as the adduct composed of mouse KLF4 and unmethylated DNA, which was employed as a modeling template. Using the existing KLF4 structure (Schuetz et al., 2011), we identified Arg458 and Asp460 in the last H2C2 zinc finger as a candidate 5mCpG recognition motif. In the KLF4 model, DNA binding is stabilized by a 5mC-Arg-G via improved van der Waals interaction and an Arg458-Asp460 salt-bridge (Figure 3A). Binding is further stabilized by a CH•••O (H2O-5mC) H-bond contact of the methylated cytosine on the complementary strand (Figure 3B). Interestingly, the local atomic environment of the KLF4 model is rather similar to those of MeCP2 and ZFP57 in their mCpG-bound forms (Figure 3—figure supplement 2). Therefore, Arg458 and Asp460 were predicted to be crucial for KLF4 and mCpG-dependent interaction (Figure 3—figure supplement 1).
Figure 3.

KLF4’s mCpG-dependent binding activity is decoupled from its binding activity to unmethylated motifs.

(A) Simulation of KLF4–DNA interactions predicted that two residues, Arg458 and Asp460, are involved in the interactions with methylated cytosine. Double arrow indicates van der Waals interactions between Arg458 and methyl group on the cytosine in one strand (5mCA). Red balls represent water molecules. (B) Asp460 further stabilizes binding to 5 mC on the other strand (5mCB) via a CH•••O (H2O—5mC) H-bond contact. (C) EMSA assays using KLF4 mutated proteins demonstrated that both R458 and D460 are crucial for mCpG-dependent binding activity. Representative images from three independent experiments with similar results are shown. (D) In cell-based luciferase assays for M197, WT KLF4 showed mCpG-dependent activation of downstream gene expression (red bars in the upper panel), while both R458A and D460A mutations abolished this activity (red bars in the middle and lower panels). (E) In cell-based luciferase assays with M412 (blue bars), both WT and mutants can activate the expression of unmethylated M412 (blue bars), but have no effect on methylated M412 (red bars). In (D) and (E), values represent mean ± SD (n = 3; **: p<0.01; t-test)

DOI: http://dx.doi.org/10.7554/eLife.00726.017

KLF4 encodes two and half zinc finger DNA-binding domains at its C-terminus. Residues R458 and D460, which were predicted to interact with the 5-methyl group in the cytosine, are located in the zf-H2C2 domain. D432 indicates where a truncated KLF4 construct ends.

DOI: http://dx.doi.org/10.7554/eLife.00726.018

The pink and blue double arrows represent van der Waals force between the arginine and methyl groups. Red balls are water molecules.

DOI: http://dx.doi.org/10.7554/eLife.00726.019

These results clearly demonstrated that both the single- and double-mutations, as well as the truncation, abolished KLF4's ability to form a complex with methylated motif M197, while neither showed detectable impact on complex formation with unmethylated motif M412.

DOI: http://dx.doi.org/10.7554/eLife.00726.020

Using GAPDH as a control, these results demonstrated equal transfection efficiency of the constructs.

DOI: http://dx.doi.org/10.7554/eLife.00726.021

Figure 3—figure supplement 1.

Architecture of KLF4 DNA-binding domain.

KLF4 encodes two and half zinc finger DNA-binding domains at its C-terminus. Residues R458 and D460, which were predicted to interact with the 5-methyl group in the cytosine, are located in the zf-H2C2 domain. D432 indicates where a truncated KLF4 construct ends.

DOI: http://dx.doi.org/10.7554/eLife.00726.018

Figure 3—figure supplement 2.

Known crystal structures of MeCP2 and ZFP57 in complex with methylated DNA.

The pink and blue double arrows represent van der Waals force between the arginine and methyl groups. Red balls are water molecules.

DOI: http://dx.doi.org/10.7554/eLife.00726.019

KLF4’s mCpG-dependent binding activity is decoupled from its binding activity to unmethylated motifs.

(A) Simulation of KLF4–DNA interactions predicted that two residues, Arg458 and Asp460, are involved in the interactions with methylated cytosine. Double arrow indicates van der Waals interactions between Arg458 and methyl group on the cytosine in one strand (5mCA). Red balls represent water molecules. (B) Asp460 further stabilizes binding to 5 mC on the other strand (5mCB) via a CH•••O (H2O5mC) H-bond contact. (C) EMSA assays using KLF4 mutated proteins demonstrated that both R458 and D460 are crucial for mCpG-dependent binding activity. Representative images from three independent experiments with similar results are shown. (D) In cell-based luciferase assays for M197, WT KLF4 showed mCpG-dependent activation of downstream gene expression (red bars in the upper panel), while both R458A and D460A mutations abolished this activity (red bars in the middle and lower panels). (E) In cell-based luciferase assays with M412 (blue bars), both WT and mutants can activate the expression of unmethylated M412 (blue bars), but have no effect on methylated M412 (red bars). In (D) and (E), values represent mean ± SD (n = 3; **: p<0.01; t-test) DOI: http://dx.doi.org/10.7554/eLife.00726.017

Architecture of KLF4 DNA-binding domain.

KLF4 encodes two and half zinc finger DNA-binding domains at its C-terminus. Residues R458 and D460, which were predicted to interact with the 5-methyl group in the cytosine, are located in the zf-H2C2 domain. D432 indicates where a truncated KLF4 construct ends. DOI: http://dx.doi.org/10.7554/eLife.00726.018

Known crystal structures of MeCP2 and ZFP57 in complex with methylated DNA.

The pink and blue double arrows represent van der Waals force between the arginine and methyl groups. Red balls are water molecules. DOI: http://dx.doi.org/10.7554/eLife.00726.019

EMSA assays to evaluate impacts of KLF4 R458K, R458A::D460A mutations, and Δ432 truncation on its binding activity to motifs M412 and M197.

These results clearly demonstrated that both the single- and double-mutations, as well as the truncation, abolished KLF4's ability to form a complex with methylated motif M197, while neither showed detectable impact on complex formation with unmethylated motif M412. DOI: http://dx.doi.org/10.7554/eLife.00726.020

Western blot analysis of overexpression of KLF4WT, KLF4R458A and KLF4D460A proteins in GT1-7 cells.

Using GAPDH as a control, these results demonstrated equal transfection efficiency of the constructs. DOI: http://dx.doi.org/10.7554/eLife.00726.021 To test our prediction, we generated a series of KLF4 mutants, including three single (i.e., R458A, R458K, D460A), one double (R458A::D460A) mutants, and one truncation (Δ432). EMSA assay with purified KLF4 mutant proteins showed that all point mutations at either or both residues completely abolished KLF4’s binding activity to methylated motif M197 (CCmCpGCC), while none of them had any detectable impact on the binding activity to the unmethylated motif M412 (TACpGCC; Figure 3C and Figure 3—figure supplement 3). The R458A and R458K mutations showed similar impact on mCpG-dependent binding activity, suggesting that charge on the residue is not critical. Similarly, KLF4 truncation showed no binding activity to methylated motif M197 (CCmCpGCC), but still maintained its binding activity to the unmethylated motif M412 (TACpGCC; Figure 3—figure supplement 3), indicating that the C2H2 zinc finger in KLF4 mediates KLF4’s interaction with unmethylated motif M412.
Figure 3—figure supplement 3.

EMSA assays to evaluate impacts of KLF4 R458K, R458A::D460A mutations, and Δ432 truncation on its binding activity to motifs M412 and M197.

These results clearly demonstrated that both the single- and double-mutations, as well as the truncation, abolished KLF4's ability to form a complex with methylated motif M197, while neither showed detectable impact on complex formation with unmethylated motif M412.

DOI: http://dx.doi.org/10.7554/eLife.00726.020

To examine whether KLF4’s binding activity to M197 (CCmCpGCC) is important for transcription regulation, we employed cell-based luciferase assays using both wild-type (WT) and mutated KLF4 expression constructs. Indeed, when co-transfected with the pCpGL construct carrying methylated motif M197 in the promoter, WT KLF4 showed dose-dependent increase of the luciferase activity (Figure 3D). This enhancement of transcription is also methylation-dependent, because WT KLF4 could not increase luciferase activity when motif M197 was not methylated in the construct (Figure 3D). More importantly, both KLF4 and KLF4 constructs completely lost their ability to modulate transcription from the pCpGL construct carrying methylated motif M197; whereas they both showed dose-dependent enhancement of gene transcription in the luciferase assay, like WT KLF4, when co-transfected with the pCpGL construct carrying unmethylated motif M412 (Figure 3E). Consistent with the EMSA results, neither WT nor mutant KLF4 constructs showed any significant increase of transcription when motif M412 was methylated in the construct (Figure 3E). Western blot analysis demonstrated an equal transfection efficiency of WT KLF4, KLF4 and KLF4 constructs (Figure 3—figure supplement 4). Altogether, these results suggested that KLF4 could bind to methylated and unmethylated CpG sites in different sequence contexts, resulting in activation of downstream gene expression in heterologous cells in vivo, and that these dual specificities are independent and achieved via two functionally separable domains encoded by KLF4.
Figure 3—figure supplement 4.

Western blot analysis of overexpression of KLF4WT, KLF4R458A and KLF4D460A proteins in GT1-7 cells.

Using GAPDH as a control, these results demonstrated equal transfection efficiency of the constructs.

DOI: http://dx.doi.org/10.7554/eLife.00726.021

Finally, we asked whether KLF4 binding preferences to both methylated and unmethylated motifs that we identified in vitro reflect sequences preferred in vivo. Since KLF4 is well known to play an important role in embryonic stem cell (ESC) maintenance, we therefore examined whether mCpG-dependent KLF4–DNA interactions occur in human ESCs (H1). First, we performed bioinformatics analysis of in vivo KLF4-binding sites in H1 cells based on their methylation status (Figure 4—figure supplement 1). We superimposed the published in vivo KLF4 binding sites, as determined by the ChIP-seq approach (Lister et al., 2009), with the published methylome dataset of single-base resolution obtained with whole-genome bisulfite sequencing in H1 hESCs (Lister et al., 2009). We identified numerous cases in which KLF4 tends to bind to those genomic regions containing a methylated CCCpGCC sequence, consistent with our protein microarray results (Figure 4—figure supplement 1; lower panel). Among the KLF4-binding sites containing at least one CpG, we found that the majority of these sites could be categorized into two groups across the whole-genome: 48% have high methylation levels of over 80% and 38% have methylation levels lower than 20% (Figure 4A). We identified statistically significant consensus motifs in methylated group (‘Materials and methods’; Supplementary file 1D). The top consensus motif, CmCpGC, discovered in the high methylation group is embedded in motif M197 (CCmCpGCC) that was recognized by KLF4 in the protein microarray assays (Figure 4A). Taken together, these integration analyses demonstrate that KLF4’s preference to methylated consensus sequence identified in vivo is highly similar to that determined in vitro.
Figure 4—figure supplement 1.

Integration of KLF4 ChIP-seq and methylome data in H1 cell.

KLF4 ChIP-Seq and methylome data in H1 were compiled to assign the methylation levels in KLF4 ChIP'ed segments (upper panel). Lower panel was schematic plot for KLF4 binding summits. The pink ovals indicate KLF4 binding summits as determined in the KLF4 ChIP-seq experiments. The short vertical lines in red and blue indicate the CpG sites in the contexts of motif M197 and M421, respectively. Other CpG sites are annotated with gray lines. The thin green lines underneath represent the observed methylation level.

DOI: http://dx.doi.org/10.7554/eLife.00726.023

Figure 4.

Endogenous KLF4 binds to methylated loci in human embryonic stem cells (H1) in vivo.

(A) Bioinformatics analysis to derive methylated DNA motif logo binding to KLF4 by integrating of KLF4 ChIP-Seq and methylome data in H1 cells. Based on the distribution of methylation level at the KLF4 binding sites, a top methylated consensus motif boxed in red was discovered in the highly methylated sites. As a comparison, M197 sequence recognized by KLF4 in the protein microarray assays is shown below. (B) Experimental procedure of KLF4 ChIP-bisulfite sequencing to confirm that KLF4 preferentially interacts with hyper-methylated motifs in H1 cells. (C) The gel images of KLF4 ChIP’ed loci (L1: chr1: 559311-559516; L2: chr5: 44424678-44424792; L3: chr16: 4681299-4681481; L4: chr2: 132747088-132747377; L5: chr12: 81828301-81828506) demonstrate specific and direct binding of KLF4 to its target regions. Negative controls were performed in the absence of the anti-KLF4 monoclonal antibodies. (D) Analysis of KLF4-ChIP against the five loci using the quantitative real-time PCR (qPCR) method. Fold change at each locus was obtained by taking the ratio of KLF4-ChIP qPCR signals over the negative control signals. Statistics analysis was based on three technical replicates. (E) Sanger bisulfite sequencing reads of input and KLF4-ChIP’ed DNA. Filled and blank circles indicate methylated and unmethylated CpG sites, respectively. Blue and red arrows indicate CpGs in the context of motifs M412 and M917, respectively. (F) For relatively lower methylation input, KLF4 methylated binding sites tend to have a higher methylation level after KLF4 ChIP. The lower panel in (F) shows the methylation differences at each CpG site between the input and KLF4 ChIP’ed DNA. p values were determined by binominal probability density function.

DOI: http://dx.doi.org/10.7554/eLife.00726.022

KLF4 ChIP-Seq and methylome data in H1 were compiled to assign the methylation levels in KLF4 ChIP'ed segments (upper panel). Lower panel was schematic plot for KLF4 binding summits. The pink ovals indicate KLF4 binding summits as determined in the KLF4 ChIP-seq experiments. The short vertical lines in red and blue indicate the CpG sites in the contexts of motif M197 and M421, respectively. Other CpG sites are annotated with gray lines. The thin green lines underneath represent the observed methylation level.

DOI: http://dx.doi.org/10.7554/eLife.00726.023

The chromosome positions and KLF4 ChIP-seq peaks (GSM447584) are shown.

DOI: http://dx.doi.org/10.7554/eLife.00726.024

The sequencing results confirmed that KLF4 bound to hyper-methylated loci in the sequence context of CCmCGCC (arrows) in H1 cell. Upper and lower panels represent bisulfite sequencing results of the input and KLF4 ChIP'ed loci.

DOI: http://dx.doi.org/10.7554/eLife.00726.025

Endogenous KLF4 binds to methylated loci in human embryonic stem cells (H1) in vivo.

(A) Bioinformatics analysis to derive methylated DNA motif logo binding to KLF4 by integrating of KLF4 ChIP-Seq and methylome data in H1 cells. Based on the distribution of methylation level at the KLF4 binding sites, a top methylated consensus motif boxed in red was discovered in the highly methylated sites. As a comparison, M197 sequence recognized by KLF4 in the protein microarray assays is shown below. (B) Experimental procedure of KLF4 ChIP-bisulfite sequencing to confirm that KLF4 preferentially interacts with hyper-methylated motifs in H1 cells. (C) The gel images of KLF4 ChIP’ed loci (L1: chr1: 559311-559516; L2: chr5: 44424678-44424792; L3: chr16: 4681299-4681481; L4: chr2: 132747088-132747377; L5: chr12: 81828301-81828506) demonstrate specific and direct binding of KLF4 to its target regions. Negative controls were performed in the absence of the anti-KLF4 monoclonal antibodies. (D) Analysis of KLF4-ChIP against the five loci using the quantitative real-time PCR (qPCR) method. Fold change at each locus was obtained by taking the ratio of KLF4-ChIP qPCR signals over the negative control signals. Statistics analysis was based on three technical replicates. (E) Sanger bisulfite sequencing reads of input and KLF4-ChIP’ed DNA. Filled and blank circles indicate methylated and unmethylated CpG sites, respectively. Blue and red arrows indicate CpGs in the context of motifs M412 and M917, respectively. (F) For relatively lower methylation input, KLF4 methylated binding sites tend to have a higher methylation level after KLF4 ChIP. The lower panel in (F) shows the methylation differences at each CpG site between the input and KLF4 ChIP’ed DNA. p values were determined by binominal probability density function. DOI: http://dx.doi.org/10.7554/eLife.00726.022

Integration of KLF4 ChIP-seq and methylome data in H1 cell.

KLF4 ChIP-Seq and methylome data in H1 were compiled to assign the methylation levels in KLF4 ChIP'ed segments (upper panel). Lower panel was schematic plot for KLF4 binding summits. The pink ovals indicate KLF4 binding summits as determined in the KLF4 ChIP-seq experiments. The short vertical lines in red and blue indicate the CpG sites in the contexts of motif M197 and M421, respectively. Other CpG sites are annotated with gray lines. The thin green lines underneath represent the observed methylation level. DOI: http://dx.doi.org/10.7554/eLife.00726.023

Five selected KLF4-binding loci for further analyses.

The chromosome positions and KLF4 ChIP-seq peaks (GSM447584) are shown. DOI: http://dx.doi.org/10.7554/eLife.00726.024

An example of KLF4 ChIP-bisulfite sequencing assay.

The sequencing results confirmed that KLF4 bound to hyper-methylated loci in the sequence context of CCmCGCC (arrows) in H1 cell. Upper and lower panels represent bisulfite sequencing results of the input and KLF4 ChIP'ed loci. DOI: http://dx.doi.org/10.7554/eLife.00726.025 To provide direct experimental evidence that KLF4 binds to endogenous CCmCpGCC in H1 hESCs in vivo, we performed KLF4 ChIP coupled with bisulfite sequencing (Figure 4B), using primer pairs designed on the basis of the above bioinformatics analysis. First, chromatin IP (ChIP) demonstrates specific and direct binding of KLF4 to five selected regulatory loci based on KLF4 ChIP-seq signals (Figure 4C; Figure 4—figure supplement 2). Next, quantitative real-time PCR (q-PCR) also showed the high enrichment (≥9-fold) on the specific loci after KLF4 ChIP (Figure 4D). KLF4 ChIP-bisulfite sequencing analyses showed that KLF4 could readily ChIP those loci (i.e., L1 and L2 in Figure 4E) with TACpGCC at two completely unmethylated fragments, and with CCCpGCC at ∼100% methylation level in other two loci tested (L3 in Figure 4E, and Figure 4—figure supplement 3), indicating that KLF4 binds to both unmethylated and highly methylated CpGs in different sequence contexts in vivo. To rule out the possibility that other sites within the ChIP’ed fragments might be responsible for KLF4 binding, we searched the published dataset (Lister et al., 2009) and did not find any other KLF4 binding peaks within the surrounding 500 bp, supporting that KLF4 indeed binds to these unmethylated or methylated sites. We also examined KLF4 methylated target sites that exhibit medium methylation levels (e.g., ∼50%) based on the published methylome dataset. Importantly, KLF4 ChIP-bisulfite sequencing analyses showed that the methylation level of CCCpGCC consensus was significantly increased from 30% and 55% of the genomics inputs to 65% (p=0.001) and 73% (p=0.04) in the ChIP’ed samples for two examined regions, respectively (L4 and L5 in Figure 4F; upper panel). As predicted, those nearby CpG sites not embedded in the KLF4 consensus did not show any significant increase in methylation (Figure 4F; lower panel). Together, the above analyses provided direct evidence that KLF4 preferentially binds to CCmCpGCC consensus sequence in vivo.
Figure 4—figure supplement 2.

Five selected KLF4-binding loci for further analyses.

The chromosome positions and KLF4 ChIP-seq peaks (GSM447584) are shown.

DOI: http://dx.doi.org/10.7554/eLife.00726.024

Figure 4—figure supplement 3.

An example of KLF4 ChIP-bisulfite sequencing assay.

The sequencing results confirmed that KLF4 bound to hyper-methylated loci in the sequence context of CCmCGCC (arrows) in H1 cell. Upper and lower panels represent bisulfite sequencing results of the input and KLF4 ChIP'ed loci.

DOI: http://dx.doi.org/10.7554/eLife.00726.025

Discussion

In this study we have identified numerous human TFs across various subfamilies that showed mCpG- and sequence-dependent binding activity, a much more prevalent phenomenon than previously appreciated (Karlsson et al., 2008; Rishi et al., 2010; Quenneville et al., 2011; Liu et al., 2012). It is likely that more proteins with the similar activity are yet to be discovered (Figure 1—figure supplement 8), because (1) we only surveyed a tiny fraction of methylated CpG space; (2) we did not screen for TFs that can recognize methylated CHG or CHH sequences in this study, which are present in several cell types including ESCs, induced pluirpotent stem cells, germline and neural cells (Ramsahoye et al., 2000; Ziller et al., 2011); (3) we did not include in the screen any hemimethylated DNA motifs, which occurs in vivo on newly synthesized DNAs during replication; and (4) some of the unconventional DNA-binding proteins we discovered in an earlier study might also encode this property (Hu et al., 2009).
Figure 1—figure supplement 8.

Number of unique mCpG-binding TFs/co-factors in function of number of tested methylated DNA motifs.

The curve is far from saturation, suggesting that more such TFs/co-factors remain to be discovered.

DOI: http://dx.doi.org/10.7554/eLife.00726.011

We estimated false positive and false negative rate for our protein microarray assays. Since 8 of the 11 methylated DNA-binding TFs were confirmed by EMSAs, we estimated that this protein microarray strategy might produce ∼27% of false positives. Our approach also showed some false negatives because some known methylated DNA binding proteins were not identified in this study (Mann et al., 2013; Spruijt et al., 2013). There are several possible reasons to explain this discrepancy. First, since we employed a competition assay on the human TF protein microarrays, only those that preferentially bind to methylated DNA motifs will be identified. In other words, if a protein binds to a DNA sequence irrespective of its methylation status, it will not show strong signals in our screen. Second, the number of DNA sequences we tested is rather small. Based on our simulation analysis (Figure 1—figure supplement 8), we expect that there exist more such human TFs yet to be discovered with more DNA motifs tested. Third, because we do not have any TF heterodimers on our arrays, it may also generate false negatives. For example, a recent study (Spruijt et al., 2013) employed a generic methylated DNA probe to pull down potential methylated DNA-binding proteins in mouse cell lysates, resulted in the identification of 19 proteins. Among them, 11 have human orthologs and are presented on our human protein microarrays. Five of them, namely MeCP2, NRF1, MBD1/4, and KLF4, showed methylated-specific binding activities in our study. The other six proteins showed weak binding signals in our competition assay, presumably due to the reasons discussed above (Figure 1—figure supplement 1E). Regardless of all the above limitations, we still discovered many novel methylation-dependent DNA-binding activities. For example, Znf114 and Znf416 were identified as new generic methylated-DNA binding proteins, with binding activities to almost all methylated DNA motifs tested in this study. The reason that they were not discovered in previous studies might be due to the fact that most of the previous studies were not performed in a systematic way. In our systematic survey, we identified a total of 47 novel methylated DNA-binding proteins, which significantly expanded the methylation-dependent protein–DNA interaction landscape. Our findings provide a new framework to better understand mechanisms of DNA methylation-dependent regulation of gene expression, especially in cancer, stem cell and development biology. For example, recent DNA methylome studies in cancer have identified dynamic epigenetic changes, resulting in global reprogramming in gene expression (Jones and Baylin, 2007; Figueroa et al., 2010; Noushmehr et al., 2010). However, CpG methylation does not always correlate with transcription repression in cancers (Everhard et al., 2009). Our unbiased screen of the entire human TF family identified many sequence-specific mCpG ‘readers’, which can presumably interpret the DNA methylation changes and in turn, regulate gene expression in a dynamic environment. Therefore, by presenting distinct binding sites for TFs, methylated cytosines may serve as the fifth alphabet that changes the landscape of TF–DNA interactions and increases the complexity and diversity of gene regulation.

Materials and methods

Protein annotation

The protein annotation of transcription factors and cofactors was obtained from our previous study (Hu et al., 2009) with some minor manual corrections. The information about TF binding domains was obtained from the Pfam (Bateman et al., 2004). The oncogenes and tumor repressor genes were downloaded from UniProtKB (http://www.uniprot.org/). A gene was categorized as a development gene if it has a GO (gene ontology) function which includes a keyword of ‘development’. Supplementary file 1B lists all transcription factors and co-factors on our protein microarray.

CpG Methylation of DNA probes and luciferase reporter constructs

Double-stranded DNA probes were generated according to a protocol described previously (Hu et al., 2011). Methylated CpG probes or luciferase reporter constructs were prepared as previously described (Guo et al., 2011). Briefly, 1 μg of DNA were incubated with 1 μl of M.SssI CpG methyltransferase (4 U/μl, NEB, Ipswich, MA), 1 μl of S-adenosylmethionine (SAM, 32 mM) and 2 μl of 10×NEBuffer 2 in a 20 μl reaction volume at 37°C overnight, followed by incubation with freshly added 1 μl of M.SssI (4 U/μl), 1 μl of SAM (32 mM), 0.5 μl of 10xNEBuffer 2 and 2.5 μl of water at 37°C for 4 hr, then at 65°C for 20 min. The methylation status was confirmed by bisulfite sequencing (Figure 1—figure supplement 7).

Protein microarrays

Human proteins were purified from yeast as GST fusion and arrayed on FAST slides (Whatman, Maidstone, UK) in duplicate as described previously (Hu et al., 2009). The protein microarrays were probed with Cy5-labeled methylated DNA motifs in the presence of 10-fold unlabeled (cold), unmethylated competitors using a similar protocol described previously (Hu et al., 2009). We selected 154 DNA motifs (Supplementary file 1A) and performed tests in duplicate protein microarrays. The slides were then washed and scanned with a GenePix 4000B scanner (Molecular Devices, Sunnyvale, CA) and the binding signals were acquired using the GenePix 6.0 software.

Protein microarray analysis

We used GenePix 6.0 to align the spot-calling grid and record the foreground and background intensities for every protein spot. Figure 1—figure supplement 1 shows detailed workflow for protein microarray analysis. The raw binding intensity (R) of each probe was defined as F/B, where F and B are the median values of foreground and background signals of the probe at site (i,j) on the microarray, respectively. We first normalized the raw signal of each probe based on median value of raw signals of its neighboring probes determined by the window size (window size = 9 × 9 in our study, Figure 1—figure supplement 1B). Most probes showed no binding signal (R′ = 1) and had variations around R′ = 1 due to background noise distribution as seen in Figure 1—figure supplement 1C. In order to evaluate the background noise of each microarray, we selected the shadow part (N: R′, in Figure 1—figure supplement 1D) and artificially combined with its symmetrical part around 1 (N: = 2 − N) to obtain the standard deviation (SD) of the noise distribution (N = N + N, Figure 1—figure supplement 1D). Then the Z-score of each probe was calculated bywhere R′ is the locally normalized intensity of probe (i, j) on the microarray, and std(N) are mean value and standard deviation, respectively, of noise distribution on the microarray. Since each protein is printed in duplicate on a microarray and each motif binding assay was performed in duplicate, a protein was identified as a positive hit only when all of its four spots produced a Z-score ≥ 3. Supplementary file 1C lists all transcription factor and cofactor hits showing binding to at least one methylated DNA motif.

EMSA

Each binding reaction was carried out with 100 fmol of biotinylated dsDNA probe and 1 pmol of purified protein in 20 µl of binding buffer as described previously (Hu et al., 2009). 10-fold (1 pmol) of unlabeled (cold) DNA motifs were added in the competition assays. All the expression clones for proteins used in EMSA were verified by DNA sequencing.

KD measurement of transcription factors binding 5mCpG motifs

Binding affinity between a transcriptional factor and a methylated DNA motif was determined by the oblique incidence reflection difference (OIRD) method (Landry et al., 2004; Zhu et al., 2007; Fei et al., 2011). Synthesized DNA motifs were methylated using SssI enzyme as described above (‘Materials and methods’), then printed together with their unmethylated counterparts onto Superamine 2 slides (ArrayIt, Sunnyvale, CA) at a concentration of 2.5 μM in 50% DMSO, exposed to 600 μJ UV for 3 min, and dried in a 37°C incubator for 1 hr. We selected ZMYM3, TFAP2A and KLF4 to measure their affinity. We also used MBD2b (Millipore, Billerica, MA) as a bench marker. For each OIRD measurement, a DNA motif slide was first washed with EMSA binding buffer (Hu et al., 2009) at 3 ml/min for 5 min, and then flooded with the corresponding TF protein solution at various concentrations as indicated in Figure 2—figure supplement 3. The observed on curve was then measured in real time until the OIRD signals reached saturation. To determine the off curve, the EMSA buffer was then pumped through the reaction chamber at a speed of 200 μl/min until the OIRD signals stabilized. Binding and dissociation signals were recorded every 10 s. Data analysis was carried out using Origin 9.0 following Majka J and Speck C’s method (Majka and Speck, 2007).
Figure 2—figure supplement 3.

Summary of KLF4’s dual-specificity.

Competition EMSA assays confirm KLF4's binding specificity to methylated motif M197 (mM197) and unmethylated motif M412 (umM412).

DOI: http://dx.doi.org/10.7554/eLife.00726.015

Luciferase assay

The CpG-free luciferase reporter vector used in our study, pCpGL, was a gift from Dr Michael Rehli (Klug and Rehli, 2006). Eight units of a DNA motif were subcloned into pCpGL promoter region, and the resulting plasmids (pCpGL-8X-motif) were grown in Escherichia coli PIR 1 strain using Zeocin as a selection marker. The CpG sites in the motifs were methylated using M.SssI as described previously. GT1-7 cells were co-transfected with three constructs: pCpGL-8X-motif, pCAGIG expressing the corresponding TF constructs at various concentrations, and pTK-RL (Promega, Madison, WI) using the FugeneHD reagent (Roche, Basel, Switzerland). Cells were harvested 48 hr post-transfection for luciferase reporter assay using the Dual-Luciferase reporter assay system (Promega). All assays were performed in triplicate.

Methylated DNA binding consensus

We reason that the methylated CpG is the key and conserved position for binding because the unmethylated counterpart shows no binding. Therefore, we first aligned the motif sequences around methylated CpG. For each TF binding to multiple methylated motifs, we then hierarchically clustered all 6-mer binding sequences centered on mCpG. With a selected cutoff for sequence similarity, all the binding motifs were separated into several groups. Each group of sequences was combined to derive the consensus sequences. The consensus sequence for the largest sequence group is shown in Figure 2A. If a TF binds to only one motif, the consensus sequence is the same as the binding motif.

Known DNA binding consensus

We collected the known consensus sequences for the TFs from the literature and databases (TRANSFAC and JASPAR). Some TFs have multiple known consensus sequences. In such cases, we prefer consensus sequences derived from in vitro interaction assays to those derived from in vivo experiments such as ChIP-seq, because the binding sequences derived from ChIP-seq could include methylated sequences. For example, one of the KLF4 consensus sequences in TRANSFAC was based on a ChIP-seq experiment (Chen et al., 2008), while another KLF4 binding motif was based on an in vitro oligonucleotide library (Shields and Yang, 1998). The known consensus for KLF4 shown in Figure 2A was created based on the dataset from in vitro oligonucleotide library using MDscan (Liu et al., 2002).

Site-directed mutagenesis

Site-directed mutagenesis was carried out using the QuikChange Multi Site-Directed Mutagenesis Kit (Agilent, Santa Clara, CA).

KLF4 methylated binding motifs in vivo

We integrated human methylome data and KLF4 ChIP-seq data to obtain the KLF4 binding landscape in H1 human embryonic stem cells (Lister et al., 2009). First, we adopted one popular software MACS (Model-based Analysis for ChIP-Seq) to determine KLF4 binding peaks and summits. The lengths of peaks determined by MACS were from 152 bp to 6062 bp. We selected 95% of these peaks which were shorter than 371 bp for further analysis. To remove the effect of sequence composition difference in genomic regions, we focused on the binding peaks but extended the peaks to ±50 bp to achieve enough ‘background’ sequences. The ‘foreground’ was sequences within ±60 bp around KLF4 binding summits. Since we are interested in motifs that could be affected by methylation, we obtained the occurrences of all possible 6-mers having CpG at the center position, e.g. AACGCT. The 6-mers significantly enriched in highly methylated summits (M ≥ 0.8) than in the background were predicted as KLF4 methylated binding motifs. The p-value was calculated based on hypergeometric cumulative distribution function. At the next step, we clustered significant 6-mer sequences (p<0.01 after Bonferroni multiple-test correction, Supplementary file 1D) from methylated group to obtain KLF4 methylated DNA binding consensus motif. One consensus sequence is shown in Figure 4A.

H1 cell culture

For feeder-free culture of H1, cells were cultured on Matrigel (BD, 1:80 dilution)-coated plates with MEF-condition media (D-MEM/F12 supplemented with 20% KSR, 2 mM L-glutamine, 100 mM MEM NEAA, and 100 mM beta-mercaptoethanol) supplemented with 4 ng/ml bFGF as previously described (Chiang et al., 2011). Media were changed daily.

Chromatin immunoprecipitation

Chromatin immunoprecipitation (ChIP) was carried out in H1 cells using a rabbit anti-KLF4 antibody (H180; Santa Cruz, Dallas, TX) according to a protocol described previously (Nelson et al., 2006), except that the protein A-Sepharose was replaced with Dynabeads Protein A (Life Technologies, Grand Island, NY). Normal rabbit IgG was used for mock IP as a negative control. Primers shown in Supplementary file 1E were used in ChIP-PCR and qPCR to detect the binding and enrichment of KlF4 regulatory regions after ChIP.

Assessment of CpG methylation status by bisulfite sequencing

Sanger bisulfite sequencing was performed as previously described (Guo et al., 2011; Ma et al., 2009). Purified genomic DNA or ChIP DNA from H1 cells were treated by EZ DNA Methylation-Direct Kit (Zymo Research, Irvine, CA). After bisulfite conversion, regions of interest were PCR-amplified. The primers used in Figure 4D,E were listed in Supplementary file 1E. PCR products were gel-purified and cloned into a TA vector (Life Technologies). Individual clones were sequenced and aligned with the reference sequence. eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see review process). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers. Thank you for sending your work entitled “DNA methylation presents distinct binding sites for human transcription factors” for consideration at eLife. Your article has been evaluated by a Senior editor, Detlef Weigel, and 3 reviewers, one of whom is a member of our Board of Reviewing Editors. The Reviewing editor and the other referees discussed their critiques and request that you respond in writing to the major concerns below. This manuscript presents an observation with important implications for the field of chromatin regulation and transcription but it was not successful in convincing us of the physiological relevance of the findings; specifically, the ChIP experiments must be improved. The authors should better demonstrate that the dual preference of KLF4 in vivo before this work could be published. The specific points to address are: 1) Figure 4D: because of the poor quality (or presentation) of the ChIP-seq data in Figure 4A, ChIP-qPCR data for this experiment should be presented here to convince the reader that KLF4 is actually bound to these sites in vivo. 2) The authors should discuss the protein microarray approach in more detail. Do they have any estimations about false positive or false negative hits in their screening? 3) The authors generate a number of KLF4 point mutants that are defective for meCpG binding (R458A, D460A) but still interact with un-methylated DNA (Figure 3C–E). It would be important to dissect which domain or which KLF4 zinc finger is specifically mediating the interaction with un-methylated DNA. 4) Figure 2F: this figure lacks negative controls. What happens when adding both methylated and un-methylated competitor DNA? This should abolish KLF4 and TFAP2A binding. 5) For a couple of transcription factor-meCpG interactions, it would be important to determine the KD of the interaction using conventional biochemistry and compare these KD s to other published meCpG-protein affinities. 6) The authors should discuss their results in the context of previously published literature and discuss any overlap, discrepancies, and novelties. For example: why were ZNF114 and ZNF416 not picked up in previous interactions screenings? Why did the authors not identify meCpG dependent interactions for Rfx proteins? 1) : because of the poor quality (or presentation) of the ChIP-seq data in 4A, ChIP-qPCR data for this experiment should be presented here to convince the reader that KLF4 is actually bound to these sites in vivo. Figure 4A is a schematic plot to demonstrate how we identified the methylated binding logos for KLF4 using the published KLF4 ChIP-seq data in H1 cell (Lister et al., Nature 2009). For the sake of simplicity, Figure 4A has been moved to Figure 4–figure supplement 1. In the revised Figure 4C, we provide KLF4 ChIP-qPCR data in the form of gel images and quantitative analysis. In the revised Figure 4D, we provide evidence that KLF4 lands on un-methylated loci in the sequence context of Motif M412. 2) The authors should discuss the protein microarray approach in more detail. Do they have any estimations about false positive or false negative hits in their screening? We have added more details both to the main text and the Materials and methods section about the protein microarray analysis. We also estimated the false positive rate of the protein microarray assays based on the EMSA results. However, it is hard for us to evaluate the false negative rate by comparing with other results published so far. The reasons are as below: 1) Some proteins identified by others may not present on our protein microarrays, or they came from different species, e.g., mice in the study by Spruijt et al. (“Dynamic Readers for 5-(Hydroxy)Methylcytosine and Its Oxidized Derivatives”, Cell. 2013 Feb 28;152(5):1146-59). 2) Our competition assays on the protein microarrays were designed to identify those TFs that prefer methylated DNA motifs, which is very different from other works already published. 3) We only surveyed a limited space of methylated DNA sequences. (So far, we had only tested 154 methylated DNA motifs.) 4) The cutoff of Z-scores used to identify specific methylated DNA-binding proteins was set pretty high. In Figure 1–figure supplement 1D, we illustrated the Z-scores for some mouse proteins identified by Spruijt et al., which are just below our cutoff. We have added discussion of false positive and false negative hits in the Discussion section. 3) The authors generate a number of KLF4 point mutants that are defective for meCpG binding (R458A, D460A) but still interact with un-methylated DNA (). It would be important to dissect which domain or which KLF4 zinc finger is specifically mediating the interaction with un-methylated DNA. We have generated a truncation mutant and identified that it is the typical zf-C2H2 domain in KLF4 that mediates the interaction with the un-methylated DNA motif M412. We have described this new result in the main text and added it in as Figure 3–figure supplement 3. 4) : this figure lacks negative controls. What happens when adding both methylated and un-methylated competitor DNA? This should abolish KLF4 and TFAP2A binding. We have performed the negative control experiments by adding both methylated and un-methylated DNA motifs as competitor DNA as recommended. The results are exactly the same as the reviewers predicted (Figure 2–figure supplement 2). 5) For a couple of transcription factor-meCpG interactions, it would be important to determine the K of the interaction using conventional biochemistry and compare these K s to other published meCpG-protein affinities. We employed an oblique incidence reflectivity difference (OIRD) system (Landry et al., 2004; Zhu et al., 2007; Fei et al., 2011) to determine binding affinity (i.e., KD values) of ZMYM3, TFAP2A, and KLF4 with their corresponding methylated DNA motifs M203, 213, and M197, respectively. As a comparison, a well known methylated DNA-binding protein, MBD2b, was also included. Because the OIRD system, similar to the SPR method, can monitor binding events in a real-time, label-free fashion, we therefore obtained the Kon and Koff values, and determined the KD values of ZMYM3, TFAP2A, and KLF4 as 460 nM, 399 nM, and 479 nM, respectively, using their newly identified methylated DNA motifs (Figure 2–figure supplement 3B–D). As expected, none of them showed any significant binding activity to the un-methylated DNA motifs in the OIRD measurements, confirming our previous observations. On the other hand, the KD values of MBD2b measured against the same motifs are in close range to the three TFs tested, suggesting that these TFs could bind to methylated DNA motifs almost as equally well as MBD2b (Figure 2–figure supplement 3E). Finally, the KD values of MBD2b measured with the OIRD system is almost the same as reported (i.e., 330 nM; Yinni Yu et al., Direct DNA methylation profiling using methyl binding domain proteins. Anal. Chem. 2010, 82, 5012–5019). These new results and a description of the OIRD experiments can now be found in the main text, Figure 2–figure supplement 4, and the Materials and methods section. 6) The authors should discuss their results in the context of previously published literature and discuss any overlap, discrepancies and novelties. For example: why were ZNF114 and ZNF416 not picked up in previous interactions screenings? Why did the authors not identify meCpG dependent interactions for Rfx proteins? ZNF114 and ZNF416 were identified as generic methylated DNA-binding proteins in our study. The reason that they were not discovered in previous studies might be due to the fact that most of the previous studies were not performed in a systematic way. In our systematic survey, we identified a total of 47 novel methylated DNA-binding proteins, which significantly expanded the methylation-dependent protein-DNA interaction landscape. Secondly, a recent study by Spruijt et al. (Spruijt et al. Cell 2013) used a mass spectrometry approach to systematically identify methylated DNA-binding proteins. However, their approach was applied to investigate the mouse proteome, which does not encode any ZNF114 and ZNF416 orthologs. On the other hand, we did not identify Rfx proteins as methylated DNA-binding proteins. As we discussed in point 2, there are several reasons for missing previous known methylated DNA-binding proteins in our assay. For example, in our study we used a competition assay, which would not identify those TFs that binds to both methylated and un-methylated DNA motifs equally well. Only the proteins that are preferentially binding to methylated DNA motifs would be identified as hits in our study. Please note that four of the six Rfx proteins on the TF protein microarrays have a Z-score above 2, indicating that they have weak preference to the methylated DNA motifs tested in this study (Figure 1–figure supplement 1D). We have added discussion on these issues to the Discussion section.
  47 in total

Review 1.  Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals.

Authors:  Rudolf Jaenisch; Adrian Bird
Journal:  Nat Genet       Date:  2003-03       Impact factor: 38.330

2.  Label-free detection of microarrays of biomolecules by oblique-incidence reflectivity difference microscopy.

Authors:  J P Landry; X D Zhu; J P Gregg
Journal:  Opt Lett       Date:  2004-03-15       Impact factor: 3.776

3.  DNA methylation inhibits transcription indirectly via a methyl-CpG binding protein.

Authors:  J Boyes; A Bird
Journal:  Cell       Date:  1991-03-22       Impact factor: 41.582

4.  Purification, sequence, and cellular localization of a novel chromosomal protein that binds to methylated DNA.

Authors:  J D Lewis; R R Meehan; W J Henzel; I Maurer-Fogy; P Jeppesen; F Klein; A Bird
Journal:  Cell       Date:  1992-06-12       Impact factor: 41.582

5.  Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a.

Authors:  B H Ramsahoye; D Biniszkiewicz; F Lyko; V Clark; A P Bird; R Jaenisch
Journal:  Proc Natl Acad Sci U S A       Date:  2000-05-09       Impact factor: 11.205

6.  Identification of a mammalian protein that binds specifically to DNA containing methylated CpGs.

Authors:  R R Meehan; J D Lewis; S McKay; E L Kleiner; A P Bird
Journal:  Cell       Date:  1989-08-11       Impact factor: 41.582

7.  Identification of the DNA sequence that interacts with the gut-enriched Krüppel-like factor.

Authors:  J M Shields; V W Yang
Journal:  Nucleic Acids Res       Date:  1998-02-01       Impact factor: 16.971

8.  An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments.

Authors:  X Shirley Liu; Douglas L Brutlag; Jun S Liu
Journal:  Nat Biotechnol       Date:  2002-07-08       Impact factor: 54.908

Review 9.  Epigenetics in human disease and prospects for epigenetic therapy.

Authors:  Gerda Egger; Gangning Liang; Ana Aparicio; Peter A Jones
Journal:  Nature       Date:  2004-05-27       Impact factor: 49.962

10.  The Pfam protein families database.

Authors:  Alex Bateman; Lachlan Coin; Richard Durbin; Robert D Finn; Volker Hollich; Sam Griffiths-Jones; Ajay Khanna; Mhairi Marshall; Simon Moxon; Erik L L Sonnhammer; David J Studholme; Corin Yeats; Sean R Eddy
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

View more
  149 in total

1.  Competition between DNA methylation and transcription factors determines binding of NRF1.

Authors:  Silvia Domcke; Anaïs Flore Bardet; Paul Adrian Ginno; Dominik Hartl; Lukas Burger; Dirk Schübeler
Journal:  Nature       Date:  2015-12-16       Impact factor: 49.962

2.  Ovine HSP90AA1 gene promoter: functional study and epigenetic modifications.

Authors:  Judit Salces-Ortiz; Carmen González; Alfonso Bolado-Carrancio; Jose Carlos Rodríguez-Rey; Jorge H Calvo; Rubén Muñoz; M Magdalena Serrano
Journal:  Cell Stress Chaperones       Date:  2015-08-08       Impact factor: 3.667

3.  Base-resolution methylation patterns accurately predict transcription factor bindings in vivo.

Authors:  Tianlei Xu; Ben Li; Meng Zhao; Keith E Szulwach; R Craig Street; Li Lin; Bing Yao; Feiran Zhang; Peng Jin; Hao Wu; Zhaohui S Qin
Journal:  Nucleic Acids Res       Date:  2015-02-26       Impact factor: 16.971

4.  A Conformational Switch in the Zinc Finger Protein Kaiso Mediates Differential Readout of Specific and Methylated DNA Sequences.

Authors:  Evgenia N Nikolova; Robyn L Stanfield; H Jane Dyson; Peter E Wright
Journal:  Biochemistry       Date:  2020-05-12       Impact factor: 3.162

5.  Parkinson-Associated SNCA Enhancer Variants Revealed by Open Chromatin in Mouse Dopamine Neurons.

Authors:  Sarah A McClymont; Paul W Hook; Alexandra I Soto; Xylena Reed; William D Law; Samuel J Kerans; Eric L Waite; Nicole J Briceno; Joey F Thole; Michael G Heckman; Nancy N Diehl; Zbigniew K Wszolek; Cedric D Moore; Heng Zhu; Jennifer A Akiyama; Diane E Dickel; Axel Visel; Len A Pennacchio; Owen A Ross; Michael A Beer; Andrew S McCallion
Journal:  Am J Hum Genet       Date:  2018-11-29       Impact factor: 11.025

6.  Aryl hydrocarbon receptor gene transitions (c.-742C>T; c.1661G>A) and idiopathic male infertility: a case-control study with in silico and meta-analysis.

Authors:  Younes Aftabi; Abasalt Hosseinzadeh Colagar; Faramarz Mehrnejad; Ensiyeh Seyedrezazadeh; Emadoddin Moudi
Journal:  Environ Sci Pollut Res Int       Date:  2017-07-15       Impact factor: 4.223

Review 7.  Cross-talk between site-specific transcription factors and DNA methylation states.

Authors:  Adam Blattler; Peggy J Farnham
Journal:  J Biol Chem       Date:  2013-10-22       Impact factor: 5.157

8.  Cell-specific Kaiso (ZBTB33) Regulation of Cell Cycle through Cyclin D1 and Cyclin E1.

Authors:  Amir Pozner; Tommy W Terooatea; Bethany A Buck-Koehntop
Journal:  J Biol Chem       Date:  2016-09-30       Impact factor: 5.157

9.  Vanin-1 expression and methylation discriminate pediatric asthma corticosteroid treatment response.

Authors:  Chang Xiao; Jocelyn M Biagini Myers; Hong Ji; Kelly Metz; Lisa J Martin; Mark Lindsey; Hua He; Racheal Powers; Ashley Ulm; Brandy Ruff; Mark B Ericksen; Hari K Somineni; Jeffrey Simmons; Richard T Strait; Carolyn M Kercsmar; Gurjit K Khurana Hershey
Journal:  J Allergy Clin Immunol       Date:  2015-04-21       Impact factor: 10.793

10.  Structural insights into methylated DNA recognition by the C-terminal zinc fingers of the DNA reader protein ZBTB38.

Authors:  Nicholas O Hudson; Frank G Whitby; Bethany A Buck-Koehntop
Journal:  J Biol Chem       Date:  2018-10-24       Impact factor: 5.157

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.