Sreejana Ray1, Desiree Tillo1, Stewart R Durell2, Syed Khund-Sayeed1, Charles Vinson1. 1. Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, 37 Convent Drive, Building 37, Room 5000, Bethesda, Maryland 20892, United States. 2. Laboratory of Cell Biology, National Cancer Institute, National Institutes of Health, 37 Convent Drive, Building 37, Room 5000, Bethesda, Maryland 20892, United States.
Abstract
NFATc2 is a DNA binding protein in the Rel family transcription factors, which binds a CGGAA motif better when both cytosines in the CG dinucleotide are methylated. Using protein binding microarrays (PBMs), we examined the DNA binding of NFATc2 to three additional types of DNA: single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) with either 5-methylcytosine (5mC, M) or 5-hydroxymethylcytosine (5hmC, H) in one strand and a cytosine in the second strand. ATTTCCAC, the complement of the core GGAA motif, is better bound as ssDNA compared to dsDNA. dsDNA containing the 5-mer CGGAA with either 5mC or 5hmC in one DNA strand is bound stronger than CGGAA. In contrast, the reverse complement TTCCG is bound weaker when it contains 5mC. Analysis of the available NFATc2:dsDNA complexes rationalizes these PBM data.
NFATc2 is a DNA binding protein in the Rel family transcription factors, which binds a CGGAA motif better when both cytosines in the CG dinucleotide are methylated. Using protein binding microarrays (PBMs), we examined the DNA binding of NFATc2 to three additional types of DNA: single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) with either 5-methylcytosine (5mC, M) or 5-hydroxymethylcytosine (5hmC, H) in one strand and a cytosine in the second strand. ATTTCCAC, the complement of the core GGAA motif, is better bound as ssDNA compared to dsDNA. dsDNA containing the 5-mer CGGAA with either 5mC or 5hmC in one DNA strand is bound stronger than CGGAA. In contrast, the reverse complement TTCCG is bound weaker when it contains 5mC. Analysis of the available NFATc2:dsDNA complexes rationalizes these PBM data.
Nuclear Factor of Activated T-cells cytoplasmic 2,
or NFATc2[1] (also known as
NFAT1, NFAT1a, or NFATP)[2] is a member of
the calcium-responsive NFAT family of transcription factors (TFs).
NFATc2 is expressed in many somatic tissues including immune and endothelial
cells[2−5] and is involved in the regulation of cellular processes, including
cell cycle regulation, T-cell differentiation and activation, and
development.[2,6−8] Because of its
importance in development, dysregulation of NFAT results in malignancies
and other pathologies.[3,9,10]NFAT contains two separate functional domains, the NFAT-homology
region (NHR) that is involved in calcium binding and subcellular localization
and the REL-homology region (RHR) that is a sequence-specific DNA
binding domain.[11] The RHR is composed of
two immunoglobulin folds, RHR-N involved in DNA binding and RHR-C
that mediates homo- or heterodimerization.[12] All NFAT TFs bind DNA as monomers to the core NFAT DNA binding motif GGAAAA.[11] NFAT TFs
can bind cooperatively with other TFs including AP-1,[13] GATA4,[14] IRF4,[15,16] FOXP3,[17] and MEF2.[18] For clarity, we present DNA sequences using the bold font.Several structures of monomeric NFATc2 bound with double-stranded
DNA (dsDNA) have been solved[12,19,20] and amino acids that interact with both strands in the major groove
of dsDNA have been highlighted. Two arginines in the loop regions
of RHR-N (R421 and R430) form bidentate hydrogen bond interactions
in the major groove with G and G of the NFAT DNA motif.[21] The amide side chain of Q571 forms hydrogen
bonds with A of the core motif.[22] Several amino acids’ N-terminal of R421,
R430, and Q571 interacts with the thymines in the complementary strand TTTTCC (nucleotide bases in the complementary strand are denoted
with a “′”). Y424 interacts with T and T, and R572 interacts with T and T via van
der Waals contacts, and both R572 and Y424 form hydrogen bonds with
the DNA backbone.[11]It was shown
that NFATc2 can bind strongly to sequences containing
5-mer CGGAA when a cytosine in both strands of the CG dinucleotide is methylated (e.g., MGGAA in one strand and TTCMG in the second strand),[23] expanding
the DNA sequences bound by NFATc2. To further explore the range of
DNA binding of NFATc2, we performed protein binding microarray (PBM)
experiments[24] with microarrays containing
three different types of DNA, single-stranded DNA (ssDNA), dsDNA with
5mC in one strand and a cytosine in the second strand (dsDNA(5mC|C)),
and dsDNA with 5hmC in one strand and a cytosine in the second strand
(dsDNA(5hmC|C)),[25] and compared these data
to previous data of NFATc2 binding to dsDNA with cytosine in both
strands (dsDNA(C|C)) and where both cytosines in all CG dinucleotides
contain 5mC (dsDNA(5mCG)).[23] Previously,
we have examined several bZIP family TFs binding with these modified
dsDNAs and identified the critical roles of conserved amino acids
in their sequence-specific DNA binding ability.[26,27] Here, the contribution of 5mC and 5hmC in one strand of dsDNA to
NFATc2 binding was examined.
Results
Protein
Binding Microarrays with Four Types
of DNA
Previously, we examined NFATc2 binding to two types
of dsDNA using PBMs, dsDNA where both strands contain a cytosine (dsDNA(C|C))
and dsDNA where both cytosines in all CG dinucleotides are methylated
(dsDNA(5mCG)).[23] NFATc2 preferentially
bound many 8-mers containing a methylated CG dinucleotide
in 5-mer CGGAA.[23] To learn
more about NFATc2 binding to different types of DNA, we examined NFATc2
binding to three additional forms of DNA using PBMs. The 16 grids
of the Agilent HK design 40k DNA microarray[28,29] were divided into four chambers containing four grids each using
a gasket slide from Agilent (Figure S1).
One chamber was left as ssDNA and the three other chambers were double-stranded
using either a cytosine producing dsDNA(C|C), or 5mC producing dsDNA(5mC|C),
or 5hmC producing dsDNA(5hmC|C).[30−33] NFATc2 was bound to two grids
in each of the four sectors. Examination of binding to all four types
of DNA on a single microarray slide allows for comparisons between
DNA types. These new data for NFATc2 binding dsDNA(C|C)[23] agree with our previously published data (R = 0.8, Figures S2 and S3E).
Rel Domain of NFATc2 Binding to Four Types
of DNA
We evaluated NFATc2 binding to four types of DNA by
examining the median and average binding to the 40,000 DNA features
on each sector of the Agilent microarray [Figure (replicate data 1), Figure S4 (replicate data 2), and Table S1]. The difference reflects sequence-specific DNA binding.
The median binding of NFATc2 to ssDNA (2256, Figure ) is higher than the three types of dsDNA,
indicating that NFATc2 preferentially binds ssDNA. However, the average
binding intensities for dsDNA(C|C) (2830) and dsDNA(5mC|C) (3404)
are higher compared to ssDNA (2351), indicating more sequence-specific
binding. This is observed in the longer right tail of all the three
dsDNA distributions (Figure and Table S1), indicating that
NFATc2 is binding specifically to some dsDNA. Both average and median
binding is highest for dsDNA with 5mC, followed by a cytosine, and
finally 5hmC. A search for the most enriched DNA motifs in the best-bound
1% of the 40,000 PBM features for all the four DNA types finds the
core motif 5-mer TTTCC as the best bound sequence in
both ssDNA and dsDNA(C|C). The preference for binding ssDNA TTTCC suggests that these interactions are stronger than the
interactions with the opposite strand GGAAA. The best
bound features for dsDNA(5mC|C) and dsDNA(5hmC|C) contain the 5-mer MGGAA and HGGAA (Table S2).
Figure 1
NFATc2 binding four types of DNA. Histogram of fluorescence intensities
from PBM experiments representing NFATc2 binding to 40,000 DNA features
containing (A) ssDNA, (B) dsDNA, and (C) dsDNA with 5mC in one strand
and cytosine in the other strand (5mC|C) and (D) dsDNA with 5hmC in
one strand and cytosine in another strand (5hmC|C). Values above 60,000
intensity units are not shown. The median and average binding intensities
of NFATc2 with four types of DNA are indicated on the plots.
NFATc2 binding four types of DNA. Histogram of fluorescence intensities
from PBM experiments representing NFATc2 binding to 40,000 DNA features
containing (A) ssDNA, (B) dsDNA, and (C) dsDNA with 5mC in one strand
and cytosine in the other strand (5mC|C) and (D) dsDNA with 5hmC in
one strand and cytosine in another strand (5hmC|C). Values above 60,000
intensity units are not shown. The median and average binding intensities
of NFATc2 with four types of DNA are indicated on the plots.
NFATc2 Binding to ssDNA and dsDNA(C|C)
We next calculated the relative binding affinities (Z-scores)[29] for NFATc2 binding to all 8-mers. While other
8-mer measures have been developed such as the rank-based E-score,[34,35] we prefer to use Z-scores as they are directly associated with binding
affinity and give an idea of the specificity of the protein under
investigation. Figure (Figure S5A for replicate data) presents
a scatter plot comparing NFATc2 Z-scores to all DNA 8-mers for dsDNA(C|C)
(x-axis) or ssDNA (y-axis). The
8-mers are divided into two groups, those that contain 5-mer TTTCC, which is enriched in the top bound array features for
both ssDNA and dsDNA(C|C), and all other 8-mers. For dsDNA(C|C), reverse
complements are equivalent (i.e., TTTCC is equivalent
to GGAAA), but with ssDNA, complements are not equivalent.
We obtain lower Pearson correlation values for the 8-mer Z-scores
for ssDNA replicate experiments (Figures S2B and S3) compared to those obtained for the raw intensity values
(Figure S2A), highlighting the lower sequence
specificity for ssDNA. Nevertheless, among the top bound 8-mers bound
by NFAT as ssDNA are those containing TTTCC and GTTCC (Figures , S5A-C, Table S3).
Figure 2
NFATc2 binding to single-stranded
DNA 8-mers. Z-scores for NFATc2
binding to double-stranded DNA 8-mers (x-axis) or
single-stranded DNA 8-mers (y-axis). 8-mers are divided
into two groups: 8-mers containing TTTCC (green) and
all other 8-mers (black). A selection of 8-mers is indicated with
their sequences provided.
NFATc2 binding to single-stranded
DNA 8-mers. Z-scores for NFATc2
binding to double-stranded DNA 8-mers (x-axis) or
single-stranded DNA 8-mers (y-axis). 8-mers are divided
into two groups: 8-mers containing TTTCC (green) and
all other 8-mers (black). A selection of 8-mers is indicated with
their sequences provided.
NFATc2 Binding to dsDNA(5mC|C)
Recently,
it was shown that the Rel domain of NFATc2 binds 8-mers
containing the 5-mer CGGAA stronger when both cytosines
in the CG dinucleotide are 5mC.[23] We examined
binding when only one strand contains 5mC. Figure A shows the scatter plot comparing 8-mers
bound by NFATc2 to dsDNA(C|C) (x-axis) and dsDNA(5mC|C)
(y-axis). For this comparison, we divided 8-mers
into four groups: (1) those that contain the NFATc2 canonical 5-mer CGGAA, (2) those that contain the reverse complementary 5-mer TTCCG, (3) all other 8-mers with a cytosine, and (4) 8-mers
with no cytosine. NFATc2 binding to 8-mers without a cytosine is along
the diagonal and acts as an internal control. The best bound 8-mer
in both dsDNA(5mC|C) and dsDNA(C|C) in the non-cytosine 8-mer group
is TGGAAAAT; thymine and 5mC both contain a methyl group
at the same position of the pyrimidine ring. In general, 8-mers with CGGAA are better bound with dsDNA(5mC|C), while 8-mers containing
the reverse complement TTCCG are better bound with dsDNA(C|C).
Examination of NFATc2 binding to 8-mers containing a single cytosine
or 5mC (Figure B)
shows a similar pattern to Figure A. Here, all 8-mers containing the MGAAA 5-mer are well bound, suggesting that methylation of a single cytosine
at position −1 in the M–1GGAA 5-mer is sufficient for the stronger binding of NFATc2
(Figure B).
Figure 3
NFATc2 binding
to double-stranded DNA 8-mers with 5mC in one strand
and cytosine in another strand. (A) 8-mer Z-score comparison of NFATc2
binding to dsDNA(C|C) (x-axis) and dsDNA(5mC|C) (y-axis). The 8-mers are divided into four groups: those
containing CGGAA (red), containing TTCCG (green), all other cytosine containing 8-mers (black), and those
without a cytosine (gray). The sequences of several 8-mers are indicated.
A line of best fit (gray) is shown for the 8-mers without a cytosine.
(B) Same as in (A) but for 8-mers containing one cytosine. (C) Z-scores
for Zta binding to dsDNA(5mC|C) 8-mers (x-axis) or
8-mers in which all cytosines in all CG dinucleotides are 5mC (dsDNA(5mCG))
(y-axis).[23] Only data
for 8-mers containing a single CG in the CG dinucleotide are shown.
8-mers containing the core NFAT motif CGAAA are shown
as red and all other 8-mers are shown in black.
NFATc2 binding
to double-stranded DNA 8-mers with 5mC in one strand
and cytosine in another strand. (A) 8-mer Z-score comparison of NFATc2
binding to dsDNA(C|C) (x-axis) and dsDNA(5mC|C) (y-axis). The 8-mers are divided into four groups: those
containing CGGAA (red), containing TTCCG (green), all other cytosine containing 8-mers (black), and those
without a cytosine (gray). The sequences of several 8-mers are indicated.
A line of best fit (gray) is shown for the 8-mers without a cytosine.
(B) Same as in (A) but for 8-mers containing one cytosine. (C) Z-scores
for Zta binding to dsDNA(5mC|C) 8-mers (x-axis) or
8-mers in which all cytosines in all CG dinucleotides are 5mC (dsDNA(5mCG))
(y-axis).[23] Only data
for 8-mers containing a single CG in the CG dinucleotide are shown.
8-mers containing the core NFAT motif CGAAA are shown
as red and all other 8-mers are shown in black.
Contribution of Each 5mC in a Methylated
CG Dinucleotide to Preferential NFATc2 Binding
We compared
the dsDNA(5mC|C) data to previous data where both cytosines in all
CG dinucleotides are 5mC (dsDNA(5mCG)).[23] We examined the 5096 8-mers that contain one cytosine that is in
a CG dinucleotide (Figure C). 8-mers containing the CGGAA 5-mer are along
the diagonal, indicating that methylation of the single cytosine in CGGAA is energetically similar to methylation of both cytosines
in the CG dinucleotide. It also suggests that methylation
of the cytosine on the other strand (e.g., TTCMG) does
not change NFATc2 binding. While our experiments do not allow for
different combinations of unmethylated and methylated cytosines on
the same strand (i.e., we cannot measure binding of TTCMG), the reduction of
NFATc2 binding to the sequence TTMMG observed in Figure A suggests that 5mC
at position 2′ is responsible for the reduced binding
to the complement motif.
NFATc2
Binding to dsDNA(5hmC|C)
5hmC is an oxidative product of
5mC that is associated with active
demethylation of DNA in vivo.[36] We compared NFATc2 binding to dsDNA(C|C) and dsDNA(5hmC|C)
(Figure A). The effect
of 5hmC on NFATc2 binding is not as dramatic as observed with 5mC.
Similar to dsDNA(5mC|C), all 8-mers containing the reverse complement
(TTCCG) are preferentially bound to dsDNA(C|C) compared
to dsDNA(5hmC|C). Few 8-mers with CGGAA are moderately
better bound with 5hmC (e.g., TGHGGAAA). Examination
of 8-mers with only one cytosine (Figure B) highlights a single 8-mer, with CGGAA (CGGAAAAA) being the best bound 8-mer with
either 5hmC or C. 8-mers containing 5hmC outside of the core motif
(e.g., AHTGGAAA) are also more strongly bound by NFATc2,
indicating that 5hmC at positions outside of the core motif can influence
NFATc2 binding.
Figure 4
NFATc2 binding to dsDNA 8-mers with 5hmC in one strand.
(A) 8-mer
Z-score comparison of NFATc2 binding to dsDNA(C|C) (x-axis) and dsDNA(5hmC|C) (y-axis). (B) Same as in
(A) but for 8-mers containing one cytosine. (C) 8-mer Z-score comparison
of NFATc2 binding to dsDNA(5mC|C) x-axis and dsDNA(5hmC|C)
(y-axis). 8-mers are colored as shown in Figure , with (D) same as
in (C) but for 8-mers containing 1 cytosine. For all panels, 8-mers
are colored as shown in Figure , with the sequences of several 8-mers indicated.
NFATc2 binding to dsDNA 8-mers with 5hmC in one strand.
(A) 8-mer
Z-score comparison of NFATc2 binding to dsDNA(C|C) (x-axis) and dsDNA(5hmC|C) (y-axis). (B) Same as in
(A) but for 8-mers containing one cytosine. (C) 8-mer Z-score comparison
of NFATc2 binding to dsDNA(5mC|C) x-axis and dsDNA(5hmC|C)
(y-axis). 8-mers are colored as shown in Figure , with (D) same as
in (C) but for 8-mers containing 1 cytosine. For all panels, 8-mers
are colored as shown in Figure , with the sequences of several 8-mers indicated.Figure C,D
compares
NFATc2 binding to 8-mers for dsDNA(5mC|C) (x-axis)
and dsDNA(5hmC|C) (y-axis). Most 8-mers containing MGGAA are better bound than HGGAA (e.g., TMGGAAAA and AMGGAAA), with the exception being
8-mer /GGAAAAA, which is similarly strongly bound by NFATc2
when it contains either 5mC or 5hmC. NFATc2 modestly prefers 8-mers
with TTMMG to TTHHG. Some 8-mers containing
cytosines outside the core CGGAA 5-mer are better bound
when they contain 5hmC compared to 5mC or cytosine (e.g., the aforementioned AHTGAAA), highlighting that 5hmC at positions outside of the
NFAT core motif can promote NFATc2 binding to dsDNA.
Structural Analysis of NFATc2 Binding Methylated
dsDNA
We investigated the available NFAT/dsDNA structural
complexes[20,21] in order to understand the physical causes
of the effects of cytosine modifications on NFATc2 dsDNA binding. Figure A examines the N-terminal
segment of the dsDNA binding domain of NFAT1 (RHR-N) binding the beginning
of the TGGAA consensus element. In particular, it is seen that the methyl group
of the thymine at the −1 position (T) occupies a partially hydrophobic pocket formed
by the side chain of R430 and the protein backbone. This is consistent
with thymine being a preferred nucleotide at this position for complex
formation, and why 5mC, for which the added methyl group is at the
same location as in thymine, is preferred over cytosine. On the complementary
strand, it is seen that the added methyl group on the cytosine at
the 1 position (C) is
distant from the protein, consistent with our suggestion that methylation
at this position has no effect on the dsDNA binding affinity of the
complex. Finally, it is seen that for the second cytosine on this
strand (C), the added
methyl group butts against the side chain of E427, which is not seen
for the unmethylated version of the consensus (Figure B). This suggestion of a potential steric
clash is consistent with the finding that methylation of this cytosine
weakens binding. This is further supported by an NMR structure involving
only the RHR-N domain of NFAT2 (PDB 1A66;[21]). In this
case, C and E427 are
in close proximity, and methylation causes an obvious steric clash
(Figure S6).
Figure 5
Models of NFATc2 binding
methylated and unmethylated dsDNA. (A)
Model of NFATc2 binding the TGGAA motif and the methylated complement TTMMA. The model was developed from
segments H, V, and Z of the PDB: 1pzu crystal structure (ref (20)) by simply adding methyl
groups to the C5 atoms of the cytosine bases. Only T of the top DNA strand and M and M of the bottom strand are shown for clarity. Pertinent residues
of the protein are shown as CPK spheres, and the rest of the protein
is represented as a gray surface. The bases and sugars of the DNA
are shown as sticks, and the backbones are represented by ribbons.
The atom color code is nitrogen—blue, oxygen—red, hydrogen—white,
protein carbon—gray, and DNA carbon and backbone ribbons—magenta
and purple, respectively, to distinguish the two strands. The methyl
groups of T, M, and M are represented by transparent spheres. (B) Model of NFATc2
binding the TGGAA motif and the unmethylated complement TTCCA.
Models of NFATc2 binding
methylated and unmethylated dsDNA. (A)
Model of NFATc2 binding the TGGAA motif and the methylated complement TTMMA. The model was developed from
segments H, V, and Z of the PDB: 1pzu crystal structure (ref (20)) by simply adding methyl
groups to the C5 atoms of the cytosine bases. Only T of the top DNA strand and M and M of the bottom strand are shown for clarity. Pertinent residues
of the protein are shown as CPK spheres, and the rest of the protein
is represented as a gray surface. The bases and sugars of the DNA
are shown as sticks, and the backbones are represented by ribbons.
The atom color code is nitrogen—blue, oxygen—red, hydrogen—white,
protein carbon—gray, and DNA carbon and backbone ribbons—magenta
and purple, respectively, to distinguish the two strands. The methyl
groups of T, M, and M are represented by transparent spheres. (B) Model of NFATc2
binding the TGGAA motif and the unmethylated complement TTCCA.
Discussion
Previously, we showed that NFATc2 binds stronger to the 5-mer CGGAA when
both cytosines in the CG dinucleotide are methylated (i.e., MGGAA in
one strand and TTCMG in the second strand).[23] Here, we examine how NFAT2c binds to three additional
kinds of DNA: ssDNA, dsDNA(5mC|C), and dsDNA(5hmC|C). ssDNA 8-mers
containing TTTCC or GTTCC are among the
best bound 8-mers by NFAT, suggesting that interactions with the strand
containing the TTTCC consensus may dominate interactions
on the opposite strand. We evaluated how 5mC in either strand of dsDNA,
one in the CGGAA and two in the complementary strand TTCCG, affects NFATc2 binding. 5mC in CGGAA and 5mC in both
the cytosines of the CG dinucleotide in CGGAA are similarly bound by NFATc2, suggesting that 5mC in the opposite
strand (TTCCG) at Cdoes not change binding. However,
5mC at position 2 in the complement TTCCG inhibits DNA binding, which we attribute to a steric clash occurring
between the methyl group of 5mC at position 2′ of
the core motif and Glu427 of the RHR-N domain of NFATc2. 5hmC is similar
to cytosine for binding CGGAA and weakens binding to TTCCG more than 5mC.The strong binding to certain ssDNA
8-mers may be biologically
important. It is becoming increasingly clear that non-B-form DNA structures
are bound in a sequence-specific manner and take part in gene regulation
by selective binding of different TFs and small molecules.[37−39] Sequence-specific binding to ssDNA has regulatory roles in eukaryotic
transcription.[40] The stronger binding to
ssDNA may reflect the conformational flexibility of ssDNA[40,41] and indicate that sequence-specific binding is primarily on one
strand of DNA.Our PBM data and structural analysis suggest
a similarity between
5mC and thymine on NFATc2 binding (Figures and 5). However,
some 8-mers with MGAAA are better bound than those containing TGAAA (Figure S7A,B), suggesting
that they are not equivalent. Many of these 8-mers differ from each
other at locations outside of the core NFAT motif (e.g., AATMGGAA, p-val = 0.006, Figure S7B). In addition, several thymine to 5mC substitutions also reduce
binding of NFATc2, including those outside of the core motif (e.g., TGGAAAAT is better bound than TGGAAAAM, p-val = 0.04, Figure S7B). We
currently cannot provide a physical explanation for NFATc2 binding
these 8-mers. Our data highlight the effects of cytosine and modified
cytosines inside and outside of the core motif on NFATc2 dsDNA binding,
providing a richer description of the dsDNA binding specificity of
NFATc2. The DNA binding domain of all NFAT members is highly conserved
(64–72% sequence identity),[42] suggesting
that other NFAT family members may have similar changes in dsDNA binding
with 5mC and 5hmC.
Materials and Methods
Cloning and Expression of Human NFATc2 DNA
Binding Domain
The construct containing humanNFATc2 is an
N-terminal GST construct cloned into a Gateway system pDEST15 vector.[23] The protein was expressed using the PURExpress In vitro protein synthesis kit (NEB) as per the manufacturer’s
protocol[33] in a 12.5 μL reaction
volume containing 90 ng of plasmid. The amino acid sequence of NFATc2
(https://www.ncbi.nlm.nih.gov/protein/Q13469.2) with the Rel homology DNA binding domain (RHR-N) is shown below
in bold:LVPPTWPKPLVPAIPICSIPVTASLPPLEWPLSSQSGSYELRIEVQPKPHHRAHYETEGSRGAVKAPTGGHPVVQLHGYMENKPLGLQIFIGTADERILKPHAFYQVHRITGKTVTTTSYEKIVGNTKVLEIPLEPKNNMRATIDCAGILKLRNADIELRKGETDIGRKNTRVRLVFRVHIPESSGRIVSLQTASNPIECSQRSAHELPMVERQDTDSCLVYGGQQMILTGQNFTSESKVVFTEKTTDGQQIWEMEATVDKDKSQPNMLFVEIPEYRNKHIRTPVKVNFYVINGKRKRSQPQHFTYHPVPAIKTEPTDEYDPTL.
Design of 40k Feature PBM and Double Stranding
of the Microarray
The 40,000 array feature PBM design contains
16 sectors, each containing a DNA grid with 40,000 features. Each
feature consists of a single-stranded DNA 60-mer probe with a 35bp
long variable and 25bp invariable sequence.[24,43] The variable sequence is designed in such a way that all 10bp sequences
(10-mers) are represented once on the array, and all 8-mers (including
complements) are represented 32 times. T7 DNA polymerase was used
to incorporate cytosine (NEB) or 5mC (NEB) or 5hmC (Zymo Research)
into the DNA, respectively, in the probes during double stranding,
creating unmethylated, hemi-methylated, or hemi-hydroxymethylated
dsDNA.[30,33,44] To monitor
the double-stranding efficiency, the double-stranding reaction mixture
was spiked with Cy3-dCTP (4%).[30,33,44]
DNA-Protein Binding Reaction and Data Analysis
DNA protein binding reactions were performed using these four types
of DNA.[45] The image generated from the
Agilent Surescan microarray scanner was quantified using ImaGene 9
software (BioDiscovery Inc.), and the extracted probe intensity values
were used for the calculation of Z-scores.[44] In previous studies, complementary 8-mers were combined,[33] but because of the asymmetric nature of the
double-stranding protocol for 5mC and 5hmC on PBMs, complementary
8-mers are different. Therefore, the Z-scores of the reverse complement
8-mer were only considered and extracted from the array probe design
as these represent the sequences containing the modified cytosine
in dsDNA. Z-score is the measure of standard deviations for each 8-mer
intensity from the global array median intensity.[34] All proteins were assayed at least twice with good agreement
(R > 0.85) between replicates (Figures S2 and S3). For all visualizations in the main figures,
we used data from representative experiments (indicated in Figure S2). Data (raw probe intensities and 8-mer
Z-scores) have been deposited at the NCBI Gene Expression Omnibus
(GEO) database under accession GSE10463.
Motif
Enrichment
Motif enrichment
analysis in the top 1% (403) of array probes ranked by intensity was
performed using the MEME tool of the MEME software suite version 4.11.2.[46] Motifs were searched on the forward strands
for the ssDNA PBMs and the reverse strand for the dsDNA(5mC|C) and
dsDNA(5hmC|C) experiments. Enriched motifs were searched for on both
strands of the dsDNA(C|C) experiments.
Structural
Modeling
Models of the
NFATc2 protein binding the sequence TGGAAA and the methylated
complement TTTMMA were developed from the crystal structure
of the unmethylated complex (ref (20); PDB: 1pzu) with the UCSF Chimera software package.[47] Chimera is developed by the Resource for Biocomputing,
Visualization, and Informatics at the University of California, San
Francisco (supported by NIGMS P41-GM103311).
Authors: Syed Khund-Sayeed; Ximiao He; Timothy Holzberg; Jun Wang; Divya Rajagopal; Shriyash Upadhyay; Stewart R Durell; Sanjit Mukherjee; Matthew T Weirauch; Robert Rose; Charles Vinson Journal: Integr Biol (Camb) Date: 2016-08-03 Impact factor: 2.192
Authors: Ishminder K Mann; Raghunath Chatterjee; Jianfei Zhao; Ximiao He; Matthew T Weirauch; Timothy R Hughes; Charles Vinson Journal: Genome Res Date: 2013-04-16 Impact factor: 9.043
Authors: Matthew T Weirauch; Ally Yang; Mihai Albu; Atina G Cote; Alejandro Montenegro-Montero; Philipp Drewe; Hamed S Najafabadi; Samuel A Lambert; Ishminder Mann; Kate Cook; Hong Zheng; Alejandra Goity; Harm van Bakel; Jean-Claude Lozano; Mary Galli; Mathew G Lewsey; Eryong Huang; Tuhin Mukherjee; Xiaoting Chen; John S Reece-Hoyes; Sridhar Govindarajan; Gad Shaulsky; Albertha J M Walhout; François-Yves Bouget; Gunnar Ratsch; Luis F Larrondo; Joseph R Ecker; Timothy R Hughes Journal: Cell Date: 2014-09-11 Impact factor: 41.582