The human muscleblind-like proteins (MBNL) regulate tissue-specific splicing by targeting cardiac troponin T and other pre-mRNAs; aberrant targeting of CUG and CCUG repeat expansions frequently accompanies the neuromuscular disease myotonic dystrophy. We show, using biolayer interferometry (Octet) and NMR spectroscopy, that the zinc finger domains of MBNL isoform 1 (MBNL1) are necessary and sufficient for binding CGCU sequences within the pre-mRNA of human cardiac troponin T. Protein constructs containing zinc fingers 1 and 2 (zf12) and zinc fingers 3 and 4 (zf34) of MBNL1 each fold into a compact globular tandem zinc finger structure that participates in RNA binding. NMR spectra show that the stoichiometry of the interaction between zf12 or zf34 and the CGCU sequence is 1:1, and that the RNA is single-stranded in the complex. The individual zinc fingers within zf12 or zf34 are nonequivalent: the primary RNA binding surface is formed in each pair by the second zinc finger (zf2 or zf4), which interacts with the CGCU RNA sequence. The NMR structure of the complex between zf12 and a 15-base RNA of sequence 95GUCUCGCUUUUCCCC109, containing a single CGCU element, shows the single-stranded RNA wrapped around zf2 and extending to bind to the C-terminal helix. Bases C101, U102, and U103 make well-defined and highly ordered contacts with the protein, whereas neighboring bases are less well-ordered in the complex. Binding of the MBNL zinc fingers to cardiac troponin T pre-mRNA is specific and relatively simple, unlike the complex multiple dimer-trimer stoichiometries postulated in some previous studies.
The humanmuscleblind-like proteins (MBNL) regulate tissue-specific splicing by targeting cardiac troponin T and other pre-mRNAs; aberrant targeting of CUG and CCUG repeat expansions frequently accompanies the neuromuscular disease myotonic dystrophy. We show, using biolayer interferometry (Octet) and NMR spectroscopy, that the zinc finger domains of MBNL isoform 1 (MBNL1) are necessary and sufficient for binding CGCU sequences within the pre-mRNA of human cardiac troponin T. Protein constructs containing zinc fingers 1 and 2 (zf12) and zinc fingers 3 and 4 (zf34) of MBNL1 each fold into a compact globular tandem zinc finger structure that participates in RNA binding. NMR spectra show that the stoichiometry of the interaction between zf12 or zf34 and the CGCU sequence is 1:1, and that the RNA is single-stranded in the complex. The individual zinc fingers within zf12 or zf34 are nonequivalent: the primary RNA binding surface is formed in each pair by the second zinc finger (zf2 or zf4), which interacts with the CGCU RNA sequence. The NMR structure of the complex between zf12 and a 15-base RNA of sequence 95GUCUCGCUUUUCCCC109, containing a single CGCU element, shows the single-stranded RNA wrapped around zf2 and extending to bind to the C-terminal helix. Bases C101, U102, and U103 make well-defined and highly ordered contacts with the protein, whereas neighboring bases are less well-ordered in the complex. Binding of the MBNL zinc fingers to cardiac troponin T pre-mRNA is specific and relatively simple, unlike the complex multiple dimer-trimer stoichiometries postulated in some previous studies.
The muscleblind-like
(MBNL) proteins are a subfamily of CCCH zinc
finger proteins that function as developmentally programmed regulators
of alternative pre-mRNA splicing. Muscleblind was originally identified
in Drosophila, where it regulates the terminal differentiation
of photoreceptors and muscles.[1,2] Muscleblind proteins
are exclusive to metazoans and are found in all genomes from Caenorhabditis elegans to Homo sapiens.[3,4] Vertebrate genomes contain 3 muscleblind homologues: MBNL1 (also
named MBNL/EXP), MBNL2 (MBLL/MLP1), and MBNL3 (MBXL/CHCR). Each of
these MBNL homologues contain two distantly spaced pairs of CCCH zinc
fingers (tandem zinc fingers, TZF) that are absolutely required for
RNA binding.[3,5] Fingers 1 and 3 have zinc ligand
spacing CX7CX6CX3H, whereas fingers
2 and 4 are CX7CX4CX3H motifs; fingers
1 and 2 are separated by 14 residues, and fingers 3 and 4 are separated
by 16 residues.[6−8] In vertebrate muscleblind, the two TZF domains, zf12
and zf34, are separated by an ∼80 residue linker (Figure A) that has a biased
amino acid composition and low sequence complexity (24% alanine, 18%
proline) and is therefore likely disordered. The muscleblind tandem
zinc finger sequences and their linkers are strongly conserved from C. elegans to humans (Figure B).
Figure 1
(A) Domain organization of human muscleblind (MBNL) zinc
finger
domains showing domain boundaries for the zf12 and zf34 constructs
used in the present work. (B) Sequence alignment of the two tandem
zinc fingers of the zinc finger pairs zf12 and zf34 for the three
human (h) isoforms of MBNL (MBNL1, MBNL2, MBNL3) with MBNL proteins
from chicken (c), bovine (b), mouse (m), and the MBL protein of Drosophila (drome) (zf12 only) and with prototype tandem
CCCH zinc finger proteins human tristetraprolin (TTP) and TIS11d.
The zinc ligands are outlined in yellow (cysteine) and red (histidine).
Conserved aromatic residues are outlined in green, and arginines conserved
in MBNL and replaced by tyrosine in TTP and TIS11d are outlined in
blue. The characteristic “RYKTEL/KYKTEL” sequences of
the latter two proteins are outlined in purple. Pink, brown, and orange
outlines show conserved cysteine and histidine residues that we show
by mutagenesis are not involved in zinc ligation. Blue circles indicate
residues from zf2 or zf4 that show large downfield shifts in 1H–15N HSQC spectra of MBNL1 (this work)
or TIS11d[11] upon addition of RNA; red circles
indicate the corresponding residues from zf1 and zf3 that disappear
upon the addition of RNA. Green triangles indicate the start and end
of the shorter constructs used in the published crystal structures.[25]
(A) Domain organization of humanmuscleblind (MBNL) zinc
finger
domains showing domain boundaries for the zf12 and zf34 constructs
used in the present work. (B) Sequence alignment of the two tandem
zinc fingers of the zinc finger pairs zf12 and zf34 for the three
human (h) isoforms of MBNL (MBNL1, MBNL2, MBNL3) with MBNL proteins
from chicken (c), bovine (b), mouse (m), and the MBL protein of Drosophila (drome) (zf12 only) and with prototype tandem
CCCH zinc finger proteins humantristetraprolin (TTP) and TIS11d.
The zinc ligands are outlined in yellow (cysteine) and red (histidine).
Conserved aromatic residues are outlined in green, and arginines conserved
in MBNL and replaced by tyrosine in TTP and TIS11d are outlined in
blue. The characteristic “RYKTEL/KYKTEL” sequences of
the latter two proteins are outlined in purple. Pink, brown, and orange
outlines show conserved cysteine and histidine residues that we show
by mutagenesis are not involved in zinc ligation. Blue circles indicate
residues from zf2 or zf4 that show large downfield shifts in 1H–15N HSQC spectra of MBNL1 (this work)
or TIS11d[11] upon addition of RNA; red circles
indicate the corresponding residues from zf1 and zf3 that disappear
upon the addition of RNA. Green triangles indicate the start and end
of the shorter constructs used in the published crystal structures.[25]HumanMBNL is a member of a large family of proteins containing
tandem CCCH zinc finger domains that plays an important role in RNA
metabolism. The best characterized CCCH zinc finger proteins are those
of the TIS11 family, which control gene expression by modulation of
mRNA stability. The TIS11 proteins bind via their zinc finger domains
to a specific regulatory element, the AU-rich element (ARE) in the
3′-untranslated region (UTR) of short-lived cytokine, growth
factor, and proto-oncogene mRNAs.[9,10] The structure
of the tandem zinc finger domain of TIS11d has been determined in
complex with its RNA recognition element.[11]In concert with CUG-BP, Elav-like family (CELF) proteins,
muscleblind
functions as a key developmental switch that controls splicing of
several genes including human cardiac troponin T (hcTnT), insulin
receptor (IR), and muscle-specific chloride ion channel (ClC-1).[12−15] MBNL proteins bind directly to the hcTnT pre-mRNA at a site immediately
adjacent to the alternative exon 5;[14] binding
appears to involve recognition of the consensus RNA motif YGCY, where
Y = pyrimidine.[16−19] Berglund and co-workers have suggested that MBNL1 regulates the
exclusion of exon 5 in the hcTnT mRNA by binding at the 3′-end
of intron 4 in hcTnT pre-mRNA. If MBNL1 is absent, the CELF proteins
and splicing machinery recognize this site, and the exon is included.[20] In Drosophila, muscleblind
plays a direct role in regulating alternate splicing of the α-actinin
pre-mRNA.[21]In addition to binding
pre-mRNA, muscleblind binds tightly to expanded
pathogenic CUG and CCUG repeats that accumulate in nuclear foci in
myotonic dystrophy.[5−7,22] It has been proposed
that, under pathogenic conditions, muscleblind proteins are aberrantly
sequestered in the nuclear foci, thereby interfering with MBNL regulated
pre-mRNA splicing activity, disrupting normal cellular developmental
programs and resulting in myotonic dystrophy.[14] The muscleblind zinc finger domains are necessary and sufficient
for high affinity binding to double-stranded CUG and CCUG repeat RNA.
Deletion of the entire C-terminal region of humanmuscleblind has
no effect on binding,[5] whereas deletion
of zinc fingers from natural muscleblind isoforms results in abrogation
of binding to RNA.[6,22] Isoforms lacking the long linker
between fingers 2 and 3 are also active in RNA binding.[6] In contrast to the human protein, Drosophilamuscleblind contains only one zinc finger pair, which is highly
homologous to humanfingers 1 and 2; however, as with humanMBNL,
CUG repeat transcripts sequester Drosophilamuscleblind
and interfere with its splicing activity.[21] Thus, a single pair of zinc fingers appears to be sufficient to
impart the ability to bind both to cognate single-stranded RNA and
CUG repeat RNA. Mutagenesis studies have shown that the zinc finger
pairs of MBNL1 are not equivalent with zf12 exhibiting higher RNA
binding affinity and splicing activity than those of zf34.[23,24]In this study, we report the determination of the 3-dimensional
NMR structures of the two sets of MBNL1 TZFs, zf12 and zf34, and of
the complex of MBNL1zf12 with an RNA sequence derived from the pre-mRNA
of cardiac troponin T. Both zf12 and zf34 adopt a structure in which
the two zinc fingers are cooperatively folded to form a single globular
unit with a stable hydrophobic core. We have characterized the RNA
binding activity of MBNL1 by biolayer interferometry (Octet) and NMR-based
titration experiments. In addition, we show through NMR titration
and self-diffusion experiments that one pair of zinc fingers binds
to one YCGCU(U/G)Y motif in the respective hcTnT pre-mRNA.
We demonstrate that the hydrogen-bonded secondary structure, likely
a stem-loop structure in a 32-nucleotide sequence at the 3′
end of intron-4 in hcTnT, unfolds to single-stranded RNA as a consequence
of MBNL1 zinc finger binding. We have calculated the NMR solution
structure of the complex of this single-stranded RNA with zf12. During
the course of this work, three-dimensional structures of free MBNL1[25] and MBNL2[26] zinc
finger domains were reported. A crystal structure analysis of MBNL1
zf34 bound to RNA[25] used MBNL zinc finger
fragments and RNA molecules that are significantly shorter than those
reported here, which have been shown to compromise RNA binding.[27] Together with the structural analysis of the
free two-finger domains, we present both a comprehensive mapping of
the RNA binding surfaces of the MBNL1 zinc finger domains and a solution
structure of zf12 in complex with RNA. These results show that the
preferred stoichiometry of the interaction between a zinc finger pair
(zf12 or zf34) and a YGCY RNA motif is 1:1.
Experimental Procedures
Preparation
of MBNL1 Protein Constructs
The DNA sequences
encoding humanMBNL1-zf12 (residues 1–92), MBNL1-zf34 (residues
173–255), and MBNL1-zf1234 (residues 1–255) were isolated
by PCR amplification from a full-length human liver cDNA library (Mobitec)
and subsequently cloned into pET21a. Protein overexpression was performed
at 37 °C in transformed E. coli BL21 (DE3) (DNAY)
in minimal medium (48 mM Na2HPO4, 22 mM KH2PO4, 8.6 mM NaCl, 1 mM MgCl2, 0.1 mM
CaCl2, BME vitamin mix (Sigma), and trace metals) in the
presence of 100 mg/mL of carbenicillin and 30 mg/mL of kanamycin.
Stable isotope labeling was performed by adding 1 g/L of (15NH4)2SO4 as nitrogen source and
either 4 g/L of d-glucose (for 15N-labeled protein)
or 2 g/L of 13C d-glucose (for 13C, 15N labeled protein) as carbon source. Cells were grown to
A600 ≈ 0.8 at 37 °C supplemented with 0.150
mM ZnSO4 and induced with IPTG to a final concentration
of 0.5 mM. For expression of MBNL1-zf12 and MBNL1-zf1234, cells were
allowed to grow for 4 h after induction at 37 °C; whereas for
MBNL1-zf34, cells were grown overnight at 18 °C after induction.
Cells were harvested by centrifugation and frozen at −20 °C.
Frozen cell pellets were suspended in 20 mM Tris/HCl, 10 mM DTT pH
8.0 and lysed by sonication (Branson Sonifier 450) at 4 °C. MBNL1
protein constructs were mainly expressed in the soluble fraction and
purified under native conditions. Cell debris was removed by centrifugation,
and a streptomycin cut was performed to remove nucleic acids: 10 mL
of a freshly prepared 10% solution of streptomycin sulfate was added
to 100 mL of lysis supernatant, and the solution was stirred at room
temperature for 30 min. The suspension was clarified by centrifugation,
and the supernatant was dialyzed against 4 L 20 mM Tris pH 8, 0.5
mM DTT overnight at 4 °C. The supernatant was passed through
a 5 mL HiTrap Q column to separate the protein of interest from remaining
nucleic acids and other impurities and applied onto a 75 mL Sepharose
SP FF column pre-equilibrated with 20 mM Tris/HCl, 50 mM NaCl, 2 mM
DTT pH 8.0. Finally, fractions containing MBNL1 protein were pooled,
concentrated, and purified on a Superdex 75 XK16 gel filtration column
(20 mM Tris/HCl, 100 mM NaCl, 2 mM DTT pH 7.0). SDS-PAGE, analytical
HPLC, and MALDI mass spectrometry were used to verify identity, purity,
and isotope enrichment. Mutant proteins were produced following the
QuikChange mutagenesis protocol. All protein variants were prepared
and purified using the same protocol.For all three protein
constructs, MBNL1-zf12, MBNL1-zf34, and MBNL1-zf1234, samples were
prepared for NMR by exchange into NMR buffer (20 mM Tris (pH 7.0),
50 mM NaCl, 2 mM DTT) All NMR experiments were performed at protein
concentrations of 0.1 mM (for HSQC titration experiments and NMR diffusion
experiments) and 0.4 mM (for assignments and structure determination).
Preparation of MBNL1 Protein–RNA Complexes
RNAs
were purchased as PAGE-purified oligonucleotides from IDT DNA (San
Diego). Samples were dissolved in water, desalted, and exchanged into
NMR buffer (20 mM Tris, pH 7.0, 50 mM NaCl, 2 mM DTT) with a NAP column.
The concentration was determined by UV spectroscopy using absorbance
at 260 nm. The product was freeze-dried until use. Before preparing
the protein–RNA complex, RNA was dissolved in NMR buffer to
a final concentration of ∼2 mM, heated to 90 °C for 3
min, and then snap-cooled in ice for 10–15 min. The concentrated
RNA was then added to fresh protein to prepare the complex.
Biolayer
Interferometry
BLI experiments were carried
out on an OCTET RED 96 instrument with streptavidin sensors. 5′
Biotinylated RNAs (purchased from IDT) were suspended in 20 mM Tris
pH 7.2, 150 mM NaCl, and then RNA and protein stocks were further
buffer exchanged on a NAP desalting column in the same buffer. Protein
and RNA concentrations were determined by UV, and stocks were kept
frozen at −80 °C. BLI experiments were carried out at
28 °C. Sensors were loaded at 0.5 μM RNA concentration,
and kinetic experiments were performed at 40–200 nM protein
concentration in a final buffer consisting of 20 mM Tris pH 7.2, 300
mM NaCl, 1 mM DTT, 0.2 mg/mL of BSA. Binding curves were fit to a
1:1 binding model. Several repeat runs were performed with each sensor,
and experiments were performed at least in triplicate with different
sets of sensors. Average Kd values are
reported.
NMR Spectroscopy
NMR spectra were recorded at 25 °C
on Bruker Avance 500, 600, 800, and 900 MHz spectrometers. 1D spectra
were processed using Bruker XWINNMR or topspin software, whereas multidimensional
spectra were processed with NMRPipe/NMRDraw software.[28] The processed NMR spectra were analyzed using Sparky[29] or NMRView.[30] Backbone
resonances were assigned using 3D HNCA, HNCACB, CBCA(CO)NH, and HNCO
experiments[31] and correlated back to 15N HSQC and 13CCT-HSQC spectra. Dihedral angle
restraints ϕ and ψ were calculated with TALOS[32,33] using backbone chemical shifts (13CO, 13Cα, 1Hα, 1HN, 15NH). In addition, ϕ
was experimentally determined using coupling constants derived from
an HNHA spectrum.[34] Aliphatic side chain
assignments were compiled from (H)C(CO)NH-TOCSY, H(C)(CO)NH-TOCSY,
and HCCH–COSY spectra,[31] and aromatic
side chains were assigned by (HB)CB(CGCD)HD and (HB)CB(CGCDCE)HE experiments.[35] Distance restraints were obtained from a 15N-edited NOESY-HSQC spectrum collected in water at 600 MHz
with 175 ms mixing time and a 13C-edited HMQC-NOESY spectrum
in D2O at 900 MHz with 150 ms mixing time. Coupling constants
from HNHB,[36] HACAHB-COSY,[37]13C-[13CO], and 13C-[15N] spin–echo difference CT-HSQC spectra[38,39] were determined to provide stereospecific assignments for Hβ
protons and to generate χ1 torsion angle restraints
for structure calculations. Zinc-coordinating His residues and their
tautomeric state were identified by 1H–15N HMQC experiments[40]The single-stranded
RNA hcTnT15 was not isotopically labeled. The resonances of the portion
of the RNA that were bound to the protein were assigned by 1H 2D methods employing the filtered NOESY spectrum[41] of the complex of hcTnT15 with uniformly 15N,13C-labeled MBNL1zf12 where only NOEs between 12C-attached 1H resonances were detected. Because the protein
was uniformly labeled, only the NOEs between the RNA protons were
detected in this experiment. The majority of the RNA is unstructured
such that the resonances significantly overlapped. Only the resonances
of the bound nucleotides are shifted outside this envelope sufficiently
for sequential assignments to be made. Standard NOE connectivities
were used to identify nearest-neighbor nucleotides, and the assignments
were verified by the correlation of the intermolecular protein–RNA
NOEs (listed in Table S1) obtained from
an edit-filtered spectrum.[42]1D diffusion
experiments were recorded using the PG-SLED gradient
echo pulse program[43] at 298 K on a Bruker
900 MHz spectrometer with 4096 complex data points. For each sample,
a set of 12 experiments were recorded with varying gradient strengths;
the last 5 experiments were repeated for consistency and error calculations.
The length of the pulsed field gradient (PFG) was kept the same, and
the amplitude was stepped up from 15 to 70% of the maximum gradient
strength in steps of 5%. As the strength of the PFG pulse was increased,
the signal amplitude was attenuated due to translational diffusion.
Structure Calculation
Protein structures were calculated
in a semiautomated iterative manner by using CYANA version 2.1[44] with distance restraints derived from 15N- and 13C-edited NOESY spectra.[42] The NOESY peak list used as input for CYANA analysis was generated
automatically using Sparky[29] based on the
chemical shift list generated in the assignment process. Additional
input for initial structure calculations included stereospecifically
assigned Hβ and prochiral methyl protons derived from coupling
constants as well as zinc ligand restraints. Backbone dihedral angle
restraints were obtained from chemical shifts using TALOS and from
a 3J-HNHA experiment. After the first few rounds of calculations,
the spectra were analyzed again to identify additional cross-peaks
consistent with the structural model and to correct mis-assigned NOE
restraints.Slowly exchanging amides were identified by lyophilizing
the protein from H2O and then dissolving it in 100% D2O; hydrogen bond donors were identified by the presence of
an amide peak in the HSQC recorded after 30 min. The corresponding
acceptors were identified from CYANA structures calculated without
any hydrogen bonding constraint to identify carbonyl groups that were
at a distance of approximately 2.0 Å from slowly exchanging amides.
Hydrogen bonding constraints were then added at this stage. When the
CYANA automated structure calculation gave an average target function
of 0.7 Å, the distance and angle constraints were used to generate
100 starting structures by following the REDAC strategy of CYANA2.1.
Refinement of these starting structures was accomplished by molecular
dynamics calculations using AMBER 8 with reduced charges in vacuo.
Zinc coordination was restrained to be tetrahedral, and zinc ions
were covalently attached with the ligand residues. Iterative structure
refinement was performed by modification of the distance restraint
list by identification of ambiguous or incorrect assigned distance
restraints using the original NOESY data. The latter was applied for
final refinement in AMBER 8[45] incorporating
a generalized Born solvation model.[46] The
20 best structures were selected to represent the ensemble of converged
MBNL1-zf12 and MBNL1-zf34 structures. Experimental data and structural
statistics are summarized in Table . The quality of the structure was evaluated using
PROCHECK-NMR.[47]
Table 1
Experimental
Restraints and Structure
Statistics for the 20 Lowest Energy Structures of MBNL1 Proteins and
RNA Complex
RNA
MBNL1-zf12
MBNL1-zf34
MBNL-zf12/hcTnT15
(A) Distance Constraints
protein
RNA
total unambiguous NOE restraints
1779
1095
1003
6
short-range (|i – j| ≤ 1)
894
544
377
medium-range (1 < |i – j| < 5)
333
225
254
long-range (|i – j| ≥ 5)
552
326
372
intermolecular
71
total dihedral angle restraintsa
74
102
128
79
Φ
47
59
68
Ψ
27
43
60
hydrogen bond constraints
18
18
56
6
(B) Structure Statistics
violations analysis
maximum distance violation
(Å)
0.12
0.18
0.30
maximum angle violation
(deg.)
0.00
0.84
0.82
energies
mean AMBER energy (kcal/mol)
–4825
–4097
–5887
mean restraint
violation
energy (kcal/mol)
3.25
9.14
6.65
average pairwise rmsdb
backbone
(Å)
0.55
0.60
0.37
heavy (Å)
1.32
1.38
0.98
(C) Ramachandran Statisticsc
allowed regions
81.1%
81.1%
83.1%
additionally allowed
regions
18.6%
18.8%
16.3%
generously allowed
regions
0.2%
0.1%
0.6%
disallowed regions
0.1%
0.0%
0.0%
Obtained from Cα, Cβ, and C chemical
shifts using TALOS.
Pairwise
rmsd was calculated among
20 refined structures for structured region residues 10–85 for MBNL1-zf12 free and
in the RNA complex and residues 179–251 for MBNL1-zf34
Obtained
from PROCHECK.[47]
Obtained from Cα, Cβ, and C chemical
shifts using TALOS.Pairwise
rmsd was calculated among
20 refined structures for structured region residues 10–85 for MBNL1-zf12 free and
in the RNA complex and residues 179–251 for MBNL1-zf34Obtained
from PROCHECK.[47]Structure
calculation for the complex of MBNL-zf12 with hcTnT15
RNA used a similar method with the RNA modeled as single-stranded.
Two-hundred starting structures of MBNL1-zf12 were generated by using
CYANA as described above and then were docked to RNA during the refinements
using AMBER 11.[48] An RNA model was generated
by using XLEAP, incorporating loose restraints for the RNA backbone
torsion angles (−120° to 120° for α, 140°
to 220° for β, 20° to 100° for γ, 55°
to 190° for δ, 170° to 300° for ε, and
−120° to 120° for ζ). Because of the paucity
of experimental RNA restraints, these torsion angle restraints were
added to penalize the formation of energetically unfavorable RNA backbone
conformations.[49,50] The modeled RNA was added to
each starting protein structure at a random position ∼30 Å
from the protein structure. The docking was achieved through seven
cycles of simulated annealing in AMBER 11 using a generalized Born
solvent model. Intermolecular restraints were not used for the first
cycle while the system was heated to 1000 K for 6 ps and then annealed
by cooling to 0 K for 14 ps. For the remaining cycles, intermolecular
distant restraints were gradually decreased from 50 to 15 to 10 Å
to final values obtained from experiments while the system was heated
to 400 K for 6 ps and then annealed by cooling to 0 K for 28 ps. During
these cycles, the restraint energy for intermolecular distances was
increased from 0 to 1 to 10 to 20 to 30 kcal/mol Å2. The majority of intermolecular distances involved protein pseudoatoms
and long upper distance bounds. As a consequence, it was difficult
to define the protein–RNA interface, and many cycles of restrained
molecular dynamics were required to generate a converged family of
structures that satisfied all of the experimental restraints. The
F54 amide cross peak is strongly downfield shifted in the RNA complex,
as observed for the equivalent residue in Tis11d, which is shifted
by hydrogen bond formation with an adenine base.[11] The experimental NOE restraints were more uniformly satisfied
and the structures converged better when an artificial distance restraint
was imposed between the C101 N3 and the F54 NH. In the resulting subset
of structures that best satisfied the restraints in this region, the
N4 and O2 atoms of C101 were consistently within hydrogen binding
distance to the A52 CO and D55 NH atoms. These hydrogen bonds were
then introduced as intermolecular restraints in the final cycles of
simulated annealing. After the final cycle of restrained simulated
annealing with all restraints active, the 20 structures with the lowest
AMBER energy were selected to represent the ensemble of the complex
of MBNL1-zf12 with hcTnT15. The structural statistics are summarized
in Table .
Results
Domain
Structure and Zinc Binding
Humanmuscleblind
proteins contain four CCCH zinc finger motifs arranged as two tandem
zinc fingers (TZFs) separated by an 80 residue linker (Figure A). Initial constructs of MBNL1-zf12
and MBNL1-zf34 containing residues 1–84 and 173–249, respectively,
were
designed on the basis of sequence alignment (Figure B) to coincide with the boundaries of the
tandem zinc finger domain of the previously characterized prototype
CCCH zinc finger protein TIS11d.[11] Constructs
of a similar size (residues 9–90 for zf12 and 178–246 for zf34)
were also
used in the published X-ray crystallographic study.[25] NMR spectra of the 1–84 and 173–249 constructs were
of
poor quality, and analysis of the chemical shift index[51] and [1H]-15N heteronuclear
NOE showed the presence of helical structure that extended to the
C-terminal residues, 84 in zf2 and 249 in zf4. Extended constructs
for MBNL1zf12 (containing residues 1–92, 1–104, and 1–129) showed better solution
behavior. Comparison of the CD and HSQC spectra of these constructs
showed that residues beyond 92 were unstructured. For MBNL zf34, the
C-terminus was extended to the end of the zinc finger domain (residues 173–255), which gave satisfactory
NMR spectra. These experiments establish that secondary structure
in both MBNL1TZF pairs extends beyond the boundaries of the previously
characterized CCCH TZF protein, TIS11d, and that the constructs used
in the MBNL1 X-ray crystal structures[25] were likely too short. The MBNL1 constructs containing residues
1–92 (MBNL1-zf12), and residues 173–255 (MBNL1-zf34)
were used in all subsequent experiments.A construct containing
all four zinc fingers of MBNL1 (MBNL-zf1234, residues 1–255) was also prepared. Comparison
of the 1H–15N HSQC spectra of the individual
two-finger constructs MBNL1-zf12 (Figure A) and MBNL1-zf34 (Figure B) with that of the four-finger construct
(Figure C) revealed
an almost perfect overlap of common cross peaks (Figure D), indicating that the zf12
and zf34 motifs fold autonomously and that there are no long-range
interactions between them. Cross peaks of residues in the long linker
between the two tandem zinc finger domains (black cross peaks visible
in Figure D) are sharp
and appear in the random coil chemical shift region, indicating that
they belong to an unstructured and flexible linker. Addition of EDTA
induces unfolding of MBNL1, resulting in loss of cross peak dispersion
in the 1H–15N HSQC spectrum, demonstrating
that zinc is essential for stabilizing the native fold of the MBNL1
CCCH zinc fingers.
Figure 2
1H–15N HSQC spectra of MBNL
zinc finger
constructs (A) MBNL1-zf12 (residues 1–92, red), (B) MBNL1-zf34
(residues 173–255, green), (C)
MBNL1-zf1234
(residues 1–255, black), and
(D) overlay
of the three spectra in (A–C), showing that the cross peaks
corresponding to the individual zinc finger pairs exactly superimpose
on the corresponding cross peaks for the full-length construct. Cross
peaks assigned to the zinc ligand residues are labeled in parts (A)
and (B).
1H–15N HSQC spectra of MBNL
zinc finger
constructs (A) MBNL1-zf12 (residues 1–92, red), (B) MBNL1-zf34
(residues 173–255, green), (C)
MBNL1-zf1234
(residues 1–255, black), and
(D) overlay
of the three spectra in (A–C), showing that the cross peaks
corresponding to the individual zinc finger pairs exactly superimpose
on the corresponding cross peaks for the full-length construct. Cross
peaks assigned to the zinc ligand residues are labeled in parts (A)
and (B).The sequence of the humanmuscleblind
proteins is highly conserved
with little variation in the zinc finger region and identical spacing
between zinc ligands in zinc fingers 1 and 3 and zinc fingers 2 and
4, respectively (Figure B). However, an extra cysteine residue is conserved at position 43
between zf1 and zf2 (Figure B, pink). To rule out the possibility that this residue plays
a role in zinc binding, the mutants C43A, C43S, and C43V were expressed.
The 1H–15N HSQC spectra of these variants
showed only minor differences from the wild-type spectrum, confirming
that Cys43 is not a zinc ligand.The zinc-coordinating histidine
residues and their tautomeric state
were identified from 2JHN1H–15N HMQC spectra using 15N chemical shift analysis
and relative intensities of cross peaks arising from two- and three-bond
couplings between nonexchangeable histidine ring protons Hδ2 and Hε1.[40] These spectra
show that the zinc-binding histidines are His38, His70, His204, and
His238, which bind zinc through Nε2 in zinc fingers
1–4, respectively, whereas His74 (end of finger 2, brown in Figure B) and His242 (end
of finger 4, orange in Figure B) are not zinc ligands, consistent with the sequence alignment.
Structure of MBNL1-zf12 and MBNL1-zf34 Domains
The
structures of the free MBNL1-zf12 and MBNL1-zf34 domains were determined
by multidimensional NMR spectroscopy. Backbone assignments were made
using triple resonance experiments, and side chain assignments were
completed using 3D heteronuclear-edited TOCSY and COSY experiments.[52] Chemical shift analysis of MBNL1 tandem zinc
fingers showed that, unlike the TIS11dTZF,[11] the region between the zinc fingers in each pair contains secondary
structure in the free state.Automated NOE assignment and structure
calculation were performed using several rounds of CYANA version 2.1[44] using 2895 NOEs (1779 unambiguous) in the case
of MBNL1-zf12 and 1849 NOEs (1095 unambiguous) in the case of MBNL1-zf34
as well as dihedral angle restraints and hydrogen bond restraints.
Once structural convergence was achieved and the CYANA target function
was less than 1.0, 200 CYANA structures were generated and subsequently
refined by restrained molecular dynamics-simulated annealing using
the AMBER 8 software package.[45] The 20
lowest-energy NMR structures are shown in Figure , and experimental restraints and structural
statistics are summarized in Table . The high quality of the structures was reflected
in the favorable Ramachandran statistics as well as the tight convergence
of both structures with average backbone RMSD of 0.55 and 0.60 Å
in the cased of MBNL1-zf12 and MBNL1-zf34, respectively.
Figure 3
NMR solution
structures of free MBNL1-zf12 (A, B) and MBNL1-zf34
(C, D). (A) Family of 20 lowest-energy structures of free MBNL1-zf12
superimposed on the well-structured region of the protein from residues
Thr11 to Asn85. The disordered N- and C-termini have been omitted
for clarity. (B) Ribbon representation of the lowest-energy structure
of the ensemble shown in (A), showing the ligation of the two zinc
centers. (C) Family of 20 lowest-energy structures of free MBNL1-zf34
superimposed on the well-structured region of the protein from residues
Thr179 to Gln251. The disordered N- and C-termini have been omitted
for clarity. (D) Ribbon representation of the lowest-energy structure
of the ensemble shown in (C), showing the ligation of the two zinc
centers (gray). In each panel, the backbone is colored from blue at
the N-terminus to red at the C-terminus. In parts (B) and (D), yellow
spheres represent the sulfur atoms of the ligating cysteines, and
the blue sphere represents the Nε of the ligating histidine.
NMR solution
structures of free MBNL1-zf12 (A, B) and MBNL1-zf34
(C, D). (A) Family of 20 lowest-energy structures of free MBNL1-zf12
superimposed on the well-structured region of the protein from residues
Thr11 to Asn85. The disordered N- and C-termini have been omitted
for clarity. (B) Ribbon representation of the lowest-energy structure
of the ensemble shown in (A), showing the ligation of the two zinc
centers. (C) Family of 20 lowest-energy structures of free MBNL1-zf34
superimposed on the well-structured region of the protein from residues
Thr179 to Gln251. The disordered N- and C-termini have been omitted
for clarity. (D) Ribbon representation of the lowest-energy structure
of the ensemble shown in (C), showing the ligation of the two zinc
centers (gray). In each panel, the backbone is colored from blue at
the N-terminus to red at the C-terminus. In parts (B) and (D), yellow
spheres represent the sulfur atoms of the ligating cysteines, and
the blue sphere represents the Nε of the ligating histidine.MBNL1-zf12 adopts a folded structure
from Thr11 to Asn86, and MBNL1-zf34
is well-structured between residues Thr179 and Gln251; residues outside
these regions are disordered. The ligation pattern for zinc fingers
1 and 3 is C(X)7C(X)6C(X)3H, extending
from residue Cys19 to His38 and residues Cys185 to His204, respectively,
and the pattern for zinc fingers 2 and 4 is C(X)7C(X)4C(X)3H, extending from residue Cys53 to His70 and
residues Cys221 to His238, respectively. Each of the four zinc fingers
is stabilized by binding of zinc through the Sγ atoms
of three cysteine residues and the Nε2 atom of one
histidine. The zinc-coordinating residues are Cys19, Cys27, Cys34,
and His38 (zf1); Cys53, Cys61, Cys66, and His70(zf2); Cys185, Cys193,
Cys200, and His204(zf3); and Cys221, Cys229, Cys234, and His238(zf4).Each of the four individual zinc fingers of MBNL1 is similar in
structure with a short helix between the first and second cysteine
of each finger. The structured portions of the individual zinc fingers
superimpose well with the corresponding portions of the published
X-ray crystal structures (Figure A, B), although the C-terminal helix of zf34 is truncated
in the X-ray structure (Figure B). The individual zinc fingers of MBNL1 also superimpose
well with the individual TIS11d zinc fingers[11] (Figure C), but
whereas each pair (zf1-zf2; zf3-zf4) of MBNL1 zinc fingers forms a
compact domain, the TIS11d zinc fingers are independent and separated
by a long and flexible linker (Figure D). The linker between the MBNL1 zf1-zf2 zinc finger
pairs forms a short β-hairpin (Cys43 to Ala52 in zf12), which
together with a third β-strand (residues Thr15-Glu17 in zf12)
participates in a three-stranded β-sheet stabilized by packing
against the C-terminal α-helix (Pro73 to Arg84 in zf12). Similar
interactions are seen for the linker between MBNL1 zf3-zf4 (β-hairpin
Ile211 to Thr219; third β-strand Arg181-Glu183; helix packing
Ala241-Ala250). In contrast, the linker between fingers 1 and 2 of
TIS11d is fully extended to allow binding of the individual fingers
to their cognate RNA recognition motifs, and the TIS11d zinc fingers
lack the C-terminal helix found in MBNL1.[11] The zf12 and zf34 domains of MBNL1 thus adopt an unusual fold in
which the two CCCH zinc fingers are packed together to form a compact
globular structure with an extensive hydrophobic core formed by the
residues of the C-terminal α-helix and of the β-sheet
between the two zinc fingers.
Figure 4
Comparison of the structures of MBNL1-zf12 and
MBNL1-zf34 with
similar published structures. (A) Overlay of the lowest-energy structure
of MBNL1-zf12 (structured region, residues 11–85 only; blue) with the
published crystal structure of free zf12 (residues 11–86; pink) (PDB ID: 3D2N).[25] (B) Overlay of the lowest-energy structure of MBNL1-zf34
(structured region, residues 180–252; green) with the published
crystal structure of free zf34 (residues 180–245; gold) (PDB ID: 3D2Q).[25] (C) Overlay of MBNL1-zf1 (residues 18–38; blue) with zf1 of TIS11d
(residues 158–178; coral).[11] (D) Ribbon representation of the backbone of
TIS11d in complex with an RNA sequence representing the 3′-UTR
of a target mRNA,[11] showing the separation
of the two zinc finger domains.
Comparison of the structures of MBNL1-zf12 and
MBNL1-zf34 with
similar published structures. (A) Overlay of the lowest-energy structure
of MBNL1-zf12 (structured region, residues 11–85 only; blue) with the
published crystal structure of free zf12 (residues 11–86; pink) (PDB ID: 3D2N).[25] (B) Overlay of the lowest-energy structure of MBNL1-zf34
(structured region, residues 180–252; green) with the published
crystal structure of free zf34 (residues 180–245; gold) (PDB ID: 3D2Q).[25] (C) Overlay of MBNL1-zf1 (residues 18–38; blue) with zf1 of TIS11d
(residues 158–178; coral).[11] (D) Ribbon representation of the backbone of
TIS11d in complex with an RNA sequence representing the 3′-UTR
of a target mRNA,[11] showing the separation
of the two zinc finger domains.
Identification of a Minimal RNA Binding Site
Cross-linking
experiments identified a common motif YGCU(U/G)Y for MBNL1 binding
in the human cardiac troponin T pre-mRNA, where Y is a pyrimidine.[14] A schematic diagram of the intron 4-intron 5
region of hcTnT pre-mRNA, together with the full nucleotide sequence
of the region that interacts with muscleblind protein, is shown in Figure S1. In hcTnT RNA, the pyrimidine 5′
to the GCU consensus is a cytidine, making the RNA binding sequence
CGCU(U/G)Y. There are two CGCU binding sites located in the
humancTnT RNA in the 3′ region of intron 4.[20] The binding of MBNL1zf12 and zf34 constructs to RNA constructs
of different lengths containing the CGCU(U/G)Y motif was tested
by biolayer interferometry (Octet) and NMR.A qualitative assessment
of the strength of binding of various RNAs can be made using HSQC
titrations of 15N-labeled MBNL1zf12 and zf34. Formation
of a high affinity complex results in slow-exchange shifts of numerous
cross peaks and relatively uniform peak intensities, and weaker binding
leads to exchange broadening and smaller changes in the chemical shift
for residues in the RNA binding sites. For MBNLzf1234, the addition
of RNA caused extensive resonance broadening in the 15N
HSQC spectra even at low RNA:protein ratios. This result indicates
that the affinity of zf1234 for the RNA is high, but that the presence
of two similar binding sites on both the protein and RNA likely results
in intermediate exchange between alternative combinations of the protein
and RNA sites. The 1:1 stoichiometry
of the interaction
between MBNLzf1234 and hcTnT32 is seen in the 1D spectra shown in Figure . Representative
data for slow-exchange binding are shown in Figure (complete HSQC spectra are shown in Figure S2), and the results are summarized in Table .
Figure 7
One-dimensional spectrum of the region 10–15 ppm
containing
the imino resonances of hydrogen-bonded RNA. Free hcTnT32 contains
several imino peaks (top spectrum). No imino resonances are present
in the spectrum of the free protein (bottom spectrum) or in the presence
of less than 1:1 mol ratio of hcTnT32 RNA. Above 1:1, the imino resonances
of the RNA are present in the spectrum, indicating that the excess
RNA contains hydrogen-bonded secondary structure.
Figure 5
Binding of
the 15 nucleotide RNA hcTnT15 (5′-GUCUCGCUUUUCCCC-3′) to MBNL
two-finger constructs, shown by superposition of a portion of the 1H–15N HSQC spectra of (A) MBNL1-zf12 (black)
with that following the addition of a 1:1 mol ratio of hcTnT15 (red) and
(B) MBNL1-zf34 (black) with that following the addition of a 1:1 mol ratio of hcTnT15 (red).
Table 2
RNA Constructs and Binding Affinitya
The nucleotide sequence is from
the intron 4 sequence of human cardiac troponin T (hcTnT) (Figure S1). The shaded box denotes the sequence
used in the solution structure determination of the RNA complex of
zf12. The numbers correspond to the nucleotide labels in Figure . Affinities estimated
by NMR are denoted +++ for binding in slow exchange, ++ for binding
in intermediate exchange, and + for binding in fast exchange. Additional
+ signs in brackets indicate that the spectrum shows a mixture of
cross peaks with different exchange behaviors. nd: not determined.
hcTnT15m is identical to hcTnT15 except that U103 is replaced with
G. hcTnT14m has the same sequence as hcTnT10 with the addition of
four extra nucleotides.
The nucleotide sequence is from
the intron 4 sequence of human cardiac troponin T (hcTnT) (Figure S1). The shaded box denotes the sequence
used in the solution structure determination of the RNA complex of
zf12. The numbers correspond to the nucleotide labels in Figure . Affinities estimated
by NMR are denoted +++ for binding in slow exchange, ++ for binding
in intermediate exchange, and + for binding in fast exchange. Additional
+ signs in brackets indicate that the spectrum shows a mixture of
cross peaks with different exchange behaviors. nd: not determined.
hcTnT15m is identical to hcTnT15 except that U103 is replaced with
G. hcTnT14m has the same sequence as hcTnT10 with the addition of
four extra nucleotides.
Figure 9
Protein–RNA contacts in the complex between MBNL1 zf12 and
hcTnT15. (A) Hydrogen-bonding interactions between C101 and the backbone
amide protons of Phe54 and Asp55, and the CO of Ala52, and hydrophobic
interactions between U102 and the side chains of Leu14 and Ile51,
and between Cyt101 and the side chain of Tyr68. (B) Hydrogen-bonding
interaction between U102 and the backbone amide proton of Asp10 observed
in many of the NMR structures. (C) Hydrogen-bonding interaction between
U103 and one of the Asp86 side chain amide protons. The U103 base
is stacked between the hydrophobic side chains of Trp13 and Leu 87.
Binding of
the 15 nucleotide RNA hcTnT15 (5′-GUCUCGCUUUUCCCC-3′) to MBNL
two-finger constructs, shown by superposition of a portion of the 1H–15N HSQC spectra of (A) MBNL1-zf12 (black)
with that following the addition of a 1:1 mol ratio of hcTnT15 (red) and
(B) MBNL1-zf34 (black) with that following the addition of a 1:1 mol ratio of hcTnT15 (red).The NMR experiments clearly show
that the tandem zinc finger domains
bind to the CGCU motifs and that high affinity binding requires a
minimum of 9–10 nucleotides.
Shorter RNA
constructs, such as CGCUG, bind only very weakly, showing that nucleotides
outside the CGCU motif also interact with the protein. Substitution
of the G in the CGCU motif by C severely impairs zf12 binding, showing
that this nucleotide makes critical interactions with the protein.
The NMR spectra in Figure show large changes in chemical shift in the presence of RNA
for resonances that belong to zf2 and zf4. Initial experiments with
hcTnT10 and hcTnT15, representing the 3′ and 5′ CGCU
sequences of the hcTnT RNA, showed additional broadening and disappearance
of resonances belonging to zf1 and zf3, suggesting the presence of
a secondary interaction site in this region. This effect is especially
notable for the amide resonances of Arg20 and Arg160 in zf1 and zf3,
respectively. However, in experiments performed using zf12, the Arg20
resonance reappeared close to its position in the free protein spectrum
when the temperature was increased from 25 to 37 °C. Furthermore,
no broadening of the Arg20 cross peak was observed upon binding of
zf12 to a truncated RNA (hcTnT13), lacking the two 5′ nucleotides
of hcTnT15. These results imply that the interaction that causes the
broadening and disappearance of the zf1 resonances (and probably those
of zf3 in the zf34 complexes) is a secondary interaction involving
nucleotides on the 5′-side of the CGCU motif.NMR titration
experiments show that a single tandem zinc finger
domain (zf12 or zf34) is sufficient for RNA binding activity. Titration
of MBNL1-zf12 with hcTnT10 and hcTnT15, each of which contains a single
CGCU sequence showing 1:1 binding stoichiometry
with no
further chemical shift changes in the 1H–15N HSQC spectra at protein:RNA ratios higher than 1:1. On the other hand, hcTnT27 and
hcTnT32, which contain two CGCU motifs, bind two mole equivalents
of zf12. Titration of MBNL1-zf1234 protein with hcTnT32 RNA also shows
a 1:1 binding stoichiometry, confirming
that each pair of zinc fingers binds to one CGCU unit. Previous studies
have shown that the RNA binding affinity is enhanced in constructs
containing both zf12 and zf34.[23,27,53]Biolayer interferometry was used to obtain estimates of binding
affinity for MBNL1-zf12 and showed that bases flanking the CGCU motif
influence the specificity of the interaction. The Octet measurements
(traces shown in Figure S3) gave Kd values of 130 nM for binding of MBNL1-zf12
to the 5′ CGCU motif in hcTNT15 (5′-GUCUUUUUCCCC-3′)
and 500 nM for binding to a mutant RNA sequence where the U following
the CGCU sequence is changed to G (5′-GUCUGUUUCCCC-3′).
These results demonstrate that, in this system, the consensus RNA
sequence YGCU(U/G)Y[14] shows a preference
for CGCUuY over CGCUgY.
Confirmation of the Binding Stoichiometry
by NMR Diffusion Measurements
During the course of this work,
a structural study of the MBNL
zinc fingers[25] reported apparent trimer
or dimer structures for free MBNL1-zf12 and MBNL-zf34 (shorter constructs
than ours). The structure of a complex between MBNL1-zf34 and the
RNA sequence 5′-CGCUGU-3′ based on hcTnT mRNA
showed MBNL1-zf34 protein as a dimer with a complex interaction between
three RNA strands and three zinc fingers and with a fourth zinc finger
involved in protein–protein contacts.[25] Because our studies showed the free zf12 and zf34 proteins to be
monomeric and our titration experiments indicated that one CGCU motif
binds to one pair of zinc fingers, it became essential for us to establish
the oligomerization state of the free MBNL1 protein constructs and
that of the protein–RNA complexes. Further information on the
state of the proteins in solution and the stoichiometry of RNA binding
was provided by pulsed field gradient (PFG) NMR diffusion experiments[43] carried out under the same buffer conditions,
temperature, and sample concentrations as the structure determination
and RNA titrations. The results of the NMR diffusion experiments are
shown in Table . The
molecular weights obtained for the free MBNL1-zf12 and zf34 proteins
are very close to the theoretical values expected for monomers. Similarly,
the molecular weights calculated for complexes of MBNL1-zf12 + hcTnT15
and MBNL1-zf34 + hcTnT15 confirm the 1:1 stoichiometric ratio inferred
from the NMR titration and Octet results. Also consistent with this
finding, the molecular weight of the MBNL1-zf34 + hcTnT32 RNA complex
was calculated from diffusion measurements as 25.3
± 0.9 kD, which is in good
agreement with the theoretical molecular weight of two MBNL1-zf34
proteins binding to one hcTnT32 RNA molecule.
Table 3
Diffusion
Coefficients and Molecular
Weight Calculated by NMR PFG Diffusion Experiments
diffusion coefficient
(× 10–7 cm2/s)
molecular weight (kDa)
theoretical molecular weight (kDa)
MBNL1-zf12
11.7 ± 0.5
9.2 ± 0.4
10.55
MBNL1-zf34
11.6 ± 0.4
9.5 ± 0.3
9.58
MBNL1-zf12 + hcTnT15
10.3 ± 0.5
13.3 ± 0.7
15.16
MBNL1-zf34 + hcTnT15
10.3 ± 0.3
13.5 ± 0.4
14.18
MBNL1-zf34 + hcTnT32
8.4 ± 0.3
25.3 ± 0.9
29.14
Mapping the Interaction of MBNL1-zf12 with
RNA
The
binding sites of the RNA constructs hcTnT15 (containing the 5′
CGCU motif) and hcTnT10 (containing the 3′ CGCU motif) were
mapped onto the structure of MBNL1-zf12 using chemical shift differences
between 1H–15N HSQC spectra of free and
RNA-bound protein. Small differences in the RNA sequence outside the
central CGCU motif in each oligonucleotide appear to have little effect
on the contact site, but the length of the RNA construct and the number
of nucleotides on the immediate 3′ and 5′ sides of the
CGCU motif are crucial. Both hcTnT10 and hcTnT15 contain four nucleotides
on the 5′ side of the CGCU motif, and our NMR experiments using
hcTnT13 indicate that this sequence should ideally be truncated by
two nucleotides to avoid resonance broadening due to secondary (probably
“fuzzy”[54]) interactions between
the RNA and zf1. hcTnT10 has two nucleotides 3′ of the CGCU,
and hcTnT15 has seven. The length of this segment, though likely not
its base composition (compare hcTNT10 and hcTNT14m in Table ), provides increased affinity
of the RNA for the TZFs. This is illustrated in Figure A, which shows increased RNA contact with
the zf12 C-terminal helix and the nearby N-terminal helical turn for
the complex of hcTnT15 with MBNL-zf12 compared to the hcTnT10-zf12
complex, probably due to the increased length of the RNA, which enhances
the affinity for MBNL1zf12 (Table ). The largest chemical shift changes are observed
for zf2 residues Phe54–Leu69 together with residues in the
region between zf1 and zf2. Significant shifts are also observed for
the cross peaks of Trp13 and Leu14 and several residues in the β-strands
(Leu16, Glu17, Val45, Ile51, Ala52) and in the C-terminal α-helix
(Glu80, Ile81, Gly83, Asn85). The changes in the HSQC spectrum of
MBNL-zf12 upon binding of hcTnT10 and hcTnT15 are mapped onto the
structure in Figure B.
Figure 6
Mapping of the RNA binding site of MBNL1-zf12. (A) Chemical shift
perturbation (⟨Δδ⟩H,N = 1/2√[(ΔδHN)2 + 0.2(ΔδN)2]) in the 1H–15N HSQC spectrum of MBNL1-zf12
upon addition of hcTnT10 (filled black bars) and hcTnT15 (filled red
bars). Small green bars indicate missing cross peaks (including prolines)
in both the free and complex spectra. (B) Chemical shift changes ⟨Δδ⟩H,N upon addition of (left) hcTnT10 or (right) hcTnT15 mapped
onto the lowest-energy structure of MBNL1-zf12. The backbone ribbon
and the backbone N atom of residues where ⟨Δδ⟩H,N takes values >2× the standard deviation (SD) greater
than the mean value are colored red, where 2× SD < value <
1× SD are shown in orange, and where 1× SD < value <
mean are shown in yellow. Zinc atoms are shown in gray, and the side
chains of zinc ligands are shown as sticks, yellow for cysteine, and
blue for histidine.
Mapping of the RNA binding site of MBNL1-zf12. (A) Chemical shift
perturbation (⟨Δδ⟩H,N = 1/2√[(ΔδHN)2 + 0.2(ΔδN)2]) in the 1H–15N HSQC spectrum of MBNL1-zf12
upon addition of hcTnT10 (filled black bars) and hcTnT15 (filled red
bars). Small green bars indicate missing cross peaks (including prolines)
in both the free and complex spectra. (B) Chemical shift changes ⟨Δδ⟩H,N upon addition of (left) hcTnT10 or (right) hcTnT15 mapped
onto the lowest-energy structure of MBNL1-zf12. The backbone ribbon
and the backbone N atom of residues where ⟨Δδ⟩H,N takes values >2× the standard deviation (SD) greater
than the mean value are colored red, where 2× SD < value <
1× SD are shown in orange, and where 1× SD < value <
mean are shown in yellow. Zinc atoms are shown in gray, and the side
chains of zinc ligands are shown as sticks, yellow for cysteine, and
blue for histidine.Several of the aromatic
amino acids that interact with RNA in the
TIS11d complex[11] are conserved in the MBNL
zinc fingers, including Phe22, Phe36, and Tyr68 (Figure B). To assess the role of these
MBNL1 aromatic residues in binding of the hcTnT RNA, each was mutated
to alanine, and the interaction with hcTnT10 was examined using HSQC
titrations. Mutation of Phe36 to Ala makes little difference for binding
to hcTnT10 but appears to hinder the nonspecific interaction that
occurs for hcTnT10 and hcTnT15 at zf1, consonant with the position
of this Phe between two of the zf1 zinc ligands. Substitution of Tyr68
with alanine seriously impairs hcTnT10 binding, such that the cross
peaks that characterize high affinity RNA binding to finger 2 are
broadened and are not observed in the HSQC spectra. Thus, it is clear
that, as for TIS11d, exposed aromatic residues in MBNL1-zf12 play
an important role in RNA binding.
Role of RNA Secondary Structure
It has been reported
that the MBNL zinc fingers bind specifically to RNA hairpins.[20,55] To assess this possibility, we performed NMR titrations with longer
RNA constructs (hcTnT32 and hcTnT27 containing 32 and 27 nucleotides,
respectively), which have the potential to form hairpins between the
CGCU motifs located in the stem.[20] A 1D
WATERGATE NMR spectrum of free hcTnT32 reveals distinct imino proton
resonances, confirming the presence of RNA secondary structure (Figure ). As hcTnT32 RNA was titrated into an MBNL1234 protein sample,
no peaks appeared in the imino proton region until the protein:RNA
ratio reached 1:1. The absence of imino peaks at substoichiometric
RNA concentrations suggests that hcTnT32 RNA is single-stranded when
it is bound to the MBNL zinc fingers. At protein:RNA ratios greater
than 1:1, imino resonances start to appear at the same chemical shifts
as for the free RNA, suggesting that the excess RNA does not participate
in binding to protein and exhibits the same secondary structure as
in the free RNA hairpin. Thus, although the free hcTnT32 RNA adopts
secondary structure, potentially similar to that predicted by Warf
and Berglund,[20] MBNL1-zf1234 binding disrupts
the secondary structure. That zf1234 binds preferentially to single-stranded
RNA rather than a hairpin stem is confirmed by the similar binding
constants and nearly identical chemical shifts observed for key residues
in the RNA binding site in HSQC spectra of zf1234 complexes with hcTnT32
and with shorter RNA constructs that are unable to form a hairpin
(Table ). Higher affinity
of MBNL zinc fingers for single-stranded RNA, compared to double-stranded
forms, has also been demonstrated for other RNA sequences.[53]One-dimensional spectrum of the region 10–15 ppm
containing
the imino resonances of hydrogen-bonded RNA. Free hcTnT32 contains
several imino peaks (top spectrum). No imino resonances are present
in the spectrum of the free protein (bottom spectrum) or in the presence
of less than 1:1 mol ratio of hcTnT32 RNA. Above 1:1, the imino resonances
of the RNA are present in the spectrum, indicating that the excess
RNA contains hydrogen-bonded secondary structure.We repeated our experiments by adding MBNL1-zf12 or MBNL1-zf34
to hcTnT32. The results were similar to those obtained for binding
to MBNL1-zf1234. Consistent with our NMR titration experiments, which
showed each pair of zinc fingers binding to one CGCU recognition site
in the hcTnT RNA; no imino peaks were observed until a protein:hcTnT32
ratio of 2:1 was reached, indicating once again that the hcTnT32 RNA
is single stranded when bound to MBNL. Beyond the 2:1 ratio, the excess
RNA adopts hydrogen-bonded secondary structure as evidenced by the
presence of imino proton peaks between 10 and 15 ppm. Lower affinities
were observed for the binding of zf12 and zf34 to similar RNA constructs
compared with the four-finger zf1234.[53] The lower affinity of the two-finger constructs was ascribed to
the presence of C-G base pairing between RNA strands, which could
be successfully disrupted to give a single-stranded RNA by zf1234,
but which could not be disrupted by the two-finger constructs.[53]
Structure of the Complex of MBNL1-zf12 with
hcTnT15 RNA
Having established that each pair of MBNL1TZF
domains (zf12 and
zf34) interacts with single-stranded RNA containing a single CGCU
sequence, we proceeded to determine the solution structure of MBNL-zf12
(residues 1–92) bound to the minimal cognate RNA oligonucleotide
hcTnT15 with sequence 5′-GUCUCGCUUUUCCCC-3′. Resonance assignments
for 15N, 13C-labeled zf12 in complex with unlabeled
RNA were made using a standard suite of triple resonance spectra,
as described above for the free protein. Resonance assignments for
the unlabeled RNA in the complex were more problematic, as resonance
overlap was severe, particularly in the regions of the RNA that are
not intimately associated with the protein. Partial 1H
assignments were obtained from resolved cross peaks in a two-dimensional 13C-filtered spectrum.[56] NOEs between
zf12 and hcTnT15 were obtained from a 3D 13C filter-edit
spectrum[42] yielding 71 intermolecular NOEs.NOE assignment and structure calculation were performed using CYANA
version 2.1,[44] and the structures were
refined using AMBER 11[48] using a similar
method to that described above for the free zf12 and zf34 proteins.
The 20 lowest-energy NMR structures of the complex are shown in Figure A, and experimental
restraints and structural statistics are included in Table . The structure is well-defined
between Asp10 and Asn85 with a pairwise rms deviation of 0.37 Å
for the polypeptide backbone. Residues 1–9 and 89–92
are disordered, and the last turn of the C-terminal helix (Arg84-Ile88)
is frayed in the structural ensemble. The RNA structure and MBNL1
interactions are well-defined only for nucleotides C101–U104
(Figure B). Although
some intermolecular NOEs are observed to the first two nucleotides
in the CGCU motif, C99 and G100, they involve long protein side chains
and long upper distance bounds; as a consequence, the RNA structure
is poorly defined in this region. The orientation of the bases and
protein interactions are well-defined only for C101, U102, and U103;
the U104 base adopts two orientations in the ensemble of NMR structures
that cannot be distinguished on the basis of the observed NOEs. The
structure of MBNL1zf12 shows very little change between free and
bound states. The RNA-binding surface of MBNL1zf12 indicated by the
changes in 1H and 15N chemical shifts (Figure ) corresponds well
with the binding site observed in the calculated solution structures
(Figure C).
Figure 8
Solution structures
of the complex of MBNL1 zf12 and hcTnT15 single-stranded
RNA. (A) Stereo ribbon representation of the family of structures.
The backbones of zf1 and zf2 are shown in blue and pink, respectively.
Zinc atoms are shown as white spheres. The non-hydrogen atoms of the
RNA (nucleotides G100–U104) are colored yellow for carbon,
blue for nitrogen, red for oxygen, and orange for phosphorus. (B)
The family of RNA structures from G100-C101-U102-U103-U104 are colored
as in (A). The structures have been rotated ∼180° from
their positions in (A). (C) Cartoon ribbon representation of the lowest-energy
structure of MBNL zf12 in complex with hcTnT15 RNA. Nucleotides G100-C101-U102-U103-U104
are shown as sticks and colored as in (A) with RNA hydrogen atoms
shown in white. (D) Vacuum electrostatic surface representation of
the protein portion of the structure shown in (C) calculated within
the program PyMol (The PyMOL Molecular Graphics System, version 1.8
Schrödinger, LLC). Areas of negative and positive charge are
shown in red and blue, respectively. The structure has been rotated
∼90° both in the plane of the paper and around the vertical
axis compared to the cartoon in (C). The non-hydrogen atoms of the
RNA (G100-U104) are colored as in (A).
Solution structures
of the complex of MBNL1zf12 and hcTnT15 single-stranded
RNA. (A) Stereo ribbon representation of the family of structures.
The backbones of zf1 and zf2 are shown in blue and pink, respectively.
Zinc atoms are shown as white spheres. The non-hydrogen atoms of the
RNA (nucleotides G100–U104) are colored yellow for carbon,
blue for nitrogen, red for oxygen, and orange for phosphorus. (B)
The family of RNA structures from G100-C101-U102-U103-U104 are colored
as in (A). The structures have been rotated ∼180° from
their positions in (A). (C) Cartoon ribbon representation of the lowest-energy
structure of MBNLzf12 in complex with hcTnT15 RNA. Nucleotides G100-C101-U102-U103-U104
are shown as sticks and colored as in (A) with RNA hydrogen atoms
shown in white. (D) Vacuum electrostatic surface representation of
the protein portion of the structure shown in (C) calculated within
the program PyMol (The PyMOL Molecular Graphics System, version 1.8
Schrödinger, LLC). Areas of negative and positive charge are
shown in red and blue, respectively. The structure has been rotated
∼90° both in the plane of the paper and around the vertical
axis compared to the cartoon in (C). The non-hydrogen atoms of the
RNA (G100-U104) are colored as in (A).
Discussion
Single-Stranded RNA is the Preferred Binding
Partner for MBNL
The affinity of muscleblind zinc fingers
for RNA has received a
great deal of attention. Although early studies appeared to show that
MBNL zinc fingers bound to RNA hairpins,[20,55] it is clear from our studies and from literature reports that have
appeared during the course of this work[18,53,57] that single-stranded RNA binds preferentially to
the MBNL zinc fingers. The presence of stable base-paired double-stranded
RNA lowers the affinity of GST-bound single zinc finger pairs (zf12
or zf34), but the GST-bound tandem construct zf1234 was able to destabilize
this structure and bind to the resulting single-stranded RNA.[53] Despite this evidence that the full-length MBNL
has additional RNA affinity over the two-finger constructs (which
may well arise from the entropic gain from intramolecular bivalency),
our NMR data suggest that the 80 residue linker between the two pairs
of tandem zinc fingers (TZF) is largely disordered in the free protein.
The 1H–15N HSQC spectra of the four-finger
construct (MBNL1 1–255) superimpose exactly with those of the
individual TZF domains (zf12, residues 1–92, and zf34, residues
173–255) (Figure ), leaving only a region in the center of the spectrum containing
the random-coil cross peaks of the disordered linker. These observations
validate our experimental strategy of characterizing the two TZF domains
independently.
Comparison of the Free Two-Finger Constructs
Consistent
with the amino acid sequence homology between zf12 and zf34 (Figure B), the solution
structures calculated for zf12 and zf34 are very similar (Figure ). Within each TZF,
the two zinc centers are folded together to form a single structure.
Residues between the individual CCCH zinc fingers fold to form a three
strand β-sheet, which together with the C-terminal α-helix
forms a compact globular structure with an extensive hydrophobic core.
During the course of this work, several structures were reported for
MBNL molecules similar to those reported here. Our structures are
very similar both in backbone as well as side chain orientation to
the solution structures of MBNL isoform 2 calculated by the RIKEN
structural genomics initiative (PDB ID: 2RPP corresponding to our zf12 and 2E5S corresponding to
our zf34), consistent with the high degree of amino acid sequence
homology (Figure ).
On the other hand, there are several differences between our tandem
zinc finger domain structures and the X-ray structures reported by
Teplova and Patel (PDB ID: 3D2N for zf12 and 3D2Q for zf34) .[25] The most
obvious difference is that, in the crystal structures, free MBNL1-zf12
is a trimer whereas free MBNL1-zf34 is a dimer. We note that both
of the protein constructs used for the X-ray structures are shorter
than ours (residues 9–90 for zf12 with coordinates reported
for residues 11–86, and residues 178–246 for zf34, with
coordinates reported for residues 180–245). By contrast, our
NMR structures for the corresponding TZFs were calculated using data
for residues 1–92 (zf12) and 173–255 (zf34). The NMR
structures show that only residues 11–85 (zf12) and 179–251
(zf34) are well-structured in the free state (Figure ). The differences between X-ray and NMR
in the extent of the structured core of each TZF are rather minor
for zf12, but the difference at the C-terminus of zf34 is significant
with the C-terminal helix being truncated by two helical turns in
the crystal structure.[25] In solution, we
observe only monomeric structures for the two TZF constructs, as confirmed
by the PFG NMR results (Table ). We suggest that the oligomers observed in the X-ray structures
might have arisen either due to truncation of the C-terminal helix
in the case of zf34 or due to crystal contacts.
Comparison
of MBNL1 and TIS11d Zinc Finger Structures
The structure
of a CCCH tandem zinc finger (TZF) domain showed the
zinc fingers of TIS11d in complex with an AU-rich RNA from the 3′-untranslated
region of TNF-α mRNA (5′-UUAUUUAUU-3′).[11] The two CCCH zinc centers of TIS11d form two
independently folded zinc fingers connected by a flexible amino acid
chain of 18 residues.[11] An alignment of
structures of the individual zinc fingers of TIS11d with that of MBNL1
(Figure C) shows that
their structures are very similar with the presence of a short α-helix
between the first and second zinc-binding cysteines. However, the
overall folds of these proteins are quite different: the C-terminal
α-helix and three-strand β-sheet observed in the MBNL1
TZFs are absent, and the two zinc fingers of TIS11d behave like independent
beads on a flexible string instead of forming a single globular structure
as seen for MBNL1.An interesting feature of the zinc fingers
of TIS11d is the presence of a highly conserved (R/K)YKTEL motif immediately
preceding the first cysteine in each zinc finger (Figure ). The structure of the TIS11d-RNA
complex shows that this motif plays a critical role in RNA recognition
by forming two walls of a deep pocket that accommodates the uracil
bases of the recognition element. Sequence alignment of TIS11d with
MBNL1 (Figure ) shows
that the RYKTEL motif of TIS11d finger 1 is replaced by WLTLEV in MBNL1 zf1 and TDRLEV in MBNL1 zf3, whereas the KYKTEL
motif is replaced by NGRVIA in MBNL1zf2 and DNTVTV in
MBNL1zf4. An extensive sequence comparison between muscleblind family
members[3] showed that the LEV box and to
a lesser extent the NGR box are highly conserved between widely divergent
species (Figure ).
These sequences participate in the formation of the three-stranded
β-sheet, which forms part of the unique fold of the MBNL TZFs.
RNA Recognition by MBNL
The two TZF constructs MBNL1-zf12
and MBNL-zf34 bind to RNA sequences from cardiac troponin T with comparable Kd values in the 100–200 nM range; the
affinity of full length MBNL1-zf1234 is higher[20] (Table ). The NMR titrations establish that the hcTnT RNAs bind preferentially
in the vicinity of the second zinc site of each TZF with the largest
chemical shift changes upon addition of RNA occurring for residues
54–60 in zf2 of MBNL1-zf12 and for the corresponding residues
222–228 of zf4 in MBNL-zf34. Resonance broadening for Arg20
and Arg186, at equivalent sites in zf1 and zf3 (circled in red in Figure B), in the presence
of RNA constructs containing four nucleotides on the 5′ end
of the CGCU interaction motif is attributed to secondary interactions.
These interactions are abrogated by 5′-truncation of the RNA
(in hcTnT13) or by raising the temperature from 25 to 37 °C.
We conclude that the primary RNA binding site is zf2 in MBNL-zf12
and zf4 in MBNL-zf34 and that the secondary interactions with zf1
and zf3 are weak. Although zf1 does present an RNA binding surface,
the potential interactions are limited compared to those of zf2, which
functions in conjunction with side chains in the N-terminal region
and C-terminal helix to form an extensive binding site and provide
a network of base-specific RNA contacts.It is interesting to
note the unusually large downfield chemical shift of the amide cross
peaks of Phe54 (zf2; Δδ = 3.7 ppm in 15N and
Δδ = 2.48 ppm in 1H) and Met222 (zf4; Δδ
= 3.25 ppm in 15N and Δδ = 2.52 ppm in 1H) induced by RNA binding (Figure ; residues circled in Figure B). The resonances of the equivalent residues
in TIS11d, Arg160 in zf1, and Arg198 in zf2 (circled in Figure B) also undergo substantial
downfield shifts upon RNA binding.[11] The
large downfield shift of the Arg160 and Arg198amide resonances of
TIS11d is due to formation of hydrogen bonds between these amide protons
and the N1 atom of adenines A3 and A7. The large downfield shifts
for the Phe54 and Met222 resonances of MBNL1 suggest that their amide
protons are similarly involved in hydrogen-bonding interactions with
hcTnT RNA.Our NMR structure of MBNL1zf12 bound to the 15 nucleotide
hcTnT
RNA are complementary to the published crystal structure of zf34 bound
to CGCUGU.[25] Together, the NMR and X-ray
structures provide detailed new insights into the mechanism of RNA
recognition by the muscleblind zinc fingers. In the solution structure,
the RNA binds in a positively charged groove formed between residues
Arg9–Thr15, zf2, and the C-terminal helix (Figure D). The first six nucleotides
at the 5′ end of the hcTnT15 RNA, including the first two bases
of the CGCU binding motif, are structurally disordered. Although G100
clearly binds to zf2, the details of the interaction cannot be discerned
in the NMR structures because of this disorder. Fortunately, the guanosine
interactions are clear in the X-ray structure of the zf34 complex,[25] where it is packed between the conserved Arg231,
Lys235, and Tyr236 side chains (equivalent to Arg63, Lys67, and Tyr78
in zf12) and forms a hydrogen bond to the Lysamide. It is likely
that this binding mode is conserved for zf12. Although the paucity
of NOE restraints prevented resolution of the interactions between
G100 and zf2, the amide1H and 15N resonances
of K67 are shifted strongly downfield (Figure ) upon binding to RNA, consistent with hydrogen
bond formation to the guanosine base. Only one weak intermolecular
NOE is observed for C99, suggesting that its interactions with zf12
are transient and nonspecific. This conclusion is supported by the
X-ray structure, where the equivalent cytosine from the CGCU motif
makes no contact with zf4 and is only observed in the electron density
for a subset of RNA molecules in which it is bound to zf3 of a neighboring
protein molecule in the crystal lattice.[25] This interaction may be a crystallization artifact because the 5′cytosine
base of the CGCU motif cannot reach the zf3 binding site observed
in the X-ray structure, which is more than 30 Å from the guanosine
binding site on zf4.The conformation of C101 is well-defined
by the NMR restraints.
The cytosine ring projects into a deep pocket formed by the side chains
of Ile51, Phe54, and Tyr68 on zf2 and Leu13 in the N-terminal region
of the zf12 construct, where it can hydrogen bond with the polypeptide
backbone (N4–A52 CO, N3–F54 NH, O2–D55 NH) (Figure A). The same pattern of hydrogen bonding is observed between
the corresponding cytosine and backbone amides of V220, M222, and
D223 in the X-ray structure of the zf34 complex.[25]Protein–RNA contacts in the complex between MBNL1zf12 and
hcTnT15. (A) Hydrogen-bonding interactions between C101 and the backbone
amide protons of Phe54 and Asp55, and the CO of Ala52, and hydrophobic
interactions between U102 and the side chains of Leu14 and Ile51,
and between Cyt101 and the side chain of Tyr68. (B) Hydrogen-bonding
interaction between U102 and the backbone amide proton of Asp10 observed
in many of the NMR structures. (C) Hydrogen-bonding interaction between
U103 and one of the Asp86 side chain amide protons. The U103 base
is stacked between the hydrophobic side chains of Trp13 and Leu 87.The protein interactions made
by bases on the 3′ side of
the GC element differ substantially in the NMR and X-ray structures,
most likely due to the truncation of the C-terminal helix of zf4 in
the construct used for crystallization[25] as well as to the presence of a G in place of U103 (see below).
The uracil moiety of U102 packs against the edge of the C101 base
and the hydrophobic side chains of T11, W13, and L14 in the N-terminal
region of zf12, and the O2 atom is within hydrogen bonding distance
of the Asp10 backbone NH in many of the NMR structures (Figure B). The uracil at this position
could be replaced by cytosine without perturbing the hydrogen bonding
or hydrophobic interactions with zf12; indeed, substitution of uridine
by cytidine has no effect on binding affinity but replacement by a
purine strongly impairs binding.[18] The
conformation of U103 is also well-defined by the NMR restraints and
is positioned with its uracil ring packed in a groove between the
face of the Trp13 indole ring and the side chain of Leu87 on the C-terminal
helix of zf2 (Figure C). Interestingly, the equivalent nucleotide in the X-ray crystal
structure[25] is a G, which we show by biolayer
interferometry corresponds to a lower-affinity complex. The uracil
O4 atom forms a hydrogen bond to the Nδ of Asn86 in the majority
of the NMR structures, and there is potentially an additional hydrogen
bond between the uracil N3 and the backbone CO of Gly83 (depending
on the torsion angle about the glycosidic bond). The base at this
position appears to contribute weakly to specificity, and substitution
by guanine decreases the binding affinity approximately 4-fold (Table ). The uracil ring
of U104 is conformationally disordered, adopting two approximately
equally populated conformations in the ensemble of NMR structures,
both of which pack against the C-terminal helix.In contrast
to the NMR structures of the zf12 complex, the nucleotides
3′ to the GC undergo very different contacts with zf34 in the
crystal. The first U forms similar interactions with the protein in
the zf12 (NMR) and zf34 (X-ray) structures. However, subsequent nucleotides
(G and U) bind to the zf3 finger in a neighboring molecule in the
crystal lattice,[25] whereas the corresponding
bases in the NMR structure (U104 and U105) interact with the C-terminal
helix of zf3. These differences could be a consequence of crystal
packing or, and perhaps more likely, due to truncation of the zf34
construct used for crystallization, leading to loss of the entire
C-terminal helix and critical contact residues required for binding
of these 3′ nucleotides.The NMR and X-ray structures
in combination substantially advance
our understanding of RNA recognition by the MBLN1 tandem zinc finger
domains. Previous studies have shown that MBNL1 binds preferentially
to the sequence motifs YGCU(U/G)Y or YGCY (where Y is a pyrimidine).[14,16] The central GC dinucleotide forms highly specific interactions with
the zf12 and zf34 domains, and substitution of these nucleotides drastically
decreases the binding affinity.[16] Although
the molecular basis for recognition of the guanosine is not clear
from the NMR data, the X-ray structure shows hydrogen bonding interactions
to one of the cysteine ligands and a backbone NH in zf4 that is specific
to a guanosine base.[25] The solution and
crystal structures show that specific recognition of the cytosine
by zf2 and zf4 is mediated by a pattern of hydrogen bonding to backbone
amides that is unique to the cytosine base. In addition, the cytosine
projects into a confined hydrophobic pocket that could not accommodate
a purine ring. The NMR structures provide new insights into the interactions
involving nucleotides on the 3′ side of the GC motif, showing
the molecular interactions that favor a pyrimidine immediately following
the cytidine, and the weak preference for uridine at the following
position.[18] Neither the X-ray nor NMR structures
provide insights into the preference for a pyrimidine immediately
5′ of the GC dinucleotide.
Effect of the Length of
the RNA Sequence on Binding
MBNLTZF domains require a minimum
of approximately 10 nucleotides
to bind RNA in a stable complex. The NMR titrations of MBNL-zf12 with
RNAs of various lengths derived from the troponin sequence showed
that zf2 binds to the CGCU recognition site, whereas nucleotides four
or more bases 5′ to that interact with zf1 but more weakly.
Similarly, for MBNL-zf34, the CGCU sequence binds strongly to zf4,
whereas 5′ nucleotides interact weakly with zf3. The humancTNT pre-mRNA contains two CGCU sites separated by a 9 nucleotide
spacer in the 3′ end region of intron 4 where MBNL is predicted
to bind.[20] Our results suggest that each
of these CGCU units is recognized by a single tandem zinc finger domain,
i.e., zf12 binds to one CGCU motif whereas zf34 binds to the other.
This mode of binding is consistent with the results of our diffusion
experiments, which showed that the hcTnT10 and hcTnT15 RNAs, each
containing a single CGCU unit, bind to MBNL1-zf12 and MBNL1-zf34 in
a 1:1 ratio, whereas hcTnT32 RNA, which contains two CGCU subunits,
binds two molecules of MBNL1-zf34. Such a binding mode is contrary
to the model proposed by Teplova and Patel[25] in which the hcTNT RNA forms a hairpin and the CGCU motifs interact
with zf3 and zf4 on the same tandem zinc finger domain.
Protein Data
Bank Accession Code
The coordinates for
MBNL1-zf12 and zf34 and the zf12-RNA structure have been deposited
with the Protein Data Bank with accession codes 5U6H, 5U6L, and 5U9B, respectively. The
NMR resonance assignments have been deposited in the Biological Magnetic
Resonance Data Bank with accession numbers 30207, 30208, and 30210,
respectively.
Authors: J W Miller; C R Urbinati; P Teng-Umnuay; M G Stenberg; B J Byrne; C A Thornton; M S Swanson Journal: EMBO J Date: 2000-09-01 Impact factor: 11.598
Authors: Majid Fardaei; Mark T Rogers; Helena M Thorpe; Kenneth Larkin; Marion G Hamshere; Peter S Harper; J David Brook Journal: Hum Mol Genet Date: 2002-04-01 Impact factor: 6.150
Authors: Eric T Wang; Neal A L Cody; Sonali Jog; Michela Biancolella; Thomas T Wang; Daniel J Treacy; Shujun Luo; Gary P Schroth; David E Housman; Sita Reddy; Eric Lécuyer; Christopher B Burge Journal: Cell Date: 2012-08-17 Impact factor: 41.582
Authors: Nicole Lambert; Alex Robertson; Mohini Jangi; Sean McGeary; Phillip A Sharp; Christopher B Burge Journal: Mol Cell Date: 2014-05-15 Impact factor: 17.970
Authors: Yadong Sun; Yixiao Zhang; Keith Hamilton; James L Manley; Yongsheng Shi; Thomas Walz; Liang Tong Journal: Proc Natl Acad Sci U S A Date: 2017-12-05 Impact factor: 11.205
Authors: Thomas Spruce; Mireya Plass; André Gohr; Debashish Ray; María Martínez de Lagrán; Gregor Rot; Ana Nóvoa; Demian Burguera; Jon Permanyer; Marta Miret; Hong Zheng; Maurice S Swanson; Quaid Morris; Moises Mallo; Mara Dierssen; Timothy R Hughes; Barbara Pernaute; Manuel Irimia Journal: PLoS Biol Date: 2022-04-27 Impact factor: 9.593
Authors: Melissa A Hale; Jared I Richardson; Ryan C Day; Ona L McConnell; Juan Arboleda; Eric T Wang; J Andrew Berglund Journal: Nucleic Acids Res Date: 2018-04-06 Impact factor: 16.971
Authors: Katarzyna Taylor; Lukasz J Sznajder; Piotr Cywoniuk; James D Thomas; Maurice S Swanson; Krzysztof Sobczak Journal: Nucleic Acids Res Date: 2018-09-28 Impact factor: 16.971