Fragile X syndrome (FXS) is a multi-organ disease that leads to mental retardation, macro-orchidism in males and premature ovarian insufficiency in female carriers. FXS is also a prominent monogenic disease associated with autism spectrum disorders (ASDs). FXS is typically caused by the loss of fragile X mental retardation 1 (FMR1) expression, which codes for the RNA-binding protein FMRP. Here we report the discovery of distinct RNA-recognition elements that correspond to the two independent RNA-binding domains of FMRP, in addition to the binding sites within the messenger RNA targets for wild-type and I304N mutant FMRP isoforms and the FMRP paralogues FXR1P and FXR2P (also known as FXR1 and FXR2). RNA-recognition-element frequency, ratio and distribution determine target mRNA association with FMRP. Among highly enriched targets, we identify many genes involved in ASD and show that FMRP affects their protein levels in human cell culture, mouse ovaries and human brain. Notably, we discovered that these targets are also dysregulated in Fmr1(-/-) mouse ovaries showing signs of premature follicular overdevelopment. These results indicate that FMRP targets share signalling pathways across different cellular contexts. As the importance of signalling pathways in both FXS and ASD is becoming increasingly apparent, our results provide a ranked list of genes as basis for the pursuit of new therapeutic targets for these neurological disorders.
Fragile X syndrome (FXS) is a multi-organ disease that leads to mental retardation, macro-orchidism in males and premature ovarian insufficiency in female carriers. FXS is also a prominent monogenic disease associated with autism spectrum disorders (ASDs). FXS is typically caused by the loss of fragile X mental retardation 1 (FMR1) expression, which codes for the RNA-binding protein FMRP. Here we report the discovery of distinct RNA-recognition elements that correspond to the two independent RNA-binding domains of FMRP, in addition to the binding sites within the messenger RNA targets for wild-type and I304N mutant FMRP isoforms and the FMRP paralogues FXR1P and FXR2P (also known as FXR1 and FXR2). RNA-recognition-element frequency, ratio and distribution determine target mRNA association with FMRP. Among highly enriched targets, we identify many genes involved in ASD and show that FMRP affects their protein levels in human cell culture, mouse ovaries and human brain. Notably, we discovered that these targets are also dysregulated in Fmr1(-/-) mouse ovaries showing signs of premature follicular overdevelopment. These results indicate that FMRP targets share signalling pathways across different cellular contexts. As the importance of signalling pathways in both FXS and ASD is becoming increasingly apparent, our results provide a ranked list of genes as basis for the pursuit of new therapeutic targets for these neurological disorders.
Most clinical cases of FXS are a result of a hyper-expansion and methylation of CGG repeats within the promoter of FMR1, leading to a loss of its expression[1-3]. The FMR1 RBP family has three members, FMR1, FXR1, and FXR2, which possess two centrally located KH domains and a C-terminal arginineglycine (RG)-rich region implicated in mRNA binding [4-7]. FMR1 encodes for multiple protein isoforms, but is predominantly expressed as a 69 kDa protein (isoform 7)[8,9]. Isoform (iso) 1 and six other alternative splice variants include exon 12, with iso1 encoding the full-length protein (71 kDa). Exon 12 insertion lengthens the second KH (KH2) RBD, possibly influencing FMR1 RNA-binding specificity or affinity. The I304N mutation, described in a FXS patient, is also located in the KH2 and is reported to attenuate association with RNA and polysomes[10-12]. FMR1 proteins are implicated in various RNA processes including RNA subcellular localization by facilitating nucleo-cytoplasmic shuttling[13] and association with motor proteins[14-16]. FMR1 proteins were also suggested to mediate translational regulation[12,17]. Given the critical role of FMR1 in human cognition and premature ovarian insufficiency[18,19], intensive efforts towards the identification of its RNA targets have been employed, with the goal that their discovery would shed light upon the array of related disorders and provide options for molecular therapy[18-26]. No precise RRE has been defined and very few bona fide mRNA targets are confirmed[27].
RESULTS AND DISCUSSION
To identify the binding sites of FMR1 family proteins (Fig. 1a and Supplementary Fig. 1), we first compared photocrosslinking methods[28-30] using stable FLAG-HA FMR1 iso7 expressing HEK293 cells (Fig. 1b), as these cells and human brain share 90% of expressed genes based on a comparison of existing RNAseq datasets[31-33] (Supplementary Fig. 2). The difference in FMR1 levels between the experimental system and brain is 1.3 fold, as calculated using RNAseq data and the quantitated expression of FMR1 in our stable cells. We found that 4SU PAR-CLIP provided the highest yield of crosslinked RNAs, and used this approach for all FMR1 family proteins (Fig. 1c). cDNA libraries were generated after PAR-CLIP and Illumina-sequenced (Supplementary Table 1). Genome-aligned reads were grouped by PARalyzer[34] to identify segments of RNA that represented peaks of T-to-C conversion, termed binding sites. PARalyzer separated closely spaced binding sites connected by overlapping reads and yielded a median RNA segment length of 33 nt (Supplementary Fig. 3). FMRP iso1 and 7 bound to approx. 80,000-100,000 sites, of which > 85% mapped to ~6,000 mRNAs (Supplementary Tables 1, 2, and https://fmrp.rockefeller.edu). FXR1 and FXR2 protein binding sites are comprised within FMRP binding sites with an overlap of 95% (Supplementary Table 3).
Figure 1
PAR-CLIP of FMR1 family proteins
a, FMR1 family proteins comprise two type-I KH domains (cyan). FMRP isoform 1 and 7 (iso1 and 7) vary by the presence of exon 12 (black) within KH2. The I304N mutation is located within the KH2 domain (red asterisk). The RG-rich region (orange bars) is also implicated in RNA-binding. The lengths of proteins in amino acids are indicated. We established stable inducible cell lines expressing FLAG-HA epitope-tagged wt and I304N mutants of FMR1 (iso1 and 7), and its homologs FXR1 and 2[47]. b, RNA-FMRP crosslinking comparing CLIP (254 nm) to 4SU-or 6SG-PAR-CLIP (365 nm). RNA-radiolabeled FLAG immunoprecipitates (IPs) of lysates from crosslinked FLAG-HA-FMRP-iso7-expressing HEK293 cells were separated by SDS-PAGE. The migrations of protein mass standards are indicated. Enrichment of radiolabelled RNA covalently bound to FLAG-HA-FMRP (arrow) was determined after normalizing by Western blot analysis (not shown). c, 4SU-PAR-CLIP of FMR1 family proteins analogous to (b).
Nearly all mRNA binding sites were located in exons (>90%) (Fig. 2a) and distributed between CDS and 3'UTR (>95%, total) with slightly more CDS sites, similar to distributions seen for other cytoplasmic RBPs[28]. The computational sequence analysis method cERMIT[35] revealed two major RREs, ACUK and WGGA (K=G/U, and W=A/U) (Fig 2b, Supplementary Fig. 4). Together, ACUK and WGGA RREs were found in ≥50% of mRNA binding sites in iso1 and iso7; they occurred exclusively or together within the same binding site (Fig. 2c). Remaining binding sites typically contained close derivatives of either RRE.
Figure 2
Analysis of FMR1 family protein mRNA binding sites
a, Distribution of binding sites within mRNA targets of the FMR1 protein family. b, Two major RREs were inferred from FMRP iso1 and iso7 binding sites. c, Distribution of FMRP binding sites, color-coded based on cERMIT-inferred RREs, across representative targets. Open boxes and thick lines indicate CDS and UTRs, respectively; numbers indicate nucleotide number.
We performed electrophoretic mobility shift assays (EMSAs) to test RNAs representing FMRP binding sites using recombinant FMRP purified from Sf9 cells (Supplementary Fig. 5 and 6). FMRP target sites were selected based on whether they contained ACUK, WGGA or Mixed (ACUK/WGGA) RREs (Supplementary Table 4). Testing RNAs of various lengths, we found that oligonucleotide lengths of ≥45 nt were required to observe gel shifts and reach dissociation constants below 0.5 μM, suggesting RNA backbone contacts outside the RREs contribute to the association in vitro. WGGA-containing RNAs exhibited the widest range and strongest affinities, generally correlating with the number of RREs within a PARalyzer-defined binding site. An RNA segment containing nine WGGAs bound almost 2 orders of magnitude tighter than those containing one WGGA, whereas binding of ACUK-containing RNAs varied only 5-fold. EMSAs using RNAs representing target sites within PPP2CA, and UBE3A are shown (Fig. 3a).
Figure 3
RNA binding assays using natural FMRP target sites containing ACUK and WGGA RREs, and the effect of a KH2 mutation to its target RNA spectrum
a, EMSAs of RNAs representing UBE3A or PPP2CA binding sites containing various RREs. Binding curves and constants are shown. The sequences of the RNAs are provided with WGGA (yellow) and ACUK RREs (cyan) highlighted. b, Impact of KH2 mutation in FMRP on target sites containing ACUK versus WGGA RREs. The RNA affinities of wt and I304N FMRP iso1 were compared using binding sites in NF1 (ACUK) and FMR1 (WGGA). c, Binding curves of wt and I304N FMRP for an RNA segment representing a mixed RRE binding site in NF1, and several mutant sequence versions (ACUK (-), WGGA (-), and ACUK, WGGA (-)). d, Comparison of FMRP iso1 affinity for RRE type in EMSAs and FMRP iso1 and 7 wt and I304N PAR-CLIP libraries. Error bars in EMSA summary represent s.d., n= 9 (ACUK), or 8 (WGGA). The ratio of sequence reads aligned to each RRE binding site was calculated between wt and I304N FMRP PAR-CLIP libraries. The average sequence-depth ratio of wt over I304N binding site, per RRE-type, are shown. Error bars in the read-depth analyses represent the avg. min and max values across all subsampled mutant libraries (n= 14 and 26 for iso 1 and 7, respectively).
We also tested sites in APP and FMR1 mRNA (Supplementary Fig. 7), two previously identified mRNA targets. APP was originally discovered as an FMRP target based on a predicted G-quartet region[7] although the actual site was subsequently identified in vitro at an upstream segment[36], which was identified here (APP Site 1) as containing WGGA. FMRP targets its own mRNA though it was an association only observed in vitro[5,6,37] and in immunoprecipitates[38].Recombinant I304N iso1 FMRP showed a ~10-fold average decrease in its affinity towards ACUK-containing RNAs compared to wt FMRP iso1 (Fig. 3b and Supplementary Table 5). We characterized binding to wt and mutant RNA sequences derived from an NF1 mixed RRE site (Fig. 3c). I304N FMRP bound wt and NF1 ACUK(-) RNAs similarly, whereas wt FMRP showed a two-fold reduction in affinity for NF1 ACUK(-) RNA. Additionally, wt FMRP bound NF1 ACUK, WGGA(-) RNA similar to I304N FMRP for NF1 WGGA(-) RNA. Our results indicate that the KH2 domain associates with the ACUK RRE. Since FMRP associates with mRNAs at more than one binding site, its association at multiple sites will impact the final regulatory effect. We compared the distribution of RREs in I304N FMRP vs. wt FMRP PAR-CLIPs and found a transcriptome-wide depletion in the recovery of sequence reads for ACUK binding sites for both I304N isoforms, consistent with the biochemistry (Fig. 3d). Interestingly, the RRE fractional distribution of I304N FMRP iso1 was similar to wt FMRP iso7, suggesting that alteration of the KH2 domain by mutation or exon insertion affects binding-site selectivity. Taken together, the biochemical and PAR-CLIP results with I304N FMRP indicate that we identified the natural target sites of the FMRP KH2 domain.To rank FMRP targets, we measured their enrichment in RIP-chip, which would indicate stable interactions. 3,593 PAR-CLIP-identified targets showed enrichment by RIP-chip, of which 939 genes were two- to six-fold enriched; 646 transcripts were two-fold enriched but not identified as PAR-CLIP targets. We used binding-site information obtained by PAR-CLIP to infer the salient features for stable association in RIP-Chip (Fig. 4, Supplementary Fig. 8, and Supplementary Table 6). Increasing frequency of WGGA- and ACUK-containing elements led to greater RIP-chip enrichment, in agreement with in vitro affinity measurements. On average, top targets contained more RRE binding sites (18 per transcript) compared to the least-enriched targets (13 per transcript). Top-ranked targets had a CDS:3'UTR binding site ratio of 3.7:1 compared to bottom-ranked targets that had 1:2. Transcripts with ACUK:WGGA RRE ratios <1 were the most significantly enriched population. Importantly, we identified 93 genes independently implicated in ASD among the highly-enriched FMRP targets (Fig. 4d and Supplementary Table 7), which is striking given the relationship of FXS with ASD[20,21,39].
Figure 4
RRE-dependent enrichment criteria for FMRP association with mRNAs
a, RIP-chip experiments were performed using FLAG-HA-FMRP iso1. a-d, Cumulative distribution fraction plots of FMRP targets based on indicated criteria. Transcripts were grouped and color-coded based on indicated bins. Non-targets are mRNA transcripts with zero PAR-CLIP binding sites, although detectable in the array; total is the sum of non-targets and PAR-CLIP identified targets detectable by RIP-chip. d, Enrichment of ninety-three PAR-CLIP identified ASD-related target genes. e, Immunoblot densitometry analysis of top-ranking FMRP targets from RIP-chip and PAR-CLIP analyses in HEK293 and human brain. In cell culture, target protein expression differences of indicated proteins were determined upon induction of FMR1 iso1 or 7 expression. Similarly, relative protein expression was measured using lysates prepared from indicated brain regions of four FXS patients, compared to age/sex/anatomic-matched controls. Error bars represent s.e.m., with n = 2-11 (depending on protein measured and whether the sample was HEK293 or brain lysate). PABPC1 protein level served as loading and ratio control as it was a gene with PAR-CLIP binding sites but showed no RIP-chip enrichment (-0.10 LFE).
Enrichment of RNAs in RIP-chip depends on the saturation of target sites with FMRP. To estimate the yield of saturation, protein copy number, target mRNA copies, and the number of binding sites within those transcripts have to be accounted. We determined endogenous and co-expressed FLAG-HA-tagged FMR1 protein copy numbers to be approx. 60,000 and 70,000 molecules per cell, respectively, using quantitative Western blotting and reference recombinant protein. Each cell contains 20 pg total RNA, of which 4% are the approx. 1 Mio mRNA molecules/cell. Considering their relative abundance based on HEK293 RNAseq, we estimate that an equal distribution of FMRP would occupy 6% of binding sites/cell among its 6,000 target transcripts, or up to 20% if FXR1 and 2 protein estimates are included. Since PAR-CLIP-identified targets had varying enrichment, the association of FMRP with them is not uniform. The ~3,500 transcripts enriched in RIP-chip are estimated to have at least 18% FMRP binding site occupancy, whereas top-ranked 900 genes (two-fold enriched) potentially exhibit 78%. The presence of multiple binding sites within targets suggests that multiple FMR1 family proteins bind each transcript to influence their regulation.To assess the impact of FMR1 binding sites on mRNA stability, siRNA knockdown of FMR1 or the FMR1 family was performed and mRNA expression profiles were analysed by microarray. We found no evidence for FMR1 affecting target mRNA abundance (data not shown).A panel of FMR1-targets were selected based on enrichment in RIP-Chip, low-to-intermediate expression in RNAseq, similar abundance in human brain (using published microarray[31] and RNAseq datasets[32,33]) and with documented neurological and human disease relevance, then analysed them by quantitative Western blot to determine their protein levels as a function of FMRP expression (Fig. 4e). FXR2, HUWE1, KDM5C, and MTOR protein levels, among others, showed up to 30% reduction in protein levels upon expression of FMR1, in HEK293. We analysed lysates prepared from human postmortem brains. Four FXS brains (Supplementary Fig. 9) were available with age/sex/anatomic-matched controls from pre-frontal cortical, hippocampus, and cerebellar regions. While only four of eight antibodies yielded quantifiable bands in brain lysates, we observed a general trend of elevated target protein expression levels in FXS brains. This is the inverse of FMR1 overexpression effects in HEK293, and consistent with FMRP affecting the protein levels of its mRNA targets.The mRNA targets identified here are from a human transcriptome where the vast majority of genes are comparably expressed in human brain (Supplementary Fig. 2). We discovered ASD-related and numerous other genes implicated in neuronal disorders associated with FXS and validated representatives by EMSA, RIP-chip, and immunoblot. We found genes involved in Angelman (AS), Prader-Willi, Rett, and Cornelia de Lange Syndromes. Interestingly, the ASD and AS-associated gene UBE3A ubiquitinates ARC and SACS1[40]; ARC is a well-known target and here we identify SACS1 as a targeted transcript. These findings potentially provide the molecular link to tie together elements of clinically overlapping disorders, principally setting a molecular target framework for characterizing the connections between FXS and its associated phenotypes.Although FX-related diseases are primarily considered CNS disorders, at least two other target organs are affected, the testes and ovaries. We reasoned that changes in FMR1 expression lead to dysregulation of largely overlapping sets of targets shared across all affected organs. Thus dysregulated genes and pathways in brain might also contribute to phenotypes in testes and ovary. We therefore examined the ovaries of Fmr1 mice[41] since CNS and macroorchidism phenotypes were reported, yet ovary development had largely been under-investigated. Ovaries from Fmr1 mice were markedly larger by 3 weeks post-birth compared to wt controls (Fig. 5a-b). At 12 and 18 wks post-birth, KO ovaries were 22% and 72% larger by mass compared to age-matched controls, respectively. Importantly, we found increased protein levels of Tsc2, Sash1 and Mtor (Fig. 5c). As it is independently known that the Mtor pathway can regulate ovarian development, it is tempting to conclude that increased activity, in the absence Fmr1 expression, contributes to enlarged ovaries histologically consistent with precocious follicular development. These observations suggest that Fmr1-KO mice have the potential to model FXS-related premature ovarian insufficiency in which it remains unclear whether elevated FMR1 mRNA or the observed reduction of FMRP protein itself is causative in female carriers affected by this disease.
Figure 5
Ovarian phenotype in Fmr1 mice
Ovaries from wt and Fmr1 female mice were harvested at 3, 9, 12, and 18 wks and processed for histological (a), morphological (b), and quantitative western analyses (c). a, By 3 wks of age, histological staining (hematoxylin) of sectioned ovaries show greater than expected number of follicles compared to wt. b, Ovaries from 18 wk old Fmr1 mice are larger than wt and exhibit prominent cysts consistent with corpus luteal development. c, Lysates were prepared from 9, 12, and 18 wk ovaries from two different wt and KO mice each, and analysed by quantitative Western using Mtor, Sash1, and Tsc2 antibodies. As in human samples, Pabpc1 was used for normalization.
Elevated signal transduction activity of the Mtor pathway[42,43] was reported in Fmr1 KO-mice, and was attributed to Fmrp targeting of Pik3ca mRNA. Indeed, PAR-CLIP identified several FMRP binding sites within the PIK3CA transcript. However, we find that it is a less-enriched RIP-chip target compared to MTOR and TSC2, whose protein levels appear regulated in an FMRP-dependent manner. Interestingly, recent evidence demonstrated that Tsc mutant mice[44-46], which have increased mTOR activity, had impaired mGluR-LTD and protein synthesis compared to Fmr1-/- mice; crossing Tsc2+/- with Fmr1 mice corrected the phenotypes[44]. Given our results we suspect that in Tsc2 mice, Tsc2 and Mtor protein levels (among others) were elevated, correcting the balance of protein expression and leading to the reversal of phenotypes observed. The reported model[44] by which separate pools of mRNA are differentially regulated by partially convergent pathways in FMRP (in)dependent ways, remains unclear. This is in part because FMRP associates with transcripts of ERK pathway components as well. Therapeutic targeting of the MTOR pathway has become an important goal – but must be further guided by additional functional analysis, particularly of FMRP targets upstream and downstream of MTOR and interconnected signaling pathways (Supplementary Fig. 10). Combined, our validation work in Fmr1-KO mouse ovaries and in human brain demonstrate that the effect of FMRP binding to specific target genes identified in cell culture is extensible to physiologically relevant contexts.
METHODS SUMMARY
Methods are described in greater detail within Supplementary Information. FVB129P2 Fragile-X mice were a kind gift from Dr. Suzanne Zukin (Albert Einstein College of Medicine). Gateway plasmids (Invitrogen) generated in this study will be deposited in addgene.org. FlpIn T-Rex HEK293 (Invitrogen) inducible-stable cell lines were generated per manufacturer's instrutions. The titers, source and use of antibodies used in this study are listed in Supplementary Information. PAR-CLIP was performed essentially as described, except that the second RNase T1 digestion was omitted following the IP. Recombinant wt and mutant FMRP iso1 proteins were purified using baculoviral expression system (Invitrogen). Electrophoretic mobility shift assays and Western blots were quantified using ImageGauge (Fuji). Parameters of computation analyses are described in Supplementary Information and within the relevant sections within https://fmrp.rockefeller.edu/. Relevant datasets, including raw data are available at https://fmrp.rockefeller.edu/ and GEO (GSE39686).
Authors: E J Kremer; M Pritchard; M Lynch; S Yu; K Holman; E Baker; S T Warren; D Schlessinger; G R Sutherland; R I Richards Journal: Science Date: 1991-06-21 Impact factor: 47.728
Authors: K De Boulle; A J Verkerk; E Reyniers; L Vits; J Hendrickx; B Van Roy; F Van den Bos; E de Graaff; B A Oostra; P J Willems Journal: Nat Genet Date: 1993-01 Impact factor: 38.330
Authors: Markus Hafner; Markus Landthaler; Lukas Burger; Mohsen Khorshid; Jean Hausser; Philipp Berninger; Andrea Rothballer; Manuel Ascano; Anna-Carina Jungkamp; Mathias Munschauer; Alexander Ulrich; Greg S Wardle; Scott Dewell; Mihaela Zavolan; Thomas Tuschl Journal: Cell Date: 2010-04-02 Impact factor: 41.582
Authors: Eric T Wang; Rickard Sandberg; Shujun Luo; Irina Khrebtukova; Lu Zhang; Christine Mayr; Stephen F Kingsmore; Gary P Schroth; Christopher B Burge Journal: Nature Date: 2008-11-27 Impact factor: 49.962
Authors: Donny D Licatalosi; Aldo Mele; John J Fak; Jernej Ule; Melis Kayikci; Sung Wook Chi; Tyson A Clark; Anthony C Schweitzer; John E Blume; Xuning Wang; Jennifer C Darnell; Robert B Darnell Journal: Nature Date: 2008-11-02 Impact factor: 49.962
Authors: Damian S McAninch; Ashley M Heinaman; Cara N Lang; Kathryn R Moss; Gary J Bassell; Mihaela Rita Mihailescu; Timothy L Evans Journal: Mol Biosyst Date: 2017-07-25
Authors: Raghu R Edupuganti; Simon Geiger; Rik G H Lindeboom; Hailing Shi; Phillip J Hsu; Zhike Lu; Shuang-Yin Wang; Marijke P A Baltissen; Pascal W T C Jansen; Martin Rossa; Markus Müller; Hendrik G Stunnenberg; Chuan He; Thomas Carell; Michiel Vermeulen Journal: Nat Struct Mol Biol Date: 2017-09-04 Impact factor: 15.369
Authors: Roman Alpatov; Bluma J Lesch; Mika Nakamoto-Kinoshita; Andres Blanco; Shuzhen Chen; Alexandra Stützer; Karim J Armache; Matthew D Simon; Chao Xu; Muzaffar Ali; Jernej Murn; Sladjana Prisic; Tatiana G Kutateladze; Christopher R Vakoc; Jinrong Min; Robert E Kingston; Wolfgang Fischle; Stephen T Warren; David C Page; Yang Shi Journal: Cell Date: 2014-05-08 Impact factor: 41.582