Literature DB >> 34388204

Identification and functional modelling of plausibly causative cis-regulatory variants in a highly-selected cohort with X-linked intellectual disability.

Hemant Bengani1, Detelina Grozeva2,3, Lambert Moyon4, Shipra Bhatia1, Susana R Louros5,6, Jilly Hope7, Adam Jackson5, James G Prendergast8, Liusaidh J Owen1, Magali Naville4, Jacqueline Rainger1, Graeme Grimes7, Mihail Halachev7, Laura C Murphy7, Olivera Spasic-Boskovic9, Veronica van Heyningen1, Peter Kind5,6, Catherine M Abbott6,7, Emily Osterweil5,6, F Lucy Raymond2, Hugues Roest Crollius4, David R FitzPatrick1,6.   

Abstract

Identifying causative variants in cis-regulatory elements (CRE) in neurodevelopmental disorders has proven challenging. We have used in vivo functional analyses to categorize rigorously filtered CRE variants in a clinical cohort that is plausibly enriched for causative CRE mutations: 48 unrelated males with a family history consistent with X-linked intellectual disability (XLID) in whom no detectable cause could be identified in the coding regions of the X chromosome (chrX). Targeted sequencing of all chrX CRE identified six rare variants in five affected individuals that altered conserved bases in CRE targeting known XLID genes and segregated appropriately in families. Two of these variants, FMR1CRE and TENM1CRE, showed consistent site- and stage-specific differences of enhancer function in the developing zebrafish brain using dual-color fluorescent reporter assay. Mouse models were created for both variants. In male mice Fmr1CRE induced alterations in neurodevelopmental Fmr1 expression, olfactory behavior and neurophysiological indicators of FMRP function. The absence of another likely causative variant on whole genome sequencing further supported FMR1CRE as the likely basis of the XLID in this family. Tenm1CRE mice showed no phenotypic anomalies. Following the release of gnomAD 2.1, reanalysis showed that TENM1CRE exceeded the maximum plausible population frequency of a XLID causative allele. Assigning causative status to any ultra-rare CRE variant remains problematic and requires disease-relevant in vivo functional data from multiple sources. The sequential and bespoke nature of such analyses renders them time-consuming and challenging to scale for routine clinical use.

Entities:  

Mesh:

Substances:

Year:  2021        PMID: 34388204      PMCID: PMC8362966          DOI: 10.1371/journal.pone.0256181

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Cis-regulatory elements (CRE; encompassing enhancers and repressors) are genomic sequences that control transcriptional activity of one or more genes on the same chromosome via sequence-specific interaction of the DNA with proteins and/or RNA. CRE can be predicted using comparative genomics [1], transcriptional characteristics [2], patterns of histone modifications and protein association [3], patterns of accessible chromatin [4] and direct interactions with promoters [5]. Although estimates of the number of CRE in the human genome vary with each prediction method, functional ENCODE data has been interpreted as identifying at least 400,000 putative human enhancers [6]. Disrupted CRE function as a cause of Mendelian disease was first recognized via the loss or gain of regulatory function resulting from structural chromosome anomalies such as deletion or translocation [7-10]. However, the identification of disease-associated single nucleotide variants within individual CRE has been complicated by several factors. CRE can function over large genomic intervals and the targeted gene may not be the closest gene. CRE mostly exist in the non-coding parts of the human genome where our current understanding of mutation consequence is very incomplete compared to the coding region. Developmental disorders (DD) are a diverse group of conditions caused by perturbations of embryogenesis or early brain development. The combination of massively parallel sequencing technologies and family-based analyses has proven very effective in identifying the genes and mechanisms causing severe developmental disorders in humans. DD are primarily genetically determined with a high proportion of causative coding region variants arising as de novo mutations (DNM) [11]. The genomic intervals encompassing known DD causative genes are commonly enriched in highly conserved CRE [12]. DNM enrichment is also evident in evolutionarily conserved, brain-active CRE in severe DD at a cohort level [13] but the confident assignment of variants as causative in affected individuals is not yet possible [14]. We have previously identified all likely CRE on the human X chromosome and assigned these to their target genes [15]. Here we have sequenced all of these CRE in 48 individuals with intellectual disability (ID) and a family history indicating that the ID is X-linked (XLID). Each affected individual had previously had a negative screen for likely causative mutations in all coding exons on the X chromosome [16]. Following strict filtering, six rare variants in CRE predicted to control known XLID genes were tested in vivo using zebrafish and mouse models to classify their diagnostic potential. After these studies and reanalysis of the population allele frequencies following the release of gnomAD 2.1 data, only one CRE variant, causing a complex dysregulation of the gene FMR1(FMRP translational regulator 1), could be considered as likely causative in a single family.

Materials and methods

Cohort selection

Genomic DNA samples from 48 individuals (probands) with moderate-to-severe intellectual disability (ID) were used in this study. Research ethics review and approval was granted by the UK Multicentre Research Ethics Committee in Cambridge with approval number 03/0/014. Written consent was obtained from the parents or guardians of each affected individual included in the study. Each individual is assumed to have X-linked recessive form of ID on the basis of positive family history: three or more cases of ID in males only, predominant sparing of carrier females and no evidence of male-to-male transmission of the disease. A clinical geneticist had assessed the individuals and the cause of the ID was unknown. The severity of the disease was categorized using DSM–IV or ICD-10 classifications (profound mental retardation was classified as severe). The affected individuals had previously been tested negative by routine diagnostic approaches (i.e., CGH microarray analysis at 500 kb resolution, fragile X [MIM 300624], methylation status of Prader Willi [MIM 176270]/Angelman syndrome [MIM 105830]). In addition, all 48 individuals have been screened within a previous study [16] for coding variants on the X chromosome likely to lead to disease and such variants had not been found. The whole genome sequencing of individual S3 was performed and analyzed within the UK National Health Service as part of a large-scale clinical implementation study led by one of the authors (FLR) [17].

Targeted capture design and sequencing

A comprehensive list of coordinates of all exonic and conserved regulatory elements from chrX used to design a customized capture library (Roche, NimbleGen) is provided in . Library preparation, pre- and post-capture multiplexing were performed using the SeqCap EZ Choice XL kit (Roche NimbleGen) and TruSeq index barcodes (Illumina) were used according to the manufacturer’s instructions. 4 different DNA samples were pooled for pre-capture multiplexing and 4 post-captured libraries were combined. Paired-end sequenced performed on a single lane of a HiSeq-2000 instrument (Illumina). In total 16 different DNA samples were sequenced in a single lane of a HiSeq-2000 and 4 lanes were used to sequence all 48 DNA samples.

Read mapping, variant analysis and enhancer selection

Following quality control with FastQC, reads were mapped to the GRCh37 version of the human reference genome using BWA [18]. Variants were called using GATK [19] according to its recommended best practice pipeline. 40,699 variants remained after filtering out variants that failed GATK’s variant quality score recalibration. These variants were subsequently compared to dbSNP v137 to filter out common variants. Any variant with one of the following handles in dbSNP (1000GENOMES, CSHL-HAPMAP, EGP_SNPS, NHLBI-ESP, PGA-UW-FHCRC) were excluded where the variant’s reported minor allele frequency was greater than 0.01 or the minor allele was observed in at least two samples. The remaining 9,577 chrX variants were then annotated with SnpEff [20] to determine their predicted effects on genes. To determine the best candidates for experimental validations, the variants were ranked based on extreme evolutionary conservation. Using Multiple Sequence Alignments from 45 vertebrate species against the Human genome (UCSC genome browser), mutations were retained if the reference human allele was conserved in at least 90% of the species, and then sorted by decreasing conservation depth. Top variants were then manually evaluated using biochemical signals from the ENCODE project (H3K4me1, H3K4me3, H3K27ac, DNase1 sensitivity), and based on the association to target genes known to be responsible for XLID or functionally related to brain development, leading to a final selection of 31 candidate variants (S2 Table in ). Target genes for each of the CRE harboring the variants were assigned as described previously [15]. Motif search on CRE element was performed on a 40bp window around the mutated base for both human and mouse sequences using the FIMO software from the MEME suite [21]. The motif databases used for the search were Jaspar Core 2018 for vertebrates and Uniprobe mouse motifs as downloaded from the MEME website. Motifs with a p-value of 0.001 or lower that were present uniquely in either the WT or the mutant sequences are reported.

Animal study licenses

All mouse and zebrafish experiments were approved by The University of Edinburgh ethical committee and performed under UK Home Office license number PIL 60/12763, 70/25905, I655D57B6, PA3527EC3 and 1724D1B2C; PPL 60/4418, 60/4424, IFC719EAD and 60/4290.

Transgenic zebrafish, In Situ Hybridization (ISH) and morphant generation

The wild-type and mutant versions of the six variants documented in and were analyzed for their regulatory activities in dual color enhancer-reporter transgenic assays in zebrafish embryos [22]. The sequences of the primers used in generating the constructs utilized in the assay are listed in S3 Table in The number of independent lines analyzed for each enhancer and their expression sites is summarized in The transgenic F1 embryos were processed for imaging as described [22]. The images were taken on a Nikon A1R confocal microscope and processed using A1R analysis software.

Project summary and XLID-associated regulatory variants and their predicted target genes.

(A) A diagrammatic summary of the experimental pipeline followed in this paper. (B) Schematic showing the genomic region of the six genomic variants in the five probands (S19, S24, S43, S3 and S31) indicating the location of the XLID-associated CRE variants along with their predicted target genes indicated in red, genomic coordinates from h19/GRCh37 genome build. The variants highlighted by grey box were used to make mouse models. *Popmax Filtering AF (95% confidence). A zebrafish six3 antisense morpholino oligonucleotide (Six3AMO) was obtained from Gene Tools, LLC, with the following sequence: 5´ GCTCTAAAGGAGACCTGAAAACCAT 3´. This morpholino has sequence complementary to the highly conserved sequences around the translation initiation codon of both six3a and six3b, and hence inhibits the function of both zebrafish six3 genes [23]. As control we used the Gene Tools LLC standard negative control morpholino: 5´ CCTCTTACCTCAGTTACAATTTATA 3´. The morpholinos were injected into 1 to 2-cell stage of at least 100 embryos to deliver an approximate amount of 2.5 ng per embryo. RNA in situ hybridization on fish embryos was performed as previously [24]. The sequences of primers used for synthesis of specific probes are listed in (S3 Table in

Generation of transgenic mice and embryo ISH

CRISPR/Cas9 gene targeting technology was used to generate mouse lines with orthologous mutations; Fmr1CRE and Tenm1CRE (Teneurin transmembrane protein 1). A double-stranded DNA oligomer that provides a template for the guide RNA sequence was cloned into px461. The details of guide RNA and repair template sequence are provided in S1 Note in The full gRNA template sequence was amplified from the resulting px461 clone using universal reverse primer and T7 tagged forward primers. The guide RNA was generated from this PCR template using T7 RNA polymerase (NEB), and purified with RNeasy mini kit (Qiagen) purification columns. The zygotic injection mix contained Cas9 mRNA (Tebu Bioscience @ 50ng/μl), guide RNA (25ng/μl) and repair template single stranded DNA (IDT 150ng/μl). Injected embryos were transferred into the oviducts of pseudo-pregnant females to litter down. Genotyping of the resulting mice was performed by Sanger sequencing using tail tip DNAs. F0 mice with desired variant were crossed with C57BL/6 to generate a stable mice line. In situ hybridization on mouse embryos was performed with DIG-labelled gene-specific antisense probes as previously described [25]. The sequences of primers used for synthesis of specific probes are listed in S3 Table in

Olfaction test

Male wild-type and Fmr1CRE and Tenm1CRE littermates at P25 were subjected to the buried food test assay. For three consecutive days before testing ¼ chocolate button (Cadbury) was placed in the home cage for 15 minutes to habituate the mice to the food reward. 12 hours before the test, all food was removed from the home cage to motivate the mouse to find the food reward during the test. After 12 hours, the mouse was placed in a clean cage with fresh bedding in which ¼ chocolate button had been buried 1cm beneath the bedding. The time taken to find the buried food was scored and the test was stopped if the mouse did not find the food after 15 minutes. The bedding was replaced and the cage cleaned with 1% Conficlean between mice. All mice were scored blind to the genotype. Unpaired t-tests were used to determine statistical significance.

Seizure propensity testing of Fmr1CRE

Male wild-type and Fmr1CRE littermates at P25 were tested for audiogenic seizures as described previously [26]. Briefly, animals were transferred to a transparent plastic test chamber and, after 1 minute of habituation, exposed to a 2 min sampling of a modified personal alarm held at > 130dB. Seizures were scored for incidence (seizure/no seizure) and severity, with an increasing scale of 1 = wild running, 2 = clonic seizure, and 3 = tonic seizure. All mice were tested and scored blind to genotype. Statistical significance for incidence was determined using two-tailed Fisher’s exact test.

Basal protein synthesis and FMRP western blotting

Protein synthesis levels were measured following the protocol outlined by Osterweil [27]. The detailed protocol is described in S2 Note in . For western blots, hippocampal slice from P25 male wild-type and Fmr1CRE knock-in mutant littermates were dissected and homogenized in lysis buffer (20 mM HEPES pH 7.4, 0.5% Triton X-100, 150 mM NaCl, 10% glycerol, 5 mM EDTA with protease inhibitor cocktail (Roche), incubated at 4°C for 30 min followed by centrifugation at 14000 rpm for 30 min to collect the supernatant. These samples were directly used for SDS-PAGE and transferred onto nitrocellulose membranes for immunoblot analysis with FMR1 antibody (MAB2160, Milipore). Densitometry was performed on scanned blot film using Image Studio Lite software. Each signal was normalized to total protein in the same blot. Values are shown as a percentage of average WT for graphical purposes.

Hippocampal slice electrophysiology

Electrophysiology experiments were performed as described [28]. The detailed protocol is described in S3 Note in .

RNAseq and RNAscope analysis

In situ RNA hybridization was performed using the RNAscope assay (Advanced Cell Diagnostics, ACD, Hayward, CA, USA) according to the manufacturer’s recommendations. The detailed protocols described in the S4 Note in . The images of sections were processed using the multimodal Imaging Platform Dragonfly (Andor Technologies, Belfast, UK) using air 40x Plan Fluor 0.75 DIC N2. Data were collected in Spinning Disk 25 μm pinhole mode on the high sensitivity iXon888 EMCCD camera. According to Advanced Cell Diagnostics, each mRNA molecule hybridized to a probe appears as separate small puncta. Data visualization and spot counting was done using IMARIS 8.4 (Bitplane). The details of the RNAseq analysis are given in S5 Note in

Statistical analysis

Statistical analysis was performed using two-tailed Student’s t-test (Prism 4, GraphPad Software, La Jolla, CA, USA) except for Fig 6C were significance was determined using two-tailed Fisher’s exact test (appropriate for analyzing nominal data sets). A p-value of < 0.05 was considered statistically significant. Data are shown as the mean values ± SE of number of replicates (n) used in the experiments.
Fig 6

Functional analysis of FMR1CRE mice.

(A) Comparison of mGluR-dependent long-term depression (LTD) in CA3-CA1 components of the hippocampus of eight Fmr1CRE male mice and eight wild-type male littermates indicates a significant Fmr1CRE-associated decrease in LTD. (B) All quantitative data are presented as mean ±SE and p value of 0.05 or less is considered statistically significant.(C) No significant difference was observed in audiogenic seizure incidence in the hemizygous mice with the variant Fmr1CRE(1/21) compared to wild-type littermates(3/9). Statistical significance is determined using two-tailed Fisher’s exact test and p value of 0.05 or less is considered statistically significant.(D) Significant increase in bulk protein synthesis levels in slices from dorsal hippocampus of Fmr1CRE knock-in mutant male mice as compared to wild-type male littermates. Quantitative data is derived from number of biological replicates used (n = 6) in the experiments. Levels of significance were determined by 2-tailed Student’s t-test, with p values lower than 0.05 considered statistically significant. (* means difference is statistically significant).

Results

Identifying a cohort likely to be enriched in disease-associated CRE

In a previous study we found that 155/208 families with apparent XLID had no detectable disease-associated variants in the coding sequence on the X chromosome [16]. We chose affected male probands from 48 of these undiagnosed families for inclusion in present study. Each unrelated proband had 3 or more similarly affected relatives with the inheritance pattern being strongly suggestive of XLID. We reasoned that these families are likely to be enriched for highly penetrant causative regulatory mutations. In addition, the high prior probability that any causative variant in these families would be located on the X chromosome significantly reduced the genomic space for interrogation. A summary of the overall study design is presented in .

Identifying rare variants in CRE on the X chromosome

We performed targeted sequencing in each proband using a custom 15.9 Mb oligonucleotide pull-down consisting of 227,323 baits. These baits were designed to capture two non-overlapping sets of target sequences from the human X chromosome (chrX); all chrX coding exons and all chrX CRE. The set of chrX CRE, accounting for 4.4% of chrX genomic sequence, had been defined in a previous study [15]. This study also showed that the maintenance of linkage between a CRE and neighbouring genes throughout evolution was an accurate way to identify the target gene(s). Target genes assigned using this conserved synteny approach allowed a maximum CRE to gene distance of 1.5 Mb [15]. Approximately a third of chrX CRE could be assigned to a single gene with the remainder having >1 equally plausible target. 389/812 protein coding genes on the X chromosome could be assigned to at least one CRE. Following sequencing and alignment, a total of 40,699 variant calls passed basic quality controls in these individuals (S1 Fig in ). As expected from our previous work [16], no likely causative variants were identified in the coding exons. 628 hemizygous variants were identified in high confidence putative CRE and were not present in the population-based whole genome sequence data that was available at the time (S1 Fig in ). To further increase the likelihood of identifying clinically-interpretable variants we focused on the 31/628 altered highly conserved bases in CRE that were predicted to control known XLID genes. 30/31 were confirmed by Sanger sequence analysis in the probands. 6 of these variants were shown to segregate appropriately in the XLID families using samples from additional affected, unaffected males and obligate female carriers (). Details of segregation in available family members are shown in S2 Table and S3-S33 Figs in . 4/48 probands carried one of these six variants and 1/48 carried two.

FMR1 and TENM1 CRE variants alter enhancer function in zebrafish transgenics

The reference and alternative base versions of all six CRE variants were then tested for CRE function using a dual-color fluorescent transgenic assay in zebrafish [22]. Multiple stable transgenic lines were created in which the wild-type and mutant human CRE drives expression of different fluorescent proteins in the same fish (Figs and ). Reporter expression domains were scored in living embryos between 24 hours and 96 hours post-fertilization (hpf). Only consistent differences between the reference and alternative alleles in at least 3 independent lines were taken as evidence of a functional effect of the mutation. The specific criteria for a variant to be included in future in vivo functional studies were: 1. Strong evidence of variant-associated disruption of CRE activity in the developing brain. 2. A significant overlap between the wild-type CRE activity and that of the endogenous neural expression of the orthologous zebrafish gene (Figs and ).

TENM1CRE alters enhancer function in the zebrafish brain by creating a repressive SIX3 binding site.

(A) A diagrammatic summary of the dual color fluorescence assay used in this study. The size of the human TENM1 element is provided in the left hand panel in base pairs (bp) (B) Human and mouse (TENM1CRE/Tenm1CRE) sequences are shown with the variant base marked in blue, resulting in gain of SIX3/SIX6 and HDX binding sites in TENM1CRE and Six6 and Hdx binding sites in Tenm1CRE. (C) mRNA in situ hybridization showing expression of tenm1 in midbrain, hindbrain and neural tube during embryonic development in wild-type zebrafish. (D-E) Dual color fluorescent transgenic assay in zebrafish with wild-type (Wt) and mutant TENM1CRE driving eGFP and mCherry expression respectively. Loss of enhancer activity is observed in midbrain and hindbrain with the mutant TENM1CRE allele. Further examples of embryos for different stable lines are shown in S34 Fig in . (F-E) six3 knockdown rescues the effect of the mutant variant on the activity of TENM1CRE. Control morpholino injected embryos show loss of reporter activity in midbrain and hindbrain by mutant allele, where the mutation creates a Six3 binding site (E). Knockdown of Six3 rescues the activity of mutant allele in the midbrain and hindbrain (F). MB: Midbrain; HB: Hindbrain; NT: Neural tube; hpf: Hours post fertilization.

FMR1CRE alters enhancer function in the zebrafish brain.

(A) A diagrammatic summary of the dual color fluorescence assay plasmid constructs used in this study. The size of the human FMR1 element is provided in base pairs (bp) (B) Human and mouse (FMR1CRE/Fmr1CRE) sequences are shown with the variant base marked in blue, resulting in predicted loss of a RFX2/Rfx2 binding site in FMR1CRE/Fmr1CRE. (C) mRNA in situ hybridization showing expression of fmr1 in forebrain and midbrain during embryonic development in wild-type zebrafish. (D-E) Dual color fluorescent transgenic assay in zebrafish with wild-type (Wt) and mutant FMR1CRE driving eGFP and mCherry expression respectively. Loss of enhancer activity is observed in forebrain with the mutant FMR1CRE allele. Further examples of embryos for different stable lines are shown in S35 Fig in . FB: Forebrain; MB: Midbrain; TG: Trigeminal ganglia; NP: hpf: Hours post fertilization. Only two CRE variants in two different probands fulfilled these criteria (): TENM1CRE (proband S24) and FMR1CRE (proband S3). TENM1CRE showed a loss of reported expression in the mid- and hind-brain (). FMR1CRE resulted in the loss of expression in the forebrain but normal expression in the trigeminal ganglia ().

Transcription factor binding site analysis of CRE variants

We next looked at the effect of these CRE variants on putative transcription factor binding sites. We restricted the analysis to sites that were gained or lost in both the human and mouse versions of the CRE. The TENM1CRE variant created a novel site predicted to bind SIX3 (SIX homeobox 3) or SIX6 (SIX homeobox 6) in both the human and orthologous mouse CRE (). SIX3 is essential for early brain development and has pathway-specific activator and repressor activity [29]. To determine if SIX3-mediated repression may be responsible for the altered enhancer activity in the variant TENM1CRE we chose to use morpholino-induced knock-down of endogenous six3 in TENM1WT/TENM1CRE transgenic embryos. The phenotypic effect of the morpholinos targeting zebrafish six3 was assessed by 1-cell embryo injections. The amount of morpholino was titrated to the point where there was no morphological anomaly seen at 24 hours. When this concentration of morpholino was injected into TENM1WT/TENM1CRE transgenic embryos there was rescue of the activity of mutant CRE in the midbrain and hindbrain with no effect on the wild-type reporter () and knockdown of Six3 protein was confirmed by western blotting (S40 Fig and S6 Note in ). This supports the hypothesis that the CRE variant had created a repressive SIX3 binding site as the mechanism for the transcriptional effect in zebrafish embryos. The loss of a RFX2 binding site in both human and mouse FMR1/Fmr1 CRE was predicted with relatively low confidence (). RFX2 (Regulatory factor X2) is a transcription factor required for spermatogenesis in mice and a wider role in the control of ciliogenesis [30-32]. Given the low confidence of this prediction we did not attempt any functional validation.

Fmr1CRE and Tenm1CRE mouse models

CRISPR/Cas9 induced homologous recombination in mouse zygotes allowed us to individually “knock-in” the same nucleotide change identified in human FMR1CRE and TENM1CRE into the orthologous positions in the mouse genome (). We established multiple independent mouse lines for each CRE variant on a C57BL/6 background. All lines resulted in viable hemizygous mutant animals, at the expected ratio that were healthy and fertile with no obvious morphological abnormalities. Whole-mount in situ hybridization (WISH) with riboprobes targeting either Fmr1 or Tenm1 was used to compare developmental expression patterns between wild-type and mutant male 13.5 gestational day (GD) embryos. Fmr1CRE caused a significant reduction in Fmr1 expression in the olfactory placodes and the forebrain (). Fmr1 WISH on four other wild-type and Fmr1CRE embryos is shown in S41 Fig in . Tenm1CRE did not show a consistent effect on Tenm1 expression in male embryos at 13.5GD.

Expression levels of Fmr1 and FMRP in Fmr1CRE.

Frontal (A) and saggital (B) views of 13.5GD embryonic mouse heads following whole-mount in situ hybridization for Fmr1. In each panel the wild-type male embryo is shown on the left and the Fmr1CRE embryo on the right. There is loss of expression of Fmr1 in the nasal placode and midbrain Fmr1CRE mutant embryos as compared to wild-type embryos. The Fmr1CRE embryos had been deliberately over-developed in the chromogenic substrate compared to the wild-type embryos to emphasize the signal difference. Saggital H&E stained section of whole brain (C) with detailed view (white dashed box) of the hippocampus (D) with marked hippocampus regions indicating the regions analysed in (F) numbered 1–8, starting from dentate gyrus. (E) Reference image of RNAscope processed section with Fmr1 transcript (red), Pax6 transcript (green) and nucleus (blue/DAPI). Each transcript is represented by a spot following the quantitative image processing. (F) Graphical representation of Fmr1 transcripts normalised to Pax6 transcripts (used as control) between Fmr1CRE (purple) compared to wild-type littermates (orange) and data represent average of four replicates (n = 4) ±SE. Levels of significance were determined by 2-tailed Student’s t-test, with p values lower than 0.05 considered statistically significant. No significant difference was observed in the Fmr1 transcript levels. (G) Western blot of hippocampal tissue from four Fmr1CRE, four wild-type and two Fmr1-null mice at P25 using an antibody that detects FMRP. (H) Quantitation of the FMRP bands in (G) indicating an apparent increase in FMRP in Fmr1CRE hippocampal slices. All quantitative data are presented as mean ±SE and p value of 0.05 or less is considered statistically significant. (* means difference is statistically significant). FB: Forebrain; MB: Midbrain; NP: Nasal placode; DG: Dentate gyrus. To determine if there were measurable phenotypic effects segregating with either CRE variant we first tested olfaction. This sense was selected for two reasons. First, complete loss of Fmr1 expression in the olfactory placode in Fmr1CRE embryos was observed. Secondly, mutations in TENM1/Tenm1 have recently been identified in humans and mice associated with congenital generalized anosmia [33]. Using a buried chocolate button test Fmr1CRE mice showed a significant increase in time to discovery compared to wild-type male littermates (). Tenm1CRE mice had olfactory function similar to wild-type male littermates ().

Olfaction testing of Fmr1CRE and Tenm1CRE mouse lines.

(A, B) The mice hemizygous for the variant in Fmr1CRE showed a significant increase in time to discovery compared to wild-type male controls in a buried food test. (C, D) No significant difference in the levels of latency to find food was observed in mice hemizygous for the variant in Tenm1 compared to wild-type littermates. The numbers of animals tested (n) are given in (A) and (C). All quantitative data are presented as mean ±SE and p value of 0.05 or less is considered statistically significant. (* means difference is statistically significant).

Fmr1/FMRP-focused analyses

Loss of FMR1 expression is responsible for Fragile X syndrome, the most common form of XLID [34]. Although we detected clear differences in Fmr1 expression in embryonic midbrain and nasal placodes (), we did not find significant difference in Fmr1 levels in the post-natal brains of male animals by quantitative RTPCR at postnatal day 7 (P7) or P14 (S42 Fig in Similarly, at P25 we found no difference in Fmr1 levels in forebrain, midbrain or hindbrain using mRNA sequencing (S43 Fig in ) or in the ratio of Fmr1:Pax6 transcripts in different regions of the hippocampus using in situ hybridization with dual-color RNAScope probe sets ( Given the gene expression results, we were surprised to find an apparent increase in FMRP (fragile X mental retardation protein) protein abundance in the hippocampus of Fmr1CRE male mice compared to wild-type littermates using western blotting (). We found a decrease in mGluR-dependent long-term depression (LTD) in the CA3-CA1 hippocampus of Fmr1CRE males (). We considered the decrease in LTD to be consistent with the increased levels of FMRP protein given that an exaggerated LTD is a consistent finding in Fmr1-null animals [35]. A predisposition to audiogenic seizures is also a consistent phenotype in Fmr1-null mice but Fmr1CRE male mice showed no increase in such seizures (). The finding of a significant increase in bulk protein translation levels in the hippocampus of Fmr1CRE male mice () was unexpected as this too is considered to be a marker of loss of Fmr1 function [27].

Functional analysis of FMR1CRE mice.

(A) Comparison of mGluR-dependent long-term depression (LTD) in CA3-CA1 components of the hippocampus of eight Fmr1CRE male mice and eight wild-type male littermates indicates a significant Fmr1CRE-associated decrease in LTD. (B) All quantitative data are presented as mean ±SE and p value of 0.05 or less is considered statistically significant.(C) No significant difference was observed in audiogenic seizure incidence in the hemizygous mice with the variant Fmr1CRE(1/21) compared to wild-type littermates(3/9). Statistical significance is determined using two-tailed Fisher’s exact test and p value of 0.05 or less is considered statistically significant.(D) Significant increase in bulk protein synthesis levels in slices from dorsal hippocampus of Fmr1CRE knock-in mutant male mice as compared to wild-type male littermates. Quantitative data is derived from number of biological replicates used (n = 6) in the experiments. Levels of significance were determined by 2-tailed Student’s t-test, with p values lower than 0.05 considered statistically significant. (* means difference is statistically significant). Re-evaluation of affected individuals within the family in which FMR1CRE is segregating () revealed no clinical features suggestive of a Fragile X (FRAX) syndrome diagnosis (OMIM #300624]; FMR1 silencing) other than macrocephaly and intellectual disability. Importantly none of the individuals carrying FMR1CRE showed signs of FRAX Tremor and Ataxia Syndrome (FRAXTAS [OMIM #300623]; FMR1 over-expression) [36]. There was no obvious olfaction anomalies in the affected individuals from this family and no seizure predisposition. Clinical whole genome sequencing [17] of individual S3 (FMR1CRE proband) did not identify any other plausible cause of his intellectual disability.

Family 347 pedigree.

Pedigree of Family 347 of which individual S3 is a member showing segregation of the mutation affecting FMR1 expression. Taken together the data above strongly suggest that Fmr1CRE/FMR1CRE does not result in simple loss or gain of FMR1 function but rather a complex site and stage specific misregulation of gene product levels and cellular function.

The impact of gnomAD 2.1 on the interpretation of CRE variants

The release of gnomAD 2.1 in late 2018 [37] represented a very significant change in our knowledge of the population allele frequencies in the non-coding part of the human genome. By this point we had already performed our zebrafish dual-colour transgenic screen and created the mouse models for Tenm1CRE and Fmr1CRE. The gnomAD-derived variant allele frequencies (AF) of the six variants which survived our original filtering are shown in This showed three of the variants remained unique; FMR1CRE, POLA1/PCYT1BCRE (DNA polymerase alpha 1, catalytic subunit) and KDM6ACRE (Lysine demethylase 6A). However, TENM1CRE, ARHGEF6CRE (Rac/Cdc42 guanine nucleotide exchange factor 6) and AFF2CRE (AF4/FMR2 family member 2) were observed in the gnomAD population and the latter two variants were also seen in hemizygous state suggesting that they were very unlikely to be a cause of XLID. Although the allele frequency of TENM1CRE was below 1 in 10,000, a frequency commonly used for clinical filtering of ultrarare variants, we wished to know if this should change its “plausibly causative” status. Using the approach of Whiffin et al [38] we calculated the maximum plausible allele frequency (AF) for any XLID causal variant. We chose conservative parameters: 0.01 for genetic heterogeneity (i.e. 1% of all XLID without a known diagnosis is caused by variation in a particular CRE), 0.2 for allelic heterogeneity (i.e. only 5 different causative variants can exist per CRE) and 0.5 for penetrance (this is complicated by X-linked inheritance but likely to be ~1 in males and > = 0.1 in females). These parameters gave maximum permitted 95% confidence AF = 4e06 suggesting that TENM1CRE is not a plausibly cause of XLID.

Discussion

The motivation for initiating this study was the difficulty in assigning pathogenic or likely pathogenic status to any de novo or segregating variant in an intergenic regulatory sequence. Such ultra-rare variants would be almost universally be considered of uncertain significance using current best practice guidelines for diagnostic interpretation of genomic sequence variants [39,40]. However functional assays demonstrating that an abnormality gene function associated with a CRE variant (coded as PS3 in the ACMG guidelines) has the potential to change many variants of uncertain significance (VUS) to likely pathogenic status [41]. The question then becomes: how should we use data from functional assays in clinical interpretation of regulatory variants. Given the rapid switch from targeted whole exome sequencing to whole genome sequencing, it is likely that there will be an increasing need to develop a rational approach to the interpretation of ultra-rare regulatory variants. Here we performed an integrated clinical, genetic, developmental, behavioural and neurophysiological approach to the analyses of CRE variants identified in a cohort of affected individuals that should be enriched for causative cis-regulatory mutations. XLID accounts for ~16% of ID in males [42]. Mutations in the coding region of at least 81 different genes [16,43] have been identified as causing XLID. Given the significant contribution of XLID to ID and the observed regulatory variant enrichment in a large cohort of individuals with neurodevelopmental disorders [13] we reasoned that we could increase the prior probability of identifying likely causative mutations by restricting the genomic search space to the X chromosome. We also chose to limit our investigations to variants in enhancers targeting known XLID genes, since most of the known disease-associated regulatory mutations were identified because they partially [44] or fully [45] phenocopy loss-of-function mutations in the target gene. If this were true for our cohort, then matching the pattern of clinical features of individuals carrying a specific regulatory mutation to those of the syndrome associated with intragenic mutations would have diagnostic value. Our experimental design can be justified on pragmatic grounds. However, we do recognize some significant problems with this filtering strategy. First, the CNE variant could induce expression in cells types where the target gene is normally silenced, which is likely to result in a phenotype completely distinct from that associated with intragenic mutations. Secondly, if intragenic mutations in the target gene result in early embryonic lethality this gene would not be identified as a cause of XLID. However, mutations in a CRE controlling only neural expression of an essential gene could cause XLID but would be missed in our analysis which is focussed on known XLID genes. Recently somatic mutations in the brain have been implicated in the causation of neuro developmental disorders [46]. The fact that we have selected individuals with a positive family history would significantly reduce the prior probability of this disease mechanism in our cohort. For this reason, we have designed our variant filtering strategy to identify constitutional mutations and exclude variants that are likely to be mosaic. In vivo analysis of the enhancer activity using the dual color reporter transgenic zebrafish embryos proved to be a useful screen. Four CRE (POLA1-PCYT1B, ARHGEF6, KDM6A, AFF2) showed inconsistent and/or non-specific reporter activity with no difference between the wild-type and mutant alleles (). However, this analysis also provided evidence for abnormal in developmental gene regulation for two CRE controlling FMR1 and TENM1 respectively. For TENM1CRE we could identify the mechanism of the altered reporter gene expression in zebrafish: de novo formation of a repressive six3 binding site. It is remarkable that, in absence of obvious homology with human CRE sequences (S44 Fig in ), the developing zebrafish brain appears to report the same transcription factors to control site and stage specific gene activation. This argues for a subtler shape-based interaction of DNA with transcription factors that we are, as yet, unable to understand. Defining the grammar of this structural effect will significantly aid our interpretation of variants in the non-coding genome. The unique CRE variant FMR1CRE is the most plausible disease associated allele of those identified in this study. This variant produced abnormal embryonic expression of endogenous Fmr1 in a mouse model () consistent with the tissue specific loss of function during early brain development in the transgenic zebrafish embryos. In contrast, we were unable to show evidence of altered transcriptional regulation in the post-natal brain of Fmr1CRE male mice (). This was particularly interesting given the apparent increased level of FMRP protein in hippocampal slices derived from P25 Fmr1CRE mice ( This increase may explain the electrophysiological change of LTD we observed being the opposite effect to that seen in Fmr1 KO mice. The increase in bulk protein synthesis was surprising as this effect is seen in Fmr1 KO mice. These apparently paradoxical results are likely to reflect a complex developmental mis-programming of the cells in the hippocampus. The results outlined above, provide a clear explanation for why proband S3 and his affected male relatives carrying FMR1CRE, do not show a clinical pattern typical of either Fragile X syndrome [OMIM 300624] or FRAXTAS [OMIM 300623]. The family presented with a non-specific intellectual disability associated with mild macrocephaly. We consider it likely that many causative CRE variants be associated with clinical features that differ significantly from those seen associated with intragenic mutations of target gene. This means that we have relatively limited ability to predict the phenotypes associated with regulatory mutations even when the clinical impact of intragenic mutations of target gene are well characterised. While it remains challenging to recognise causative CRE variants, the availability of population-based allele frequencies from whole genome sequencing data has certainly improved our ability to identify those of likely neutral or low penetrant effect. The gnomAD 2.1 data allowed us to show that TENM1CRE was implausible as an XLID causative variant despite it being in an evolutionarily conserved, non-redundant CRE with a strong repressive effect in the zebrafish transgenic analyses. Filtering for extreme rarity of individual alleles will aid the identification of causative variants in CRE that are under high levels of selective constraint within human populations [47]. However, human genetic filtering will have to be supported by strong, multi-source, disease-relevant functional data before a likely causative status can be assigned to any CRE variant.

List of coordinates used to design customized capture library.

(TXT) Click here for additional data file.

The file have details for S1-S44 Figs, S1-S3 Tables and S1-S6 Notes.

(DOCX) Click here for additional data file. (PDF) Click here for additional data file.

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present. 14 Apr 2021 Submitted filename: Point by Point Reviewer Response.docx Click here for additional data file. 12 May 2021 PONE-D-21-12396 Functional predictors of causation for cis-regulatory mutations PLOS ONE Dear Dr. FitzPatrick, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Jun 26 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Chaeyoung Lee Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. To comply with PLOS ONE submissions requirements, in your Methods section, please provide additional information on the animal research and ensure you have included details on (1) methods of sacrifice, (2) methods of anesthesia and/or analgesia, and (3) efforts to alleviate suffering. 3. We note that you are reporting an analysis of a microarray, next-generation sequencing, or deep sequencing data set. PLOS requires that authors comply with field-specific standards for preparation, recording, and deposition of data in repositories appropriate to their field. Please upload these data to a stable, public repository (such as ArrayExpress, Gene Expression Omnibus (GEO), DNA Data Bank of Japan (DDBJ), NCBI GenBank, NCBI Sequence Read Archive, or EMBL Nucleotide Sequence Database (ENA)). In your revised cover letter, please provide the relevant accession numbers that may be used to access these data. For a full list of recommended repositories, see http://journals.plos.org/plosone/s/data-availability#loc-omics or http://journals.plos.org/plosone/s/data-availability#loc-sequencing. 4.  PLOS ONE now requires that authors provide the original uncropped and unadjusted images underlying all blot or gel results reported in a submission’s figures or Supporting Information files. This policy and the journal’s other requirements for blot/gel reporting and figure preparation are described in detail at https://journals.plos.org/plosone/s/figures#loc-blot-and-gel-reporting-requirements and https://journals.plos.org/plosone/s/figures#loc-preparing-figures-from-image-files. When you submit your revised manuscript, please ensure that your figures adhere fully to these guidelines and provide the original underlying images for all blot or gel data reported in your submission. See the following link for instructions on providing the original image data: https://journals.plos.org/plosone/s/figures#loc-original-images-for-blots-and-gels. In your cover letter, please note whether your blot/gel image data are in Supporting Information or posted at a public data repository, provide the repository URL if relevant, and provide specific details as to which raw blot/gel images, if any, are not available. Email us at plosone@plos.org if you have any questions. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: N/A Reviewer #3: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This manuscript by Bengani and FitzPatrick used an updated method to determine causality of SNPs found in genomes of patients with genetically inherited intellectual disability (ID). Those are resourceful information to the other peers studying X-linked recessive forms of ID and represent diagnostic values for this particular disorder. The data in this work use a combination of zebra fish and mouse genetics models and functionally validated the clinically identified rare variants on the X chromosomes. Overall, I agree that the study is conclusive and thus recommend for acceptance. I list two minor issues and I hope the authors can address them quickly before publication. 1, An increasing body of evidences have implied brains are highly associated with somatic mutations, which often cause defects either derived from neural degeneration or development. Dr. Christopher Walsh from Harvard University is an expert in this field. The point is brain genomes are not identical to genomes of other tissue types as the method presented in this study used nonbrain tissues for sequencing. Would the authors comments on this point with respect to performance of gnomAD2.1 in the discussion section? 2, The quality/resolution of some figures are low, e.g. fig. 1, 3, 4 and 5. Is there any way to increase the readability of these figures? Reviewer #2: Bengani et al. performed a study for the identification of functional cis-regulatory mutations involved in the development of X-linked intellectual disability. The task is definitely very interesting, but also very challenging and time consuming. Indeed, it took several years and two different in vivo models to find data supporting a causative role for one out of 31 highly conserved CRE initially identified, the FMR1-CRE variant. I think that this manuscript is worth publishing because it has a quite strong rationale behind the selection of the non-coding variants to be tested and it shows that it is possible to make functional sense of mutations in regulatory elements. However, the phenotype for which the authors are trying to find a genetic evidence in CRE is very complex and very hard to dissect. Indeed, it is very difficult to find a straightforward link with a single variant in CRE. The phenotypes observed in the mouse model and the link to the human disease appears quite weak. I think this should be clearly underlined in the manuscript and better explained. Moreover, I think that the title, being very general, overstates the results of the manuscript and it should be changed to make it more related to the findings. The description of the statistical analysis applied by the authors is lacking. Moreover, I would suggest to check the statistical methods used for the analysis, in general, and, in particular, for Fmr1 transcription by qPCR (Fig. S41): some differences between wt and mutants would seem significant (i.e. in Hind brain P7), especially compared to other data reported as significant in the manuscript (i.e. Fig. 4H). Based on these reevaluations, some results and some statement in the paper might need to be revised. Reviewer #3: Manuscript Summary: Identifying causative variants in cis -regulatory elements (CREs) in neurodevelopmental disorders has proven challenging. Bengani, Grozeva, Moyon, and Bhatia et al. used a translational bridge between humans-zebrafish-mice, with the aid of model systems, and circling back to humans to identify and evaluate the pathological implications of chromosome X-resident CRE mutations that cause X-linked intellectual disability (XLID). Overall, this manuscript is perfectly suited for the scope of PLOS ONE. The translational approach that the authors have taken in this study is albeit challenging but particularly well appreciated. Studies such as these would certainly enhance the field's understanding of pathogenic variants implicated in neurodevelopmental disorders. In addition, the methodology employed in this study is thorough and the authors present the data/findings objectively. Major Comments and Questions: 1) 1a) Figures 2F and 2G: The authors use Six3-targeting morpholinos to demonstrate that six3-knockdown rescues the activity of mutant TENM1-CRE, the latter of which is quite convincing. However, there is no validation data provided to demonstrate that the concentration of the six3-morpholino used post-optimizations indeed led to a sufficient knockdown of six3. 1b) Additional note 1 (not required for the acceptance of this manuscript for publication): While I can understand the following might be outside the scope of this manuscript, an IP-seq experiment to pull down six3-bound to either TENM1-WT or TENM1-CRE would provide irrefutable evidence to substantiate the author's hypothesis that TENM1-CRE introduces a novel binding site of the repressor, six3. 1c) Additional note 2: The following could be addressed in the discussion, as per the authors' discretion. While TENM1-CRE led to a consistent loss of tenm1-expression in the mid-brain and hind-brain (table 1), it did not lead to a loss of tenm1-expression in the neural tube. Is six3 and six6 not expressed in the neural tubes and what additional six3/six6-independent mechanisms could be at play here? 2) In the figure legend for Fig. 6C: The authors state that there was "No significant difference was observed in audiogenic seizure incidence in the hemizygous mice with the variant Fmr1CRE compared to wild-type littermates (lines 447 - 449)." However, as per Fig. 6C, the Fmr1-CRE did display a statistically-significant decrease in the predisposition to audiogenic seizures (AGS), starkly contrasting the anticipated results of an increase in AGS derived from Fmr1-null mice. The data and results appears to contradict the figure legend and the authors' interpretation. 3) It is nice that the authors took a bench-to-bedside approach to re-evaluate the individuals within the FMR1-CRE-segregating family. Did these individuals not have any olfactory disorders, which may recapitulate the phenotype observed in Fig. 5A,B? And, following up from the precedence derived from Fig. 6C, did the affected individuals also display a resistance to the development of AGS? 4) The six3 piece of data from the zebrafish study was intriguing (Fig. 2). Was there no correlative evidence of a similar interplay between Six3 and Tenm1 in mice and/or humans? Could murine and human SIX3 exhibit a non-overlapping pattern of expression with TENM1, which might explain the lack of a phenotype in Tenm1-CRE mice? What are the authors thoughts on the Six3/Six6-dependent and -independent mechanisms of tenm1-CRE in zebrafish and how these mechanisms translate to mammalian species (e.g. mice and humans)? Minor Comments and Questions: 1) Formatting: It would be nice to keep the text format used in all figures, either in the main text or in the supplement, consistent in size and style. 2) It would be nice to spell out the full names of the protein abbreviations used, such as FMR1, FMRP, TENM1. All additional major and minor comments from me were either fully or partially overlapping with those raised by the reviewers during the peer-review for the PLOS Genetics submission and have been addressed by the authors. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 21 Jul 2021 Reviewer #1: This manuscript by Bengani and FitzPatrick used an updated method to determine causality of SNPs found in genomes of patients with genetically inherited intellectual disability (ID). Those are resourceful information to the other peers studying X-linked recessive forms of ID and represent diagnostic values for this particular disorder. The data in this work use a combination of zebra fish and mouse genetics models and functionally validated the clinically identified rare variants on the X chromosomes. Overall, I agree that the study is conclusive and thus recommend for acceptance. I list two minor issues and I hope the authors can address them quickly before publication. 1, An increasing body of evidences have implied brains are highly associated with somatic mutations, which often cause defects either derived from neural degeneration or development. Dr. Christopher Walsh from Harvard University is an expert in this field. The point is brain genomes are not identical to genomes of other tissue types as the method presented in this study used nonbrain tissues for sequencing. Would the authors comments on this point with respect to performance of gnomAD2.1 in the discussion section? This is an interesting point. Our experimental design relies on the variants being inherited and co-segregating with the disease in an individual family. gnomAD, likewise, is very focused on cataloging constitutional variation and data is filtered with the specific purpose of excluding somatic mosaic variants. To clarify this we have added the following text to the discussion: “Recently somatic mutations in the brain have been implicated in the causation of neuro developmental disorders {Rodin et al., 2021, #61844}. The fact that we have selected individuals with a positive family history would significantly reduce the prior probability of this disease mechanism in our cohort. For this reason, we have designed our variant filtering strategy to identify constitutional mutations and exclude variants that are likely to be mosaic.” (Line: 460-464) 2, The quality/resolution of some figures are low, e.g. fig. 1, 3, 4 and 5. Is there any way to increase the readability of these figures? We apologise for this and we have increased the resolution of these figures – we also have Adobe Illustrator files available if required. Reviewer #2: Bengani et al. performed a study for the identification of functional cis-regulatory mutations involved in the development of X-linked intellectual disability. The task is definitely very interesting, but also very challenging and time consuming. Indeed, it took several years and two different in vivo models to find data supporting a causative role for one out of 31 highly conserved CRE initially identified, the FMR1-CRE variant. I think that this manuscript is worth publishing because it has a quite strong rationale behind the selection of the non-coding variants to be tested and it shows that it is possible to make functional sense of mutations in regulatory elements. However, the phenotype for which the authors are trying to find a genetic evidence in CRE is very complex and very hard to dissect. Indeed, it is very difficult to find a straightforward link with a single variant in CRE. The phenotypes observed in the mouse model and the link to the human disease appears quite weak. I think this should be clearly underlined in the manuscript and better explained. We have tried not to over-state the link between the mouse model and the human disease. However for the FMR1CRE/Fmr1CRE variant there is clear evidence that this variant results in reduced expression of endogenous Fmr1 in the embryonic brain and abnormal results in neurophysiological tests that are considered to the gold-standard markers of FMRP function in the hippocampus. We agree that the behavioural phenotype is not Fmr1-specific but there are no known phenotypes that are specific to this genotype. We have changed the penultimate sentence in the abstract to address the reviewer’s concerns. It now reads: ”Assigning causative status to any ultra-rare CRE variant remains problematic and requires disease-relevant in vivo functional data from multiple sources.”(Line: 53-55). Moreover, I think that the title, being very general, overstates the results of the manuscript and it should be changed to make it more related to the findings. We have changed the title to be more descriptive: “Identification and functional modelling of plausibly causative cis-regulatory variants in a highly-selected cohort with X-linked intellectual disability.” The description of the statistical analysis applied by the authors is lacking. Moreover, I would suggest to check the statistical methods used for the analysis, in general, and, in particular, for Fmr1 transcription by qPCR (Fig. S41): some differences between wt and mutants would seem significant (i.e. in Hind brain P7), especially compared to other data reported as significant in the manuscript (i.e. Fig. 4H). Based on these reevaluations, some results and some statement in the paper might need to be revised. We have added a section named Statistical analysis (Line 255-260) under Methods in which we have given details of statistical methods used and also added the details of statistics used in the legend section of figures. There was mistake in calculating the standard error in figure S41.We have re done the analysis and updated the figure S41 (now S42 Fig) and result confirm that differences between wt and mutants is not significant. We apologise for this error. Reviewer #3: Manuscript Summary: Identifying causative variants in cis -regulatory elements (CREs) in neurodevelopmental disorders has proven challenging. Bengani, Grozeva, Moyon, and Bhatia et al. used a translational bridge between humans-zebrafish-mice, with the aid of model systems, and circling back to humans to identify and evaluate the pathological implications of chromosome X-resident CRE mutations that cause X-linked intellectual disability (XLID). Overall, this manuscript is perfectly suited for the scope of PLOS ONE. The translational approach that the authors have taken in this study is albeit challenging but particularly well appreciated. Studies such as these would certainly enhance the field's understanding of pathogenic variants implicated in neurodevelopmental disorders. In addition, the methodology employed in this study is thorough and the authors present the data/findings objectively. Major Comments and Questions: 1) 1a) Figures 2F and 2G: The authors use Six3-targeting morpholinos to demonstrate that six3-knockdown rescues the activity of mutant TENM1-CRE, the latter of which is quite convincing. However, there is no validation data provided to demonstrate that the concentration of the six3-morpholino used post-optimizations indeed led to a sufficient knockdown of six3. We have validated the knockdown of Six3 by western blot. We have added a figure in the supplementary data named as S40 Fig and also documented it in the result section.(Line 333-334) 1b) Additional note 1 (not required for the acceptance of this manuscript for publication): While I can understand the following might be outside the scope of this manuscript, an IP-seq experiment to pull down six3-bound to either TENM1-WT or TENM1-CRE would provide irrefutable evidence to substantiate the author's hypothesis that TENM1-CRE introduces a novel binding site of the repressor, six3. We thank the reviewer for this suggestion, and we would endeavour to do those experiments in future follow-up studies. 1c) Additional note 2: The following could be addressed in the discussion, as per the authors' discretion. While TENM1-CRE led to a consistent loss of tenm1-expression in the mid-brain and hind-brain (table 1), it did not lead to a loss of tenm1-expression in the neural tube. Is six3 and six6 not expressed in the neural tubes and what additional six3/six6-independent mechanisms could be at play here? We believe there might be distinct regions of the TENM1-CRE driving expression in the mid-brain/hind-brain and the neural tube. The single-nucleotide change affecting the six3 binding site possibly localises to the region of the CRE driving brain expression. Similar scenarios have been observed in other CREs (Bhatia et al AJHG 2013) 2) In the figure legend for Fig. 6C: The authors state that there was "No significant difference was observed in audiogenic seizure incidence in the hemizygous mice with the variant Fmr1CRE compared to wild-type littermates (lines 447 - 449)." However, as per Fig. 6C, the Fmr1-CRE did display a statistically-significant decrease in the predisposition to audiogenic seizures (AGS), starkly contrasting the anticipated results of an increase in AGS derived from Fmr1-null mice. The data and results appears to contradict the figure legend and the authors' interpretation. The apparent decrease in audiogenic seizures in the Frm1-CRE mice is intriguing but is not statistically significant. Significance was determined using two-tailed Fisher’s exact test (appropriate for analysing nominal data sets). A p-value of < 0.05 was considered statistically significant. The calculated p value is p = 0.063 which is greater than the cut off used. 3) It is nice that the authors took a bench-to-bedside approach to re-evaluate the individuals within the FMR1-CRE-segregating family. Did these individuals not have any olfactory disorders, which may recapitulate the phenotype observed in Fig. 5A,B? And, following up from the precedence derived from Fig. 6C, did the affected individuals also display a resistance to the development of AGS? There was no obvious olfaction anomalies in the affected individuals from this family and no seizure predisposition (Line 393-394). We have added this as a comment to the second last paragraph of the Results section “Fmr1/FMRP-Focused Analyses”. 4) The six3 piece of data from the zebrafish study was intriguing (Fig. 2). Was there no correlative evidence of a similar interplay between Six3 and Tenm1 in mice and/or humans? Could murine and human SIX3 exhibit a non-overlapping pattern of expression with TENM1, which might explain the lack of a phenotype in Tenm1-CRE mice? What are the authors thoughts on the Six3/Six6-dependent and -independent mechanisms of tenm1-CRE in zebrafish and how these mechanisms translate to mammalian species (e.g. mice and humans)? We agree this it would be interesting to explore the relationship between SIX3-binding and TENM1 expression in the mouse mutants but this is obviously a much more challenging experimental task than it is in zebrafish embryos. The zebrafish reporter-transgenic assay used an isolated fragment of human DNA the dramatic effect of Six3 binding on CRE function may be exaggerated compared to the endogenous loci in mice and humans if other enhancers exist with partially overlapping function. Minor Comments and Questions: 1) Formatting: It would be nice to keep the text format used in all figures, either in the main text or in the supplement, consistent in size and style. We have reformatted all the text in the figures and legends in the main text as well as in supplement with consistent size and style. We were not able to change the format of figure S3-S33 (pedigree analysis) as we got the file from the clinical lab long time back and unable to access the original file. We apologise for this. 2) It would be nice to spell out the full names of the protein abbreviations used, such as FMR1, FMRP, TENM1. We have added the details of full gene name/protein abbreviation in the main text FMR1: FMRP translational regulator 1. (Line 96) Tenm1: Teneurin transmembrane protein 1. (Line 184) SIX3: SIX homeobox 3. (Line 323) SIX6: SIX homeobox 6. (Line 324) RFX2: Regulatory factor X2. (Line 338) FMRP: fragile X mental retardation protein. (Line 375) POLA1/PCYT1B: DNA polymerase alpha 1, catalytic subunit. (Line 408) KDM6A: lysine demethylase 6A. (Line 409) ARHGEF6: Rac/Cdc42 guanine nucleotide exchange factor 6. (Line 410) AFF2: AF4/FMR2 family member 2. (Line 410) All additional major and minor comments from me were either fully or partially overlapping with those raised by the reviewers during the peer-review for the PLOS Genetics submission and have been addressed by the authors. Submitted filename: Response to Reviewers Comments.docx Click here for additional data file. 2 Aug 2021 Identification and functional modelling of plausibly causative cis-regulatory variants in a highly-selected cohort with X-linked intellectual disability PONE-D-21-12396R1 Dear Dr. FitzPatrick, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Chaeyoung Lee Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed Reviewer #3: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: (No Response) Reviewer #3: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: (No Response) Reviewer #3: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: (No Response) Reviewer #3: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: (No Response) Reviewer #3: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: All of my minor concerns have been addressed in this revisedmanuscript. I have no further question. Reviewer #2: (No Response) Reviewer #3: The authors have sufficiently and more than satisfactorily addressed all my comments and questions. Overall, this manuscript is perfectly suited for the scope of PLOS ONE. The translational approach that the authors have taken in this study is albeit challenging but particularly well appreciated. Studies such as these would certainly enhance the field's understanding of pathogenic variants implicated in neurodevelopmental disorders. In addition, the methodology employed in this study is thorough and the authors present the data/findings objectively. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No 5 Aug 2021 PONE-D-21-12396R1 Identification and functional modelling of plausibly causative cis-regulatory variants in a highly-selected cohort with X-linked intellectual disability Dear Dr. FitzPatrick: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Prof. Chaeyoung Lee Academic Editor PLOS ONE
Table 1

Allele frequencies of the six variants tested in the dual-color reporter transgenic assays.

ProbandVariantTarget GeneAllelesTotal AllelesHemizygotesAllele Frequency*
S3 X-146875009-C-T FMR1 0NA0NA
S19 X-25260740-A-G POLA1 PCYT1B 0219670NA
S24 X-124269322-G-A TENM1 12196200.00004553
S24 X-136183176-G-A ARHGEF6 72197910.0003185
S31 X-147866225-C-A AFF2 92202030.0004319
S43 X-45375111-C-G KDM6A 0NA0NA

*Popmax Filtering AF (95% confidence).

Table 2

Analysis of stable transgenic lines of dual-color reporters in zebrafish embryos.

Target GeneNo. stable lines analysedMutation StatusReporterOlfactory placodeForebrainTrigeminal gangliaMidbrainHindbrainNeural TubeLateral spinal cord neuronsEyeOtic vescleHeartPectoral finEffect of CRE Variant
TENM1 4Wild-typeeGFP    4 4 4    12Loss of midbrain and hindbrain activity
MutantmCherry1  00 4   1  
FMR1 3Wild-typeeGFP 3 3 3 1Loss of forebrain activity
MutantmCherry0 3 1 3
POLA1-PCYT1B 3Wild-typeeGFP 1  11     Uninterpretable
MutantmCherry 1  11 11  
ARHGEF6 3Wild-typeeGFP1111Uninterpretable
MutantmCherry11121
KDM6A 3Wild-typemCherry   112    1Uninterpretable
MutanteGFP   112 1  2
AFF2 2Wild-typemCherry111Uninterpretable
MutanteGFP 1 11      
  47 in total

1.  A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.

Authors:  Pablo Cingolani; Adrian Platts; Le Lily Wang; Melissa Coon; Tung Nguyen; Luan Wang; Susan J Land; Xiangyi Lu; Douglas M Ruden
Journal:  Fly (Austin)       Date:  2012 Apr-Jun       Impact factor: 2.160

Review 2.  Structural variation in the 3D genome.

Authors:  Malte Spielmann; Darío G Lupiáñez; Stefan Mundlos
Journal:  Nat Rev Genet       Date:  2018-07       Impact factor: 53.242

3.  Hypersensitivity to mGluR5 and ERK1/2 leads to excessive protein synthesis in the hippocampus of a mouse model of fragile X syndrome.

Authors:  Emily K Osterweil; Dilja D Krueger; Kimberly Reinhold; Mark F Bear
Journal:  J Neurosci       Date:  2010-11-17       Impact factor: 6.167

4.  Human-specific gain of function in a developmental enhancer.

Authors:  Shyam Prabhakar; Axel Visel; Jennifer A Akiyama; Malak Shoukry; Keith D Lewis; Amy Holt; Ingrid Plajzer-Frick; Harris Morrison; David R Fitzpatrick; Veena Afzal; Len A Pennacchio; Edward M Rubin; James P Noonan
Journal:  Science       Date:  2008-09-05       Impact factor: 47.728

5.  De novo mutations in regulatory elements in neurodevelopmental disorders.

Authors:  Patrick J Short; Jeremy F McRae; Giuseppe Gallone; Alejandro Sifrim; Hyejung Won; Daniel H Geschwind; Caroline F Wright; Helen V Firth; David R FitzPatrick; Jeffrey C Barrett; Matthew E Hurles
Journal:  Nature       Date:  2018-03-21       Impact factor: 49.962

6.  The landscape of somatic mutation in cerebral cortex of autistic and neurotypical individuals revealed by ultra-deep whole-genome sequencing.

Authors:  Rachel E Rodin; Yanmei Dou; Minseok Kwon; Maxwell A Sherman; Alissa M D'Gama; Ryan N Doan; Lariza M Rento; Kelly M Girskis; Craig L Bohrson; Sonia N Kim; Ajay Nadig; Lovelace J Luquette; Doga C Gulhan; Peter J Park; Christopher A Walsh
Journal:  Nat Neurosci       Date:  2021-01-11       Impact factor: 24.884

7.  Long-range evolutionary constraints reveal cis-regulatory interactions on the human X chromosome.

Authors:  Magali Naville; Minaka Ishibashi; Marco Ferg; Hemant Bengani; Silke Rinkwitz; Monika Krecsmarik; Thomas A Hawkins; Stephen W Wilson; Elizabeth Manning; Chandra S R Chilamakuri; David I Wilson; Alexandra Louis; F Lucy Raymond; Sepand Rastegar; Uwe Strähle; Boris Lenhard; Laure Bally-Cuif; Veronica van Heyningen; David R FitzPatrick; Thomas S Becker; Hugues Roest Crollius
Journal:  Nat Commun       Date:  2015-04-24       Impact factor: 14.919

8.  Histone H3 globular domain acetylation identifies a new class of enhancers.

Authors:  Madapura M Pradeepa; Graeme R Grimes; Yatendra Kumar; Gabrielle Olley; Gillian C A Taylor; Robert Schneider; Wendy A Bickmore
Journal:  Nat Genet       Date:  2016-04-18       Impact factor: 38.330

9.  A systematic, large-scale resequencing screen of X-chromosome coding exons in mental retardation.

Authors:  Patrick S Tarpey; Raffaella Smith; Erin Pleasance; Annabel Whibley; Sarah Edkins; Claire Hardy; Sarah O'Meara; Calli Latimer; Ed Dicks; Andrew Menzies; Phil Stephens; Matt Blow; Chris Greenman; Yali Xue; Chris Tyler-Smith; Deborah Thompson; Kristian Gray; Jenny Andrews; Syd Barthorpe; Gemma Buck; Jennifer Cole; Rebecca Dunmore; David Jones; Mark Maddison; Tatiana Mironenko; Rachel Turner; Kelly Turrell; Jennifer Varian; Sofie West; Sara Widaa; Paul Wray; Jon Teague; Adam Butler; Andrew Jenkinson; Mingming Jia; David Richardson; Rebecca Shepherd; Richard Wooster; M Isabel Tejada; Francisco Martinez; Gemma Carvill; Rene Goliath; Arjan P M de Brouwer; Hans van Bokhoven; Hilde Van Esch; Jamel Chelly; Martine Raynaud; Hans-Hilger Ropers; Fatima E Abidi; Anand K Srivastava; James Cox; Ying Luo; Uma Mallya; Jenny Moon; Josef Parnau; Shehla Mohammed; John L Tolmie; Cheryl Shoubridge; Mark Corbett; Alison Gardner; Eric Haan; Sinitdhorn Rujirabanjerd; Marie Shaw; Lucianne Vandeleur; Tod Fullston; Douglas F Easton; Jackie Boyle; Michael Partington; Anna Hackett; Michael Field; Cindy Skinner; Roger E Stevenson; Martin Bobrow; Gillian Turner; Charles E Schwartz; Jozef Gecz; F Lucy Raymond; P Andrew Futreal; Michael R Stratton
Journal:  Nat Genet       Date:  2009-04-19       Impact factor: 38.330

10.  Using high-resolution variant frequencies to empower clinical genome interpretation.

Authors:  Nicola Whiffin; Eric Minikel; Roddy Walsh; Anne H O'Donnell-Luria; Konrad Karczewski; Alexander Y Ing; Paul J R Barton; Birgit Funke; Stuart A Cook; Daniel MacArthur; James S Ware
Journal:  Genet Med       Date:  2017-05-18       Impact factor: 8.822

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.