Autism Spectrum Disorder (ASD) is a complex neuropsychiatric syndrome whose etiology includes genetic and environmental components. Since epigenetic marks are sensitive to environmental insult, they may be involved in the development of ASD. Initial brain studies have suggested a dysregulation of epigenetic marks in ASD. However, due to cellular heterogeneity in the brain, these studies have not determined if there is a true change in the neuronal epigenetic signature. Here, we report a genome-wide methylation study on fluorescence-activated cell sorting-sorted neuronal nuclei from the frontal cortex of 16 male ASD and 15 male control subjects. Using the 450 K BeadArray, we identified 58 differentially methylated regions (DMRs) that included loci associated to GABAergic system genes, particularly ABAT and GABBR1, and brain-specific MicroRNAs. Selected DMRs were validated by targeted Next Generation Bisulfite Sequencing. Weighted gene correlation network analysis detected 3 co-methylation modules which are significantly correlated to ASD that were enriched for genomic regions underlying neuronal, GABAergic, and immune system genes. Finally, we determined an overlap of the 58 ASD-related DMRs with neurodevelopment associated DMRs. This investigation identifies alterations in the DNA methylation pattern in ASD cortical neurons, providing further evidence that epigenetic alterations in disorder-relevant tissues may be involved in the biology of ASD.
Autism Spectrum Disorder (ASD) is a complex neuropsychiatric syndrome whose etiology includes genetic and environmental components. Since epigenetic marks are sensitive to environmental insult, they may be involved in the development of ASD. Initial brain studies have suggested a dysregulation of epigenetic marks in ASD. However, due to cellular heterogeneity in the brain, these studies have not determined if there is a true change in the neuronal epigenetic signature. Here, we report a genome-wide methylation study on fluorescence-activated cell sorting-sorted neuronal nuclei from the frontal cortex of 16 male ASD and 15 male control subjects. Using the 450 K BeadArray, we identified 58 differentially methylated regions (DMRs) that included loci associated to GABAergic system genes, particularly ABAT and GABBR1, and brain-specific MicroRNAs. Selected DMRs were validated by targeted Next Generation Bisulfite Sequencing. Weighted gene correlation network analysis detected 3 co-methylation modules which are significantly correlated to ASD that were enriched for genomic regions underlying neuronal, GABAergic, and immune system genes. Finally, we determined an overlap of the 58 ASD-related DMRs with neurodevelopment associated DMRs. This investigation identifies alterations in the DNA methylation pattern in ASD cortical neurons, providing further evidence that epigenetic alterations in disorder-relevant tissues may be involved in the biology of ASD.
Autism spectrum disorder (ASD) is a group of neurodevelopmental conditions characterized by dysfunction in social communication and stereotypic behavior (American Psychiatric Association 2013). Genetic and environmental factors have been implicated in the development of ASD, but the molecular mechanisms underlying their interaction are still not fully understood (Gaugler et al. 2014; Tordjman et al. 2014). Epigenetic modifications have been ascribed as possible mediator between environmental cues and the genome to produce adaptive or maladaptive behaviors (Petronis 2010). In addition, genetic evidence suggests epigenetics as potential cofactor in the aetiopathogenesis of ASD. As a matter of fact, whole-exome sequencing (WES) studies on ASD cases identified de novo loss-of-function mutations to be enriched for genes encoding proteins belonging to histone-modifying enzymes and chromatin remodeler families (De Rubeis et al. 2014; Iossifov et al. 2014). Thus far, epigenome-wide association studies (EWAS) in ASD have been performed exclusively on a restricted group of modifications: that is, DNA methylation (Ladd-Acosta et al. 2013; Nardone et al. 2014), trimethylated histone H3 at lysine 4 (H3K4me3) (Shulha et al. 2012), acetylated histone H3 at lysine 27 (H3K27ac) (Sun et al. 2016), long non-coding (lnRNA) (Ziats and Rennert 2013; Parikshak et al. 2016), and micro-RNA (miRNA) (Mor et al. 2015; Wu et al. 2016). In addition, due to the scarce availability of ASD brain specimens, many of these studies were carried out on peripheral tissues, even though it is likely that the main molecular drivers of ASD development are found in the brain. Therefore, even if these studies could be useful to identify new epigenetic biomarkers, they are far from being able to elucidate the epigenetic mechanisms underlying disease’s molecular basis (Petronis 2010). Only rigorous tissue and cell-specific-based epigenetic studies can aim at reaching this goal. Recently, we and others have profiled the DNA methylation signature of different Brodmann Areas (BA) from ASD post-mortem brain specimens (Ladd-Acosta et al. 2013; Nardone et al. 2014). Despite the unavoidable heterogeneity among the studies, common findings were detected: particularly, a very large number of CpGs upstream the transcription start site (Tss) of Tetraspanin 32 (TSPAN32) was found hypomethylated in autistic versus control cohorts by 2 independent studies (Ladd-Acosta et al. 2013; Nardone et al. 2014). Overall, these investigations constituted a preliminary evidence of epigenetic dysregulation in ASD. In contrast to well-established Genome Wide Association Study (GWAS), the implementation of EWAS is still in its infancy, and many potential confounding factors can hinder its progress (Rakyan et al. 2011). While most of the sources of variability can be controlled by improving experimental design and bioinformatics approaches, the cellular mosaicism still constitutes a major drawback in epigenetic studies, especially in highly heterogenic tissues, such as the brain. Noticeably, a true signal originating from a small cell population can often go undetected, whereas bias in the results can stem from tissue-specific variation in the ratio of cell types, due to physiological changes or as consequence of medical condition (Rakyan et al. 2011). A method that promises to significantly improve epigenetic interrogation of highly heterogenic tissues, partly overcoming issues related to mosaicism, is the upstream implementation of fluorescence-activated cell sorting (FACS). To date, FACS have been successfully applied to answer many scientific questions that require to focus on 1 cell type at time, that is, in immunology, cancer or stem cell studies, however its use in post-mortem brain studies is relatively new.The investigation of DNA methylation signature in FACS-sorted neuronal nuclei from ASD brain specimens, can provide vital information regarding distortions in methylation patterning during development, that otherwise would probably be lost in the bulk of the signal. Recently, Lister et al. and Spiers et al., in mouse prefrontal cortex and in human fetal brain studies, respectively, have demonstrated that extensive methylome reconfiguration occurs during development from fetal to young adult, particularly during synaptogenesis, and it is considered a critical process in defining the neuron molecular identity (Lister et al. 2013; Spiers et al. 2015). Previous ASD post-mortem brain studies have demonstrated a dysregulation in the regional patterns of gene expression that differentiate between cortical brain regions (Voineagu et al. 2011; Ziats and Rennert 2013; Parikshak et al. 2016). Since DNA methylation plays a substantial role in neuron differentiation (Xie et al. 2013; Mo et al. 2015), it is plausible that disturbances in its patterning may be involved in the lack of regional transcriptome identity. In addition to the above, mounting experimental and epidemiologic evidence support the hypothesis that ASD could possibly emerge during the same time window when methylation patterning in the brain is highly dynamic and liable to insults (Ciernia and LaSalle 2016), making this investigation worthwhile.
Materials and Methods
Samples Information
Frozen brain samples from 15 ASD cases and 16 controls were acquired through the Autism Tissue Program (http://www.autismbrainnet.org), with all the brain specimens deriving from Harvard Brain Bank except for 2 ASD brain samples received from the UK Brain Bank for Autism. For nearly all the individuals, brain tissue was obtained from the anterior prefrontal cortex (PFC), BA10, with the exception of 1 and 2 specimens obtained from the frontal cortex BA9 and BA8, respectively. All samples were from males due to both low number of available female samples and high epigenetic variability between males and females. Individuals with known Copy Number Variations or delirious mutations were excluded from this study. When choosing samples, we matched for age, PMI, and brain mass. In addition, we needed to choose samples which were never thawed, in order to ensure preserved nuclear integrity, which is necessary for the FACS analysis. In short, after extraction of 5 ml of cerebral spinal fluid, full brains were removed from the skull, and cut in half by cutting in the midsagittal plane through the corpus callosum. Half of the brain was then immediately frozen on dry ice, followed by long-term storage in a −70 °C freezer (the other half was fixed with formalin). Our studies were performed only on frozen samples. Full protocol is available from the Autism BrainNet. Ethical approval for this research was granted by Bar Ilan University Institutional Review Board, and written informed consent for every individual tested was given to the Autism Tissue Program by the patient himself or his next-of-kin. A summary of clinical features of each subject is available in Supplementary Table 1.
FACS Analysis
Neuronal nuclei were isolated from 500 mg of liquid nitrogen pulverized brain tissue as previously described by Matevossian et al. (Matevossian and Akbarian 2008) with few minor changes to the original protocol. In summary, nuclei concentration, antibody titration and staining time were carefully calibrated to determine the best experimental conditions: 2.8 ml of nuclei resuspension solution (0.02% bovineserum albumin (BSA), 0.4% goat serum in 1× phosphate-buffered saline (PBS)) were stained with 5 ul of Anti-NeuN-PE antibody clone A60 (Millipore, MD, USA) while rotating in the dark for 1 h at 4 °C. Unstained nuclei were used to set the gates and as control to detect “auto-fluorescence” or “innate” background staining, while Mouse IgG1 K Isotype Control PE (eBioscience, CA, USA) was used to determine the background signal caused by nonspecific antibody binding. FACS was performed by MoFlo® Astrios™ EQ cell sorter (Beckman Coulter Inc., CA, USA). Before the sort, samples were filtered through a 40 μm filter tube to prevent clogging the machine. To optimize the gating parameters, each sample underwent a pre-sort analysis on unstained, IgG1 K Isotype Control PE, and Anti-NeuN-PE stained nuclei aliquots. Post-sort analysis of NeuN+ and NeuN− collected events was carried out to assess sample purity: NeuN+ and NeuN− sorted nuclei displayed a purity >95% and >99%, respectively (Fig. 1B,C). FloJo v10.1 was employed to analyze and plot FACS raw data. In order to confirm the high efficiency of the sorting by an independent technique, we stained unsorted, NeuN+, and NeuN− FACS-sorted nuclei from the sample AN07176 with 4′,6-diamidino-2-phenylindole (DAPI) and Anti-NeuN-PE Ab, and subsequently imaged them with Axio Imager 2 with ApoTome system microscope (Zeiss, Germany, EU) (Fig. 1D,E).
Figure 1.
Isolation of neuronal nuclei by FACS. (A) Flow chart of the methods employed in the study. (B) Side scatter vs. forward scatter plot of sorted nuclei that had been stained with anti-NeuN-PE. A subset of the population I (identified in the figure as Ia) composed of large neural nuclei was gated for our analysis. The choice to gate only the large NeuN+ population was dictated by technical limitations of the FACS instrument to perform efficiently if the gate would have included all NeuN+ events. (C,D) After sorting nuclei to NeuN+ and NeuN− populations, we performed post-sort analysis on each of the sorted nuclei populations. In C (left), image of post-sort analysis of NeuN+ nuclei. In D (right), post-sort analysis of NeuN− nuclei. We confirmed a purity >95% and >99%, respectively. (E) Unsorted cortical nuclei stained with DAPI (blue) and Anti-NeuN-PE antibody (orange). (F) Above, images of FACS-sorted NeuN+ nuclei stained with DAPI and Anti-NeuN-PE antibody show a positive staining for both the markers. Below, images of FACS-sorted NeuN− nuclei stained with DAPI and Anti-NeuN-PE antibody show positive staining for DAPI but negative staining for Anti-NeuN-PE antibody.
Isolation of neuronal nuclei by FACS. (A) Flow chart of the methods employed in the study. (B) Side scatter vs. forward scatter plot of sorted nuclei that had been stained with anti-NeuN-PE. A subset of the population I (identified in the figure as Ia) composed of large neural nuclei was gated for our analysis. The choice to gate only the large NeuN+ population was dictated by technical limitations of the FACS instrument to perform efficiently if the gate would have included all NeuN+ events. (C,D) After sorting nuclei to NeuN+ and NeuN− populations, we performed post-sort analysis on each of the sorted nuclei populations. In C (left), image of post-sort analysis of NeuN+ nuclei. In D (right), post-sort analysis of NeuN− nuclei. We confirmed a purity >95% and >99%, respectively. (E) Unsorted cortical nuclei stained with DAPI (blue) and Anti-NeuN-PE antibody (orange). (F) Above, images of FACS-sorted NeuN+ nuclei stained with DAPI and Anti-NeuN-PE antibody show a positive staining for both the markers. Below, images of FACS-sorted NeuN− nuclei stained with DAPI and Anti-NeuN-PE antibody show positive staining for DAPI but negative staining for Anti-NeuN-PE antibody.
DNA Extraction and 450 K BeadArray Data Analysis
Genomic DNA was extracted from approximately 2.5 × 106 FACS-sorted neuronal nuclei using the AllPrep DNA/RNA/miRNA Universal Kit (Qiagen, Germany, EU), and its concentration was assessed by Qubit 2.0 Fluorometer using the Qubit dsDNA BR Assay Kit (Life. Technologies, MA, USA). DNA samples were then submitted to the Genomics Core Facility of “The Rappaport Family Institute for Research in the Medical Sciences” (Haifa, IL), and processed as follows: for each individual, 500 ng of genomic DNA were converted twice with sodium bisulfite using the EZ-96 DNA Methylation™ Kit (Zymo Research, CA, USA) in order to minimize potential bias introduced by variable conversion efficiency. Genome-wide DNA methylation was assessed by Infinium HumanMethylation450K BeadChip (Illumina, CA, USA) array technology. Chip Analysis Methylation Pipeline (CHaMP) package (Morris et al. 2014) was employed to normalize and process the output generated by iScan System (Illumina, CA, USA). Briefly, iScan Control Software was used to extract signal intensities for each probe and generate iDAT files that were imported into R statistical environment 3.1.3 (https://www.r-project.org/) using Illuminaio tool implemented in Minfi package (Aryee et al. 2014). After running basic quality control metrics, raw data were filtered out for probes with a detection P-value > 0.01 in at least 1 sample or with bead count <3 in at least 5% of samples per probe (n = 12 989). Probes targeting CpGs on sex chromosomes (n = 11 650) or probes with polymorphic CpGs/SNPs at single base extension (SBE) or within ≤10 bp from SBE site at allele frequency (AF) ≥0.01 (n = 31 368) (Chen et al. 2013) were discarded, leaving 430 544 probes for downstream analysis (Supplementary Table 2). Probes in sex chromosomes were removed due to the fact that sex chromosomes displayed a very different distribution of beta values, which would introduce a bias in normalization procedures. Conversely, cross-reactive probes (n = 30 969) recently identified by Chen et al. (2013) were not removed from the dataset during our analysis. In our follow-up analysis, we found that no cross-reactive probes were detected within significant differentially methylated regions (DMRs), and no enrichment for them in any of the 8 weighted gene correlation network analysis (WGCNA) modules related to ASD/control state resulted after hypergeometric test (HyperP≈1) (Supplementary Table 2). Following probe filtering, data underwent different methods integrated in CHaMP to identify and correct for technical or biological variables, such as type II bias (BMIQ), batch effects (SVA + ComBat) and copy number variations (CNVs). The SVA tool was used to determine if any known covariates showed significant correlation to beta values, in which case CHAMP can normalize the data for covariates that show correlations. None of the covariates in our study showed significant correlation, according to CHAMP. As a follow-up, we performed a separate Principal Component Analysis (PCA) on our data, which revealed no significant correlation for covariates, including age, brain mass, NeuN+ purity, and PMI on principal components in the data (Supplementary Fig. 1). The detection of differentially methylated positions (DMPs) was performed using Limma package (Ritchie et al. 2015) while DMRs were determined using Probe Lasso (Butcher and Beck 2015), a flexible window-based approach, recently implemented in CHaMP, that gathers neighboring significant CpG-signals to define clear DMR boundaries. Statistical significance and other features related to DMRs are available in Supplementary Table 3 as well as basic quality control statistics for 450 K BeadArray (Supplementary Figs. 2 and 3).
Targeted Next-Gen Bisulfite Sequencing and Statistical Methods
Targeted Next-Gen Bisulfite Sequencing (NGBS) was used to validate differences in DNA methylation detected by Illumina 450 K BeadArray. Overall, we tested 4 significant DMRs encompassing 25 CpG sites by targeted NGBS. DNA samples were submitted to the Genome Center at “Barts and The London School of Medicine and Dentistry” (London, UK). Briefly, for each sample a total amount of 500 ng of DNA was equally divided into 3 aliquots, that underwent separate bisulfite conversions (EZ-96 DNA Methylation™ Kit) followed by independent PCR reactions using primers designed ad hoc to amplify a specific region of interest with a specific barcode for each PCR. The separate barcoded PCR products were then pooled and sequenced on MiSeq (Illumina, CA, USA) to generate in excess of 2000 sequence reads per amplicon that were down sampled to 2000 for downstream analysis. Sequences were mapped to the target gene and methylation status was called with Bismarck (Krueger and Andrews 2011). Only 100% mapped reads were considered for analysis. For each sample, the percentage of DNA methylation at single CpG resolution was quantified as the average of 3 technical replicates (3 separate bisulfite conversions and PCR reactions). For each region tested by targeted NGBS, the genomic coordinates, amplicon sequence, PCR primers and CpG sites, along with their methylation values are reported in Supplementary Tables 4 and 5. Statistical analyses were performed by SPSS software package (version 22.0; SPSS, Chicago, IL, USA). Levene’s test was employed to assess the homogeneity of variance in the data distribution across the groups, and unequal variance was assumed if the test was significant (P < 0.05). We employed 2-tailed Independent t-test and Mann–Whitney U-test to compare groups with equal and unequal variance, respectively (Supplementary Table 6). Spearman’s rank correlation coefficient (rs) was used to determine the correlation between 450 K BeadArray and targeted NGBS methylation values.
Weighted Gene Correlation Network Analysis
The WGCNA R software package was applied to the entire methylation dataset mainly with the aim of identifying co-methylation modules (Langfelder and Horvath 2008). Briefly, pairwise Pearson’s correlations between methylation values were used to build a signed network. A soft threshold of 14 was used, as defined by the scale-free topology criterion. Subsequently, average linkage hierarchical clustering coupled with a topological overlap matrix (TOM)-based dissimilarity measure, were employed to construct a dendrogram of the network whose branches, defined by the “dynamic tree cut” function, corresponded to single modules. The threshold to merge closely related modules was set at minimum height of 0.1. Each module was assigned a color, and a Module Eigengene (ME) corresponding to its first principal component, was calculated. The ME can be correlated to any sample trait (e.g., age, PMI, diagnosis of ASD, etc.) to assess the significance of module-trait association (eigengene significance). We used Kendall’s tau correlation tool in R package to correlate between trait (ASD) and eigengenes, due to the binary nature of the ASD diagnosis. For each probe, WGCNA defines the module membership (MM), that is the correlation between the single β values and the ME, and its corresponding P-value. In addition, to incorporate external information into the co-methylation network, WGCNA computes the probe significance (PS), that is the absolute value of correlation between each probe and a given trait. In our case, the probe significance is the correlation between the methylation of each probe and ASD diagnosis.
Gene Ontology, PPI Network Analysis, and GWAS Enrichment Analysis
Enrichment analyses for Biological Processes and InterPro GO categories were performed on co-methylation modules using GREAT online software (http://great.stanford.edu/great/public/html/) (McLean et al. 2010). GRCh37 (UCSC hg19, Feb/2009) was chosen as human genome reference assembly and only probes with MM > 0.7 were included in the analysis. The background was set to the 450 K BeadArray, instead of the whole genome, to avoid any potential bias due to the array-specific design, for example, the imbalance in the number of probes associated to each gene or located in different genomic regions. GO terms were considered significant when showing a Region-based Fold Enrichment >2 and Bonferroni-corrected Hypergeometric P-value <0.01 (BonfHyperP < 0.01). The parameters and criteria employed by GREAT to: 1) assign univocally a CpG to the distal, proximal, or intragenic region of a transcript (if present in the fixed range), 2) infer statistical significance from enriched genomic regions, and 3) associate genomic regions to GO annotations, along with further data output, are reported in Supplementary Table 7.Protein-Protein Interaction (PPI) Network Analysis was performed by CluePedia (Bindea et al. 2013), a ClueGO plugin for pathway insights that uses integrated experimental and in silico data. Only genes associated to >3 probes with MM > 0.7 were considered for the downstream analysis. For each module investigated, a list composed exclusively by genes associated to a significant number of probes (HyperP < 0.01), was inputted into CluePedia. The output, comprised of nodes and edges datasets, was imported in Cytoscape 3.3.0 that is designed for network data integration, analysis and visualization (Shannon et al. 2003). For each module, node and edge attributes along with network topological parameters, are listed in Supplementary Tables 8–11.The specificity of the modules associated to ASD/control state was evaluated by assessing their enrichment for ASD-related genes, and for GWAS related to other psychiatric and non-psychiatric disorders. The hypergeometric test was done at the level of the probes, not the genes, do the variable number of probes in each gene. Therefore, we first calculated the number of probes the array, in each disorder, and in each module, as well as the number of probes from each disorder that are found in each module. The probe numbers can be found in the embedded text in Supplementary Table 12. Only probes with MM > 0.7 were included in the test. The statistical significance for enrichment of disorder-related probes in each module was computed by hypergeometric test (HyperP < 0.01) in R (Supplementary Tables 12 and 13). Gene lists were retrieved from different sources: (https://gene.sfari.org/autdb/HG_Home.do) for ASD, (http://jjwanglab.org/gwasdb) (Li et al. 2012) for Alzheimer, Atherosclerosis, Diabetes type2, Systemic Lupus Erythematosus (SLE) and Psoriasis, and from a recent publication authored by Ripke et al., for Schizophrenia (Ripke et al. 2014). Permutation testing to determine 1000 permutated datasets of module probes-disorder probes overlaps was performed in R package.
DMRs Overlaps
To test the overlap between 58 DMRs detected by the present study against 4 792 DMRs by Spiers et al. (2015) and 6 480 DMRs by Jaffe et al. (2016) we used the intersect module in Bedtools (v2.25.0) (Quinlan and Hall 2010) and, for confirmation, the findOverlaps function in GenomicRanges R library (Lawrence et al. 2013). Statistical significance for number and extent of overlapping intervals, was computed at global and local (Chr level) levels by implementing the Fisher’s exact test (FET) module in Bedtools (v2.25.0) (Quinlan and Hall 2010). The background was set to the regions covered by the 450 K BeadArray (omitting X,Y Chr), instead of the whole genome. Statistical value (localP) and overlapping intervals at each chromosome are in Supplementary Tables 14 and 15. The overlap between our 58 DMRs and DMRs identified by Lister et al. (2013) in NeuN+ versus NeuN− nuclei from Dorsolateral PFC (DLPFC), was performed manually using Genome Browser online tool (http://genome.ucsc.edu/). The DNA methylation directions (Hypomethylation; Hypermethylation) of NeuN+ versus NeuN− nuclei at each DMR are reported in Supplementary Table 14.
Results
FACS of Neuronal Nuclei from Post-mortem Brains
To investigate the neuron-specific DNA methylation signature in ASD, we employed FACS-mediated enrichment of neuronal nuclei followed by hybridization of bisulfite-converted DNA on Illumina 450 K BeadArray. Brain samples were dissected from the prefrontal cortex (BA8-10) of 15 individuals with an ADI-R-confirmed diagnosis of ASD and 16 matched controls. Clinical data for these individuals are provided in Supplementary Table 1, and more in-depth case reports are available online (http://research.autismbrainnet.org). All the samples were obtained from male subjects and there were no significant differences for Age, (Post-mortem interval) PMI, Brain Mass and Global DNA Methylation levels between ASD and control groups (Supplementary Fig. 4). To efficiently separate neuronal from non-neuronal nuclei, we applied FACS to the purified brain nuclei after Anti-NeuN-PE Ab staining. The parameters, defined for each step of the entire procedure, were calibrated to obtain the best performance (Fig. 1A; Supplementary Fig. 5). We note that to ensure the high purity of NeuN+ cells, was necessary to gate for a fraction of larger nuclei, which is more enriched for neuronal nuclei. The purity of NeuN+ and NeuN− nuclei fractions was determined by post-sort analysis to be approximately >95% and >99%, respectively, for all the FACS-sorted samples (Fig. 1B–D). The high efficiency and reproducibility of the FACS were confirmed for some of the samples randomly chosen by microscopy (Fig. 1E,F) and by performing a FACS technical replicate of the control sample AN07176 (Supplementary Fig. 6). In order to rule out the possibility that differences in DNA methylation could stem from biases introduced by FACS, we determined the percentage of nuclei, singlets and NeuN+ sorted events at level of the Ist, IInd and IIIrd gate, respectively, along with the post-sort NeuN+ purity of ASD versus control groups. No statistical difference between the percentage of events was detected at any of the 3 gates and at the level of post-sort analysis of NeuN+ between ASD and control groups (Supplementary Fig. 7).For each sample, genomic DNA was extracted from FACS-sorted NeuN+ nuclei, bisulfite-converted and probed with Illumina 450 K BeadArray. To validate the reproducibility of 450 K BeadArray we implemented 2 strategies: in 1 case, 2 FACS technical replicates of the same brain sample were bisulfite-converted and probed on 2 different microarrays from the same batch; in the other case, 2 aliquots of the same FACS-sorted sample were independently bisulfite-converted and probed on 2 different batches of microarrays. In both cases, the β values between the samples were highly correlated (Pearson’s r > 0.99) (Supplementary Fig. 8).
Differentially Methylated Regions in Neurons of ASD Subjects
After pre-processing, stringent quality controls and normalization procedures, we tested ASD and control datasets for differential methylation both at the level of single CpG, referred to as DMPs, and at level of genomic region, referred to as Differentially Methylated Regions (DMRs), according to the CHaMP package (see Materials and Methods). To this end, CHaMP implements 2 functions Limma and Probe Lasso: the first treats all CpGs as independent entities, while the second considers the combinatorial effect of neighboring CpG sites. By using Limma method, we detected 11 886 DMPs at P-value <0.01, albeit none survived multiple testing correction (FDR < 0.05), whereas with Probe Lasso we identified 58 DMRs at FDR < 0.05, the majority of whom were closely associated or overlapped a known transcript (Table 1; Supplementary Table 3). It is of interest that 2 of the top 10 DMRs are related to the genes GABBR1 and ABAT, which are part of GABAergic system. Moreover, 2 other DMRs, located on different arms of the chromosome 8, are closely associated to the genes Mir124-1 and Mir124-2, 2 highly conserved microRNAs that play a crucial role in fine-tuning gene expression during brain development (Sun et al. 2015) and, at adulthood, in regulating social behavior (Yang et al. 2012; Gascon et al. 2014; Bahi 2016). Subsequently, we validated 37 CpGs spanning 4 DMRs by targeted Next Generation Bisulfite Sequencing (NGBS) (Fig. 2A–C; Supplementary Fig. 9a) and demonstrated a high correlation between the methylation values detected by 450 K BeadArray and targeted NGBS (Fig. 2D–F; Supplementary Fig. 9b). About 25 out of the 37 CpGs we detected in the NGBS were represented by probes on the original 450 K microarray, while the additional 12 CpGs are neighboring methylation sites within the DMR, which were not represented on the microarray, but which were also found to have significant changes in the NGBS analysis. This further verifies that these genetic regions are DMRs. Two of the DMRs we sequenced, associated with Gamma-Aminobutyric Acid Type B Receptor Subunit 1 (GABBR1) and Mir124-2, displayed hypomethylation in ASD versus control group while the other 2 DMRs, related to Family With Sequence Similarity 124 Member B (FAM124B) and long non-coding RNA Nuclear Enriched Abundant Transcript 1 (lnNEAT1), showed the opposite tendency. All the CpGs investigated were statistically significant at P-value <0.01 except for 3 CpGs, from lnNEAT1, at P-value <0.05 (Supplementary Table 6).
Table 1
Top 25 DMRs detected in cortical neuronal nuclei between ASD and control groups. Table of main DMRs’ attributes: genomic coordinates, size, probe number, Δβ value and statistical significance, names of gene/s related to each DMR, functional (genomic location) and topological (neighborhood location) features
Chr
dmr.start
dmr.end
dmr.size
n.Probes
deltaBeta (ASD-ctrl)
FDR
Gene
Genomic location
Neighborhood location
13
113 697 522
113 700 412
2891
15
−0.03
8.5E−05
MCF2L
Body
Shelf, shore
17
75 314 823
75 316 738
1916
14
−0.05
3.2E−04
SEPT9
5′UTR
Open sea
6
30 649 851
30 653 916
4066
33
0.04
5.9E−04
KIAA1949
Body, 5′UTR
Open sea, shelf, shore
18
74 728 721
74 729 664
944
14
0.06
1.4E−03
MBP
TSS1500, TSS200, 5′UTR,firstExon
Open sea
2
225 266 222
225 266 914
693
10
0.09
4.6E−03
FAM124B
TSS200, 5′UTR, firstExon
Open sea
11
62 474 608
62 475 095
488
8
−0.05
4.6E−03
BSCL2
5′UTR
Shore
16
8 806 101
8 807 301
1201
13
−0.03
4.6E−03
ABAT
5′UTR
Open sea
6
29 599 012
29 599 538
527
10
−0.04
5.2E−03
GABBR1
Body
Shore
8
65 291 378
65 296 546
5169
16
−0.06
7.2E−03
MIR124-2
TSS200, Body, IGR (+284 +3092)
Shore, shelf
10
88 423 626
88 428 687
5062
13
0.07
7.2E−03
OPN4-LDB3
3′UTR, TSS1500, TSS200, 5′UTR
Open sea
12
58 013 239
58 013 654
416
15
0.04
7.2E−03
SLC26A10
TSS1500, TSS200
Island
12
104 443 484
104 444 249
766
9
0.07
8.6E−03
GLT8D2
TSS1500, TSS200, 5′UTR
Open sea
7
150 147 440
150 148 331
892
9
0.06
1.0E−02
GIMAP8
TSS1500, TSS200, 5′UTR
Open sea
17
1 810 379
1 811 517
1139
9
−0.05
1.5E−02
RTN4RL1
IGR (−27 512 −26 497)
Shore, island
11
123 986 008
123 986 248
241
9
0.07
1.9E−02
VWA5A
TSS200, 1stExon
Open sea
12
114 841 523
114 842 097
575
8
-0.03
1.9E−02
TBX5
5′UTR, Body
Shore
6
164 505 292
164 508 210
2919
10
0.07
2.5E−02
QKI
IGR (+670522 +671630)
Open sea
2
236 715 276
236 716 843
1568
6
0.07
2.9E−02
AGAP1
Body
Open sea
4
1 166 607
1 167 081
475
7
0.05
2.9E−02
LOC100130872-SPON2
TSS200, Body
Shore
12
106 978 914
106 979 561
648
7
−0.03
2.9E−02
RFX4
Body
Shore, island
22
25 615 204
25 615 564
361
7
0.05
2.9E−02
CRYBB2
TSS1500, TSS200
Open sea
19
1 466 929
1 468 058
1130
7
0.06
3.0E−02
APC2
Body
Island
7
101 555 285
101 558 745
3461
10
−0.05
3.0E−02
CUX1
Body
Shelf, shore
8
9 765 052
9 768 548
3497
6
−0.06
3.0E−02
MIR124-1
IGR (+4685 +5902)
Shore, shelf
11
65 193 185
65 198 928
5744
18
0.07
3.0E−02
NEAT1
Body, IGR (+4664 +7754)
Shelf, open sea
Figure 2.
Validation of 3 independent DMRs by targeted NGBS technology. (A, B, C) Dot plots displaying methylation values at single CpG resolution in ASD and control cohorts for 3 independent DMRs tested by targeted NGBS. The black bar represents the median of the methylation values distribution at each CpG site. Above each plot, a custom track displays the CpGs tested by targeted NGBS (red) in relation to the significant CpGs detected by 450 K BeadArray (blue), the array background (black) and the closest UCSC gene. Each individual CpG within DMRs had a P-value <0.01. CpGs that were represented on the 450 K BeadArray are labeled by their illumina annotation numbers. CpGs that were not represented in the original microarray are labeled by their placement within the examined region. A detailed enumeration of methylation for each CpG investigated by targeted NGBS is reported in Supplementary Table 5. (D, E, F) Scatter plots displaying the correlation (Spearman’s rank r correlation coefficient) and its statistical significance (P-value) between targeted NGBS and 450 K BeadArray methylation values. For each DMR, Spearman’s rank r was calculated by plotting simultaneously the targeted NGBS versus the 450 K BeadArray methylation values of each single probe. A color-coded legend univocally identifies each probe. GABBR1 (GABA B Receptor 1), Mir124-2 (MicroRNA 124-2), FAM124B (Family with Sequence Similarity 124 Member B).
Top 25 DMRs detected in cortical neuronal nuclei between ASD and control groups. Table of main DMRs’ attributes: genomic coordinates, size, probe number, Δβ value and statistical significance, names of gene/s related to each DMR, functional (genomic location) and topological (neighborhood location) featuresValidation of 3 independent DMRs by targeted NGBS technology. (A, B, C) Dot plots displaying methylation values at single CpG resolution in ASD and control cohorts for 3 independent DMRs tested by targeted NGBS. The black bar represents the median of the methylation values distribution at each CpG site. Above each plot, a custom track displays the CpGs tested by targeted NGBS (red) in relation to the significant CpGs detected by 450 K BeadArray (blue), the array background (black) and the closest UCSC gene. Each individual CpG within DMRs had a P-value <0.01. CpGs that were represented on the 450 K BeadArray are labeled by their illumina annotation numbers. CpGs that were not represented in the original microarray are labeled by their placement within the examined region. A detailed enumeration of methylation for each CpG investigated by targeted NGBS is reported in Supplementary Table 5. (D, E, F) Scatter plots displaying the correlation (Spearman’s rank r correlation coefficient) and its statistical significance (P-value) between targeted NGBS and 450 K BeadArray methylation values. For each DMR, Spearman’s rank r was calculated by plotting simultaneously the targeted NGBS versus the 450 K BeadArray methylation values of each single probe. A color-coded legend univocally identifies each probe. GABBR1 (GABA B Receptor 1), Mir124-2 (MicroRNA 124-2), FAM124B (Family with Sequence Similarity 124 Member B).
Perturbation of Neuronal Co-methylation Modules in ASD
The investigation of DMRs is a powerful tool for detecting with high confidence changes in methylation levels on large scale, nevertheless it fails to consider that multiple dispersed loci may change methylation patterns concordantly. To characterize DNA methylation changes associated with ASD at system level, we applied a clustering analytical approach called weighted gene correlation network analysis (WGCNA), that is well-suited to identify correlation patterns among β values across the entire dataset. WGCNA identified 34 discrete methylation modules, and their first principal component (module eigengene) was used to assess the correlation of each module to a specific trait (Fig. 3A). In total, 8 modules were found to be significantly correlated with ASD/control state. Particularly interesting were the greenyellow (n. probes=4 614, r = −0.53, P = 5e−04), green (n. probes=14 558, r = −0.37, p = 0.01) and brown modules (n. probes=49 145, r = 0.44, P = 4e−03) (Fig. 3A) that included 429 out of 537 CpGs belonging to the significant DMRs (brown=278, greenyellow=80, green=71). As shown on the left side of Fig. 3B–D, the greenyellow and green modules displayed an inverse correlation with the ASD/control state, while the brown module was directly correlated to it. Of particular importance, Module Membership (MM) strongly correlated to Probe Significance (PS) for ASD for each of these 3 modules (greenyellow m: r = 0.66, P < 1e−200; green m: r = 0.53, P < 1e−200; brown m: r = 0.61, P < 1e−200) (Fig. 3B–D right side). This supports the assertion that core CpGs of each module are also strongly associated with ASD. In contrast, the correlations between MM and PS in 3 of the other 5 modules that displayed significant correlation to ASD (purple, saddlebrown, cyan) were not significant (at cutoff of P < 0.01) (Supplementary Fig. 10). In addition, these same 3 modules were the least significant of the 8 significant modules in the WGCNA analysis, therefore providing further evidence that the correlation of these 3 modules to ASD may be due to stochastic mechanisms, and not true biological changes. The other 2 modules, orange and skyblue, which did show significant correlation to ASD, contained very few probes (less than 200 each). Therefore, we did not include those modules in our further analysis. Next, we examined the distribution of CpG sites for each of the 3 modules from a topological and functional perspective. Analysis of neighborhood location revealed a probe enrichment in low-density CpG areas (open sea >shelves >shores) (greenyellow m: P < 3.1e−86; green m: P < 1e−200; brown m: P < 1e−200) and a depletion in high-density CpG areas (islands) (greenyellow m: P < 4.2e−86; green and brown m: P < 1e−200) (Fig. 3E). Analysis of genomic location revealed a probe enrichment in gene bodies (greenyellow m: P < 2.6e−08; green m: P < 1.4e−171; brown m: P < 3.5e−70) and a depletion in Tss200 (greenyellow m: P < 1.8e−17; green m P < 3.2e−195; brown m: P < 1e−200) and 1stExon (greenyellow m: P < 1.1e−07; green m P < 1.8e−144; brown m: P < 4.5e−99) that were common to all 3 modules, while other genomic features showed opposite behavior between green, brown versus greenyellow module, with intergenic and 3′UTR regions enriched in the first 2 (Intergenic green m: P < 3.2e−15; brown m: P < 1e−200. 3′UTR green m: P < 1.4e−33; brown m: P < 5.1e−14) and depleted in the greenyellow module (3′UTR P < 0.2; Intergenic P < 7.3e−05) whereas Tss1500 and 5′UTR regions depleted in the first 2 (Tss1500 green m: p < 1e−07; brown m: P < 9.6e−126. 5′UTR green m: P < 8.9e−28; brown m: P < 1.5e−29) and enriched in the greenyellow module (Tss1500 P < 1.4e−08; 5′UTR P < 001) (Fig. 3F).
Figure 3.
WGCNA applied to methylation data. (A) Table of Module-Trait relationship. Each cell reports the Kendall’s τ correlation coefficient and its P-value between the eigengene value of each module (rows) and a specific trait (columns). (B; C; D) On the left side, heat map plotting the β values of each CpG in the module (rows) across the samples (columns), and bar chart of the corresponding module eigengene values (y-axis) across the samples (x-axis), for the greenyellow, green and brown modules. ASD and control samples are color-coded in red and green, respectively. On the right side, scatter plots indicate the correlation (Pearson’s r correlation coefficient) and its significance (P-value) between CpG site Significance (GS) (y-axis) and MM (x-axis) for the corresponding modules. (E, F) Bar plots displaying the topological (neighborhood location) and functional (genomic location) CpG distribution in the 450 K BeadArray, greenyellow, green, and brown modules. The bar plots representing Neighborhood and Genomic locations are color-coded according to the respective legend (right) that univocally identifies each subgroup. The number of CpGs associated to each subgroup is expressed as percentage of the total.
WGCNA applied to methylation data. (A) Table of Module-Trait relationship. Each cell reports the Kendall’s τ correlation coefficient and its P-value between the eigengene value of each module (rows) and a specific trait (columns). (B; C; D) On the left side, heat map plotting the β values of each CpG in the module (rows) across the samples (columns), and bar chart of the corresponding module eigengene values (y-axis) across the samples (x-axis), for the greenyellow, green and brown modules. ASD and control samples are color-coded in red and green, respectively. On the right side, scatter plots indicate the correlation (Pearson’s r correlation coefficient) and its significance (P-value) between CpG site Significance (GS) (y-axis) and MM (x-axis) for the corresponding modules. (E, F) Bar plots displaying the topological (neighborhood location) and functional (genomic location) CpG distribution in the 450 K BeadArray, greenyellow, green, and brown modules. The bar plots representing Neighborhood and Genomic locations are color-coded according to the respective legend (right) that univocally identifies each subgroup. The number of CpGs associated to each subgroup is expressed as percentage of the total.
ASD-related Co-methylation Modules are Enriched for Synaptic, Neuronal, GABAergic and Immune Processes
To gain insight into the biological meaning of each module, we performed GO analysis for Biological Process and InterPro categories on CpGs within greenyellow, green, and brown modules. To determine the enrichment for CpG sites in respect to a gene function or a genomic feature we used GREAT, a bioinformatics tool designed for the analysis of cis-regulating factors, such as epigenetic modifications (McLean et al. 2010). The analysis was performed setting the 450 K BeadArray as background and inputting only probes with MM > 0.7. Greenyellow and green modules were both enriched for genes related to synaptic and neuronal processes, that is, synaptic transmission (greenyellow BonfHyperP < 1.3e−21; green BonfHyperP < 2.3e−23), regulation of synaptic plasticity (greenyellow BonfHyperP < 1.7e−11; green BonfHyperP < 3.2e−28), regulation of neuron projection development (greenyellow BonfHyperP < 8.0e−09; green BonfHyperP < 2.3e−21), GABA metabolic process (greenyellow BonfHyperP < 8.0e−09) and GABA signaling pathway (green BonfHyperP < 4.9e−06). Conversely, brown module revealed an enrichment for genes related to immune response processes, that is, chemokine-mediated signaling pathway (BonfHyperP < 1.9e−14), or specifically for protein domains or families, such as TNF-like domain (BonfHyperP < 2.3e−19) and complement C1q protein (BonfHyperP < 2.0e−11). We considered significant only GO terms showing a Region-based Fold Enrichment >2 and BonfHyperP < 0.01 (Fig. 4A,B; Supplementary Table 7). Since in biological systems genes do not stand on their own, but interact reciprocally to form complex networks underpinning their biological activity, in order to unravel this complexity, a bottom-up analysis toward a systems biology approach, such as in network analysis, is required. Considering that greenyellow and green modules had a substantial overlap of genes and biological processes, we deemed worthwhile to merge them into a single PPI network. According to the parameters set out for the analysis (see Materials and Methods), we identified 12 hub proteins that can be grossly divided into 2 groups: a first group composed of protein kinases at enzymatic or regulatory activity with neuronal (CAMK2A, BRSK2, STK38L) and non-neuronal specificity (PRKCA, PRKCB, PRKAG2, PRKAR1B); a second group consisted of proteins that are subunits of NMDA (GRIN1, GRIN2A, GRIN2B) and Kainate ionotropic receptors (GRIK5) (Fig. 4C). Notably, of these neuron-specific serine/threonine kinases, CAMK2A regulates NMDAR-dependent long-term potentiation and neurotransmitter release, BRSK2 plays a key role in polarization of neurons and axonogenesis, and STK38L has been hypothesized, by sequence similarity, to be involved in the regulation of structural processes in differentiating and mature neuronal cells. On the other side, subunits of NMDA and Kainate ionotropic receptors play a pivotal role in long-potentiation and synaptic plasticity by regulating the glutamatergic neurotransmission. In the brown module, the top ranked of 12 hub proteins was OAS2, a member of the 2–5 A synthetase family, that are essential proteins involved in the innate immune response to viral infection as part of the IFN-γ signaling (Fig. 4D).
Figure 4.
Gene Ontology and PPI Network analysis. GO analysis for Biological Process and InterPro categories on CpGs associated to greenyellow, green and brown modules. (A) Dot plot depicting the top 5 enriched terms for each of the 2 GO categories. The analysis was performed on CpGs with MM > 0.7 using GREAT web-based tool, and were considered significant only GO terms with a Region-based Fold Enrichment >2 and BonfHyperP < 0.01. (B) A table chart reporting the ID and a brief description of each GO term. RTK, receptor tyrosine kinase. TLR, toll-like receptor. GABA, γ-Aminobutyric acid. BAIAP2, brain-specific angiogenesis inhibitor (BAI1)-binding protein. GPCR, G-protein-coupled receptor. ABAT, 4-aminobutyrate aminotransferase. (C, D) Circle plots displaying PPI network analysis performed on gene sets resulting from the merging of greenyellow and green modules, and from brown module, respectively. The top twelve most interconnected genes (hubs) are displayed in the middle of the circle plot. The edge width (k value, 0.1 < k < 1) reflects the strength of PPI derived from experimental evidence. The node size, defined by −log10 HyperP-value, reflects the per-gene enrichment for a number of probes greater than expected by chance, considering the total number of probes in 450 K BeadArray, the probe number assigned by Illumina to each gene, and the module size. Node transparency refers to the degree of connectivity. On the left, a legend illustrates the numerical scale for all the parameters displayed in the circle plot.
Gene Ontology and PPI Network analysis. GO analysis for Biological Process and InterPro categories on CpGs associated to greenyellow, green and brown modules. (A) Dot plot depicting the top 5 enriched terms for each of the 2 GO categories. The analysis was performed on CpGs with MM > 0.7 using GREAT web-based tool, and were considered significant only GO terms with a Region-based Fold Enrichment >2 and BonfHyperP < 0.01. (B) A table chart reporting the ID and a brief description of each GO term. RTK, receptor tyrosine kinase. TLR, toll-like receptor. GABA, γ-Aminobutyric acid. BAIAP2, brain-specific angiogenesis inhibitor (BAI1)-binding protein. GPCR, G-protein-coupled receptor. ABAT, 4-aminobutyrate aminotransferase. (C, D) Circle plots displaying PPI network analysis performed on gene sets resulting from the merging of greenyellow and green modules, and from brown module, respectively. The top twelve most interconnected genes (hubs) are displayed in the middle of the circle plot. The edge width (k value, 0.1 < k < 1) reflects the strength of PPI derived from experimental evidence. The node size, defined by −log10 HyperP-value, reflects the per-gene enrichment for a number of probes greater than expected by chance, considering the total number of probes in 450 K BeadArray, the probe number assigned by Illumina to each gene, and the module size. Node transparency refers to the degree of connectivity. On the left, a legend illustrates the numerical scale for all the parameters displayed in the circle plot.
Module CpGs are Significantly Associated to ASD Candidate Genes
We tested the specificity of these 3 modules to ASD by assessing their enrichment for GWAS databases for other psychiatric (Alzheimer and Schizophrenia) and non-psychiatric disorders (SLE, Psoriasis, Diabetes type2 and Atherosclerosis), as well as to the SFARI database of ASD associated genes. Due to the lack of significant GWAS data for ASD, we used the genes in the SFARI database that rank at strong confidence (gene score ≤4). Remarkably, the greenyellow (P < 2e−12) and green (P < 1e−26) modules were enriched exclusively for genes associated to ASD while the brown module was enriched for immune genes primarily linked to autoimmune disorders, that is, SLE (P < 2e−07) and Psoriasis (P < 0.0001), and sometimes present in complex diseases as Diabetes type2 (P < 0.004) having an immune component among the risk factors (Fig. 5). We further tested these positive findings with permutation testing with 1000 permuted databases. All of these enrichments were also significantly correlated in the permutation analysis (Supplementary Table 16).
Figure 5.
Enrichment analysis for disease-specific gene associations. Bar plots showing the enrichment analysis of ASD/control trait-related modules for GWAS of psychiatric and non-psychiatric disorders. In the case of ASD, GWAS data were not used, but rather SFARI genes (gene score ≤4). Each color identifies univocally a specific disorder as described in the color legend. The length of the bars reflects the statistical significance of the enrichment analysis. Only HyperP-value <0.01 were considered statistically significant.
Enrichment analysis for disease-specific gene associations. Bar plots showing the enrichment analysis of ASD/control trait-related modules for GWAS of psychiatric and non-psychiatric disorders. In the case of ASD, GWAS data were not used, but rather SFARI genes (gene score ≤4). Each color identifies univocally a specific disorder as described in the color legend. The length of the bars reflects the statistical significance of the enrichment analysis. Only HyperP-value <0.01 were considered statistically significant.
ASD-related DMRs Overlap with Neurodevelopment-specific DMRs
Next, we asked if the methylation patterns we detected both at DMR and module system levels were preserved to a certain extent in the methylation signature specific to early stages of neurodevelopment. To this end, we referred to 2 recent manuscripts: the first, authored by Spiers et al. (2015), profiling the whole human brain methylome from embryonic (post-conceptional week (pcw) 4) to late fetal stage (pcw 26), whereas the second, from Jaffe et al. (2016), investigating the DNA methylation signature in the dorsolateral prefrontal cortex (DLPFC) from embryonic/fetal (Age mean: 14–20 pcw) to early/late childhood stages (Age mean: 6.6 years). First, we investigated the overlap with the datasets at whole-genome resolution by comparing 58 DMRs (FDR < 0.05), found by the present study, versus 4 792 DMRs (FDR < 0.05) detected by Spiers et al. (2015) and 6 480 DMRs (FWER < 0.05) from Jaffe et al. (2016). Remarkably, 27 and 52 out of 58 DMRs overlapped with Spiers et al. (2015) and Jaffe et al. (2016) datasets, respectively, at very high significance level (FET, GlobalP < 1.4e−51; GlobalP < 8.5e−118) (Supplementary Table 15). We also tested for the same overlap at single chromosome resolution, demonstrating that all the 27 and 52 regions, for a total of 14 and 18 chromosomes represented, respectively, were statistically significant (FET, LocalP < 0.05). Furthermore, 18 out of 58 DMRs were commonly overlapping between the 2 datasets (Supplementary Tables 14 and 15). These findings were particularly striking considering that methylation data from Spiers et al. (2015) and Jaffe et al. (2016) were neither cell type nor brain region specific. However, the remarkable overlap we detected against the 2 datasets, regardless of many limiting factors, suggests that regions that are more likely to display dysregulated methylation in ASD brain, are the same that undergo drastic reconfiguration of methylation levels from embryonic to late fetal neurodevelopmental stage. Considering that brain development includes differentiation of immature precursors into neuronal and non-neuronal cell types, we referred to the Lister et al. (2013) study, that profiled the global methylome reconfiguration occurring in NeuN+ and NeuN− cell populations, of mouse and human frontal cortex, at indicative neurodevelopmental time points. We compared all 58 DMRs from the present study against NeuN+ and NeuN− methylation signatures detected at identical genomic coordinates by Lister et al. (2013). Remarkably, all of the 58 DMRs were located in genomic regions characterized by opposite methylation between NeuN+ and NeuN− populations (FET, GlobalP < 1e−200), as summarized in Supplementary Table 14. In Supplementary Fig. 11, we show 6 representative examples of DMRs found by the current study at cell type-specific genomic loci.
Discussion
The aim of this study was to profile the methylome landscape of cortical neurons in individuals with a diagnosis of ASD compared to controls. To our knowledge, this is the first genome-wide study to test neuron-specific methylation patterns in ASD. Differential methylation analyses were performed at 2 levels: at CpG resolution, where none of the 11 886 DMPs (P < 0.01) survived multiple testing correction (FDR < 0.05), and at region-based level, where 58 significant DMRs were identified (FDR < 0.05). Importantly, analyzing differential methylation at the regional level, compared to single site level, is considered more appropriate both because it takes into account the functional significance of multiple CpGs acting concertedly, and because it better protects against technical artefacts, generally affecting individual probes (Ladd-Acosta et al. 2013). In addition, FDR-based correction at the single CpG level is based on the assumption that neighboring methylation sites are independent features, which is not correct from a biological standpoint. Therefore, the lack of significant DMPs is not unexpected. Pipelines, such as bumphunting, which was integrated into the CHAMP tool, have been created to find DMRs, therefore overcoming these disadvantages. Remarkably, all DMRs were closely associated or overlapped known transcripts, suggesting they might play a critical role in regulating gene expression (Lister et al. 2013). Of great interest, both for their statistical significance and for their biological function, were 4 hypomethylated regions: the first 2 overlapping ABAT in 5′UTR and GABBR1 in exon 2/3 (depending on the isoform), while the other 2 overlapping a Tss200 region upstream of Mir124-2 and an intergenic region (IGR) (+4685 +5902) downstream of Mir124-1. The 58 identified DMRs had differential methylation ranging from a 3% change to a 9% change. While these are modest changes, they are comparable to the 3 DMRs found using a similar bumphunting protocol in a previous study of the ASD brain (Nardone et al. 2014). The fact that we found more DMRs is likely due to the issue of tissue heterogeinity, since we examined FACS-sorted nuclei, compared to whole tissue. Despite the modest changes in methylation, it is possible that they have biological significance. Some of these changes may be occurring in specific neuronal subtypes, where their expression is particularly important. For example, the GABA-related genes may be dysregulated specifically in those neuronal subtypes where they are expressed. Further analysis on specific neuronal subtypes would be necessary to understand this issue.Mir124 is a conserved class of MicroRNA that includes 3 members, Mir124-1, Mir124-2 and Mir124-3, located at different genomic positions. Of high importance, Mir124 is the most abundant microRNA in human brain and a master regulator of neuronal identity able to suppress hundreds of non-neuronal genes in a finely tuned spatial and temporal fashion (Sun et al. 2015). Mir124 plays a key role both in neurogenesis, by preventing the translation of many essential transcripts, for example, Sox9, SCP1, Lhx2, JAG1, EfnB1, DLX2 (Sun et al. 2015; Cheng et al. 2009) and in neuronal differentiation, where ensures the transition from neuronal progenitors to mature neurons by directly targeting specific transcripts, for example, Syngr2, Baf53a, Baf45a, PTBP1, STAT3, LAMC1, ITGB1, NeuroD1, RhoG (Yoo et al. 2009; Sun et al. 2015). In adulthood, Mir124 regulates synaptic plasticity by repressing BDNF and CREB (Sun et al. 2015). Of particular relevance to the current study, several publications have found an association between Mir124 and social behavior. In fact, dysregulated levels of Mir124 contributed to the onset of cognition and social dysfunctions in a mouse model of frontotemporal dementia (FTD) and in mice bearing an EPAC null mutation (Yang et al. 2012; Gascon et al. 2014). Similarly, lentiviral-mediated overexpression of Mir124 in the hippocampus-induced anxiety and ASD-like behaviors in rats (Bahi 2016). To add another layer of complexity, Kerek et al. demonstrated that lack of dietary methyl donors in utero in rats was associated with growth retardation at E20 and, postnatally, with long-term brain defects due to an increment of Mir124 expression (Kerek et al. 2013). Despite the lack of mechanistic evidence regarding epigenetic regulation of Mir124 in the brain, the role of DNA methylation in regulating the expression of Mir124-1, -2, and -3 has been shown in humanpancreatic cancer (Wang et al. 2014). Moreover, a recent study profiling the human brain methylome, from embryonic to late fetal stage, reported age-related DMRs that widely overlapped with those we found associated to Mir124-1 and Mir124-2 (Spiers et al. 2015). Notably, the region associated to Mir124-2 was ranked as the second top significant age-related DMR out of 4 792 (Spiers et al. 2015). Overall, considering the abovementioned evidence, it is conceivable that any distortion of methylation trajectory at developmentally regulated regions controlling Mir124-1 and Mir124-2 might have detrimental effects on diverse aspects of brain functioning and behavior.ABAT and GABBR1 genes are part of the GABAergic system and encode for 4-aminobutyrate aminotransferase, an enzyme responsible for GABA catabolism, and for the subunit 1 of GABA typeB receptor, respectively. The importance of GABAergic system in the etiology of ASD was postulated on the basis of the imbalance between excitatory/inhibitory (E/I) neurotransmission (Hussman 2001), justified in part by the high prevalence of epilepsy among ASD subjects (Canitano 2007), and supported by post-mortem brain studies (Fatemi et al. 2009; Oblak et al. 2011). These studies demonstrated in distinct brain regions a decrease in RNA and protein levels of various GABA receptors’ subunits. However, only recent transcriptomic (Voineagu et al. 2011) and Magnetic Resonance Imaging (MRI) studies (Robertson et al. 2016) determined the direct implication of GABAergic system in the imbalance of (E/I) neurotransmission. Nevertheless, the molecular mechanisms underlying the dysregulation of GABAergic system in ASD have not yet been fully elucidated. Despite the fact that genetic linkage and GWAS studies have identified various GABAergic genes as potential ASD candidates, only GABRB3 (Delahanty et al. 2011), SLC6A1 (De Rubeis et al. 2014), and SLC12A5 (Kahle et al. 2014) genes are supported by strong/suggestive evidences. Therefore, genetics alone does not appear to be sufficient to explain GABAergic dysregulation in ASD.For the last decade, mounting body of evidence from studies on human, primate and mouse models, have pinpointed multiple environmental agents as potential risk factors for ASD. Many of them seem to act through epigenetic mechanisms, such as DNA methylation, recently revised by 2 comprehensive reviews (Ciernia and LaSalle 2016; Gesundheit et al. 2016). A unique environmental factor, Maternal Immune Activation (MIA), has been associated with ASD through both epidemiological (Lee et al. 2014) and mouse model studies (Richetto et al. 2016). Of particular relevance to the current investigation, recent studies on rodent models have linked MIA to dysregulated DNA methylation of GABAergic genes (Labouesse et al. 2015; Richetto et al. 2016). Specifically, Labouesse et al. demonstrated an association between hypermethylation at GAD1 and GAD2 promoters, downregulation of both transcripts, presynaptic GABAergic impairments, and behavioral abnormalities in offspring PFC of the MIA model (Labouesse et al. 2015). Notably, similar dynamics were determined for GAD1 gene in the cerebellum of ASD subjects (Zhubi et al. 2014). Moreover, a genome-wide methylation study, performed on mouse model of MIA, identified methylation and expression changes at several loci underlying GABAergic genes that are entailed in neuronal differentiation and signaling, for example, DLX1, LHX5, LHX8 (Richetto et al. 2016). Further corroborating evidence for epigenetic control upon GABAergic system also stems from genetic models. A study on a mouse model of conditional MeCP2 KO, specific to forebrain GABAergic neurons, demonstrated a correlation between loss of MeCP2 occupancy 1 kb upstream of GAD1 and GAD2 Tss, decrease of their mRNA levels, and consequent reduced inhibitory quantal size from GABAergic neurons. Conversely, no decrease in GAD1 and GAD2 mRNA levels was detected in MeCP2 KO excitatory forebrain neurons (Chao et al. 2010). Taken together, these data provide a strong evidence for a role of DNA methylation as interface between environmental insults at early developmental stages, dysregulation of GABAergic system, and ASD onset.WGCNA identified 3 discrete co-methylation modules showing an inverse (greenyellow and green) and direct (brown) correlation with ASD. GO analysis on greenyellow and green modules highlighted regions encompassing neuronal and synaptic genes, whereas GO on brown module highlighted regions associated to immune genes. Interestingly, Voineagu et al., (2011) in a transcriptomic study on BA9, detected similar GO annotations for the top 2 ASD-related modules. The neuronal and immune modules showed inverse and direct correlation to ASD trait, respectively (Voineagu et al. 2011). Furthermore, the neuronal module presented a significant overlap with PVALB+ interneuron module and a synaptic module, previously identified as part of the human brain transcriptional network (Oldham et al. 2009; Voineagu et al. 2011). Interestingly, GO analysis on greenyellow and green modules revealed enrichment for GABAergic system and synaptic categories. Besides this, CpGs within DMRs related to ABAT and GABBR1 were also part of these 2 modules, further corroborating the hypothesis of epigenetic control upon GABAergic system in ASD. GO analysis on brown module highlighted a significant enrichment for immune response categories, including TNF-like domain and complement C1q protein that we found differentially methylated in a recent methylation study on ASD (Nardone et al. 2014). Thus, our system level analysis of neuronal methylome in ASD unveils the existence of a widespread epigenetic dysregulation converging on common biological pathways. Moreover, the functional overlap between co-methylation and co-expression modules could suggest a possible epigenetic control over gene expression.Topological analysis for CpGs within ASD-related modules indicated an enrichment of low-density CpG regions. It has been demonstrated that CpGs located outside CpG islands undergo dynamic methylation changes both spatially, for example, in cell and tissue-specific contexts, and temporally during development and ageing, whereas CpG islands near Tss are usually stably unmethylated (Ciernia and LaSalle 2016). Functional analysis revealed an enrichment for CpGs at gene bodies and a depletion at Tss200 and 1stExon. Wu et al., proved, for the first time, that Dnmt3a-dependent DNA methylation at gene body is positively correlated with gene expression in postnatal neural stem cells (NSCs) and is required for neurogenesis (Wu et al. 2010). This may suggest a partial explanation behind the positive correlation between ASD-related co-methylation modules from the present study and co-expression modules from Voineagu et al. (2011). For example, at the system level Voineagu et al. described increased gene expression of immune system genes, whereas the present study suggests increased methylation of immune genes in ASD brains. The fact that the module of immune genes is enriched for gene body CpGs may justify the positive correlation between expression and methylation modules.In our previous methylation study performed on unsorted cells from BA10, we determined a highly significant enrichment of differentially methylated CpGs in immune genes, in contrast to the relatively modest changes in neuronal genes (Nardone et al. 2014). Those findings do not coincide with our current results that highlight both at system level, by WGCNA, and at single region resolution, by DMR analysis, a more central role of DNA methylation upon neuronal genes. Thus, it is conceivable that our previous study was unable to detect neuron-specific difference in methylation that, albeit present, did not emerge from background noise. And if so, the hypomethylation detected at multiple immune-related CpGs, may have originated only from the glia fraction. However, it should be mentioned that several biological and technical factors can influence the resolving power of an epigenetic study (Rakyan et al. 2011). While many of these factors can be controlled through the implementation of experimental, statistical and bioinformatics approaches, many others, above all brain mosaicism (Tasic et al. 2016), small-size changes in methylation, small size of the cohorts, and low neuron/glia ratio (Herculano-Houzel 2009) in PFC, are still the most challenging to be taken into account. Furthermore, despite the existence of statistical methodologies for deconvolving heterogeneous epigenetic data from post-mortem brain into their neuronal and non-neuronal components (Montaño et al. 2013; Jaffe and Irizarry 2014), the level of resolution and accuracy they reached so far, is still lower than what is obtainable by FACS methodology(Shulha et al. 2012). Therefore, these issues are likely to explain part of the discrepancies between different epigenetic studies. However, while changes in immune genes were not as pronounced as in our previous study, one of the ASD-related modules (brown) was highly enriched for genomic regions linked to immune genes, particularly for genes related to complement response, whose dysregulation in methylation have been previously implicated in ASD. Neuronal expression of complement factors, including C1qA and C3, plays a crucial role in synaptic pruning throughout neurodevelopment and also in adulthood (Kettenmann et al. 2013). Abnormalities in immune system at genetic, epigenetic or transcription level, have already been identified in several psychiatric disorders including Schizophrenia (Ripke et al. 2014; Sekar et al. 2016), Alzheimer (Hong et al. 2016) and, recently, also in ASD by transcriptomic and epigenomic post-mortem brain studies (Voineagu et al. 2011; Gupta et al. 2014; Nardone et al. 2014). Therefore, it is conceivable that dysregulation of neuron-specific methylation patterns at regions controlling immune genes might be involved in the biology of ASD.Several lines of evidence emerging from the present study suggest that CpGs, undergoing highly dynamic DNA methylation during neurodevelopment, are particularly susceptible to dysregulation in neurons of ASD individuals. First, there is a significant overlap between ASD-specific DMRs, from the current study, and age-specific DMRs reported by Spiers et al. and Jaffe et al. in their recent publications (Spiers et al. 2015; Jaffe et al. 2016). Second, all 58 ASD-specific DMRs belong to a class of developmentally regulated regions characterized by opposite methylation in NeuN+ versus NeuN− nuclei from DLPFC (Lister et al. 2013). Nevertheless, we still cannot infer a direct causal link between environmental cues, methylation changes and ASD diagnosis: in fact, changes in methylation might arise as result of genetic factors (Iossifov et al. 2014) or, alternatively, they could represent the outcome of compensatory mechanisms or, merely a late-appearing epiphenomenon (Petronis 2010). Additionally, environmental factors could act through other mechanisms rather than epigenetics.It should be mentioned that we might not have been able to identify additional DMRs due to both technical and biological limiting factors. First, the 450 K BeadArray does not include all the CpGs in the human genome, and also includes very little of the non-CpG methylation sites. Therefore, it is possible that there are additional DMRs in the samples that we were not able to detect. Second, the 450 K BeadArray focuses mainly, although not exclusively, on high-density CpG regions, such as CpG islands. However, we detected a high enrichment of CpGs in our WGCNA modules for low-density CpG regions. Therefore, we may have missed additional CpGs in low-density regions that are not well represented on the microarray. On the other hand, one of the major strengths of the 450 K BeadArray, compared to sequencing methods, is the high quality and resolution of data at the level of each CpG. Due to high costs, most whole-genome sequencing studies are not able to reach such a high resolution of methylation analysis, therefore decreasing the likelihood of finding significant DMRs. An additional limitation of our study is that we separated mostly the larger nuclear fraction for analysis, due to technical necessity. Therefore, it is possible that our DMRs are found in a specific neuronal subtype with larger nuclei, and that we may have missed DMRs in neurons that have smaller nuclei. Additionally, there is the possibility that methylation changes occurs only in a specific neuronal subtype (Kozlenkov et al. 2015; Mo et al. 2015), such as PVB + GABAergic interneurons, that account only for 20% of NeuN+ cells in human PFC (Kozlenkov et al. 2015). As such, those DMRs may not be detectable when analyzing the entire neuronal population.There are a few other limitations of this study which must be taken into account. First, we do not have information about ethnicity for most of the samples, particularly the controls. While the same is true for all previous studies utilizing this brain bank, we cannot rule out the possibility that there are some factors of ethnicity which may partially influence our data. There is also limited information about medicines used by patients for most of the samples. In addition, we point out that all of the samples in this study came from the frontal cortex, and all of the samples, except for 3 came from brodmann’s area 10 (1 in brodmann’s area 9; and 2 in brodmann’s area 8). Although these are all areas of the frontal cortex, we cannot rule out slight changes in cortical neuron methylation between these neighboring regions.One other limitation of our study is sample size. The FACS procedure demands intact neurons, which limits the sample size to only those samples that have never been previously thawed after initial freezing. One previous epigenetic study on FACS-separated neurons in a psychiatric condition used similar number of samples to that used on our study (they used 16 controls and 16 ASD cases) (Shulha et al. 2012). In their chip-seq study on H3K4 methyation, they found spreading of H3K4 methylation from the promoter into gene bodies in a subset of ASD cases, but did not find any significantly different methylation peaks between ASD and control cases. There are many technical differences between our studies that may explain the fact that the previous study was not able to decipher significant differences between groups, including that chip-seq is much less sensitive to small changes between samples compared to the 450 K BeadArray, in addition to the fact that chip-seq only detects relative, and not absolute quantities. In addition, it is possible that DNA methylation, and not histone methylation may play a larger factor in ASD. A previous study of DNA methylation in non-sorted ASD brain samples used 21 control and 19 ASD samples (Nardone et al. 2014). In that study, they used the 450 K BeadArray and a similar bioinformatic pipeline to our study, therefore allowing for a stronger comparison between studies. They identified 4 intital DMRs at FDR < 0.1, 3 of which they verified in a seperate brain region, while our study identified 58 DMRs at FDR < 0.05. Therefore, it is plausable that the neuron-specific approach used in the present study allowed for the identification of more DMRs due to the reduction in cell-type epigenetic heterogeinity.Despite the fact that observational studies cannot provide direct mechanistic insight into the role of DNA methylation in the aetiopathogenesis of ASD, dysregulation of DNA methylation detected by human (Ladd-Acosta et al. 2013; Nardone et al. 2014) and rodent studies (Richetto et al. 2016) both at regional and network-based level, strongly supports its critical function in the ASD aetiopathogenesis. Considering our system level analysis of neuronal methylome in ASD within the framework of cell type and developmental stage-specific methylation signatures, we can speculate as to whether environmental insults could alter the normal brain development by interfering with epigenetic machinery and ultimately leading to long-lasting functional and behavioral changes. However, additional research is warranted to ascertain this hypothesis.
Authors’ Contributions
S.N. and E.E. conceived and designed the experiments. S.N., with the contribution of D.S.S, performed all the wet-lab experiments and most of the statistical and bioinformatics data analysis. E.R. and A.Z. performed Principal Component and DMRs enrichment analyses. E.E. provided guidance and oversight on all experiments and analyses. S.N. and E.E. wrote the manuscript in consultation with all authors.Click here for additional data file.
Authors: A Vogel Ciernia; B I Laufer; H Hwang; K W Dunaway; C E Mordaunt; R L Coulson; D H Yasui; J M LaSalle Journal: Cereb Cortex Date: 2020-03-21 Impact factor: 5.357
Authors: Antonietta Messina; Vincenzo Monda; Francesco Sessa; Anna Valenzano; Monica Salerno; Ilaria Bitetti; Francesco Precenzano; Rosa Marotta; Francesco Lavano; Serena M Lavano; Margherita Salerno; Agata Maltese; Michele Roccella; Lucia Parisi; Roberta I Ferrentino; Gabriele Tripi; Beatrice Gallai; Giuseppe Cibelli; Marcellino Monda; Giovanni Messina; Marco Carotenuto Journal: Front Physiol Date: 2018-03-22 Impact factor: 4.566