Literature DB >> 25363760

Synaptic, transcriptional and chromatin genes disrupted in autism.

Silvia De Rubeis, Xin He, Arthur P Goldberg, Christopher S Poultney, Kaitlin Samocha, A Erucment Cicek, Yan Kou, Li Liu, Menachem Fromer, Susan Walker, Tarinder Singh, Lambertus Klei, Jack Kosmicki, Fu Shih-Chen, Branko Aleksic, Monica Biscaldi, Patrick F Bolton, Jessica M Brownfeld, Jinlu Cai, Nicholas G Campbell, Angel Carracedo, Maria H Chahrour, Andreas G Chiocchetti, Hilary Coon, Emily L Crawford, Sarah R Curran, Geraldine Dawson, Eftichia Duketis, Bridget A Fernandez, Louise Gallagher, Evan Geller, Stephen J Guter, R Sean Hill, Juliana Ionita-Laza, Patricia Jimenz Gonzalez, Helena Kilpinen, Sabine M Klauck, Alexander Kolevzon, Irene Lee, Irene Lei, Jing Lei, Terho Lehtimäki, Chiao-Feng Lin, Avi Ma'ayan, Christian R Marshall, Alison L McInnes, Benjamin Neale, Michael J Owen, Noriio Ozaki, Mara Parellada, Jeremy R Parr, Shaun Purcell, Kaija Puura, Deepthi Rajagopalan, Karola Rehnström, Abraham Reichenberg, Aniko Sabo, Michael Sachse, Stephan J Sanders, Chad Schafer, Martin Schulte-Rüther, David Skuse, Christine Stevens, Peter Szatmari, Kristiina Tammimies, Otto Valladares, Annette Voran, Wang Li-San, Lauren A Weiss, A Jeremy Willsey, Timothy W Yu, Ryan K C Yuen, Edwin H Cook, Christine M Freitag, Michael Gill, Christina M Hultman, Thomas Lehner, Aaarno Palotie, Gerard D Schellenberg, Pamela Sklar, Matthew W State, James S Sutcliffe, Christiopher A Walsh, Stephen W Scherer, Michael E Zwick, Jeffrey C Barett, David J Cutler, Kathryn Roeder, Bernie Devlin, Mark J Daly, Joseph D Buxbaum.

Abstract

The genetic architecture of autism spectrum disorder involves the interplay of common and rare variants and their impact on hundreds of genes. Using exome sequencing, here we show that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate (FDR) < 0.05, plus a set of 107 autosomal genes strongly enriched for those likely to affect risk (FDR < 0.30). These 107 genes, which show unusual evolutionary constraint against mutations, incur de novo loss-of-function mutations in over 5% of autistic subjects. Many of the genes implicated encode proteins for synaptic formation, transcriptional regulation and chromatin-remodelling pathways. These include voltage-gated ion channels regulating the propagation of action potentials, pacemaking and excitability-transcription coupling, as well as histone-modifying enzymes and chromatin remodellers-most prominently those that mediate post-translational lysine methylation/demethylation modifications of histones.

Entities: Chemical

Mesh：

Substances：
Chromatin

Year: 2014 PMID： 25363760 PMCID： PMC4402723 DOI： 10.1038/nature13772

Source DB: PubMed Journal: Nature ISSN： 0028-0836 Impact factor: 49.962

Features of subjects with autism spectrum disorder (ASD) include compromised social communication and interaction. Because the bulk of risk arises from de novo and inherited genetic variation[1-10], characterizing which genes are involved informs on ASD neurobiology and on what makes us social beings. Whole-exome sequencing (WES) studies have proved fruitful in uncovering risk-conferring variation, especially by enumerating de novo variation, which is sufficiently rare that recurrent mutations in a gene provide strong causal evidence. De novo loss-of-function (LoF) single-nucleotide variants (SNV) or insertion/deletion (indel) variants[11-15] are found in 6.7% more ASD subjects than in matched controls and implicate nine genes from the first 1000 ASD subjects[11-16]. Moreover, because there are hundreds of genes involved in ASD risk, ongoing WES studies should identify additional ASD genes as an almost linear function of increasing sample size[11]. Here, we conduct the largest ASD WES study to date, analyzing 16 sample sets comprising 15,480 DNA samples (Supplementary Table 1; Extended Data Fig. 1). Unlike earlier WES studies, we do not rely solely on counting de novo LoF variants, rather we use novel statistical methods to assess association for autosomal genes by integrating de novo, inherited and case-control LoF counts, as well as de novo missense variants predicted to be damaging. For many samples original data from sequencing performed on Illumina HiSeq 2000 systems were used to call SNVs and indels in a single large batch using GATK (v2.6). De novo mutations were called using enhancements of earlier methods[14] (Supplementary Information), with calls validating at extremely high rates.

Extended Data Figure 1

Workflow of the study

The workflow began with 16 sample sets, as listed in Supplementary Table 1. DNA was obtained, and exomes were captured and sequenced. After variant calling QC was performed: duplicate subjects and incomplete families were removed; and subjects with extreme genotyping, de novo, or variant rates were removed. Following cleaning, 3,871 subjects with ASD remained. Analysis proceeded separately for SNVs and indels, and CNVs. De novo and transmission/non-transmission were obtained for trio data (published de novo from 825 trios[11,13-15] were incorporated). This path led to the TADA analysis, which found 33 ASD risk genes with q < 0.1; and 107 with q < 0.3. CNV were called in 2,305 ASD subjects.

After evaluation of data quality, high-quality alternate alleles with a frequency of < 0.1% were identified, restricting to LoF (frameshifts, stop gains, donors/acceptor splice site mutations) or probably damaging missense (Mis3) variants (defined by PolyPhen-2[17]). Variants were classified by type (de novo, case, control, transmitted, non-transmitted) and severity (LoF, Mis3), and counts tallied for each gene. Some 13.8% of the 2270 autism trios (two parents and one affected child) carried a de novo LoF mutation – significantly in excess of expectation[18] (8.6%, P<10−14) or what is observed in 510 control trios (7.1%, P=1.6×10−5) collected here and previously published[15]. Eighteen genes (Table 1) were hit by 2 or more de novo LoF mutations. These genes are all known or strong candidate ASD genes, but given the number of trios sequenced, we expect approximately two such genes by chance given gene mutability[14,18]. While we expect only 2 de novo Mis3 events in these 18 genes, we observe 16 (P=9.2×10−11, Poisson test). Because much of our data exist in cases and controls and because we observed an additional excess of transmitted LoF events in the 18 genes, it is evident that the optimal analysis framework must involve an integration of de novo mutation with variants observed in cases and controls and transmitted or untransmitted from carrier parents. Going beyond de novo LoFs is also critical given that many ASD risk genes and loci have mutations that are not completely penetrant.

Table 1

ASD risk genes[1].

dnLoF[2] count	q≤0.01	0.01<q≤0.05	0.05<q≤0.1
≥2	ADNP, ANK2, ARID1B, CHD8, CUL3, DYRK1A, GRIN2B, KATNAL2, POGZ, SCN2A, SUV420H1, SYNGAP1, TBR1	ASXL3, BCL11A, CACNA2D3, MLL3	ASH1L
1		CTTNBP2, GABRB3, PTEN, RELN	APH1A, CD42BPB, ETFB, NAA15, MYO9B, MYT1L, NR3C2, SETD5, TRIO
0		MIB1	VIL1

TADA analysis of loss-of-function (LoF) and damaging missense variants found to be de novo in ASD subjects, inherited by ASD subjects, or in ASD subjects (versus control subjects).

De novo LoF events.

Transmission and De novo Association

We adopted TADA (for ‘Transmission and De novo Association’), a weighted, statistical model integrating de novo, transmitted and case-control variation[19]. TADA uses a Bayesian gene-based likelihood model including per gene mutation rates, allele frequencies, and relative risks of particular classes of sequence changes. We modeled both LoF and Mis3 sequence variants. Because no aggregate association signal was detected for inherited Mis3 variants, they were not included in the analysis. For each gene, variants of each class were assigned the same effect on relative risk. Using a prior probability distribution of relative risk across genes for each class of variants, the model effectively weighted different classes of variants in this order: de novo LoF > de novo Mis3 > transmitted LoF, and allowed for a distribution of relative risks across genes for each class. The strength of association was assimilated across classes to produce a gene-level Bayes Factor (BF) with a corresponding False Discovery Rate or FDR q-value. This framework increases the power compared to use of de novo LoF alone (Extended Data Fig. 2).

Extended Data Figure 2

Expected number of ASD genes discovered as a function of sample size

The Multiple LoF test (red) is a restricted version of TADA that uses only the de novo LoF data. TADA (blue) models de novo LoF, de novo Mis3, LoF variants transmitted/not transmitted and LoF variants observed in case/control samples. The sample size (N) indicates either (i) N trios, for which we record de novo and transmitted variation, or (ii) N trios, for which we record only de novo events, plus N cases and N controls.

TADA identified 33 autosomal genes with an FDR < 0.1 (Table 1) and 107 genes with an FDR < 0.3 (Supplementary Tables 2 and 3 and Extended Data Fig. 3). Of the 33 genes, 15 (45.5%) are known ASD risk genes[9]; 11 have been reported previously with mutations in ASD patients but were not classed as true risk genes owing to insufficient evidence (SUV420H1[11,15], ADNP[12], BCL11A[15], and CACNA2D3[15,20], CTTNBP2[15], GABRB3[20], CDC42BPB[, APH1A[14], NR3C2[15], SETD5[14,21], TRIO[); and 7 are completely novel (ASH1L, MLL3, ETFB, NAA15, MYO9B, MIB1, VIL1). ADNP mutations have recently been identified in 10 patients with ASD and other shared clinical features[22]. Two of the newly discovered genes, ASH1L and MLL3, converge on chromatin remodeling. MYO9B plays a key role in dendritic arborization[23]. MIB1 encodes an E3 ubiquitin ligase critical for neurogenesis[24] and is regulated by miR-137[25], a microRNA that regulates neuronal maturation and is implicated in risk for schizophrenia[26].

Extended Data Figure 3

Heat map of the numbers of variants used in TADA analysis from each dataset in genes with q < 0.3

Left panel, variants in affected subjects; right panel, unaffected subjects. For the counts, we only focus on de novo LoF and Mis3 variants, transmitted/un-transmitted and case/control LoF variants. These variant counts are normalized by the length of coding regions of each gene and sample size of each dataset (|trio|+|case| for left panel, |trio|+|control| for the right panel).

When the WES data from genes with FDR < 0.3 were evaluated for the presence of deletion copy number variants (such CNVs are functionally equivalent to LoF mutations), 34 CNVs meeting quality and frequency constraints (Supplementary Information) were detected in 5781 samples (Extended Data Fig. 1). Of the 33 genes with FDR < 0.1, three contained deletion CNVs mapping to three ASD subjects and one parent. Of the 74 genes meeting the criterion 0.1 ≤ FDR < 0.3, about a third could be false positives. Deletion CNVs were found in 14 of these genes and the data supported risk status for 10 of them (Extended Data Table 1, Extended Data Fig. 4). Two of the 10, NRXN1 and SHANK3, were previously implicated in ASD[2,3,10]. The risk from deletion CNVs, as measured by the odds ratio, is comparable to that from LoF SNV in cases versus controls or transmission of LoF from parents to offspring.

Extended Data Table 1

CNVs hitting TADA genes.

Gene	ASD subject		Unaffected parent[2]			Unaffected	Odds[4] Ratio
	Unknown Inheritance	Inherited	Tr-ASD[3]	NT[3]	Tr-not-ASD[3]	Unaffected	Odds[4] Ratio
q-value < 0.1
ANK2	1						∞
ASXL3	1						∞
VIL1		1	1				1.49
0.1 ≤ q-value < 0.3: Evidence for role in ASD
UTP6	1						∞
DNAH10		1	1				1.49
ATP1B1	1						∞
GGNBP2	1						∞
NRXN1		2	1				2.99
WHSC1	1						∞
HDLBP [5]	1	2	1		1	1	2.24
CERS4		1	1				1.49
SHANK3	4						∞
IQGAP2	1						∞
0.1 ≤ q-value < 0.3: Evidence against role in ASD
EP400						1	0
SLCO1B1 [5,6]	1	1	1	1		1	0.996
SLCO1B3 [6]		1	1	2		1	0.37
KDM6B						1	0

Count of deletion copy number variants, inferred from sequence, for ASD subjects and those unaffected by ASD. Number of subjects and family status: 849 ASD without family information; 1467 ASD subjects in families; 2766 unaffected parents; 319 unaffected siblings of ASD subjects; 373 unaffected subjects without family information.

No parents in this count were affected; 7 parents in the study were affected, none carried a CNV reported in the table and these subjects did not enter the calculation.

Tr-ASD = transmitted to ASD subject from carrier parent; NT=parent a carrier but CNV not transmitted to affected child; Tr-not-ASD = parent transmits a CNV to an unaffected child.

To compute the odds ratio we count the number ‘a’ of affected carriers, ‘b’ unaffected carriers (including parents), ‘c’ affected subjects who do not have the CNV, and ‘d’ unaffected non-carriers. The odds ratio = (ad)/(bc).

One parent transmits the CNV to an affected and unaffected offspring; to obtain the total count of controls with a CNV, subtract one.

Genes are adjacent in the genome (see Extended Data Fig. 4). For 3 subjects both genes are hit by the same CNV (1 ASD and 2 unaffected subjects).

Extended Data Figure 4

Genome browser view of the CNV deletions identified in ASD affected subjects

The deletions are displayed in red if with unknown inheritance, in grey if inherited, and in black in un unaffected subjects. Deletions in parents are not shown. For deletions within a single gene, all splicing isoforms are shown.

Estimated odds ratios of top genes

Inherent in our conception of the biology of ASD is the notion that there is variation between genes in their impact on risk: for a given class of variants (e.g., LoF), some genes have large impact, others smaller, and still others have no effect at all. Yet mis-annotation of variants, among other confounds, can produce false variant calls in subjects (Supplementary Information). These confounds can often be overcome by examining the data in a manner orthogonal to gene discovery. For example, females have greatly reduced rates of ASD relative to males (a so-called ‘female protective effect’). Consequentially, and regardless of whether this is diagnostic bias or biological protection, females have a higher liability threshold, requiring a larger genetic burden before being diagnosed[21,27,28]. A corollary is that if a variant has the same effect on autism liability in males as it does in females, that variant will be at higher frequency in female ASD cases compared to males. Importantly, the magnitude of the difference is proportional to risk as measured by the odds ratio (OR); hence, the effect on risk for a class of variants can be estimated from the difference in frequency between males and females. Genes with FDR < 0.1 show profound female enrichment for de novo events (P=0.005 for LoF, P=0.004 for Mis3), consistent with de novo events having large impact on liability (OR ≥ 20; Extended Data Fig. 5). Genes with FDR between 0.1 and 0.3, however, show substantially less enrichment for female events, consistent with a modest impact for LoF variants (OR range 2-4, whether transmitted or de novo) and little to no effect from Mis3 variants. The results are consistent with inheritance patterns, LoF mutations in FDR < 0.1 genes are rarely inherited from unaffected parents while those in the 0.1 < FDR < 0.3 group are far more often inherited than de novo.

Extended Data Figure 5

Frequency of variants by gender

Frequency of de novo (DN) and transmitted (TR) variants per sample in males (black) and females (white) for genes with q < 0.1 (upper panel), q < 0.3 (central panel), or all TADA genes (lower panel). The P values were determined by a one-tailed permutation test (*P < 0.5; **P < 0.01; ***P < 0.01).

By analyzing the distribution of relative risk over inferred ASD genes[19], the number of ASD risk genes can be estimated. The estimate relies on the balance of genes with multiple de novo LoF mutations versus those with only one: the larger the number of ASD genes, the greater proportion that will show only one de novo LoF. This approach yields an estimate of 1,150 ASD genes (Supplementary Information). While there are many more genes to be discovered, many will have a modest impact on risk compared to the genes in Table 1.

Enrichment analyses

FDR < 0.3 gene sets are strongly enriched for genes under evolutionary constraint[18] (P=3.0×10−11, Fig. 1a, Supplementary Table 4), consistent with the hypothesis that heterozygous LoF mutations in these genes are ASD risk factors. Indeed over 5% of ASD subjects carry de novo LoF mutations in our FDR < 0.3 list. We also observed that genes in the FDR < 0.3 list had a significant excess of de novo LoF events detected by the largest schizophrenia WES study to date[29] (P=0.0085, Fig. 1a), providing further evidence for overlapping risk loci between these disorders and independent confirmation of the signal in the gene sets presented here.

Figure 1

ASD genes in synaptic network

a. Enrichment of 107 TADA genes in: FMRP targets from two independent datasets and their overlap; RBFOX targets; RBFOX targets with predicted alterations in splicing; RBFOX and H3K4me3 overlapping targets; genes with de novo mutations in schizophrenia; human orthologues of Genes2Cognition mouse synaptosome or PSD genes; constrained genes; and, genes encoding mitochondrial proteins (as a control). Red bars indicate empirical P-values. b. Synaptic proteins encoded by TADA genes. c. De novo Mis3 variants in Nav1.2 (SCN2A). The four repeats (I-IV) with P-loops, the EF-hand, and the IQ domain are shown, as are the four amino acids (DEKA) forming the inner ring of the ion selectivity filter. d. Relevant variants in Cav1.3 (CACNA1D). Part of the channel is shown, including helices one and six (S1 and S6) for the I-IV domains, NSCaTE motif, EF-hand domain, pre-IQ, IQ, PCRD, DCRD, proline-rich region, and PDZ-binding motif.

We found significant enrichment for genes encoding mRNAs targeted by two neuronal RNA-binding proteins: FMRP[30] (also known as FMR1), mutated or absent in fragile X syndrome (P=1.20×10−17, 34 targets[30], of which 11 are corroborated by an independent data set[31]), and, RBFOX (RBFOX1/2/3) (P=0.0024, 20 targets, of which 12 overlap with FMRP), with RBFOX1 shown to be a splicing factor dysregulated in ASD[32,33] (Fig. 1a). These two pathways expand the complexity of the ASD neurobiology to post-transcriptional events, including splicing and translation, both of which would sculpt the neural proteome. We found nominal enrichment for human orthologs of mouse genes encoding synaptic (P=0.031) and postsynaptic density (PSD) proteins[34] (P=0.046, Fig. 1a, 1b, Supplementary Tables 4, 5 and 6). Enrichment analyses for InterPro, SMART, or Pfam domains (FDR < 0.05 and a minimum of 5 genes per category) reveal an overrepresentation of DNA/histone-related domains: 8 genes encoding proteins with InterPro zinc finger (Znf) FYVE PHD domains (142 such annotated genes in the genome; FDR=7.6×10−4), and five with Pfam Su(var)3-9, Enhancer-of-zeste (SET) domains (39 annotated in the genome; FDR=8.2×10−4).

Integrating complementary data

To implicate additional genes in risk for ASD, we use a model called DAWN[35]. DAWN evokes a hidden Markov random field framework to identify clusters of genes that show strong association signal and highly correlated co-expression in a key tissue and developmental context. Previous research suggests human mid-fetal prefrontal and motor-somatosensory neocortex is such a critical nexus for risk[16], thus we evaluated gene co-expression data from that tissue together with TADA scores for genes with FDR < 0.3. Because this list is enriched for genes under evolutionary constraint, we generalized DAWN to incorporate constraint scores (Supplementary Information). When (a) TADA results, (b) gene co-expression in mid-fetal neocortex, and (c) constraint scores are jointly modeled, DAWN identifies 160 genes that plausibly affect risk (Fig. 2), 91 of which are not in the top 107 TADA genes. Moreover, the model parameter describing evolutionary constraint is an important predictor of clusters of putative risk genes (P=0.018).

Figure 2

ASD genes in neuronal networks

Protein-protein interaction network created by seeding TADA and DAWN predicted genes. Only intermediate genes that are known to interact with at least two TADA and/or DAWN genes are included. Four natural clusters (C1-C4) are demarcated with black ellipses. All nodes are sized based on degree of connectivity.

A subnetwork obtained by seeding the 160 DAWN genes within a high-confidence protein-protein interactome[14] confirmed that the putative genes are enriched for neuronal functions. We kept the largest connected component, containing 95 seed DAWN genes, 50 of which were in the FDR < 0.3 gene set. The DAWN gene products form four natural clusters based on network connectivity (Fig. 2). We visualized the enriched pathways and biological functions for each of these clusters on canvases[36] (Extended Data Fig. 6). Many of the previously known ASD risk genes fall in cluster C3, including genes involved in synaptic transmission and cell-cell communication. Cluster C4 is enriched for genes related to transcriptional and chromatin regulation. Many TADA and DAWN genes in this cluster interact tightly with other transcription factors, histone modifying enzymes and DNA binding proteins. Five TADA genes in the cluster C2 are bridged to the rest of the network through MAPT, inferred by DAWN. The enrichment results for C2 indicate that genes implicated in neurodegenerative disorders could also play a role in neurodevelopmental disorders.

Extended Data Figure 6

Enrichment terms for the four clusters identified by protein-protein interaction network

P-values using Mouse-Genome-Informatics/Mammalian-Phenotype (MGI-MP, blue), Kyoto Encyclopedia of Genes and Genomes pathways (KEGG, red), and Gene Ontology biological processes (GO, yellow) are indicated.

Emergent results

Amongst critical synaptic components found mutated in our study are voltage-gated ion channels involved in fundamental processes including propagation of action potentials (e.g., Nav1.2 channel), neuronal pacemaking, and excitability-transcription coupling (e.g., Cav1.3 channel) (Fig. 1b). We identified, 4 LoF and 5 Mis3 variants in SCN2A (Nav1.2), 3 Mis3 in CACNA1D (Cav1.3), 2 LoF in CACNA2D3 (α-δauxiliary subunits of L-type voltage-gated Ca2+ channels, including Cav1.3). Remarkably, three de novo Mis3 variants in SCN2A hit residues mutated in homologous genes in patients with other syndromes, including Brugada syndrome (SCN5A) or epilepsy disorders (SCN1A) (p.R379H and p.R937H). These arginines, as well as the threonine mutated in p.T1420M, cluster to the P-loops forming the ion selectivity filter, in proximity of the inner ring (DEKA motif) (Fig. 1c). Because homologous channels mutated in these arginines do not conduct inwards Na+ currents[37,38], p.R379H and p.R937H might have similar effect. Two de novo CACNA1D variants (p.G407R and p.A749G) hit positions proximal to residues mutated in patients with primary aldosteronism and neurological deficits (Fig. 1d). The reported mutations interfere with channel activation and inactivation[39]. Amongst variants found in cases, p.A59V maps to the NSCaTE domain, also important for Ca2+-dependent inactivation, while p.S1977L and p.R2021H co-cluster in the C-terminal proline-rich domain, the site of interaction with SHANK3, a key PSD scaffolding protein. Mutations in RIMS1 and RIMBP2, which can associate with Cav1.3, were found in our cohort (but with an FDR.0.3). Chromatin remodeling involves histone-modifying enzymes (encoded by histone modifier genes, HMGs) and chromatin remodelers (‘readers’) that recognize specific histone post-translational modifications (PTMs) and orchestrate their effects on chromatin. Our gene set is enriched in HMGs (9 HMGs out of 152 annotated in HIstome[40], Fisher's exact test, P=2.2×10−7). Enrichment in the GO term ‘histone-lysine N-methytransferase activity’ (5 genes out of 41 so annotated; FDR=2.2×10−2) highlights this as a prominent pathway. Lysines on histones 3 and 4 can be mono-, di-, or tri-methylated, providing a versatile mechanism for either activation or repression of transcription. Of 107 TADA genes, five are SET lysine methyltransferases, four are Jumonji (JmjC) lysine demethylases, and two are readers (Fig. 3a). RBFOX1 co-isolates with H3K4me3[41], and our dataset is enriched in targets shared by RBFOX1 and H3K4me3 (P=0.0166, Fig. 1a, Supplementary Table 4). Some de novo missense variants targeting these genes map to functional domains (Extended Data Fig. 7).

Figure 3

ASD genes in chromatin remodeling

a. TADA genes cluster to chromatin remodeling complexes. Amino terminals of histones H3, H4 and part of H2A, are shown. Lysine methyltransferases add methyl groups, while lysine demethylases remove them. b. De novo Mis3 and LoF variants in CHD8. The box shows the outcome of RT-PCR and Sanger sequencing in lymphoblastoid cells for two newly identified de novo splice-site variants. The first mutation hits an acceptor splice site (red arrow), causing the activation of a cryptic splice site (red box), a four-nucleotide deletion, frame shift and a premature stop. The second mutation hits a donor splice site (red arrow), causing exon skipping, frame shift and a premature stop.

Extended Data Figure 7

De novo variants in SET lysine methyltransferases and JmjC lysine demethylases

Mis3 are in black, LoF in red, and variants identified in other disorders in grey (Fig. 5). JmjC, Jumonji C domain; JmjN, Jumonji N domain; JmjC, PHD, plant homeodomain; ARID, AT-rich interacting domain; SET, Su(var)3-9, Enhancer-of-zeste, Trithorax domain; FYR N, FY-rich N-terminal domain; FYR C, FY-rich C-terminal domain; PWWP, Pro-Trp-Trp-Pro domain; HMG, high mobility group box; AWS, associated with SET domain; Bromo, bromodomain; BAH, bromo adjacent homology.

For the H3K4me2 reader CHD8, we extended our analyses in search of additional de novo variation in the cases of the case-control sample. By sequencing complete parent-child trios for many CHD8 variants, five variants were found to be de novo, two of which affect essential splice sites and cause loss of function by exon skipping or activation of cryptic splice sites in lymphoblastoid cells (Fig. 3b). Given the role of HMGs in transcription, we reasoned that TADA genes might be interconnected through transcription “routes”. We searched for a connected network (seeded by 9 TADA HMGs) in a transcription factor interaction network (ChEA)[42]. We found that 46 TADA genes are directly interconnected in a 55-gene cluster (Extended Data Fig. 8) (P=0.002; 1,000 random draws), for a total of 69 when including all known HMGs (Fig. 4) (P=0.001; 1,000 random draws).

Extended Data Figure 8

Transcription regulation network of TADA genes only

Edges indicate transcription regulator (source node) and its gene targets (target node) based on ChEA network.

Figure 4

Transcription regulation network of TADA genes

Edges indicate transcription regulator (source node) and its gene targets (target node) based on ChEA network; interactions among only HMGs are ignored.

Examining the Human Gene Mutation Database we found that the 107 TADA genes included 21 candidate genes for intellectual disability, 3 for epilepsy, 17 for schizophrenia, 9 for congenital heart disease and 6 for metabolic disorders (Fig. 5).

Figure 5

Involvement in disease of ASD genes

Venn diagram to visualize the overlap in disease involvement for the TADA genes.

Conclusions

Complementing earlier reports, ASD subjects show a clear excess of de novo LoF mutations over expectation, with a pile-up of such events in a handful of genes. While this handful has a large effect on risk, most ASD genes have much smaller impact. This gradient emerges most strikingly from the contrast of risk variation in male and female ASD subjects. Unlike some earlier studies, but consistent with expectation, the data also show clear evidence for effect of de novo missense SNV on risk; for risk generated by LoF variants transmitted from unaffected parents; and for the value of case-control design in gene discovery. Indeed, by integrating data on de novo, inherited and case control variation, the yield of ASD gene discoveries was doubled over what would be obtained from a count of de novo LoF alone. Almost uniformly ASD genes show large constraint against variation, a feature we exploit to implicate other genes in risk. Three critical pathways for typical development are damaged by risk variation: (1) chromatin remodeling, (2) transcription and splicing, and, (3) synaptic function. Chromatin remodeling controls events underlying the formation of neural connections, including neural neurogenesis and neural differentiation[43], and relies on epigenetic marks as histone PTMs. Here we provide extensive evidence for HMGs and readers in sporadic ASD, implicating specifically lysine methylation and extending the mutational landscape of the emergent ASD gene CHD8 to missense variants. Splicing is implicated by the enrichment of RBFOX targets in the top ASD candidates. Risk variation also hits multiple classes and components of synaptic networks, from receptors and ion channels to scaffolding proteins. Because a wide set of synaptic genes is disrupted in idiopathic ASD, it seems reasonable to conjecture that altered chromatin dynamics and transcription, induced by disruption of relevant genes, leads to impaired synaptic function as well. De novo mutations in ASD[11-15], intellectual disability[44] and schizophrenia[29] cluster to synaptic genes, and synaptic defects have been reported in models of these disorders[45]. Integrity of synaptic function is essential for neural physiology, and its perturbation could represent the intersection between diverse neuropsychiatric disorders[46].

Workflow of the study

Expected number of ASD genes discovered as a function of sample size

Heat map of the numbers of variants used in TADA analysis from each dataset in genes with q < 0.3

Genome browser view of the CNV deletions identified in ASD affected subjects

Frequency of variants by gender

Enrichment terms for the four clusters identified by protein-protein interaction network

De novo variants in SET lysine methyltransferases and JmjC lysine demethylases

Transcription regulation network of TADA genes only

Edges indicate transcription regulator (source node) and its gene targets (target node) based on ChEA network. CNVs hitting TADA genes. Count of deletion copy number variants, inferred from sequence, for ASD subjects and those unaffected by ASD. Number of subjects and family status: 849 ASD without family information; 1467 ASD subjects in families; 2766 unaffected parents; 319 unaffected siblings of ASD subjects; 373 unaffected subjects without family information. No parents in this count were affected; 7 parents in the study were affected, none carried a CNV reported in the table and these subjects did not enter the calculation. Tr-ASD = transmitted to ASD subject from carrier parent; NT=parent a carrier but CNV not transmitted to affected child; Tr-not-ASD = parent transmits a CNV to an unaffected child. To compute the odds ratio we count the number ‘a’ of affected carriers, ‘b’ unaffected carriers (including parents), ‘c’ affected subjects who do not have the CNV, and ‘d’ unaffected non-carriers. The odds ratio = (ad)/(bc). One parent transmits the CNV to an affected and unaffected offspring; to obtain the total count of controls with a CNV, subtract one. Genes are adjacent in the genome (see Extended Data Fig. 4). For 3 subjects both genes are hit by the same CNV (1 ASD and 2 unaffected subjects).

46 in total

1. ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments.

Authors: Alexander Lachmann; Huilei Xu; Jayanth Krishnan; Seth I Berger; Amin R Mazloom; Avi Ma'ayan
Journal: Bioinformatics Date: 2010-08-13 Impact factor: 6.937

2. Genetic and biophysical basis of sudden unexplained nocturnal death syndrome (SUNDS), a disease allelic to Brugada syndrome.

Authors: Matteo Vatta; Robert Dumaine; George Varghese; Todd A Richard; Wataru Shimizu; Naohiko Aihara; Koonlawee Nademanee; Ramon Brugada; Josep Brugada; Gumpanart Veerakul; Hua Li; Neil E Bowles; Pedro Brugada; Charles Antzelevitch; Jeffrey A Towbin
Journal: Hum Mol Genet Date: 2002-02-01 Impact factor: 6.150

3. A higher mutational burden in females supports a "female protective model" in neurodevelopmental disorders.

Authors: Sébastien Jacquemont; Bradley P Coe; Micha Hersch; Michael H Duyzend; Niklas Krumm; Sven Bergmann; Jacques S Beckmann; Jill A Rosenfeld; Evan E Eichler
Journal: Am J Hum Genet Date: 2014-02-27 Impact factor: 11.025

Review 4. Dendritic spine pathology in neuropsychiatric disorders.

Authors: Peter Penzes; Michael E Cahill; Kelly A Jones; Jon-Eric VanLeeuwen; Kevin M Woolfrey
Journal: Nat Neurosci Date: 2011-03 Impact factor: 24.884

5. Functional impact of global rare copy number variation in autism spectrum disorders.

Authors: Dalila Pinto; Alistair T Pagnamenta; Lambertus Klei; Richard Anney; Daniele Merico; Regina Regan; Judith Conroy; Tiago R Magalhaes; Catarina Correia; Brett S Abrahams; Joana Almeida; Elena Bacchelli; Gary D Bader; Anthony J Bailey; Gillian Baird; Agatino Battaglia; Tom Berney; Nadia Bolshakova; Sven Bölte; Patrick F Bolton; Thomas Bourgeron; Sean Brennan; Jessica Brian; Susan E Bryson; Andrew R Carson; Guillermo Casallo; Jillian Casey; Brian H Y Chung; Lynne Cochrane; Christina Corsello; Emily L Crawford; Andrew Crossett; Cheryl Cytrynbaum; Geraldine Dawson; Maretha de Jonge; Richard Delorme; Irene Drmic; Eftichia Duketis; Frederico Duque; Annette Estes; Penny Farrar; Bridget A Fernandez; Susan E Folstein; Eric Fombonne; Christine M Freitag; John Gilbert; Christopher Gillberg; Joseph T Glessner; Jeremy Goldberg; Andrew Green; Jonathan Green; Stephen J Guter; Hakon Hakonarson; Elizabeth A Heron; Matthew Hill; Richard Holt; Jennifer L Howe; Gillian Hughes; Vanessa Hus; Roberta Igliozzi; Cecilia Kim; Sabine M Klauck; Alexander Kolevzon; Olena Korvatska; Vlad Kustanovich; Clara M Lajonchere; Janine A Lamb; Magdalena Laskawiec; Marion Leboyer; Ann Le Couteur; Bennett L Leventhal; Anath C Lionel; Xiao-Qing Liu; Catherine Lord; Linda Lotspeich; Sabata C Lund; Elena Maestrini; William Mahoney; Carine Mantoulan; Christian R Marshall; Helen McConachie; Christopher J McDougle; Jane McGrath; William M McMahon; Alison Merikangas; Ohsuke Migita; Nancy J Minshew; Ghazala K Mirza; Jeff Munson; Stanley F Nelson; Carolyn Noakes; Abdul Noor; Gudrun Nygren; Guiomar Oliveira; Katerina Papanikolaou; Jeremy R Parr; Barbara Parrini; Tara Paton; Andrew Pickles; Marion Pilorge; Joseph Piven; Chris P Ponting; David J Posey; Annemarie Poustka; Fritz Poustka; Aparna Prasad; Jiannis Ragoussis; Katy Renshaw; Jessica Rickaby; Wendy Roberts; Kathryn Roeder; Bernadette Roge; Michael L Rutter; Laura J Bierut; John P Rice; Jeff Salt; Katherine Sansom; Daisuke Sato; Ricardo Segurado; Ana F Sequeira; Lili Senman; Naisha Shah; Val C Sheffield; Latha Soorya; Inês Sousa; Olaf Stein; Nuala Sykes; Vera Stoppioni; Christina Strawbridge; Raffaella Tancredi; Katherine Tansey; Bhooma Thiruvahindrapduram; Ann P Thompson; Susanne Thomson; Ana Tryfon; John Tsiantis; Herman Van Engeland; John B Vincent; Fred Volkmar; Simon Wallace; Kai Wang; Zhouzhi Wang; Thomas H Wassink; Caleb Webber; Rosanna Weksberg; Kirsty Wing; Kerstin Wittemeyer; Shawn Wood; Jing Wu; Brian L Yaspan; Danielle Zurawiecki; Lonnie Zwaigenbaum; Joseph D Buxbaum; Rita M Cantor; Edwin H Cook; Hilary Coon; Michael L Cuccaro; Bernie Devlin; Sean Ennis; Louise Gallagher; Daniel H Geschwind; Michael Gill; Jonathan L Haines; Joachim Hallmayer; Judith Miller; Anthony P Monaco; John I Nurnberger; Andrew D Paterson; Margaret A Pericak-Vance; Gerard D Schellenberg; Peter Szatmari; Astrid M Vicente; Veronica J Vieland; Ellen M Wijsman; Stephen W Scherer; James S Sutcliffe; Catalina Betancur
Journal: Nature Date: 2010-06-09 Impact factor: 49.962

6. FMRP targets distinct mRNA sequence elements to regulate protein expression.

Authors: Manuel Ascano; Neelanjan Mukherjee; Pradeep Bandaru; Jason B Miller; Jeffrey D Nusbaum; David L Corcoran; Christine Langlois; Mathias Munschauer; Scott Dewell; Markus Hafner; Zev Williams; Uwe Ohler; Thomas Tuschl
Journal: Nature Date: 2012-12-12 Impact factor: 49.962

7. Patterns and rates of exonic de novo mutations in autism spectrum disorders.

Authors: Benjamin M Neale; Yan Kou; Li Liu; Avi Ma'ayan; Kaitlin E Samocha; Aniko Sabo; Chiao-Feng Lin; Christine Stevens; Li-San Wang; Vladimir Makarov; Paz Polak; Seungtai Yoon; Jared Maguire; Emily L Crawford; Nicholas G Campbell; Evan T Geller; Otto Valladares; Chad Schafer; Han Liu; Tuo Zhao; Guiqing Cai; Jayon Lihm; Ruth Dannenfelser; Omar Jabado; Zuleyma Peralta; Uma Nagaswamy; Donna Muzny; Jeffrey G Reid; Irene Newsham; Yuanqing Wu; Lora Lewis; Yi Han; Benjamin F Voight; Elaine Lim; Elizabeth Rossin; Andrew Kirby; Jason Flannick; Menachem Fromer; Khalid Shakir; Tim Fennell; Kiran Garimella; Eric Banks; Ryan Poplin; Stacey Gabriel; Mark DePristo; Jack R Wimbish; Braden E Boone; Shawn E Levy; Catalina Betancur; Shamil Sunyaev; Eric Boerwinkle; Joseph D Buxbaum; Edwin H Cook; Bernie Devlin; Richard A Gibbs; Kathryn Roeder; Gerard D Schellenberg; James S Sutcliffe; Mark J Daly
Journal: Nature Date: 2012-04-04 Impact factor: 49.962

8. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations.

Authors: Brian J O'Roak; Laura Vives; Santhosh Girirajan; Emre Karakoc; Niklas Krumm; Bradley P Coe; Roie Levy; Arthur Ko; Choli Lee; Joshua D Smith; Emily H Turner; Ian B Stanaway; Benjamin Vernot; Maika Malig; Carl Baker; Beau Reilly; Joshua M Akey; Elhanan Borenstein; Mark J Rieder; Deborah A Nickerson; Raphael Bernier; Jay Shendure; Evan E Eichler
Journal: Nature Date: 2012-04-04 Impact factor: 49.962

9. Most genetic risk for autism resides with common variation.

Authors: Trent Gaugler; Lambertus Klei; Stephan J Sanders; Corneliu A Bodea; Arthur P Goldberg; Ann B Lee; Milind Mahajan; Dina Manaa; Yudi Pawitan; Jennifer Reichert; Stephan Ripke; Sven Sandin; Pamela Sklar; Oscar Svantesson; Abraham Reichenberg; Christina M Hultman; Bernie Devlin; Kathryn Roeder; Joseph D Buxbaum
Journal: Nat Genet Date: 2014-07-20 Impact factor: 38.330

10. Integrated model of de novo and inherited genetic variants yields greater power to identify risk genes.

Authors: Xin He; Stephan J Sanders; Li Liu; Silvia De Rubeis; Elaine T Lim; James S Sutcliffe; Gerard D Schellenberg; Richard A Gibbs; Mark J Daly; Joseph D Buxbaum; Matthew W State; Bernie Devlin; Kathryn Roeder
Journal: PLoS Genet Date: 2013-08-15 Impact factor: 5.917

1060 in total

1. De Novo Mutations in YWHAG Cause Early-Onset Epilepsy.

Authors: Ilaria Guella; Marna B McKenzie; Daniel M Evans; Sarah E Buerki; Eric B Toyota; Margot I Van Allen; Mohnish Suri; Frances Elmslie; Marleen E H Simon; Koen L I van Gassen; Delphine Héron; Boris Keren; Caroline Nava; Mary B Connolly; Michelle Demos; Matthew J Farrer
Journal: Am J Hum Genet Date: 2017-08-03 Impact factor: 11.025

2. Haploinsufficiency of ZNF462 is associated with craniofacial anomalies, corpus callosum dysgenesis, ptosis, and developmental delay.

Authors: Karin Weiss; Kristen Wigby; Madeleine Fannemel; Lindsay B Henderson; Natalie Beck; Neeti Ghali; D D D Study; Britt-Marie Anderlid; Johanna Lundin; Ada Hamosh; Marilyn C Jones; Sondhya Ghedia; Maximilian Muenke; Paul Kruszka
Journal: Eur J Hum Genet Date: 2017-05-17 Impact factor: 4.246

Review 3. Drug development in the era of precision medicine.

Authors: Sarah A Dugger; Adam Platt; David B Goldstein
Journal: Nat Rev Drug Discov Date: 2017-12-08 Impact factor: 84.694

4. Missense variants in the chromatin remodeler CHD1 are associated with neurodevelopmental disability.

Authors: Genay O Pilarowski; Hilary J Vernon; Carolyn D Applegate; Leandros Boukas; Megan T Cho; Christina A Gurnett; Paul J Benke; Erin Beaver; Jennifer M Heeley; Livija Medne; Ian D Krantz; Meron Azage; Dmitriy Niyazov; Lindsay B Henderson; Ingrid M Wentzensen; Berivan Baskin; Maria J Guillen Sacoto; Gregory D Bowman; Hans T Bjornsson
Journal: J Med Genet Date: 2017-09-02 Impact factor: 6.318

5. Exonic Mosaic Mutations Contribute Risk for Autism Spectrum Disorder.

Authors: Deidre R Krupp; Rebecca A Barnard; Yannis Duffourd; Sara A Evans; Ryan M Mulqueen; Raphael Bernier; Jean-Baptiste Rivière; Eric Fombonne; Brian J O'Roak
Journal: Am J Hum Genet Date: 2017-08-31 Impact factor: 11.025

6. A Statistical Framework for Mapping Risk Genes from De Novo Mutations in Whole-Genome-Sequencing Studies.

Authors: Yuwen Liu; Yanyu Liang; A Ercument Cicek; Zhongshan Li; Jinchen Li; Rebecca A Muhle; Martina Krenzer; Yue Mei; Yan Wang; Nicholas Knoblauch; Jean Morrison; Siming Zhao; Yi Jiang; Evan Geller; Iuliana Ionita-Laza; Jinyu Wu; Kun Xia; James P Noonan; Zhong Sheng Sun; Xin He
Journal: Am J Hum Genet Date: 2018-05-10 Impact factor: 11.025

7. Replication of a rare risk haplotype on 1p36.33 for autism spectrum disorder.

Authors: N H Chapman; R A Bernier; S J Webb; J Munson; E M Blue; D-H Chen; E Heigham; W H Raskind; Ellen M Wijsman
Journal: Hum Genet Date: 2018-10-01 Impact factor: 4.132

8. De Novo Truncating Variants in ASXL2 Are Associated with a Unique and Recognizable Clinical Phenotype.

Authors: Vandana Shashi; Loren D M Pena; Katherine Kim; Barbara Burton; Maja Hempel; Kelly Schoch; Magdalena Walkiewicz; Heather M McLaughlin; Megan Cho; Nicholas Stong; Scott E Hickey; Christine M Shuss; Michael S Freemark; Jane S Bellet; Martha Ann Keels; Melanie J Bonner; Maysantoine El-Dairi; Megan Butler; Peter G Kranz; Constance T R M Stumpel; Sylvia Klinkenberg; Karin Oberndorff; Malik Alawi; Rene Santer; Slavé Petrovski; Outi Kuismin; Satu Korpi-Heikkilä; Olli Pietilainen; Palotie Aarno; Mitja I Kurki; Alexander Hoischen; Anna C Need; David B Goldstein; Fanny Kortüm
Journal: Am J Hum Genet Date: 2016-09-29 Impact factor: 11.025

9. Heterozygous Mutations in SMARCA2 Reprogram the Enhancer Landscape by Global Retargeting of SMARCA4.

Authors: Fangjian Gao; Nicholas J Elliott; Josephine Ho; Alexzander Sharp; Maxim N Shokhirev; Diana C Hargreaves
Journal: Mol Cell Date: 2019-07-30 Impact factor: 17.970

10. Gene expression elucidates functional impact of polygenic risk for schizophrenia.

Authors: Menachem Fromer; Panos Roussos; Solveig K Sieberts; Jessica S Johnson; David H Kavanagh; Thanneer M Perumal; Douglas M Ruderfer; Edwin C Oh; Aaron Topol; Hardik R Shah; Lambertus L Klei; Robin Kramer; Dalila Pinto; Zeynep H Gümüş; A Ercument Cicek; Kristen K Dang; Andrew Browne; Cong Lu; Lu Xie; Ben Readhead; Eli A Stahl; Jianqiu Xiao; Mahsa Parvizi; Tymor Hamamsy; John F Fullard; Ying-Chih Wang; Milind C Mahajan; Jonathan M J Derry; Joel T Dudley; Scott E Hemby; Benjamin A Logsdon; Konrad Talbot; Towfique Raj; David A Bennett; Philip L De Jager; Jun Zhu; Bin Zhang; Patrick F Sullivan; Andrew Chess; Shaun M Purcell; Leslie A Shinobu; Lara M Mangravite; Hiroyoshi Toyoshiba; Raquel E Gur; Chang-Gyu Hahn; David A Lewis; Vahram Haroutunian; Mette A Peters; Barbara K Lipska; Joseph D Buxbaum; Eric E Schadt; Keisuke Hirai; Kathryn Roeder; Kristen J Brennand; Nicholas Katsanis; Enrico Domenici; Bernie Devlin; Pamela Sklar
Journal: Nat Neurosci Date: 2016-09-26 Impact factor: 24.884