| Literature DB >> 33970367 |
Kuokuo Li1,2,3,4, Zhenghuan Fang2, Guihu Zhao1, Bin Li1, Chao Chen2, Lu Xia2, Lin Wang2, Tengfei Luo2, Xiaomeng Wang2, Zheng Wang1, Yi Zhang1, Yi Jiang2, Qian Pan2, Zhengmao Hu2,5, Hui Guo2,5, Beisha Tang1,5, Chunyu Liu2,6, Zhongsheng Sun7,8, Kun Xia9,10,11, Jinchen Li12,13,14.
Abstract
The clinical similarity among different neuropsychiatric disorders (NPDs) suggested a shared genetic basis. We catalogued 23,109 coding de novo mutations (DNMs) from 6511 patients with autism spectrum disorder (ASD), 4,293 undiagnosed developmental disorder (UDD), 933 epileptic encephalopathy (EE), 1022 intellectual disability (ID), 1094 schizophrenia (SCZ), and 3391 controls. We evaluated that putative functional DNMs contribute to 38.11%, 34.40%, 33.31%, 10.98% and 6.91% of patients with ID, EE, UDD, ASD and SCZ, respectively. Consistent with phenotype similarity and heterogeneity in different NPDs, they show different degree of genetic association. Cross-disorder analysis of DNMs prioritized 321 candidate genes (FDR < 0.05) and showed that genes shared in more disorders were more likely to exhibited specific expression pattern, functional pathway, genetic convergence, and genetic intolerance.Entities:
Keywords: Candidate gene; De novo mutation; Expression pattern; Functional network; Neuropsychiatric disorder
Mesh:
Year: 2021 PMID: 33970367 PMCID: PMC8854168 DOI: 10.1007/s10803-021-05031-7
Source DB: PubMed Journal: J Autism Dev Disord ISSN: 0162-3257
Burdens and contributions of different classes of DNMs in five disorders
| Disorders (N) | Category | LoF | Dmis | Pfun | Tmis | Synonymous |
|---|---|---|---|---|---|---|
| ASD (6511) | DNMs | 1228 | 1633 | 2861 | 3406 | 1864 |
| 6.80E−09 | 6.00E−04 | 9.81E−08 | 0.20 | |||
| 0.20 | ||||||
| OR | 1.49 | 1.23 | 1.33 | 1.07 | ||
| 95% Cl | 1.30–1.72 | 1.09–1.39 | 1.20–1.48 | 0.97–1.18 | ||
| Clinical implicated DNMs (%) | 33.06 | 18.92 | 24.99 | |||
| Contribute to patients (%) | 6.24 | 4.75 | 10.98 | |||
| UDD (4293) | DNMs | 1,382 | 1,898 | 3,280 | 2676 | 1607 |
| 2.11E−22 | 6.66E−17 | 6.15E−26 | 0.60 | |||
| 0.60 | ||||||
| OR | 1.95 | 1.66 | 1.77 | 0.97 | ||
| 95% Cl | 1.70–2.24 | 1.47–1.88 | 1.59–1.97 | 0.88–1.08 | ||
| Clinical implicated DNMs (%) | 48.72 | 39.86 | 43.59 | |||
| Contribute to patients (%) | 15.68 | 17.62 | 33.31 | |||
| EE (933) | DNMs | 192 | 350 | 542 | 414 | 192 |
| 5.03E−12 | 1.66E−20 | 1.90E−22 | 0.018 | |||
| OR | 2.27 | 2.57 | 2.45 | 1.26 | ||
| 95% Cl | 1.79–2.88 | 2.09–3.16 | 2.03–2.97 | 1.04–1.53 | ||
| Clinical implicated DNMs (%) | 55.90 | 61.03 | 59.22 | |||
| Contribute to patients (%) | 11.50 | 22.90 | 34.40 | |||
| ID (1022) | DNMs | 309 | 366 | 675 | 447 | 248 |
| 7.94E−24 | 2.91E−14 | 2.23E−24 | 0.59 | |||
| 0.59 | ||||||
| OR | 2.82 | 2.08 | 2.36 | 1.05 | ||
| 95% Cl | 2.2–3.48 | 1.71–2.52 | 1.99–2.81 | 0.88–1.26 | ||
| Clinical implicated DNMs (%) | 64.61 | 51.87 | 57.70 | |||
| Contribute to patients (%) | 19.53 | 18.58 | 38.11 | |||
| SCZ (1094) | DNMs | 136 | 217 | 353 | 450 | 241 |
| 0.045 | 0.028 | 0.011 | 0.35 | |||
| 0.060 | 0.056 | 0.35 | ||||
| OR | 1.28 | 1.27 | 1.27 | 1.09 | ||
| 95% Cl | 1.00–1.64 | 1.02–1.57 | 1.05–1.54 | 0.91–1.31 | ||
| Clinical implicated DNMs (%) | 21.85 | 21.11 | 22.40 | |||
| Contribute to patients (%) | 2.72 | 4.19 | 6.91 | |||
| NPDs (13,853) | DNMs | 3247 | 4464 | 7711 | 7393 | 4152 |
| 3.53E−20 | 4.68E−14 | 1.85E−22 | 0.39 | |||
| 0.39 | ||||||
| OR | 1.77 | 1.51 | 1.61 | 1.04 | ||
| 95% Cl | 1.56–2.01 | 1.36–1.69 | 1.46–1.78 | 0.95–1.14 | ||
| Clinical implicated DNMs (%) | 43.61 | 33.93 | 38.01 | |||
| Contribute to patients (%) | 10.22 | 10.94 | 21.16 | |||
| Control (3391) | DNMs | 411 | 662 | 1073 | 1595 | 932 |
“Clinical implicated DNMs” means the estimated proportion of DNMs in each disorder involved in the etiology of disorders. “Contribute to patients” means the proportion of patients can be interpreted by DNMs based on the clinical implicated DNMs. We performed a Fisher exact test, which normalizes by the number of de novo synonymous mutations in each condition to adjust the batch effects in different studies. The Benjamini and Hochberg false discovery rate (FDR) procedure was used to adjust for multiple testing. padj below 0.05 were highlighted in bold
ASD autism spectrum disorder, UDD undiagnosed developmental disorder, EE epileptic encephalopathy, ID intellectual disability, SCZ schizophrenia, NPDs neuropsychiatric disorders, integration of these five disorders, DNMs de novo mutations, OR odds ratio, CI confidence interval, Dmis deleterious missense variants as predicted by ReVe, Tmis tolerant missense variants, LoF loss-of-function variants including frameshift, stoploss and stopgain, splicing variants, Pfun putative functional variant including Dmis and LoF variants
Fig. 1Overlap of genes across five NPDs based on de novo mutations. Overlap of genes among disorders were performed based on three classes of variants include LoF, Dmis and Pfun. O/E ratio of observed to expected numbers of shared genes, Dmis Deleterious missense variants, Tmis Tolerant missense variants, LoF loss of function. LoF include frameshift, stoploss and stopgain, splicing variants, Pfun Putative functional variants, including Dmis and LoF variants. The Benjamini and Hochberg false discovery rate (FDR) procedure was used to adjust for multiple testing
Prioritized candidate genes with FDR < 0.05 in this study
| Rank | Unique genes (26.48%, | Shared genes in two disorders (32.71%, | Shared genes in three disorders (27.10%, | Shared genes in at least four disorders (13.71%, |
|---|---|---|---|---|
| FDR ≤ 0.0001 (39.25%, | HDAC8u, KANSL1u, ZBTB18u, CNKSR2u, BTF3u, MSL3u, PDHA1u | SATB2, GATAD2B, KAT6B, MEF2C, SMC1A, CDKL5, SMARCA2, KDM5B, NSD1, EHMT1, HNRNPU, PTPN11, CTCF, CNOT3, TBR1, NFIX, PPP1CB, KIF1A, GNAI1, CHAMP1, KCNH1, NAA15*, UPF3B, PIK3CA**, MAP4K4, UNC80*, KCNT1, KDM6A, ZC4H2, SMAD4, WDR26**, SOX11 | SLC6A1, FOXP1, CTNNB1, TCF4, SETD5, PPP2R5D, ASXL3, MED13L, SCN1A, DYRK1A, ADNP, EP300, PURA, WDR45, CDK13, TBL1XR1, IRF2BPL, PTEN, KAT6A, DNM1, PACS1, SHANK3, TRIP12*, PPM1D, NAA10, TCF20, CLTC, SET*, BRAF, CACNA1A, BCL11A, CHD3, EFTUD2, SMARCA4*, SOX5, HECW2, KMT5B**, FBXO11**, USP9X, DLG4, PBX1***, MYT1L**, TCF7L2**, NR2F1**, ATP1A3***, SLC35A2**, NALCN, NSD2**, ANK2 | CHD8, CHD2, POGZ, STXBP1, KCNQ2, KMT2A, ANKRD11, DDX3X, ARID1B, SCN8A, GRIN2B, CSNK2A1, WAC, FOXG1, CASK, IQSEC2, AHDC1, EEF1A2, TLK2*, DNMT3A*, GABRB2, CREBBP, COL4A3BP, PUF60, CACNA1E*, GABRB3**, KIAA2022***, CUL3**, SMC3*, PHIP, CHD4, ITPR1, KCNQ3*, AGO1***, MECP2, SYNGAP1, SCN2A, GNAO1 |
| 0.0001 < FDR ≤ 0.001 (10.59%, | GFOD2u, MYO1Ea, TAOK1u, PRKAR1Au, AKT3u, SLC12A2u, HNRNPKu, AGO2i | QRICH1*, EBF3, PPP2R1A, CAMK2A*, RAB11A, RAC1, SNAP25*, USP7, TNPO2, SYT1, SIN3A | FOXP2*, HIST1H1E*, GABBR2, MBD5**, SETBP1, ZMYND11*, SLC22A23*, KCNB1, CSNK1E*, SYNCRIP*, RFX3**, LOC400927-CSNK1E*, DHDDS* | GRIN2A, NACC1 |
| 0.001 < FDR ≤ 0.01 (16.82%, | TA | CYP27C1, ARHGEF9, HK1, POU3F3, FGF12, CBL, PPP2CA, CSNK2B, SF1, TNPO3, ASXL1, PRKD1*, ERI1, HIST1H4E, CAMK2B, AUTS2, PHF7, NTRK2, FANCE, RNF146, LAMB1*, FAM200A, SSBP3, PRPF8, FIGN, | YWHAG*, GABRA1, SRCAP*, TUBA1A, BRPF1, GABRG2, CPSF7, SPAST, NRXN1*, DSCAM | DYNC1H1 |
| 0.01 < FDR ≤ 0.05 (33.33%, | EBF2e, PDX1u, PLEKHB2u, HIST1H2ACu, TAF13s, FAM104Au, SMPD4e, MPPED2u, GOLPH3u, RAD51u, FOSL2u, GALNT18a, PI4K2Bu, PRSS48u, PRKAR1Ba, SIAH1u, LARP7u, EIF4A2u, COL23A1u, ATP8A1i, RAB11Bu, GIGYF1a, MSI1u, SMPD2u, PISDu, ZBTB10u, DCAF7u, NONOu, RPUSD1u, GRIA2a, TMEM26u, FXYD5a, IL1RAPL2u, TGFB2u, NFE2L2u, ACHEa, FAM84Au, PAPOLGa, PNKDu, ERBB4u, CLDN5u, ATP1B1a, SGCEu, NUDT17a, SNRPB2i, ACTC1u, GABRPu, GNB1u, ILF2a, LRRC3Ci, NR6A1u, STK33a | ANO3, G3BP1, CLCN4, DCX, MAP2K1, VEZF1, ASH1L, SMARCD1, KCNQ5, ENO1, STXBP3, LZTR1, AGO3, RPL26, GRIK1, CELF2, EYA1, PDK2, TFAP2C, COL4A1, LMTK3, PRR14L, TCTE3, NUDT4, GLRA2, ARIH1, GNAS, MEIS2, BIRC5, DOCK1, TIFA, DPF2, H2AFV, ABI2, MECOM, CFAP45, TFAP4 | ARHGAP15, SMAD6, DEAF1, NR4A2, TANC2, GNB2, SMARCC2, TAF1, KIF5C, RPL4, PSD3, MARK2, PHF21A, UGT1A3, SPRED2 | SON, GRIN1, RBM12 |
Candidate genes are split into four parts based on the number of disorders with putative functional DNMs in specific gene. Unique genes with superscripts letters including a, u, e, i, s represents genes only carry putative functional DNMs in ASD, UDD, EE, ID, SCZ respectively. We ranked all candidate genes into four tiers based on the strength of FDR. The number of asterisks on the genes indicate the increased ranks of candidate genes by cross disorders analysis compared to individual disorder analysis
Fig. 2Expression characteristics of candidate genes in the human brain. a Spatiotemporal expression pattern of candidate genes based on RNA-seq data from BrainSpan. 250 of 321 Candidate genes could be classified into two co-expression modules: M1 (n = 171) and M2 (n = 79). MFC medial prefrontal cortex, OFC orbital frontal cortex, DFC dorsolateral prefrontal cortex, VFC ventrolateral prefrontal cortex, M1C primary motor cortex, S1C primary somatosensory cortex, IPC inferior parietal cortex, A1C primary auditory cortex, STC superior temporal cortex, ITC inferior temporal cortex, V1C primary visual cortex, HIP hippocampus, AMY amygdala, STR striatum, MD mediodorsal nucleus of thalamus, CBC cerebellar cortex. b Distribution of candidate genes at different conditions in the spatiotemporal coexpression modules. Candidate genes per disorder refers to genes with FDR < 0.05 based on putative functional DNMs in each individual disorder (ASD, UDD, EE, ID). Number of shared disorders refers to candidate genes in modules carrying Pfun in one, two, three or more than four disorders (1, 2, 3, > 4). We used the number of all candidate genes with expression in RNA-seq data of BrainSpan to set the background (All). c Neocortical expression pattern of candidate genes based on microarray data from micro-dissected human prenatal neocortex. A total of 210 candidate genes could be classified into three co-expression modules: Ma (n = 121), Mb (n = 50) and Mc (n = 39). SG subpial granular zone, MZ marginal zone, CPo outer cortical plate, CPi inner cortical plate, SP subplate zone, IZ intermediate zone, SZo outer subventricular zone, SZi inner subventricular zone, VZ ventricular zone. d Distribution of candidate genes in different conditions in the modules. Candidate genes per disorder refers to genes with FDR < 0.05 based on putative functional DNMs in individual disorders (ASD, UDD, EE, ID). Number of shared disorders refers to candidate genes carry Pfun in one, two, three or more than four disorders (1, 2, 3, > 4). We used the number of all candidate genes with expression in microarray data to set background levels (All). M0 candidate genes not be included into coexpression modules. *p < 0.05; **p < 0.01; ***p < 0.001
Fig. 3Functional network of candidate genes. a A network representation to show connectivity between candidate genes based on co-expression and protein–protein interactions (PPI). Dotted lines and full lines between nodes represent co-expression and PPI, respectively. The node size and color of the node boundary represent the number of putative functional variants and shared disorders for specific genes. Colors within nodes indicate the distribution of putative functional variant in each disorder. b Box plots for the relationship between the number of candidate genes connected with others, and the gene intolerance score, and the number of shared disorders. Top, co-expression and protein–protein interactions; bottom, residual variation intolerance score (RVIS) and substitution intolerance scores from Aggarwala et al., Nature Genetics 2016. c Top 20 clusters of functional enrichment for candidate gene (gene ontology terms with a similarity > 0.3 were merged into one cluster). d Distribution of genes in different conditions in the clusters related to their chromatin and synaptic function. Candidate genes per disorder refers to genes with FDR < 0.05 based on putative functional DNMs for individual disorders (ASD, UDD, EE, ID). Number of shared disorders refers to all candidate genes carrying Pfun in one, three or more than four disorders (1, 2, 3, > 4). We used the number of all candidate genes to set the background (All). *p < 0.05; **p < 0.01