| Literature DB >> 33004838 |
Tianyun Wang1, Kendra Hoekzema1, Davide Vecchio2,3, Huidan Wu4, Arvis Sulovari1, Bradley P Coe1, Madelyn A Gillentine1, Amy B Wilfert1, Luis A Perez-Jurado5,6,7, Malin Kvarnung8,9, Yoeri Sleyp10, Rachel K Earl11, Jill A Rosenfeld12,13, Madeleine R Geisheker1, Lin Han4, Bing Du4, Chris Barnett5,14, Elizabeth Thompson5, Marie Shaw14, Renee Carroll14, Kathryn Friend15, Rachael Catford15, Elizabeth E Palmer16,17, Xiaobing Zou18, Jianjun Ou19, Honghui Li20, Hui Guo4, Jennifer Gerdts11, Emanuela Avola21, Giuseppe Calabrese21, Maurizio Elia21, Donatella Greco21, Anna Lindstrand8,9, Ann Nordgren8,9, Britt-Marie Anderlid8,9, Geert Vandeweyer22, Anke Van Dijck22, Nathalie Van der Aa22, Brooke McKenna23, Miroslava Hancarova24, Sarka Bendova24, Marketa Havlovicova24, Giovanni Malerba25, Bernardo Dalla Bernardina26, Pierandrea Muglia27, Arie van Haeringen28, Mariette J V Hoffer28, Barbara Franke29,30, Gerarda Cappuccio31,32, Martin Delatycki33, Paul J Lockhart33,34, Melanie A Manning35,36, Pengfei Liu12,13, Ingrid E Scheffer33,37,38,39, Nicola Brunetti-Pierri31,32, Nanda Rommelse30,40, David G Amaral41, Gijs W E Santen28, Elisabetta Trabetti25, Zdeněk Sedláček24, Jacob J Michaelson42, Karen Pierce43, Eric Courchesne43, R Frank Kooy22, Magnus Nordenskjöld8,9, Corrado Romano21, Hilde Peeters10, Raphael A Bernier11, Jozef Gecz6,14,15, Kun Xia4,44, Evan E Eichler45,46.
Abstract
Most genes associated with neurodevelopmental disorders (NDDs) were identified with an excess of de novo mutations (DNMs) but the significance in case-control mutation burden analysis is unestablished. Here, we sequence 63 genes in 16,294 NDD cases and an additional 62 genes in 6,211 NDD cases. By combining these with published data, we assess a total of 125 genes in over 16,000 NDD cases and compare the mutation burden to nonpsychiatric controls from ExAC. We identify 48 genes (25 newly reported) showing significant burden of ultra-rare (MAF < 0.01%) gene-disruptive mutations (FDR 5%), six of which reach family-wise error rate (FWER) significance (p < 1.25E-06). Among these 125 targeted genes, we also reevaluate DNM excess in 17,426 NDD trios with 6,499 new autism trios. We identify 90 genes enriched for DNMs (FDR 5%; e.g., GABRG2 and UIMC1); of which, 61 reach FWER significance (p < 3.64E-07; e.g., CASZ1). In addition to doubling the number of patients for many NDD risk genes, we present phenotype-genotype correlations for seven risk genes (CTCF, HNRNPU, KCNQ3, ZBTB18, TCF12, SPEN, and LEO1) based on this large-scale targeted sequencing effort.Entities:
Mesh:
Substances:
Year: 2020 PMID: 33004838 PMCID: PMC7530681 DOI: 10.1038/s41467-020-18723-y
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Overview of study design.
Targeted sequencing was performed in probands for two gene panels: NDD1 (63 genes) and hcNDD (62 genes). Gene and variant counts are after QC. The same categories of variants were retrieved from three previously published smMIP studies for 62 hcNDD genes. All smMIP variants were combined; redundant samples were eliminated and compared to the same category of variants from ExAC non-psych controls. The number of variants is after the exclusion of false positive variants and variants with insufficient coverage in ExAC. Mutation burden analysis identified 48 FDR significant genes (qmutBurden < 0.05, Benjamini–Hochberg correction for 125 genes), of which six reached FWER significance (pmutBurden < 1.25E−06, Bonferroni correction for 20,000 genes and two tests); DNMs of the 125 genes used in this study were identified from exome sequencing in 10,927 published NDD trios and 6,499 new ASD trios that combined as 17,426 NDD parent–child trios. A separate de novo enrichment analysis, using two statistical methods (CH model and denovolyzeR), identified 90 FDR significant genes (qdnEnrich < 0.05, Benjamini–Hochberg correction for 18,946 genes in CH model and 19,618 genes in denovolyzeR), of which, 61 genes reach FWER significance (pdnEnrich < 3.64E−07, Bonferroni correction for 19,618 genes and seven tests) for excess DNM. There is a significant overlap (40 genes) of the significant genes suggested by the two approaches. Then we performed genotype–phenotype correlation analysis for seven NDD risk genes (CTCF, HNRNPU, KCNQ3, ZBTB18, TCF12, SPEN, and LEO1) and present a clearer clinical picture of each gene.
Genes with a significant burden for ultra-rare severe variants.
| Gene | Samples | smMIP (AC ≤ 3) Combined (This study | Published) | ExAC non-psych (AC ≤ 9) | Mutation burden test | |||||
|---|---|---|---|---|---|---|---|---|---|
| LGD | MIS30 | LGD | MIS30 | LGD | MIS30 | FDR Significance | FWER Significance | ||
| 19,847 | 33 (12|23) | 25 (4|23) | 1 | 11 | 2.09E−16 | 1.63E−06 | LGD MIS30 | LGD | |
| 19,847 | 14 (8|9) | 14 (5|10) | 0 | 6 | 5.82E−08 | 3.08E−04 | LGD MIS30 | LGD | |
| 19,847 | 28 (13|18) | 3 (1|2) | 1 | 3 | 6.89E−14 | 2.64E−01 | LGD | LGD | |
| 19,847 | 25 (8|17) | 21 (10|11) | 5 | 32 | 3.03E−09 | 9.77E−02 | LGD | LGD | |
| 19,847 | 16 (7|9) | 13 (3|11) | 2 | 10 | 4.20E−07 | 8.24E−03 | LGD | LGD | |
| 19,847 | 16 (4|12) | 8 (3|5) | 2 | 9 | 4.20E−07 | 1.12E−01 | LGD | LGD | |
| 19,847 | 15 (3|12) | 19 (8|15) | 3 | 12 | 5.28E−06 | 3.68E−04 | LGD MIS30 | ||
| 19,847 | 10 (4|7) | 7 (4|3) | 1 | 0 | 5.41E−05 | 2.41E−04 | LGD MIS30 | ||
| 19,538 | 17 (4|14) | 61 (29|46) | 11 | 86 | 7.61E−04 | 2.16E−03 | LGD MIS30 | ||
| 19,538 | 7 (1|6) | 8 (3|7) | 1 | 2 | 1.32E−03 | 1.63E−03 | LGD MIS30 | ||
| 19,847 | 10 (1|9) | 5 (0|5) | 0 | 11 | 6.80E−06 | 5.65E−01 | LGD | ||
| 16,321 | 8 (8|-) | 3 (3|-) | 0 | 2 | 2.37E−05 | 1.19E−01 | LGD | ||
| 19,077 | 13 (5|9) | 29 (9|22) | 3 | 42 | 2.86E−05 | 2.82E−02 | LGD | ||
| 19,077 | 11 (1|10) | 3 (0|3) | 2 | 7 | 6.32E−05 | 6.06E−01 | LGD | ||
| 19,538 | 8 (2|6) | 12 (3|10) | 0 | 20 | 6.73E−05 | 2.32E−01 | LGD | ||
| 19,538 | 13 (1|12) | 6 (3|3) | 4 | 10 | 1.07E−04 | 3.43E−01 | LGD | ||
| 16,321 | 8 (8|-) | 0 (0|-) | 2 | 0 | 6.26E−04 | 1 | LGD | ||
| 19,847 | 11 (4|8) | 43 (21|34) | 3 | 64 | 2.83E−04 | 2.01E−02 | LGD | ||
| 19,847 | 11 (6|5) | 22 (8|15) | 3 | 35 | 2.83E−04 | 1.17E−01 | LGD | ||
| 19,847 | 9 (2|7) | 12 (1|12) | 2 | 14 | 6.49E−04 | 6.65E−02 | LGD | ||
| 19,847 | 11 (3|8) | 45 (23|37) | 4 | 78 | 7.68E−04 | 8.43E−02 | LGD | ||
| 19,077 | 41 (17|29) | 9 (5|6) | 50 | 12 | 1.28E−03 | 1.38E−01 | LGD | ||
| 19,077 | 11 (7|7) | 4 (2|2) | 5 | 5 | 1.38E−03 | 2.61E−01 | LGD | ||
| 19,847 | 7 (0|7) | 4 (0|4) | 1 | 7 | 1.42E−03 | 4.43E−01 | LGD | ||
| 16,321 | 8 (8|-) | 7 (7|-) | 3 | 24 | 1.76E−03 | 7.49E−01 | LGD | ||
| 16,321 | 6 (6|-) | 1 (1|-) | 1 | 1 | 1.84E−03 | 4.59E−01 | LGD | ||
| 19,847 | 10 (5|6) | 39 (17|28) | 4 | 65 | 1.88E−03 | 7.38E−02 | LGD | ||
| 19,847 | 5 (2|4) | 3 (1|2) | 0 | 2 | 2.61E−03 | 1.69E−01 | LGD | ||
| 19,077 | 7 (0|7) | 10 (4|6) | 2 | 17 | 3.94E−03 | 2.57E−01 | LGD | ||
| 19,538 | 7 (2|5) | 7 (2|5) | 2 | 11 | 4.38E−03 | 2.81E−01 | LGD | ||
| 19,847 | 8 (5|4) | 17 (3|16) | 3 | 18 | 4.73E−03 | 1.84E−02 | LGD | ||
| 16,321 | 6 (6|-) | 3 (3|-) | 2 | 3 | 5.71E−03 | 1.93E−01 | LGD | ||
| 16,321 | 5 (5|-) | 13 (13|-) | 2 | 27 | 1.65E−02 | 2.40E−01 | LGD | ||
| 16,321 | 6 (6|-) | 8 (8|-) | 2 | 17 | 5.71E−03 | 3.32E−01 | LGD | ||
| 16,321 | 5 (5|-) | 4 (4|-) | 1 | 7 | 6.02E−03 | 3.26E−01 | LGD | ||
| 16,321 | 9 (9|-) | 31 (31|-) | 6 | 57 | 6.25E−03 | 4.26E−02 | LGD | ||
| 19,538 | 26 (11|17) | 11 (5|9) | 30 | 10 | 7.26E−03 | 2.70E−02 | LGD | ||
| 19,077 | 9 (3|6) | 0 (0|0) | 5 | 6 | 7.52E−03 | 1 | LGD | ||
| 19,847 | 10 (3|7) | 18 (3|15) | 6 | 24 | 7.96E−03 | 5.96E−02 | LGD | ||
| 19,847 | 10 (2|8) | 12 (2|10) | 6 | 16 | 7.96E−03 | 1.12E−01 | LGD | ||
| 19,538 | 9 (5|5) | 10 (3|7) | 5 | 15 | 8.48E−03 | 1.92E−01 | LGD | ||
| 19,847 | 7 (1|6) | 16 (8|10) | 3 | 25 | 1.15E−02 | 1.52E−01 | LGD | ||
| 19,847 | 5 (2|3) | 3 (2|1) | 1 | 6 | 1.17E−02 | 5.49E−01 | LGD | ||
| 19,847 | 6 (2|4) | 9 (7|5) | 2 | 14 | 1.22E−02 | 2.43E−01 | LGD | ||
| 19,538 | 9 (4|5) | 3 (3|0) | 6 | 6 | 1.56E−02 | 5.40E−01 | LGD | ||
| 16,321 | 9 (9|-) | 12 (12|-) | 8 | 18 | 1.79E−02 | 7.34E−02 | LGD | ||
| 19,847 | 1 (0|1) | 18 (8|12) | 0 | 9 | 3.04E−01 | 1.11E−04 | MIS30 | ||
| 16,321 | 1 (1|-) | 11 (11|-) | 2 | 7 | 6.02E−01 | 2.03E−03 | MIS30 | ||
Fisher’s exact test (one-sided) for LGD and MIS30 variants from smMIP sequencing compared to the ExAC (r0.3) non-psych subset identified 48 genes significant at the FDR level, of which, six genes reach FWER significance. The FDR significance threshold qmutBurden < 0.05 was corrected by the Benjamini–Hochberg method for 125 genes in this study; the FWER significance threshold pmutBurden < 1.25E−06 was corrected by the Bonferroni method for 20,000 genes in human genome and two tests performed (LGD and MIS30 variants). *Indicates 25 genes showing new mutational burden significance in case-control analysis of ultra-rare LGD and MIS30 variants in this study. See Supplementary Data 10 for underlying data.
Fig. 2Significant genes identified from mutation burden and de novo enrichment analyses.
a Mutation burden analysis identified 48 genes significant for LGD and/or MIS30 variants in smMIP sequencing compared with the ExAC (r0.3) non-psych subset controls; each dot indicates a gene and the color indicates the category of variant showing significance for the gene (red for LGD, blue for MIS30, and black for both LGD and MIS30). b The CH model and denovolyzeR show high concordance for genes with significant excess of DNM at both FDR and FWER levels. c A union set of 90 genes showing excess DNM (FDR 5%) in de novo enrichment analysis. Gray dashed box in top panel is shown in bottom panel for a zoom view. See Supplementary Data 10 for underlying data.
Genes reaching new de novo enrichment significance.
| Gene | DNM All (denovo-db | SPARK-27K) | CH model | denovolyzeR | Significance (union of two models) | Reported significance | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| dnLGD | dnMIS | dnMIS30 | dnLGD | dnMIS | dnMIS30 | dnALT | dnLGD | dnMIS | dnALT | FDR | FWER | Coe253 | ASC102 | DDD299 | |
| 0 (0|0) | 6 (4|2) | 0 (0|0) | 1 | 1.67E−02 | 1 | 2.94E−02 | 1 | 5.00E−05 | 2.01E−04 | dnMIS dnALT | – | No | No | No | |
| 4 (2|2) | 2 (2|0) | 0 (0|0) | 6.53E−05 | 4.44E−01 | 1 | 8.20E−03 | 5.17E−05 | 3.76E−01 | 4.49E−03 | dnLGD | – | No | Yes | No | |
| 1 (1|0) | 7 (5|2) | 1 (1|0) | 2.13E−01 | 4.70E−03 | 2.06E−01 | 2.30E−03 | 1.57E−01 | 1.54E−03 | 4.87E−04 | dnALT | – | No | Yes | No | |
| 2 (2|0) | 2 (1|1) | 2 (1|1) | 1.16E−03 | 3.89E−02 | 2.30E−04 | 5.00E−04 | 6.27E−03 | 9.96E−02 | 4.38E−03 | dnALT | – | No | No | Yes | |
| 1 (0|1) | 4 (3|1) | 3 (3|0) | 2.41E−01 | 1.52E−01 | 1.29E−03 | 8.69E−02 | 1.22E−01 | 1.47E−03 | 4.16E−04 | dnALT | – | No | No | No | |
| 9 (6|3) | 5 (5|0) | 1 (1|0) | 1.13E−05 | 9.97E−01 | 6.88E−01 | 6.08E−01 | 6.09E−09 | 3.03E−01 | 1.22E−04 | dnLGD dnALT | dnLGD | Yes | Yes | No | |
| 4 (3|1) | 3 (2|1) | 1 (0|1) | 1.93E−07 | 9.30E−03 | 2.60E−02 | 6.81E−07 | 2.01E−06 | 6.88E−02 | 9.41E−05 | dnLGD dnALT | dnLGD | Yes | Yes | No | |
| 4 (3|1) | 1 (1|0) | 0 (0|0) | 1.06E−07 | 3.02E−01 | 1 | 6.08E−05 | 4.00E−06 | 6.83E−01 | 9.15E−03 | dnLGD dnALT | dnLGD | Yes | Yes | No | |
| 4 (3|1) | 5 (3|2) | 3 (3|0) | 7.36E−06 | 2.30E−03 | 3.70E−04 | 1.26E−06 | 1.56E−05 | 9.05E−04 | 3.47E−07 | dnLGD dnALT | dnALT | Yes | Yes | Yes | |
| 6 (5|1) | 2 (2|0) | 1 (1|0) | 4.39E−06 | 8.43E−01 | 2.56E−01 | 3.59E−02 | 3.30E−07 | 6.63E−01 | 4.55E−03 | dnLGD | dnLGD | Yes | No | Yes | |
| 6 (4|2) | 2 (0|2) | 1 (0|1) | 8.43E−07 | 8.17E−01 | 2.17E−01 | 2.35E−02 | 1.16E−09 | 7.09E−01 | 5.10E−03 | dnLGD | dnLGD | Yes | No | No | |
| 5 (4|1) | 1 (1|0) | 1 (1|0) | 4.57E−07 | 7.20E−01 | 1.35E−01 | 3.40E−03 | 1.19E−07 | 7.82E−01 | 6.60E−03 | dnLGD | dnLGD | Yes | No | Yes | |
Five genes newly reached FDR significance and seven genes reached FWER significance in the de novo enrichment analysis, compared to Coe et al.[25], using the same methods (CH model and denovolyzeR) with DNMs in 17,426 NDD trios combined from denovo-db (v1.5) and SPARK-27K. The FDR significance threshold qdnEnrich < 0.05 was corrected by the Benjamini–Hochberg method for genes in each method (18,946 genes in CH model and 19,618 genes in denovolyzeR); the FWER significance threshold pdnEnrich < 3.64E−07 was corrected by the Bonferroni method for 19,618 genes and seven tests (dnLGD, dnMIS, dnMIS30, and dnALT variants in CH model, and dnLGD, dnMIS, and dnALT variants in denovolyzeR). Coe253 indicates whether the gene is in the 253 genes reported significant (FDR 5%) in Coe et al.[25]; ASC102 indicates whether the gene is in the 102 genes reported as significant (FDR 10%) in Satterstrom et al.[8]; and DDD299 indicates whether the gene is in the 299 genes reported as significant in Kaplanis et al.[31]. Note different methods and significant threshold were applied in those three studies. See Supplementary Data 10 for underlying data.
Fig. 3Severe variants and the genotype–phenotype correlations in CTCF.
a LGD (red) and MIS30 (blue) variants are depicted against a protein model for CTCF. Variants new to this study are shown above the protein while published DNMs from denovo-db (v1.5) are below. Variants are flagged with yellow lightning bolt if de novo. Annotated protein domains are shown (colored blocks) for the largest protein isoforms. b Heatmap depicts the common clinical features for patients carrying CTCF severe variants by using the specific HPO annotation (rows), which were retrieved from published studies and our cohort (columns). Phenotypic enrichment is shown according to the features’ recurrence labeled by the increment of color degree. The items with no data available were labeled with “-” and were excluded in the frequency analysis.
Fig. 4Distribution of severe patient variants in six genes.
Protein diagrams are shown for HNRNPU (a) KCNQ3 (b) ZBTB18 (c) TCF12 (d) SPEN (e), and LEO1 (f) with the same display metrics that applied in Fig. 3. Validated LGD (red) and MIS30 (blue) variants are plotted. Variants listed above the protein model are new to this study, while the ones below were published previously. Paternal (black arrow) and maternal (green arrow) inheritance are shown if determined. A yellow lightning bolt denotes a de novo mutation.
Clinical recontact and detailed genotype–phenotype correlations.
| Gene | CTCF | HNRNPU | KCNQ3 | ZBTB18 | TCF12 | SPEN | LEO1 |
|---|---|---|---|---|---|---|---|
| OMIM gene | *604167 | *602869 | *602232 | *608433 | *600480 | *613484 | *610507 |
| OMIM phenotype | #615502 | #617391 | #121201 | #612337 | #615314 | NR | NR |
| Inheritance pattern | AD | AD | AD | AD | AD | NR | NR |
| # Patients | ~70 | ~35 | ~46 | ~31 | ~124 | ~10 | ~8 |
Clinical synopsis (most frequent features) | Microcephaly, thin vermilion border, Abnormality of the dentition; hypermetropia, strabismus, delayed dentition. Feeding difficulties. Congenital cardiopathies. Cryptorchidism. Hypotonia, global developmental delay, intellectual disability. Growth delay and short stature. | Microcephaly. Generalized hypotonia. Delayed myelination, EEG abnormality, epileptic encephalopathy, global developmental delay, intellectual disability, ventriculomegaly. | Benign familial neonatal epilepsy and benign familial infantile epilepsy, seizure disorders that occur in children who typically have normal psychomotor development. Developmental disability with or without seizures and/or cortical visual impairment. | Moderate to severe intellectual disability, limited or no speech, and variable but characteristic facial features including a round face, prominent forehead, flat nasal bridge, hypertelorism, epicanthal folds, and low-set ears. Hypotonia, poor growth, microcephaly, agenesis of the corpus callosum, and seizures. | Variable craniosynostosis that may involve, individually or in combination, the coronal and/or the sagittal skull sutures. Other congenital anomalies, dysmorphisms (brachydactyly, ptosis, strabismus) and/or neurodevelopmental impairment may be present. | Mild facial dysmorphisms, muscular hypotonia, tall stature, poor motor coordination, and ocular abnormalities | Intellectual disability and autistic behavior |
Data were retrieved by analyzing available clinical reports for genes of interest, and a clinical synopsis is presented according to MedGen. Individual patient details can be found in Supplementary Data 12–18, respectively.
AD autosomal dominant, NR not reported.