| Literature DB >> 33874999 |
Madelyn A Gillentine1, Tianyun Wang1, Kendra Hoekzema1, Jill Rosenfeld2,3, Pengfei Liu2, Hui Guo1,4, Chang N Kim5,6,7,8, Bert B A De Vries9, Lisenka E L M Vissers9, Magnus Nordenskjold10,11, Malin Kvarnung10,11, Anna Lindstrand10,11, Ann Nordgren10,11, Jozef Gecz12,13,14, Maria Iascone15, Anna Cereda16, Agnese Scatigno16, Silvia Maitz17, Ginevra Zanni18, Enrico Bertini18, Christiane Zweier19, Sarah Schuhmann19, Antje Wiesener19, Micah Pepper20,21, Heena Panjwani20,21, Erin Torti22, Farida Abid23,24, Irina Anselm25, Siddharth Srivastava25, Paldeep Atwal26, Carlos A Bacino3, Gifty Bhat27, Katherine Cobian27, Lynne M Bird28,29, Jennifer Friedman28,30,31, Meredith S Wright28,30, Bert Callewaert32, Florence Petit33, Sophie Mathieu34, Alexandra Afenjar34, Celenie K Christensen35, Kerry M White36, Orly Elpeleg37, Itai Berger38,39, Edward J Espineli23,24, Christina Fagerberg40, Charlotte Brasch-Andersen40, Lars Kjærsgaard Hansen41, Timothy Feyma42, Susan Hughes43,44, Isabelle Thiffault44,45, Bonnie Sullivan43, Shuang Yan43, Kory Keller46, Boris Keren47, Cyril Mignot47, Frank Kooy48, Marije Meuwissen48, Alice Basinger49, Mary Kukolich49, Meredith Philips49, Lucia Ortega49, Margaret Drummond-Borg49, Mathilde Lauridsen40, Kristina Sorensen40, Anna Lehman50,51, Elena Lopez-Rangel50,52,53, Paul Levy54, Davor Lessel55, Timothy Lotze23, Suneeta Madan-Khetarpal56,57, Jessica Sebastian56, Jodie Vento56, Divya Vats58, L Manace Benman59, Shane Mckee60, Ghayda M Mirzaa61,62,63, Candace Muss64, John Pappas65, Hilde Peeters66, Corrado Romano67, Maurizio Elia67, Ornella Galesi67, Marleen E H Simon68, Koen L I van Gassen68, Kara Simpson69, Robert Stratton70, Sabeen Syed71, Julien Thevenon72, Irene Valenzuela Palafoll73, Antonio Vitobello74,75, Marie Bournez76,77, Laurence Faivre75,77, Kun Xia4, Rachel K Earl20,21,78, Tomasz Nowakowski5,6,7,8, Raphael A Bernier20,21,78, Evan E Eichler79,80.
Abstract
BACKGROUND: With the increasing number of genomic sequencing studies, hundreds of genes have been implicated in neurodevelopmental disorders (NDDs). The rate of gene discovery far outpaces our understanding of genotype-phenotype correlations, with clinical characterization remaining a bottleneck for understanding NDDs. Most disease-associated Mendelian genes are members of gene families, and we hypothesize that those with related molecular function share clinical presentations.Entities:
Keywords: Cortex development; Gene families; Neurodevelopmental disorders; hnRNPs
Mesh:
Substances:
Year: 2021 PMID: 33874999 PMCID: PMC8056596 DOI: 10.1186/s13073-021-00870-6
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Gene families involved in RNA processing reaching FDR significance in Coe et al. [1] and their role in disease
| Gene family | Genes reaching FDR significance | # of NDD candidate genes in gene family |
|---|---|---|
| Chromodomain DNA-binding proteins | 6/10 | |
| BAF complex | 16/24 | |
| Heterogeneous nuclear ribonuclear proteins | 7/33 | |
| Lysine acetyltransferases | 5/17 |
Gene families were defined by the HUGO Gene Nomenclature Consortium. At least three members of the gene family must reach FDR significance to be included. NDD candidate genes were determined by OMIM and a literature search. Coe et al. [1] n = 11,722 exomes
Previous disease associations for genetic variation in HNRNP genes
| Gene | Disorder | Type of variation |
|---|---|---|
ALS/FTL [ Multisystem proteinopathy [ | MIS MIS | |
ALS/FTLD [ Multisystem proteinopathy [ | MIS MIS | |
ALS/FTLD [ Multisystem proteinopathy [ | MIS MIS | |
| 4q21 microdeletion/duplication/triplication syndrome [ | CNVs | |
| MIS20/small CNVs | ||
| Bain-type ID [ | MIS20 | |
AKS/Okamoto syndrome [ Kabuki syndrome [ | LGD/MIS20/chromosomal deletions MIS | |
| LGD/MIS20 | ||
| 6q proximal deletions [ | Chromosomal deletion | |
| ALS/FTLD [ | LGD/MIS | |
1q43q44 microdeletion syndrome [ | Chromosomal deletion LGD/MIS20/chromosomal duplication |
MIS missense, MIS20 missense variants with CADD scores ≥ 20, indicating they are in the top 1% of likely pathogenic variants, CNVs copy number variants, LGD likely gene disrupting, ALS amyotrophic lateral sclerosis, FTLD frontotemporal lobar degeneration, ID intellectual disability, AKS Au-Kline syndrome
Fig. 1Study workflow. Candidate NDD HNRNPs were determined from the literature and publicly available information (such as amino acid sequences) and from identification of probands in our novel cohorts. The candidate NDD HNRNPs were finalized by considering only genes in which at least three probands were identified from published and/or novel sources. Functional impacts focused on these finalized NDD HNRNP candidates and included pathogenicity predictions (gnomAD and GEVIR), de novo enrichment analyses (using the Chimpanzee–Human [CH] model and denovolyzeR), missense analyses (using CLUMP and MetaDome), expression analyses of fetal cortex and adult tissues, and phenotypic analyses within HNRNPs, across HNRNPs, and in comparison to other similarly presenting disorders by HPO terms. CNV: copy number variant; pLI: loss-of-function intolerance; LGD: likely gene disrupting; NMD: nonsense mediated decay; HPO: human phenotype ontology
Fig. 2Protein similarity of hnRNPs. Correlation plot of hnRNPs by canonical amino acid sequence. Pearson correlation values are shown in the bottom half of the plot and are shown visually on the top half of the plot
Genomic disorders spanning NDD HNRNPs
| Genomic disorder | Gene previously considered candidate for CNVs? | Shared phenotypes | |
|---|---|---|---|
| 5q35 deletions | No | DD/ID, characteristic facial features, overgrowth, microcephaly | |
| 1q43q44 deletions | Yes | DD/ID, seizures, structural brain abnormalities, speech delay | |
| 9q21.32 deletions | Yes | DD/ID, motor delay, speech delay, structural brain abnormalities, hypotonia, skeletal abnormalities, hand/feet abnormalities, cardiac abnormalities, genitourinary issues, dysmorphic features | |
| 6q proximal deletions | Yes | DD/ID, ASD, structural brain abnormalities, behavioral issues | |
| 4q21 microdeletion syndrome | No | DD/ID, emotional/behavioral issues, speech delay | |
| 1p36 monosomy | No | DD/ID, skeletal abnormalities, genitourinary issues, seizures, structural brain abnormalities |
CNV copy number variant, DD/ID developmental delay/intellectual disability, ASD autism spectrum disorder
Fig. 3Pathogenicity assessment of variation in hnRNPs. pLI and Z-scores were obtained from gnomAD. pLI scores are significantly higher among NDD hnRNPs (n = 12) compared to non-NDD hnRNPs (n = 20), suggesting LGD variants are more likely to be damaging. Z-scores trend towards being significantly higher for NDD hnRNPs, suggesting severe missense variants are likely to be damaging. T-test with Welch’s correction. *p < 0.05. pLI: loss-of-function intolerance
Fig. 4De novo enrichment and clustering of missense variation analyses of NDD hnRNPs. a De novo variation was assessed for NDD HNRNPs using two statistical models: the CH model and denovolyzeR. Right/above the dotted line indicates the gene achieves exome-wide significance (q < 4.24 × 10− 7) while right/above the dashed line indicates the gene reaches nominal significance (q < 0.05). HNRNPU reaches exome-wide significance for all protein-impacting variants (Protein) and LGD variants, with severe missense variants reaching significance by only the CH model. SYNCRIP reaches exome-wide significance for LGD variants and all protein-impacting variants by the CH model alone. HNRNPD reaches nominal significance by the CH model. P values are FDR corrected with the number of genes (n = 18,946 for CH model and n = 19,618 for denovolyzeR) with three tests per gene (LGD, missense, and all protein changes) and two tests (CH model and denovolyzeR) per mutation type. Only cohorts with known de novo status were included, as listed in Tables S1 and S7. Statistics can be seen in Table S6. b Analysis of clustered missense variants. Clustering of missense variants was analyzed using CLUMP; scores are shown in Table S6 (paired t-test). Compared to the non-neuropsychiatric subset of gnomAD (n = 114,704, 1958 missense variants), the CLUMP score for NDD hnRNPs (red) among probands is significantly lower than controls in gnomAD (black), indicating more clustering of mutations (shown by arrow). Note that only genes with variants in the current cohort could undergo this analysis. hnRNPH2, hnRNPK, hnRNPR, and hnRNPUL1 each have independent significant clumping compared to gnomAD controls. c CLUMP scores for missense variants in probands with ASD (n = 60). HnRNPH2, hnRNPK, hnRNPR, and hnRNPUL1 reach significance independently. *p < 0.05, **p < 0.01, ***p < 0.001. LGD: likely gene disrupting; Missense: severe missense (CADD ≥ 20); Protein: all protein-affecting variants
Fig. 5hnRNP proband variants. Protein structure, known binding motif, number of probands by mutation type, location of variants in each protein, and known associated disorders of NDD hnRNPs are shown. Novel cases are above the protein with published cases below. Red indicates LGD variants and blue represents severe missense variants. RRM: RNA recognition motif; qRRM: quasi-RNA recognition motif; KH: K-homology domain; RGG: Arginine-glycine rich (RGG) box; NLS: nuclear localization sequence. Further details of each variant are shown in Table S7. Adapted from Geuens et al. [83]. a Gene reaches exome-wide significance for all protein-impacting variants by CH model. b Gene reaches exome-wide significance for all protein-impacting variants and LGD variants by CH model. c Gene reaches exome-wide significance for all protein-impacting variants and LGD variants by CH model and denovolyzeR. d Gene reaches significance for missense variant clustering
Fig. 6HNRNP expression in adult and developing fetal cortex tissues. a Heatmap showing transcript-level expression values for NDD hnRNPs for adult brain tissues in GTEx. All tissues are shown in Fig. S4, and p values among individual HNRNPs are shown in Table S3. b Heatmap showing fold change of expression of each NDD HNRNP among 48 different cell types in the developing fetal cortex. Blue indicates increase in fold change of expression and red indicates decreased expression, as determined by Z-scores. NDD HNRNPs have higher fold expression change in MGE progenitors, radial glia, and excitatory neurons, while depleted in inhibitory neurons. All HNRNPs are shown in Fig. S3. Significance indicates enrichment in particular cell types by Wilcoxon ranked sum test with Bonferroni correction based on number of cell types. P values and fold change for scRNA data from developing human cortex can be seen in Tables S4 and S5. c Correlation plot of developing fetal cortex gene expression. Pearson correlation R values are shown in the bottom half of the plot, which are visually in the top half of the plot. P values were corrected by number of genes [23] and number of cell types [48]. HNRNPs in the same homology group tend to have more correlated expression. d Specific brain region enrichment as determined by SEA, showing enrichment of expression of the NDD HNRNPs in the early fetal striatum and early-mid fetal amygdala. *p < 0.05; **p < 0.01; ***p < 0.001, ****p < 0.0001
Fig. 8Relationship of HNRNP-related disorders to each other and to similarly presenting disorders. a Comparison of average number of HPO terms shared within HNRNPs versus with other similarly presenting disorders, as determined by PhenPath. The HNRNP-related disorders present more similarly to each other than other NDDs. HPO terms are in Table S12. b Heatmap showing fold change of expression of each NDD HNRNP and similarly presenting NDD based on HPO terms among 48 different cell types. Blue indicates increase in fold change of expression and red indicates decreased expression, as determined by Z-scores. c Correlation plot of developing fetal cortex gene expression for HNRNPs and genes implicated in similarly presenting disorders. Pearson correlation R values shown visually, with darker and larger circles indicating higher Pearson R values. HNRNPs are noted with a red line. P values were corrected by number of genes [28] and number of cell types [48]. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001. d Comparison of expression Pearson’s R values within HNRNPs and compared to similarly presenting disorders. The expression of the HNRNPs is more similar to each other than to other NDDs
Fig. 7Phenotypic information of 189–221 hnRNP-variation probands. a Correlation matrix of phenotypes across hnRNP probands. Genes are in order of protein similarity as determined by Clustal Omega and canonical protein sequences as in Fig. 1. Phenotypes correlate across all HNRNPs, except HNRNPF due to sample size. Size and shade of circle represent correlation coefficients, which are shown on bottom half of matrix. Correlations for LGD and missense variants separately are in Fig. S4. P values, which are corrected by number of genes [23] and phenotypes (88, occurring in at least 20% of any HNRNP group) can be seen in Table S9. b Plot comparing protein and phenotype correlations that are over Pearson’s R = 0.5. Colors are the same as in Fig. 2 protein groups. Those with more similar protein sequences tend to be more phenotypically similar. c Plot of phenotypes of all probands by mutation type. Individual HNRNPs can be seen in Fig. S5. d Heatmap indicating percent of probands with phenotype. Sample sizes can be seen in Table S7 and range from n = 2 (HNRNPF) to n = 83 (HNRNPU). Lines indicate significant differences as determined by pairwise Fisher’s exact tests with Bonferroni correction based on 12 genes, 88 phenotypes, and three mutational categories. Red dashed lines indicate significance with only LGD variants. Raw p values can be seen in Tables S8. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001. LGD: likely gene disrupting; MIS: severe missense (CADD ≥ 20)
Inheritance of 225 SNVs and indels in NDD HNRNPs
| Variant type | De novo % of variant type ( | Inherited % of variant type ( | Unknown % of variant type ( | Total % of all variants ( |
|---|---|---|---|---|
| 82 (114/139) | 0.7 (1/139) | 17.3 (24/139) | 61.8 (139/225) | |
| 58.1 (50/86) | 1.2 (1/86) | 40.7 (35/86) | 38.2 (86/225) | |
| 50 (4/8) | 0 (0/8) | 50 (4/8) | 3.6 (8/225) | |
| 57.5 (42/73) | 1.4 (1/73) | 41.1 (30/73) | 32.4 (73/225) | |
| 80 (4/5) | 0 (0/5) | 28.6 (2/5) | 2.7 (6/225) | |
| 73.2 (164/225) | 0.9 (2/225) | 25.3 (56/225) | 100 (225/225) |
LGD likely gene disruptive, MIS missense, MIS20 CADD score ≥ 20, MIS30 CADD score ≥ 30. NDD HNRNPs include HNRNPAB, HNRNPD, HNRNPF, HNRNPH1, HNRNPH2, HNRNPH3, HNRNPK, SYNCRIP, HNRNPR, HNRNPU, HNRNPUL1, and HNRNPUL2. Variants identified in non-NDD HNRNPs are in Table S7