| Literature DB >> 33809095 |
Kuokuo Li1,2,3, Zhengbao Ling3, Tengfei Luo3, Guihu Zhao1, Qiao Zhou1, Xiaomeng Wang3, Kun Xia3, Jinchen Li1,3,4, Bin Li1,5.
Abstract
De novo variants (DNVs) are critical to the treatment of neurodevelopmental disorders (NDDs). However, effectively identifying candidate genes in small cohorts is challenging in most NDDs because of high genetic heterogeneity. We hypothesised that integrating DNVs from multiple NDDs with genetic similarity can significantly increase the possibility of prioritising the candidate gene. We catalogued 66,186 coding DNVs in 50,028 individuals with nine types of NDDs in cohorts with sizes spanning from 118 to 31,260 from Gene4Denovo database to validate this hypothesis. Interestingly, we found that integrated DNVs can effectively increase the number of prioritised candidate genes for each disorder. We identified 654 candidate genes including 481 shared candidate genes carrying putative functional variants in at least two disorders. Notably, 13.51% (65/481) of shared candidate genes were prioritised only via integrated analysis including 44.62% (29/65) genes validated in recent large cohort studies. Moreover, we estimated that more novel candidate genes will be prioritised with the increase in cohort size, in particular for some disorders with high putative functional DNVs per individual. In conclusion, integrated DNVs may increase the power of prioritising candidate genes, which is important for NDDs with small cohort size.Entities:
Keywords: candidate gene; de novo variant; neurodevelopmental disorder
Year: 2021 PMID: 33809095 PMCID: PMC8001830 DOI: 10.3390/life11030233
Source DB: PubMed Journal: Life (Basel) ISSN: 2075-1729
Summary of collected DNVs in neurodevelopmental disorders.
| Phenotypes | Study | Trios | DNVs | Coding DNVs | PTVs | Dmis | Pfun | Pfun per Individual |
|---|---|---|---|---|---|---|---|---|
| ASD | 14 | 10,318 | 287,444 | 12,141 | 1580 | 2507 | 4087 | 0.40 |
| SCZa | 11 | 3402 | 3422 | 3357 | 358 | 716 | 1074 | 0.32 |
| EE | 9 | 973 | 1248 | 1191 | 170 | 364 | 534 | 0.55 |
| DD/ID | 6 | 31,260 | 45,541 | 44,825 | 7078 | 11,683 | 18,761 | 0.60 |
| CHD | 1 | 2645 | 2990 | 2981 | 369 | 654 | 1023 | 0.39 |
| TD | 3 | 909 | 842 | 818 | 85 | 199 | 284 | 0.31 |
| BPa | 3 | 219 | 6995 | 199 | 34 | 21 | 55 | 0.25 |
| OCD | 1 | 118 | 134 | 128 | 48 | 20 | 68 | 0.58 |
| CMS | 1 | 184 | 205 | 198 | 27 | 12 | 39 | 0.21 |
ASD, autism spectrum disorder; SCZ, schizophrenia; EE, epileptic encephalopathy; DD/ID, developmental disorders/intellectual disability; CHD, congenital heart disease, TD, tourette disorder; BP, bipolar disorder; OCD, obsessive-compulsive disorder; CMS, complex motor stereotypies; DNVs, de novo variants; PTVs, protein-truncating variants; Dmis, deleterious missense variant; Pfun, putative functional variant, combining PTVs and Dmis. a, several patients with SCZ/BP come from one study.
Figure 1Genetic similarity between different neurodevelopmental disorders. Genetic similarity among disorders were performed based on three classes of variants include LoF, Dmis and Pfun. OE, ratio of observed to expected numbers of shared genes. Solid and coloured circle indicate OE greater than 1 and p value less than 0.05. Solid circle with no colour indicate OE greater than 1 but p value great than 0.05. Solid circle with no colour indicate OE greater than 1. Hollow circle indicate that OE less than 1. Dmis, Deleterious missense variants; LoF, loss of function. LoF include frameshift, stoploss and stopgain, splicing variants. Pfun, Putative functional variants, including Dmis and LoF variants. p value was calculated by using DNENRICH software (v1.0). ASD, autism spectrum disorder; SCZ, schizophrenia; EE, epileptic encephalopathy; DD/ID, developmental disorders/intellectual disability; CHD, congenital heart disease, TD, Tourette disorder; BP, bipolar disorder; OCD, obsessive-compulsive disorder; CMS, complex motor stereotypies.
Comparison of prioritised candidate gene number by integrated analysis based on mutation type.
| Disorders (N) | Genetic Similarity | Category | Type | FDR < 0.0001 | 0.0001 < FDR < 0.001 | 0.001 < FDR < 0.01 | 0.01 < FDR < 0.05 | |
|---|---|---|---|---|---|---|---|---|
| OE | ||||||||
| ASD (10,318) | 1.00 × 10−4 | 3.72 | Before | 24 | 7 | 23 | 50 | |
| After | Pfun | 229 | 31 | 47 | 55 | |||
| LoF | 141 | 16 | 24 | 30 | ||||
| Dmis | 175 | 21 | 31 | 33 | ||||
| SCZ (3402) | 1.00 × 10−4 | 1.67 | Before | 0 | 0 | 3 | 5 | |
| After | Pfun | 68 | 9 | 17 | 18 | |||
| LoF | 29 | 1 | 8 | 13 | ||||
| Dmis | 47 | 9 | 11 | 6 | ||||
| EE (973) | 1.00 × 10−4 | 6.54 | Before | 7 | 4 | 5 | 8 | |
| After | Pfun | 87 | 6 | 10 | 6 | |||
| LoF | 38 | 1 | 9 | 2 | ||||
| Dmis | 58 | 5 | 1 | 5 | ||||
| DD/ID (31,260) | 1.00 × 10−4 | 6.80 | Before | 278 | 53 | 81 | 115 | |
| After | Pfun | 287 | 56 | 79 | 96 | |||
| LoF | 237 | 46 | 64 | 65 | ||||
| Dmis | 267 | 50 | 70 | 73 | ||||
| CHD (2645) | 1.00 × 10−4 | 2.84 | Before | 3 | 3 | 4 | 12 | |
| After | Pfun | 78 | 14 | 16 | 20 | |||
| LoF | 45 | 6 | 8 | 14 | ||||
| Dmis | 46 | 8 | 11 | 10 | ||||
| TD (909) | 1.00 × 10−4 | 2.02 | Before | 0 | 0 | 0 | 0 | |
| After | Pfun | 21 | 1 | 6 | 0 | |||
| LoF | 7 | 0 | 2 | 0 | ||||
| Dmis | 14 | 1 | 4 | 0 | ||||
| BP (219) | 2.89 × 10−2 | 1.90 | Before | 0 | 0 | 0 | 0 | |
| After | Pfun | 3 | 1 | 1 | 0 | |||
| LoF | 2 | 0 | 0 | 0 | ||||
| Dmis | 2 | 1 | 1 | 0 | ||||
| OCD (118) | 2.00 × 10−4 | 3.20 | Before | 0 | 0 | 0 | 0 | |
| After | Pfun | 10 | 1 | 3 | 0 | |||
| LoF | 2 | 0 | 0 | 0 | ||||
| Dmis | 9 | 1 | 3 | 0 | ||||
| CMS (184) | 1.00 × 10−2 | 2.49 | Before | 0 | 0 | 0 | 1 | |
| After | Pfun | 4 | 1 | 1 | 3 | |||
| LoF | 3 | 0 | 0 | 0 | ||||
| Dmis | 1 | 1 | 1 | 3 | ||||
ASD, autism spectrum disorder; SCZ, schizophrenia; DD/ID, developmental disorders/intellectual disability; CHD, congenital heart disease; TD, Tourette disorder; BP, bipolar disorder; OCD, obsessive-compulsive disorder; CMS, complex motor stereotypies; Pfun, putative functional variant; LoF, loss of function variant; Dmis, deleterious missense variant; Before, prioritised candidate gene base on putative functional DNVs of specific disorder with FDR < 0.05; After, prioritised candidate gene base on the integration of DNVs in all disorders. Gene carrying Pfun, LoF and Dmis in specific disorder and pass each FDR threshold in integration analysis was defined as candidate gene of this disorder. OE, ratio of observed to expected numbers of shared genes with putative functional de novo variants.
Candidate gene carrying putative functional variants in different number of disorders (FDR < 0.05).
| Rank (FDR) | Unique Disorders | Two Disorders | Three Disorders | Four Disorders | Five Disorders | Six Disorders |
|---|---|---|---|---|---|---|
| [0, 0.0001) (48.32%) | 42 | 113 | 98 | 50 | 10 | 3 |
| [0.0001, 0.001) (9.17%) | 14 | 26 | 18 | 2 | 0 | 0 |
| [0.001, 0.01) (15.44%) | 31 | 41 | 21 | 6 | 2 | 0 |
| [0.01, 0.05) (27.06%) | 86 | 59 | 30 | 1 | 1 | 0 |
Candidate genes are split into six parts based on the number of disorders with putative functional DNMs in specific gene. Unique genes means gene only carry putative functional DNMs in one disorder. We ranked all candidate genes into four tiers based on the strength of FDR.
Figure 2Projected gene discovery in larger cohort size. We assume the sample size were 500, 1000, 2000, 4000, 8000, 16,000, and 32,000 for each disorder and sampling de novo variant from exist based on putative functional de novo variant rate per individual. We then estimate the number of candidate gene for each disorders with FDR < 0.05 by transmitted and de novo association (TADA) analysis. ASD, autism spectrum disorder; SCZ, schizophrenia; EE, epileptic encephalopathy; DD/ID, developmental disorders/intellectual disability; CHD, congenital heart disease, TD, Tourette disorder; BP, bipolar disorder; OCD, obsessive-compulsive disorder; CMS, complex motor stereotypies.