| Literature DB >> 31642496 |
Guihu Zhao1,2, Kuokuo Li3, Bin Li1,2, Zheng Wang1, Zhenghuan Fang3, Xiaomeng Wang3, Yi Zhang1, Tengfei Luo3, Qiao Zhou1, Lin Wang3, Yali Xie1, Yijing Wang3, Qian Chen1, Lu Xia3, Yu Tang1, Beisha Tang1,2, Kun Xia3, Jinchen Li1,2,3.
Abstract
De novo mutations (DNMs) significantly contribute to sporadic diseases, particularly in neuropsychiatric disorders. Whole-exome sequencing (WES) and whole-genome sequencing (WGS) provide effective methods for detecting DNMs and prioritizing candidate genes. However, it remains a challenge for scientists, clinicians, and biologists to conveniently access and analyse data regarding DNMs and candidate genes from scattered publications. To fill the unmet need, we integrated 580 799 DNMs, including 30 060 coding DNMs detected by WES/WGS from 23 951 individuals across 24 phenotypes and prioritized a list of candidate genes with different degrees of statistical evidence, including 346 genes with false discovery rates <0.05. We then developed a database called Gene4Denovo (http://www.genemed.tech/gene4denovo/), which allowed these genetic data to be conveniently catalogued, searched, browsed, and analysed. In addition, Gene4Denovo integrated data from >60 genomic sources to provide comprehensive variant-level and gene-level annotation and information regarding the DNMs and candidate genes. Furthermore, Gene4Denovo provides end-users with limited bioinformatics skills to analyse their own genetic data, perform comprehensive annotation, and prioritize candidate genes using custom parameters. In conclusion, Gene4Denovo conveniently allows for the accelerated interpretation of DNM pathogenicity and the clinical implication of DNMs in humans.Entities:
Mesh:
Year: 2020 PMID: 31642496 PMCID: PMC7145562 DOI: 10.1093/nar/gkz923
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Summary of collected DNMs in Gene4Denovo database
| Phenotypes | Abbreviation | Study | Trios | DNMs | Coding DNMs |
|---|---|---|---|---|---|
| Autism spectrum disorder | ASD | 11 | 6511 | 280 782 | 8175 |
| Undiagnosed developmental disorder | UDD | 1 | 4293 | 8361 | 7696 |
| Congenital heart disorder | CHD | 1 | 2645 | 2990 | 2972 |
| Intellectual disability | ID | 7 | 1331 | 1493 | 1478 |
| Epileptic encephalopathy | EE | 7 | 933 | 1213 | 1165 |
| Schizophrenia | SCZ | 7 | 1094 | 1064 | 1052 |
| Tourette disorder | TD | 2 | 812 | 805 | 781 |
| Congenital diaphragmatic hernia | CDH | 1 | 362 | 470 | 470 |
| Craniosynostosis | CRAN | 1 | 291 | 322 | 319 |
| Periventricular nodular heterotopia | PNH | 1 | 202 | 219 | 219 |
| Amyotrophic lateral sclerosis | ALS | 3 | 173 | 111 | 109 |
| Bipolar disorder | BP | 1 | 79 | 71 | 68 |
| Early onset Parkinson disease | EOPD | 2 | 49 | 60 | 60 |
| Cerebral palsy | CP | 1 | 98 | 61 | 59 |
| Neural tube defects | NTD | 1 | 43 | 40 | 40 |
| Early-onset high myopia | EOHM | 1 | 18 | 20 | 19 |
| Early onset Alzheimer disease | EOAD | 1 | 12 | 15 | 15 |
| Smith-Magenis syndrome | SMS | 1 | 13 | 13 | 13 |
| Cantu syndrome | CS | 1 | 14 | 6 | 6 |
| Sporadic infantile spasm syndrome | SISS | 1 | 10 | 5 | 5 |
| Acromelic frontonasal dysostosis | AFD | 1 | 4 | 4 | 4 |
| Anophthalmia/Microphthalmia | AM | 1 | 25 | 4 | 4 |
| Control | Control | 9 | 3391 | 174 836 | 3629 |
| Mix phenotype | Mix | 1 | 1548 | 107 834 | 1702 |
| Total | 59 | 23 951 | 580 799 | 30 060 |
All DNMs reported in primary publications were integrated in Gene4Denovo database. ANNOVAR was performed to annotate these DNMs. Variants with functional effects of frameshift indels, stopgain, and stoploss, missense, synonymous, non-frameshift indels and splicing site (≤2 bp) were defined as coding DNMs. DNMs in AFD with sample size <10 (n = 4) from denovo-db database were also integrated in present study.
Summary of prioritized candidate genes in Gene4Denovo database
| Disease (trios) | FDR ≤ 0.0001 | 0.0001< FDR ≤ 0.001 | 0.001 < FDR ≤ 0.01 | 0.01 < FDR ≤ 0.05 | 0.05 < FDR ≤ 0.1 | 0.1 < FDR < 0.2 |
|---|---|---|---|---|---|---|
| ASD (6511) | 13 | 9 | 10 | 29 | 26 | 53 |
| UDD (4293) | 85 | 21 | 43 | 50 | 40 | 69 |
| CHD (2645) | 3 | 3 | 4 | 12 | 13 | 25 |
| ID (1331) | 26 | 13 | 18 | 16 | 14 | 34 |
| EE (933) | 14 | 3 | 14 | 12 | 8 | 29 |
| SCZ (1094) | 0 | 0 | 0 | 0 | 1 | 9 |
| TD (812) | 0 | 0 | 0 | 2 | 3 | 6 |
| CDH (362) | 0 | 1 | 0 | 0 | 1 | 0 |
| CRAN (291) | 0 | 1 | 0 | 0 | 0 | 0 |
| PNH (202) | 1 | 0 | 0 | 0 | 0 | 0 |
| ALS (173) | 0 | 0 | 0 | 0 | 0 | 1 |
| BP (79) | 0 | 0 | 0 | 0 | 0 | 1 |
| EOPD (49) | 0 | 0 | 0 | 0 | 0 | 1 |
| CP (98) | 0 | 0 | 0 | 1 | 0 | 0 |
| NTD (43) | 0 | 0 | 0 | 1 | 0 | 0 |
| SMS (13) | 1 | 0 | 0 | 0 | 0 | 0 |
| CS (14) | 1 | 0 | 0 | 0 | 0 | 0 |
| AFD (4) | 0 | 1 | 0 | 0 | 0 | 0 |
| CD (19 012) | 117 | 27 | 46 | 60 | 47 | 88 |
| Total | 132 | 36 | 62 | 116 | 99 | 230 |
ASD, autism spectrum disorder; UDD, undiagnosed developmental disorder; CHD, congenital heart disorder; ID, intellectual disability; EE, epileptic encephalopathy; SCZ, schizophrenia; TD, tourette disorder; CDH, congenital diaphragmatic hernia; CRAN, craniosynostosis; PNH, periventricular nodular heterotopia; ALS, amyotrophic lateral sclerosis; BP, bipolar disorder; EOPD, early onset parkinson disease; CP, cerebral palsy; NTD, neural tube defects; SMS, smith-magenis syndrome; CS, cantu syndrome; AFD, acromelic frontonasal dysostosis. CD, combined all samples with different disorders. Number of genes with FDR < 0.2 in each disorder and cross disorders analysis were showed in this table. We ranked all candidate genes into six tiers based on the strength of false discovery rate (FDR). The total number of candidate genes were counted after removing redundancy.
Figure 1.Snapshot of variant-level implications in Gene4Denovo. Two approaches are available to access variant-level implications, the ‘Quick search’ and ‘Advanced search’. The results of a quick search for the KCNQ2 gene are shown as an example, including the functional effects at the transcript and protein levels, homology, predicted damaging severity of missense variants, allele frequencies in different populations, and information in disease-related databases.
Figure 2.Snapshot of gene-level implications in Gene4Denovo. The typical gene-level implications of the KCNQ2 gene are illustrated as an example, including basic information, gene functions, associated phenotypes and diseases, gene expression, variants in different populations, and drug–gene interactions.
Figure 3.Snapshot of analysis panel in Gene4Denovo. There are four steps in the analysis process: inputting an email address, choosing the Trio or Non-trio option, uploading the data files, and inputting the trio or genotype information. To increase flexibility, users are able to specify annotation datasets, such as functional effects, allele frequencies, and predicted damaging scores from any of the 24 in silico algorithms.