| Literature DB >> 35361823 |
Kyle W Davis1, Colleen G Bilancia1, Megan Martin1, Rena Vanzo1, Megan Rimmasch1, Yolanda Hom1, Mohammed Uddin2,3, Moises A Serrano4.
Abstract
To identify candidate disease genes of central nervous system (CNS) phenotypes, we created the Neurogenetic Systematic Correlation of Omics-Related Evidence (NeuroSCORE). We identified five genome-wide metrics highly associated with CNS phenotypes to score 19,601 protein-coding genes. Genes scored one point per metric (range: 0-5), identifying 8298 scored genes (scores ≥ 1) and 1601 "high scoring" genes (scores ≥ 3). Using logistic regression, we determined the odds ratio that genes with a NeuroSCORE from 1 to 5 would be associated with known CNS-related phenotypes compared to genes that scored zero. We tested NeuroSCORE using microarray copy number variants (CNVs) in case-control cohorts and aggregate mouse model data. High scoring genes are associated with CNS phenotypes (OR = 5.5, p < 2E-16), enriched in case CNVs, and mouse ortholog genes that cause behavioral and nervous system abnormalities. We identified 1058 high scoring genes with no disease association in OMIM. Transforming the logistic regression results indicates high scoring genes have an 84-92% chance of being associated with a CNS phenotype. Top scoring genes include GRIA1, MAP4K4, SF1, TNPO2, and ZSWIM8. Finally, we interrogated CNVs in the Clinical Genome Resource, finding the majority of clinically significant CNVs contain high scoring genes. These findings can direct future research and improve molecular diagnostics.Entities:
Mesh:
Year: 2022 PMID: 35361823 PMCID: PMC8971396 DOI: 10.1038/s41598-022-08938-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Schematic of final NeuroSCORE model. Key: De novo: De Novo Database; gnomAD: Genome Aggregation Database; Critical Constraint: Critically constrained coding regions database; GTEx: Gene-Tissue Expression database; Index: Database of genes based on Uddin et al.[66]; OMIM: Online Mendelian Inheritance in Man database (see “Methods”).
NeuroSCORE gene metric association of genes with CNS-related clinical features.
| Gene metric | Pearson’s χ2 | Logistic regression | Wald test | Total genes | ||||
|---|---|---|---|---|---|---|---|---|
| χ2 | OR | 95% CI | ||||||
| CCR 95 | 19.8 | < 1 | 0.9 | 0.8–1.1 | 0.4 | NA | NA | 7049 |
| gnomAD MIS | 20.2 | < 1 | 1.8 | 0.8–4.5 | 0.2 | NA | NA | 144 |
| Coe 1 | 4.2 | .04 | 1.2 | 0.9–1.4 | .10 | NA | NA | 3116 |
| Coe 2 | 2.7 | .10 | NA | NA | NA | NA | NA | 3732 |
Bold text indicates variables used in the final model; OR: odds ratio; 95% CI 95% confidence interval; CCR 95 or 99: genes with ≥ 1 critically constrained coding region at the 95th or 99th percentiles; GTEx: gene-tissue expression database; gnomAD LOF and MIS: gnomAD genes with upper bound of loss-of-function or missense observed/expected metric < 0.35; NA: not applicable.
NeuroSCORE model shows increasing odds ratios with increasing point totals.
| NeuroSCORE | Logistic regression results | OMIM genes | ||||||
|---|---|---|---|---|---|---|---|---|
| OR | 95% CI | Total genes | CNS phenotype | No CNS phenotype | No phenotype | Absent from OMIM | ||
| 5 of 5 | 32.2 | 11.8–132.7 | 5.7 | 75 | 58 | 3 | 14 | 2 |
| 4 of 5 | 6.6 | 4.4–10.3 | 2.0 | 447 | 121 | 29 | 297 | 11 |
| 3 of 5 | 4.3 | 3.3–5.6 | 2.0 | 1079 | 241 | 91 | 747 | 135 |
| 2 of 5 | 3.6 | 3.0–4.4 | 2.0 | 3407 | 603 | 284 | 2520 | 644 |
| 1 of 5 | 2.0 | 1.7–2.4 | 1.1 | 3290 | 489 | 409 | 2392 | 453 |
| 0 of 5 | NA | NA | NA | 10,461 | 715 | 1389 | 8357 | 2734 |
| ≥ 3 of 5 | 5.5 | 4.4–7.0 | 2.0 | 1601 | 420 | 123 | 1058 | 148 |
OR: odds ratio; 95% CI 95% confidence interval; NA: not available; high scoring genes are genes identified by ≥ 3 gene sets; OMIM data current as of July 31st, 2021.
Figure 2Odds ratio of genes associated with CNS-related phenotypes by NeuroSCORE.
Figure 3NeuroSCORE applied to human case–control cohorts and mouse phenotype experiments.
Distribution of the highest scored gene within case and control CNVs.
| NeuroSCORE | Genes in loss CNVs | Genes in gain CNVs | ||||
|---|---|---|---|---|---|---|
| Cases | Controls | Cases | Controls | |||
| 5 | 5 (0.6) | 0 (0) | 18 (1.3) | 1 (0.1) | ||
| 4 | 52 (6.2) | 9 (0.4) | 2 | 102 (7.5) | 25 (1.3) | 2 |
| 3 | 78 (9.3) | 7 (0.3) | 2 | 176 (13.0) | 52 (2.8) | 2 |
| 2 | 115 (13.8) | 365 (14.3) | .73 | 288 (21.2) | 354 (19.0) | .13 |
| 1 | 149 (17.8) | 307 (12.1) | 2 | 261 (19.2) | 300 (16.1) | .68 |
| 0 | 435 (52.1) | 1501 (58.9) | 509 (37.5) | 746 (40.1) | ||
| NA | 1 (0.1) | 358 (14.1) | 2 | 3 (0.2) | 384 (20.6) | 2 |
NA are genes that could not be scored (e.g., pseudogenes); total case CNVs NLOSSES = 835, NGAINS = 1357, total control CNVs NLOSSES = 2547, NGAINS = 1862; For statistical testing, Fisher’s Exact test was used for analyses when genes in CNVs were ≤ 5, while Chi squared tests were used for analyses when genes in CNVs were > 5; significance set at p ≤ 4e-3 after Bonferroni correction for 14 tests.
Figure 4Common neurodevelopmental microdeletion/Duplication syndromes with gene level NeuroSCORE.