| Literature DB >> 33075204 |
Sourena Soheili-Nezhad1, Robert J van der Linden2, Marcel Olde Rikkert3,4, Emma Sprooten1, Geert Poelmans2.
Abstract
Aging, the greatest risk factor for Alzheimer's disease (AD), may lead to the accumulation of somatic mutations in neurons. We investigated whether somatic mutations, specifically in longer genes, are implicated in AD etiology. First, we modeled the theoretical likelihood of genes being affected by aging-induced somatic mutations, dependent on their length. We then tested this model and found that long genes are indeed more affected by somatic mutations and that their expression is more frequently reduced in AD brains. Furthermore, using gene-set enrichment analysis, we investigated the potential consequences of such long gene disruption. We found that long genes are involved in synaptic adhesion and other synaptic pathways that are predicted to be inhibited in the brains of AD patients. Taken together, our findings indicate that long gene-dependent synaptic impairment may contribute to AD pathogenesis.Entities:
Keywords: Alzheimer's disease; DNA damage; long genes; somatic mutations; synaptic adhesion
Year: 2020 PMID: 33075204 PMCID: PMC8048495 DOI: 10.1002/alz.12211
Source DB: PubMed Journal: Alzheimers Dement ISSN: 1552-5260 Impact factor: 21.566
FIGURE 1Longer genes have an increased likelihood to be affected by sSNVs. (A) The length distribution of human genes has a long tail that extends toward a group of extremely long genes of 1‐2 mega base pairs (272 very long genes are indicated with open circles (see below) and gene length in base pairs (bp). Gene length information was retrieved from Ensembl Biomart (GENCODE v19, GRCh37p13). (B) Gene length follows a log‐normal distribution with parameters μ = 4.35 (22.5 kb) and σ = 0.68 (dashed line). The outlier bin near 1 kb represents the large family of olfactory receptors that have gone through extreme evolutionary expansion. The 272 genes that are indicated by the open circles in 1B and in the shaded gray area under the curve in 1C show the subgroup of very long genes (genes with gene length > μ+2σ) that were used for the enrichment analyses in this study. (C) Binomial probability model for gene conservation over time in which somatic mutations (sSNVs) take place at a fixed and uniform rate across the genome, age in years (y). An average‐sized gene mostly survives the mutational burden of aging, with only ≈1% of its copies being affected by somatic mutations in a 65‐year‐old subject. For longer genes, however, ≈60% of copies are expected to have been affected by at least one sSNV between the sixth and seventh decades of life. (D) sSNVs occur more often in longer genes (Kolmogorov‐Smirnov test: P < 1.0 × 10−4). Gene length distributions for genes having potential pathogenic sSNVs from the studies by Park et al. (Red, 208 genes), Ivashko‐Pachima et al. (Blue, 499 genes), Lodato et al. (Green, 175 genes), and all human protein‐coding genes (Black, 20535 genes) are shown. Circles following the same color code plotted below density graph represent individual gene lengths. Box plots visualize the median with flanking lower and upper hinges (corresponding to the 25th and 75th percentiles), and the whiskers represent the 95% confidence interval
Data resource information for data used in this article
|
| Brain region | Details | Original paper |
|---|---|---|---|
|
| Dentate gyrus/prefrontal cortex | NIH NeuroBioBank; WGS of single isolated neuronal nuclei | Lodato et al. |
|
| Hippocampus | Netherlands Brain Bank and Human Brain and Spinal Fluid Resource Center; WES of laser capture micro dissected hippocampal formations | Park et al. |
|
| Hippocampus | Banner Sun Health Research Institute; RNA‐seq based mutation analysis (from GSE67333) | Ivashko‐Pachima et al. |
|
| Cerebellum | Mayo clinic (AMP‐AD); bulk tissue | Allen et al. |
|
| Temporal cortex | Mayo clinic (AMP‐AD); bulk tissue | Allen et al. |
|
| Frontal pole (BA10) | Mount Sinai/JJ Peters VA Medical Center Brain Ban (AMP‐AD); bulk tissue | Wang et al. |
|
| Superior temporal gyrus (BA22) | Mount Sinai/JJ Peters VA Medical Center Brain Ban (AMP‐AD); bulk tissue | Wang et al. |
|
| Parahippocampal gyrus (BA36) | Mount Sinai/JJ Peters VA Medical Center Brain Ban (AMP‐AD); bulk tissue | Wang et al. |
|
| Inferior frontal gyrus (BA44) | Mount Sinai/JJ Peters VA Medical Center Brain Ban (AMP‐AD); bulk tissue | Wang et al. |
|
| Dorsolateral prefrontal cortex | Mount Sinai/JJ Peters VA Medical Center Brain Ban (AMP‐AD); bulk tissue | Mostafavi et al. |
|
| Hippocampus | Netherlands Brain Bank; bulk tissue | Van Rooij et al. |
|
| Entorhinal Cortex | Victorian Brain bank; single‐nucleus RNA sequencing | Grubman et al. |
NOTE. Abbreviations: AMP‐AD, Accelerating Medicines Partnership Alzheimer's Disease Project; BA, Brodmann area; RNAseq, RNA sequencing; sSNV, somatic single nucleotide variant; WES, whole exome sequencing; WGS, whole genome sequencing.
FIGURE 2Long genes are significantly downregulated in AD‐relevant brain regions. Plots show differentially expressed genes, that is, genes that show significantly increased or decreased expression when comparing AD patients to non‐demented controls, from previously published RNA sequencing studies (Table 4). Protein‐coding genes are binned in 50 consecutive groups (gray bars), based on transcribed gene length. We compared the number of genes showing either increased or decreased expression in each bin (height of gray bar) with that of the total gene pool using hypergeometric tests (red circles, Bonferroni threshold for significance is indicated with dashed blue line)
FIGURE 3Long genes are significantly downregulated in inhibitory neurons of the entorhinal cortex. Plots show differentially expressed genes, that is, genes that show significantly increased or decreased expression when comparing AD patients to non‐demented controls, in single inhibitory or excitatory neurons from the entorhinal cortex (Table 4). Protein‐coding genes are binned in 50 consecutive groups (gray bars), based on transcribed gene length. We compared the number of genes showing either increased or decreased expression in each bin (height of gray bar) with that of the total gene pool using hypergeometric tests (red circles, Bonferroni threshold for significance is indicated with dashed blue line)
Over‐ and underrepresentation of very long genes in genes differentially expressed in the brain of AD patients
| Brain region | Number of genes detected (very long) | Number of differentially expressed genes (very long) | Over‐/underrepresentation |
|---|---|---|---|
| Cerebellum | 14291 (258) | 5128 (63) | ‐1.47 ( |
| Temporal cortex | 14292 (258) | 6129 (143) | 1.29 ( |
| Frontal pole (BA10) | 13788 (263) | 334 (5) | ‐ 1.27 ( |
| Superior temporal gyrus (BA22) | 13789 (263) | 688 (20) | 1.52 ( |
| Parahippocampal gyrus (BA36) | 13789 (263) | 4814 (134) | 1.46 ( |
| Inferior frontal gyrus (BA44) | 13789 (263) | 151 (3) | 1.04 ( |
| Dorsolateral prefrontal cortex | 13512 (250) | 1647 (22) | ‐1.39 ( |
| Hippocampus | 14533 (250) | 7411 (156) | 1.22 ( |
NOTE. A hypergeometric test was performed to generate the P‐ values of over‐and underrepresentation.
Abbreviation: AD, Alzheimer's disease.
Genes that are affected by sSNVs in the hippocampus , and differentially expressed in the AD hippocampus
| sSNV study | Decreased mRNA expression in the AD hippocampus | Increased mRNA expression in the AD hippocampus |
|---|---|---|
| Park et al. |
|
|
| Ivashko‐Pachima et al. |
|
|
Abbreviations: AD, Alzheimer's disease; sSNV, somatic single nucleotide variant.
Canonical pathway enrichment analysis for differentially expressed genes in brain regions of AD patients
|
| Brain region | |||||||
|---|---|---|---|---|---|---|---|---|
| Cerebellum | Temporal cortex | Frontal pole (BA10) | Superior temporal gyrus (BA22) | Parahippocampal gyrus (BA36) | Inferior frontal gyrus (BA44) | DLPFC | Hippocampus | |
|
|
Z‐score = 0.816 |
|
|
Z‐score = ‐1.706 |
|
|
Z‐score = 0.426 |
|
|
|
Z‐score = ND |
Z‐score = ‐0.707 |
|
|
Z‐score = ‐1.265 |
|
Z‐score = 0.258 |
|
|
|
Z‐score = ND |
Z‐score = ‐1.633 |
|
|
Z‐score = ‐1.890 |
|
Z‐score = 0 |
|
|
|
Z‐score = ND |
Z‐score = ‐1.342 |
|
|
Z‐score = ‐1.633 |
|
Z‐score = ‐0.277 |
|
|
|
Z‐score = ND |
Z‐score = ND |
|
|
Z‐score = ND |
|
Z‐score = ND |
|
NOTE. Results are shown for the five most significantly enriched synaptic pathways among the 272 longest genes in the genome (Supplementary Table 1). Significant P‐values (P < 0.05) and Z‐scores (Z ≤ ‐2 or Z ≥ 2) are indicated in bold.
Abbreviations: AD, Alzheimer's disease; BA, Brodmann area; DLPFC, dorsolateral prefrontal cortex; ND, Not determined.