| Literature DB >> 33576571 |
Young Won Kim1, Ismael Al-Ramahi2,3, Amanda Koire4,5, Stephen J Wilson6, Daniel M Konecki4, Samantha Mota3, Shirin Soleimani3, Juan Botas2,3,4, Olivier Lichtarge1,2,3,4,5,6.
Abstract
The strongest genetic risk factor for idiopathic late-onset Alzheimer's disease (LOAD) is apolipoprotein E (APOE) ɛ4, while the APOE ɛ2 allele is protective. However, there are paradoxical APOE ɛ4 carriers who remain disease-free and APOE ɛ2 carriers with LOAD. We compared exomes of healthy APOE ɛ4 carriers and APOE ɛ2 Alzheimer's disease (AD) patients, prioritizing coding variants based on their predicted functional impact, and identified 216 genes with differential mutational load between these two populations. These candidate genes were significantly dysregulated in LOAD brains, and many modulated tau- or β42-induced neurodegeneration in Drosophila. Variants in these genes were associated with AD risk, even in APOE ɛ3 homozygotes, showing robust predictive power for risk stratification. Network analyses revealed involvement of candidate genes in brain cell type-specific pathways including synaptic biology, dendritic spine pruning and inflammation. These potential modifiers of LOAD may constitute novel biomarkers, provide potential therapeutic intervention avenues, and support applying this approach as larger whole exome sequencing cohorts become available.Entities:
Keywords: Drosophila models; apolipoprotein E; late-onset Alzheimer's disease; paradoxical phenotypes
Mesh:
Substances:
Year: 2020 PMID: 33576571 PMCID: PMC8247413 DOI: 10.1002/alz.12240
Source DB: PubMed Journal: Alzheimers Dement ISSN: 1552-5260 Impact factor: 21.566
FIGURE 1Identification and validation of genes with differential functional mutational load in the paradoxical AD‐ɛ2 versus HC‐ɛ4 population. (A) For each gene, the mutational burden, defined as the sum of all evolutionary action (EA) scores, was plotted against the number of all coding variants observed in that gene, and a regression line was fitted to establish the expected mutational burden given background mutation rate across the two paradoxical patient groups. Then, in each patient group (AD‐ɛ2 vs HC‐ɛ4), the observed mutational burden and the background mutation rate for each gene were plotted, and the distance (d) from regression line was measured and compared. To control for noise from passive mutations, we assessed the significance of each gene's signal by calculating a z‐score from randomizing the labels of AD‐ɛ2 and HC‐ɛ4 patients 100 times to build a background distribution of imputed deviation in evolutionary action load (iDEAL) scores; 216 genes had an absolute value of z‐score above 2.5 (>99th percentile). For more detail, refer to Detailed Methods. (B) Enrichment of iDEAL genes for differentially expressed genes (DEG) in Alzheimer's disease (AD) brains from Accelerating Medicines Partnership‐Alzheimer's Disease sequence repository. DEG were defined as genes significantly up‐ or downregulated using adjusted P‐value cutoff of .01 in at least one brain region. Hypergeometric test was used to assess enrichment. (C) Enrichment of iDEAL genes for first‐degree neighbors of genome‐wide association study AD genes. STRING v11 was used to construct protein‐protein interaction (PPI) network. Interaction sources are from Textmining, Experiments, and Databases, with interaction scores above 0.400. For calculation of z‐score, see Figure S4A in supporting information. (D) The average worsening or amelioration (%) of neurodegeneration measured as the loss in climbing speed of Drosophila expressing either secreted β42 (left) or human 2N4R tau (right) together with the indicated allele in the Drosophila homolog of the gene shown. β42/scramble or tau/scramble animals are used as the reference (error bars indicate standard deviation). Seven genes that showed conflicting evidence are not included
FIGURE 2Receiver operating characteristic (ROC) curves for imputed deviation in evolutionary action load (iDEAL) genes as diagnostic markers. Adaboost‐SVM algorithm was trained to classify individuals with (A) AD‐ɛ2 versus HC‐ɛ4, (C) AD‐ɛ2 versus HC‐ɛ2, and (D) HC‐ɛ4 versus AD‐ɛ4, using aggregate evolutionary action (EA) burden in the 216 iDEAL genes as features in a five‐fold cross‐validation. Blue line represents the mean ROC curve, and the gray area represents ±1 standard deviation (std. dev.). Red dotted line represents the ROC curve for a random classifier (area under the curve [AUC] = 0.50). (B) Permutation feature importance returned 94 genes that positively contributed to risk prediction. Data shown as mean ± standard error of the mean of five folds
FIGURE 3Network‐based functional enrichment of imputed deviation in evolutionary action load (iDEAL) genes. (A) Networkbuilt using those genes among the 216 candidates which interacted with each other (stringency 0.4) using STRING. Gene modules were established by applying Markov cluster algorithm with an inflation of 1.9. Functional enrichment analysis was then performed for each cluster.15 out of 26 clusters showed functional enrichment (indicated in blue font) with FDR q‐value < 0.05. (Table S5 in supporting information). Information on drug availability (rhomboid‐shaped nodes), ability to ameliorate (green outline), or worsen (red outline) neurodegeneration in vivo, and whether the genes were among the most significant in patient stratification (orange nodes) are superimposed on the network. (B) Examples of coexpression communities and their functional enrichment in specific brain‐cell types. Green nodes indicate iDEAL genes, gray nodes indicate their first degree coexpressed neighbors, light blue indicates first neighbors of ideal genes that are also central players in AD biology, and purple nodes are genome‐wide association study candidates that are first coexpression neigbors of iDEAL genes. Edges represent weighed coexpression. Based on networks built by McKenzie et al. (Cell images modified from Servier Medical Art in accordance with the Creative Commons license.)