| Literature DB >> 34873335 |
Wouter van Rheenen1, Rick A A van der Spek2, Mark K Bakker2, Joke J F A van Vugt2, Paul J Hop2, Ramona A J Zwamborn2, Niek de Klein3, Harm-Jan Westra3, Olivier B Bakker3, Patrick Deelen3,4, Gemma Shireby5, Eilis Hannon5, Matthieu Moisse6,7,8, Denis Baird9,10, Restuadi Restuadi11, Egor Dolzhenko12, Annelot M Dekker2, Klara Gawor2, Henk-Jan Westeneng2, Gijs H P Tazelaar2, Kristel R van Eijk2, Maarten Kooyman2, Ross P Byrne13, Mark Doherty13, Mark Heverin14, Ahmad Al Khleifat15, Alfredo Iacoangeli15,16,17, Aleksey Shatunov15, Nicola Ticozzi18,19, Johnathan Cooper-Knock20, Bradley N Smith15, Marta Gromicho21, Siddharthan Chandran22,23, Suvankar Pal22,23, Karen E Morrison24, Pamela J Shaw20, John Hardy25, Richard W Orrell26, Michael Sendtner27, Thomas Meyer28, Nazli Başak29, Anneke J van der Kooi30, Antonia Ratti18,31, Isabella Fogh15, Cinzia Gellera32, Giuseppe Lauria33,34, Stefania Corti19,35, Cristina Cereda36, Daisy Sproviero36, Sandra D'Alfonso37, Gianni Sorarù38, Gabriele Siciliano39, Massimiliano Filosto40, Alessandro Padovani40, Adriano Chiò41,42, Andrea Calvo41,42, Cristina Moglia41,42, Maura Brunetti41, Antonio Canosa41,42, Maurizio Grassano41, Ettore Beghi43, Elisabetta Pupillo43, Giancarlo Logroscino44, Beatrice Nefussy45, Alma Osmanovic46,47, Angelica Nordin48, Yossef Lerner49,50, Michal Zabari49,50, Marc Gotkine49,50, Robert H Baloh51,52, Shaughn Bell51,52, Patrick Vourc'h53,54, Philippe Corcia54,55, Philippe Couratier56,57, Stéphanie Millecamps58, Vincent Meininger59, François Salachas58,60, Jesus S Mora Pardina61, Abdelilah Assialioui62, Ricardo Rojas-García63, Patrick A Dion64,65, Jay P Ross64,66, Albert C Ludolph67, Jochen H Weishaupt68, David Brenner68, Axel Freischmidt67,69, Gilbert Bensimon70,71,72,73, Alexis Brice74, Alexandra Durr74, Christine A M Payan70, Safa Saker-Delye75, Nicholas W Wood76, Simon Topp15, Rosa Rademakers77, Lukas Tittmann78, Wolfgang Lieb78, Andre Franke79, Stephan Ripke80,81,82, Alice Braun82, Julia Kraft82, David C Whiteman83, Catherine M Olsen83, Andre G Uitterlinden84,85, Albert Hofman85, Marcella Rietschel86,87, Sven Cichon88,89,90,91, Markus M Nöthen88,89, Philippe Amouyel92, Bryan J Traynor93,94, Andrew B Singleton95, Miguel Mitne Neto96, Ruben J Cauchi97, Roel A Ophoff98,99,100, Martina Wiedau-Pazos101, Catherine Lomen-Hoerth102, Vivianna M van Deerlin103, Julian Grosskreutz104,105, Annekathrin Roediger104, Nayana Gaur104, Alexander Jörk104, Tabea Barthel104, Erik Theele104, Benjamin Ilse104, Beatrice Stubendorff104, Otto W Witte104, Robert Steinbach104, Christian A Hübner106, Caroline Graff107, Lev Brylev108,109,110, Vera Fominykh108,110, Vera Demeshonok111, Anastasia Ataulina108, Boris Rogelj112,113,114, Blaž Koritnik115, Janez Zidar115, Metka Ravnik-Glavač116, Damjan Glavač117, Zorica Stević118, Vivian Drory45,119, Monica Povedano62, Ian P Blair120, Matthew C Kiernan121, Beben Benyamin11,122, Robert D Henderson123,124, Sarah Furlong120, Susan Mathers125, Pamela A McCombe124,126, Merrilee Needham127,128,129, Shyuan T Ngo123,124,126, Garth A Nicholson120,130,131, Roger Pamphlett132, Dominic B Rowe120, Frederik J Steyn124,133, Kelly L Williams120, Karen A Mather134,135, Perminder S Sachdev134,136, Anjali K Henders11, Leanne Wallace11, Mamede de Carvalho21, Susana Pinto21, Susanne Petri46, Markus Weber137, Guy A Rouleau64,65,66, Vincenzo Silani18,19, Charles J Curtis138,139, Gerome Breen138,139, Jonathan D Glass140, Robert H Brown141, John E Landers141, Christopher E Shaw15, Peter M Andersen48, Ewout J N Groen2, Michael A van Es2, R Jeroen Pasterkamp142, Dongsheng Fan143, Fleur C Garton11, Allan F McRae11, George Davey Smith10,144, Tom R Gaunt10,144, Michael A Eberle12, Jonathan Mill5, Russell L McLaughlin13, Orla Hardiman14, Kevin P Kenna2,142, Naomi R Wray11,126, Ellen Tsai9, Heiko Runz9, Lude Franke3, Ammar Al-Chalabi15,145, Philip Van Damme6,7,8, Leonard H van den Berg2, Jan H Veldink146.
Abstract
Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease with a lifetime risk of one in 350 people and an unmet need for disease-modifying therapies. We conducted a cross-ancestry genome-wide association study (GWAS) including 29,612 patients with ALS and 122,656 controls, which identified 15 risk loci. When combined with 8,953 individuals with whole-genome sequencing (6,538 patients, 2,415 controls) and a large cortex-derived expression quantitative trait locus (eQTL) dataset (MetaBrain), analyses revealed locus-specific genetic architectures in which we prioritized genes either through rare variants, short tandem repeats or regulatory effects. ALS-associated risk loci were shared with multiple traits within the neurodegenerative spectrum but with distinct enrichment patterns across brain regions and cell types. Of the environmental and lifestyle risk factors obtained from the literature, Mendelian randomization analyses indicated a causal role for high cholesterol levels. The combination of all ALS-associated signals reveals a role for perturbations in vesicle-mediated transport and autophagy and provides evidence for cell-autonomous disease initiation in glutamatergic neurons.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34873335 PMCID: PMC8648564 DOI: 10.1038/s41588-021-00973-1
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330
Extended Data Fig. 1Manhattan plot in European ancestries GWAS.
Genome-wide association statistics obtained by inverse-variance weighted meta-analysis of the stratified SAIGE logistic mixed model regression in European ancestry cohorts. Y-axis corresponds to the two-tailed -log10(P-value), x-axis corresponds to the genomic coordinates (GRCh37). Loci containing a genome-wide significant SNP are highlighted in red. SNP IDs are the top associated SNPs in each locus. The dotted horizontal line reflects the threshold for genome-wide significance (P = 5 × 10−8).
Source data
Genome-wide significant loci
| European ancestries | Asian ancestries | Cross-ancestry | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Chr | Position (bp) | ID | Prioritized gene | A1 | A2 | Freq | Effect (s.e.) | Effect (s.e.) | Effect (s.e.) | |||
| 9 | 27,563,868 | A | G | 0.248 | 0.174 (0.013) | 1.0 × 10−43 | 0.017 (0.066) | 0.80 | 0.168 (0.012) | 1.5 × 10−41 | ||
| 19 | 17,752,689 | C | A | 0.347 | 0.125 (0.012) | 8.8 × 10−25 | 0.074 (0.038) | 0.053 | 0.120 (0.012) | 3.0 × 10−25 | ||
| 21 | 33,039,603 | C | A | 0.006 | 1.078 (0.124) | 3.5 × 10−18 | – | – | – | – | ||
| 14 | 31,045,596 | A | G | 0.337 | 0.091 (0.012) | 9.2 × 10−15 | – | – | – | – | ||
| 14 | 31,045,181 | A | G | 0.337 | 0.091 (0.012) | 9.2 × 10−15 | 0.002 (0.036) | 0.97 | 0.083 (0.011) | 1.5 × 10−13 | ||
| 3 | 39,508,968 | G | A | 0.291 | 0.079 (0.012) | 5.2 × 10−11 | 0.084 (0.036) | 0.020 | 0.080 (0.011) | 3.3 × 10−12 | ||
| 6 | 32,672,641 | C | A | 0.096 | −0.143 (0.021) | 5.5 × 10−12 | −0.110 (0.111) | 0.32 | −0.142 (0.02) | 3.5 × 10−12 | ||
| 12 | 57,975,700 | T | A | 0.016 | 0.332 (0.049) | 1.4 × 10−11 | – | – | – | – | ||
| 21 | 45,753,117 | A | C | 0.012 | 0.418 (0.063) | 2.7 × 10−11 | – | – | – | – | ||
| 5 | 150,410,835 | C | T | 0.253 | 0.079 (0.013) | 3.5 × 10−10 | 0.042 (0.036) | 0.24 | 0.075 (0.012) | 2.7 × 10−10 | ||
| 20 | 48,438,761 | A | T | 0.353 | 0.074 (0.012) | 3.5 × 10−10 | 0.045 (0.076) | 0.55 | 0.073 (0.012) | 3.2 × 10−10 | ||
| 12 | 64,877,053 | A | T | 0.112 | −0.098 (0.018) | 1.7 × 10−8 | −0.216 (0.090) | 0.017 | −0.103 (0.017) | 2.1 × 10−9 | ||
| 5 | 172,354,731 | C | T | 0.397 | −0.065 (0.011) | 8.5 × 10−9 | −0.067 (0.074) | 0.37 | −0.065 (0.011) | 5.6 × 10−9 | ||
| 4 | 170,583,157 | A | G | 0.335 | 0.063 (0.012) | 7.0 × 10−8 | 0.203 (0.070) | 3.8 × 10−3 | 0.067 (0.012) | 6.9 × 10−9 | ||
| 13 | 46,113,984 | C | T | 0.259 | 0.066 (0.013) | 1.9 × 10−7 | 0.100 (0.041) | 0.014 | 0.069 (0.012) | 1.2 × 10−8 | ||
| 7 | 157,481,780 | G | C | 0.124 | 0.076 (0.017) | 5.8 × 10−6 | 0.132 (0.037) | 2.9 × 10−4 | 0.086 (0.015) | 1.8 × 10−8 | ||
Details of two-sided SAIGE logistic mixed model regression for the top associated SNPs within each genome-wide significant locus (P < 5 × 10−8). aFor the strongest associated SNP in the SCFD1 locus, rs229195 (MAF = 0.337), details of the LD proxy rs229194 are described (MAF = 0.337, r2 = 0.996 in Asian ancestries), as only the LD proxy was present in the Asian ancestry GWAS. The low-frequency SNPs rs80265967, rs113247976 and rs75087725 were not present in the Asian ancestry GWAS, and no LD proxies (r2 > 0.8) were found. Chr, chromosome; Position, basepair position in the reference genome GRCh37; A1, effect allele; A2, non-effect allele; Freq, frequency of the effect allele in the European ancestry GWAS; s.e., standard error of the effect estimate.
Fig. 1Manhattan plot of cross-ancestry meta-analysis.
Genome-wide association statistics obtained by IVW meta-analysis of the stratified SAIGE logistic mixed model regression. The y axis corresponds to two-tailed −log10 (Pvalues); the x axis corresponds to genomic coordinates (GRCh37). The horizontal dashed line reflects the threshold for calling genome-wide significant SNPs (P = 5 × 10−8). Color coding and gene labels reflect those prioritized by the gene-prioritization analysis. Labels in bold indicate genes with known highly pathogenic mutations for ALS. SAIGE = Scalable and Accurate Implementation of Generalized mixed model software package.
Source data
Extended Data Fig. 2Annotation specific heritability enrichment.
Enrichment of SNP-based heritability was calculated with LD-score regression. Grey dashed line represents no enrichment (enrichment = 1). Error bars denote standard error of enrichment estimate. Nominal statistically significant enrichment estimates (two-sided P < 0.05) are marked with an asterisk (Conserved_LindbladToh P = 6.5 × 10−5, SuperEnhancer_Hnisz P = 0.014, TFBS_ENCODE P = 0.017, H3K4me1_peaks_Trynka P = 0.018, Coding_UCSC P = 0.028, H3K9ac_Trynka P = 0.037). The category Conserved_LindbladToh was significant after Bonferroni correction for multiple testing across all categories (N = 28). Due to the regression framework in LDSC, enrichment estimates < 0 are possible (with large standard errors).
Source data
Extended Data Fig. 3PRS stratified by rare variant carrier status.
Distribution of PRS in controls and ALS patients with or without one or more rare variants in ALS risk genes. There was no statistically significant difference in PRS between ALS patients with and without rare variants in ALS risk genes (labeled as gene_mut or gene_wt respectively). In total, 5,112 ALS patients and 2,132 controls from stratum 6 with whole-genome sequencing data available were included. For SOD1, TARDBP, FUS, NEK1, TBK1, and CFAP410, rare variants were included according to the model that yielded the strongest association in the rare variant burden association analyses. For C9orf72, patients with the pathogenic hexanucleotide repeat expansion were compared to those without the expansion. The ‘any ALS gene’ groups all patients together with a rare variant in any of the ALS risk genes. P-values for difference in PRS were derived by two-tailed logistic regression. The number of ALS patients carrying a rare variant per gene is denoted in the corresponding panel. Intervals for boxplots: center = median, box = lower and upper quartile, hinges = median ± 2 * IQR, IQR = interquartile range.
Source data
Extended Data Fig. 4NEK1 repeat distribution.
The frequency of STR alleles in ALS cases and controls are shown. A repeat length of 11 and longer was used as the optimal threshold for disease-associated genotype. The P-value was calculated by Firth logistic regression and FDR correction over all possible thresholds. Y-axis shows the allele frequency of repeat lengths. Repeat position on GRCh37, and repeat motif are shown.
Source data
Fig. 2Genetic modifier analyses.
a, Cox proportional HRs for genome-wide significant SNPs (brown, n = 15), PRSs (red, n = 2) and rare variant burden in ALS-risk genes (pink, n = 7) on survival (months) tested in 6,095 patients with ALS. Estimated HRs are displayed with error bars corresponding to 95% CIs. Higher HRs correspond to shorter survival times. b, Effect estimates from a linear regression model of age at onset (years) in 6,095 patients with ALS. Lower effect estimates correspond to a younger age at onset. Effect estimates from linear regression are displayed with error bars corresponding to 95% CIs. The risk-increasing allele for ALS corresponds to the effect allele for both survival and age-at-onset analyses.
Source data
Extended Data Fig. 5Genetic correlations between brain diseases.
Correlation matrix for genetic correlation estimates obtained from bivariate LD score regression. Colors correspond to genetic correlation estimates. Strongest clusters appear between neurodegenerative diseases and within the psychiatric traits. ALS = amyotrophic lateral sclerosis, FTD = frontotemporal dementia, PSP progressive supranuclear palsy, PD = Parkinson’s disease, CBD = corticobasal degeneration, AD = (clinically diagnosed) Alzheimer’s disease, MS = multiple sclerosis, IS = ischemic stroke (any), ICH = intracerebral hemorrhage, IA = intracranial aneurysm (any), AN = anorexia nervosa, OCD = obsessive compulsive disorder, Anxiety = anxiety disorder (score), PTSD = post-traumatic stress disorder, MDD = major depressive disorder, BIP = bipolar disorder, SCZ = schizophrenia, TS = Tourette’s syndrome, ASD = autism spectrum disorder, ADHD = attention-deficit hyperactivity disorder.
Source data
Fig. 3Shared genetic risk between ALS and neurodegenerative diseases.
a, Genetic correlation analysis. Genetic correlation was estimated with LDSC between each pair of neurodegenerative diseases (ALS, AD, CBD, PD, PSP and FTD). Correlations marked with an asterisk reached nominal statistical significance (PALS,AD = 0.01, PALS,PD = 0.01, PALS,PSP = 0.0001, PPSP,PD = 0.002). b, SNP associations of ALS lead SNPs or LD proxies in neurodegenerative diseases. The association with ALS is shown at the top. Effective sample size is shown on the left. Posterior probabilities of the same causal SNP affecting two diseases were estimated through colocalization analysis and are highlighted as connections.
Source data
Extended Data Fig. 6Colocalization signals.
Loci were selected for colocalization analysis when the top associated SNP was associated with any neurodegenerative disease at 5 × 10−5. For ALS, the European ancestries meta-analysis was used. Bayesian posterior probabilities for a shared variant driving risk of both traits (PPH4) are reported below locus names. Colors reflect LD between the variant and top associated SNP.
Source data
Extended Data Fig. 7Colocalization analysis with FTD subtypes.
Top associated SNPs in the ALS GWAS were selected for colocalization analysis between ALS and FTD subtypes using COLOC. In the top panel, point height is the two-sided -log10(P-value) of the top-associated SNP in the ALS GWAS. In the bottom panel, association P-values of these SNPs with FTD subtypes are shown by color. The Bayesian posterior probability for a shared causal variant between traits (PPH4) is depicted by a connection between points.
Source data
Fig. 4Tissue and cell type enrichment analysis.
a, Enrichment of tissues and brain regions included in GTEx version 8 illustrates a brain-specific enrichment pattern in ALS, similar to that in PD but contrasting with that in AD. Tissues and brain regions displayed are those significantly enriched in ALS or PD, tissues previously reported in AD and tissues of specific interest for ALS (spinal cord, tibial nerve and muscle). Color represents the enrichment coefficient, and size indicates two-sided −log10 (P-values) of enrichment obtained by the linear regression model in the MAGMA gene property analysis. b, Cell type enrichment analyses indicate neuron-specific enrichment for glutamatergic neurons. In ALS, no enrichment was found for microglia or other non-neuronal cell types, contrasting with the pattern observed in AD. Color represents the enrichment coefficient, and size indicates two-sided −log10 (P-values) of enrichment obtained by the linear regression model in the MAGMA gene property analysis. Statistically significant enrichments after correction for multiple testing over all tissues (n = 54), cell types (n = 7) and neurons (n = 3) with FDR < 0.05 are marked with an asterisk. Cx, cortex; GABA, γ-aminobutyric acid; OPCs, oligodendrocyte progenitor cells.
Source data
Extended Data Fig. 8Tissue and cell-type enrichment analyses for all brain diseases.
Tissue (a) and cell-type (b) enrichment for all included brain diseases obtained from two-sided MAGMA linear regression. Only brain diseases with exome-wide significant gene-based MAGMA associations (P < 2.7 × 10−6) were suitable for tissue and cell-type enrichment analyses. The color represents enrichment coefficient and size indicates two-sided -log10(P-value) of enrichment obtained by the linear regression model in the MAGMA gene-property analysis. Due to the large number of significant genes in the gene-based MAGMA analyses for schizophrenia, bipolar disorder and multiple sclerosis the enrichment P-values were truncated at P < 1.0 × 10−5. ALS = amyotrophic lateral sclerosis, PD = Parkinson’s disease, AD = Alzheimer’s disease, ADHD = attention-deficit hyperactivity disorder, ASD = autism spectrum disorder, TS = Tourette’s syndrome, SCZ = schizophrenia, BIP = bipolar disorder, MDD = major depressive disorder, PTSD = post-traumatic stress disorder, Anxiety = anxiety disorder (score), AN = anorexia nervosa, IA intracranial aneurysm (any), IS = ischemic stroke, MS = multiple sclerosis, Cx = cortex, OPC = oligodendrocyte progenitor cells.
Source data
Extended Data Fig. 9Cell-type enrichment analysis in mice.
Cell-type enrichment analysis using the DropViz single-cell RNA sequencing dataset obtained from mice. Similar to the cell-type enrichment analyses there is neuron-specific enrichment in ALS and Parkinson’s disease. In Alzheimer’s disease microglia are the most enriched cell-types. The color represents enrichment coefficient and size indicates two-sided -log10(P-value) of enrichment obtained by the linear regression model in the MAGMA gene-property analysis. Statistically significant enrichments after correction for multiple testing with a false discovery rate (FDR) < 0.05 are marked with an asterisk. ALS = amyotrophic lateral sclerosis, PD = Parkinson’s disease, AD = Alzheimer’s disease, Cx = cortex.
Source data
Extended Data Fig. 10Human phenotype ontology term enrichment.
Downstreamer enrichment analyses were performed using the multi-tissue and brain-specific co-expression matrix to identify co-regulated ALS-genes. The distribution of enrichment statistics (Z-scores) for all Human phenotype ontology (HPO) terms are plotted per HPO parent branch. The multi-tissue analysis indicates enrichment for the neurology parent branch ‘abnormality of the nervous system’ (dark-red), although no term passes the Bonferroni threshold for multiple testing. The brain-specific analysis illustrates stronger enrichment for the neurology parent branch. In total, 58 HPO terms pass the threshold for multiple testing of which 42 are defined within the ‘abnormality of the nervous system’ branch.
Source data
Fig. 5Causal inference of total cholesterol levels and years of schooling in ALS.
a, MR results for ALS and total cholesterol levels. Results for the five different MR methods for two different P-value cutoffs for SNP instrument selection are presented. In total, 83 and 178 SNPs were used as instruments at cutoffs of P < 5 × 10−8 and P < 5 × 10−5, respectively. All methods show a consistent positive effect for an increased risk of ALS with higher total cholesterol levels. There is no evidence for reverse causality. Point estimates for MR are presented with error bars reflecting 95% CIs. b, MR results for ALS and years of schooling. In total, 306 and 681 SNPs were used as instruments at cutoffs of P < 5 × 10−8 and P < 5 × 10−5. Point estimates for MR are presented, with error bars reflecting 95% CIs. Statistically significant effects with a two-sided P-value passing Bonferroni correction for multiple testing over all tested traits (n = 22), instrument P-value cutoffs (n = 2) and MR methods (n = 5) are marked with an asterisk (total cholesterol, Pweighted median = 0.0003 and Pweighted median = 0.0007 for cutoffs at P < 5 × 10−8 and P < 5 × 10−5, respectively; years of schooling, PIVW = 0.0002 at the cutoff of P < 5 × 10−5). Here, SNP outliers were not removed for instrument selection. Z, genetic instrument; b, estimated causal effect for an increase of 1 s.d. in genetically predicted exposure.
Source data