| Literature DB >> 33329728 |
Lishun Xiao1, Zhongshang Yuan2, Siyi Jin1, Ting Wang1, Shuiping Huang1,3, Ping Zeng1,3.
Abstract
Genome-wide association studies (GWAS) have identified multiple causal genes associated with amyotrophic lateral sclerosis (ALS); however, the genetic architecture of ALS remains completely unknown and a large number of causal genes have yet been discovered. To full such gap in part, we implemented an integrative analysis of transcriptome-wide association study (TWAS) for ALS to prioritize causal genes with summary statistics from 80,610 European individuals and employed 13 GTEx brain tissues as reference transcriptome panels. The summary-level TWAS analysis with single brain tissue was first undertaken and then a flexible p-value combination strategy, called summary data-based Cauchy Aggregation TWAS (SCAT), was proposed to pool association signals from single-tissue TWAS analysis while protecting against highly positive correlation among tests. Extensive simulations demonstrated SCAT can produce well-calibrated p-value for the control of type I error and was often much more powerful to identify association signals across various scenarios compared with single-tissue TWAS analysis. Using SCAT, we replicated three ALS-associated genes (i.e., ATXN3, SCFD1, and C9orf72) identified in previous GWASs and discovered additional five genes (i.e., SLC9A8, FAM66D, TRIP11, JUP, and RP11-529H20.6) which were not reported before. Furthermore, we discovered the five associations were largely driven by genes themselves and thus might be new genes which were likely related to the risk of ALS. However, further investigations are warranted to verify these results and untangle the pathophysiological function of the genes in developing ALS.Entities:
Keywords: amyotrophic lateral sclerosis (ALS); brain tissue; genome-wide association studies (GWAS); transcriptome-wide association study (TWAS); type I error control
Year: 2020 PMID: 33329728 PMCID: PMC7714931 DOI: 10.3389/fgene.2020.587243
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Previous association studies for ALS in terms of the GWAS catalog.
| Year | Pop | cases/controls (discover + replication) | References | |
| 2007 | EUR | 276/271 | 3 | |
| 2007 | EUR | 461/450 + 876/906 | 1 | |
| 2007 | EUR | 221/211 + 737/721 | 1 | |
| 2008 | EUR | 737/721 + 1,030/1,195 | 3 | |
| 2009 | EUR | 958/932 + 309/404 | 1 | |
| 2009 | EUR | 1,821/2,258 + 538/556 | 14 | |
| 2009 | EUR | 2,323/9,013 + 2,532/5,940 | 3 | |
| 2010 | EUR | 405/497 | 4 | |
| 2010 | EUR | 4,857/8,987 | 0 | |
| 2010 | EUR | 639/6,257 + 183/961 | 2 | |
| 2013 | EUR | 4,243/5,112 | 19 | |
| 2013 | EUR | 6,100/7,125 + 2,074/2,556 | 3 | |
| 2014 | EUR | 4,377 + 435/14,431 + 4,056/3,958 | 10 | |
| 2015 | EUR | 25/1,179 | 1 | |
| 2016 | EUR | 12,577/23,475 + 2,579/2,767 | 4 | |
| 2018 | EUR | 20,806/59,804 + 4,159/18,650 | 10 | |
| 2019 | EUR | 4,244/3,106 | 1 | |
| 2013 | CHI | 506/1,859 + 706/1,777 | 4 | |
| 2013 | CHI | 4,243 (age of ALS on-set) | 15 | |
| 2013 | CHI | 250/250 | 174 | |
| 2016 | CHI | 94/376 | 1 | |
| 2017 | CHI | 1,234/2,850 + 576/683 | 7 |
FIGURE 1Schematic framework of TWAS with FUSION and SCAT based on only summary-level datasets and reference panel for linkage disequilibrium (LD) structure of SNPs. TWAS can be viewed to be a relatively independent two-stage inference procedure: the first stage is to estimate weights for cis-SNPs with GTEx brain transcriptome reference panel (the top panel); the second stage is to examine causal association between genes and ALS with weights obtained from the first stage (the bottom panel).
ALS-associated genes identified by SCAT or FUSION with 13 GTEx brain tissues.
| Tissue | |||||||||||
| Amygdala | 81 | 1,799 | 0 (0.00) | 2.85E-1 | |||||||
| Anterior cingulate cortex BA24 | 102 | 2,653 | 4 (0.15) | 1.20E-1 | 4.11E-3 | 6.90E-4 | |||||
| Caudate basal ganglia | 126 | 3,586 | 1 (0.03) | 2.71E-8 | 3.06E-1 | ||||||
| Cerebellar hemisphere | 113 | 4,327 | 6 (0.14) | 3.36E-1 | 3.93E-10 | 2.01E-1 | 2.37E-1 | 3.45E-1 | 7.25E-4 | 4.76E-2 | |
| Cerebellum | 137 | 5,752 | 4 (0.07) | 4.97E-4 | 5.86E-3 | 1.02E-2 | 1.15E-3 | 3.73E-1 | |||
| Cortex | 119 | 3,943 | 3 (0.08) | 7.79E-3 | 6.41E-3 | 2.00E-1 | 1.22E-1 | 2.00E-1 | |||
| Frontal cortex BA9 | 104 | 3,080 | 1 (0.03) | 5.87E-1 | 3.84E-16 | 1.88E-1 | |||||
| Hippocampus | 99 | 2,245 | 1 (0.04) | 3.66E-1 | 1.12E-4 | 8.61E-2 | 8.61E-2 | ||||
| Hypothalamus | 98 | 2,257 | 3 (0.13) | 4.94E-1 | 3.65E-1 | 3.65E-1 | 1.82E-2 | 1.55E-4 | 6.40E-3 | ||
| Nucleus accumbens basal ganglia | 114 | 3,172 | 2 (0.06) | 5.53E-1 | 3.32E-24 | 4.91E-3 | |||||
| Putamen basal ganglia | 98 | 2,766 | 1 (0.04) | 6.04E-7 | 2.07E-1 | ||||||
| Spinal cord cervical c-1 | 76 | 1,974 | 2 (0.10) | 4.97E-1 | 1.26E-7 | ||||||
| Substantia nigra | 70 | 1,568 | 2 (0.13) | ||||||||
| SCAT | 11469 | 8 (0.07) | 4.22E-2 | 1.08E-22 | 3.49E-2 | 4.10E-2 | 3.68E-2 | 4.22E-2 | 1.20E-3 | 4.22E-2 |
FIGURE 2Type I error control (A–D) and Estimated statistical power (E,F) in the simulation studies. In (A,B), the correlation matrix was independent; in panels (C,D), the correlation matrix was specified with the matrix shown in Supplementary Figure S2; in (E), the clustered lines with various colors represent the 13 types of FUSION analysis with one tissue and cannot be clearly separated; in (F), the number attached by SCAT indicates various tissues included; oracle denotes the oracle TWAS approach with the matrix shown in Supplementary Figure S2; because the inclusion of all 13 tissues in the oracle TWAS would result in 100% power; thus, here we only considers three tissues that were randomly selected in the oracle TWAS.
FIGURE 3Summary results for ALS-associated SNPs and mapped genes identified in previous GWASs. (A) The distribution for associated SNPs across all 22 chromosomes; (B) The p-values of circle Manhattan plot of associated SNPs for significance; (C) The distribution for genes with high frequency.
FIGURE 4Results of FUSION and SCAT for TWAS analysis of ALS with multiple brain tissues. (A) The QQ plot for SCAT; (B) The QQ plot for FUSION with each of the GTEx brain tissues as reference dataset; (C) The distribution for analyzed genes across all 22 chromosomes; (D) The p-values of circle Manhattan plot of analyzed genes for significance. Of note, the genomic inflation factor of the p values obtained via SCAT is 1.04, indicating the slight inflation observed in (A) might be due to the polygenicity of ALS rather than uncontrolled unknown confounders.