| Literature DB >> 36204489 |
Junyu Chen1,2, Lei Wang2,3, Philip L De Jager4, David A Bennett5, Aron S Buchman5, Jingjing Yang2.
Abstract
Existing methods for integrating functional annotations in genome-wide association studies (GWASs) to fine-map and prioritize potential causal variants are limited to using non-overlapped categorical annotations or limited by the computation burden of modeling genome-wide variants. To overcome these limitations, we propose a scalable Bayesian functional GWAS method to account for multivariate quantitative functional annotations (BFGWAS_QUANT), accompanied by a scalable computation algorithm enabling joint modeling of genome-wide variants. Simulation studies validated the performance of BFGWAS_QUANT for accurately quantifying annotation enrichment and improving GWAS power. Applying BFGWAS_QUANT to study five Alzheimer disease (AD)-related phenotypes using individual-level GWAS data (n = ∼1,000), we found that histone modification annotations have higher enrichment than expression quantitative trait locus (eQTL) annotations for all considered phenotypes, with the highest enrichment in H3K27me3 (polycomb regression). We also found that cis-eQTLs in microglia had higher enrichment than eQTLs of bulk brain frontal cortex tissue for all considered phenotypes. A similar enrichment pattern was also identified using the International Genomics of Alzheimer's Project (IGAP) summary-level GWAS data of AD (n = ∼54,000). The strongest known APOE E4 risk allele was identified for all five phenotypes, and the APOE locus was validated using the IGAP data. BFGWAS_QUANT fine-mapped 32 significant variants from 1,073 genome-wide significant variants in the IGAP data. We also demonstrated that the polygenic risk scores (PRSs) using effect size estimates by BFGWAS_QUANT had a similar prediction accuracy as other methods assuming a sparse causal model. Overall, BFGWAS_QUANT is a useful GWAS tool for quantifying annotation enrichment and prioritizing potential causal variants.Entities:
Keywords: Alzheimer disease; Bayesian hierarchical variable selection regression; fine-mapping; genome-wide association study; molecular quantitative trait loci; polygenic risk score; quantitative functional annotation
Year: 2022 PMID: 36204489 PMCID: PMC9530673 DOI: 10.1016/j.xhgg.2022.100143
Source DB: PubMed Journal: HGG Adv ISSN: 2666-2477
Figure 1Bayesian enrichment estimates, heritability estimates, and sensitivities of simulation studies.
(A–F) Simulations with (A–C) and simulations with (D–F). Bayesian estimates of annotation enrichment () of 100 simulations with true heritability are shown in the respective boxplots (A and D), where red dots denote true enrichment values. Comparable heritability estimates (B and E) and higher sensitivities (C and F) were obtained by BFGWAS_QUANT (red) versus BVSR (blue).
Figure 2Bayesian estimates of functional annotation enrichment for Alzheimer dementia.
(A): Using ROS/MAP individual-level GWAS data and (B): Using IGAP summary-level GWAS data. Histone modification H3K27me3 (polycomb regression) and microglia cis-eQTL annotations were found to be most enriched for association signals of AD.
Significant SNPs with Bayesian CPP >0.1068 by BFGWAS_QUANT for studying AD-related phenotypes using the ROS/MAP individual-level GWAS data
| CHR | rsID | Gene | Function | MAF | CPP | Beta | p Value | Phenotype |
|---|---|---|---|---|---|---|---|---|
| 1 | rs148348738a | SPATA6 | intron | 0.011 | 0.149 | −0.039 | 4.47E−07 | cognition decline rate |
| 2 | rs147749419 | CXCR1 | regulatory | 0.017 | 0.154 | −0.043 | 2.94E−08 | cognition decline rate |
| 8 | rs11787066a | LOC | intron | 0.148 | 0.276 | 0.015 | 6.93E−08 | β-amyloid |
| 19 | rs34134669a | ADAMTS10 | regulatory | 0.234 | 0.119 | −0.005 | 8.57E−07 | cognition decline rate |
| 19 | rs769449 | APOE | 0.111 | 0.121 | 0.076 | 3.45E−11 | Alzheimer dementia | |
| regulatory | 0.112 | 0.116 | 0.022 | 1.51E−16 | tangle density | |||
| 0.109 | 0.475 | −0.025 | 2.09E−15 | cognition decline rate | ||||
| 19 | rs429358 | APOE | 0.138 | 0.144 | 0.037 | 7.72E−13 | Alzheimer dementia | |
| 0.138 | 0.631 | 0.037 | 1.17E−20 | tangle density | ||||
| missense | 0.138 | 0.999 | 0.083 | 6.60E−27 | β-amyloid | |||
| 0.139 | 0.999 | 0.089 | 1.19E−33 | global AD pathology | ||||
| 0.136 | 0.17 | −0.036 | 1.29E−17 | cognition decline rate | ||||
| 19 | rs7412 | APOE | missense | 0.077 | 0.108 | −0.027 | 6.67E−13 | global AD pathology |
| 19 | rs1065853 | intergenic | 0.076 | 0.381 | −0.026 | 8.31E−13 | global AD pathology | |
| 19 | rs10414043 | intergenic | 0.113 | 0.111 | 0.028 | 2.71E−12 | Alzheimer dementia | |
| 19 | rs7256200 | regulatory | 0.113 | 0.315 | 0.028 | 2.71E−12 | Alzheimer dementia | |
| 0.113 | 0.228 | 0.03 | 3.86E−17 | tangle density | ||||
| 0.111 | 0.270 | −0.024 | 3.66E−15 | cognition decline rate | ||||
| 20 | rs1131695 | APOC1 | stop gained | 0.435 | 0.119 | 0.039 | 1.06E−06 | tangle density |
SNPs with a single variant test p value >5 × 10−8 that did not reach genome-wide significance by standard GWAS.
Figure 3Manhattan plots of BFGWAS_QUANT results for studying Alzheimer dementia.
(A): Using ROS/MAP individual-level GWAS data; (B): Using IGAP summary-level GWAS data. Single-variant test p values were plotted in −log10 scale on the y axis. The dashed horizontal line denotes the genome-wide significant threshold . SNPs with Bayesian CPP greater than 0.1068 were colored according to the color scale of their Bayesian CPP values by BFGWAS_QUANT. SNPs with Bayesian CPP greater than 0.5 were plotted as solid triangles.
Significant SNPs with Bayesian CPP > 0.1068 by BFGWAS_QUANT for studying AD using the IGAP summary-level GWAS data
| CHR | rsID | Gene | Function | CPP | Beta | p Value |
|---|---|---|---|---|---|---|
| 1 | rs6656401 | CR1 | intron | 0.119 | −0.017 | 8.67E−15 |
| 1 | rs7515905 | CR1 | intron | 0.206 | −0.019 | 3.75E−15 |
| 1 | rs1752684 | CR1 | regulatory | 0.125 | −0.017 | 3.77E−15 |
| 1 | rs679515 | CR1 | intron | 0.220 | −0.018 | 3.60E−15 |
| 2 | rs4663105 | BIN1 | regulatory | 0.631 | 0.050 | 1.26E−26 |
| 2 | rs6733839 | BIN1 | regulatory | 0.796 | 0.053 | 1.24E−26 |
| 6 | rs9270999a | HLA-DRB1 | intron | 0.181 | 0.001 | 8.04E−08 |
| 6 | rs9273472a | HLA-DRB1 | intron | 0.110 | 0.074 | 1.63E−04 |
| 7 | rs10808026 | EPHA1 | intron | 0.123 | −0.020 | 1.36E−11 |
| 7 | rs11762262 | EPHA1 | intron | 0.117 | −0.011 | 2.21E−10 |
| 7 | rs11763230 | EPHA1 | intron | 0.325 | −0.020 | 1.86E−11 |
| 7 | rs11771145 | EPHA1 | intron | 0.173 | −0.021 | 8.69E−10 |
| 8 | rs28834970 | PTK2B | intron | 0.137 | 0.066 | 3.22E−09 |
| 8 | rs2279590 | CLU | intron | 0.166 | 0.021 | 4.47E−17 |
| 8 | rs4236673 | CLU | intron | 0.123 | 0.020 | 3.25E−17 |
| 8 | rs11787077 | CLU | intron | 0.247 | 0.022 | 2.94E−17 |
| 8 | rs9331896 | CLU | intron | 0.154 | 0.022 | 8.38E−17 |
| 8 | rs2070926 | CLU | intron | 0.278 | 0.023 | 2.69E−17 |
| 11 | rs11039390a | NUP160 | downstream | 0.145 | −0.004 | 2.31E−05 |
| 11 | rs4939338 | MS4A6E | upstream | 0.139 | 0.011 | 2.79E−12 |
| 11 | rs7110631 | PICALM | intergenic | 0.134 | 0.014 | 8.77E−15 |
| 11 | rs10792832 | RNU6-560P | regulatory | 0.633 | 0.027 | 7.89E−16 |
| 11 | rs11218343 | SORL1 | regulatory | 0.643 | −0.046 | 4.77E−11 |
| 14 | rs10498633a | SLC24A4 | intron | 0.371 | −0.059 | 1.55E−07 |
| 19 | rs3752246 | ABCA7 | missense | 0.361 | −0.027 | 4.27E−09 |
| 19 | rs4147929 | ABCA7 | regulatory | 0.111 | −0.030 | 1.77E−09 |
| 19 | rs41289512 | PVRL2 | regulatory | 1.000 | 0.132 | 1.81E−167 |
| 19 | rs6857 | PVRL2 | 3′ UTR | 1.000 | 0.359 | 0 |
| 19 | rs769449 | APOE/TOMM40 | regulatory | 1.000 | 0.292 | 0 |
| 19 | rs56131196 | APOC1 | regulatory | 1.000 | 0.251 | 0 |
| 19 | rs78959900 | APOC1 | downstream | 1.000 | −0.096 | 8.22E−85 |
| 19 | rs12459419a | CD33 | missense | 0.245 | −0.027 | 6.66E−08 |
a: SNPs with single-variant test p value >5 × 10−8 that did not reach genome-wide significance by standard GWAS.
Estimates of total causal SNPs
| GWAS data | Phenotype | BFGWAS_QUANT | BVSR |
|---|---|---|---|
| ROS/MAP | Alzheimer dementia | 0.718 | 6.472 |
| tangle density | 3.179 | 6.127 | |
| 5.375 | 7.316 | ||
| global AD pathology | 5.375 | 6.174 | |
| cognition decline rate | 6.219 | 7.136 | |
| IGAP | Alzheimer dementia | 54.282 | – |
The summations of the Bayesian CPP estimates of SNPs with CPP >0.01 estimate the total number of causal SNPs.
BVSR was not developed for using summary-level GWAS data.
Figure 4ROC plots comparing prediction accuracy of Alzheimer dementia in the test data of MCDGC.
(A): PRSs derived using the ROS/MAP individual-level GWAS data; (B): PRSs derived using IGAP summary-level GWAS data . The PRS derived using Bayesian effect size estimates by BFGWAS_QUANT has comparable prediction accuracy as the PRSs derived by BVSR and LDpred2 auto, for all assuming a sparse causal model. PRSs derived by PRS-CS and LDpred2-inf using IGAP summary-level GWAS data as training data have the highest prediction accuracy for assuming an infinitesimal causal model.