| Literature DB >> 35705896 |
Kim Lorenz1,2, Christopher S Thom1,2,3, Sanjana Adurty4, Benjamin F Voight5,6,7.
Abstract
BACKGROUND: The majority of Genome Wide Associate Study (GWAS) loci fall in the non-coding genome, making causal variants difficult to identify and study. We hypothesized that the regulatory features underlying causal variants are biologically specific, identifiable from data, and that the regulatory architecture that influences one trait is distinct compared to biologically unrelated traits.Entities:
Keywords: GWAS interpretation; Pleiotropy; Tissue specific; Variant prediction
Mesh:
Year: 2022 PMID: 35705896 PMCID: PMC9202130 DOI: 10.1186/s12864-022-08654-x
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 4.547
Fig. 1Model AUCs. A Training vs Holdout AUCs. Dotted grey line shows equality. Selected models (holdout AUC ≥ 0.7) shown in green, unselected in purple. B Individual holdout AUC plots for selected models, compared to CADD (pink), GWAVA (cyan), DeepSEA (brown)
All Traits Results Summary
| Trait | Positive SNPs | Total Positive Loci | Positive Training Loci | Positive Holdout Loci | Training AUC | Model Holdout AUC | CADD Holdout AUC | GWAVA Holdout AUC | DeepSEA Holdout AUC | Number Selected Features |
|---|---|---|---|---|---|---|---|---|---|---|
| platelets | 955 | 699 | 464 | 231 | 0.80 | 0.80 | 0.52 | 0.57 | 0.56 | 17 |
| autoimmune | 724 | 484 | 318 | 162 | 0.79 | 0.77 | 0.51 | 0.57 | 0.51 | 37 |
| WBC | 2222 | 1762 | 1091 | 544 | 0.77 | 0.76 | 0.51 | 0.52 | 0.51 | 26 |
| lipids | 1439 | 810 | 526 | 275 | 0.73 | 0.74 | 0.53 | 0.52 | 0.50 | 15 |
| RBC | 2430 | 1810 | 1132 | 564 | 0.76 | 0.74 | 0.50 | 0.52 | 0.52 | 31 |
| ECG | 444 | 303 | 202 | 101 | 0.71 | 0.70 | 0.53 | 0.58 | 0.53 | 4 |
| S_BD | 624 | 422 | 273 | 144 | 0.98 | 0.67 | 0.48 | 0.49 | 0.45 | 519 |
| WHRadjBMI | 506 | 419 | 275 | 141 | 0.72 | 0.67 | 0.54 | 0.56 | 0.56 | 5 |
| birthweight | 346 | 294 | 195 | 97 | 0.83 | 0.66 | 0.48 | 0.59 | 0.53 | 75 |
| CAD | 740 | 535 | 346 | 183 | 0.67 | 0.66 | 0.52 | 0.54 | 0.53 | 7 |
| BMD | 3046 | 2140 | 1394 | 697 | 0.68 | 0.65 | 0.53 | 0.52 | 0.52 | 42 |
| BC | 419 | 328 | 218 | 109 | 0.78 | 0.64 | 0.53 | 0.47 | 0.54 | 53 |
| height | 3715 | 3392 | 1699 | 839 | 0.66 | 0.64 | 0.51 | 0.50 | 0.52 | 24 |
| BMI | 2220 | 1676 | 1032 | 516 | 0.70 | 0.63 | 0.50 | 0.51 | 0.51 | 53 |
| T2D | 849 | 587 | 388 | 194 | 0.68 | 0.62 | 0.51 | 0.53 | 0.54 | 18 |
| BP | 2595 | 1842 | 1215 | 616 | 0.63 | 0.60 | 0.50 | 0.51 | 0.51 | 10 |
| menarche | 817 | 593 | 393 | 197 | 0.75 | 0.57 | 0.52 | 0.44 | 0.49 | 59 |
Fig. 2Model AUCs on same holdout sets. Each panel shows the named holdout set: A Platelets and B ECG. AUCs for all six selected models are shown on all plots, with the left (All) plot having all loci in that holdout and the right (No Overlap) having only those loci not positive in multiple models
Fig. 3Selected Model Feature Coefficients. A Platelets, B ECG, and C Lipids model coefficients, sorted by tissue type and value. GE – Gene Expression
Credible set improvements
| Trait | Maximum number of SNPs per locus | Average number of SNPs per locus | Number of loci with a credible set | Number of loci with a multi-SNP credible set | Fraction of loci with prioritized SNPS at FDR < 0.25 | Fraction of loci with prioritized SNPS at FDR < 0.10 |
|---|---|---|---|---|---|---|
| autoimmune | 108 | 18 | 237 | 218 | 0.48 | 0.29 |
| platelets | 86 | 11 | 951 | 692 | 0.54 | 0.24 |
| RBC | 209 | 12 | 789 | 615 | 0.52 | 0.27 |
| WBC | 197 | 13 | 762 | 583 | 0.50 | 0.31 |