| Literature DB >> 35047847 |
Zong Miao1,2, Kristina M Garske1, David Z Pan1,2, Amogha Koka1, Dorota Kaminska1,3,4, Ville Männistö5,6, Janet S Sinsheimer1,2,7, Jussi Pihlajamäki3,8, Päivi Pajukanta1,2,9.
Abstract
The prevalence of non-alcoholic fatty liver disease (NAFLD), now also known as metabolic dysfunction-associated fatty liver disease (MAFLD), is rapidly increasing worldwide due to the ongoing obesity epidemic. However, currently the NALFD diagnosis requires non-readily available imaging technologies or liver biopsy, which has drastically limited the sample sizes of NAFLD studies and hampered the discovery of its genetic component. Here we utilized the large UK Biobank (UKB) to accurately estimate the NAFLD status in UKB based on common serum traits and anthropometric measures. Scoring all individuals in UKB for NAFLD risk resulted in 28,396 NAFLD cases and 108,652 healthy individuals at a >90% confidence level. Using this imputed NAFLD status to perform the largest NAFLD genome-wide association study (GWAS) to date, we identified 94 independent (R2 < 0.2) NAFLD GWAS loci, of which 90 have not been identified before; built a polygenic risk score (PRS) model to predict the genetic risk of NAFLD; and used the GWAS variants of imputed NAFLD for a tissue-aware Mendelian randomization analysis that discovered a significant causal effect of NAFLD on coronary artery disease (CAD). In summary, we accurately estimated the NAFLD status in UKB using common serum traits and anthropometric measures, which empowered us to identify 90 GWAS NAFLD loci, build NAFLD PRS, and discover a significant causal effect of NAFLD on CAD.Entities:
Keywords: GWAS; Mendelian randomization; NAFLD; UK Biobank; polygenic risk score
Year: 2021 PMID: 35047847 PMCID: PMC8756520 DOI: 10.1016/j.xhgg.2021.100056
Source DB: PubMed Journal: HGG Adv ISSN: 2666-2477
Effect sizes (betas) estimated in the NAFLDS and NAFLDS_simple models
| GGT | 0.0138 | 0.0144 |
| BMI | 0.0395 | 0.0479 |
| Waist | 0.0606 | 0.0714 |
| ALT | 0.0089 | 0.0125 |
| AST | 0.0373 | 0.0346 |
| HbA1c | 0.0360 | NA |
| AST/ALT | −0.1299 | −0.1794 |
| TG | 0.3499 | NA |
| Cholesterol | −0.2850 | NA |
| Albumin | −0.0035 | NA |
| Age | −0.1470 | −0.1722 |
| Age2 | 0.0015 | 0.0018 |
| Sex | −1.0252 | −0.9153 |
| T2D | 0.4123 | NA |
The predictors are ranked by their importance in the random forest estimation model. NA indicates not applicable.
Figure 1ROC and PRC plots show that NAFLDS outperformed the existing NAFLD predictors
(A) As demonstrated by an ROC curve, NAFLDS outperformed FLI and HSI by achieving higher AUCs.
(B) As demonstrated by a PRC plot, NAFLDS and NAFLDS_simple outperformed FLI and HIS and achieved higher AUCs.
(C) In the ROC plot, NAFLDS outperforms the key predictors, ALT, GGT, BMI, and waist circumference.
(D) In the RPC plot, NAFLDS outperforms the key predictors, ALT, GGT, BMI, and waist circumference.
Seven of the Previously identified NAFLD GWAS loci were observed in our imputed NAFLD status (n = 136,840) GWAS analyses at the genome-wide significant level (p < 5E−8) or subgenome-wide significant level (p < 5E−5)
| 1 | MARC1 (MIM: | rs2642438 | −1.09E−3 | 0.28 | – |
| 2 | GCKR (MIM: | rs1260326 | 9.08E−3 | 2.00E−22 | genome |
| 2 | GCKR | rs780094 | 8.47E−3 | 1.70E−19 | genome |
| 4 | HSD17B13 (MIM: | rs9992651 | 5.23E−3 | 5.40E−07 | subgenome |
| 7 | – | rs343062 | −8.17E−4 | 3.80E−01 | – |
| 8 | PPP1R3B (MIM: | rs4240624 | 3.90E−3 | 1.30E−02 | nominal |
| 16 | ZFP90-CDH1 (MIM: | rs698718 | −1.70E−3 | 1.10E−01 | – |
| 19 | NCAN (MIM: | rs2228603 | −8.47E−3 | 8.60E−07 | subgenome |
| 19 | TM6SF2 (MIM: | rs58542926 | −9.55E−3 | 2.70E−08 | genome |
| 22 | SAMM50 (MIM: | rs3761472 | −8.97E−3 | 9.80E−13 | genome |
| 22 | SAMM50 | rs2143571 | −5.73E−3 | 1.40E−06 | subgenome |
| 22 | PNPLA3 (MIM: | rs738409 | −1.19E−2 | 3.60E−27 | genome |
| 22 | IL17RA (MIM: | rs5748926 | 1.13E−05 | 9.90E−01 | – |
| 22 | PARVB (MIM: | rs5764455 | −2.33E−04 | 8.00E−01 | – |
| 1 | LYPLAL1 (MIM: | rs12137855 | 1.85E−3 | 9.80E−02 | – |
| 2 | FABP1 (MIM: | rs72943235 | −7.26E−4 | 8.70E−01 | – |
| 8 | TRIB1 (MIM: | rs2980888 | 1.00E−2 | 7.30E−24 | genome |
| 8 | TRIB1 | rs2954021 | 1.06E−2 | 3.50E−31 | genome |
| 8 | FDFT1 (MIM: | rs2645424 | 1.53E−4 | 8.70E−01 | – |
| 19 | MBOAT7 (MIM: | rs641738 | 9.22E−4 | 2.6E−03 | nominal |
| 17 | GRB2 (MIM: | rs5015881 | −7.99E−3 | 1.60E−08 | genome |
Loci were previously identified in NAFLD GWASs.4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
The Gene/loci column shows the nearest gene of the identified NAFLD variant.
This variant was derived from a previous study that performed a meta-analysis of rs641738 instead of a full GWAS.
The number of GWAS variants shared by the imputed NAFLD status and predictors
| Shared SNPs | 3,366 | 1,669 | 336 | 3,360 | 3,760 | 4,890 |
| Percentage | 64.89% | 32.18% | 6.48% | 64.77% | 72.49% | 94.27% |
Hba1c, hemoglobin A1c; BMI, body mass index; ALT, alanine aminotransferase; GGT, gamma-glutamyl transpeptidase.
Figure 2The ORs of NAFLD for the decile compared to the people with the lowest 10% NAFLD PRS score
The error bar shows the 95% confidence interval of the estimated OR. The x axis shows the 10 deciles divided by the NAFLD PRS score. The annotation box indicates the result comparing the inverse normal transformed PRS scores between the NAFLD cases and control subjects using a Student’s t test.
Figure 3Workflow of combining liver/coronary artery cis-eQTL and UKB GWAS variants to a tissue-aware, bi-directional MR between imputed NAFLD and CAD