| Literature DB >> 35347128 |
Xianyong Yin1, Lap Sum Chan1, Debraj Bose1, Anne U Jackson1, Peter VandeHaar1, Adam E Locke2, Christian Fuchsberger1,3, Heather M Stringham1, Ryan Welch1, Ketian Yu1, Lilian Fernandes Silva4, Susan K Service5, Daiwei Zhang1,6, Emily C Hector7, Erica Young2,8, Liron Ganel2, Indraniel Das2, Haley Abel9, Michael R Erdos10, Lori L Bonnycastle10, Johanna Kuusisto4,11, Nathan O Stitziel2,8,12, Ira M Hall13, Gregory R Wagner14, Jian Kang1, Jean Morrison1, Charles F Burant15, Francis S Collins10, Samuli Ripatti16,17,18, Aarno Palotie16,17,19, Nelson B Freimer5, Karen L Mohlke20, Laura J Scott1, Xiaoquan Wen1, Eric B Fauman21, Markku Laakso22, Michael Boehnke23.
Abstract
Few studies have explored the impact of rare variants (minor allele frequency < 1%) on highly heritable plasma metabolites identified in metabolomic screens. The Finnish population provides an ideal opportunity for such explorations, given the multiple bottlenecks and expansions that have shaped its history, and the enrichment for many otherwise rare alleles that has resulted. Here, we report genetic associations for 1391 plasma metabolites in 6136 men from the late-settlement region of Finland. We identify 303 novel association signals, more than one third at variants rare or enriched in Finns. Many of these signals identify genes not previously implicated in metabolite genome-wide association studies and suggest mechanisms for diseases and disease-related traits.Entities:
Mesh:
Year: 2022 PMID: 35347128 PMCID: PMC8960770 DOI: 10.1038/s41467-022-29143-5
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Flow chart of the METSIM metabolomics study.
MAC: minor allele count; SPIP and VPIP: signal and variant posterior inclusion probability in DAP-g Bayesian fine mapping; RCP: regional colocalization posterior probability in FastENLOC; PTV: protein-truncating variant; VMA: vanillylmandelate.
Fig. 2Our METSIM Metabolomics PheWeb facilitates the characterization of genetic associations and gene activities.
a Manhattan plot for N-acetylkynurenine highlights the roles of the associated genes (https://pheweb.org/metsim-metab/pheno/C100006378). Chemical structure for N-acetylkynurenine (in bold face) and activities for the associated genes are added manually on top of the Manhattan plot. b Stacked PheWeb plots show significant associations between rs6705977 (NAT8, https://pheweb.org/metsim-metab/variant/2:73622043-C-G) and fifteen N-acetylated molecules, and the more restricted set of associations between rs948445 (ACY3, https://pheweb.org/metsim-metab/variant/11:67647021-C-T) and four N-acetylated aromatic amino acids. LI: lipid; XE: xenobiotics; AA: amino acid; CA: carbohydrate; NU: nucleotide; PE: peptide; CV: cofactor and vitamin; EN: energy; PC: partially characterized; UN: unnamed.
Summary of the 2030 genetic association signals by metabolite biochemical class.
| Biochemical class and abbreviation | Total metabolites | Significant metabolites | Total signals | Novel signals |
|---|---|---|---|---|
| Lipid (LI) | 548 | 357 | 903 | 74 |
| Amino acid (AA) | 215 | 154 | 441 | 73 |
| Xenobiotics (XE) | 163 | 52 | 91 | 16 |
| Nucleotide (NU) | 42 | 26 | 65 | 29 |
| Peptide (PE) | 42 | 18 | 28 | 7 |
| Cofactors and vitamins (CV) | 38 | 25 | 69 | 27 |
| Carbohydrate (CA) | 25 | 20 | 38 | 7 |
| Energy (EN) | 10 | 4 | 11 | 3 |
| Partially characterized (PC) | 16 | 8 | 20 | 1 |
| Unnamed (UN) | 292 | 139 | 364 | 66 |
| Total | 1,391 | 803 | 2,030 | 303 |
Significant metabolites: number of metabolites with at least one association signal at P < 7.2 × 10−11.
Fig. 3Characterization of the 2030 significant metabolite genetic association signals.
Comparison of MAFs for the 1143 index variants between METSIM and non-Finnish Europeans in gnomAD v3.1; index variants are colored a purple if MAF>10-fold greater in METSIM than in non-Finnish Europeans; or b blue if they represent novel association signals. The dashed line is of slope one through the origin. c Overlaid Manhattan plots of the 1391 metabolite GWAS. The red dashed line depicts genome-wide significance threshold P = 7.2 × 10−11. The associations at 40 novel putative causal genes within novel regions (blue) and 18 novel putative causal genes within previously reported regions (maize) are highlighted. The seven novel putative causal genes implicated only by fine-mapping analysis are starred. HADHA/B represents the HADHA and HADHB genes and ARSD/L the ARSD and ARSL genes.
Fig. 4The 1952 of the 2030 metabolite genetic association signals identified in stepwise conditional tests with SPIP ≥ 0.95 in DAP-g Bayesian fine mapping.
a Numbers of variants in the 95% credible sets and distribution of variant posterior inclusion probabilities (VPIPs) for the most likely causal variants within the 95% credible sets. b Density plot of largest VPIPs highlights the variants with > 10-fold greater frequency in METSIM than non-Finnish Europeans (gnomAD v3.1; blue) have larger VPIPs than all other variants (maize).
Top metabolite genetic association signals at 47 novel variants with MAF>10-fold greater in METSIM than in gnomAD v3.1 non-Finnish Europeans.
| Biochemical name | rsID | MAF(%) | MAFNFE(%) | RMAF | β | Gene | |
|---|---|---|---|---|---|---|---|
| X – 17676 | rs200711248 | 0.321 | 0.009 | 34.5 | 1.18 | 5.40E−11 | |
| Uracil | rs1254152519 | 0.613 | 0.002 | 395.2 | 1.33 | 5.39E−28 | |
| palmitoyl dihydrosphingomyelin (d18:0/16:0) | rs752521494 | 0.190 | 0.000 | ∞ | 1.91 | 2.53E−14 | |
| X – 12127 | rs189344406 | 0.302 | 0.006 | 48.7 | −1.75 | 3.27E−14 | |
| Campesterol | rs1247627279 | 0.362 | 0.003 | 116.8 | 1.11 | 5.26E−11 | |
| Xanthurenate | rs199546957 | 0.151 | 0.000 | ∞ | 2.07 | 4.96E−18 | |
| hydantoin-5-propionate | rs144419430 | 0.928 | 0.015 | 59.9 | −0.76 | 2.16E−13 | Unknown |
| Glycerate | rs192756070 | 2.917 | 0.022 | 134.5 | −1.07 | 8.04E−85 | |
| X – 15666 | rs140758280 | 4.073 | 0.376 | 10.8 | −0.42 | 1.40E−19 | |
| X – 24475 | rs202158371 | 0.814 | 0.012 | 65.6 | −0.91 | 9.61E−17 | |
| methyl glucopyranoside (alpha + beta) | rs186284085 | 1.370 | 0.008 | 176.7 | 0.69 | 9.41E−20 | |
| X – 12844 | rs200280202 | 1.436 | 0.034 | 42.1 | −0.83 | 5.21E−26 | |
| N-acetylglucosaminylasparagine | rs561604250 | 0.668 | 0.008 | 86.3 | 0.97 | 1.16E−14 | |
| X – 24544 | rs141884785 | 1.320 | 0.073 | 18.1 | 0.76 | 1.74E−22 | |
| Choline | rs200164783 | 2.634 | 0.136 | 19.3 | 0.70 | 2.32E−33 | Unknown |
| Serine | rs1297328831 | 0.162 | 0.002 | 104.6 | −1.64 | 6.82E−13 | Unknown |
| N-acetylhistidine | rs146438324 | 0.946 | 0.057 | 16.5 | 1.51 | 1.71E−61 | |
| Sulfate | rs138989506 | 1.910 | 0.029 | 64.9 | −1.41 | 1.55E−99 | |
| X – 26054 | rs976212663 | 0.231 | 0.019 | 12.4 | −2.08 | 3.52E−15 | |
| 3-(3-amino-3-carboxypropyl)uridine | rs149926554 | 2.898 | 0.175 | 16.6 | 0.43 | 7.97E−14 | Unknown |
| 5-oxoproline | rs782359519 | 0.854 | 0.019 | 45.8 | 0.92 | 1.43E−24 | |
| alpha-ketoglutarate | rs191616586 | 3.113 | 0.195 | 15.9 | −0.45 | 7.65E−17 | Unknown |
| 5-oxoproline | rs558946866 | 0.442 | 0.034 | 13.0 | 0.99 | 4.54E−15 | |
| 3beta-hydroxy-5-cholestenoate | rs552968665 | 0.245 | 0.003 | 79.1 | 1.55 | 8.62E−16 | Unknown |
| 3-amino-2-piperidone | rs121965043 | 0.346 | 0.003 | 111.7 | 1.91 | 3.73E−35 | |
| Deoxycarnitine | rs1268699195 | 0.577 | 0.002 | 372.3 | 0.90 | 1.25E−12 | Unknown |
| beta-citrylglutamate | rs182295429 | 4.615 | 0.064 | 72.6 | 0.52 | 3.73E−28 | |
| Betaine | rs1358634021 | 0.093 | 0.005 | 20.0 | 2.43 | 1.56E−16 | |
| Aspartate | rs1209353188 | 0.578 | 0.000 | ∞ | 0.91 | 9.92E−14 | |
| Succinylcarnitine (C4-DC) | rs200127857 | 0.342 | 0.005 | 73.6 | 2.20 | 2.37E−45 | |
| Succinylcarnitine (C4-DC) | rs200480788 | 0.104 | 0.005 | 22.4 | 2.40 | 3.96E−19 | |
| 5-hydroxylysine | rs201135688 | 4.513 | 0.447 | 10.1 | 0.39 | 6.83E−22 | |
| X – 17676 | rs185603444 | 1.983 | 0.184 | 10.8 | −0.60 | 5.56E−16 | |
| Orotidine | rs201899452 | 0.143 | 0.002 | 92.3 | −1.80 | 2.41E−13 | |
| N-formylanthranilic acid | rs77585764 | 5.407 | 0.378 | 14.3 | 1.24 | 9.55E−218 | |
| Sphingomyelin (d18:1/18:1, d18:2/18:0) | rs527480139 | 0.330 | 0.000 | ∞ | −1.41 | 2.51E−19 | |
| Glycosyl-N-stearoyl-sphingosine (d18:1/18:0) | rs1013893365 | 2.956 | 0.046 | 64.2 | 1.25 | 2.17E−106 | |
| X – 11315 | rs201742362 | 1.134 | 0.033 | 34.9 | 1.43 | 1.62E−61 | |
| 2-O-methylascorbic acid | rs6267 | 5.778 | 0.142 | 40.6 | −0.82 | 9.49E−108 | |
| 2-O-methylascorbic acid | rs199637204 | 0.098 | 0.005 | 21.1 | −1.87 | 1.49E−12 | |
| Gamma-tocopherol/beta-tocopherol | rs182488695 | 1.548 | 0.008 | 199.9 | 0.89 | 7.37E−33 | |
| N6-succinyladenosine | rs8192461 | 1.175 | 0.073 | 16.1 | 1.12 | 9.38E−27 | |
| N6-succinyladenosine | rs773404017 | 0.414 | 0.006 | 66.8 | 2.55 | 2.30E−44 | |
| 5-methyluridine (ribothymidine) | rs548223694 | 0.448 | 0.039 | 11.6 | 1.36 | 9.47E−27 | |
| 5-methyluridine (ribothymidine) | rs756647111 | 0.119 | 0.000 | ∞ | 2.04 | 2.35E−14 | |
| 2’-deoxyuridine | rs556167510 | 0.601 | 0.042 | 14.4 | 1.07 | 2.45E−14 | |
| Tiglylcarnitine (C5:1-DC) | rs201378370 | 2.584 | 0.034 | 76.7 | 0.94 | 5.21E−122 |
Biochemical name: biochemical name of the metabolite. rsID: dbSNP variant ID. MAF, MAFNFE, and RMAF: minor allele frequency in METSIM, in gnomAD v3.1 non-Finnish Europeans, and their ratio. When the index variant is monomorphic in gnomAD v3.1 non-Finnish Europeans (n = 34,029), the ratio is labeled as infinite, ∞. β: effect size estimate from the metabolite-specific stepwise conditional association test. P: p-value of metabolite-specific stepwise conditional association test. Gene: the putative causal gene(s) nominated in the knowledge-based approach. Gene symbols are italic. For unnamed metabolites, the putative causal gene results are represented by "–". If no putative causal gene is nominated, it is labeled as “unknown”. If multiple putative causal genes are nominated, they are separated by a vertical bar.
Fig. 5Colocalization and causal relationship between campesterol and gallstones.
a Stacked regional association plots for campesterol and gallstones (cholelithiasis, K11_CHOLELITH in FinnGen release 4) in the ABCG5/ABCG8 region. The index variants identified in stepwise conditional analysis (campesterol) and approximate conditional analysis (gallstones) are labeled and variants colored by their linkage disequilibrium (LD) to the index variant with which they are in strongest LD in METSIM. The campesterol signal (index variant rs6544713) is colocalized with the gallstone signal (rs4299376, pairwise LD r2 = 0.993, RCP = 0.65) shown in the gray box. In contrast, no colocalization was detected between the signals indexed by rs4614977 and rs11887534. No coding variants within 1 Mb have LD r2 > 0.2 with rs6544713 in METSIM. b Comparison of effect sizes for the 15 instrumental variables genome-wide without significant heterogeneity (P > 0.05) used in Mendelian randomization analysis between campesterol and gallstones. rs6544713 is in blue. The slope of the blue dashed line depicts the estimated causal effect size of campesterol on gallstones. The Egger regression intercept is deemed not significant (P = 0.15). c Negative relationship between instrumental variable and risk of gallstones. OR: odds ratio.