| Literature DB >> 31636271 |
Apolline Gallois1, Joel Mefford2, Arthur Ko3, Amaury Vaysse1, Hanna Julienne1, Mika Ala-Korpela4,5,6,7,8,9, Markku Laakso10, Noah Zaitlen11, Päivi Pajukanta12, Hugues Aschard13,14.
Abstract
Genetic studies of metabolites have identified thousands of variants, many of which are associated with downstream metabolic and obesogenic disorders. However, these studies have relied on univariate analyses, reducing power and limiting context-specific understanding. Here we aim to provide an integrated perspective of the genetic basis of metabolites by leveraging the Finnish Metabolic Syndrome In Men (METSIM) cohort, a unique genetic resource which contains metabolic measurements, mostly lipids, across distinct time points as well as information on statin usage. We increase effective sample size by an average of two-fold by applying the Covariates for Multi-phenotype Studies (CMS) approach, identifying 588 significant SNP-metabolite associations, including 228 new associations. Our analysis pinpoints a small number of master metabolic regulator genes, balancing the relative proportion of dozens of metabolite levels. We further identify associations to changes in metabolic levels across time as well as genetic interactions with statin at both the master metabolic regulator and genome-wide level.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31636271 PMCID: PMC6803661 DOI: 10.1038/s41467-019-12703-7
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Region-metabolite associations. Distribution of the 588 significant associations (P < 1.28 × 10−9) identified in the 158 metabolites GWAS in the METSIM cohort. a Regions in dark green were significant for standard linear regression adjusted by confounding factors. Regions in red were significant for linear regression adjusted with confounding factors and covariates selected by CMS. Regions in light green were significant for both models. b Same plot including only the 228 new associations, not identified in previous metabolites GWAS
New gene–metabolite associations
| Chr | Genea | Position | SNPb | A1 | A2 | Associated metabolites | Opposite association |
|---|---|---|---|---|---|---|---|
| 1 | PCSK9 | 55,505,647 | rs11591147 | G | T | IDL_CE, L_LDL_TG, M_LDL_FC, Remnant_C, S_LDL_FC, S_VLDL_CE, VLDL_C, XL_HDL_FC, XS_VLDL_C/CE/FC, XXL_VLDL_CE | M_HDL_C/CE/P/PL, S_HDL_PL |
| 1 | DOCK7 | 63,056,112 | rs1748197 | G | A | HDL_TG, MUFA, M_HDL_TG, PC, PUFA, TotCho, TotFA, XXL_VLDL_CE | |
| 1 | CELSR2 | 109,818,530 | rs646776 | T | C | M_LDL_FC, S_LDL_FC/PL | |
| 1 | PSRC1 | 109,822,166 | rs599839 | A | G | S_LDL_CE | |
| 1 | GALNT2 | 230,294,916 | rs2144300 | C | T | ApoB, L_VLDL_*, M_VLDL_*, S_VLDL_FC/L/P/PL/TG, TG_PG, VLDL_C/D/TG, XL_VLDL_P/TG | M_HDL_PL, S_HDL_PL |
| 2 | APOB | 21,225,281 | rs1042034 | T | C | TotFA | |
| 2 | GCKR | 27,730,940 | rs1260326 | T | C | Remnant_C, TG_PG, VLDL_C/TG, XL_VLDL_CE/FC | L_HDL_PL |
| 3 | PROK2 | 71,880,578 | rs7622817 | G | A | Serum_C | |
| 4 | CHIC2 | 54,714,868 | rs17083590 | G | A | XS_VLDL_CE | |
| 4 | UTP3 | 71,552,398 | rs16845383 | A | G | Alb | |
| 5 | MARCH3 | 126,267,351 | rs12655258 | C | T | HDL2_C | |
| 5 | MIR4634 | 174,223,234 | rs12660057 | G | A | M_HDL_L | |
| 6 | MICB | 31,236,410 | rs34131062 | T | C | S_VLDL_TG, VLDL_TG, XS_VLDL_TG | |
| 6 | MIR3925 | 36,613,812 | rs6457931 | G | T | XL_HDL_L | |
| 8 | LPL | 19,832,646 | rs17482753 | G | T | ApoB, HDL_TG, MUFA, SFA, TG_PG, TotFA, VLDL_C/TG | |
| 8 | TRIB1 | 126,485,531 | rs7846466 | T | C | L_VLDL_L, MUFA, Remnant_C, VLDL_C, XL_VLDL_C/CE/L, XXL_VLDL_C/CE/FC | |
| 10 | PCDH15 | 56,015,656 | rs11004183 | G | A | IDL_C/FC/L/P | |
| 10 | PKD2L1 | 102,075,479 | rs603424 | G | A | MUFA_FA | |
| 11 | CELF1 | 47,539,697 | rs4752845 | T | C | ApoA1 | XXL_VLDL_P |
| 11 | PTPMT1 | 47,583,121 | rs12798346 | C | T | HDL_D, L_HDL_P, XL_HDL_PL | |
| 11 | MTCH2 | 47,663,049 | rs10838738 | G | A | TG_PG | |
| 11 | MYRF | 61,551,356 | rs174535 | C | T | PUFA_FA | S_HDL_TG |
| 11 | TMEM258 | 61,557,803 | rs102275 | C | T | MUFA, MUFA_FA | HDL2_C |
| 11 | FADS1 | 61,569,830 | rs174546 | C | T | EstC, FAw3_FA, UnSat, XS_VLDL_L | M_VLDL_FC |
| 11 | FADS2 | 61,597,972 | rs1535 | G | A | DHA_FA, SM, XS_VLDL_FC | LA_FA, M_VLDL_P, XL_VLDL_TG |
| 11 | FADS3 | 61,639,573 | rs174448 | G | A | M_VLDL_PL | |
| 11 | CPT1A | 68,562,328 | rs17610395 | C | T | DHA, DHA_FA, FAw3, FAw3_FA | |
| 11 | APOA5 | 116,660,686 | rs2266788 | G | A | HDL_TG, Ile, M_HDL_TG, PUFA, Remnant_C, SFA, S_VLDL_CE, TG_PG, VLDL_C/TG, XS_VLDL_FC, XXL_VLDL_C/CE | |
| 12 | HNF1A | 121,420,260 | rs7979473 | G | A | M_LDL_P | |
| 13 | LINC02296 | 87,773,653 | rs17123289 | G | A | FreeC | |
| 15 | LOC283665 | 58,380,442 | rs12910902 | T | C | LDL_TG, L_HDL_L, L_LDL_TG | |
| 15 | LIPC | 58,683,366 | rs1532085 | A | G | HDL2_C, HDL3_C, HDL_TG, IDL_CE, LDL_TG, L_HDL_TG, L_LDL_L/TG, MUFA_FA, M_HDL_L/ TG, M_LDL_L/TG, PUFA, Remnant_C, SFA, S_HDL_TG, S_LDL_TG, S_VLDL_C/CE/FC/L/P/PL, TotCho, VLDL_C, XS_VLDL_C/CE/FC | FAw6_FA, LA_FA, PUFA_FA |
| 15 | MYO1E | 59,453,384 | rs2306791 | T | C | S_LDL_P/PL | |
| 16 | ITGAM | 31,343,769 | rs4597342 | T | C | TG_PG | |
| 16 | CETP | 56,991,363 | rs183130 | C | T | ApoB, HDL_TG, IDL_L/P/PL, L_LDL_C/CE/L/PL, M_HDL_TG, Remnant_C, S_VLDL_CE, VLDL_C, XL_VLDL_CE, XS_VLDL_C/CE/FC, XXL_VLDL_CE | HDL2_C |
| 16 | DHX38 | 72,144,174 | rs9302635 | T | C | SFA, TotFA | |
| 16 | PMFBP1 | 72,230,112 | rs9923575 | T | C | UnSat | |
| 16 | C16orf47 | 73,177,225 | rs9673570 | A | G | Tyr | |
| 19 | LDLR | 11,202,306 | rs6511720 | G | T | IDL_CE, LDL_TG, L_LDL_TG, M_LDL_FC, Remnant_C, S_LDL_CE/FC, S_VLDL_CE, XS_VLDL_C/CE/FC | |
| 19 | PRKCSH | 11,560,347 | rs755000 | T | G | FreeC | |
| 19 | APOE | 45,408,836 | rs405509 | G | T | M_HDL_P/PL, PUFA | |
| 19 | APOC1 | 45,415,640 | rs445925 | G | A | IDL_CE, M_LDL_FC, S_HDL_CE, S_LDL_CE/FC/PL, TotCho, XS_VLDL_C/CE | |
| 19 | NECTIN2 | 45,373,565 | rs395908 | G | A | Remnant_C | |
| 19 | TOMM40 | 45,395,266 | rs157580 | A | G | VLDL_C, XS_VLDL_FC | |
| 20 | PLTP | 44,545,048 | rs4810479 | C | T | S_HDL_FC/PL |
Chr. chromosome
aNearest gene from the reported SNP
bSNP strongly associated with the majority of phenotypes present in last two columns, most significant SNP for each phenotypes are listed in Supplementary Data 5
Fig. 2Overview of CMS results. Characterization of results from the CMS adjusted analysis among the 588 identified region-metabolite associations. For all panels, we used the most associated SNP per region. a The regression coefficient for each SNP estimated using standard linear regression (βSTD) and after adjustment for the covariates selected by CMS (βCMS). Associations significant at 1.28 × 10−9 with the standard test are indicated in blue, those only significant with CMS are indicated in red. b The outcome variance explained by the SNPs as a function of the variance explained by covariates selected by CMS for the corresponding associations. The blue and red areas correspond to the detectable SNP effect size for simple regression given the available sample size, and after explaining the residual outcome variance, respectively. c The gain in power achieved by CMS across the species analyzed, expressed as equivalent increase in sample size. The dash black line corresponds to the baseline sample size of 6623 individuals. The gradient of reds indicates the density
Fig. 3Network representation of the 588 region-metabolite associations identified in the 158 metabolites GWAS in METSIM. For each region we used the nearest gene of the most associated variant. Each node represents either a gene (blue diamonds, N = 70) or a metabolite (orange circles, N = 147). Each edge is an association between one gene and one metabolite. Node size is directly proportional to the number of other nodes associated with it. Red edges correspond to opposite effect of a gene on a metabolite, compared with the other metabolites associated with the same gene. Metabolites colors (orange shades) represents correlation strength between a given metabolite and all other metabolites. Genes colors (blue shades) represent strength of correlation between a given gene and associated metabolites, quantified as the average of r2 across all corresponding metabolites
Fig. 4Specificity of master regulators on lipoproteins. We performed a hierarchical clustering of the association between the 13 master regulators and the lipoprotein type (a). Further panels show the total number of associations, the number of associations with lipoprotein, and the total number of top associated SNP (b); the count of association hits by lipoprotein type (c), their size (d), and class (e). The background colors represent the relative proportion of association within each gene-item stratum, highlighting heterogeneity in the distribution of signal
Fig. 5Fine mapping of the LIPC region. The top panel indicates the posterior probability assessing the evidence that the SNP is causal for each of the 75 phenotypes and the local recombination rate. The middle panel contains genes from the UCSC hg19 annotation. The lower panel is a r²-based LD heatmap computed using PLINK1.9 on the METSIM data. The gradient of red is proportional to the r². For clarity, we represented the LD only for SNPs with a posterior probability > 0.01 for at least 1 phenotype
Fig. 6Change in genetic effect as a function of aging and exposure to statins. We derived for the top variant of each of the 13 core regulator genes we identified, a the interaction effect with statin, and b the effect on , the difference in the metabolite measurement between the two time points. In all association tests, the allele associated with an increase level of metabolite in the marginal test at baseline measurement was defined as the coded allele. Non-significant test are in pink, test nominally significant are in red, and test significant at 5 × 10−3 are in dark red. In agreement with the heritability analysis most of the coefficients for are negative, indicating an overall decrease of genetic effect. Two genes, APOC1 and TRIB1, show strong enrichment for interaction with statin
Fig. 7Heritability of metabolites in baseline and follow-up data. Heritability of studied metabolites, computed on individuals present in both baseline and follow-up data. We used bivariate restricted maximum likelihood (REML) and included 10 genetic PCs, age, and age² as fixed effects. Light colors stand for heritability in baseline data and dark colors stand for follow-up data