| Literature DB >> 29362361 |
Noha A Yousri1,2, Khalid A Fakhro3,4, Amal Robay5, Juan L Rodriguez-Flores6, Robert P Mohney7, Hassina Zeriri5, Tala Odeh5, Sara Abdul Kader8, Eman K Aldous9, Gaurav Thareja8, Manish Kumar8, Alya Al-Shakaki5, Omar M Chidiac5, Yasmin A Mohamoud9, Jason G Mezey6, Joel A Malek5,9, Ronald G Crystal6, Karsten Suhre10.
Abstract
Metabolomics-genome-wide association studies (mGWAS) have uncovered many metabolic quantitative trait loci (mQTLs) influencing human metabolic individuality, though predominantly in European cohorts. By combining whole-exome sequencing with a high-resolution metabolomics profiling for a highly consanguineous Middle Eastern population, we discover 21 common variant and 12 functional rare variant mQTLs, of which 45% are novel altogether. We fine-map 10 common variant mQTLs to new metabolite ratio associations, and 11 common variant mQTLs to putative protein-altering variants. This is the first work to report common and rare variant mQTLs linked to diseases and/or pharmacological targets in a consanguineous Arab cohort, with wide implications for precision medicine in the Middle East.Entities:
Mesh:
Year: 2018 PMID: 29362361 PMCID: PMC5780481 DOI: 10.1038/s41467-017-01972-9
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Demographics (sample characteristics)
| Demographic category | Attribute |
|---|---|
| Gender (% females) | 45% |
| Age (mean ± s.d.) | 50.1 ± 12.6 |
| T2D (% with diabetes) | 56% |
| BMI (mean ± s.d.) | 32 ± 6.6 |
| Population information: #subjects | Q1:442 Q2:339 Q3:70, admixed: 54, not assigned: 91 |
| Genotyping source: #subjects | WES: 614, array genotyped: 382 |
WES whole-exome sequencing
Q1, Q2, and Q3 refer to Bedouin, Persian, and African ancestries, respectively, that present subpopulations of the people of Qatar[36]
Fig. 1Schematic view of the study design for the common variant analysis. 'A' indicates the first method for ratio computation, where we computed the associations between SNPs (within 100 Kb of the sentinel SNP from single-metabolite association analysis) and the ratio of the sentinel metabolite to all remaining 825 metabolites. 'B' indicates the second method for ratio computation, where for all SNPs for which two metabolites had been nominally associated in the discovery phase (p ≤ 10−4), but in opposite directions (opposite beta signs), we computed the association of the given SNP to the ratio of that pair of metabolites
21 Unique locus-metabolite pairs, indicating 7 newly identified and novel loci, 10 loci fine mapped with new metabolite ratio associations, and 11 protein coding variants
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|
|
| rs4646257 |
| 2.23 × 10−58 (6.83 × 1046) | 0.970 | IG, | 31 | ([ | |
| rs4921913 | 5-acetylamino-6-amino-3-methyluracil | 2.14 × 10−12 | 0.499 | 7.8 | ([ | 9.6 × 10−9 | ||
|
| rs1799958 | Ethylmalonate | 2.18 × 10−53 | 0.885 |
| 28.5 | ([ | 2.1 × 10−21 |
|
| rs13538 |
| 4.4 × 10−47 (8.6 × 1035) | −0.781 |
| 26.6 | ([ | |
| rs13538 | N-acetylcitrulline | 5.5 × 10−32 | 0.780 | 19.6 | ([ | 3.7 × 10−21 | ||
|
| rs34109652 | X-11491 (deoxycholic acid glucuronide or isomer) | 3.28 × 10−35 | −0.737 | INT | 21.4 | ([ | 3.8 × 10−25 |
|
| rs4149056 | glycochenodeoxycholate glucuronide (1) | 3.06 × 10−31 | 0.833 |
| 18.5 | ([ | 5.5 × 10−22 |
|
| rs2147896 | N-methylpipecolate | 9.13 × 10−26 | −0.663 |
| 18.3 | ([ | 7.4 × 10−19 |
|
| rs3756669 |
| 1.55 × 10−25 (4.02 × 1012) | −0.915 |
| 16.6 | ([ | |
| rs3756669 | X-24348 | 6.25 × 10−13 | −0.712 | 8.5 | ([ | 1.01 × 10−9 | ||
|
| rs28456 |
| 9.81 × 10−25 (5.38 × 1016) | −0.641 | INT | 17.8 | ([ | |
| rs174560 | X-24439 (PE(P-16:0/20:3)*) | 9.09 × 10−14 | 0.457 | INT | 8.9 | ([ | 1.6 × 10−3 | |
|
| rs37370 | 3-aminoisobutyrate | 8.45 × 10−21 | 0.810 |
| 13 | ([ | 4.3 × 10−10 |
|
| rs181856093 | X-22145 (2′-O-methyluridine) | 3.29 × 10−20 | 0.531 | INT, | 12.7 | NL | 1.3 × 10−11 |
|
| rs6690449 |
| 9.35 × 10−20 (2.29 × 109) | −0.547 | INT | 14.1 | ([ | |
| rs2999534 | X-23293 | 1.15 × 10−11 | 0.426 | IG | 7.2 | ([ | 2.05 × 10−4 | |
|
| rs78461713 | bilirubin (E, E)* | 5.1 × 10−17 | 0.484 | INT | 10.6 | ([ | 1.4 × 10−12 |
|
| rs62129970 |
| 7.59 × 10−17 (7.33 × 1010) | −0.951 | IG | 11 | ([ | 1.4 × 10−17 |
|
| rs78176967 |
| 1.43 × 10−16 (6.75 × 106) | 0.995 | IG | 10.8 | ([ | |
| rs61285056 | X-22379 (androsterone glucuronide) | 9.11 × 10−11 | 0.765 | INT | 6.9 | ([ | 5.2 × 10−8 | |
|
| rs2069258 |
| 1.71 × 10−16 (3.69 × 108) | 0.425 | IG | 10.6 | NL | 2.4 × 10−6 |
|
| rs117135869 |
| 4.264 × 10−16 (2.06 × 105) | 0.623 |
| 10.4 | ([ | |
| rs117135869 | X-22162 | 8.8 × 10−11 | 0.616 | 6.7 | ([ | 1.7 × 10−4 | ||
|
| rs274554 | Tryptophan betaine | 1.05 × 10−13 | −0.430 | INT | 8.6 | ([ | 3.3 × 10−10 |
|
| rs7530513 | Imidazole lactate | 5.6 × 10−13 | 0.425 | INT | 8.1 | ([ | 3.2 × 10−2 |
|
| rs1165196 |
| 1.46 × 10−12 (1.30 × 106) | −0.543 |
| 8.8 | ([ | 8.5 × 10−8 |
|
| rs776746 | X-12063 | 1.54 × 10−12 | −0.620 | 7.8 | ([ | 1.2 × 10−12 | |
|
| c15p90683852 | Undecanedioate | 1.18 × 10−10 | 0.421 | IG | 7 | NL | 9.7 × 10−4 |
Biochemical Name* indicates compounds that have not been officially confirmed based on a standard, but Metabolon is confident in its identity
p.## (missense), bold font indicates functional variant
apgain was introduced in[19] as “an ad-hoc measure to determine whether a ratio between two metabolite concentrations carries more information than the two corresponding metabolite concentrations alone”, calculated as pgain = min (pval(m1),pval(m2))/pval(m1/m2), given two metabolites m1 and m2
bSNP Annotation and Nearest Functional SNP (NFS), function, mutation, IG refers to intergenic, INT (Intron), INS (Intron near splice), SA (Splice Acceptor)
cr2 is percent of variance explained
dReference or Novel Locus (NL) or Novel Association (NAS)
eExact replication SNP is indicated in Supplementary Dataset 4
fNewly identified/novel loci
gNewly identified nominally replicated
Fig. 2Manhattan plot for the discovered loci. The red line indicates the Bonferroni threshold (2.2 × 10−10) and the blue line indicates the genome wide significance threshold (5 × 10−8). The newly identified/novel replicated loci are typed in red, and the non-replicated loci are in blue. Unnamed loci at the borderline (2.2 × 10−10) are associations with ratios with a p-gain below the threshold. *stands for nominally replicated loci
Fig. 3Percent of variance explained in the 21 loci. The height of a column bar indicates the percent of variance explained for each locus, loci genes are indicated above the column bar, and the metabolite/ratio on the X-axis. Bars are colored according to Metabolon pathway specified for the metabolites associated with the locus. Biochemical Name* indicates compounds that have not been officially confirmed based on a standard, but Metabolon is confident in its identity
Fig. 4Boxplots for the loci NAT2, FADS2, and UGT3A1. Boxplots showing metabolite/ratio levels and number of samples for each genotype group and comparing ratios to single metabolites for NAT2 (a, b), FADS2 (c, d), and UGT3A1 (e, f) loci, where the percent of variance explained by the ratio is 3.9-, 1.99-, and 1.94-fold greater, respectively, than that explained by the single metabolite
Fig. 5Regional association plots for the loci UGT3A1 and ACADS. UGT3A1 (a) and ACADS (b) loci missense SNPs showing the strength of the association (−log10 (p-value)) for X–24348/pregn-steroid-monosulfate* and ethylmalonate, respectively, on the Y-axis and the genes on the X-axis. The colors correspond to different LD thresholds, where LD is computed between the sentinel SNP (lowest p-value, colored in blue) and all SNPs. Shapes of markers correspond to their functionality as described in the legend
Fig. 6Boxplots for the rare variant loci AASDH, PRB1, ACAN, and OTOF, indicating the metabolite level and the number of samples for each genotype group. Boxplots of rare variants associations of AASDH with thyroxine (3 SNPs) (a–c), PRB1 with mannose (d), ACAN with X-12844 (e), and OTOF with retinol (vitamin A) (f)
12 novel functional rare variant mQTLs
| Gene-based burden test | |||||||
|---|---|---|---|---|---|---|---|
| SNP name (c#chr p#position) | rsID | Gene | Metabolite | EA/OA | C-MAF | ||
| c4p57221348 | rs3796543 |
| Thyroxine | C/T | |||
| c4p57248716 | rs34228795 |
| Thyroxine | C/G | 0.036 | 4.1 × 10−09 | |
| c4p57250285 | rs34543011 |
| Thyroxine | C/T | |||
| c12p56075599 | rs199581976 |
| Androsterone sulfate | T/C | |||
| c12p56075915 | rs75289684 |
| Androsterone sulfate | A/C | 0.035 | 4.78 × 10−09 | |
| c12p56077768 | rs115687886 |
| Androsterone sulfate | T/C | |||
EA effective allele, OA observed allele, EAF effective allele frequency, CMAF cumulative minor allele frequency
All SNPs have a call rate of 100% except for the SNP marked (*), which has a call rate of 97%