| Literature DB >> 20463879 |
Hua Zhong1, John Beaulaurier, Pek Yee Lum, Cliona Molony, Xia Yang, Douglas J Macneil, Drew T Weingarth, Bin Zhang, Danielle Greenawalt, Radu Dobrin, Ke Hao, Sangsoon Woo, Christine Fabre-Suver, Su Qian, Michael R Tota, Mark P Keller, Christina M Kendziorski, Brian S Yandell, Victor Castro, Alan D Attie, Lee M Kaplan, Eric E Schadt.
Abstract
Genome-wide association studies (GWAS) have demonstrated the ability to identify the strongest causal common variants in complex human diseases. However, to date, the massive data generated from GWAS have not been maximally explored to identify true associations that fail to meet the stringent level of association required to achieve genome-wide significance. Genetics of gene expression (GGE) studies have shown promise towards identifying DNA variations associated with disease and providing a path to functionally characterize findings from GWAS. Here, we present the first empiric study to systematically characterize the set of single nucleotide polymorphisms associated with expression (eSNPs) in liver, subcutaneous fat, and omental fat tissues, demonstrating these eSNPs are significantly more enriched for SNPs that associate with type 2 diabetes (T2D) in three large-scale GWAS than a matched set of randomly selected SNPs. This enrichment for T2D association increases as we restrict to eSNPs that correspond to genes comprising gene networks constructed from adipose gene expression data isolated from a mouse population segregating a T2D phenotype. Finally, by restricting to eSNPs corresponding to genes comprising an adipose subnetwork strongly predicted as causal for T2D, we dramatically increased the enrichment for SNPs associated with T2D and were able to identify a functionally related set of diabetes susceptibility genes. We identified and validated malic enzyme 1 (Me1) as a key regulator of this T2D subnetwork in mouse and provided support for the association of this gene to T2D in humans. This integration of eSNPs and networks provides a novel approach to identify disease susceptibility networks rather than the single SNPs or genes traditionally identified through GWAS, thereby extracting additional value from the wealth of data currently being generated by GWAS.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20463879 PMCID: PMC2865508 DOI: 10.1371/journal.pgen.1000932
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Figure 1Diagram depicting the process of filtering SNPs using eSNPs and disease associated networks.
eSNP Set Enrichment Summary for the DIAGRAM, DGI and WTCCC GWAS.
|
|
|
| ||
| Number Genotyped | 16,801 | 3,888 | 4,047 | |
| All eSNPs | % PT2D<0.05 (%) | 6.63 | 6.20 | 6.72 |
| % PT2D<0.05 from random set (95% CI) | 5.70 (5.40, 6.00) | 5.19 (4.56, 5.82) | 5.94 (5.27, 6.60) | |
| % PT2D<0.05 Fold Increase over random SNP sets(p) | 1.16 (2.05×10−9) | 1.19 (8.00×10−4) | 1.13 (1.09×10−2) | |
| Number Genotyped | 144,660 | 24,220 | 25,591 | |
| All expanded eSNPs | % PT2D<0.05 (%) | 7.34 | 6.26 | 7.50 |
| % PT2D<0.05 from random set (95% CI) | 6.12 (5.72, 6.52) | 5.28 (4.90, 5.66) | 6.43 (5.99, 6.87) | |
| % PT2D<0.05 Fold Increase over random expanded SNP sets (p) | 1.20 (1.33×10−9) | 1.19 (2.19×10−7) | 1.17 (1.22×10−6) | |
| Number Genotyped | 19,853 | 3,342 | 3,539 | |
| Adipose network expanded eSNPs | % PT2D<0.05 (%) | 9.21 | 9.07 | 8.20 |
| % PT2D<0.05 from random expanded eSNP set (95% CI) | 7.30 (6.93, 7.65) | 6.13 (5.31, 6.95) | 7.40 (6.53, 8.26) | |
| % PT2D<0.05 Fold Increase over random expanded eSNP sets (p) | 1.26 (<10−16) | 1.48 (1.10×10−12) | 1.11 (3.49×10−1) | |
| Number Genotyped | 628 | 101 | 111 | |
| Adipose purple subnetwork expanded eSNPs | % PT2D<0.05 (%) | 13.22 | 36.63 | 10.81 |
| % PT2D<0.05 from random adipose network expanded eSNP set (95% CI) | 9.21 (6.94, 11.50) | 9.04 (3.44, 14.63) | 8.20 (3.08, 13.29) | |
| % PT2D<0.05 Fold Increase over random adipose network expanded eSNP sets (p) | 1.44 (2.97×10−4) | 4.05 (<10−16) | 1.32 (1.57×10−1) |
Figure 2eSNP sets enriched for T2D associated SNPs in three GWAS.
The Y axis shows the proportion of SNPs with PT2D< = 0.05. The PT2D are from DIAGRAM, WTCCC, and DGI from left to right. In each GWAS cohort, from left to right, the 1st bar shows the observed proportion of all studied SNPs; the 2nd bar shows the proportion of all eSNPs, the 3rd bar shows the proportion of the expanded eSNPs; the 4th bar shows the proportion of the expanded adipose network eSNPs; and the 5th bar shows the proportion of the expanded T2D adipose causal subnetwork eSNPs. In DIAGRAM study, the second bar is significantly higher than the first (p = 2.05×10−9), the third bar is higher than the second (p = 1.33×10−9), the fourth is higher than the third (p<10−16), and the fifth is higher than the fourth (p = 2.97×10−4). In WTCCC, the second bar is higher than the first (p = 1.09×10−2), the third is higher than the second (p = 1.22×10−6), but the fourth and fifth bars are not significantly higher than the third (p = 0.35) or fourth (p = 0.15), respectively. In the DGI study, the second bar is higher than the first (p = 8.0×10−4), the third is higher than the second, (p = 2.19×10−7), the fourth is higher than the third (p = 1.10×10−12), and the fifth is higher than the fourth (p<10−16).
Figure 3Adipose T2D causal subnetwork and human supporting evidence.
A) The adipose coexpression network is comprised of 9,900 gene expression traits. The purple subnetwork comprised of 159 genes is highlighted as the subnetwork most enriched for genes supported as causal for T2D. B) LOD score plots for plasma insulin (solid black), Me1 adipose expression (solid red), Anxa2 adipose expression (solid blue), Bcl2l10 adipose expression (solid cyan), 2310046806Rik adipose expression (solid black), Car12 adipose expression (solid green), Paqr9 adipose expression (solid magenta), C730029A08Rik adipose expression (dashed red), Poclce2 adipose expression (dashed blue), and adipose expression traits linking to this region in trans (grey), all measured in the B6×BTBR cross. C) The T2D adipose causal subnetwork is enriched for genes supported as having a causal relationship with plasma insulin levels in the B6×BTBR cross (blue nodes). The white nodes represent genes in the T2D adipose causal subnetwork not supported as causal for insulin traits in the B6×BTBR cross. D) OGTT curves for Me1−/− (Male n = 19; Female n = 14) and wild-type control (Male n = 25; Female n = 16) mice (p = 3.16×10−4 for male OGTT AUC and p = 1.84×10−3 for female OGTT AUC; overall sex adjusted difference p = 7.30×10−8). E) The T2D adipose causal subnetwork is enriched for genes in the Me1−/− adipose gene expression signature (orange nodes). The white nodes represent genes in the purple subnetwork not in the Me1−/− adipose gene signature.
Gene sets significantly over-represented in the mouse and human T2D adipose causal subnetwork.
|
|
|
| |
| Gene set count | overlap (Enrichment p value | ||
| Panther biological process | Lipid, fatty acid and steroid metabolism | 729 | 27 (4.95×10−10, 4.1) |
| Causal gene sets | Genes supported as causal for plasma insulin | 432 | 57 (5.26×10−68, 29.3) |
| Genes supported as causal for plasma glucose | 357 | 44 (1.40×10−50, 27.4) | |
| Genes supported as causal for number of islets | 605 | 47 (1.55×10−44, 17.3) | |
| Single gene perturbation experiments |
| 2958 | 32 (9.25×10−7, 2.4) |
The number of genes represented on the mouse array that mapped to orthologs in human.
The overlap count is computed by counting the number of transcripts in the intersection between the indicated gene set and the subnetwork. The fold enrichment is computed as the observed overlap count divided by the expected overlap count, estimated by multiplying the subnetwork gene count by the fraction ‘gene set count divided by total gene count’.
Nominal p values represent the significance of the Fisher Exact Test statistic under the null hypothesis that the frequency of the indicated gene set is the same between a reference set of all genes represented on the array and the set of genes comprising the subnetwork.
Total gene count is 17,413.
Total gene count is 35,345 transcripts on the array.
Comparison of metabolic traits between Me1−/− mice and wild-type controls.
|
|
| ||||||
|
| Wild type |
| Percentage change (%) | Wild type (n = 16) |
| Percentage change (%) | Difference p value |
| OGTT AUC (min (mg·ml−1)) | 424.5(117.4) | 299.4(36.9) | −29.5 | 350.2(116.3) | 239.8(23.8) | −31.5 | 7.30×10−8 |
| Glucose (mg·ml−1) | 2.4(0.6) | 1.8(0.2) | −24.9 | 1.9(0.5) | 1.6(0.2) | −18.1 | 3.30×10−6 |
| Insulin (mg·ml−1) | 10.9(6.1) | 2.0(1.6) | −81.5 | 2.9(2.2) | 0.6(0.3) | −78.9 | 1.23×10−9 |
| Leptin (mg·ml−1) | 11.2(3.3) | 6.1(2.6) | −45.4 | 4.6(4.0) | 2.4(2.4) | −47.8 | 1.69×10−6 |
| Cholesterol (mg·ml−1) | 2.2(0.4) | 1.9(0.5) | −13.8 | 1.3(0.3) | 1.0(0.1) | −22.7 | 2.19×10−3 |
| Triglycerides (mg·ml−1) | 2.3(1.0) | 1.2(0.4) | −48.6 | 1.4(0.4) | 1.0(0.1) | −26.9 | 1.40×10−7 |
Presented as mean (SD).
All p values reported represent the significance of the t statistic under the null hypothesis that the difference in mean, sex-adjusted values between the Me1−/− and wild-type groups is equal to 0.