| Literature DB >> 26931462 |
Darren A Cusanovich1, Minal Caliskan1, Christine Billstrand1, Katelyn Michelini1, Claudia Chavarria1, Sherryl De Leon1, Amy Mitrano1, Noah Lewellyn1, Jack A Elias2, Geoffrey L Chupp3, Roberto M Lang4, Sanjiv J Shah4, Jeanne M Decara4, Yoav Gilad1, Carole Ober5.
Abstract
Genome-wide association studies (GWASs) have become a standard tool for dissecting genetic contributions to disease risk. However, these studies typically require extraordinarily large sample sizes to be adequately powered. Strategies that incorporate functional information alongside genetic associations have proved successful in increasing GWAS power. Following this paradigm, we present the results of 20 different genetic association studies for quantitative traits related to complex diseases, conducted in the Hutterites of South Dakota. To boost the power of these association studies, we collected RNA-sequencing data from lymphoblastoid cell lines for 431 Hutterite individuals. We then used Sherlock, a tool that integrates GWAS and expression quantitative trait locus (eQTL) data, to identify weak GWAS signals that are also supported by eQTL data. Using this approach, we found novel associations with quantitative phenotypes related to cardiovascular disease, including carotid intima-media thickness, left atrial volume index, monocyte count and serum YKL-40 levels.Entities:
Mesh:
Year: 2016 PMID: 26931462 PMCID: PMC5062579 DOI: 10.1093/hmg/ddw061
Source DB: PubMed Journal: Hum Mol Genet ISSN: 0964-6906 Impact factor: 6.150
Figure 1.Study overview. (A) Outline of the study design. We collected genotype, quantitative trait and RNA-seq data on the Hutterites. GWAS and eQTL studies were conducted on independent subsets of the population and then the results of these studies were integrated. (B) Boxplot of the sample sizes for the 20 GWAS conducted. Among 984 individuals with genotype data, but no RNA-seq data, quantitative-trait data were collected for between 263 and 788 individuals for each of the 20 traits. (C) Sequencing depth for the RNA-seq data generated.
Summary of the GWAS data sets and results
| Quantitative trait | Relevant complex disease | GWAS-sample size | GWAS-number of SNPs | GWAS-significant SNPs | GWAS-significant locus |
|---|---|---|---|---|---|
| SBP | CVD | 406 | 396 968 | – | – |
| DBP | CVD | 406 | 396 968 | – | – |
| HDL | CVD | 516 | 387 558 | – | – |
| LDL | CVD | 499 | 387 345 | – | – |
| Triglycerides | CVD | 517 | 387 374 | rs6589677, rs500254, rs11217655, rs11217695 | 11q23.3 |
| Total cholesterol/HDL | CVD | 516 | 387 558 | – | – |
| Monocyte count | CVD | 653 | 390 542 | – | – |
| LAVI | CVD | 318 | 393 258 | – | – |
| LVMI | CVD | 320 | 393 024 | – | – |
| CIMT | CVD | 263 | 391 119 | – | – |
| FeNO | Asthma | 452 | 396 731 | – | – |
| FEV1 | Asthma | 697 | 391 522 | – | – |
| FEV1/FVC | Asthma | 700 | 392 576 | – | – |
| BRI | Asthma | 571 | 393 137 | – | – |
| Total serum IgE | Asthma | 788 | 390 188 | – | – |
| Lymphocyte count | Asthma | 654 | 390 392 | – | – |
| Eosinophil count | Asthma | 650 | 391 004 | – | – |
| Neutrophil count | Asthma | 653 | 390 560 | rs12634993 | 3p12.3 |
| YKL-40 | Asthma | 715 | 391 032 | rs1794867, rs495198, rs2819346, rs2819349, rs10800812, rs10920521, rs6672643, rs2153101, rs2494282, rs4950936, rs946258, rs3820145, rs1340237, rs10128007, rs12079530, rs35068223, rs4550119, rs79707006 | 1q32.1 |
| Chitinase 1 activity | Asthma | 715 | 391 032 | rs495198, rs2486070 | 1q32.1 |
SBP, systolic blood pressure; DBP, diastolic blood pressure; HDL, high-density lipoprotein cholesterol; LDL, low-density lipoprotein cholesterol; LAVI, left atrial volume index; LVMI, left ventricular mass index; CIMT, carotid intima-media thickness; FeNO, fraction of exhaled nitric oxide; FEV1, forced expiratory volume at 1 s; FVC, forced vital capacity; BRI, bronchial responsiveness index; IgE, immunoglobulin E.
Figure 2.Manhattan plots of the GWAS results for four phenotypes with genome-wide significant results. The genome-wide –log10 (P-value) of association for SNPs included in our study are shown for the four traits with significant associations at a Bonferroni-corrected P-value threshold of 0.05: (A) triglyceride levels, (B) neutrophil count, (C) YKL-40 levels and (D) Chitinase 1 activity. Points are ordered on the x-axis by their relative position in the genome. The red line denotes the Bonferroni-corrected threshold. Note that, the difference in P-value between the top SNP and nearby SNPs evident for YKL-40 and Chitinase 1 activity is the result of our pruning strategy, which filtered out highly correlated SNPs (see the ‘Materials and Methods’ section).
Figure 3.Effects of cis-eQTL SNPs in the Hutterites. (A) QQ-plot of the P-values for cis-eQTLs. (B) Distribution of eQTLs with respect to the nearest TSS. SNPs that are significant eQTLs (FDR < 0.05) tend to be very close to the TSS, unlike SNPs that are not significantly associated with expression levels.
Figure 4.Influence of Hutterite-specific genetic variation on gene expression. (A) Distribution of minor allele frequencies for SNPs found in dbSNP compared with SNPs that are specific to the Hutterites. (B) Barplot of the proportion of all SNPs tested that are Hutterite-specific compared with the proportion of significant cis-eQTL SNPs that were Hutterite-specific.
Genes associated with quantitative traits by joint analysis of gene expression and GWAS data
| Phenotype | Relevant disease | Genes at 20% FDR | Genes | Log10 (BF) | |
|---|---|---|---|---|---|
| CIMT | CVD | 2 | 11.06 | 2.15E−05 | |
| 10.13 | 2.49E−05 | ||||
| LAVI | CVD | 4 | 12.86 | 1.49E−05 | |
| 9.18 | 4.14E−05 | ||||
| 8.92 | 4.14E−05 | ||||
| 8.50 | 5.13E−05 | ||||
| Monocyte count | CVD | 4 | 20.33 | 4.97E−05 | |
| 11.95 | 1.82E−05 | ||||
| 9.62 | 3.48E−05 | ||||
| 9.45 | 3.65E−05 | ||||
| YKL-40 | Asthma | 5 | 10.85 | 2.32E−05 | |
| 10.24 | 3.15E−05 | ||||
| 8.86 | 4.14E−05 | ||||
| 8.74 | 4.31E−05 | ||||
| 7.80 | 6.63E−05 |
BF, Bayes factor.
Figure 5.Example of signals identified by Sherlock. Manhattan plots showing the –log10 (P-value) for each SNP arranged by genomic coordinates. Plot shows coordinated signals for the GWAS of CIMT (middle panel), the eQTL signals genome- wide for KCNK10 (bottom panel) and the eQTL signals genome wide for TRIM14 (top panel). Gray bars highlight the regions of interest identified by Sherlock. eQTL panels show all SNPs with P-value <0.001. GWAS panel shows all SNPs with P-value <0.01.