| Literature DB >> 27027517 |
Shuai Wang1, Jing Hua Zhao2, Ping An3, Xiuqing Guo4, Richard A Jensen5,6, Jonathan Marten7, Jennifer E Huffman7, Karina Meidtner8, Heiner Boeing9, Archie Campbell10, Kenneth M Rice11, Robert A Scott2, Jie Yao4, Matthias B Schulze8,12, Nicholas J Wareham2, Ingrid B Borecki3, Michael A Province3, Jerome I Rotter4, Caroline Hayward6,10, Mark O Goodarzi13, James B Meigs14,15, Josée Dupuis1,16.
Abstract
For complex traits, most associated single nucleotide variants (SNV) discovered to date have a small effect, and detection of association is only possible with large sample sizes. Because of patient confidentiality concerns, it is often not possible to pool genetic data from multiple cohorts, and meta-analysis has emerged as the method of choice to combine results from multiple studies. Many meta-analysis methods are available for single SNV analyses. As new approaches allow the capture of low frequency and rare genetic variation, it is of interest to jointly consider multiple variants to improve power. However, for the analysis of haplotypes formed by multiple SNVs, meta-analysis remains a challenge, because different haplotypes may be observed across studies. We propose a two-stage meta-analysis approach to combine haplotype analysis results. In the first stage, each cohort estimate haplotype effect sizes in a regression framework, accounting for relatedness among observations if appropriate. For the second stage, we use a multivariate generalized least square meta-analysis approach to combine haplotype effect estimates from multiple cohorts. Haplotype-specific association tests and a global test of independence between haplotypes and traits are obtained within our framework. We demonstrate through simulation studies that we control the type-I error rate, and our approach is more powerful than inverse variance weighted meta-analysis of single SNV analysis when haplotype effects are present. We replicate a published haplotype association between fasting glucose-associated locus (G6PC2) and fasting glucose in seven studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium and we provide more precise haplotype effect estimates.Entities:
Keywords: family samples; haplotype association tests; linear mixed effects model; meta-analysis
Mesh:
Substances:
Year: 2016 PMID: 27027517 PMCID: PMC4869684 DOI: 10.1002/gepi.21959
Source DB: PubMed Journal: Genet Epidemiol ISSN: 0741-0395 Impact factor: 2.135
CHARGE cohorts
|
|
|
|---|---|
| Generation Scotland: Scottish Family Health Study | 7,678 |
| Framingham Heart Study | 6,561 |
| Cardiovascular Health Study (CHS) | 3,525 |
| Family Heart Study | 3,393 |
| Multi‐Ethnic Study of Atherosclerosis (MESA) | 2,507 |
| FENLAND (FLD) | 1,341 |
| European Prospective Investigation into Cancer and Nutrition, Potsdam (EPIC‐Potsdam) | 300 |
|
| 25,305 |
Family‐based cohort.
Study designs for type‐I error rate evaluation
|
|
|
| Type‐I error rate (G6PC2) | Type‐I error rate (JAZF1) |
|---|---|---|---|---|
| 1 | 5 | 250 NF2 (× 5) | 0.010 | 0.010 |
| 2 | 5 | 250 NFv (× 5) | 0.010 | 0.012 |
| 3 | 5 | 100 NF2, 175 NF2, 400 U, 700 U, 1000 U | 0.013 | 0.010 |
| 4 | 5 | 100 NFv, 175 NFv, 400 U, 700 U, 1000 U | 0.011 | 0.011 |
| 5 | 5 | 100 NFv, 175 NFv, 250 NFv, 325 NFv, 400 NFv | 0.011 | 0.012 |
| 6 | 10 | 250 NF2 (× 5); 1000 U (× 5) | 0.010 | 0.011 |
| 7 | 10 | 400 U, 700 U, 1000 U, 1300 U, 1600 U | 0.008 | 0.012 |
| 8 | 5 | 100 NF2, 175 NF2, 250 NF2, 325 NF2, 400 NF2 | 0.012 | 0.011 |
| 9 | 5 | 250 NF2, 125 NF2 (× 2), 375 NF2 (× 2) | 0.011 | 0.011 |
| 1000 U, 500 U (× 2), 1500 U (× 2) | ||||
| 10 | 10 | 250 NFv (× 7), 1000 U (× 3) | 0.012 | 0.011 |
NF2, nuclear family with 2 offspring; NFv, nuclear family with the number of offspring randomly selected to be between 1 and 4; U, unrelated subjects.
G6PC2 variants
| Name | Chr | MapInfo | dbSNPID | Minor | Major | FHS MAF |
|---|---|---|---|---|---|---|
| exm‐rs560887 | 2 | 169763148 | rs560887 | A | G | 0.293 |
| exm239664 | 2 | 169763262 | rs138726309 | T | C | 0.0036 |
| exm239667 | 2 | 169764141 | rs2232323 | C | A | 0.0078 |
| exm239672 | 2 | 169764176 | rs492594 | C | G | 0.4553 |
G6PC2 haplotype frequencies
| rs560887 | rs138726309 | rs2232323 | rs492594 | FHS frequency |
|---|---|---|---|---|
| C | C | A | C | 0.46 |
| T | C | A | G | 0.29 |
| C | C | A | G | 0.24 |
| T | C | C | G | 0.006 |
| C | T | A | C | <0.001 |
| T | C | A | C | <0.001 |
| C | T | A | G | <0.001 |
| C | C | C | G | <0.001 |
JAZF1 variants (chromosome 7)
| Name | Position | dbSNPID | Minor | Major | MAF |
|---|---|---|---|---|---|
| exm‐rs10486567 | 27976563 | rs10486567 | A | G | 0.2415 |
| exm2270592 | 28039797 | rs38523 | C | T | 0.3683 |
| exm‐rs864745 | 28180556 | rs864745 | G | A | 0.4965 |
| exm‐rs1635852 | 28189411 | rs1635852 | C | T | 0.4973 |
| exm‐rs849134 | 28196222 | rs849134 | G | A | 0.4917 |
JAZF1 haplotype frequencies
| Haplotype | rs10486567 | rs38523 | rs864745 | rs1635852 | rs849134 | Frequency |
|---|---|---|---|---|---|---|
| 1 | G | T | A | T | A | 0.2327 |
| 2 | G | T | G | C | G | 0.2295 |
| 3 | G | C | G | C | G | 0.1608 |
| 4 | G | C | A | T | A | 0.1295 |
| 5 | A | T | A | T | A | 0.0866 |
| 6 | A | T | G | C | G | 0.0793 |
| 7 | A | C | A | T | A | 0.0434 |
| 8 | A | C | G | C | G | 0.0259 |
| 9 | A | T | G | T | A | 0.0029 |
| 10 | A | T | A | C | A | 0.0029 |
| 11 | A | C | A | C | A | 0.0023 |
| 12 | G | T | A | C | A | 0.0019 |
| 13 | G | T | G | T | A | 0.0017 |
| 14 | G | C | G | T | A | 0.0005 |
Single haplotype association test using 4SNVs on G6PC2 region
| rs560887 | rs138726309 | rs2232323 | rs492594 | β (SE) |
| Frequency |
|
|---|---|---|---|---|---|---|---|
| C | C | A | C | 0.4394 | |||
| T | C | A | G | −0.073 (0.0055) |
| 0.2671 | −0.065(0.011) |
| C | C | A | G | 0.039 (0.0056) |
| 0.2645 | 0.034(0.012) |
| T | C | C | G | −0.12 (0.029) |
| 0.0065 | −0.205(0.057) |
| C | T | A | C | −0.022 (0.056) | 0.70 | 0.0021 | −0.202(0.077) |
| T | C | A | C | −0.031 (0.020) | 0.12 | 0.0195 | NA |
The haplotypes are observed in all cohorts except that the last one is observed only in FHS, CHS, GS, and FamHS.
and denote the estimates from the paper of Mahajan et al. [2015].
Figure 1Power of the haplotype meta‐analysis approach compared to gene‐based methods and single SNV meta‐analysis (min P) adjusted for multiple testing in the G6PC2 region, evaluated at in four study designs. Description of the four study designs used in the simulation can be found in Table 2 (study design 1–4). The labels on the x axes denote that 1 (SNV) or 2 (2SNVs) SNVs are influencing the phenotypes, or 1 (1HAP) or 2 (2HAPs) haplotypes have an effect on the phenotypes.
Figure 2Power of the haplotype meta‐analysis approach compared to gene‐based methods and single SNV meta‐analysis (min P) adjusted for multiple testing in the JAZF1 region, evaluated at in four study designs. Description of the four study designs used in the simulation can be found in Table 2 (study design 1–4). The labels on the x axes denote that 1 (SNV) or 2 (2SNVs) SNVs are influencing the phenotypes, or 1 (1HAP) or 2 (2HAPs) haplotypes have an effect on the phenotypes.