| Literature DB >> 21281506 |
Daniel Shriner1, Laura Kelly Vaughan.
Abstract
BACKGROUND: Common, complex diseases are hypothesized to result from a combination of common and rare genetic variants. We developed a unified framework for the joint association testing of both types of variants. Within the framework, we developed a union-intersection test suitable for genome-wide analysis of single nucleotide polymorphisms (SNPs), candidate gene data, as well as medical sequencing data. The union-intersection test is a composite test of association of genotype frequencies and differential correlation among markers.Entities:
Mesh:
Year: 2011 PMID: 21281506 PMCID: PMC3040731 DOI: 10.1186/1471-2164-12-89
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Validity and power analysis of union testing. Each simulated data set consisted of 938 cases and 863 controls. For each experiment, 1,000 independent replicates were simulated. Coded genotypes were simulated by randomly sampling from a binomial distribution with the given frequencies. Case-control status was determined by a logistic model. Gray lines indicate the per comparison significance threshold of 0.05. A) Per comparison error rate. Unions of one, two, three, four, and five SNPs are represented by black, red, orange, green, and blue lines, respectively. B) Power for an odds ratio of 1.5 for each SNP. C) Power for an odds ratio of 2 for each SNP. D) Power for 2-marker unions with opposing effects. The black line represents odds ratios of 2 and 0.5 for the two markers, the red line represents 1.5 and 0.67, and the blue line represents 1 and 1. E) Power for unions consisting of one predictor with an odds ratio of 2 (black line), 1.5 (red line), or 1 (blue line), and four predictors with odds ratios of 1. F) Power for 2-marker unions with correlated predictors. Solid lines represent independent predictors and dotted lines represent predictors correlated at r2 = 0.8. Black lines represent odds ratios of 2, red lines represent 1.5, and blue lines represent 1. G) Power to detect epistasis for 2-marker unions. The black line represents odds ratios of 1 for both markers and 1 for the epistatic effect. The red line represents odds ratios of 1 for both markers and 2 for the epistatic effect. The blue line represents odds ratios of 1 for both markers and 0.5 for the epistatic effect. H) Power to detect differential correlation between 2-marker unions. Black lines represent r2 = 0 in controls and r2 = 0.8 in cases. Red lines represent r2 = 0.8 in controls and r2 = 0 in cases. Solid lines represent odds ratios of 2 and dotted lines represent 1.
Figure 2Quality control. A) Sample processing for the discovery sample. B) SNP processing for the discovery sample. C) Sample processing for the replication sample. D) SNP processing for the replication sample.
Figure 3Genome-wide scans for single marker analysis. The red horizontal line indicates the significance level of 3.42 × 10-8.
Summary of discovery and replication results
| Model | Size | Discovery | OR (95% CI) | Replication | OR (95% CI) | rsid | Chr | Position (bp) | Minor/major allele | Gene |
|---|---|---|---|---|---|---|---|---|---|---|
| Dominant | 1 | 8.17 × 10-6 | 0.65 (0.54, 0.79) | 0.845 | 1.04 (0.69, 1.56) | rs1584586 | 3 | 151677041 | A/G | |
| Dominant | 1 | 6.70 × 10-6 | 1.71 (1.35, 2.18) | 0.037 | 1.69 (1.02, 2.81) | rs1564282 | 4 | 842313 | T/C | |
| Dominant | 1 | 4.03 × 10-6 | 1.73 (1.36, 2.21) | NA | NA | rs11248051 | 4 | 848332 | T/C | |
| Dominant | 1 | 8.25 × 10-6 | 1.67 (1.32, 2.11) | 0.033 | 1.66 (1.02, 2.71) | rs11248060 | 4 | 954359 | T/C | |
| Dominant | 1 | 1.89 × 10-6 | 12.95 (3.24, 112.76) | NA | NA | rs7848576 | 9 | 697463 | G/A | |
| Dominant | 1 | 3.37 × 10-6 | 0.64 (0.53, 0.78) | 0.560 | 0.89 (0.60, 1.33) | rs898528 | 17 | 74678398 | T/C | NA |
| Dominant | 1 | 6.84 × 10-6 | 0.65 (0.54, 0.79) | 0.283 | 1.25 (0.84, 1.87) | rs2830713 | 21 | 27416311 | T/C | NA |
| Dominant | 2 | 1.77 × 10-6 | 1.58 (1.30, 1.91) | 0.202 | 1.30 (0.87, 1.95) | rs1564282 | 4 | 842313 | T/C | |
| rs2061846 | 4 | 842484 | C/T | |||||||
| Dominant | 2 | 2.24 × 10-6 | 1.57 (1.30, 1.90) | NA | NA | rs4690339 | 4 | 844712 | G/A | |
| rs11248051 | 4 | 848332 | T/C | |||||||
| Dominant | 2 | 2.91 × 10-6 | 0.63 (0.52, 0.77) | 0.252 | 1.28 (0.84, 1.97) | rs194907 | 6 | 82485214 | G/A | |
| rs1276888 | 6 | 82489107 | T/C | |||||||
| Dominant | 3 | 1.40 × 10-6 | 1.58 (1.31, 1.92) | NA | NA | rs1564282 | 4 | 842313 | T/C | |
| rs2061846 | 4 | 842484 | C/T | |||||||
| rs4690339 | 4 | 844712 | G/A | |||||||
| Dominant | 3 | 9.55 × 10-6 | 0.62 (0.50, 0.77) | 1.000 | 1.01 (0.65, 1.58) | rs1881747 | 10 | 54003581 | C/T | NA |
| rs1919764 | 10 | 54015996 | C/T | NA | ||||||
| rs1919738 | 10 | 54021111 | A/G | NA | ||||||
| Dominant | 4 | 5.06 × 10-6 | 0.61 (0.49, 0.76) | 0.914 | 0.98 (0.63, 1.54) | rs7085224 | 10 | 53971746 | G/A | NA |
| rs1881747 | 10 | 54003581 | C/T | NA | ||||||
| rs1919764 | 10 | 54015996 | C/T | NA | ||||||
| rs1919738 | 10 | 54021111 | A/G | NA | ||||||
| Recessive | 1 | 7.25 × 10-6 | 0.22 (0.09, 0.46) | 0.060 | 3.72 (0.88, 22.09) | rs9310784 | 3 | 25905208 | C/T | |
| Recessive | 1 | 2.26 × 10-6 | 1.65 (1.33, 2.04) | 0.182 | 0.73 (0.45, 1.16) | rs2382722 | 16 | 27300127 | G/A | |
| Recessive | 1 | 5.36 × 10-6 | 0.56 (0.43, 0.72) | 0.078 | 1.65 (0.94, 2.91) | rs1159220 | 22 | 31410753 | T/C | |
| Recessive | 1 | 6.14 × 10-6 | 0.56 (0.43, 0.73) | 0.078 | 1.65 (0.94, 2.91) | rs3788483 | 22 | 31414345 | C/T | |
| Recessive | 2 | 8.33 × 10-6 | 2.06 (1.48, 2.90) | 0.521 | 1.26 (0.64, 2.48) | rs2189387 | 17 | 36293632 | A/G | |
| rs7212483 | 17 | 36294578 | T/C | |||||||
| Recessive | 2 | 7.75 × 10-6 | 0.57 (0.44, 0.74) | 0.164 | 1.45 (0.86, 2.45) | rs1159220 | 22 | 31410753 | T/C | |
| rs5998577 | 22 | 31412043 | A/G |
Haplotype analysis for the locus at chromosome 4p16
| Discovery | Replication | |||||
|---|---|---|---|---|---|---|
| CC | 0.173 | 1.17 | 0.075 | 0.155 | 0.92 | 0.634 |
| TT | 0.111 | 1.59 | 2.10 × 10-5 | 0.102 | 1.58 | 0.049 |
| CT | 0.716 | 0.72 | 9.86 × 10-6 | 0.743 | 0.87 | 0.356 |
a In order, the SNPs are rs1564282 and rs2061846.
Figure 4Comparison of single marker and multi-locus methods. A) The observed joint genotype counts for rs1564282 and rs2061846. One control with the CT genotype at rs1564282 had a missing genotype at rs2061846. B) Single marker analysis for rs1564282 under additive coding. C) Single marker analysis for rs2061846 under additive coding. D) Single marker analysis of rs1564282 under dominant coding. E) Single marker analysis of rs2061846 under dominant coding. "-" indicates either allele. F) Haplotype analysis. G) Union analysis under dominant coding.
Haplotype analysis for the locus at chromosome 3p24
| Discovery | Replication | |||||
|---|---|---|---|---|---|---|
| ACGCCAAT | 0.799 | 0.98 | 0.835 | 0.782 | 0.83 | 0.248 |
| GTATTGCC | 0.072 | 0.76 | 0.040 | 0.105 | 1.01 | 0.974 |
| GTATTAAT | 0.033 | 1.42 | 0.069 | 0.043 | 0.73 | 0.365 |
| GCACTACT | NA | NA | NA | 0.014 | 7.11 | 0.012 |
| GTATTACC | 0.033 | 0.99 | 0.969 | 0.041 | 2.07 | 0.058 |
| GCATCACC | 0.038 | 1.27 | 0.195 | NA | NA | NA |
a In order, the SNPs are rs4481118, rs4293672, rs6551000, rs6793031, rs1991332, rs7644516, rs2052760, and rs9310784.
Haplotype analysis for the locus at chromosome 22q12
| Discovery | Replication | |||||
|---|---|---|---|---|---|---|
| TA | 0.163 | 0.84 | 0.058 | 0.166 | 0.99 | 0.941 |
| CA | 0.036 | 0.93 | 0.717 | 0.049 | 0.49 | 0.084 |
| TG | 0.242 | 0.89 | 0.155 | 0.259 | 1.21 | 0.259 |
| CG | 0.559 | 1.20 | 0.007 | 0.526 | 0.97 | 0.820 |
a In order, the SNPs are rs1159220 and rs5998577.