| Literature DB >> 22951892 |
Reedik Mägi1, Jennifer L Asimit, Aaron G Day-Williams, Eleftheria Zeggini, Andrew P Morris.
Abstract
Genome-wide association studies have been successful in identifying loci contributing effects to a range of complex human traits. The majority of reproducible associations within these loci are with common variants, each of modest effect, which together explain only a small proportion of heritability. It has been suggested that much of the unexplained genetic component of complex traits can thus be attributed to rare variation. However, genome-wide association study genotyping chips have been designed primarily to capture common variation, and thus are underpowered to detect the effects of rare variants. Nevertheless, we demonstrate here, by simulation, that imputation from an existing scaffold of genome-wide genotype data up to high-density reference panels has the potential to identify rare variant associations with complex traits, without the need for costly re-sequencing experiments. By application of this approach to genome-wide association studies of seven common complex diseases, imputed up to publicly available reference panels, we identify genome-wide significant evidence of rare variant association in PRDM10 with coronary artery disease and multiple genes in the major histocompatibility complex (MHC) with type 1 diabetes. The results of our analyses highlight that genome-wide association studies have the potential to offer an exciting opportunity for gene discovery through association with rare variants, conceivably leading to substantial advancements in our understanding of the genetic architecture underlying complex human traits.Entities:
Keywords: genome-wide association study; imputation; rare variants
Mesh:
Year: 2012 PMID: 22951892 PMCID: PMC3569874 DOI: 10.1002/gepi.21675
Source DB: PubMed Journal: Genet Epidemiol ISSN: 0741-0395 Impact factor: 2.135
Fig. 1Power, at a nominal significance level of P < 0.05, to detect association of an accumulation of minor alleles with a quantitative trait, for different strategies for assaying rare genetic variation in a 50 kb gene, as a function of the size of the reference panel. Multiple causal variants in the gene contribute jointly to 5% of the overall trait variation. The panels correspond to two specific trait association models: (A) the maximum MAF of any individual causal variant is 1%, and the total MAF of all causal variants is 5%; and (B) the maximum MAF of any individual causal variant is 0.5%, and the total MAF of all causal variants is 2%.
Fig. 2Principal components representing axes of genetic variation demonstrating clear separation between 12 UK regions of residence. Each point represents the mean projection of samples from each UK region onto the first three axes of genetic variation.
Fig. 3Manhattan plots summarising association of seven diseases from the WTCCC experiment with accumulations of well-imputed rare variants (MAF < 1% and info score of at least 0.4) within genes (as defined by the UCSC human genome database). Each point represents a gene, plotted according to the observed −log10 P-value of association (y-axis) and the physical position of the midpoint of the transcript (x-axis), with those achieving genome-wide significance (P < 1.7 × 10−6) highlighted in red.
Genes demonstrating genome-wide significant (P < 1.7 × 10−6) evidence of rare variant association with type 1 diabetes in the MHC, before and after adjustment for the lead common GWAS SNP (rs9268645) in the region
| Gene symbol | Build 37 chromosome 6 position (bp) | Number of rare variants | Mean MAF (%) | Analysis adjusted for three principal components only | Analysis adjusted for three principal components and rs9268645 | |||
|---|---|---|---|---|---|---|---|---|
| Start | Stop | OR (95% CI) | OR (95% CI) | |||||
| Genome-wide significant before adjustment for rs9268645 | ||||||||
| | 32,407,646 | 32,412,821 | 23 | 0.32 | 2.0 × 10−13 | 0.556 (0.476–0.650) | 2.2 × 10−9 | 0.642 (0.555–0.742) |
| | 32,485,162 | 32,498,006 | 43 | 0.51 | 1.6 × 10−10 | 0.746 (0.682–0.817) | 1.6 × 10−10 | 0.738 (0.673–0.810) |
| | 31,837,321 | 31,846,823 | 27 | 0.22 | 1.7 × 10−10 | 0.556 (0.465–0.666) | 8.7 × 10−9 | 0.586 (0.489–0.703) |
| | 32,152,509 | 32,157,963 | 13 | 0.22 | 1.2 × 10−9 | 0.375 (0.273–0.514) | 9.5 × 10−14 | 0.290 (0.210–0.402) |
| | 31,976,196 | 31,981,050 | 7 | 0.41 | 2.6 × 10−9 | 2.346 (1.772–3.107) | 2.4 × 10−4 | 1.719 (1.287–2.295) |
| | 32,135,989 | 32,139,282 | 5 | 0.24 | 3.3 × 10−9 | 0.118 (0.058–0.239) | 7.2 × 10−7 | 0.169 (0.084–0.342) |
| | 31,847,536 | 31,853,019 | 16 | 0.23 | 4.1 × 10−9 | 0.437 (0.332–0.576) | 5.1 × 10−7 | 0.484 (0.365–0.643) |
| | 31,021,983 | 31,027,653 | 25 | 0.35 | 3.2 × 10−8 | 0.757 (0.686–0.836) | 9.1 × 10−7 | 0.777 (0.702–0.859) |
| | 32,256,302 | 32,261,812 | 22 | 0.41 | 7.1 × 10−8 | 0.748 (0.673–0.831) | 1.9 × 10−5 | 0.793 (0.713–0.882) |
| | 31,557,050 | 31,560,762 | 10 | 0.23 | 1.0 × 10−6 | 0.436 (0.312–0.608) | 1.6 × 10−4 | 0.518 (0.368–0.729) |
| Genome-wide significant after adjustment for rs9268645 | ||||||||
| | 32,917,411 | 32,920,899 | 13 | 0.35 | 8.6 × 10−6 | 0.606 (0.389–0.942) | 1.1 × 10−7 | 0.540 (0.430–0.678) |
| | 31,926,580 | 31,937,532 | 33 | 0.20 | 4.0 × 10−5 | 0.706 (0.507–0.984) | 2.6 × 10−7 | 0.640 (0.540–0.759) |
| | 32,008,931 | 32,014,384 | 10 | 0.35 | 4.3 × 10−5 | 0.589 (0.355–0.978) | 4.1 × 10−7 | 0.513 (0.396–0.664) |
Fig. 4Regional plots summarising association of type 1 diabetes with accumulations of well-imputed rare variants (MAF < 1% and info score of at least 0.4) within MHC genes (as defined by the UCSC human genome database). Each point represents a gene, and those achieving genome-wide significance (P < 1.7 × 10−6) are highlighted in red. The panels correspond to analyses (A) before and (B) after adjustment for the lead common GWAS SNP (rs9268645) in the region.