| Literature DB >> 23698362 |
Karen A Hunt1, Vanisha Mistry, Nicholas A Bockett, Tariq Ahmad, Maria Ban, Jonathan N Barker, Jeffrey C Barrett, Hannah Blackburn, Oliver Brand, Oliver Burren, Francesca Capon, Alastair Compston, Stephen C L Gough, Luke Jostins, Yong Kong, James C Lee, Monkol Lek, Daniel G MacArthur, John C Mansfield, Christopher G Mathew, Charles A Mein, Muddassar Mirza, Sarah Nutland, Suna Onengut-Gumuscu, Efterpi Papouli, Miles Parkes, Stephen S Rich, Steven Sawcer, Jack Satsangi, Matthew J Simmonds, Richard C Trembath, Neil M Walker, Eva Wozniak, John A Todd, Michael A Simpson, Vincent Plagnol, David A van Heel.
Abstract
Genome-wide association studies (GWAS) have identified common variants of modest-effect size at hundreds of loci for common autoimmune diseases; however, a substantial fraction of heritability remains unexplained, to which rare variants may contribute. To discover rare variants and test them for association with a phenotype, most studies re-sequence a small initial sample size and then genotype the discovered variants in a larger sample set. This approach fails to analyse a large fraction of the rare variants present in the entire sample set. Here we perform simultaneous amplicon-sequencing-based variant discovery and genotyping for coding exons of 25 GWAS risk genes in 41,911 UK residents of white European origin, comprising 24,892 subjects with six autoimmune disease phenotypes and 17,019 controls, and show that rare coding-region variants at known loci have a negligible role in common autoimmune disease susceptibility. These results do not support the rare-variant synthetic genome-wide-association hypothesis (in which unobserved rare causal variants lead to association detected at common tag variants). Many known autoimmune disease risk loci contain multiple, independently associated, common and low-frequency variants, and so genes at these loci are a priori stronger candidates for harbouring rare coding-region variants than other genes. Our data indicate that the missing heritability for common autoimmune diseases may not be attributable to the rare coding-region variant portion of the allelic spectrum, but perhaps, as others have proposed, may be a result of many common-variant loci of weak effect.Entities:
Mesh:
Year: 2013 PMID: 23698362 PMCID: PMC3736321 DOI: 10.1038/nature12170
Source DB: PubMed Journal: Nature ISSN: 0028-0836 Impact factor: 49.962
Variant types in protein coding regions of 25 genes in 41,911 phenotyped individuals
Numbers shown are after quality control steps. Annotation performed with GENCODE V14 gene definitions. Triallelic (n=124) and quadrallelic (n=3) sites (combined single nucleotide variants and indels) are shown as multiple separate variants with the appropriate annotation for each non-reference allele.
| Variant type | All variants | Rare | Novel |
|---|---|---|---|
| Nonsynonymous SNV | 1,792 | 1,758 | 1,379 |
| Splicing SNV | 86 | 85 | 65 |
| Stopgain SNV | 47 | 47 | 42 |
| Synonymous SNV | 1,024 | 972 | 674 |
| Frameshift indels | 31 | 31 | 31 |
| Nonframeshift indels | 10 | 10 | 10 |
| Total variants | 2,990 | 2,903 | 2,201 |
| Singleton | 1,602 | 1,598 | 1,411 |
| Doubleton | 470 | 468 | 378 |
MAF in 17,019 sequenced controls.
Not seen in dbSNP137; nor 1000 Genomes Project (April 2012 release); nor NHLBI (data release ESP6500SI, with 6,503 individuals)
Figure 1Association analyses of discovered rare functional variants in autoimmune diseases
We define rare functional variants as MAF<0.5% in 17,019 controls and predicted non-synonymous, premature stop or splice site annotation. Quantile-quantile plots compare observed versus expected test-statistic distributions, with shading indicating 99% confidence intervals. Full results are available in Supplementary Data. Each of six individual diseases, and all autoimmune diseases combined, were tested as phenotypes.
a. Gene based C-alpha test (25 genes by 7 phenotypes, n=41,911 subjects) allowing for both risk and protective effects for rare functional variants (n = 41,911 subjects). Singleton variants pooled into a single binomial count per phenotype.
b. Gene based burden tests (25 genes by 7 phenotypes, n=41,911 subjects) comparing summed allele counts for rare functional variants in cases versus controls with Fisher’s exact test.
c. Conditional gene based burden test (25 genes by 6 phenotypes, n=32,806 subjects): rare functional variant allele counts are summed for each individual per gene and introduced in a logistic regression, including Immunochip covariates for multiple independent top (common) variant signals selected on the basis of a stepwise regression (down to P>10−4). The psoriasis phenotype was not tested as most samples do not have Immunochip data.
d. UNIQ-cases tests (25 genes by 7 phenotypes, n=41,911 subjects) that compares the number of rare functional variants only observed in cases with the distribution of this value upon random permutation (10,000 times) of the phenotypes.
e. UNIQ-controls, same as e but for rare functional variants uniquely observed in controls.