| Literature DB >> 22057235 |
Gosia Trynka1, Karen A Hunt, Nicholas A Bockett, Jihane Romanos, Vanisha Mistry, Agata Szperl, Sjoerd F Bakker, Maria Teresa Bardella, Leena Bhaw-Rosun, Gemma Castillejo, Emilio G de la Concha, Rodrigo Coutinho de Almeida, Kerith-Rae M Dias, Cleo C van Diemen, Patrick C A Dubois, Richard H Duerr, Sarah Edkins, Lude Franke, Karin Fransen, Javier Gutierrez, Graham A R Heap, Barbara Hrdlickova, Sarah Hunt, Leticia Plaza Izurieta, Valentina Izzo, Leo A B Joosten, Cordelia Langford, Maria Cristina Mazzilli, Charles A Mein, Vandana Midah, Mitja Mitrovic, Barbara Mora, Marinita Morelli, Sarah Nutland, Concepción Núñez, Suna Onengut-Gumuscu, Kerra Pearce, Mathieu Platteel, Isabel Polanco, Simon Potter, Carmen Ribes-Koninckx, Isis Ricaño-Ponce, Stephen S Rich, Anna Rybak, José Luis Santiago, Sabyasachi Senapati, Ajit Sood, Hania Szajewska, Riccardo Troncone, Jezabel Varadé, Chris Wallace, Victorien M Wolters, Alexandra Zhernakova, B K Thelma, Bozena Cukrowska, Elena Urcelay, Jose Ramon Bilbao, M Luisa Mearin, Donatella Barisani, Jeffrey C Barrett, Vincent Plagnol, Panos Deloukas, Cisca Wijmenga, David A van Heel.
Abstract
Using variants from the 1000 Genomes Project pilot European CEU dataset and data from additional resequencing studies, we densely genotyped 183 non-HLA risk loci previously associated with immune-mediated diseases in 12,041 individuals with celiac disease (cases) and 12,228 controls. We identified 13 new celiac disease risk loci reaching genome-wide significance, bringing the number of known loci (including the HLA locus) to 40. We found multiple independent association signals at over one-third of these loci, a finding that is attributable to a combination of common, low-frequency and rare genetic variants. Compared to previously available data such as those from HapMap3, our dense genotyping in a large sample collection provided a higher resolution of the pattern of linkage disequilibrium and suggested localization of many signals to finer scale regions. In particular, 29 of the 54 fine-mapped signals seemed to be localized to single genes and, in some instances, to gene regulatory elements. Altogether, we define the complex genetic architecture of the risk regions of and refine the risk signals for celiac disease, providing the next step toward uncovering the causal mechanisms of the disease.Entities:
Mesh:
Year: 2011 PMID: 22057235 PMCID: PMC3242065 DOI: 10.1038/ng.998
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330
Sample Collections
| Population sample | Celiac cases | Controls |
|---|---|---|
| UK | 7728 | 8274 |
| The Netherlands | 1123 | 1147 |
| Poland | 505 | 533 |
| Spain - CEGEC | 545 | 308 |
| Spain - Madrid | 537 | 320 |
| Italy - Rome, Milan, Naples | 1374 | 1255 |
| India - Punjab | 229 | 391 |
| Total | 12041 | 12228 |
The two Spanish population samples were considered separately due to genotyping in different laboratories.
5430 UK 1958 Birth Cohort participants, and 2844 UK Blood Services-Common Controls.
Each of the collections from the UK, Netherlands, Poland, Spain (Madrid) and Italy contained essentially the same sample set as our 2010 celiac disease GWAS[5], with now substantial additional samples from the UK and Netherlands and exclusion of amplified DNA samples from the Spanish collections. The Indian collection has not previously been studied. Our 2010 GWAS contained several collections not studied here.
Figure 1Manhattan plot of association statistics for known and novel celiac disease risk loci
Novel loci indicated in blue, loci with multiple signals indicated with grey highlight. Significance threshold drawn at P=5×10−8.
Risk variant signals at genome-wide significant celiac disease loci.
Non-HLA loci meeting genome-wide significance (P<5×10−8) in the current Immunochip dataset, or previous GWAS/replication dataset[5], are shown. Loci reported for the first time for celiac disease at genome wide significance are shown in bold in the Top variant column.
| Top variant | Chr | HapMap3 CEU LD | MAF | P | OR | Highly correlated (r2>0.9) | Localization: protein coding genes |
|---|---|---|---|---|---|---|---|
| rs4445406 | 1 | 2396747 - 2775531 | 0.344 | 5.4×10−12 | 0.87 | 2510162 - 2710035 |
|
| rs72657048 | 1 | 25111876 - 25180863 | 0.498 | 3.8×10−6 | 0.92 | 25162321 - 25177139 | 0 - 10kb 5′ & 1st exon |
|
| 1 | 170917308 - 171207073 | 0.185 | 1.4×10−10 | 0.86 | 170940206 - 170948695 | 35 - 43kb 5′ FASLG |
|
| 1 | ” | 0.180 | 8.3×10−9 | 0.87 | 171129607 – 171131275 | intergenic between |
| rs1359062 | 1 | 190728935 - 190814664 | 0.180 | 2.5×10−25 | 0.77 | 190786488 - 190811722 | 0 - 24kb 5′ & 1st exon |
| signal 2 | 1 | ” |
| 3.7×10−4 | 1.23 | 190779182 | 32kb 5′ |
| rs10800746 | 1 | 199119734 - 199308949 | 0.305 | 2.6×10−8 | 0.89 | 199148015 | 9th intron |
| rs13003464 | 2 | 60768233 - 61745913 | 0.388 | 4.3×10−16 | 1.17 | 61040333 - 61058360 | exons 5-11 |
| rs10167650 | 2 | 68389757 - 68535760 | 0.266 | 1.3×10−4 | 0.92 | 68493221 - 68499064 | intergenic between |
| rs990171 | 2 | 102221730 - 102573468 | 0.225 | 1.2×10−16 | 1.20 | 102338297 - 102459513 |
|
| rs1018326 | 2 | 181502502 - 181972196 | 0.418 | 3.1×10−16 | 1.16 | 181708291 - 181803246 | intergenic between |
|
| 2 | 191581798 - 191715979 | 0.058 | 8.4×10−9 | 0.79 | 191621279 - 191643278 | exons 6-14 |
|
| 2 | ” | 0.296 | 1.3×10−6 | 1.10 | 191681808 | intron 3 |
|
| 2 | ” | 0.119 | 2.6×10−4 | 0.90 | 191656882 | intron 3 |
| rs1980422 | 2 | 204154625 - 204524627 | 0.233 | 1.4×10−15 | 1.19 | 204318641 - 204320303 | intergenic between |
| signal 2 | 2 | ” | 0.217 | 1.6×10−5
| 0.91 | 204470572 – 204478299 | intergenic between |
| signal 3 | 2 | ” |
| 1.3×10−4
| 1.20 | 204158521 - 204168206 | 111 – 121 kb 5′ |
| rs4678523 | 3 | 32895606 - 33063377 | 0.313 | 2.4×10−7 | 1.11 | 33012725 - 33012756 | intergenic between |
| rs2097282 | 3 | 45904804 - 46625997 | 0.314 | 1.1×10−20 | 1.20 | 46321275 - 46377631 | intergenic between |
| signal 2 | 3 | ” | 0.361 | 8.6×10−9
| 1.12 | 46162711 – 46180690 | 38 – 55 kb 3′ CCR1 |
| signal 3 | 3 | ” | 0.070 | 4.8×10−5
| 1.16 | 46458634 – 46480319 | exons 2-13 |
| rs61579022 | 3 | 120587671 - 120783345 | 0.390 | 9.9×10−9 | 1.11 | 120601187 - 120605968 | intron 10 ARHGAP31 |
| [imm_3_161120372] | 3 | 161065075 - 161237201 | 0.111 | 2.6×10−27 | 1.36 | 161112778 - 161147744 | intergenic between |
| signal 2 | 3 | ” | 0.288 | 9.8×10−9
| 0.88 | 161106253 (1) | intergenic between |
| signal 3 | 3 | ” | 0.455 | 8.1×10−8
| 1.12 | 161136316 – 161168494 | intergenic between |
| rs2030519 | 3 | 189552054 - 189622323 | 0.486 | 3.0×10−49 | 0.76 | 189587750 - 189602595 | intron 2 |
| rs13132308 | 4 | 123192512 - 123784752 | 0.166 | 1.9×10−38 | 0.71 | 123269042 - 123770564 | multiple genes |
| signal 2 | 4 | ” | 0.073 | 8.6×10−5
| 1.15 | 123257527 – 123722990 | multiple genes |
|
| 6 | 315547 - 402748 | 0.488 | 1.8×10−9 | 0.89 | 353079 - 355417 | 3′ UTR |
|
| 6 | ” | 0.183 | 2.6×10−4
| 0.91 | 341321 | intron 4 |
| rs7753008 | 6 | 90863556 - 91096529 | 0.380 | 2.7×10−7 | 1.10 | 90866360 - 90875874 | intron 2 |
| rs55743914 | 6 | 127993875 - 128382483 | 0.239 | 1.1×10−18 | 1.21 | 128332892 - 128335255 | |
| signal 2 | 6 | ” | 0.150 | 1.2×10−5
| 0.89 | 128307943 - 128339304 | |
| rs17264332 | 6 | 137924568 - 138316778 | 0.211 | 5.0×10−30 | 1.29 | 138000928 - 138048197 | intergenic between |
| 6 | ” | 0.190 | 2.1×10−7
| 0.88 | 138015797 – 138043754 | intergenic between | |
| rs182429 | 6 | 159242314 - 159461818 | 0.427 | 8.5×10−16 | 1.16 | 159385965 - 159390046 | 4kb 5′ and 5′ UTR |
| 6 | ” | 0.071 | 2.8×10−6
| 1.18 | 159418255 | 32kb 5′ | |
|
| 7 | 37330503 - 37406978 | 0.101 | 2.1×10−8 | 1.18 | 37366994 - 37404402 | intron 1 |
| rs10808568 | 8 | 129211716 - 129368419 | 0.256 | 2.2×10−5 | 0.91 | 129333242 - 129345888 | 151 - 163kb 3′ of |
|
| 10 | 6428077 - 6585110 | 0.229 | 1.9×10−8 | 0.88 | 6430198 | intergenic between |
| rs1250552 | 10 | 80690408 - 80774414 | 0.470 | 8.0×10−17 | 0.86 | 80728033 | intron 14 |
|
| 11 | 110682429 - 110815769 | 0.209 | 1.9×10−11 | 1.16 | not high-density genotyped | [region: |
|
| 11 | 117847131 - 118270810 | 0.237 | 1.7×10−11 | 0.86 | 118080536 - 118085075 | intergenic between |
| rs61907765 | 11 | 127754640 - 127985723 | 0.213 | 3.4×10−13 | 1.18 | 127886184 - 127901948 | 5kb 5′ & 1st exon |
| rs3184504 | 12 | 110183529 - 111514870 | 0.488 | 5.4×10−21 | 1.19 | 110368991 - 110492139 | 5′ UTR & exons 1-3 |
|
| 14 | 68238574 - 68387815 | 0.221 | 4.7×10−8 | 1.13 | 68329159 - 68341722 | 1kb 5′ & 1st exon |
|
| 15 | 72397784 - 73270664 | 0.278 | 7.8×10−9 | 1.13 | not high-density genotyped | [region inc. |
|
| 16 | 10834038 - 10903351 | 0.246 | 5.8×10−10 | 1.14 | not high-density genotyped | [region: |
| rs243323 | 16 | 11220552 - 11385420 | 0.300 | 2.5×10−5 | 0.92 | 11254549 - 11268703 | 11kb 5′, all of SOCS1, 1kb 3′ |
| signal 2 | 16 | ” |
| 1.3×10−4
| 1.70 | 11281298 | intergenic between |
| signal 3 | 16 | ” | 0.169 | 2.0×10−4
| 1.10 | 11292457 | 10kb 5′ PRM1 |
| rs11875687 | 18 | 12728413 - 12914117 | 0.150 | 1.9×10−10 | 1.17 | 12811903 - 12870206 | exons 2-5 |
| signal 2 | 18 | ” |
| 5.2×10−5
| 1.20 | 12847758 | intron 2 |
|
| 21 | 42683153 - 42760214 | 0.282 | 3.0×10−9 | 0.88 | 42728136 | intron 9 |
| rs58911644 | 21 | 44414408 - 44528088 | 0.193 | 6.2×10−7 | 0.89 | 44446245 - 44453549 | 18 - 25kb 3′ |
|
| 22 | 20042414 - 20352005 | 0.186 | 5.7×10−11 | 1.16 | 20250903 - 20313260 |
|
|
| X | 152825373 - 153043675 | 0.133 | 2.7×10−8 | 1.18 | 152872114 - 152937386 |
|
Only the most significantly associated risk variant from each region and independent signal is shown. Variant names shown are as in dbSNP130 where available. Otherwise, the Illumina Immunochip manifest name is shown in brackets (Supplementary Table 5 shows both names for variants).
Regions were first defined by linkage disequilibrium blocks, extending 0.1 cM to the left and right of the risk SNP as defined by the HapMap3 CEU recombination map. For loci with multiple different previously reported risk SNPs for different diseases, and overlapping blocks, the extended region is shown. Regions where additional case resequencing (as well as 1000Genomes) has been performed are shown, with boundaries of the resequencing effort(s). All chromosomal positions are based on NCBI build-36 (hg18) coordinates.
MAF shown for European controls. See Supplementary Table 4 for more detailed allele frequencies in cases and controls by collection. Low frequency and rare variants shown in bold.
Logistic regression association test. Tests for second (and third) independent signals are conditioned on the first (and second) reported variant(s). Per locus significance thresholds for second (and third) independent signals are shown in Supplementary Table 3.
Figure 2Loci with multiple independent signals
Non-conditioned P values shown for loci with multiple independent signals (from Table 2). The most associated variant for a signal shown in bold colour, further variants in r2>0.90 (calculated from the 24,249 sample Immunochip dataset) shown in normal colour. First signal coloured blue, second coloured red, third coloured green. Squares indicate markers present in our previous celiac disease GWAS post quality control dataset (Illumina Hap550)[5].