| Literature DB >> 27490946 |
Marijn C Visschedijk1,2, Rudi Alberts1, Soren Mucha3, Patrick Deelen2, Dirk J de Jong4, Marieke Pierik5, Lieke M Spekhorst1, Floris Imhann1, Andrea E van der Meulen-de Jong6, C Janneke van der Woude7, Adriaan A van Bodegraven8, Bas Oldenburg9, Mark Löwenberg10, Gerard Dijkstra1, David Ellinghaus3, Stefan Schreiber11, Cisca Wijmenga2, Manuel A Rivas12, Andre Franke3, Cleo C van Diemen2, Rinse K Weersma1.
Abstract
Genome-wide association studies have revealed several common genetic risk variants for ulcerative colitis (UC). However, little is known about the contribution of rare, large effect genetic variants to UC susceptibility. In this study, we performed a deep targeted re-sequencing of 122 genes in Dutch UC patients in order to investigate the contribution of rare variants to the genetic susceptibility to UC. The selection of genes consists of 111 established human UC susceptibility genes and 11 genes that lead to spontaneous colitis when knocked-out in mice. In addition, we sequenced the promoter regions of 45 genes where known variants exert cis-eQTL-effects. Targeted pooled re-sequencing was performed on DNA of 790 Dutch UC cases. The Genome of the Netherlands project provided sequence data of 500 healthy controls. After quality control and prioritization based on allele frequency and pathogenicity probability, follow-up genotyping of 171 rare variants was performed on 1021 Dutch UC cases and 1166 Dutch controls. Single-variant association and gene-based analyses identified an association of rare variants in the MUC2 gene with UC. The associated variants in the Dutch population could not be replicated in a German replication cohort (1026 UC cases, 3532 controls). In conclusion, this study has identified a putative role for MUC2 on UC susceptibility in the Dutch population and suggests a population-specific contribution of rare variants to UC.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27490946 PMCID: PMC4973970 DOI: 10.1371/journal.pone.0159609
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Overview of the screening and replication strategy for rare variants.
Phase I: a) targeted re-sequencing of 122 genes was performed in a pooled design of 790 Dutch UC cases. Five hundred healthy individuals sequenced by the Genome of the Netherlands Project were used as a control cohort. After quality control, 2562 high-confidence variants were further prioritized based on allele frequency and likely pathogenicity. In total 188 SNVs were selected for replication phase 1 (Phase II), of which 171 passed the design of five Agena Biosience iPlexes. (http://agenabio.com) b) Phase II: genotyping of 171 variants was performed in 1021 Dutch UC cases and 1166 controls. c) Phase III: after association and gene-based analyses, genotyping of 19 variants was performed in 1026 German UC cases and 3532 healthy German controls.
Fig 2Overview of quality control and prioritization in Phase I.
a) After pooled sequencing, a total of 7969 SNVs were detected with a coverage of >360x (12 individuals* 30x coverage). b) All variants called by two alignment strategies were included and filtered using a Forward/Reverse balance between 20–80%. c) Variants previously tested in a large IBD cohort with the Immunochip (n = 527) and silent mutations (n = 335) were excluded. d) We used different strategies to select non-synonymous SNVs (coding), including splice-sites, (n = 418) (d1) and non-coding SNVs (n = 1282) (d2). d1) The coding variants were selected on the basis of allele frequency (AF): known SNVs with an AF > 0.05 were excluded. A different strategy was obtained for genes that are known to lead to spontaneous colitis when in knocked-out mice. In this group of genes we took a more liberal approach in selecting variants for further follow-up and included common variants with predicted functional consequences for follow-up genotyping. Three hundred seventy-seven SNVs remained after this step. d2) To prioritize the non-coding SNVs in regulatory regions, we selected 48 SNVs in a transcription factor binding site (TFBS), based on ENCODE data in the UCSC browser e) Further prioritization was based on damaging effect prediction by Polyphen (damaging effects between 0.8 and 1.0) and/or damaging effect predicted by Sift (n = 112). We included all nonsense variants (n = 6), the variants in splice-sites (n = 4) and variants that were significantly different in AF compared to the AF in GoNL (n = 5). We also included unknown SNVs present in more than one pool (n = 13). f) In total, 140 coding and 48 non-coding rare variants remained after filtering.
Overview of known rare IBD risk variants.
| Rivas et al | Beaudoin et al | Prescott et al | Hong et al | This study | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Allele Frequency | Allele Frequency | Allele Frequency | Allele Frequency | Allele Frequency | |||||||||||||||
| SNV | Chr:Position (Hg19) | Gene | Amino Acid Change | cDNA Change | Cases (ICHIP) | Controls (ICHIP) | P | Cases (ICHIP) | Controls (ICHIP) | P | Cases | Controls | P | Cases | Controls | P | Cases | Controls | P |
| rs41313262 | 1:67705900 | IL23R | p.Val362Ile | c.1084G>A | 0.0110 | 0.0152 | 1.18 x 10−5 | 0.0012 | 0.0015 | 1.2 x 10−3 | 0.0062 | 0.0139 | 0.1398 | NA | NA | NA | 0.0107 | 0.0210 | 0.0432 |
| rs76418789 | 1:67648596 | IL23R | p.Gly149Arg | c.445G > A | 0.0025 | 0.0043 | 3.20 x 10−4 | 0.0034 | 0.0044 | 0.0320 | 0.0016 | 0.0039 | 0.8800 | 0.036 | 0.068 | 1.1 x 10−8 | 0.0013 | 0.0041 | 0.0040 |
| rs11209026 | 1:67705958 | IL23R | p.Arg381Gln | c.1142G>A | NA | NA | NA | NA | NA | NA | 0.0190 | 0.0570 | 0.0006 | NA | NA | NA | 0.0468 | 0.0750 | 0.0031 |
| rs141992399 | 9:139259592 | CARD9 | NA | c.IVS11+iG>C | 0.0024 | 0.0071 | <1. x 10−16 | 0.0003 | 0.0007 | 1.5 x 10−11 | NA | NA | NA | NA | NA | NA | 0.0025 | 0.0070 | 0.1199 |
| rs200735402 | 9:139265120 | CARD9 | p.Glu221Lys | c.661G>A | NA | NA | NA | NA | NA | NA | NA | NA | NA | 0.001 | 0.011 | 0.0001 | NA | NA | NA |
| rs41316003 | 9:5126343 | JAK2 | p.Arg1063His | c.3188G>A | NA | NA | NA | 0.00034* | 0.00058* | 0.0150 | NA | NA | NA | NA | NA | NA | 0.0190 | 0.0120 | 0.2027 |
This table provides an overview of known rare IBD variants, based on literature. Exclusively, the genes included in our UC study design are displayed. The allele frequencies and p-values of combined analyses of the variants in the different studies (Rivas et al(9), Beaudoin et al(7), Prescott et al(10), Hong et al(21), and our study (Discovery, Phase I) are shown.
a identified by Rivas et al(9).
b identified by Momozawa et al(8).
c identified by Hong et al(21), not replicated in the other populations.
d identified by Beaudoin et al(10) in the follow-up phase, but not tested for replication on the Immunochip.
SNV: single nucleotide variant; Chr: chromosome;, ICHIP: Immunochip, P: P-value, NA not applicable
Predicted loss of function variants identified by pooled sequencing (Phase I), and genotyped in replication phase 1 (Phase II).
| Discovery (Phase I) | Replication phase 1 (Phase II) | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Allele Frequency | Allele Frequency | |||||||||||
| SNV | Chr:Position (Hg19) | Gene | Amino Acid Change | cDNA Change | Exonic function | Cases | Controls (GoNL) | P_FISHER | Cases | Controls | P_CHISQ | P_10,000perm |
| - | 2:25064537 | NA | c.957-1G>T | SPLICE_SITE_ACCEPTOR | 0.0006 | NA | NA | fail QC | fail QC | fail QC | NA | |
| rs150302537 | 2:28532947 | NA | c.1089-2A>C | SPLICE_SITE_ACCEPTOR | 0.0038 | 0.0020 | 0.7186 | 0.0020 | 0.0043 | 0.1783 | 0.0880 | |
| - | 1:67702486 | NA | c.1045+1G>T | SPLICE_SITE_DONOR | 0.0006 | NA | NA | NA | 0.0004 | 0.3517 | 0.1869 | |
| - | 20:62369002 | NA | c.98+2T>C | SPLICE_SITE_DONOR | 0.0006 | NA | NA | NA | 0.0009 | 0.1745 | 0.0858 | |
| rs142690032 | 3:49721812 | p.Arg651 | c.1951C>T | STOP_GAINED | 0.0107 | 0.0080 | 0.5427 | 0.0182 | 0.0139 | 0.2502 | 0.1502 | |
| rs147438510 | 7:36561695 | p.Gly517 | c.1549G>T | STOP_GAINED | 0.0044 | 0.0040 | 1.0000 | 0.0025 | 0.0022 | 0.8398 | 0.4118 | |
| - | 11:64111929 | p.Trp639 | c.1916G>A | STOP_GAINED | 0.0006 | NA | NA | NA | NA | NA | NA | |
| - | 12:12588642 | p.Arg95 | c.283C>T | STOP_GAINED | 0.0006 | NA | NA | fail QC | fail QC | fail QC | NA | |
| - | 20:62328835 | p.Cys193 | c.579C>A | STOP_GAINED | 0.0006 | NA | NA | NA | NA | NA | NA | |
| - | 22:30415593 | p.Glu649 | c.1945G>T | STOP_GAINED | 0.0013 | NA | NA | NA | NA | NA | NA | |
Pooled sequencing identified 10 predicted loss of function variants, shown in this table. The exonic function is predicted based on SNPeff. Allele frequencies of the discovery (Phase I) and the replication phase 1 (Phase II) are provided.
* no carriers detected
SNV: single nucleotide variant; Chr: chromosome; GoNL: Genome of the Netherlands; P_CHISQ: p-value of chi-squared; P_Fisher: p-value of fisher exact test, P_10,000perm: p-value of 10,000 permutations; fail QC: variants fail the quality control; NA: not applicable.
Significant SNVs in Replication phase 1 (Phase II) and replication phase 2 (Phase III).
| Discovery (Phase I) | Replication phase 1 (Phase II) | Replication phase 2 (Phase III) | Exac | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Allele Frequency | Allele Frequency | Allele Frequency | ||||||||||||||||
| SNV | Chr:Position (Hg19) | GENE | Amino Acid Change | cDNA Change | Cases | Controls | P_FISHER | Cases | Controls | P_CHISQ | OR | P_10,000 perm | Cases | Controls | P_CHISQ | OR | P_10,000 perm | Euro_freq |
| rs147664779 | 11:1083557 | p.Arg743Trp | c.2227C>T | 0.0013 | 0.0000 | NA | 0.0070 | 0.0009 | 0.0009 | 8.1940 | 0.0003 | fail QC | fail QC | fail QC | NA | NA | 0.0006 | |
| rs41376152 | 11:1094761 | p.Thr1946Asn | c.5837C>A | 0.0316 | 0.0240 | 0.2783 | 0.0657 | 0.0361 | 1.15E-05 | 1.8790 | 0.0057 | 0.0274 | 0.0274 | 0.3853 | 0.8781 | 0.1887 | 0.0289 | |
| rs4400498 | 9:139305007 | NA | c.-158G>A | 0.0923 | 0.2390 | <0.0001 | 0.0093 | 0.0180 | 0.0166 | 0.5108 | 0.0065 | 0.3280 | 0.3126 | 0.1811 | 1.073 | 0.0863 | NA | |
| rs2856111 | 11:1075747 | p.Leu58Pro | c.173T>C | 0.1517 | 0.1320 | 0.1857 | 0.1445 | 0.1239 | 0.0487 | 1.1940 | 0.0321 | NA | NA | NA | NA | NA | 0.1346 | |
| rs149995388 | 2:198482574 | p.Ser334Arg | c.1000A>C | 0.0088 | 0.0070 | 0.8210 | 0.0079 | 0.0039 | 0.0790 | 2.0510 | 0.0339 | NA | NA | NA | NA | NA | 0.0047 | |
| rs150660153 | 1:2535397 | p.Glu323Gln | c.967G>C | 0.0019 | 0.0020 | 1.0000 | 0.0010 | 0.0035 | 0.0906 | 0.2850 | 0.0442 | NA | NA | NA | NA | NA | 0.0027 | |
| rs41386154 | 11:1097749 | p.Asn2277Thr | c.6830A>C | 0.0126 | 0.0070 | 0.0325 | 0.0066 | 0.0030 | 0.0842 | 2.2050 | 0.0485 | 0.0024 | 0.0024 | 0.2441 | 0.5726 | 0.13 | 0.0021 | |
| rs144037797 | 11:64117106 | p.Thr943Ile | c.2828C>T | 0.0278 | 0.0320 | 0.5513 | 0.0223 | 0.0314 | 0.0664 | 0.7041 | 0.0492 | NA | NA | NA | NA | NA | 0.0344 | |
Table 3 shows all significant associated SNVs in replication phase 1 (Phase II) and replication phase 2 (Phase III). Phase I: 790 UC cases, 500 GoNL controls; Phase II: 1021 UC cases, 1166 healthy controls; Phase III: 1026 German UC cases, 3532 healthy German controls. Besides, the allele frequencies of the Exac database or shown. The MUC2 gene is selected based on the fact that this gene leads to the development of a spontaneous colitis in knock-out mice. Fur MUC2 we took a more liberal approach in selecting variants and included common variants with predicted functional consequences for follow up genotyping.
* Follow-up genotyping of rs4400498 in the PMCA gene had a 10-times difference in AF in the replication phase 1 (Phase II) compared to the replication phase 2 (Phase III). This is probably due an artefact in phase II.
SNV: single nucleotide variant; Chr: chromosome; UC: Ulcerative Colitis; freq: allele frequency; GoNL: Genome of the Netherlands; P_CHISQ: p-value of chi-squared; OR: Odds Ratio; P_10,000perm: p-value of 10,000 permutations; NA: not applicable, Euro_freq: allele frequencies of european (non-Finnish) population in Exac database (http://exac.broadinstitute.org)