| Literature DB >> 31727947 |
Julia Höglund1, Nima Rafati2, Mathias Rask-Andersen2, Stefan Enroth2, Torgny Karlsson2, Weronica E Ek2, Åsa Johansson2.
Abstract
Genome-wide association studies (GWAS) have identified associations between thousands of common genetic variants and human traits. However, common variants usually explain a limited fraction of the heritability of a trait. A powerful resource for identifying trait-associated variants is whole genome sequencing (WGS) data in cohorts comprised of families or individuals from a limited geographical area. To evaluate the power of WGS compared to imputations, we performed GWAS on WGS data for 72 inflammatory biomarkers, in a kinship-structured cohort. When using WGS data, we identified 18 novel associations that were not detected when analyzing the same biomarkers with genotyped or imputed SNPs. Five of the novel top variants were low frequency variants with a minor allele frequency (MAF) of <5%. Our results suggest that, even when applying a GWAS approach, we gain power and precision using WGS data, presumably due to more accurate determination of genotypes. The lack of a comparable dataset for replication of our results is a limitation in our study. However, this further highlights that there is a need for more genetic epidemiological studies based on WGS data.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31727947 PMCID: PMC6856527 DOI: 10.1038/s41598-019-53111-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Results of GWAS analysis of the abundance of the 42 significant plasma proteins. Each dot represents a locus with a significant association. A non-filled dot represents an association in trans (on another chromosome than the gene encoding the biomarker) and the filled dots an association in cis. The dots are labelled with the names of the genes/locus that the top variant is located in in italics and the associated biomarker in brackets. Two genes are shown if it is intergenic. Red color depicts the centromere.
Location and annotation of significant top GWAS hits from WGS data.
| Biomarker | SNV | P-value | Effect, beta (SE) | Effect allele (ref) | MAF (effect allele) | chr:position† | Gene | Type | Location** |
|---|---|---|---|---|---|---|---|---|---|
| ADA | 4.91 × 10−18 | 1.46 (0.17) | C (T) | 0.019 | 20:43255220 | missense | |||
| CASP-8 | 3.623 × 10−09 | 0.46 (0.07) | T (C) | 0.165 | 2:202178477 | intronic | |||
| CCL11 | rs2228467 | 2.19 × 10−09 | 0.63 (0.11) | C (T) | 0.070 | 3:42906116 | missense | ||
| CCL19 | rs149941420 | 4.28 × 10−18 | 0.61 (0.07) | G (T) | 0.160 | 6:32556454 | intronic | ||
| CCL20 | 1.40 × 10−09 | 0.42 (0.07) | G (T) | 0.160 | 11:102742761 | intronic | |||
| CCL23 | 1.28 × 10−12 | −0.64 (0.09) | A (C) | 0.087 | 17:34326215 | ncRNA_intronic | |||
| CCL25 | rs2032887 | 1.09 × 10−37 | 0.72 (0.06) | G (A) | 0.301 | 19:8121360 | missense | ||
| CCL4 | rs113010081 | 4.19 × 10−38 | 0.80 (0.06) | C (T) | 0.232 | 3:46457412 | intergenic | ||
| CCL4 | rs4141329* | 1.55 × 10−14 | −0.38 (0.05) | C (A) | 0.472 | 17:34490448 | intergenic | ||
| CD244 | 1.16 × 10−13 | 0.41 (0.06) | C (T) | 0.378 | 1:160802681 | intronic | |||
| CD40 | rs4239702* | 1.01 × 10−49 | −0.84 (0.06) | T (C) | 0.273 | 20:44749251 | intronic | ||
| CD6 | rs11230563 | 5.23 × 10−31 | −0.79 (0.07) | T (C) | 0.168 | 11:60776209 | missense | ||
| CDCP1 | 2.62 × 10−12 | −0.44 (0.06) | A (G) | 0.225 | 3:45176513 | intronic | |||
| CST-5 | rs4239743 | 8.61 × 10−21 | 0.60 (0.06) | C (A) | 0.499 | 20:23859017 | intronic | ||
| CX3CL1 | 3.37 × 10−10 | 0.33 (0.05) | T (C) | 0.309 | 16:57374418 | intergenic | |||
| CXCL1 | rs3117604 | 2.46 × 10−19 | 0.50 (0.06) | C (T) | 0.331 | 4:74734668 | upstream | ||
| CXCL10 | rs11548618* | 5.07 × 10−47 | 2.11 (0.15) | A (G) | 0.035 | 4:76943947 | missense | ||
| CXCL11 | rs11548618* | 3.44 × 10−13 | −1.05 (0.14) | A (G) | 0.035 | 4:76943947 | missense | ||
| CXCL5 | rs425535* | 1.09 × 10−34 | 1.01 (0.08) | T (C) | 0.103 | 4:74863997 | synonymous | ||
| CXCL5 | rs10740118* | 6.09 × 10−27 | 0.40 (0.05) | C (G) | 0.443 | 10:65101207 | intronic | ||
| CXCL6 | rs111903579 | 6.71 × 10−58 | 0.81 (0.05) | T (C) | 0.445 | 4:74700432 | intergenic | ||
| CXCL9 | rs11548618* | 6.19 × 10−10 | −0.90 (0.15) | A (G) | 0.035 | 4:76943947 | missense | ||
| FGF-5 | 1.50 × 10−11 | 0.44 (0.07) | T (A) | 0.335 | 4:81184341 | intergenic | |||
| Flt3L | rs111595024* | 1.01 × 10−16 | 1.76 (0.21) | A (G) | 0.015 | 13:28761592 | intronic | ||
| IL-10RB | rs8178528 | 5.46 × 10−35 | −0.64 (0.05) | A (G) | 0.425 | 21:34660980 | intronic | ||
| IL-12B | rs10043720 | 7.99 × 10−31 | −0.68 (0.06) | A (G) | 0.262 | 5:158767333 | ncRNA_intronic | ||
| IL-15RA | rs3136630 | 2.64 × 10−19 | −0.56 (0.06) | T (C) | 0.312 | 10:5997820 | intronic | ||
| IL-18R1 | rs10190555 | 2.37 × 10−72 | 1.08 (0.06) | A (G) | 0.233 | 2:102994056 | intronic | ||
| TGFB1 | 1.35 × 10−12 | −0.88 (0.12) | A (G) | 0.040 | 19:41847860 | missense | |||
| MCP-1 | rs1800024* | 1.26 × 10−09 | 0.58 (0.10) | T (C) | 0.075 | 3:46412559 | ncRNA_intronic | ||
| MCP-2 | rs1133763 | 2.693 × 10−53 | −1.30 (0.08) | C (A) | 0.104 | 17:32647831 | missense | ||
| MCP-3 | 1.34 × 10−09 | 0.50 (0.08) | C (G) | 0.112 | 1:109407135 | intergenic | |||
| MCP-4 | rs12075 | 1.25 × 10−45 | −0.72 (0.05) | G (A) | 0.474 | 1:159175354 | missense | ||
| MMP-1 | rs471994* | 5.02 × 10−19 | −0.47 (0.05) | A (G) | 0.390 | 11:102697731 | ncRNA_intronic | ||
| MMP-10 | 1.17 × 10−08 | −0.51 (0.09) | T (G) | 0.081 | 11:102643718 | synonymous | |||
| SCF | rs6073958* | 1.20 × 10−09 | 0.037 (0.06) | C (T) | 0.199 | 20:44551855 | intergenic | ||
| ST1A1 | 2.51 × 10−13 | 0.78 (0.11) | G (A) | 0.064 | 16:28595989 | intronic | |||
| STAMBP | 3.32 × 10−09 | 0.92 (0.16) | T (G) | 0.026 | 1:53206258 | intronic | |||
| TNFB | 2.70 × 10−29 | −1.77 (0.16) | C (A) | 0.027 | 6:31540757 | missense | |||
| TNFSF14 | rs344560 | 3.72 × 10−17 | −0.88 (0.10) | T (C) | 0.065 | 19:665020 | missense | ||
| TRAIL | rs144242131* | 1.02 × 10−12 | 1.98 (0.28) | A (G) | 0.007 | 18:29769910 | upstream | ||
| uPA | 7.11 × 10−09 | −0.71 (0.12) | T (A) | 0.046 | 19:44202855 | intergenic | |||
| VEGF-A | rs6921438 | 1.63 × 10−12 | 0.35 (0.05) | G (A) | 0.434 | 6:43925607 | intergenic |
The raw p-values (not adjusted for multiple testing) are shown. If one biomarker had been measured twice (i.e. been measured on both INF and ONC_CVD), the SNV with the most significant p-value is presented. Novel variants are shown in bold. Additional information can be found in Supplementary Table S2.
§Does not have an rs-id,
†In hg19 coordinates
*Variant is from ONC_CVD. Either the p-value was lower, or no significant association was found in INF.
**In cis: within 1 Mb of the gene encoding the biomarker; in trans: on another chromosome of the gene encoding the biomarker.
Figure 2Circular representation of the GWAS hits. The numbers in the outer circle correspond to the chromosomes. Each biomarker is labelled at the position of the gene coding it on the cytoband. The colored lines/arrows represent the significant hits. The breadth of the line represents the size of the region associated with respective biomarker.
Location and annotation of top GWAS hits after having conditioned on the most significant hit.
| Biomarker | SNV | Conditional signal | P-value | P adj. | Effect, beta (SE) | Effect allele (ref) | MAF (effect allele) | chr:position† | Gene | Type | Location** |
|---|---|---|---|---|---|---|---|---|---|---|---|
| CCL23 | rs72831705 | secondary | 9.52 × 10−11 | 1.07 × 10−05 | 0.44 (0.07) | T (C) | 0.153 | 17:34321277 | ncRNA_intronic | ||
| CCL23 | rs854671 | tertiary | 1.48 × 10−08 | 1.67 × 10−03 | 0.23 (0.05) | C (T) | 0.475 | 17:34361300 | intergenic | ||
| CCL4 | 3:51599851§* | secondary | 3.53 × 10−07 | 2.10 × 10−02 | 3.09 (0.61) | A (C) | 0.00098 | 3:51599851 | intronic | ||
| CCL4 | rs188700215* | secondary | 1.01 × 10−07 | 5.91 × 10−03 | −5.19 (0.97) | A (G) | 0.00098 | 17:30092085 | intergenic | ||
| CCL4 | rs201079256* | tertiary | 1.02 × 10−13 | 5.97 × 10−09 | −0.36 (0.05) | T (C) | 0.465 | 17:34522125 | downstream | ||
| CD40 | rs6063068* | secondary | 7.41 × 10−08 | 6.27 × 10−03 | 3.28 (0.61) | T (A) | 0.00098 | 20:45717496 | intronic | ||
| CD40 | rs182282247* | tertiary | 9.99 × 10−08 | 8.45 × 10−03 | 0.82 (0.15) | A (G) | 0.021 | 20:44730041 | intergenic | ||
| CST-5 | rs6138152 | secondary | 3.87 × 10−07 | 3.13 × 10−02 | 0.36 (0.07) | G (A) | 0.211 | 20:23850130 | intergenic | ||
| CST-5 | rs75823487 | tertiary | 4.56 × 10−07 | 3.69 × 10−02 | 4.25 (0.84) | T (C) | 0.0015 | 20:29478349 | intergenic | ||
| CXCL1 | rs10938101* | secondary | 7.53 × 10−07 | 6.54 × 10−03 | −0.24 (0.05) | T (G) | 0.461 | 4:74688772 | intergenic | ||
| CXCL6 | rs181216093* | secondary | 5.27 × 10−09 | 8.42 × 10−04 | −1.21 (0.21) | T (C) | 0.009 | 4:74661204 | intergenic | ||
| IL-15RA | rs144173272 | secondary | 1.49 × 10−11 | 1.86 × 10−06 | −1.70 (0.25) | T (C) | 0.013 | 10:6008255 | missense | ||
| IL-15RA | rs35095871 | tertiary | 3.06 × 10−07 | 3.81 × 10−02 | 0.41 (0.08) | G (A) | 0.102 | 10:5700416 | intronic | ||
| IL-18R1 | rs12999517 | secondary | 4.89 × 10−19 | 1.31 × 10−13 | −0.47 (0.05) | C (T) | 0.172 | 2:102959260 | intronic | ||
| MCP-2 | rs74832623 | secondary | 6.47 × 10−32 | 7.79 × 10−27 | −1.17 (0.10) | G (A) | 0.045 | 17:32535173 | intergenic | ||
| MCP-2 | rs12601658 | tertiary | 7.15 × 10−12 | 8.61 × 10−07 | −0.32 (0.05) | A (T) | 0.244 | 17:32533423 | intergenic | ||
| MMP-1 | rs470358* | secondary | 9.38 × 10−09 | 1.56 × 10−03 | 0.29 (0.05) | T (C) | 0.397 | 11:102668702 | ncRNA_intronic | ||
| SCF | rs6104417* | secondary | 2.66 × 10−07 | 2.53 × 10−02 | 0.23 (0.05) | C (T) | 0.4995 | 20:44632542 | intergenic | ||
| ST1A1 | rs4149383 | secondary | 5.38 × 10−07 | 4.56 × 10−02 | 0.54 (0.11) | A (G) | 0.061 | 16:28620320 | UTR5 | ||
| TNFB | rs746868 | secondary | 1.42 × 10−07 | 2.18 × 10−02 | 0.26 (0.05) | C (G) | 0.061 | 6:31540429 | intronic | ||
| TNFB | 6:27190519§ | tertiary | 4.06 × 10−08 | 6.22 × 10−03 | 1.77 (0.32) | T (G) | 0.005 | 6:27190519 | intergenic | ||
| TNFSF14 | rs2291668 | secondary | 4.71 × 10−07 | 7.36 × 10−03 | 0.27 (0.05) | A (G) | 0.281 | 19:6669934 | synonymous | ||
| TRAIL | 18:21026109§* | secondary | 1.01 × 10−07 | 1.28 × 10−02 | −5.19 (0.97) | A (G) | 0.00049 | 18:21026109 | intergenic |
The raw p-values (not adjusted for multiple testing) are shown. The adjusted p-values are based on the number of SNVs tested in each region which means that each SNV does not need to reach genome wide significance. If one biomarker had been measured twice (i.e. been measured on both INF and ONC_CVD), the SNV with the most significant p-value is presented. Additional information can be found in Supplementary Table S4.
§Does not have an rs-id,
†In hg19 coordinates.
*Variant is from ONC_CVD. Either the p-value was lower, or no significant association was found in INF.
**In cis: within 1 Mb of the gene encoding the biomarker; in trans: on another chromosome of the gene encoding the biomarker.
Figure 3Narrow-sense heritability estimates of the top variants. The total heritability estimate is shown in dark grey. The contribution of the top variant is shown in pink, the contribution of the first conditional top variant (secondary hit) in yellow and the second conditional (tertiary hit) in green. Light grey depicts biomarkers with no significant GWAS signal.
Top GWAS hits with WGS data in comparison to the significant genotyped/imputed associations identified by Ahsan et al.[25].
| Biomarker | WGS top variant | chr:pos† | MAF WGS (effect allele) | MAF imputed (effect allele) | Effect allele (ref) | Genotype quality (sd) WGS | Imputation quality for WGS variant | P present | P Ahsan[ | Ahsan top variant (R2§)[ | P top variant Ahsan[ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| CCL19 | rs149941420* | 6:32556454 | 0.160 | 0.125 | G (T) | 90.92 (13.34) | 0.846 | 4.269 × 10−18 | 5.429 × 10−12 | rs2395201 (0.277) | 5.951 × 10−17 |
| CCL4 | rs113010081 | 3:46457412 | 0.232 | 0.201 | C (T) | 92.44 (11.21) | 0.996 | 4.188 × 10−38 | 3.124 × 10−23 | rs113341849 (0.992) | 3.326 × 10−26 |
| CCL4 | rs4141329 | 17:34490448 | 0.472 | 0.483 | C (A) | 94.92 (10.96) | 0.712 | 1.550 × 10−14 | n.s.‡ | rs113877493 (0.095) | 9.181 × 10−10 |
| CD40 | rs4239702 | 20:44749251 | 0.273 | 0.261 | T (C) | 97.09 (6.08) | 0.996 | 1.014 × 10−49 | 3.288 × 10−18 | rs4810485 (0.911) | 4.697 × 10−19 |
| CXCL10 | rs11548618 | 4:76943947 | 0.035 | 0.035 | A (G) | 92.21 (10.60) | 1 | 5.072 × 10−47 | 2.132 × 10−37 | rs11548618 (1) | 2.132 × 10−37 |
| CXCL5 | rs425535 | 4:74863997 | 0.103 | 0.100 | T (C) | 92.63 (10.84) | 0.989 | 1.091 × 10−34 | 2.081 × 10−25 | rs425535 (1) | 2.081 × 10−25 |
| CXCL5 | rs7088799 | 10:65016174 | 0.443 | 0.446 | G (T) | 97.22 (6.00) | 0.999 | 7.357 × 10−16 | 4.598 × 10−11 | rs7896910 (0.735) | 2.932 × 10−11 |
| CXCL6 | rs111903579 | 4:74700432 | 0.446 | NA** | T (C) | 85.04 (18.31) | NA** | 6.708 × 10−58 | NA** | rs16850073 (1) | 1.976 × 10−32 |
| Flt3L | rs111595024 | 13:28768589 | 0.015 | NA** | G (A) | 50.08 (21.35) | NA** | 1.008 × 10−16 | NA** | rs145096717 (0.967) | 3.045 × 10−14 |
| MCP-1 | rs1800024 | 3:46412559 | 0.075 | 0.077 | T (C) | 93.57 (9.38) | 0.998 | 1.257 × 10−09 | n.s. ‡ | rs2888526 (0.979) | 2.399 × 10−09 |
| MMP-1 | rs471994 | 11:102697731 | 0.389 | 0.395 | A (G) | 94.33 (10.16) | 1 | 5.017 × 10−19 | 1.736 × 10−15 | rs471994 (1) | 1.736 × 10−15 |
| MMP-10 | rs17359286 | 11:102643718 | 0.081 | 0.058 | T (G) | 93.56 (9.98) | 0.999 | 1.171 × 10−08 | n.s. ‡ | rs486055 (0.583) | 9.246 × 10−10 |
| SCF | rs6073958 | 20:44551855 | 0.199 | 0.196 | C (T) | 96.37 (7.18) | 0.997 | 1.204 × 10−09 | 2.325 × 10−09 | rs6073958 (1) | 2.325 × 10−09 |
| TRAIL | rs144242131 | 18:29769910 | 0.007 | 0.007 | A (G) | 90.74 (12.23) | 0.999 | 1.020 × 10−12 | 1.387 × 10−16 | rs144242131 (1) | 1.387 × 10−16 |
Results from ONC_CVD are compared*. The p-values for the top variants from the present study are shown (P present) as well as the p-values for the same variant in the imputed data (P Ahsan for WGS top variant). The most significant SNV from the previous study and corresponding p-value is also shown, and its LD (R2) with the most significant SNV from the present study. The comparisons have been filtered the same way as the present study: only biallelic variants and variants not located in a spanning deletion are compared.
†In hg19 coordinates.
‡Not significant in Ahsan et al. (P > 4.2e-09).
§R2 with WGS top variant.
*All top variants and P-values are from the analyses of the ONC_CVD panel, except for CCL19 that is from INF.
**Did not pass imputation QC or were not present in the reference panel used for the imputations.
Top GWAS hits from WGS data in comparison to the significant genotyped/imputed associations identified by Enroth et al.[28].
| Biomarker | WGS top variant | chr:pos† | MAF WGS (effect allele) | MAF imputed (effect allele) | Effect allele (ref) | Genotype quality (sd) WGS | Imputation quality of WGS top variant | P present | P Enroth[ | Enroth top variant (R2§)[ | P top variant Enroth[ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| CCL19 | rs149941420 | 6:32556454 | 0.160 | 0.125 | G (T) | 90.92 (13.34) | 0.846 | 4.269 × 10−18 | n.s. ‡ | rs9968904 (0.979) | 5.744 × 10−13 |
| CCL25 | rs2032887 | 19:8121360 | 0.301 | 0.302 | G (A) | 92.07 (12.10) | 1 | 1.089 × 10−37 | 4.368 × 10−35 | rs2032887 (1) | 4.368 × 10−35 |
| CCL4 | rs113010081 | 3:46457412 | 0.232 | 0.201 | C (T) | 92.44 (11.21) | 0.996 | 4.188 × 10−38 | 7.834 × 10−24 | rs113341849 (0.992) | 7.834 × 10−24 |
| CD40 | rs1569723 | 20:44742064 | 0.256 | 0.257 | C (A) | 95.30 (8.69) | 1 | 5.242 × 10−43 | 6.608 × 10−21 | rs4810485 (0.997) | 4.960 × 10−21 |
| CD6 | rs11230563 | 11:60776209 | 0.168 | 0.164 | T (C) | 90.08 (13.40) | 1 | 5.235 × 10−31 | 1.115 × 10−18 | rs11230556 (0.729) | 9.259 × 10−21 |
| CXCL10 | rs11548618 | 4:76943947 | 0.035 | 0.035 | A (G) | 92.21 (10.60) | 1 | 5.072 × 10−47 | 5.396 × 10−37 | rs11548618 (1) | 5.396 × 10−37 |
| CXCL5 | rs352045 | 4:74864687 | 0.103 | 0.100 | T (G) | 89.77 (13.54) | 0.995 | 6.091 × 10−27 | 1.164 × 10−19 | rs2564594 (0.974) | 7.826 × 10−20 |
| CXCL5 | rs10740118 | 10:65101207 | 0.443 | 0.445 | C (G) | 96.52 (7.24) | 0.999 | 2.927 × 10−13 | 5.033 × 10−10 | rs12770839 (0.698) | 8.833 × 10−12 |
| CXCL6 | rs111903579* | 4:74700432 | 0.446 | NA** | T (C) | 85.04 (18.31) | NA** | 6.708 × 10−58 | NA** | rs6831029 (0.813) | 1.126 × 10−26 |
| Flt3L | rs145096717 | 13:28761592 | 0.015 | 0.002 | A (G) | 94.67 (8.47) | 0.668 | 2.086 × 10−16 | 3.519 × 10−14 | rs145096717 (1) | 3.519 × 10−14 |
| IL-10RB | rs8178528 | 21:34660980 | 0.425 | 0.423 | A (G) | 95.79 (8.07) | 0.968 | 5.461 × 10−35 | n.s. ‡ | rs2843697 (0.951) | 1.098 × 10−16 |
| IL-12B | rs10043720 | 5:158767333 | 0.272 | 0.269 | A (G) | 91.15 (12.75) | 0.998 | 7.988 × 10−31 | 1.424 × 10−17 | rs10076557 (1) | 4.247 × 10−18 |
| IL-15RA | rs3136630 | 10:5997820 | 0.312 | 0.312 | T (C) | 91.43 (12.75) | 1 | 2.644 × 10−19 | 1.659 × 10−11 | rs3136630 (1) | 1.659 × 10−11 |
| IL-18R1 | rs10190555 | 2:102994056 | 0.233 | 0.234 | A (G) | 94.92 (9.44) | 0.999 | 2.373 × 10−72 | 1.155 × 10−51 | rs2058660 (0.957) | 5.500 × 10−51 |
| MCP-2 | rs1133763 | 17:32647831 | 0.104 | 0.109 | C (A) | 91.07 (12.04) | 0.978 | 2.693 × 10−53 | n.s. ‡ | rs3138037 (1) | 2.113 × 10−48 |
| MCP-4 | rs12075 | 1:159175354 | 0.474 | 0.473 | G (A) | 94.55 (10.13) | 1 | 1.253 × 10−45 | 1.475 × 10−43 | rs12075 (1) | 1.475 × 10−43 |
| VEGF-A | rs6921438 | 6:43925607 | 0.434 | 0.389 | G (A) | 95.10 (10.09) | 0.770 | 8.294 × 10−40 | n.s. ‡ | rs7767396 (0.942) | 8.048 × 10−19 |
Results from INF are compared*. The p-values for the top variants from the present study are shown (P present) as well as the p-values for the same variant in the imputed data (P Enroth for WGS top variant). The most significant SNV from the previous study and corresponding p-value is also shown, and its LD (R2) with the most significant SNV from the present study. The comparisons have been filtered the same way as the present study. Only biallelic variants and variants not located in a spanning deletion are compared.
†In hg19 coordinates.
‡Not significant in Enroth et al. (P > 4.79e-9).
§R2 with WGS top variant.
*For CXCL6 the variant is from ONC_CVD since this p-value was lower.
**Did not pass imputation QC or were not present in the reference panel used for the imputations.
Overlapping top SNVs from our biomarker GWAS with WGS data and top SNVs from the eQTL analyses by Westra et al.[51].
| Gene name | Biomarker | Top SNV (biomarker GWAS) | Annotation | Top SNV (eQTL) | LD | P (biomarker GWAS) | P (eQTL) |
|---|---|---|---|---|---|---|---|
| CD40 (INF) | rs1569723 | intergenic ( | rs1569723 | 1 | 5.24 × 10−43 | 1.06 × 10−28 | |
| CD40 (ONC_CVD) | rs4239702 | intronic ( | rs4239702 | 1 | 1.01 × 10−49 | 1.26 × 10−34 | |
| CXCL5 (INF) | rs352045 ( | upstream ( | rs352045 | 1 | 6.09 × 10−27 | 4.25 × 10−111 | |
| CXCL5 (ONC_CVD) | rs425535 ( | exonic ( | rs425535 | 1 | 1.09 × 10−34 | 4.50 × 10−111 | |
| IL-15RA | rs3136630 | intronic ( | rs3136630 | 1 | 2.64 × 10−19 | 5.21 × 10−6 | |
| CXCL5 (INF) | rs10740118 ( | intronic ( | rs10761779 | 0.856 | 2.93 × 10−13 | 1.82 × 10−7 | |
| CXCL5 (ONC_CVD) | rs7088799 ( | intronic ( | rs10761779 | 0.856 | 7.36 × 10−16 | 1.82 × 10−7 | |
| IL-18R1 | rs12999517† | intronic ( | rs12999517 | 1 | 4.89 × 10−19 | 1.13 × 10−39 | |
| TNFSF14 | rs2291668† | synonymous ( | rs1077667 | 0.899 | 4.71 × 10−07 | 4.36 × 10−47 | |
| CCL23 | rs854671‡ | intergenic ( | rs854671 | 1 | 1.48 × 10−08 | 3.21 × 10−27 |
Linkage disequilibrium (R2) is presented for biomarkers that had different top SNVs in ONC_CVD and INF, but are both in LD with a top eQTL.
†From the conditional analysis, adjusted for the top variant.
‡From the second conditional analysis, adjusted for the top primary and secondary variant.
*Top variant in cis, within 1 Mb of the gene encoding the biomarker.
**Top variant in trans, on another chromosome than the gene encoding the biomarker.
Disease-associations for the inflammatory biomarkers.
| Biomarker | Disease trait | Mapped trait | SNV (biomarker GWAS) | Annotation (biomarker GWAS) | Associated SNP (GWAS catalog) | LD§ |
|---|---|---|---|---|---|---|
| CCL19 | Asthma, Juvenile idiopathic arthritis, Rheumatoid arthritis | Asthma; systemic, polyarticular, rheumatoid factor negative, oligoarticular juvenile idiopathic arthritis; Rheumatoid arthritis | rs149941420 | intronic ( | rs7775228 | 0.849 |
| CCL4 | Inflammatory bowel disease, Juvenile arthritis, Ulcerative colitis | Inflammatory bowel disease; systemic, polyarticular, rheumatoid factor negative, oligoarticular juvenile idiopathic arthritis; Ulcerative colitis | intergenic ( | rs113010081 | 1 | |
| CD40 | Chronic hepatitis B infection, Chronic inflammatory diseases, Crohn’s disease, Inflammatory bowel disease, Kawasaki disease, Multiple sclerosis, Rheumatoid arthritis, Systemic lupus erythematosus | Chronic hepatitis B infection; Ankylosing spondylitis; Psoriasis; Ulcerative colitis; Sclerosing cholangitis; Crohn’s disease; Inflammatory bowel disease; Mucocutaneous lymph node syndrome; Multiple sclerosis; Rheumatoid arthritis; Systemic lupus erythematosus | intergenic ( | rs1569723, rs4239702, rs1883832, rs1569723, rs6074022, rs2425752, rs4810485, rs6032662 | 1, 1, 0.914, 0.914, 0.914, 0.843, 0.906, 0.914 | |
| CD6 | Chronic inflammatory diseases, Crohn’s disease, Inflammatory bowel disease, Ulcerative colitis | Ankylosing spondylitis; Psoriasis; Ulcerative colitis; Sclerosing cholangitis; Crohn’s disease; Inflammatory bowel disease | missense ( | rs11230563 | 1 | |
| CXCL5 | Ulcerative colitis | Ulcerative colitis | rs352045, rs425535 | upstream ( | rs2457996 | 0.944 |
| IL-12B | Ankylosing spondylitis, Chronic inflammatory diseases, Crohn’s disease | Ankylosing spondylitis; Psoriasis; Ulcerative colitis; Sclerosing cholangitis; Crohn’s disease | rs10043720 | ncRNA intronic (LOC285626) | rs6556416, rs6556411, rs10045431 | 0.993, 1, 0.884 |
| IL-18R1 | Celiac disease, Crohn’s disease, Inflammatory bowel disease, Pediatric autoimmune diseases | Celiac disease; Crohn’s disease; Inflammatory bowel disease; Autoimmune thyroid disease; Type I diabetes mellitus; Common variable immunodeficiency, Chronic childhood arthritis; Ankylosing spondylitis; Psoriasis; Ulcerative colitis; Autoimmune disease; Systemic lupus erythematosus | rs10190555 | intronic ( | rs13015714, rs917997, rs990171, rs2058660, rs6708413, rs2075184 | 0.991, 0.954, 0,954, 0.954, 0.954, 0.954 |
| TNFSF14 | Multiple sclerosis | Multiple sclerosis | rs2291668 (secondary) | synonymous ( | rs1077667 | 0.879 |
Disease-associations with inflammatory diseases in the GWAS catalog are presented. If the variant has been reported in the catalog before, it is marked in bold. The other have not been previously reported, but are in strong LD with variants that have (R2 > 0.8). A more extensive Table is found in Supplementary material, Supplementary Table S6.
§R2 between our SNV and the previously associated variants from the GWAS catalog.
Location and annotation of novel top GWAS hits from the present WGS associations that were not reported in our previous studies with genotyped/imputed data.
| Biomarker | chr:position† | SNV | MAF WGS (effect allele) | MAF imputed (effect allele) | Genotype quality (sd) WGS | Imputation quality | P WGS | P imputed[ |
|---|---|---|---|---|---|---|---|---|
| ADA | 20:43255220 | rs11555566 | 0.019 | 0.019 | 86.50 (14.78) | 1 | 4.91 × 10−18 | 1.35 × 10−08 |
| CASP-8 | 2:202178477 | rs116010659 | 0.165 | 0.141 | 90.00 (12.94) | 0.968 | 3.63 × 10−09 | 7.55 × 10−03 |
| CCL11 | 3:42906116 | rs2228467 | 0.070 | 0.069 | 87.10 (14.11) | 1 | 2.19 × 10−09 | 2.93 × 10−08 |
| CCL20 | 11:102742761 | rs17368659 | 0.160 | 0.154 | 96.12 (7.33) | 0.999 | 1.40 × 10−09 | 9.42 × 10−01 |
| CCL23 | 17:34326215 | rs712048*** | 0.087 | 0.085 | 95.65 (7.46) | 0.993 | 1.28 × 10−12 | 7.90 × 10−11*** |
| CD244 | 1:160802681 | rs71517284 | 0.378 | NA** | 91.14 (13.84) | NA** | 1.16 × 10−13 | NA** |
| CDCP1 | 3:45176513 | rs78521038 | 0.225 | NA** | 94.40 (9.59) | NA** | 2.62 × 10−12 | NA** |
| CST-5 | 20:23859017 | rs4239743 | 0.499 | 0.494 | 94.61 (9.83) | 0.990 | 8.61 × 10−21 | Biomarker not analysed[ |
| CX3CL1 | 16:57374418 | rs9921681 | 0.309 | 0.366 | 82.93 (18.48) | 0.984 | 3.37 × 10−10 | 4.00 × 10−06 |
| CXCL1 | 4:74734668 | rs3117604 | 0.331 | 0.321 | 91.77 (12.80) | 0.989 | 2.46 × 10−19 | 3.80 × 10−01 |
| CXCL11* | 4:76943947 | rs11548618 | 0.035 | 0.034 | 92.21 (10.60) | 1 | 3.44 × 10−13 | 8.96 × 10−01 |
| CXCL9* | 4:76943947 | rs11548618 | 0.035 | 0.035 | 92.21 (10.60 | 1 | 6.19 × 10−10 | 4.33 × 10−01 |
| FGF-5 | 4:81184341 | rs16998073 | 0.335 | 0.334 | 93.58 (11.26) | 1 | 1.50 × 10−11 | 2.03 × 10−06 |
| MCP-3 | 1:109407135 | rs11102571 | 0.112 | 0.103 | 86.27 (21.81) | 0.985 | 1.34 × 10−09 | 3.029 × 10−06 |
| ST1A1 | 16:28595989 | rs138534121 | 0.064 | NA** | 83.89 (16.79) | NA** | 2.51 × 10−13 | NA** |
| STAMBP* | 1:53206258 | 1:53206258§ | 0.026 | NA** | 29.16 (20.59) | NA** | 3.32 × 10−09 | NA** |
| TGFB1 | 19:41847860 | rs1800472 | 0.040 | 0.041 | 88.09 (14.06) | 1 | 1.35 × 10−12 | 1.55 × 10−08 |
| TNFB | 6:31540757 | rs2229092*** | 0.027 | 0.027 | 76.95 (17.90) | 1 | 2.70 × 10−29 | 6.04 × 10−21*** |
| TNFSF14 | 19:6665020 | rs344560 | 0.065 | 0.066 | 91.63 (11.59) | 1 | 3.72 × 10−17 | 7.18 × 10−01 |
| uPA | 19:44202855 | rs346058 | 0.046 | 0.043 | 87.38 (14.18) | 0.955 | 7.12 × 10−09 | 2.89 × 10−03 |
MAF is shown for both the WGS and imputed data set as well as genotype quality for the WGS data and imputation quality for the imputed data. The lowest p-value is shown from the WGS study and the p-value from combined analyses*** of the INF biomarkers published by Enroth et al.[28]. The genome-wide significant threshold used in the WGS study was 1.62 × 10−8 and in the previous study using genotyped/imputed data, a more stringent threshold of 4.79 × 10−9 were used, adjusting for the total number of markers analyzed rather than the total number of independent tests performed.
§Does not have an rs-id.
†In hg19 coordinates.
*Likely to be false positive findings (STAMBP) due to low genotype quality and CXCL9/ CXCL11 is discussed in the Supplementary (including Supplementary Figs S21–S25)
**Did not pass imputation QC or were not present in the reference panel used for the imputations.
***Two associations were not reported in our previous study due to the two-stage design (discovery and replication) even though the p-value was significant in the combined analyses.