| Literature DB >> 35935937 |
Julia Höglund1, Fatemeh Hadizadeh1, Weronica E Ek1, Torgny Karlsson1, Åsa Johansson1.
Abstract
Eosinophils play important roles in the release of cytokine mediators in response to inflammation. Many associations between common genetic variants and eosinophils have already been reported, using single nucleotide polymorphism (SNP) array data. Here, we have analyzed 200,000 whole-exome sequences (WES) from the UK Biobank cohort and performed gene-based analyses of eosinophil count. We defined five different variant weighting schemes to incorporate information on both deleteriousness and frequency. A total of 220 genes in 55 distinct (>10 Mb apart) genomic regions were found to be associated with eosinophil count, of which seven genes (ALOX15, CSF2RB, IL17RA, IL33, JAK2, S1PR4, and SH2B3) are driven by rare variants, independent of common variants identified in genome-wide association studies. Two additional genes, NPAT and RMI1, have not been associated with eosinophil count before and are considered novel eosinophil loci. These results increase our knowledge about the effect of rare variants on eosinophil count, which can be of great value for further identification of therapeutic targets.Entities:
Keywords: UK Biobank; association testing; eosinophils; exome sequencing; inflammation
Mesh:
Year: 2022 PMID: 35935937 PMCID: PMC9355086 DOI: 10.3389/fimmu.2022.862255
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 8.786
Baseline characteristics of the UKB participants with exome sequencing data available.
| Baseline characteristics | |||
|---|---|---|---|
| Number of participants | 200,643 | ||
| Age, median | 1st–3rd quartiles | 58 | 50–63 | |
| Females | males (%) | 110,478 (55) | 90,154 (45) | |
| Eosinophil count, median | 1st–3rd quartiles | 0.14 | 0.10–0.21 | |
| Asthma | 171,938 | 28,693 | 16.7 |
| Hay fever | 178,067 | 22,562 | 12.7 |
| Eczema | 191,674 | 8,955 | 4.7 |
| Type 1 diabetes | 198,780 | 1,849 | 0.9 |
| Psoriasis | 198,622 | 1,996 | 1.0 |
| Rheumatic arthritis | 196,021 | 4,608 | 2.4 |
| Crohn’s disease | 198,396 | 2,233 | 1.1 |
| Ulcerative colitis | 199,472 | 1,157 | 0.6 |
Summary of the inflammatory diseases available in the UKB in which significant genes from our analysis were enriched for the associated genes to the respective disease.
Figure 1Significant genes in the sequence kernel association test (SKAT) primary analyses; combined results from the five models of the SKAT gene-based analysis. Results from gene-based tests of all canonical transcripts, with chromosomal location at the x-axis and −log10(P) on the y-axis. Genome-wide significance threshold is set at 5.18 × 10−7 [−log(P) = 6.29]. These 220 genes correspond to the genes that had a significant association with at least one model.
Results for the 55 independent SKAT loci.
| Chr | Lead gene in the locus | Other genes in the locus | Lowest | GWAS overlap | Significant gene(s) after GWAS adjustment | GWAS-adjusted |
|---|---|---|---|---|---|---|
| 1 | 2.37E−08 | No | 3.77E−07 | |||
| 1 | 1.12E−13 | Yes | 4.13E−06, 1.36E−06 | |||
| 1 | None | 1.29E−12 | No | 9.00E−08 | ||
| 1 | 2.73E−13 | No | None in locus | 0.13 | ||
| 1 | 8.06E−10 | No | None in locus | 0.07 | ||
| 1 | None | 3.66E−08 | No | 6.31E−06 | ||
| 2 | 2.96E−177 | Yes | None | 0.02 | ||
| 2 | 4.78E−22 | No | None | 0.76 | ||
| 2 | None | 4.55E−08 | Yes | None | 2.92E−04 | |
| 2 | 1.77E−32 | Yes | None | 0.02 | ||
| 3 | None | 4.36E−08 | Yes | None in locus | 2.38E−03 | |
| 3 | 7.79E−21 | Yes | None in locus | 2.35E−04 | ||
| 3 | 1.01E−17 | Yes | None in locus | 0.016 | ||
| 3 | None | 1.30E−55 | No | 9.99E−27 | ||
| 3 | None | 2.33E−07 | No | None in locus | 8.05E−05 | |
| 4 | None | 3.06E−08 | Yes | None | 0.08 | |
| 5 | None | 3.06E−07 | No | 2.15E−06 | ||
| 5 | None | 2.60E−18 | No | None in locus | 0.04 | |
| 5 | 1.11E−131 | No | None in locus | 0.04 | ||
| 5 | None | 8.85E−28 | No | None in locus | 0.26 | |
| 6 | See below | 1.55E−65 | No | None in locus | 0.02 | |
| 7 | None | 7.61E−08 | No | None in locus | 0.09 | |
| 7 | None | 8.61E−28 | No | None in locus | 0.06 | |
| 7 | None | 4.15E−11 | No | None in locus | 0.07 | |
| 7 | None | 2.17E−08 | No | 3.54E−06 | ||
| 8 | None | 4.68E−15 | No | None | 2.05E−05 | |
| 8 | None | 2.79E−21 | No | None | 0.05 | |
| 9 | 2.40E−40 | No | 2.14E−21, 6.21E−08 | |||
| 9 | 1.53E−13 | No | 1.65E−07 | |||
| 9 | 5.51E−09 | No | None in locus | 1.37E−03 | ||
| 11 | None | 6.03E−26 | Yes | None in locus | 0.06 | |
| 11 | 3.30E−27 | No | 1.09E−06 | |||
| 11 | None | 2.16E−14 | No | 1.62E−07 | ||
| 12 | 3.57E−247 | No | 5.15E−12 | |||
| 12 | None | 3.83E−25 | No | None in locus | 0.17 | |
| 12 | 1.29E−12 | No | None in locus | 0.04 | ||
| 13 | None | 3.53E−22 | No | None | 5.46E−03 | |
| 14 | None | 7.81E−09 | No | None | 1.02E−05 | |
| 14 | 2.50E−31 | No | None | 0.08 | ||
| 14 | None | 7.44E−29 | No | None | 0.06 | |
| 14 | 4.46E−16 | Yes | None | 0.07 | ||
| 15 | 2.45E−08 | No | None in locus | 4.55E−05 | ||
| 15 | 2.33E−19 | Yes | 1.94E−03 | |||
| 16 | 1.68E−09 | No | 1.41E−07 | |||
| 16 | 6.44E−21 | No | None in locus | 0.58 | ||
| 16 | None | 7.95E−09 | No | 6.20E−06 | ||
| 16 | 3.66E−07 | No | 3.16E−07 | |||
| 17 | 1.87E−71 | Yes | 9.73E−07, 1.16E−06 | |||
| 17 | 1.57E−15 | Yes | 1.97E−06 | |||
| 17 | 8.79E−09 | No | None in locus | 9.24E−06 | ||
| 18 | 4.22E−08 | No | 7.60E−07 | |||
| 19 | 2.52E−32 | Yes | 8.76E−12, 7.50E−07 | |||
| 19 | None | 5.43E−09 | No | None in locus | 0.11 | |
| 19 | 4.57E−84 | No | None in locus | 0.24 | ||
| 22 | 1.07E−26 | No | 3.18E−10, 1.47E−06 |
Significant genes from any of the SKAT models were clustered into independent loci based on the genomic distance. This was done by an iterative procedure, where the gene with the lowest P-value for each chromosome was considered to be the lead gene for the first locus of each chromosome, and all significant genes within 10 Mb were considered to belong to the same locus. Then, a second lead gene for each chromosome was identified as the most significant of the genes not belonging to any locus, and this procedure was repeated until no additional genes remained.
The lowest P-value from the five different gene-based models for the lead gene at each locus.
None: if no genes at that chromosome passed the significance threshold after adjusting for GWAS hits. None in locus: if no other gene in that particular locus passed the significance threshold.
P-value for the most significant gene at the locus after adjustment. If no gene in the locus was significant after GWAS adjustment, the adjusted P-value for the lead gene is presented.
The HLA locus contains the following genes: TRIM31, TRIM40, TRIM15, HLA-E, ABCF1, PPP1R18, VARS2, SFTA2, MUCL3, MUC21, MUC22, C6orf15, PSORS1C1, PSORS1C2, CCHCR1, TCF19, HLA-C, MICA, MCCD1, LTA, PRRC2A, BAG6, C6orf47, GPANK1, LY6G5C, CLIC1, VWA7, VARS1, HSPA1L, SLC44A4, EHMT2, SKIV2L, TNXB, PPT2, PPT2-EGFL8, EGFL8, AGER, NOTCH4, TSBP1, BTNL2, HLA-DRA, HLA-DRB5, HLA-DRB1, HLA-DQA1, HLA-DQB1, HLA-DQA2, HLA-DQB2, PSMB8, HLA-DPA1, HLA-DPB1, TAPBP, ZBTB22, ITPR3, IP6K3, TCP11, H1-1, BTN3A2, BTN3A1, BTN2A1, BTN1A1, ZNF322, PRSS16, POM121L2, H2BC13, OR2B2, ZKSCAN4, NKAPL, ZSCAN26, PGBD1, ZSCAN31, ZKSCAN3, ZSCAN12, OR2J3, OR14J1, OR12D3, OR12D2, MOG, and HLA-G.
Figure 2Venn and UpSet diagrams of the overlap between the SKAT weighting schemes. The graph shows the overlap between all comparisons and the table shows the pairwise overlap between the scenarios. CADD, SKAT CADD-weighted; MAF; SKAT MAF-weighted; SKAT, SKAT unweighted; CR, CommonRare unweighted; CR.MAF, CommonRare weighted. (A) Venn diagram showing the overlap between weighting schemes, with one color per scheme. The outermost circles show the unique number of hits per weighting scheme, and where the shapes of the schemes overlap, the number of overlapping significant genes is shown. (B) Numerical overlap in pairwise comparisons of weighting schemes. The table shows the total overlap, i.e., the summation of each pairwise overlap in (A). (C) Overlap visualized as an UpSet plot. The set size represents the total number of significant genes per weighting scheme. The intersection size is showing the number of overlapping genes in each of the respective scheme combinations, as shown by the filled dots underneath.
Figure 3SKAT P-values from genes that are significant after adjusting for lead GWAS SNPs. Gene name is stated on the y-axes and −log(P-value) on the x-axes. The different colors depict the different models used in the analyses. (A) The five weighting schemes for the 26 genes that were significant after adjusting for lead GWAS SNPs. (B) The results when only analyzing rare variants, with different rare variant cutoffs (from 0.01% to 5%) and adjusting for lead GWAS SNPs. Among all 220 genes that were identified in the primary SKAT analyses, only seven genes that were significant for any of the rare variant cutoffs are shown; the others can be found in . In all analyses, a P-value cutoff of 5.11 × 10−06 [−log10(P) = 5.29) was used, correcting for the 220 genes times 29 different models tested: seven rare variant cutoffs * two strata (full cohort and unrelated White British) and five weighting schemes for the GWAS-adjusted analyses * two strata and the five weighting schemes for the non-GWAS adjusted (primary/discovery) analyses in the full cohort.
Meta-analysis results for the 26 genes that are still significantly associated in the primary analyses after adjusting for all lead GWAS hits (0.05/220 = 2.3 x 10-04).
| Chromosome | Gene | Adjusted | |||||
|---|---|---|---|---|---|---|---|
| 1 | 1.98E−06 | 4.46E−01 | 1.32E−05 | 27.88 | 4 | 2.90E−03 | |
| 1 | 4.49E−07 | 6.19E−01 | 4.47E−06 | 30.19 | 4 | 9.83E−04 | |
| 1 | 3.59E−06 | 1.39E−03 | 1.00E−07 | 38.23 | 4 | 2.21E−05 | |
| 1 | 7.94E−06 | 6.19E−02 | 7.63E−06 | 29.05 | 4 | 1.68E−03 | |
| 1 | 1.02E−07 | 1.47E−01 | 2.84E−07 | 36.04 | 4 | 6.24E−05 | |
| 3 | 1.31E−26 | 3.64E−01 | 2.93E−25 | 121.22 | 4 | 6.45E−23 | |
| 5 | 1.79E−06 | 5.01E−01 | 1.34E−05 | 27.85 | 4 | 2.94E−03 | |
| 7 | 9.47E−06 | 1.12E−02 | 1.82E−06 | 32.11 | 4 | 4.00E−04 | |
| 9 | 1.45E−07 | 9.26E−03 | 2.89E−08 | 40.85 | 4 | 6.35E−06 | |
| 9 | 9.33E−22 | 2.14E−01 | 1.02E−20 | 99.93 | 4 | 2.24E−18 | |
| 9 | 6.92E−08 | 1.85E−02 | 2.74E−08 | 40.96 | 4 | 6.04E−06 | |
| 11 | 3.65E−07 | 9.16E−02 | 6.09E−07 | 34.43 | 4 | 1.34E−04 | |
| 11 | 1.65E−06 | 1.44E−01 | 3.85E−06 | 30.51 | 4 | 8.46E−04 | |
| 12 | 4.28E−12 | 1.02E−01 | 1.29E−11 | 56.91 | 4 | 2.84E−09 | |
| 15 | 2.52E−09 | 3.44E−01 | 1.90E−08 | 41.73 | 4 | 4.18E−06 | |
| 16 | 9.92E−06 | 1.86E−01 | 2.62E−05 | 26.41 | 4 | 5.76E−03 | |
| 16 | 2.63E−07 | 3.01E−02 | 1.55E−07 | 37.31 | 4 | 3.42E−05 | |
| 16 | 1.45E−07 | 3.57E−01 | 9.18E−07 | 33.56 | 4 | 2.02E−04 | |
| 17 | 1.10E−06 | 9.33E−02 | 1.75E−06 | 32.19 | 4 | 3.86E−04 | |
| 17 | 1.00E−06 | 6.74E−01 | 1.03E−05 | 28.42 | 4 | 2.26E−03 | |
| 17 | 3.26E−06 | 1.08E−01 | 5.57E−06 | 29.72 | 4 | 1.23E−03 | |
| 18 | 3.61E−07 | 1.46E−01 | 9.36E−07 | 33.52 | 4 | 2.06E−04 | |
| 19 | 5.18E−12 | 2.98E−01 | 4.36E−11 | 54.39 | 4 | 9.59E−09 | |
| 19 | 5.74E−07 | 1.82E−01 | 1.79E−06 | 32.14 | 4 | 3.94E−04 | |
| 22 | 3.09E−10 | 3.84E−01 | 2.83E−09 | 45.71 | 4 | 6.23E−07 | |
| 22 | 5.28E−06 | 3.06E−02 | 2.69E−06 | 31.28 | 4 | 5.92E−04 |
Bonferroni-adjusted P-value adjusting for the 220 significant genes tested.P-values are shown for Europeans (N = 188,248) and non-Europeans (N = 11,372) as well for the meta-analysis, combining the P-values.
Summary of the 18 genes that are significant analyzing rare variants only.
| Chromosome | Position | Gene | Frequency cutoff | ||
|---|---|---|---|---|---|
| 2 | 102,418,689–102,452,565 | 2.31E−06 | 0.3% | 269 | |
| 2 | 212,999,691–213,152,427 | 1.45E−08 | 0.5% | 212 | |
| 3 | 3,066,324–3,126,613 | 7.19E−07 | 0.5% | 235 | |
| 5 | 131,641,714–131,797,063 | 4.93E−07 | 0.5% | 445 | |
| 6 | 31,268,749–31,272,130 | 4.84E−07 | 1% | 479 | |
| 6 | 31,620,715–31,637,771 | 4.14E−07 | 1% | 1,261 | |
| 6 | 32,041,153–32,115,334 | 6.99E−08 | 0.5% | 2,043 | |
| 6 | 32,517,353–32,530,287 | 9.02E−07 | 1% | 77 | |
| 9 | 4,984,390–5,129,948 | 1.77E−10 | 0.3% | 551 | |
| 9 | 6,215,786–6,257,983 | 1.17E−34 | 0.5% | 153 | |
| 12 | 111,405,923–111,451,623 | 3.35E−07 | 0.1% | 429 | |
| 14 | 23,117,306–23,119,255 | 7.81E−09 | 1% | 157 | |
| 15 | 79,898,840–79,923,702 | 1.93E−06 | 0.5% | 23 | |
| 16 | 30,934,376–30,960,104 | 3.36E−07 | 1% | 255 | |
| 17 | 4,630,919–4,642,294 | 4.22E−11 | 0.1% | 364 | |
| 19 | 3,172,346–3,180,332 | 5.12E−17 | 1% | 271 | |
| 22 | 17,084,954–17,115,694 | 6.89E−08 | 0.5% | 532 | |
| 22 | 36,913,628–36,940,439 | 6.60E−13 | 0.1% | 290 |
The P-value represents the value corresponding to the lowest frequency cutoff yielding a significant association. Both the allele frequency cutoff and the number of rare variants included in the test are also shown per gene.
Figure 4Venn diagram with circles representing four of the downstream analyses. In each section, the overlapping genes are named with their respective gene symbol. If there are no genes in an intersection, “none” is stated. GWAS adjusted: The set of genes that are still significant after adjusting for all lead GWAS hits. Passes RareOnly: The genes that are still significantly associated at any given allele frequency in the analyses including only rare and low-frequency spectrum (<0.01% to <5%). Previously associated: The set of genes that have been previously associated to eosinophil counts in the GWAS catalog. Non-GWAS-overlapping: genes located >5 Mb from a lead GWAS SNP.