| Literature DB >> 30978214 |
Glen James1, Sulev Reisberg2,3,4, Kaido Lepik2, Nicholas Galwey5, Paul Avillach6,7, Liis Kolberg2, Reedik Mägi8, Tõnu Esko8, Myriam Alexander5, Dawn Waterworth9, A Katrina Loomis10, Jaak Vilo2.
Abstract
The Estonian Biobank, governed by the Institute of Genomics at the University of Tartu (Biobank), has stored genetic material/DNA and continuously collected data since 2002 on a total of 52,274 individuals representing ~5% of the Estonian adult population and is increasing. To explore the utility of data available in the Biobank, we conducted a phenome-wide association study (PheWAS) in two areas of interest to healthcare researchers; asthma and liver disease. We used 11 asthma and 13 liver disease-associated single nucleotide polymorphisms (SNPs), identified from published genome-wide association studies, to test our ability to detect established associations. We confirmed 2 asthma and 5 liver disease associated variants at nominal significance and directionally consistent with published results. We found 2 associations that were opposite to what was published before (rs4374383:AA increases risk of NASH/NAFLD, rs11597086 increases ALT level). Three SNP-diagnosis pairs passed the phenome-wide significance threshold: rs9273349 and E06 (thyroiditis, p = 5.50x10-8); rs9273349 and E10 (type-1 diabetes, p = 2.60x10-7); and rs2281135 and K76 (non-alcoholic liver diseases, including NAFLD, p = 4.10x10-7). We have validated our approach and confirmed the quality of the data for these conditions. Importantly, we demonstrate that the extensive amount of genetic and medical information from the Estonian Biobank can be successfully utilized for scientific research.Entities:
Mesh:
Year: 2019 PMID: 30978214 PMCID: PMC6461350 DOI: 10.1371/journal.pone.0215026
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Association between genetic variants and asthma (J45-J46) diagnosis/laboratory measurements in the Estonian Biobank.
| Gene | SNP / Proxy SNP and effect allele | Author | Study Size | Study Effect: OR (95% CI), p-value | Biobank Case/Control Size (total size for continuous variables) | Biobank Effect Size: OR (95% CI), p-value |
|---|---|---|---|---|---|---|
| rs11071559:T | Moffat [ | 10,365/16,110 | OR = 0.85 (0.79–0.90), p = 7.9E-07 | 3,424/21,020 | ||
| rs1837253:C | Moffat [ | 290/974 | Severe asthma OR = 0.56 in one dataset (p = 3x10-6); Asthma OR = 1.15 (1.08–1.22), p = 7.5x10-8 | 3,424/21,020 | ||
| Astle [ | Total study size 173,480 (case/control size not reported) | Increased Eosinophil Count (p = 2.1x10-17); Decreased Neutrophil Count (p = 3.5 x10-12) | 8,040 Eosinophil measurements; 8,174 Neutrophil measurements | Eosinophil/neutrophil counts increased/decreased, but non-significant | ||
| rs1342326:C | Moffat [ | 10,365/16,110 | OR = 1.22 (1.14–1.30), p = 1.4 x10-8 | 3,424/21,020 | OR = 1.08 (0.99–1.17), p = 0.07 | |
| rs7127583 / rs2846848:T | Roscioli [ | 401 | Reduced Eosinophil Count p = 0.002; Reduced Neutrophil Count p = 0.005 | 8,040 Eosinophil measurements; 8,174 Neutrophil measurements | ||
| rs3771166:A | Moffat [ | 10,365/16,110 | OR = 0.87 (0.83–0.91), p = 1.7x10-8 | 3,424/21,020 | OR = 0.97 (0.92–1.03), p = 0.34 | |
| rs9273349:C | Moffat [ | 10,365/16,110 | OR = 1.19 (1.13–1.25), p = 2.0x10-11 | 3,424/21,020 | OR = 0.99 (0.94–1.04), p = 0.68 | |
| rs2305480:A | Moffat [ | 10,365/16,110 | OR = 0.82 (0.79–0.86), p = 3.3x10-16 | 3,424/21,020 | OR = 0.98 (0.93–1.03), p = 0.47 | |
| rs3894194:A | Moffat [ | 10,365/16,110 | OR = 1.18 (1.13–1.23), p = 2.0x10-13 | 3,424/21,020 | OR = 1.03 (0.98–1.09), p = 0.25 | |
| rs2284033:A | Moffat [ | 10,365/16,110 | OR = 0.90 (0.86–0.94) p = 4.8x10-6 | 3,424/21,020 | OR = 0.98 (0.93–1.03), p = 0.37 | |
| rs2073643:C | Moffat [ | 10,365/16,110 | OR = 0.90 (0.86–0.94) p = 6.2x10-6 | 3,424/21,020 | OR = 0.98 (0.93–1.03), p = 0.35 | |
| rs1295686:C | Moffat [ | 10,365/16,110 | OR = 0.87 (0.83–0.92), p = 7.9x10-7 | 3,424/21,020 | OR = 0.97 (0.92–1.03), p = 0.32 |
Association between genetic variants and NASH/NAFLD diagnosis (K75.8 or K76.0)/laboratory measurements in the Estonian Biobank.
| Gene | SNP / Proxy SNP | Author | Study Size | Study Effect | Biobank Case/Control Size (total size for continuous variables) | Biobank Effect Size: OR (95% CI), p-value |
|---|---|---|---|---|---|---|
| APOC3 | rs2854117 / rs2849176:T | Petersen [ | 258, prevalence of NAFLD not reported | Increased prevalence of NAFLD (p<0.001) | 625/25,097 | OR = 1.00 (0.89–1.12), p = 0.99 |
| APOC3 | rs2854116:C | OR = 1.02 (0.91–1.14), p = 0.79 | ||||
| GCKR | rs780094:T | Speliotes [ | 592/1,405 | Effect allele T: NAFLD OR = 1.45 (1.17–1.57), p = 2.6x10-8 | ||
| Yang [ | 436/467 | Effect allele T: NAFLD OR = 1.61 (1.14–2.27), p = 0.0072 | ||||
| GCKR | rs1260326:T / rs780094:T | Petit [ | 201/107 | Steatosis, OR = 1.99 (1.14–3.47), p = 0.01 | ||
| MBOAT7 | rs641738:T | Mancina [ | 2,736, case group size not reported | NAFLD OR = 1.20 (1.05–1.37), p = 0.006 | OR = 1.03 (0.92–1.16), p = 0.55 | |
| MERTK | rs4374383:A | Patin [ | 57/239 | Advanced fibrosis OR = 0.18 (0.09–0.36), p = 1.1x10-9 (recessive model, AA required) | ||
| PNPLA3 | rs738409:G / rs2281135:A | Kitamoto [ | 540/1,012 | NAFLD OR = 2.20 (1.78–2.72), p = 4.1x10-13 | ||
| Speliotes [ | 592/1,405 | NAFLD OR = 3.26 (2.11–7.21), p = 3.6x10-43 | ||||
| PNPLA3 | rs2294918:G / rs8418:G | Donati [ | 142/100 | NAFLD, p = 0.0009, OR not given | ||
| PPP1R3B | rs4240624:A / rs4841132:G | Speliotes [ | 592/1,405 | Significant effect for computed tomography measured hepatic steatosis (p = 3.6x10-18); NAFLD OR = 0.93 (0.68–1.18), p = 0.285 | OR = 0.86 (0.71–1.03), p = 0.10 | |
| SOD2 | rs4880:G | Al-Serri [ | 179/323 | Advanced fibrosis OR = 1.56 (1.09–2.25), p = 0.014 | OR = 1.00 (0.90–1.12), p = 0.96 | |
| TM6SF2 | rs58542926:T | Bale [ | 256/247 | NAFLD OR = 2.7 (1.37–5.3), p = 0.0004 | ||
| Liu [ | 437/637 | Advanced fibrosis OR = 1.88 (1.41–2.5), p = 1.6x10-5 | ||||
| TRIB1 | rs2954021:A / rs2980875:A | Kitamoto [ | 540/1,012 | NAFLD OR = 1.52, (1.23–1.88), p = 9.7x10-5 | ||
| HSD17B13 | rs6834314/ rs9992651:G | Chambers [ | 61,089 | Increase of ALT concentration in plasma per copy of effect allele rs6834314 A: OR = 2.6, (1.9–3.4), p = 3.1x10-9 | 9,107 ALT measurements | |
| ERLIN1 | rs2862954 / rs11597086:C | Yuan [ | 7,715 | rs11597086 C: Decreased ALT level (p = 1.8x10-8) | 9,107 ALT measurements |
SNPs / Proxy SNPs of the genes of interest.
| Gene | Phenotype | SNP | Proxy | R2 | Effect Allele (%) | Hardy-Weinberg Equilibrium p-value |
|---|---|---|---|---|---|---|
| Asthma, increased eosinophil count, decreased neutrophil count | rs1837253 | NA | NA | C (71) | 0.190 | |
| Asthma, reduced eosinophil and neutrophil count | rs7127583 | rs2846848 | 0.677 | T (29) | 0.011 | |
| Asthma | rs3771166 | NA | NA | A (26) | 0.374 | |
| Asthma | rs9273349 | NA | NA | C (57) | 0.316 | |
| Asthma | rs1342326 | NA | NA | C (10) | 0.194 | |
| Asthma | rs2305480 | NA | NA | A (43) | 0.261 | |
| Asthma | rs3894194 | NA | NA | A (46) | 0.210 | |
| Asthma | rs2284033 | NA | NA | A (46) | 0.994 | |
| Asthma | rs2073643 | NA | NA | C (48) | 0.567 | |
| Asthma | rs1295686 | NA | NA | C (69) | 0.751 | |
| Asthma | rs11071559 | NA | NA | T (19) | 0.684 | |
| Hepatic Fat | rs2854117 | rs2849176 | 0.714 | T (42) | 0.948 | |
| Hepatic Fat | rs2854116 | NA | NA | C (44) | 0.939 | |
| NAFLD | rs780094 | NA | NA | T (39) | 0.306 | |
| NAFLD | rs1260326 | rs780094 | 0.933 | |||
| NAFLD | rs641738 | NA | NA | T (42) | 0.239 | |
| HCV fibrosis progression, NAFLD fibrosisφ | rs4374383 | NA | NA | A (36) | 0.347 | |
| NAFLD / NASH | rs738409 | rs2281135 | 0.688 | A (19) | 0.362 | |
| Liver density, NAFLD | rs2294918 | rs8418 | 0.890 | G (60) | 0.388 | |
| Computed tomography measured hepatic steatosis | rs4240624 | rs4841132 | 1.000 | G (91) | 0.835 | |
| Fibrosis in NAFLD | rs4880 | NA | NA | G (55) | 0.021 | |
| NAFLD/NASH | rs58542926 | NA | NA | T (7) | 0.721 | |
| NAFLD, Lipids | rs2954021 | rs2980875 | 0.780 | A (47) | 0.959 | |
| NAFLD, increased ALT level | rs6834314 | rs9992651 | 0.823 | G (77) | 0.338 | |
| NAFLD, decreased ALT level | rs2862954 | rs11597086 | 0.669 | C (41) | 0.878 | |
*Where primary SNPs of interest were not available on this array, the Broad Institute SNP Annotation and Proxy Search (SNAP) tool was used to identify appropriate proxy SNPs (SNPs which can represent the SNP of interest) with a minimum r2 value of 0.6. r2 or linkage disequilibrium is the non-random measure of association between alleles at different loci, providing an approximate reliability for a proxy SNP representing a primary SNP.
Biobank study attrition.
| Exclusion Criteria Applied | Number of Patients Remaining (%) | Number of Patients Removed |
|---|---|---|
| Total Database Population | 52,274 (100) | NA |
| Has Genotype Data | 32,831 (62.8) | 19,443 |
| Has EHR Linked Data | 26,808 (51.3) | 6,023 |
| Inside Study Period | 26,789 (51.2) | 19 |
| Age >18 | 26,766 (51.2) | 23 |
| Not Missing Gender | 26,766 (51.2) | 0 |
PheWAS study participant characteristics.
| Characteristic | N | % |
|---|---|---|
| Gender | ||
| Female | 19,224 | 71.8 |
| Male | 7,542 | 28.2 |
| Smoking Status | ||
| Current | 7,151 | 26.7 |
| Former | 3,721 | 13.9 |
| Never | 15,853 | 59.2 |
| Unknown | 41 | 0.2 |
| Body Mass Index | ||
| <18.5 | 467 | 1.7 |
| 18.5–25 | 11,127 | 41.6 |
| 25–30 | 8,721 | 32.6 |
| 30+ | 6,423 | 24.0 |
| Unknown | 28 | 0.1 |
| Nationality | ||
| Estonian | 20,320 | 75.9 |
| Russian | 5,307 | 19.8 |
| Other | 1,139 | 4.3 |
Results of the PheWAS using the PheWAS codes and passing the PheWAS significance threshold.
| Phenotype | SNP | Single diagnosis required, | Single diagnosis required, |
|---|---|---|---|
| E06 (Thyroiditis) | rs9273349 | 2,458/20,382 | OR = 1.182 (1.113–1.255), p = 5.52x10-8 |
| E10 (Type 1 Diabetes) | rs9273349 | 719/23,268 | OR = 1.331 (1.194–1.484), p = 2.55x10-7 |
| K76 (non-alcoholic liver diseases, including NAFLD) | rs2281135 | 1,041/25,097 | OR = 1.309 (1.179–1.452), p = 4.05x10-7 |
Results of the PheWAS using exact ICD-10 diagnosis codes and passing the PheWAS significance threshold.
| Phenotype | SNP | Single diagnosis required, | Single diagnosis required, |
|---|---|---|---|
| E10.7 (Insulin-dependent diabetes mellitus with multiple complications) | rs9273349 | 215/23,268 | OR = 2.021 (1.634–2.500), p = 8.6x10-11 |
| E06.3 (Autoimmune thyroiditis) | rs9273349 | 1,986/20,382 | OR = 1.203 (1.126–1.286), p = 4.7x10-8 |
| E10.9 (Insulin-dependent diabetes mellitus without complications) | rs9273349 | 288/23,268 | OR = 1.630 (1.367–1.943), p = 5.3x10-8 |
| E10.3 (Insulin-dependent diabetes mellitus with ophthalmic complications) | rs9273349 | 106/23,268 | OR = 2.352 (1.720–3.216), p = 8.6x10-8 |
| K76.0 (Fatty (change of) liver, not elsewhere classified, including NAFLD) | rs2281135 | 605/25,097 | OR = 1.251 (1.251–1.630), p = 1.2x10-7 |