| Literature DB >> 23244446 |
Jyotishman Pathak1, Richard C Kiefer, Suzette J Bielinski, Christopher G Chute.
Abstract
BACKGROUND: The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form "biobanks" where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on a large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypotheses generation.Entities:
Year: 2012 PMID: 23244446 PMCID: PMC3554594 DOI: 10.1186/2041-1480-3-10
Source DB: PubMed Journal: J Biomed Semantics
MayoGC Phase I studiesa,b(used with permission from Bielinski et al [[12]])
| Cases (n = 1612) | Controls (n = 1585) | Cases (n = 1233) | Controls (n = 1264) | Controls (n = 613) | |
|---|---|---|---|---|---|
| Age (y), mean ± SD | 66.0 ± 10.7 | 61.0 ± 7.4 | 55.0 ± 16.2 | 56.0 ± 15.8 | 66.0 ± 10.0 |
| Female (%) | 36 | 40 | 50 | 52 | 45 |
| Medical record length (y) | |||||
| Mean ± SD | 23.4 ± 20.0 | 26.1 ± 20.3 | 13.7 ± 16.3 | 21.1 ± 15.4 | 30.2 ± 16.5 |
| Median ± (range) | 18.7 (1.0–78.6) | 23.0 (1.0–79.2) | 6.3 (1.0–71.8) | 17.8 (1.0–70.2) | 29.8 (1.0–75.0) |
| White (%) | 94 | 94 | 96 | 99 | 100 |
| Geographic location, No. (%)c | |||||
| Olmsted Country | 328(20) | 590(37) | 7(1) | 10(1) | 64(10) |
| Southeast Minnesota | 191(12) | 62(4) | 205(17) | 378(30) | 107(17) |
| Greater Minnesota | 393(24) | 343(22) | 314(25) | 371(25) | 135(22) |
| Iowa | 212(13) | 97(6) | 176(14) | 191(15) | 65(11) |
| South and North Dakota | 50(3) | 31(2) | 79(6) | 71(6) | 19(3) |
| Wisconsin | 128(8) | 68(4) | 121(10) | 138(11) | 32(5) |
| Other states or international | 309(19) | 394(25) | 330(27) | 159(13) | 191(31) |
aeMERGE=Electronic Medical Records and Genomics; GENEVA=Gene Environment Association Studies; MayoGC=Mayo Genome Consortia; PAD=peripheral arterial disease; PANC=Mayo Clinic Molecular Epidemiology of Pancreatic Cancer Study; VTE=venous thromboembolism.
bPercentages may not total 100% because of rounding.
cSoutheast Minnesota includes 7 counties in the southeast corner of Minnesota: Dodge, Goodhoue, Wabasha, Winona, Houston, Fillmore, and Mower, Olmsted County, Minnesota, is a mutually exclusive category.
Examples of gene loci associated with T2DM, Hypothyroidism and related traits
| PPARG | Peroxisome proliferator-activated receptor gamma | rs1801282 | T2DM | 1.14 (1.08-1.20) | 1.7 × 10-6 | Scott et al
[ |
| KCNJ11 | Potassium inwardly rectifying channel, subfamily J, member 11 | rs5219 | T2DM | 1.14 (1.10-1.19) | 6.7 × 10-11 | Scott et al
[ |
| TCF7L2 | Transcription factor 7-like 2 | rs7903146 | T2DM, glucose, HbA1c | 1.37 (1.31-1.43) | 1.0 × 10-8 | Sladek et al
[ |
| | | rs12255372 | | | | |
| SLC30A8 | Solute carrier family 30 [zinc transporter], member 30 | rs13266634 | T2DM, HbA1c | 1.12 (1.07-1.16) | 5.3 × 10-8 | Zeggini et al
[ |
| FTO | Fat mass and obesity associated | rs8050136 | T2DM, BMI | 1.17 (1.12-1.22) | 1.3 × 10-12 | Scott et al
[ |
| FOXE1 | Forkhead box protein E1 | rs965513 | Thyroid cancer, TSH levels | 1.75 (1.49-2.01) | 1.7 × 10-27 | Gudmundsson et al
[ |
| FOXE1 | Forkhead box protein E1 | rs7850258 | Hypothyroidism | 0.74 (0.67-0.82) | 3.96 × 10-9 | Denny et al
[ |
| PTPN22 | Protein tyrosine phosphatase, non-receptor type 22 | rs2476601 | Hashimoto’s thyroiditis | 1.77 (1.31-2.40) | 4.6 × 10-13 | Criswell et al
[ |
| VAV3 | Guanine nucleotide exchange factor | rs4915077 | Hypothyroidism | 1.397 (1.27-1.54) | 8.3 × 10-11 | Eriksson et al
[ |
Figure 1System architecture for representing patient electronic health records and MayoGC data using RDF.
Figure 2Sample mapping between MCLSS and MayoGC database schemas and existing biomedical ontologies.
Figure 3Sample Spyder relational database to RDF mapping file using R2RML.
Figure 4Sample Federated SPARQL query for MCLSS and MayoGC datasets.
Figure 5SNP-disease associations for T2DM SNPs obtained via phenome mining: (a) SNP rs5219 within the gene KCNJ11; (b) SNP rs7903146 is within the gene TCF7L2; (c) SNP rs12255372 is within the gene TCF7L2; (d) SNP 13266634 is within the gene SLC30A8.
Figure 6SNP-disease associations for Hypothyroidism SNPs obtained via phenome mining: (a) SNP rs965513 within the gene FOXE1; (b) SNP rs7850258 within the gene FOXE1; (c) SNP rs2476601 within the gene PTPN22; (d) SNP rs2069561 within the gene TG.