| Literature DB >> 27508393 |
Anurag Verma1,2, Anna O Basile2, Yuki Bradford1, Helena Kuivaniemi3,4, Gerard Tromp3,4, David Carey3, Glenn S Gerhard5, James E Crowe6, Marylyn D Ritchie1,2, Sarah A Pendergrass1.
Abstract
We performed a Phenome-Wide Association Study (PheWAS) to identify interrelationships between the immune system genetic architecture and a wide array of phenotypes from two de-identified electronic health record (EHR) biorepositories. We selected variants within genes encoding critical factors in the immune system and variants with known associations with autoimmunity. To define case/control status for EHR diagnoses, we used International Classification of Diseases, Ninth Revision (ICD-9) diagnosis codes from 3,024 Geisinger Clinic MyCode® subjects (470 diagnoses) and 2,899 Vanderbilt University Medical Center BioVU biorepository subjects (380 diagnoses). A pooled-analysis was also carried out for the replicating results of the two data sets. We identified new associations with potential biological relevance including SNPs in tumor necrosis factor (TNF) and ankyrin-related genes associated with acute and chronic sinusitis and acute respiratory tract infection. The two most significant associations identified were for the C6orf10 SNP rs6910071 and "rheumatoid arthritis" (ICD-9 code category 714) (pMETAL = 2.58 x 10-9) and the ATN1 SNP rs2239167 and "diabetes mellitus, type 2" (ICD-9 code category 250) (pMETAL = 6.39 x 10-9). This study highlights the utility of using PheWAS in conjunction with EHRs to discover new genotypic-phenotypic associations for immune-system related genetic loci.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27508393 PMCID: PMC4980020 DOI: 10.1371/journal.pone.0160573
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of Data Sets Used for the Study.
| EHR Site | Total Sample Size | % Male | Median Age (in decade) | Case Size Range | Genotyping Platform | Number of SNPs Pre-imputation | Number of SNPs Post-imputation | Number of SNPs after filtering | Number of Diagnosis Codes |
|---|---|---|---|---|---|---|---|---|---|
| Geisinger MyCode® | 3024 | 53.0 | 40 | Min = 11; Max = 1898; Median = 32 | Illumina Human OmniExpress | 729,078 | 38,054,243 | 95,448 | 477 |
| Vanderbilt BioVU | 2899 | 45.4 | 60 | Min = 11; Max = 1056; Median = 31 | Illumina 660 | 558,980 | 38,041,351 | 87,690 | 380 |
For additional information on the study design, see Figs 1 and S1.
Fig 1Overview of PheWAS with Immune Variants.
This flow chart provides an overview of the steps taken to perform PheWAS between immune variants and ICD-9 diagnosis codes. The final testing dataset (purple) was formed by selecting SNPs from our array data that also exist on Immunochip and/or are within immune-related genes (yellow) and removing samples with missing genotypic or phenotypic data (green). Comprehensive associations were calculated between all final dataset SNPs and ICD-9 code based case/control status using logistic regression, with all models adjusted for age, sex and first five principal components. Replication was sought following both an exact ICD-9 code and a category ICD-9 code approach following the specified criteria. Pooled analysis was performed for both approaches using METAL. See S1 Fig for the full workflow from imputation through quality control, association testing, and replication for this study.
Immune-Related ICD-9 categories selected for further analysis.
| ICD-9 General Classification | ICD-9 Code Category (Code: Category Description) | Nearest Genes |
|---|---|---|
| Endocrine, nutritional and metabolic diseases, and immunity disorders | 250: Diabetes mellitus, Type 1 | |
| 273: Disorders of plasma protein metabolism | ||
| Diseases of the nervous system | 331: Other cerebral degenerations | |
| 340: Multiple sclerosis | ||
| 357: Inflammatory and toxic neuropathy | ||
| Diseases of the sense organs | 373: Inflammation of eyelids | |
| Diseases of the respiratory system | 461: Acute sinusitis | |
| 465: Acute upper respiratory infections of multiple or unspecified sites | ||
| 466: Acute bronchitis and bronchiolitis | ||
| 472: Chronic pharyngitis and nasopharyngitis | ||
| 473: Chronic sinusitis | ||
| 477: Allergic rhinitis | ||
| 482: Other bacterial pneumonia | ||
| 491: Chronic bronchitis | ||
| 492: Emphysema | ||
| 493: Asthma | ||
| 515: Postinflammatory pulmonary fibrosis | ||
| Diseases of the digestive system | 556: Ulcerative colitis | |
| 571: Chronic liver disease and cirrhosis | ||
| 577: Diseases of pancreas | ||
| Diseases of the genitourinary system | 584: Acute renal failure | |
| 585: Chronic kidney disease | ||
| 586: Renal failure | ||
| 595: Cystitis | ||
| Diseases of the skin and subcutaneous tissue | 692: Contact dermatitis and other eczema | |
| 695: Erythematous conditions | ||
| Diseases of the musculoskeletal system and connective tissue | 714: Rheumatoid arthritis and other inflammatory polyarthropathies | |
| 715: Osteoarthrosis and allied disorders | ||
| 716: Other and unspecified arthropathies | ||
| 719: Other and unspecified disorders of joint |
Fig 2Synthesis view plot showing PheWAS results replicating across MyCode® and BioVU that have previously reported associations.
The first track is the chromosomal location for each SNP. The next column lists the SNP identifier, the phenotype associated in our study, and the reported GWAS trait (p<10−5). Results representing exact matches with the NHGRI-EBI GWAS catalog and GRASP are annotated with a single asterisk and the closely related traits are represented with a double asterisk. Blue symbols represent results from MyCode®, red symbols represent results from BioVU and green symbols are the pooled analysis results obtained using the program METAL.
Fig 3PheWAS View Plot of Meta-analysis Results with p<0.01 Replicating for the Same ICD-9 Category, Meeting Autoimmune and Immune-Related Diagnosis Criteria.
The left track specifies the phenotype and ICD-9 Category code with which the SNP was associated. The next track indicates–log10(p-value) from the meta-analysis performed on all replicating SNPs with p<0.01. The last track indicates the SNP that had the most significant p-value, and the direction of effect of the association (+, positive; -, negative). The total number of associations between the SNPs and diagnoses was 409.
Fig 4Pleiotropy: SNPs Associated with more than One Phenotype and Replicating across more than One Study for the Same ICD-9 Category.
This chromosomal ideogram has lines indicating the location of the SNP, with filled colored circles indicating different ICD-9 code diagnoses associated with that particular SNP. When there are multiple pairs of the same phenotypes in the same region, this indicates regions where several SNPs in close proximity were associated with the same pairs of phenotypes.
Fig 5Cytoscape Network Showing the Connections between Phenotypes, the Genes with SNPs, and Pathways.
In this network, green squares represent phenotype; red triangles represent genes; and blue circles are KEGG pathways. The colored lines highlight the link between phenotype and pathway. For the gene HLA-DRA with SNPs associated with “714: rheumatoid arthritis” and “250: type 1 diabetes” is present in the KEGG pathway of “rheumatoid arthritis” (red line) and “type 1 diabetes” (green line) respectively. Also, the blue edge shows the connection between “714: rheumatoid arthritis”, “716: other specified arthropathies” and the KEGG “JAK-STAT signaling pathway” through two interleukin genes, IL23R and IL6.