| Literature DB >> 30598089 |
Haiquan Li1,2,3,4, Jungwei Fan5,6, Francesca Vitali5,6,7, Joanne Berghout5,6,8,9, Dillon Aberasturi5,6,10, Jianrong Li5,6,7, Liam Wilson5, Wesley Chiu5, Minsu Pumarejo5, Jiali Han5,6,11, Colleen Kenost5,6, Pradeep C Koripella5,6, Nima Pouladi5,6, Dean Billheimer5,10,7,12, Edward J Bedrick5,10,7,12, Yves A Lussier13,14,15,16,17,18,19.
Abstract
BACKGROUND: Forty-two percent of patients experience disease comorbidity, contributing substantially to mortality rates and increased healthcare costs. Yet, the possibility of underlying shared mechanisms for diseases remains not well established, and few studies have confirmed their molecular predictions with clinical datasets.Entities:
Keywords: Common diseases; Complex diseases; Disease comorbidities; Diseases; GWAS studies; Genetic network; Intergenic; Non-coding variants; RNA; SNP; eQTL
Mesh:
Substances:
Year: 2018 PMID: 30598089 PMCID: PMC6311938 DOI: 10.1186/s12920-018-0428-9
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1Overview of the study. a Preprocessing. We created a controlled disease terminology across the molecular data from the GWAS Catalog and the HCUP clinical datasets (Methods- Data preprocessing to define disease bundles). We mapped GWAS diseases into disease bundles, i.e., group diseases, using the EMBL-EBI EFO, UMLS-CUI, SNOMED-CT nomenclatures integrated with expert curation (Methods- Creation of the SNOMED-coded disease-bundles from GWAS terms). Similarly, we mapped HCUP diseases coded with ICD-9-CM terminology into disease bundles by using SNOMED-CT, UMLS, and expert curation (Methods- Mapping HCUP diseases to disease-bundles). b eQTL RNA overlap model. Convergence between downstream eQTLs signals associated with coding and intergenic disease-associated polymorphisms are calculated for each pair of diseases (Methods- Statistical overlap). We selected significant disease pairs sharing convergent mechanisms by applying the Fisher’s Exact Test (FET) according to the contingency table shown in the panel. We considered significant disease pairs surpassing FDReRNA of 0.05. c Disease comorbidity model. We computed the disease comorbidity for each disease pairs by applying logistic regression (Methods- Calculation of disease comorbidity) to the clinical datasets. The effect size and significance of disease co-occurrence in clinical datasets (comorbidities) were controlled for age, gender, and race. Significant comorbid disease pairs were selected accordingly with FDR values (FDRcomorbidity < 0.05). d Comparative study. Finally, congruence between molecular-prioritized disease pairs and clinically-prioritized comorbidities is measured by applying FET-based enrichment studies (FETfinal) (Methods- Comparative studies between eQTLs and HCUP). e Network visualization. We further investigated in detail the molecular networks of comorbid disease pairs with sharing convergent genetics (eQTL RNAs) (Methods- Network visualization of the comorbidities sharing intergenic genetic risks). f Curation. For additional validation, we conducted a systematic curation of the literature using PUBMED and Google Scholar for the comorbidities discovered from HCUP datasets (FDR < 0.05, OR > 3) having convergent eQTL RNAs (Methods- Curation of prioritized comorbidities)
Data sources
| Dataset name | Version | Downloaded | Source (URL) | Data type derived |
|---|---|---|---|---|
| NIGHRI-EBI GWAS Catalog | 2016 | 07/10/2016 |
| Disease-to-SNP associations derived from GWAS |
| EMBL-EBI EFO | 2016 | 07/10/2016 |
| Disease branches |
| GTEx | V6 | 05/04/2017 |
| SNP-to-eQTL_RNA relations derived from eQTL studing associating SNPs to the regulated targets (RNAs) |
| dbSNP | 142 | 08/25/2016 |
| SNP host gene or intergenic SNPs |
| HCUP | 2013 | 09/01/2016 |
| Disease-patient relations |
| SNOMED | Sep. 2015 | 11/2015 |
| Disease SNOMED-CT IDs |
| UMLS | 2015AA | 07/09/2015 |
| Disease UMLS IDs |
| HapMap LD | 2009 | 10/11/2010 |
| Linkage disequilibrium data |
| 1000 Genome | 2014 | 11/14/2014 |
| Linkage disequilibrium r^2 |
Fig. 2Convergent downstream genetic mechanisms predicted from shared eQTL RNA between disease-pairs are enriched among comorbidities observed in clinical datasets. Vertical axis = odds ratio of overrepresentation of shared molecular mechanisms among clinical comorbidities (Results-Convergent genetic mechanisms between disease-pairs are enriched among comorbid disease). Left bottom axis = FDR cutoffs of comorbidities found in the HCUP clinical datasets (OR > 3; Results- Prioritized comorbidities), right bottom axis = FDR cutoffs of shared molecular mechanisms discovered between two diseases
Count of prioritized disease pairs by eQTL RNA overlap for each tissue
| Tissue of eQTL associations | INPUT | OUTPUT | |||
|---|---|---|---|---|---|
| Distinct eQTL SNPs | Distinct eQTL RNAs | Distinct SNP-RNA associations | Distinct diseases | Prioritized disease pairs (FDR < 5%) | |
| Adipose subcutaneous | 620 | 489 | 2400 | 127 | 1581 |
| Artery aorta | 337 | 200 | 1264 | 96 | 1198 |
| Artery tibial | 492 | 365 | 1898 | 125 | 1331 |
| Blood | 270 | 117 | 1388 | 77 | 1191 |
| Brain | 188 | 50 | 554 | 67 | 448 |
| Breast mammary tissue | 238 | 83 | 987 | 80 | 970 |
| Cells transformed fibroblasts | 541 | 417 | 1485 | 123 | 799 |
| Colon transverse | 231 | 95 | 810 | 80 | 748 |
| Esophagus mucosa | 518 | 398 | 1927 | 126 | 1462 |
| Esophagus muscularis | 560 | 421 | 1856 | 135 | 1366 |
| Heart atrial appendage | 218 | 55 | 867 | 71 | 674 |
| Heart left ventricle | 378 | 199 | 1283 | 104 | 1201 |
| Lung | 154 | 138 | 416 | 68 | 45 |
| Muscle skeletal | 551 | 408 | 2240 | 130 | 1617 |
| Nerve tibial | 748 | 627 | 2497 | 134 | 1536 |
| Pancreas | 279 | 126 | 843 | 85 | 753 |
| Skin | 806 | 667 | 2669 | 155 | 1775 |
| Stomach | 202 | 64 | 779 | 68 | 676 |
| Thyroid | 857 | 793 | 2759 | 145 | 1484 |
| Total (union of sets) | 1721 | 2644 | 8033 | 188 | 2043 |
Fig. 3Network of disease-pairs prioritized as comorbid and sharing convergent genetic mechanisms through cis- and trans-eQTL associations of their coding and intergenic polymorphisms. Convergent molecular mechanisms were confirmed at FDR < 0.05 (Methods- Calculation of disease comorbidity based on HCUP). Disease comorbidities were confirmed in either clinical datasets NIS13 or NEDS13 at FDR < 0.05 (Panel a with OR > 3; panel b with OR > 1.5; Methods- Statistical overlap of eQTL-associated RNAs between distinct disease-associated SNPs). Known clinical syndromes with common genetic risks are recapitulated (e.g., metabolic syndrome), as well as less known monogenic diseases modulated with SNPs unrelated to their monogenic cause (e.g., SNPs worsening cystic fibrosis associated by eQTL studies to those of the metabolic syndrome for which the comorbidity is known but not the underpinning biological mechanisms). Many eQTL mechanisms relate known co-classified diseases (e.g., cancers, immune-mediated diseases), however many cross classes provide intriguing novel comorbidities linked by genetics that had eluded discovery by both clinicians and geneticists (e.g., Parkinson’s disease and Allergic Dermatitis). In most cases, though, the comorbidity was known and explained to clinicians by non-genetic pathophysiology (e.g., duodenal cancer and pancreatic cancer), and yet this study implies that previously undiscovered genetic mechanisms further amplify these comorbid conditions in predisposed individuals. Legend. Edge widths are proportional to the number of tissues that yielded eQTL RNA associations with SNPs by eQTL analyses (19 tissues, eQTL RNA and SNPs not shown; details in Fig. 5 for two examples). Diseases classifications are color-colored (e.g., autoimmune disorders in blue)
Fig. 5Examples of comorbidities that share downstream intergenic eQTL mechanisms via their associated SNPs. Panel a: Polycystic ovary syndrome (POS) and psoriasis are observed comorbid in HCUP (Odds ratio (OR) = 2.3) and were previously described as co-occurring [45]; however, the common genetic risk remains unreported. Here, we provide evidence that intergenic SNPs of psoriasis share eQTL associations with the intragenic SNPs of POS (LD r2 = 0.05 on CEU population). Panel b: Parkinson’s disease and schizophrenia are also known as comorbid, and here we show a novel shared mechanism among their numerous intergenic SNPs (protein-coding IGSF9B, microRNA MIR1307, and ncRNA CYP17A10-AS1). SNPs on chromosome 11 are 57 k away (LD r2 = 0.04), and SNP rs17115100 is 20.9 k byte from rs11191419 (LD r2 = 0.15) and 314.8 k byte apart from a LD SNP rs1191580
Fig. 4Curation results for prioritized comorbidities with an eQTL downstream convergence. Prioritized comorbidities (green) were enriched in curation categories as compared to blind controls (grey) consisting of random disease pairs among non-prioritized ones (p = 0.001; inter-rater agreement p < 1.4 × 10− 3). Legend: evidence for positive correlation (levels 1–2); no evidence for association or evidence for non-coexistence of diseases (levels 5 or 6), see Methods- Curation of prioritized comorbidities