| Literature DB >> 27991583 |
Younhee Ko1, Minah Cho2, Jin-Sung Lee1, Jaebum Kim2.
Abstract
Despite multiple diseases co-occur, their underlying common molecular mechanisms remain elusive. Identification of comorbid diseases by considering the interactions between molecular components is a key to understand the underlying disease mechanisms. Here, we developed a novel approach utilizing both common disease-causing genes and underlying molecular pathways to identify comorbid diseases. Our approach enables the analysis of common pathologies shared by comorbid diseases through molecular interaction networks. We found that the integration of direct genetic sharing and indirect high-level molecular associations revealed significantly strong consistency with known comorbid diseases. In addition, neoplasm-related diseases showed high comorbidity patterns within themselves as well as with other diseases, indicating severe complications. This study demonstrated that molecular pathway information could be used to discover disease comorbidity and hidden biological mechanism to understand pathogenesis and provide new insight on disease pathology.Entities:
Mesh:
Year: 2016 PMID: 27991583 PMCID: PMC5172201 DOI: 10.1038/srep39433
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Statistics of four integrated disease databases (i.e., OMIM, DO, HPO, and GAD) and the overall schema of three representative quantities to identify disease comorbidity.
(A) Disease overlap among four disease databases. The number in parentheses represents the total number of genes in each database. (B) Disease gene coverage of the integrated disease database in comparison with STRING network. The x-axis represents the proportion of overlap between associated genes of a disease and all genes in the STRING network. The y-axis indicates the fraction of diseases. The fraction of diseases (more than 80% of disease genes are covered by STRING) is over 95%. (C) Two different strategies to represent the degree of comorbidity between diseases A and B. “Direct gene overlap” and “Function network structure” are used to consider overlap between associated genes of the two diseases and the number of direct as well as indirect interactions between associated genes of the two diseases in a function network, respectively. The “Function network structure” strategy to explain the disease comorbidity utilizes disease-associated genes as well as the neighborhood genes which are connected to the disease-associated genes. In our study, the STRING interaction database has been used to identify the functional interactions.
Figure 2Statistics of comorbidity measures for disease pairs in the US Medicare data.
(A) The distribution of the log-scaled XD scores. (B) The distribution of the positive XD scores. (C) The numbers of disease pairs chosen by different quantities (+NG: disease pairs having at least one common gene, +XD: disease pairs having the positive XD scores). (D) The average and standard errors of RR scores of disease pairs chosen by different quantities. (E) The average and standard errors of PHI scores of disease pairs chosen by different quantities. (+XDand + NG: disease pairs having both the positive XD scores and at least one common gene, +NGnot + XD: disease pairs having at least one common gene but without the positive XD scores, +XDnot + NG: disease pairs having the positive XD scores but without sharing genes, and not + XDnot + NG: disease pairs having the negative or zero XD scores without sharing any gene).
Correlation between different measures of disease comorbidity.
| Criteria for disease pairs | XD | NG | ||
|---|---|---|---|---|
| RR | PHI | RR | PHI | |
| +XDand+NG | 0.2640 (6.59 × 10−4) | 0.1377 (2.687 × 10−3) | −0.0065 (0.3789) | 0.0251 (0.1329) |
| +XD | 0.2592 (5.20 × 10−4) | 0.1432 (2.29 × 10−3) | −0.0047 (0.2466) | 0.0344 (0.0474) |
| +NG | 0.2407 (3.26 × 10−4) | 0.1267 (1.184 × 10−3) | −0.0023 (0.1753) | 0.0360 (0.0338) |
| +NGnot+XD | 0.0059 (0.0764) | −0.0209 (0.9999) | 0.0096 (0.0892) | 0.0288 (0.0357) |
| +XDnot+NG | 0.0886 (0.0132) | 0.0723 (0.0305) | NA | NA |
| not+XDnot+NG | −0.0029 (0.8218) | 0.0040 (0.1130) | NA | NA |
| ALL | 0.1557 (6.67 × 10−6) | 0.0759 (0) | 0.0013 (0.0521) | 0.0254 (0.0016) |
Numbers represent Pearson’s correlation coefficients for XD scores against RR/PHI scores or for NG values against RR/PHI scores calculated from different sets of disease pairs constructed by different categories shown at the first column; p-values in parenthesis are from permutation tests.
+XD: disease pairs having positive XD scores.
+NG: disease pairs having at least one common gene.
+XDand +NG: disease pairs having both positive XD score and at least one common gene.
+NGnot+XD: disease pairs having at least one common gene but without having positive XD scores.
+XDnot+NG: disease pairs having positive XD scores but without having common disease genes.
not+XDnot+NG: disease pairs having negative XD scores and without having common disease genes.
Figure 3Construction of the predicted disease network based on XD score.
(A) The top 10% of disease pairs having the highest XD score. The color of nodes indicates a disease category based on ICD-9 classification (Supplementary Table 3). (B) Average degree (number of links with other diseases) of diseases in the disease category. The average degree of all diseases in the disease network is 12.543 (marked as a black bar). Musculoskeletal system had the highest average degree, indicating that musculoskeletal system related diseases often accompany other diseases as complications or are frequently accompanied by other diseases as complications. (C) Top-ranked disease category pairs based on the normalized number of links between disease categories (including links between different disease category pairs as well as links within one disease category). Note that diseases in the neoplasm category have the highest intra-comorbidity patterns. (D) Illustration of distinct comorbidity patterns around the neoplasm category. The edge thickness represents the relative degree of the XD score between two disease categories.