| Literature DB >> 24725842 |
Rong Xu1, Li Li, Quanqiu Wang.
Abstract
BACKGROUND: Discerning the genetic contributions to complex human diseases is a challenging mandate that demands new types of data and calls for new avenues for advancing the state-of-the-art in computational approaches to uncovering disease etiology. Systems approaches to studying observable phenotypic relationships among diseases are emerging as an active area of research for both novel disease gene discovery and drug repositioning. Currently, systematic study of disease relationships on a phenome-wide scale is limited due to the lack of large-scale machine understandable disease phenotype relationship knowledge bases. Our study innovates a semi-supervised iterative pattern learning approach that is used to build an precise, large-scale disease-disease risk relationship (D1 → D2) knowledge base (dRiskKB) from a vast corpus of free-text published biomedical literature.Entities:
Mesh:
Year: 2014 PMID: 24725842 PMCID: PMC3998061 DOI: 10.1186/1471-2105-15-105
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The semi-supervised pattern-learning approach for extracting disease-disease risk pairs from MEDLINE.
Top 10 ranked patterns and numbers of associated D1 ←D2 pairs
| “ | 14,183 | “ | 14,183 | “ | 14,183 |
| “D1 was due to D2” | 198 | “D1 and D2” | 205,942 | “ | 8,297 |
| “D1 owing to D2” | 175 | “D1 in D2” | 50,902 | “ | 6,499 |
| “D1 attributable to D2” | 279 | “ | 8,297 | “D1 from D2” | 7,993 |
| “D1 was caused by D2” | 181 | “D1 associated with D2” | 27,477 | “D1 associated with D2” | 27,477 |
| “D1 due to chronic D2” | 146 | “ | 6,499 | “D1 in patients with D2” | 20,221 |
| “D1 due to severe D2” | 187 | “D1 in patients with D2” | 20,221 | “D1 in D2” | 50,902 |
| “D1 as a result of D2” | 516 | “D1 with D2” | 35,203 | “D1 of D2” | 11,919 |
| “ | 1,281 | “D1 from D2” | 7,993 | “ | 1,281 |
| “D1 attributed to D2” | 184 | “D1, D2” | 99,881 | “D1 related to D2” | 1,616 |
Risk-specific patterns associated with ≥1000 pairs are highlighted.
Top 10 ranked patterns and numbers of associated D1 →D2 pairs
| “ | 14,183 | “ | | 14,183 “ | 14,183 |
| “D1 is a leading cause of D2” | 132 | “D1 and D2” | 205,942 | “D1 with D2” | 35,203 |
| “D1 is the most common cause of D2” | 188 | “D1 with D2” | 35,203 | “D1 D2” | 12,887 |
| “D1 is a major cause of D2” | 281 | “D1, D2” | 99,881 | “ | 2,260 |
| “D1 is the main cause of D2” | 104 | “D1 D2” | 12,887 | “D1 patients with D2” | 2,578 |
| “D1 is a frequent cause of D2” | 104 | “D1 or D2” | 38,841 | “ | 1,463 |
| “D1 is an important cause of D2” | 262 | “D1 in D2” | 50,902 | “D1 without D2” | 3,703 |
| “D1 is a common cause of D2” | 351 | “D1 associated with D2” | 27,477 | “ | 3,422 |
| “D1 as cause of D2” | 117 | “D1, and D2” | 28,942 | “D1 or D2” | 38,841 |
| “D1-induced D2” | 558 | “ | 2,260 | “D1 and D2” | 205,942 |
Risk-specific patterns associated with ≥1000 pairs are highlighted.
Figure 2Pattern precisions.
Number of disease-risk-specific patterns among top-ranked patterns for five different seeds: seed1 (“D1 due to D2”), seed2 (“D1 caused by D2”), seed3 (“ D1 secondary to D2”), seed4 (“ D1 attributable to D2”), and seed5 (“D1 and D2)”
| 10 | 4 | 4 | 4 | 5 | 1 |
| 20 | 6 | 7 | 7 | 11 | 3 |
| 30 | 8 | 11 | 11 | 11 | 3 |
| 40 | 12 | 15 | 15 | 12 | 4 |
| 50 | 16 | 20 | 15 | 15 | 4 |
| 60 | 21 | 21 | 17 | 15 | 4 |
| 70 | 23 | 22 | 21 | 16 | 4 |
| 80 | 25 | 23 | 22 | 16 | 5 |
| 90 | 26 | 24 | 23 | 16 | 5 |
| 100 | 26 | 24 | 23 | 16 | 5 |
Percentages of disease-disease risk pairs (D1 →D2) that share any genes or drugs (Column 2), the average numbers of shared genes or drugs for D1 →D2 pairs (Column 3), and the average numbers of shared genes or drugs for disease-disease combinations (D1-D2) (Column 4)
| | |||
|---|---|---|---|
| Disease-gene (OMIM) | 3.79% | 0.016 | |
| Disease-gene (GWAS) | 13.64% | 0.134 | |
| Disease-drug | 42.12% | 0.222 |
Figure 3Correlations between disease-disease pairs with shared risk or effect diseases and their associated genes (OMIM).
Figure 4Correlations between disease-disease pairs with shared risk or effect diseases and their associated genes (GWAS).
Figure 5Correlations between disease-disease pairs with shared risk or effect diseases and their associated drugs.
Figure 6Weighted risk graph directly related to obesity.
Figure 7Weighted risk graph directly related to type 2 diabetes (T2D).