| Literature DB >> 35166337 |
Vivek Sriram1, Manu Shivakumar1, Sang-Hyuk Jung1,2, Yonghyun Nam1, Lisa Bang3, Anurag Verma4, Seunggeun Lee5, Eun Kyung Choe1, Dokyoon Kim1,6.
Abstract
BACKGROUND: Disease complications, the onset of secondary phenotypes given a primary condition, can exacerbate the long-term severity of outcomes. However, the exact cause of many of these cross-phenotype associations is still unknown. One potential reason is shared genetic etiology-common genetic drivers may lead to the onset of multiple phenotypes. Disease-disease networks (DDNs), where nodes represent diseases and edges represent associations between diseases, can provide an intuitive way of understanding the relationships between phenotypes. Using summary statistics from a phenome-wide association study (PheWAS), we can generate a corresponding DDN where edges represent shared genetic variants between diseases. Such a network can help us analyze genetic associations across the diseasome, the landscape of all human diseases, and identify potential genetic influences for disease complications.Entities:
Keywords: PheWAS; comorbidity; disease complication; disease-disease network; network medicine
Mesh:
Year: 2022 PMID: 35166337 PMCID: PMC8848314 DOI: 10.1093/gigascience/giac002
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 7.658
Figure 1:A depiction of the process for creating a SNP-based DDN. A PheWAS can be run on data from an EHR-linked biobank to calculate p-values of associations between a variety of single-nucleotide polymorphisms (SNPs) and phenotypes. The summary statistics from this PheWAS lend themselves to a DDN, where nodes represent diseases and edges represent common associated SNPs between diseases. Figure created with BioRender.com.
A comparison of NETMAGE to other toolkits that currently exist for the visualization of PheWAS summary statistics
| Software Name | Allows users to upload desired PheWAS results for analysis | Allows for interactive investigation of cross-phenotype associations | Generates a network visualization of genetic associations between phenotypes | Allows users to search and create subsets of any produced networks by disease, by genetic variant, or by other network statistics |
|---|---|---|---|---|
| PleioNet | x | x | x | |
| ShinyGPA | x | x | x | |
| PheGWAS | x | x | N/A | |
| PheWAS-Me | x | x | x | |
| PheWeb | x | x | N/A | |
| NETMAGE | x | x | x | x |
N/A: not applicable.
Figure 2:A depiction of the NETMAGE visualization tool. (A) The sidebar of the visualization gives a description of the map. It also includes a search dropdown and a group selector dropdown menu. (B) Variables are automatically read from the input data and included as options for search. (C) Clicking on a node reduces the displayed map to only the chosen node and its direct connections. Additionally, associated variants, connected phenotypes, and network statistics are presented to the right of the window when a node is selected. This graph corresponds to the subnetwork for type 2 diabetes. (D) All nodes within a single disease category can be visualized at once using the Group Selector. Here, we display all neoplasm phenotypes.
Figure 3:A histogram of degree distributions for the UKBB DDN. This distribution follows the power law, suggesting a scale-free property for the network. We also see that disease categories fail to follow specific trends based upon the degree of the disease.
Hub phenotypes in the UKBB DDN
| Phenotype | PheCode | Disease category |
|---|---|---|
| Skin cancer | 172 | Neoplasm |
| Diabetes mellitus | 250 | Endocrine/metabolic |
|
| 244 | Endocrine/metabolic |
|
| 244.4 | Endocrine/metabolic |
|
| 250.1 | Endocrine/metabolic |
|
| 250.2 | Endocrine/metabolic |
|
| 272 | Endocrine/metabolic |
|
| 272.1 | Endocrine/metabolic |
| Other retinal disorders | 362 | Sense organs |
| Hypertension | 401 | Circulatory system |
| Essential hypertension | 401.1 | Circulatory system |
| Coronary atherosclerosis | 411.4 | Circulatory system |
| Non-celiac intestinal malabsorption | 557 | Digestive |
|
| 557.1 | Digestive |
|
| 696 | Dermatologic |
| Psoriasis NOS | 696.4 | Dermatologic |
| Other inflammatory polyarthropathies | 714 | Musculoskeletal |
| Rheumatoid arthritis | 714.1 | Musculoskeletal |
| Disorders of muscle, ligament, and fascia | 728 | Musculoskeletal |
| Fasciitis | 728.7 | Musculoskeletal |
Centrality measures used to identify these phenotypes included degree, weighted degree, closeness centrality, betweenness centrality, and eigenvector centrality. Diseases marked in boldface appear multiple times as the most central nodes based upon our different network measures. Supplementary Table S1 provides the exact centrality measures that identified each phenotype to be a hub. NOS: not otherwise specified.
Runtimes for DDN generation given input datasets with different numbers of phenotypes
| Server runtime to generate network after receiving HTTP request (sec) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Phenotype count | Fruchterman-Reingold layout | Force Atlas 2 layout | ||||||||||
| 1 | 2 | 3 | 4 | 5 | Mean (SD) | 1 | 2 | 3 | 4 | 5 | Mean (SD) | |
| 50 | 3.07 | 2.34 | 2.86 | 2.31 | 2.76 | 2.67 (0.33) | 2.46 | 2.48 | 2.93 | 2.43 | 3.00 | 2.66 (0.28) |
| 100 | 3.26 | 3.49 | 4.29 | 3.61 | 3.52 | 3.63 (0.39) | 3.43 | 4.14 | 4.37 | 4.62 | 3.58 | 4.03 (0.51) |
| 250 | 6.60 | 5.20 | 6.77 | 6.62 | 5.56 | 6.15 (0.72) | 6.74 | 5.31 | 6.36 | 6.92 | 5.90 | 6.25 (0.65) |
| 500 | 11.21 | 11.85 | 12.53 | 10.94 | 9.91 | 11.29 (0.99) | 11.68 | 12.04 | 12.49 | 11.21 | 9.33 | 11.35 (1.22) |
| 1,000 | 28.27 | 28.77 | 30.19 | 27.01 | 29.52 | 28.75 (1.22) | 29.37 | 28.35 | 29.84 | 27.23 | 30.23 | 29.00 (1.22) |
| UKBB DDN | 48.60 | N/A | 39.43 | N/A | ||||||||
These times measure how long it takes for the server to generate the network after the “submit” button has been clicked—in all instances, files have already been uploaded to the server. Upload speeds for files will vary depending on user bandwidth. Five different datasets were constructed for each count of phenotypes to evaluate runtime, and the mean and standard deviation of time for the 5 runs is also provided for each row. Finally, runtime for the full input UKBB case study is included in the last row of the table. N/A: not applicable.