| Literature DB >> 29156709 |
Jialiang Yang1, Jing Qiu2, Kejing Wang2, Lijuan Zhu2, Jingjing Fan2, Deyin Zheng3, Xiaodi Meng4, Jiasheng Yang5, Lihong Peng1, Yu Fu2, Dahan Zhang6, Shouneng Peng7, Haiyun Huang2, Yi Zhang2.
Abstract
Obesity is a primary risk factor for many diseases such as certain cancers. In this study, we have developed three algorithms including a random-walk based method OBNet, a shortest-path based method OBsp and a direct-overlap method OBoverlap, to reveal obesity-disease connections at protein-interaction subnetworks corresponding to thousands of biological functions and pathways. Through literature mining, we also curated an obesity-associated disease list, by which we compared the methods. As a result, OBNet outperforms other two methods. OBNet can predict whether a disease is obesity-related based on its associated genes. Meanwhile, OBNet identifies extensive connections between obesity genes and genes associated with a few diseases at various functional modules and pathways. Using breast cancer and Type 2 diabetes as two examples, OBNet identifies meaningful genes that may play key roles in connecting obesity and the two diseases. For example, TGFB1 and VEGFA are inferred to be the top two key genes mediating obesity-breast cancer connection in modules associated with brain development. Finally, the top modules identified by OBNet in breast cancer significantly overlap with modules identified from TCGA breast cancer gene expression study, revealing the power of OBNet in identifying biological processes involved in the disease.Entities:
Keywords: bioinformatics; gene expression; human obesity; obesity-related diseases; protein interaction network
Year: 2017 PMID: 29156709 PMCID: PMC5689599 DOI: 10.18632/oncotarget.19490
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Figure 1An overview of (A) OBNet and (B) OBsp. OBNet: A list of obesity and disease genes, a reference network, GO biological processes and KEGG pathways are first collected. The genes in specific GO term and KEGG pathway are mapped onto the reference network to define a modularized network, which could be further expanded by a random walk with restart (RWR) procedure to construct an expanded modularized network. After that, the obesity genes and disease genes are mapped to each (expanded) modularized network. The mutual reachability between obesity genes and disease genes is estimated by using RWR and a gene set enrichment analysis (GSEA). The significance of the mutual reachability is evaluated by using a permutation analysis, in which the obesity genes are randomly permuted, and the significance p-value is adjusted for multiple testing. Finally, the diseases are ranked by the minimum adjusted p-values across all (expanded) modularized networks and those with low adjusted p-values are obesity-related. OBsp: The mutual reachability of obesity and disease genes is estimated by the average shortest path between the two sets.
Figure 2Comparison of OBNet, OBsp and OBoverlap
OBNet-Expanded modularized network represents OBNet using expanded modularized network; OBNet-modularized network represent OBNet using modularized network; other methods are defined similarly.
Top 40 predicted obesity associated diseases
| Disease | FDR | Disease | FDR |
|---|---|---|---|
| Body mass index | 3.15E-32 | Crohn’s disease | 7.96E-07 |
| Autism spectrum disorder-bipolar disorder-schizophrenia | 7.60E-11 | Asthma | 8.48E-07 |
| Coronary artery disease | 3.29E-09 | Red blood cell traits | 1.05E-06 |
| Type 2 diabetes | 3.47E-09 | Prostate cancer | 1.16E-06 |
| Metabolite levels | 7.60E-09 | Fasting plasma glucose | 1.68E-06 |
| Atrial fibrillation | 8.76E-09 | Celiac disease | 1.70E-06 |
| Height | 9.42E-09 | Blood pressure | 2.25E-06 |
| Obesity-related traits | 5.26E-08 | Rheumatoid arthritis | 2.50E-06 |
| Chronic lymphocytic leukemia | 5.51E-08 | Inflammatory biomarkers | 4.07E-06 |
| Type 1 diabetes | 6.71E-08 | HDL Cholesterol - Triglycerides (HDLC-TG) | 4.73E-06 |
| Bone mineral density | 9.23E-08 | Lipid traits | 1.25E-05 |
| Blood trace element (Cu levels) | 1.34E-07 | Platelet counts | 1.85E-05 |
| Neuroblastoma | 1.34E-07 | Allergic sensitization | 1.90E-05 |
| Warfarin maintenance dose | 1.34E-07 | Pulmonary function | 1.94E-05 |
| C-reactive protein | 1.36E-07 | Breast cancer | 2.03E-05 |
| Metabolic syndrome | 1.47E-07 | Electrocardiographic traits | 2.38E-05 |
| HDL cholesterol | 1.77E-07 | Inflammatory bowel disease | 2.87E-05 |
| Alzheimer’s disease | 2.21E-07 | Schizophrenia or bipolar disorder | 2.88E-05 |
| Thyroid function | 2.24E-07 | Immune response to smallpox vaccine (IL-6) | 3.34E-05 |
| Triglycerides | 6.69E-07 | Fibrinogen | 3.46E-05 |
Figure 3The connections between obesity and ORDs
(A) Top 37 most frequent obesity-related diseases for OBNet based on expanded modularized network. Here frequency means the number of subnetworks in which obesity and the disease are significant connected. (B) Top 40 most frequent significant modules for ORDs. Here frequency means the number of diseases significantly connected with obesity in the module.
Figure 4Network topology and key genes connecting (A) obesity and Type 2 diabetes in regulation of behaviour and (B) obesity and breast cancer in hindbrain development. We use node shape to denote key connectors: (1) square represents the top 5 key connectors; (2) circle represents expanded obesity and disease genes. We use fill colour to denote new (expanded) obesity and disease information: (1) red represents obesity gene; (2) blue represents disease gene.
Comparing the top 10 modules associated with Breast cancer identified by WGCNA and by our study
| module | blue | darkgreen | darkorange | darkgrey | royalblue | lightgreen | orange | white | saddlebrown | skyblue |
|---|---|---|---|---|---|---|---|---|---|---|
| GO:0030902_hindbrain development | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| GO:0033135_regulation of peptidyl-serine phosphorylation | 9.78E-09 | 3.90E-02 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| GO:0021695_cerebellar cortex development | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| GO:0030003_cellular cation homeostasis | 1.61E-11 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| GO:0022037_metencephalon development | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| GO:0006521_regulation of cellular amino acid metabolic process | 1 | 3.88E-02 | 8.78E-02 | 4.04E-01 | 9.78E-09 | 1 | 1 | 1 | 1 | 1 |
| GO:0048871_multicellular organismal homeostasis | 7.17E-08 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| GO:0055080_cation homeostasis | 3.96E-09 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| GO:0055082_cellular chemical homeostasis | 2.00E-10 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| GO:0031145_anaphase-promoting complex-dependent proteasomal ubiquitin-dependent protein catabolic process | 1 | 1.30E-01 | 2.52E-01 | 2.89E-01 | 8.53E-01 | 1 | 1 | 1 | 1 | 1 |
The co-expression modules were named by colors and the genes in each module were listed in Supplementary Dataset 5.