| Literature DB >> 31396259 |
Nuosi Wu1, Jiang Huang2, Xiao-Fei Zhang3, Le Ou-Yang4,5, Shan He6, Zexuan Zhu2, Weixin Xie1.
Abstract
Gene regulatory networks (GRNs) are often inferred based on Gaussian graphical models that could identify the conditional dependence among genes by estimating the corresponding precision matrix. Classical Gaussian graphical models are usually designed for single network estimation and ignore existing knowledge such as pathway information. Therefore, they can neither make use of the common information shared by multiple networks, nor can they utilize useful prior information to guide the estimation. In this paper, we propose a new weighted fused pathway graphical lasso (WFPGL) to jointly estimate multiple networks by incorporating prior knowledge derived from known pathways and gene interactions. Based on the assumption that two genes are less likely to be connected if they do not participate together in any pathways, a pathway-based constraint is considered in our model. Moreover, we introduce a weighted fused lasso penalty in our model to take into account prior gene interaction data and common information shared by multiple networks. Our model is optimized based on the alternating direction method of multipliers (ADMM). Experiments on synthetic data demonstrate that our method outperforms other five state-of-the-art graphical models. We then apply our model to two real datasets. Hub genes in our identified state-specific networks show some shared and specific patterns, which indicates the efficiency of our model in revealing the underlying mechanisms of complex diseases.Entities:
Keywords: Gaussian graphical model; fused lasso penalty; gene network analysis; precision matrix; prior information
Year: 2019 PMID: 31396259 PMCID: PMC6662592 DOI: 10.3389/fgene.2019.00623
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Illustration of the proposed method. (A) Graphical lasso just uses gene expression data to separately estimate each state-specific network, leading to incorrect estimation results. (B) The proposed weighted fused pathway graphical lasso jointly estimates multiple state-specific networks by considering the prior knowledge of gene interaction networks and pathways, which could eliminate the spurious links between different pathways and results in more accurate estimation of the networks across multiple states.
Figure 2The experiment results of various methods on two groups of samples, with the value of η changing at (A) η = 0, (B) η = 0.4, and (C) η = 0.8. The performance of various methods on individual network estimation [with respect to true-positive rate (TPR) and false-positive rate (FPR)] is shown on the left side, while the performance of various methods on differential network estimation [with respect to true-positive differential rate (TPDR) and false-positive differential rate (FPDR)] is shown on the right side. For weighted fused pathway graphical lasso (WFPGL) and fused graphical lasso (FGL), different line styles correspond to different choices of λ2: solid line for λ2 = 0.0001, dashed line for λ2 = 0.001, and dotted line for λ2 = 0.01.
Figure 3The experiment results of various methods on multiple groups of samples: (A) Dataset 1 for three states. (B) Dataset 2 for four states. The performance of various methods on individual network estimation (with respect to TPR and FPR) is shown on the left side, while the performance of various methods on differential network estimation (with respect to TPDR and FPDR) is shown on the right side. For WFPGL and FGL, different line styles corresponding to different choices of λ2: solid line for λ2 = 0.0001, dashed line for λ2 = 0.001, and dotted line for λ2 = 0.01.
Top 10 nodes with the highest degree in the predicted differential network between insulin-resistant (IR) and insulin-sensitive (IS) patients.
| Rank | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|---|---|---|---|
| Name | MYC | TP53 | RELA | EGFR | NFKB1 | RAC1 | VEGFA | CREB3L1 | ITGA1 | PRKAA2 |
Top 30 nodes with the highest degree in four breast cancer subtypes.
| Rank | Basal like | HER2 enriched | Luminal A | Luminal B |
|---|---|---|---|---|
| 1 | IGF1 | BMP4 | PRKACB | BMP4 |
| 2 | TP53 | IGF1 | IGF1 | IGF1 |
| 3 | TNF | ID4 | BMP4 | IFNG |
| 4 | BCL2 | IFNG | BMP2 | FAS |
| 5 | FAS | BCL2 | TNF | RPS6KB2 |
| 6 | BMP2 | BMP2 | RPS6KB2 | TP53 |
| 7 | THBS1 | TP53 | THBS1 | PRKACB |
| 8 | PIK3CG | BIRC3 | FAS | BAMBI |
| 9 | BMP4 | RPS6KB1 | TP53 | THBS1 |
| 10 | IFNG | FAS | TGFB2 | BMP2 |
| 11 | AKT3 | BMPR1B | BIRC3 | TNF |
| 12 | BAMBI | MYC | BAMBI | PRKAR2B |
| 13 | ID4 | PIK3CG | BMPR1B | TNFRSF10B |
| 14 | CDKN2B | RPS6KB2 | RPS6KB1 | BMPR1B |
| 15 | TGFB2 | TNF | AKT3 | BMP7 |
| 16 | INHBB | BMP7 | BCL2 | ACVR1C |
| 17 | RPS6KB2 | THBS1 | ACVR1C | IL1R1 |
| 18 | BIRC3 | TGFB2 | APAF1 | BCL2 |
| 19 | PITX2 | INHBB | AKT1 | PIK3R1 |
| 20 | BMP7 | DCN | ID4 | ID4 |
| 21 | BMP5 | AKT1 | LEFTY1 | APAF1 |
| 22 | LEFTY1 | PRKACB | FST | BIRC3 |
| 23 | INHBA | BAMBI | PIK3R1 | PITX2 |
| 24 | MYC | ACVR1C | IFNG | MYC |
| 25 | CCNB3 | LEFTY1 | INHBB | PIK3CD |
| 26 | PRKACB | CDKN2A | PRKAR2B | PMAIP1 |
| 27 | LEFTY2 | SMAD9 | PIK3CG | DCN |
| 28 | BMP6 | GADD45A | PIK3R5 | AKT1 |
| 29 | PIK3CD | INHBA | INHBA | INHBB |
| 30 | DCN | SERPINB5 | PIK3R3 | FASLG |
Framework of alternating direction method of multipliers (ADMM).
|
|