| Literature DB >> 30953534 |
Wenting Liu1,2, Jagath C Rajapakse3.
Abstract
BACKGROUND: Systematic fusion of multiple data sources for Gene Regulatory Networks (GRN) inference remains a key challenge in systems biology. We incorporate information from protein-protein interaction networks (PPIN) into the process of GRN inference from gene expression (GE) data. However, existing PPIN remain sparse and transitive protein interactions can help predict missing protein interactions. We therefore propose a systematic probabilistic framework on fusing GE data and transitive protein interaction data to coherently build GRN.Entities:
Keywords: Gaussian mixture model (GMM); Gene expressions; Gene regulatory network (GRN); Protein-protein interaction networks; Transitive protein-protein interactions
Mesh:
Year: 2019 PMID: 30953534 PMCID: PMC6449891 DOI: 10.1186/s12918-019-0695-x
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Fig. 1Overall Model. Step 1: A Gaussian mixture model (GMM) is used to soft-cluster gene expression (GE) data. Step 2: A heuristic is proposed to quantitatively extend the sparse protein-protein interactions by using transitive linkages. A novel way is then proposed to score protein interactions by combining topological properties of extended protein-protein interaction network (PPIN) and GE correlations. Step 3: A Gaussian Hidden Markov Model (GHMM) is used to identify gene regulatory pathways and refine interaction scores, both of which are then used as structural priors to constrain the model of GRN. Step 4: Lastly, the GRN from GE is refined using a Bayesian Gaussian Mixture (BGM) model by including the structural priors derived from Step 3
Fig. 2A sample Gaussian Hidden Markov Model (GHMM) model. The gene expression observation of gene i is denoted by X, the circles denote hidden variables γ, and PPI confidence scores between genes i and j are denoted by C
Performance of predicting GRN with different PPINs on 30 yeast genes ground-truth network
| Method |
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|
| Raw PPIN | 165 | 95 | 152 | 63.46 | 52.05 | 57.19 | 0.698 | 0.591 |
| Extended PPIN (subnet) | 229 | 220 | 88 | 51.00 |
| 59.79 |
| 0.533 |
| Extended PPIN (global) | 240 | 260 | 77 | 48.00 |
| 58.75 | 0.696 | 0.500 |
Best performance measures that are significantly different are shown in bold
Fig. 3Performance comparison of C and R scores from the subnet or the global information at various cut-off thresholds on 30 yeast genes ground-truth network. a C scores (global) b C scores (subnet) c R scores (subnet)
Performances of prediction of GRNs by various methods on 30 yeast genes ground-truth network
| Method |
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|
| CLR | 190 | 312 | 127 | 37.85 | 59.94 | 46.40 | 0.555 | 0.388 |
| GENIE3 | 128 | 202 | 189 | 38.79 | 40.38 | 39.57 | 0.546 | 0.395 |
| TIGRESS | 140 | 207 | 177 | 40.35 | 44.16 | 42.17 | 0.546 | 0.392 |
| GMM | 172 | 266 | 145 | 39.27 | 54.26 | 45.56 | 0.583 | 0.412 |
| GHMM | 258 | 329 | 59 | 43.95 |
| 57.08 | 0.664 | 0.461 |
| R scores (GHMM) | 250 | 262 | 67 |
| 78.86 |
|
|
|
| BGM (R scores) | 202 | 237 | 115 | 46.01 | 63.72 | 53.44 | 0.627 | 0.446 |
Best performance measures that are significantly different are shown in bold
Performance of GRNs generated from the BGM with different priors on 30 yeast genes ground-truth network
| Method |
|
|
|
|
| ||||
| BGM | 139 | 201 | 178 | 40.88 | 43.85 | 42.31 | 0.553 | 0.390 | 636 |
| BGM (GMM) | 185 | 293 | 132 | 38.70 | 58.36 | 46.54 | 0.555 | 0.383 | 606 |
| BGM (C scores) | 170 | 150 | 147 |
| 53.63 |
|
|
|
|
| Nariai et al. (GHMM) | 176 | 273 | 141 | 39.20 | 55.52 | 45.95 | 0.570 | 0.408 | 1174 |
| Imoto et al. (GHMM) | 184 | 293 | 133 | 38.57 | 58.04 | 46.35 | 0.565 | 0.401 | 1150 |
| BGM (R scores) | 202 | 237 | 115 | 46.01 |
|
| 0.627 | 0.446 | 608 |
Best performance measures that are significantly different are shown in bold