| Literature DB >> 27479082 |
Fei Liu1,2, Shao-Wu Zhang1, Wei-Feng Guo1, Ze-Gang Wei1, Luonan Chen1,3,4.
Abstract
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce the computational cost of BN due to much smaller sizes of local GRNs, but also identify the directions of the regulations.Entities:
Mesh:
Year: 2016 PMID: 27479082 PMCID: PMC4968793 DOI: 10.1371/journal.pcbi.1005024
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Schematic diagram of LBN method.
(1) process the data, (2) construct the initial network (a large-scale network) by CMI or MI, (3) decompose the network into local networks (a number of small-scale networks) by kNN with k = 1, (4) perform BN to obtain local BNs (a number of small-scale networks), (5) integrate local BNs into a candidate network (a large-scale network), (6) perform CMI to obtain the tentative network (a large-scale network). Iteratively performing BN and CMI with kNN (k = 2) until G topological structure tends to stable, the final network or GRN can be inferred. The solid lines denote the true regulations and the dashed lines denote redundant correlations between two genes.
Comparison of different methods on dataset10, dataset50 and dataset100.
| Method | TPR | FPR | FDR | PPV | ACC | MCC | F | AUC |
|---|---|---|---|---|---|---|---|---|
| GENIE3 | 0.700 | 0.112 | 0.563 | 0.437 | 0.867 | 0.483 | 0.538 | 0.919 |
| ARACNE | 0.112 | 0.500 | 0.500 | 0.888 | 0.618 | 0.643 | 0.930 | |
| NARROMI | 0.700 | 0.050 | 0.364 | 0.636 | 0.922 | 0.623 | 0.666 | 0.938 |
| LBN | ||||||||
| GENIE3 | 0.481 | 0.078 | 0.833 | 0.167 | 0.908 | 0.245 | 0.248 | 0.843 |
| ARACNE | 0.082 | 0.809 | 0.192 | 0.908 | 0.303 | 0.291 | 0.832 | |
| NARROMI | 0.532 | 0.062 | 0.783 | 0.217 | 0.925 | 0.307 | 0.308 | 0.839 |
| LBN | 0.403 | |||||||
| GENIE3 | 0.265 | 0.015 | 0.768 | 0.232 | 0.972 | 0.234 | 0.247 | 0.809 |
| ARACNE | 0.042 | 0.854 | 0.146 | 0.949 | 0.227 | 0.217 | ||
| NARROMI | 0.277 | 0.010 | 0.676 | 0.324 | 0.978 | 0.289 | 0.299 | 0.849 |
| LBN | 0.283 | 0.852 |
Comparison of Grow-shring, IAMB and LBN methods on dataset10.
| Method | TPR | FPR | FDR | PPV | ACC | MCC | F | Runtime(s) |
|---|---|---|---|---|---|---|---|---|
| Grow-shring | 0.700 | 0.100 | 0.533 | 0.467 | 0.878 | 0.506 | 0.560 | 128.815 |
| IAMB | 0.800 | 0.075 | 0.429 | 0.571 | 0.911 | 0.629 | 0.667 | 70.524 |
| LBN | 0.900 | 0.050 | 0.308 | 0.692 | 0.944 | 0.759 | 0.782 | 10.462 |
Fig 2SOS DNA repair network.
(a) True network. (b) Inferred network with LBN (α = β = 0.01). The solid lines are correctly inferred regulatory relationships, and the dotted lines are false regulatory links.
Comparison of different methods on SOS DNA repair network.
| Method | TPR | FPR | FDR | PPV | ACC | MCC | F | AUC |
|---|---|---|---|---|---|---|---|---|
| GENIE3 | 0.500 | 0.455 | 0.546 | 0.694 | 0.299 | 0.522 | 0.684 | |
| ARACNE | 0.625 | 0.638 | 0.362 | 0.486 | 0.083 | 0.479 | 0.739 | |
| NARROMI | 0.667 | 0.458 | 0.579 | 0.421 | 0.583 | 0.197 | 0.516 | 0.791 |
| Grow-shring | 0.458 | 0.271 | 0.542 | 0.458 | 0.639 | 0.188 | 0.458 | 0.758 |
| IAMB | 0.583 | 0.229 | 0.440 | 0.560 | 0.708 | 0.351 | 0.571 | 0.809 |
| LBN | 0.625 |
Comparison of different methods on the large-scale gene regulatory network.
| GENIE3 | ARACNE | NARROMI | Grow-shring | IAMB | LBN | |
|---|---|---|---|---|---|---|
| AveAUC_TF | 0.684 | 0.749 | 0.754 | 0.724 | 0.751 | 0.761 |
| 78(0.486) | 86(0.538) | 93(0.581) | 84(0.525) | 89(0.556) | 96(0.600) | |
| 60(0.375) | 68(0.425) | 71(0.444) | 62(0.389) | 68(0.425) | 72(0.450) | |
| AveAUC_TG | 0.723 | 0.733 | 0.735 | 673(0.535) | 690(0.548) | 0.747 |
| 484(0.385) | 691(0.549) | 694(0.552) | 472(0.375) | 479(0.381) | 702(0.558) | |
| 428(0.340) | 484(0.385) | 485(0.386) | 602.776 | 472.598 | 488(0.388) |
Notes: AUC represents the area under ROC curve; AveAUC_TF is average AUC for transcriptional factors (TFs); AveAUC_TG is average AUC for target genes (TGs);
#**(rate) is the number and proportion of TFs/TGs predicted correctly under the condition **.
Results of different combination ways on the dataset10.
| Method | TPR | FPR | FDR | PPV | ACC | MCCC | F | Time (s) |
|---|---|---|---|---|---|---|---|---|
| BN | 0.800 | 0.050 | 0.333 | 0.667 | 0.933 | 0.693 | 0.727 | 0.8247 |
| MI+BN | 0.900 | 0.088 | 0.438 | 0.563 | 0.911 | 0.668 | 0.692 | 0.0395 |
| MI+BN+CMI | 0.900 | 0.063 | 0.357 | 0.643 | 0.933 | 0.726 | 0.750 | 0.0544 |
| MI+BN+CMI+kNN+BN | 0.900 | 0.050 | 0.308 | 0.692 | 0.944 | 0.759 | 0.782 | 0.2677 |
Fig 3Gene regulatory networks composed of 10 genes.
(a) The true network with 10 genes and 10 edges. (b) The network inferred by BN method. (c) The network inferred by MI+BN. (d) The network inferred by MI+BN+CMI. (e) The network inferred by MI+BN+CMI+kNN+BN. The solid lines are correctly inferred regulatory relationships, and the dotted lines are false regulatory links.
Fig 4Effect of parameters α and β for LBN on dataset10.