| Literature DB >> 28194163 |
Jie Fang1, Hongjia Ouyang1, Liangzhong Shen1, Edward R Dougherty2,3, Wenbin Liu1,2.
Abstract
The inference of gene regulatory networks is a core problem in systems biology. Many inference algorithms have been proposed and all suffer from false positives. In this paper, we use the minimum description length (MDL) principle to reduce the rate of false positives for best-fit algorithms. The performance of these algorithms is evaluated via two metrics: the normalized-edge Hamming distance and the steady-state distribution distance. Results for synthetic networks and a well-studied budding-yeast cell cycle network show that MDL-based filtering is more effective than filtering based on conditional mutual information (CMI). In addition, MDL-based filtering provides better inference than the MDL algorithm itself.Entities:
Keywords: Best-fit; Boolean network; Conditional mutual information; Minimum description length principle
Year: 2014 PMID: 28194163 PMCID: PMC5270450 DOI: 10.1186/s13637-014-0013-2
Source DB: PubMed Journal: EURASIP J Bioinform Syst Biol ISSN: 1687-4145
Average number of true-positive and false-positive connections for MDL, best-fit-I, and best-fit-II filtered by CMI and MDL
| Noise (%) | Algorithm | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| TP | FP | TP | FP | TP | FP | TP | FP | TP | FP | ||
| 0 | MDL | 10.9 | 3.0 | 15.4 | 1.1 | 17.0 | 0.5 | 17.5 | 0.3 | 17.7 | 0.1 |
| BF-I | 11.4 | 3.8 | 15.8 | 1.6 | 17.1 | 0.7 | 17.4 | 0.4 | 17.5 | 0.3 | |
| BF-I-CMI | 10.4 | 3.2 | 14.8 | 1.3 | 15.9 | 0.6 | 16.2 | 0.4 | 16.3 | 0.3 | |
| BF-I-MDL | 11.0 | 2.6 | 15.4 | 1.2 | 16.9 | 0.6 | 17.3 | 0.4 | 17.5 | 0.2 | |
| BF-II | 11.7 | 2.8 | 16.1 | 1.5 | 17.3 | 0.7 | 17.6 | 0.6 | 17.7 | 0.3 | |
| BF-II-CMI | 10.9 | 2.3 | 15.2 | 1.3 | 16.1 | 0.6 | 16.4 | 0.4 | 16.4 | 0.2 | |
| BF-II-MDL | 10.8 | 1.9 | 15.3 | 0.9 | 16.9 | 0.4 | 17.5 | 0.3 | 17.6 | 0.2 | |
| 5 | MDL | 9.5 | 5.8 | 14.1 | 5.8 | 16.2 | 5.5 | 17.0 | 5.9 | 17.4 | 6.4 |
| BF-I | 10.0 | 9.1 | 14.5 | 8.9 | 16.4 | 6.5 | 17.0 | 4.3 | 17.3 | 2.7 | |
| BF-I-CMI | 9.1 | 6.7 | 13.5 | 7.1 | 15.2 | 5.2 | 15.7 | 3.8 | 15.9 | 2.5 | |
| BF-I-MDL | 9.4 | 6.8 | 14.2 | 6.0 | 16.3 | 5.0 | 16.9 | 3.1 | 17.3 | 2.0 | |
| BF-II | 10.4 | 7.3 | 14.9 | 8.5 | 16.6 | 6.8 | 17.3 | 4.6 | 17.5 | 3.0 | |
| BF-II-CMI | 9.7 | 5.9 | 14.0 | 7.1 | 15.4 | 5.3 | 16.0 | 3.5 | 16.0 | 2.4 | |
| BF-II-MDL | 9.3 | 4.9 | 14.0 | 5.3 | 16.2 | 4.7 | 17.0 | 3.4 | 17.3 | 2.2 | |
| 10 | MDL | 8.3 | 8.1 | 12.8 | 10.4 | 15.1 | 10.6 | 16.2 | 10.7 | 16.9 | 11.0 |
| BF-I | 8.8 | 12.9 | 13.0 | 13.7 | 15.1 | 11.1 | 16.3 | 8.6 | 16.8 | 6.4 | |
| BF-I-CMI | 7.9 | 9.4 | 12.1 | 11.0 | 13.9 | 9.7 | 14.9 | 7.7 | 15.3 | 4.5 | |
| BF-I-MDL | 8.1 | 9.6 | 12.6 | 10.7 | 15.0 | 8.4 | 16.2 | 6.3 | 16.8 | 5.8 | |
| BF-II | 9.2 | 10.9 | 13.5 | 13.1 | 15.6 | 11.4 | 16.6 | 9.2 | 17.1 | 7.0 | |
| BF-II-CMI | 8.4 | 8.5 | 12.6 | 10.8 | 14.4 | 8.9 | 15.1 | 7.2 | 15.5 | 5.0 | |
| BF-II-MDL | 8.1 | 7.5 | 12.6 | 9.0 | 15.1 | 8.5 | 16.3 | 6.9 | 16.9 | 5.6 | |
BF, best-fit.
Figure 1Comparison of normalized-edge Hamming distance and steady-state distribution distance μ with 0%, 5%, and 10% noise for MDL, best-fit-I, and best-fit-II filtered by CMI and MDL.
Figure 2Simplified cell-cycle network of budding yeast and the inferred networks from time-series in Table 2. (A) Simplified cell-cycle network of budding yeast. Arrows are positive regulation, “T” lines are negative regulation, “T” loops are self-degradation. (B) Network inferred by MDL. (C) Network inferred by best-fit-I. (D) Network inferred by best-fit-II. (E) Network inferred by best-fit-I filtered by CMI. (F) Network inferred by best-fit-II filtered by CMI. (G) Network inferred by best-fit-I filtered by MDL. (H) Network inferred by best-fit-II filtered by MDL. From panel (B) to (H), the bold solid lines are the correctly inferred regulatory relations, while the light dashed lines are the incorrectly inferred regulatory relations.
Temporal evolution of state for the cell cycle
| Time | Cln3 | MBF | SBF | Cln1 | Cdh1 | Swi5 | Cdc20 | Clb5 | Sic1 | Clb1 | Mcm1 | Phase |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | Start |
| 2 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | G1 |
| 3 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | G1 |
| 4 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | G1 |
| 5 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | S |
| 6 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | G2 |
| 7 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | M |
| 8 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | M |
| 9 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | M |
| 10 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | M |
| 11 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | M |
| 12 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | M |
| 13 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | G1 |
Comparison of MDL, best-fit-I, and best-fit-II with CMI- and MDL-based filtering for yeast-pathway data
| Algorithm | Noise = 0 | Noise = 5% | Noise = 10% | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TP | FP |
| μssd | TP | FP |
| μssd | TP | FP |
| μssd | |
| MDL | 14 | 2 | 0.65 | 1.31 | 11.5 | 9 | 0.93 | 1.42 | 8.9 | 12.5 | 1.11 | 1.45 |
| BF-I | 15 | 5 | 0.71 | 1.25 | 12.2 | 11.9 | 0.99 | 1.44 | 9.8 | 18.4 | 1.25 | 1.49 |
| BF-I-CMI | 11 | 1 | 0.71 | 1.43 | 10.4 | 9 | 0.96 | 1.47 | 8.3 | 14 | 1.17 | 1.51 |
| BF-I-MDL | 14 | 2 | 0.65 | 1.17 | 10.8 | 8.5 | 0.93 | 1.43 | 8.6 | 13.1 | 1.13 | 1.48 |
| BF-II | 15 | 3 | 0.65 | 1.41 | 12.4 | 10.4 | 0.94 | 1.45 | 10.6 | 16.5 | 1.17 | 1.48 |
| BF-II-CMI | 12 | 2 | 0.71 | 1.46 | 11 | 8.7 | 0.93 | 1.47 | 8.3 | 12.4 | 1.12 | 1.50 |
| BF-II-MDL | 13 | 1 | 0.65 | 1.36 | 11.1 | 7.7 | 0.9 | 1.42 | 9.2 | 11.9 | 1.08 | 1.44 |