| Literature DB >> 23805196 |
Natalie Berestovsky1, Luay Nakhleh.
Abstract
Regulatory networks play a central role in cellular behavior and decision making. Learning these regulatory networks is a major task in biology, and devising computational methods and mathematical models for this task is a major endeavor in bioinformatics. Boolean networks have been used extensively for modeling regulatory networks. In this model, the state of each gene can be either 'on' or 'off' and that next-state of a gene is updated, synchronously or asynchronously, according to a Boolean rule that is applied to the current-state of the entire system. Inferring a Boolean network from a set of experimental data entails two main steps: first, the experimental time-series data are discretized into Boolean trajectories, and then, a Boolean network is learned from these Boolean trajectories. In this paper, we consider three methods for data discretization, including a new one we propose, and three methods for learning Boolean networks, and study the performance of all possible nine combinations on four regulatory systems of varying dynamics complexities. We find that employing the right combination of methods for data discretization and network learning results in Boolean networks that capture the dynamics well and provide predictive power. Our findings are in contrast to a recent survey that placed Boolean networks on the low end of the "faithfulness to biological reality" and "ability to model dynamics" spectra. Further, contrary to the common argument in favor of Boolean networks, we find that a relatively large number of time points in the time-series data is required to learn good Boolean networks for certain data sets. Last but not least, while methods have been proposed for inferring Boolean networks, as discussed above, missing still are publicly available implementations thereof. Here, we make our implementation of the methods available publicly in open source at http://bioinfo.cs.rice.edu/.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23805196 PMCID: PMC3689729 DOI: 10.1371/journal.pone.0066031
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Algorithm 1 From Time-series to Boolean Networks.
|
|
| • Time-series |
| • Binarization method |
| • Learning method |
| • Error scoring metric |
| • Number of iterations |
|
|
| All Boolean networks that are optimal under the error metric; |
|
|
|
|
| 1. |
| 2. Remove redundancy in |
| 3. |
| 4. |
| (a) |
| Return all Boolean networks |
Figure 1Iterative k-means clustering with (direct binarization) vs. .
More refined binarization is achieved with higher values of .
Figure 2True dynamics (left column) and the dynamics based on asynchronous simulation of the best-scoring Boolean networks learned from the data (right column) of the four systems: toy network (a–b), Jak-Stat (c–d), Smad (e–f), and budding yeast cell cycle (g–h).
The Boolean network simulated for each system is one with minimum error obtained by the KM3:REVEAL method (see Table 2).
Evaluation results for different combinations of binarization and learning methods on the four networks.
| Toy network | Jak-Stat | Smad | Cell cycle | ||||||||||
| KM-1 | KM-3 | BASC A | KM-1 | KM-3 | BASC A | KM-1 | KM-3 | BASC A | KM-1 | KM-3 | BASC A | ||
| REVEAL |
| 0.43 | 0.007 | 0.025 | 0.28 | 0.0 | 0.26 | 0.48 | 0.0 | 0.73 | 0.52 | 0.012 | 0.52 |
|
| 1 | 14 | 1 | 595 | 2237 | 2 | 1 | 12 | 1 | 6 | 559 | 1 | |
|
| 1 | 1 | 1 | 1 | 3 | 1 | 1 | 96 | 1 | 1 | 6 | 1 | |
|
| N | Y | N | N | Y | N | N | Y | N | N | Y | N | |
| BESTFIT |
| 0.13 | 0.007 | 0.125 | 0.0 | 0.0 | 0.26 | 0.48 | 0.0 | 0.73 | 0.05 | 0.005 | 0.05 |
|
| 1 | 17 | 1 | 85 | 1 | 1 | 1 | 1 | 1 | 6 | 8 | 1 | |
|
| 1 | 1 | 1 | 1 | 6 | 1 | 23 | 96 | 1 | 1 | 53 | 1 | |
|
| N | Y | N | Y | Y | N | N | Y | N | Y | Y | Y | |
| FULLFIT |
| 0.43 | 0.15 | 0.7 | 0.0 | 0.0 | – | 0.48 | 0.0 | 0.73 | 0.4 | 0.08 | 0.4 |
|
| 1 | 18 | 1 | 104 | 1 | – | 1 | 1 | 1 | 6 | 741 | 1 | |
|
| 1 | 1 | 1 | 1 | 6 | – | 23 | 96 | 1 | 1 | 6 | 1 | |
|
| N | N | N | Y | Y | – | N | Y | N | N | N | N | |
Figure 3Dynamics of Boolean networks learned from 16 time-points of the toy network.
(a) Time points correspond to 0 min, 5 min, 15 min, 30 min, 45 min, 1 hr, 2hr, 3hr, 6hr, 8 hr, 10 hr, 12 hr, 15 hr, 18 hr, 21 hr, 24 hr. (b) Time points are manually selected to capture the oscillatory patterns of the original system. Left panels show the time points selected, and right panels show the binary data obtained by applying KM3 to the measurements at the selected time points in the left panels. Binarized data are shifted vertically for readability. Blue, green, red, and cyan curves correspond to species A, B, C, and D, respectively.