| Literature DB >> 20030818 |
Sandro Morganella1, Pietro Zoppoli, Michele Ceccarelli.
Abstract
BACKGROUND: The ultimate aim of systems biology is to understand and describe how molecular components interact to manifest collective behaviour that is the sum of the single parts. Building a network of molecular interactions is the basic step in modelling a complex entity such as the cell. Even if gene-gene interactions only partially describe real networks because of post-transcriptional modifications and protein regulation, using microarray technology it is possible to combine measurements for thousands of genes into a single analysis step that provides a picture of the cell's gene expression. Several databases provide information about known molecular interactions and various methods have been developed to infer gene networks from expression data. However, network topology alone is not enough to perform simulations and predictions of how a molecular system will respond to perturbations. Rules for interactions among the single parts are needed for a complete definition of the network behaviour. Another interesting question is how to integrate information carried by the network topology, which can be derived from the literature, with large-scale experimental data.Entities:
Mesh:
Year: 2009 PMID: 20030818 PMCID: PMC2813854 DOI: 10.1186/1471-2105-10-444
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Example of the discrete state matrix .
Figure 2Recovery step example. (a) Data obtained by the discretisation rule defined in (1). (b) Data obtained after one iteration of the rules defined in (3) and (4). (c) Final data for which all uncertain values were recovered.
Figure 3Modelling the gene regulatory network as a factor graph. (a) Gene regulatory network modelled as a direct graph. (b) Equivalent factor graph representation of the network in (a).
Figure 4Synthetic network for .
Figure 5Synthetic network for .
Figure 6Comparison Kullback-Leibler divergence for (a) . The x-axis shows the biological noise levels used to generate the data sets. The y-axis represents the Dvalues obtained as the mean for all tables of a network for the corresponding biological noise level. The Figure reports the Dvalues of EM-MAP obtained using different discretisation approaches: IRIS (crosses), equal frequency (diamonds), global width (squares) and equal width (circles) and the Dvalues obtained using IRIS algorithm both in discretisation step and in regulation function learning process (asterisks).
Percentage of correct entries in the inferred truth tables for synthetic networks for E. coli and S. cerevisiae.
| E. Coli | |||||
|---|---|---|---|---|---|
| 4 | 4 | 0 | 0 | ||
| 4 | 4 | 0 | 0 | ||
| 2 | 2 | 0 | 0 | ||
| 2 | 2 | 0 | 0 | ||
| 4 | 4 | 0 | 0 | ||
| Total | 16 | 16 | 0 | 0 | |
| Percentage | 100% | 0% | 0% | ||
| True Table vs IRIS Inferred TT | |||||
| Regulated Gene | Regulator Genes | TT Size | Correct | Incorrect | Undefined |
| 2 | 2 | 0 | 0 | ||
| 2 | 2 | 0 | 0 | ||
| 2 | 2 | 0 | 0 | ||
| 4 | 4 | 0 | 0 | ||
| 2 | 2 | 0 | 0 | ||
| 2 | 1 | 1 | 0 | ||
| 2 | 2 | 0 | 0 | ||
| 2 | 2 | 0 | 0 | ||
| 2 | 2 | 0 | 0 | ||
| 4 | 4 | 0 | 0 | ||
| Total | 24 | 23 | 1 | 0 | |
| Percentage | 95.83% | 4.17% | 0% | ||
Truth tables (TTs) were computed using the data set with the maximum biological noise level (0.50). "TT Size" reports the number of possible state assignments for the regulator genes. Note that the value for the i-th regulated gene is , where |Ri| is the number of regulators and each gene can be in two states. We distinguish the number of correct/incorrect/undefined inferred states for each regulatory relation and compute these as percentages of the total number of states.
Execution time for IRIS and EM-MAP.
| IRIS | EM-MAP | IRIS | EM-MAP | |||
|---|---|---|---|---|---|---|
| 0.10 | 0.929 s | 9.447 s | 6 | 1.833 s | 14.557 s | 5 |
| 0.15 | 0.897 s | 9.435 s | 6 | 1.892 s | 14.399 s | 5 |
| 0.20 | 0.910 s | 8.770 s | 5 | 1.853 s | 14.510 s | 5 |
| 0.25 | 0.905 s | 8.642 s | 5 | 1.874 s | 14.381 s | 5 |
| 0.30 | 0.968 s | 8.787 s | 5 | 1.876 s | 17.446 s | 6 |
| 0.35 | 0.965 s | 8.762 s | 5 | 1.790 s | 17.432 s | 6 |
| 0.40 | 1.041 s | 8.880 s | 5 | 1.825 s | 17.469 s | 6 |
| 0.45 | 0.953 s | 8.674 s | 5 | 1.839 s | 14.600 s | 5 |
| 0.50 | 1.005 s | 8.741 s | 5 | 1.860 s | 14.735 s | 5 |
Each value was obtained as the mean for 10 runs. For EM-MAP we also report the number of iterations to reach convergence.
Figure 7Kullback-Leibler divergence for IRIS using randomised data sets. The S. cerevisiae synthetic network and the data set with biological noise of 0.5 were used in this test. The x-axis represents the percentage of columns swapped randomly. The Dvalues reported are the means for 100 runs.
Figure 8Network for the .
Percentage of correct entries in inferred truth tables for the S. cerevisiae mitotic cell-cycle network.
| True Table vs IRIS Inferred TT | |||||
|---|---|---|---|---|---|
| Regulated Gene | Regulator Genes | TT Size | Correct | Incorrect | Undefined |
| 2 | 2 | 0 | 0 | ||
| 2 | 2 | 0 | 0 | ||
| 2 | 1 | 0 | 1 | ||
| 2 | 2 | 0 | 0 | ||
| 4 | 4 | 0 | 0 | ||
| 8 | 7 | 0 | 1 | ||
| 4 | 2 | 2 | 0 | ||
| 8 | 6 | 1 | 1 | ||
| 8 | 6 | 1 | 1 | ||
| 4 | 1 | 2 | 1 | ||
| Total | 44 | 33 | 6 | 5 | |
| Percentage | 75% | 13.64% | 11.36% | ||
This table follows the same schema as for Table 1.
Inference results. Column "Biological Findings" lists a short description of the features of interest and references.
| Biological Findings | Observed Genes | Hidden Genes | Inference Results |
|---|---|---|---|
| Strong relationship between cyclins | |||
| Inhibitory activity of | |||
| Inactivation of | |||
| While | |||
| Inactivation activity of | |||
To infer the conditional probability reported in the "Inference Results" column, the sum-product algorithm was used, and the state of the "Observed Genes" is fixed to derive the potential behaviour of "Hidden Genes".
Steady states for the yeast mitotic cell-cycle network obtained using IRIS
| 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 |
| 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 |
Figure 9Results on MYC-subnetwork for different . The ratio m/n represents the ratio between the number of samples and the number of genes. In (a) the comparison of the Kullback-Leibler divergences of IRIS (red asterisk line) and EM-MAP (blue crossed line) obtained for different values of the ratio m/n (x-axis). In (b) the percentage of correct (blue bars), undefined (green bars) and incorrect (red bars) evaluations of IRIS for different m/n ratio values.
IRIS results for the MYC subnetwork including 55 genes directly connected to MYC.
| Gene | IRIS Inferred Regulation | MYC-DB Regulation |
|---|---|---|
| Upregulation | Upregulation | |
| Upregulation | Upregulation | |
| Upregulation | Upregulation | |
| Upregulation | Upregulation | |
| Downregulation | Upregulation | |
| Undefined | Upregulation | |
| Upregulation | Upregulation | |
| Upregulation | Upregulation | |
| Undefined | Not Specified | |
| Upregulation | Upregulation | |
| Upregulation | Not Specified | |
| Upregulation | Upregulation | |
| Upregulation | Upregulation | |
| Undefined | Upregulation | |
| Upregulation | Upregulation | |
| Upregulation | Upregulation | |
| Upregulation | Upregulation | |
| Upregulation | Not Specified | |
| Undefined | Upregulation | |
| Upregulation | Upregulation | |
| Upregulation | Upregulation | |
| Upregulation | Not Specified | |
| Upregulation | Upregulation | |
| Undefined | Upregulation | |
| Upregulation | Not Present | |
| Downregulation | Not Present | |
| Upregulation | Not Present | |
| Upregulation | Not Present | |
| Upregulation | Not Present | |
| Downregulation | Not Present | |
| Downregulation | Not Present | |
| Undefined | Not Present | |
| Upregulation | Not Present | |
| Upregulation | Not Present | |
| Upregulation | Not Present | |
| Downregulation | Not Present | |
| Upregulation | Not Present | |
| Upregulation | Not Present | |
| Upregulation | Not Present | |
| Upregulation | Not Present | |
| Upregulation | Not Present | |
| Undefined | Not Present | |
| Upregulation | Not Present | |
| Upregulation | Not Present | |
| Undefined | Not Present | |
| Undefined | Not Present | |
| Upregulation | Not Present | |
| Undefined | Not Present | |
| Upregulation | Not Present | |
| Upregulation | Not Present | |
| Undefined | Not Present | |
| Downregulation | Not Present | |
| Upregulation | Not Present | |
| Undefined | Not Present | |
| Upregulation | Not Present | |
Note: Not Specified indicates a gene for which an entry exists in MYC-DB but for which no information on regulation is available; Not Present indicates a gene for which no entry in MYC-DB exists; Undefined indicates a situation for which IRIS cannot distinguish between up- and down-regulation.