| Literature DB >> 24400020 |
Miguel Lopes1, Gianluca Bontempi1.
Abstract
Accurate inference of causal gene regulatory networks from gene expression data is an open bioinformatics challenge. Gene interactions are dynamical processes and consequently we can expect that the effect of any regulation action occurs after a certain temporal lag. However such lag is unknown a priori and temporal aspects require specific inference algorithms. In this paper we aim to assess the impact of taking into consideration temporal aspects on the final accuracy of the inference procedure. In particular we will compare the accuracy of static algorithms, where no dynamic aspect is considered, to that of fixed lag and adaptive lag algorithms in three inference tasks from microarray expression data. Experimental results show that network inference algorithms that take dynamics into account perform consistently better than static ones, once the considered lags are properly chosen. However, no individual algorithm stands out in all three inference tasks, and the challenging nature of network inference tasks is evidenced, as a large number of the assessed algorithms does not perform better than random.Entities:
Keywords: causality inference; experimental assessment; gene network inference; static models; temporal models
Year: 2013 PMID: 24400020 PMCID: PMC3872039 DOI: 10.3389/fgene.2013.00303
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Assessed network inference models.
| catnet | Static | – | Bayesian network | – Categorization of data |
| – Stochastic search (simulated annealing) in the network space | ||||
| Static | – | Bayesian network | – Progressive removal of edges (backwards selection) | |
| – Conditional dependence estimated with partial correlation | ||||
| Static | – | Graphical Gaussian Model | – Full partial correlations estimated through shrinkage | |
| – Edges are directed from the most to the less exogenous variable | ||||
| Dynamic | Fixed (first) | VAR | –VAR(I) model subject to a LI penalty term | |
| – Regression coefficients estimated with least angle regression (lars) | ||||
| Dynamic | Fixed (first) | VAR | –VAR(I) model subject to a variable penalty term (to favor the selection of transcription factors) | |
| – Regression coefficients estimated through optimization | ||||
| Dynamic | Fixed(first) | Dynamic Bayesian network | – Estimation of a number of first order partial regression coefficients, for each possible interaction | |
| – Predictors and target are lagged by I time point | ||||
| Dynamic | Estimated(one) | Information–theoretic | – Mutual information used to infer dependencies (MI estimated with a copula–based approach) | |
| – Estimation of the lag between two genes | ||||
| – Use of the DPI to break up fully connected triplets | ||||
| Dynamic | Estimated(one) | Information–theoretic | – Mutual information used to infer dependencies (Gaussian assumption) | |
| – Estimation of the lag between two genes | ||||
| – mRMR feature selection | ||||
| Dynamic | Estimated(one) | Information–theoretic | – Mutual information used to infer dependencies (Gaussian assumption) | |
| – Estimation of the lag between two genes | ||||
| – Normalization of MI |
Figure 1Precision-recall curves.
Figure 2Average AUPRC for the three datasets and different algorithms.
Figure 3Existence (black) or not (white) of a significant difference between the algorithms performance.
Figure 4Distribution of lags for the three datasets, maximum allowed lag is 6 time points.
Figure 5Distribution of lags for the three datasets, maximum allowed lag is 18 time points.