| Literature DB >> 21188141 |
Patricia Menéndez1, Yiannis A I Kourmpetis, Cajo J F ter Braak, Fred A van Eeuwijk.
Abstract
A major challenge in the field of systems biology consists of predicting gene regulatory networks based on different training data. Within the DREAM4 initiative, we took part in the multifactorial sub-challenge that aimed to predict gene regulatory networks of size 100 from training data consisting of steady-state levels obtained after applying multifactorial perturbations to the original in silico network. Due to the static character of the challenge data, we tackled the problem via a sparse Gaussian Markov Random Field, which relates network topology with the covariance inverse generated by the gene measurements. As for the computations, we used the Graphical Lasso algorithm which provided a large range of candidate network topologies. The main task was to select the optimal network topology and for that, different model selection criteria were explored. The selected networks were compared with the golden standards and the results ranked using the scoring metrics applied in the challenge, giving a better insight in our submission and the way to improve it.Our approach provides an easy statistical and computational framework to infer gene regulatory networks that is suitable for large networks, even if the number of the observations (perturbations) is greater than the number of variables (genes).Entities:
Mesh:
Year: 2010 PMID: 21188141 PMCID: PMC3004794 DOI: 10.1371/journal.pone.0014147
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1The experimental data.
Visualization of the gene levels for all the perturbations ordered according to the first principal component.
Figure 2Influence of the graphical Lasso penalty on network complexity and Bayesian Information.
A: Number of edges versus penalty for data set 3 in the multifactorial challenge with down arrows indicating the chosen associated with (from left to right) AIC, MAX_AUROC, MAX_AUPR and BIC. The horizontal line connects and of the 50 best BIC networks chosen in the ensemble network. B: BIC versus penalty for the five data sets.
Graphical Lasso algorithm.
| Graphical Lasso algorithm |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 3Performance of the five network reconstruction methods.
The ROC and PR curves (Ensemble, AIC, BIC, MAX_AUPR and MAX_AUROC) are vertical averages of the curves for the five data sets.
Average performance measures for different network reconstructions across data sets with standard deviations in parentheses.
| Measures | Ensemble | AIC | BIC | MAX-AUPR | MAX-AUROC | MAX |
|
| 0.23(0.04) | 0.05(0.01) | 0.26(0.06) | 0.28(0.06) | 0.26(0.08) | 0.23(0.03) |
|
| 0.67(0.05) | 0.58(0.02) | 0.65(0.04) | 0.68(0.04) | 0.69(0.04) | 0.68(0.04) |
|
| 0.84(0.26) | 0.18(0.27) | 1(0.00) | 0.82(0.25) | 0.79(0.32) | 0.81(0.27) |
|
| 0.66(0.13) | 0.06(0.02) | 0.83(0.14) | 0.83(0.13) | 0.79(0.32) | 0.65(0.13) |
|
| 0.10(0.04) | 0.05(0.01) | 0.07(0.02) | 0.10(0.04) | 0.11(0.04) | 0.11(0.04) |
|
| 0.05(0.00) | 0.04(0.01) | 0.05(0.01) | 0.05(0.01) | 0.05(0.01) | 0.05(0.01) |
|
| 36.58 | 2.25 | 43.00 | 47.30 | 43.55 | 35.29 |
|
| 11.19 | 3.49 | 8.52 | 11.26 | 12.80 | 11.79 |
|
| 23.89 | 2.87 | 25.76 | 29.28 | 28.17 | 23.54 |
|
| − | −6(0.00) | −2.38(0.08) | −2.60(0.07) | −2.94(0.34) | − |
Pr1Rec, Pr10Rec, Pr50Rec, Pr80Rec represent precision at 1%, 10%, 50%, and 80% recall. The last row shows the best penalty value.
Average performance measures for different network reconstructions across data sets, when only half of the perturbations were used, standard deviations in parentheses.
| Measures | Ensemble | AIC | BIC | MAX-AUPR | MAX-AUROC | MAX |
|
| 0.18(0.04) | 0.19(0.05) | 0.21(0.05) | 0.22(0.06) | 0.22(0.06) | 0.18(0.03) |
|
| 0.64(0.05) | 0.63(0.04) | 0.64(0.04) | 0.63(0.04) | 0.64(0.05) | 0.64(0.05) |
|
| 0.83(0.24) | 0.82(0.26) | 0.82(0.26) | 0.92(0.18) | 0.89(0.26) | 0.83(0.24) |
|
| 0.53(0.17) | 0.64(0.23) | 0.69(0.16) | 0.73(0.14) | 0.70(0.16) | 0.53(0.16) |
|
| 0.07(0.02) | 0.07(0.02) | 0.07(0.02) | 0.07(0.02) | 0.07(0.02) | 0.07(0.02) |
|
| 0.05(0.01) | 0.04(0.01) | 0.04(0.01) | 0.04(0.01) | 0.05(0.01) | 0.05(0.01) |
|
| 26.83 | 28.07 | 32.51 | 35.55 | 34.32 | 26.58 |
|
| 7.99 | 7.29 | 7.69 | 7.60 | 8.39 | 8.02 |
|
| 17.41 | 17.68 | 20.10 | 21.57 | 21.36 | 17.30 |
|
| − | −3.00(0.00) | −2.72(0.27) | −2.41(0.15) | −2.66(0.23) | − |
Pr1Rec, Pr10Rec, Pr50Rec, Pr80Rec represent precision at 1%, 10%, 50%, and 80% recall. The last row shows the best penalty value.
Average performance measures for different network computed using correlation reconstructions for thresholds 0 (REL), 0.4 and 0.8 with standard deviations in parentheses.
| Measures |
|
|
|
|
|
|
|
| 0.26(0.06) | 0.19(0.07) | 0.05(0.02) | 0.24(0.06) | 0.20(0.06) | 0.06(0.01) |
|
| 0.74(0.02) | 0.61(0.04) | 0.51(0.01) | 0.70(0.04) | 0.61(0.03) | 0.51(0.00) |
|
| 0.73(0.30) | 0.73(0.30) | 0.38(0.43) | 0.84(0.36) | 0.84(0.36) | 0.84(0.36) |
|
| 0.67(0.18) | 0.67(0.17) | 0.05(0.01) | 0.73(0.15) | 0.73(0.15) | 0.05(0.01) |
|
| 0.16(0.04) | 0.06(0.01) | 0.04(0.01) | 0.11(0.03) | 0.06(0.01) | 0.04(0.01) |
|
| 0.06(0.00) | 0.04(0.01) | 0.04(0.01) | 0.05(0.01) | 0.04(0.01) | 0.04(0.01) |
|
| 45.53 | 30.75 | 2.30 | 39.12 | 32.02 | 3.24 |
|
| 17.75 | 5.63 | 0.45 | 13.48 | 5.84 | 0.52 |
|
| 31.64 | 18.19 | 1.38 | 26.30 | 18.93 | 1.88 |
Pr1Rec, Pr10Rec, Pr50Rec, Pr80Rec represent precision at 1%, 10%, 50%, and 80% recall when all (100) and only half of the perturbations (50) are considered.