| Literature DB >> 24172639 |
Andrea Rau1, Florence Jaffrézic, Grégory Nuel.
Abstract
BACKGROUND: In recent years, there has been great interest in using transcriptomic data to infer gene regulatory networks. For the time being, methodological development in this area has primarily made use of graphical Gaussian models for observational wild-type data, resulting in undirected graphs that are not able to accurately highlight causal relationships among genes. In the present work, we seek to improve the estimation of causal effects among genes by jointly modeling observational transcriptomic data with arbitrarily complex intervention data obtained by performing partial, single, or multiple gene knock-outs or knock-downs.Entities:
Mesh:
Year: 2013 PMID: 24172639 PMCID: PMC3834107 DOI: 10.1186/1752-0509-7-111
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1Graph structure used in simulation study. Graph structure taken from [9] used for the simulation study for a graph with ten nodes and 21 edges.
Comparison of methods for total causal effects for simulated data with moderate variability ( )
| | AUROC | 0.749 (0.043) | — | 0.76 (0.062) | 0.643 (0.079) |
| | AUPRC | 0.638 (0.053) | — | 0.628 (0.078) | 0.527 (0.088) |
| Observation only | Spearman | 0.48 (0.091) | — | 0.491 (0.128) | 0.254 (0.177) |
| | MSE | 0.056 (0.007) | — | 0.182 (0.054) | 0.126 (0.034) |
| | AUROC | 0.948 (0.03) | 0.825 (0.048) | 0.733 (0.068) | 0.67 (0.073) |
| | AUPRC | 0.868 (0.042) | 0.737 (0.059) | 0.569 (0.087) | 0.53 (0.091) |
| Mixed | Spearman | 0.696 (0.053) | 0.553 (0.097) | 0.42 (0.14) | 0.318 (0.186) |
| | MSE | 0.026 (0.012) | 0.104 (0.011) | 0.334 (0.137) | 0.196 (0.067) |
| | AUROC | 0.845 (0.059) | 0.795 (0.017) | 0.736 (0.056) | 0.646 (0.085) |
| | AUPRC | 0.734 (0.078) | 0.725 (0.038) | 0.588 (0.075) | 0.514 (0.092) |
| Partial KO | Spearman | 0.587 (0.104) | 0.636 (0.034) | 0.449 (0.099) | 0.285 (0.187) |
| | MSE | 0.035 (0.015) | 0.081 (0.008) | 0.215 (0.066) | 0.146 (0.049) |
| | AUROC | 0.959 (0.016) | 0.83 (0.035) | 0.733 (0.068) | 0.67 (0.073) |
| | AUPRC | 0.886 (0.028) | 0.725 (0.039) | 0.569 (0.087) | 0.53 (0.091) |
| Multiple KO | Spearman | 0.712 (0.028) | 0.625 (0.058) | 0.42 (0.14) | 0.318 (0.186) |
| | MSE | 0.015 (0.006) | 0.107 (0.008) | 0.334 (0.137) | 0.196 (0.067) |
| | AUROC | 0.932 (0.046) | 0.574 (0.165) | 0.58 (0.145) | 0.562 (0.121) |
| Multiple KO | AUPRC | 0.539 (0.078) | 0.36 (0.105) | 0.353 (0.086) | 0.35 (0.08) |
| (3 hidden genes) | Spearman | 0.67 (0.109) | 0.037 (0.372) | 0.076 (0.316) | 0.076 (0.31) |
| MSE | 0.044 (0.034) | 0.15 (0.041) | 0.45 (0.225) | 0.294 (0.124) |
Several intervention designs were simulated: 1) 20 observational (wild-type) replicates with no interventions, 2) mixed setting with 10 wild-types and one knock-out per gene, 3) partial knock-out design with 15 wild-types and one knock-out for five genes {N1, N4, N6, N7, N9}, 4) multiple knock-out design with 10 wild types, one knock-out per gene and five double knock-outs: {N1, N5}, {N1, N6}, {N4, N7}, {N6, N9}, and {N7, N10}, and 5) a multiple knock-out design as in the previous setting, with three hidden variables. Results were averaged over 100 simulations (standard deviations in parentheses): area under the ROC curve (AUROC), area under the precision-recall curve (AUPRC), Spearman correlation between true and estimated total causal effects, and mean squared error (MSE) of estimated total causal effects.
Comparison of methods for direct causal effects for simulated data with moderate variability ( )
| | AUROC | 0.79 (0.041) | — | 0.773 (0.064) | 0.651 (0.083) |
| | AUPRC | 0.633 (0.061) | — | 0.577 (0.085) | 0.472 (0.102) |
| Observation only | Spearman | 0.474 (0.094) | — | 0.484 (0.122) | 0.246 (0.17) |
| | MSE | 0.059 (0.006) | — | 0.193 (0.057) | 0.138 (0.035) |
| | AUROC | 0.951 (0.03) | 0.842 (0.051) | 0.746 (0.06) | 0.678 (0.073) |
| | AUPRC | 0.841 (0.051) | 0.688 (0.084) | 0.5 (0.081) | 0.465 (0.091) |
| Mixed | Spearman | 0.668 (0.055) | 0.534 (0.097) | 0.409 (0.132) | 0.306 (0.181) |
| | MSE | 0.048 (0.015) | 0.107 (0.01) | 0.35 (0.131) | 0.211 (0.067) |
| | AUROC | 0.871 (0.064) | 0.784 (0.018) | 0.749 (0.068) | 0.655 (0.094) |
| | AUPRC | 0.721 (0.089) | 0.663 (0.054) | 0.532 (0.088) | 0.459 (0.115) |
| Partial KO | Spearman | 0.574 (0.106) | 0.606 (0.04) | 0.437 (0.104) | 0.272 (0.185) |
| | MSE | 0.055 (0.015) | 0.088 (0.007) | 0.228 (0.068) | 0.161 (0.052) |
| | AUROC | 0.962 (0.017) | 0.839 (0.032) | 0.746 (0.06) | 0.678 (0.073) |
| | AUPRC | 0.864 (0.034) | 0.69 (0.046) | 0.5 (0.081) | 0.465 (0.091) |
| Multiple KO | Spearman | 0.683 (0.033) | 0.614 (0.051) | 0.409 (0.132) | 0.306 (0.181) |
| | MSE | 0.038 (0.009) | 0.108 (0.008) | 0.35 (0.131) | 0.211 (0.067) |
| | AUROC | 0.94 (0.045) | 0.561 (0.189) | 0.576 (0.156) | 0.555 (0.133) |
| Multiple KO | AUPRC | 0.483 (0.085) | 0.288 (0.107) | 0.279 (0.078) | 0.276 (0.073) |
| (3 hidden genes) | Spearman | 0.633 (0.106) | 0.048 (0.37) | 0.07 (0.311) | 0.064 (0.305) |
| MSE | 0.069 (0.048) | 0.149 (0.032) | 0.454 (0.207) | 0.296 (0.109) |
Several intervention designs were simulated: 1) 20 observational (wild-type) replicates with no interventions, 2) mixed setting with 10 wild-types and one knock-out per gene, 3) partial knock-out design with 15 wild-types and one knock-out for five genes {N1, N4, N6, N7, N9}, 4) multiple knock-out design with 10 wild types, one knock-out per gene and five double knock-outs: {N1, N5}, {N1, N6}, {N4, N7}, {N6, N9}, and {N7, N10}, and 5) a multiple knock-out design as in the previous setting, with three hidden variables. Results were averaged over 100 simulations (standard deviations in parentheses): area under the ROC curve (AUROC), area under the precision-recall curve (AUPRC), Spearman correlation between true and estimated direct causal effects, and mean squared error (MSE) of estimated direct causal effects.
Figure 2Posterior distribution of node orders from the MCMC-Mallows approach, averaged over 100 simulations. Results from simulation setting with σ = 0.1: Observations only (top left), complete single knock-outs (top right), partial single knock-outs (bottom left), multiple knock-outs (bottom right). Node labels are included on the vertical axis, estimated positions within causal orderings along the horizontal axis, and the intensity of color of each square corresponds to the average proportion of iterations in which a given node was placed in a given position. As the causal node ordering is not unique for this DAG, true potential positions for each node are outlined in black.
Figure 3Comparison of methods on data with a complete design from the DREAM4 challenge. ROC curves (top) and precision-recall curves (bottom) for the five simulated 10-gene networks of the DREAM4 challenge [13] for the MCMC-Mallows, Pinna, and IDA (optimistic and pessimistic) methods.
Comparison of methods on complete DREAM4 data
| | 1 | 0.972 | 0.447 | 0.833 | 0.448 | 0.413 |
| | 2 | 0.841 | 0.647 | 0.584 | 0.610 | 0.641 |
| AUROC | 3 | 0.900 | 0.717 | 0.816 | 0.638 | 0.638 |
| | 4 | 0.954 | 0.867 | 0.899 | 0.554 | 0.483 |
| | 5 | 0.928 | 0.814 | 0.700 | 0.599 | 0.534 |
| | 1 | 0.916 | 0.183 | 0.506 | 0.142 | 0.133 |
| | 2 | 0.547 | 0.289 | 0.331 | 0.243 | 0.284 |
| AUPRC | 3 | 0.968 | 0.340 | 0.416 | 0.242 | 0.242 |
| | 4 | 0.852 | 0.633 | 0.664 | 0.162 | 0.158 |
| | 5 | 0.761 | 0.308 | 0.278 | 0.146 | 0.156 |
| DREAM score | overall | 7.127 | 2.579 | 3.563 | 0.735 | 0.723 |
Area under the ROC curve (AUROC), area under the precision-recall curve (AUPRC), and DREAM score for each of the five DREAM4 datasets for the Petri Nets [17], MCMC-Mallows, Pinna et al., and IDA (optimistic and pessimistic) methods. Results for the Petri Nets method [17] and evaluation scripts for the overall DREAM score were obtained from the DREAM4 evaluation page, located at http://wiki.c2b2.columbia.edu/dream/results/DREAM4.
Figure 4Comparison of methods on data with a partial design from the DREAM4 challenge. ROC curves (top) and precision-recall curves (bottom) for the five simulated 10-gene networks of the DREAM4 challenge [13], where for each dataset five knock-outs were removed at random, for the MCMC-Mallows, Pinna, and IDA (optimistic and pessimistic) methods.
Comparison of methods on partial DREAM4 data
| | 1 | 0.708 | 0.555 | 0.448 | 0.413 |
| | 2 | 0.525 | 0.637 | 0.610 | 0.641 |
| AUROC | 3 | 0.711 | 0.498 | 0.638 | 0.638 |
| | 4 | 0.748 | 0.682 | 0.554 | 0.483 |
| | 5 | 0.676 | 0.565 | 0.599 | 0.534 |
| | 1 | 0.344 | 0.346 | 0.142 | 0.133 |
| | 2 | 0.240 | 0.306 | 0.243 | 0.284 |
| AUPRC | 3 | 0.271 | 0.214 | 0.242 | 0.242 |
| | 4 | 0.322 | 0.494 | 0.162 | 0.158 |
| | 5 | 0.194 | 0.168 | 0.146 | 0.156 |
| DREAM score | overall | 1.844 | 1.450 | 0.735 | 0.723 |
Area under the ROC curve (AUROC), area under the precision-recall curve (AUPRC), and DREAM score for each of the five DREAM4 partial datasets, where only five of the single-gene knock-outs are included, for the MCMC-Mallows, Pinna et al., and IDA (optimistic and pessimistic) methods. Results for the Petri Nets method [17] are not provided as no software is publicly available to implement this approach. Evaluation scripts for the overall DREAM score were obtained from the DREAM4 evaluation page, located at http://wiki.c2b2.columbia.edu/dream/results/DREAM4.