| Literature DB >> 25215507 |
Sisi Ma1, Patrick Kemmeren2, David Gresham3, Alexander Statnikov4.
Abstract
De-novo reverse-engineering of genome-scale regulatory networks is a fundamental problem of biological and translational research. One of the major obstacles in developing and evaluating approaches for de-novo gene network reconstruction is the absence of high-quality genome-scale gold-standard networks of direct regulatory interactions. To establish a foundation for assessing the accuracy of de-novo gene network reverse-engineering, we constructed high-quality genome-scale gold-standard networks of direct regulatory interactions in Saccharomyces cerevisiae that incorporate binding and gene knockout data. Then we used 7 performance metrics to assess accuracy of 18 statistical association-based approaches for de-novo network reverse-engineering in 13 different datasets spanning over 4 data types. We found that most reconstructed networks had statistically significant accuracies. We also determined which statistical approaches and datasets/data types lead to networks with better reconstruction accuracies. While we found that de-novo reverse-engineering of the entire network is a challenging problem, it is possible to reconstruct sub-networks around some transcription factors with good accuracy. The latter transcription factors can be identified by assessing their connectivity in the inferred networks. Overall, this study provides the gene network reverse-engineering community with a rigorous assessment of the accuracy of S. cerevisiae gene network reconstruction and variability in performance of various approaches for learning both the entire network and sub-networks around transcription factors.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25215507 PMCID: PMC4162580 DOI: 10.1371/journal.pone.0106479
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Construction of gene regulatory networks by integrating targeted perturbation data with binding data.
The relations in constructed gene regulatory network correspond to direct regulatory interactions.
Assessment of currently available genome-scale gold-standard networks used by prior gene network reverse-engineering studies.
| Gold-standard | Description | Limitations | Used by |
| #1 | E. Coli network from RegulonDB, a curated database of regulatory interactions obtained through literature search | • Unknown quality | DREAM2 |
| • Heterogeneous data sources and experimental methods | |||
| #2 | S. Cerevisiae network from binding data | • Binding relations can be non-functional |
|
| • Higher quality binding data exists | |||
| #3 | S. Cerevisiae network from binding data | • Binding relations can be non-functional | DREAM5 |
| #4 | S. Cerevisiae network from YEASTRACT, a curated database of regulatory interactions obtained through literature search | • Unknown quality | DREAM5 |
| • Heterogeneous data sources and experimental methods | |||
| #5 | S. Cerevisiae network from deletion mutants | • Inferred transcription relations can be indirect | DREAM5 |
Overlapping identified binding with regulatory relations results in gold-standard networks with direct regulatory relations.
| Gold-standard network # | Binding Network | Regulatory network | Gold-standard network (integrating binding and regulatory networks) | |||
|
|
|
|
|
|
| |
| 1 | 0.001 | 2 | 4,034 | 991,444 | 1,083 | <10−16 |
| 2 | 0.005 | 1 | 8,392 | 1,785 | <10−16 | |
| 3 | 0.005 | 0 | 13,050 | 2,403 | <10−16 | |
Figure 2Gold-standard gene regulatory network #1.
Transcription factors are shown with large blue circles, and other genes are shown with small green circles. Edges in the network represent direct regulatory interactions. Inhibiting edges are shown with red, and excitatory edges are shown with black.
Figure 3Direct regulatory interactions between transcription factors in gold-standard gene regulatory network #1.
Inhibiting edges are shown with red, and excitatory edges are shown with black.
Figure 4Topological analysis of gold-standard gene regulatory network #1.
The analysis was performed in Cytoscape with NetworkAnalyzer.
Sensitivity and specificity.
| Observational (Biological Replicates) | Observational (Environment/Time) | Semi-experimental (Compendium) | Experimental | ||||||||||||
| Statistics | Conditioning | Post-Processing | Holstege1 | Holstege2 | Gresham | Gasch | Smith | Yeung | M3D | GPL90 | Hughes1 | Hughes2 | Hu | Holstege3 | Holstege4 |
| Fisher's Z | None | FDR, “AND” rule | 0.63|0.37 |
|
|
|
| 0.75|0.19 | 0.74|0.26 |
|
|
| 0.33|0.68 | 0.73|0.25 |
|
| Fisher's Z | None | FDR, “OR” rule | 0.65|0.35 |
|
|
|
| 0.77|0.18 | 0.76|0.25 |
|
|
| 0.36|0.64 | 0.74|0.24 |
|
| Fisher's Z | None | None | 0.68|0.32 |
|
|
|
| 0.78|0.17 | 0.77|0.24 |
|
|
| 0.43|0.57 | 0.77|0.22 |
|
| Fisher's Z | 1 gene | “AND” rule | 0.06|0.94 |
| 0.03|0.97 |
|
|
|
|
|
|
| 0.07|0.94 | 0.12|0.88 |
|
| Fisher's Z | 1 gene | “OR” rule |
|
| 0.04|0.97 |
|
|
|
|
|
|
| 0.07|0.93 | 0.13|0.88 |
|
| Fisher's Z | 2 genes | “AND” rule | 0.01|0.99 | 0.01|0.99 |
|
|
|
|
|
|
|
| 0.02|0.98 |
|
|
| Fisher's Z | 2 genes | “OR” rule | 0.02|0.98 |
|
|
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “AND” rule | 0.01|1.00 | 0.01|1.00 | 0.00|1.00 |
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “OR” rule |
| 0.01|0.99 | 0.01|0.99 |
|
|
|
|
|
|
|
|
|
|
| G2 | None | FDR, “AND” rule | 0.39|0.59 | 0.62|0.37 | 0.49|0.50 |
|
| 0.80|0.14 |
|
|
|
| 0.21|0.80 | 0.69|0.30 |
|
| G2 | None | FDR, “OR” rule | 0.44|0.54 | 0.66|0.33 | 0.56|0.44 |
|
| 0.82|0.12 |
|
|
|
| 0.26|0.74 | 0.72|0.27 |
|
| G2 | None | None | 0.50|0.49 | 0.69|0.31 | 0.60|0.38 |
|
| 0.83|0.12 |
|
|
|
| 0.36|0.65 | 0.76|0.23 |
|
| G2 | 1 gene | “AND” rule | 0.04|0.96 | 0.14|0.86 | 0.59|0.39 |
|
|
|
|
|
|
|
| 0.14|0.85 |
|
| G2 | 1 gene | “OR” rule | 0.04|0.95 | 0.15|0.85 | 0.60|0.38 |
|
|
|
|
|
|
|
| 0.16|0.83 |
|
| G2 | 2 genes | “AND” rule | 0.04|0.96 | 0.14|0.86 | 0.52|0.47 |
|
|
|
|
|
|
|
| 0.02|0.98 |
|
| G2 | 2 genes | “OR” rule | 0.04|0.95 | 0.15|0.85 | 0.60|0.39 |
|
|
|
|
|
|
|
| 0.04|0.96 |
|
| G2 | 3 genes | “AND” rule | 0.04|0.96 | 0.09|0.91 | 0.09|0.89 |
|
|
|
|
|
|
|
| 0.02|0.98 |
|
| G2 | 3 genes | “OR” rule | 0.04|0.95 | 0.14|0.86 | 0.28|0.73 |
|
|
|
|
|
|
|
| 0.04|0.96 |
|
Cells with bold font correspond to experiments with statistically significant reconstruction of regulatory networks. See for abbreviations of row labels. See part A for a colored version of this table.
Euclidean distance from the optimal algorithm with sensitivity = 1 and specificity = 1.
| Observational (Biological Replicate) | Observational (Environment/Time) | Semi-Experimental (Compendium) | Experimental | Method Average | ||||||||||||
| Statistics | Conditioning | Post-Processing | Holstege1 | Holstege2 | Gresham | Gasch | Smith | Yeung | M3D | GPL90 | Hughes1 | Hughes2 | Hu | Holstege3 | Holstege4 | |
| Fisher's Z | None | FDR, “AND” rule | 0.73 |
|
|
|
| 0.85 | 0.78 |
|
|
| 0.74 | 0.79 |
|
|
| Fisher's Z | None | FDR, “OR” rule | 0.74 |
|
|
|
| 0.85 | 0.79 |
|
|
| 0.73 | 0.8 |
|
|
| Fisher's Z | None | — | 0.75 |
|
|
|
| 0.86 | 0.79 |
|
|
| 0.71 | 0.81 |
|
|
| Fisher's Z | 1 gene | “AND” rule | 0.94 |
| 0.97 |
|
|
|
|
|
|
| 0.94 | 0.89 |
|
|
| Fisher's Z | 1 gene | “OR” rule |
|
| 0.96 |
|
|
|
|
|
|
| 0.93 | 0.88 |
|
|
| Fisher's Z | 2 genes | “AND” rule | 0.99 | 0.99 |
|
|
|
|
|
|
|
| 0.98 |
|
|
|
| Fisher's Z | 2 genes | “OR” rule | 0.98 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “AND” rule | 0.99 | 0.99 | 1 |
|
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “OR” rule |
| 0.99 | 0.99 |
|
|
|
|
|
|
|
|
|
|
|
| G2 | None | FDR, “AND” rule | 0.73 | 0.73 | 0.72 |
|
| 0.89 |
|
|
|
| 0.82 | 0.77 |
|
|
| G2 | None | FDR, “OR” rule | 0.72 | 0.75 | 0.72 |
|
| 0.89 |
|
|
|
| 0.78 | 0.78 |
|
|
| G2 | None | — | 0.72 | 0.76 | 0.74 |
|
| 0.9 |
|
|
|
| 0.74 | 0.8 |
|
|
| G2 | 1 gene | “AND” rule | 0.96 | 0.87 | 0.74 |
|
|
|
|
|
|
|
| 0.87 |
|
|
| G2 | 1 gene | “OR” rule | 0.96 | 0.86 | 0.74 |
|
|
|
|
|
|
|
| 0.86 |
|
|
| G2 | 2 genes | “AND” rule | 0.96 | 0.87 | 0.71 |
|
|
|
|
|
|
|
| 0.98 |
|
|
| G2 | 2 genes | “OR” rule | 0.96 | 0.86 | 0.73 |
|
|
|
|
|
|
|
| 0.96 |
|
|
| G2 | 3 genes | “AND” rule | 0.96 | 0.92 | 0.91 |
|
|
|
|
|
|
|
| 0.98 |
|
|
| G2 | 3 genes | “OR” rule | 0.96 | 0.87 | 0.77 |
|
|
|
|
|
|
|
| 0.96 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||
|
|
|
|
|
| ||||||||||||
Cells with bold font correspond to experiments with statistically significant reconstruction of regulatory networks. See for abbreviations of row labels. See part B for a colored version of this table.
Statistical approaches used for gene regulatory network reverse-engineering.
| Approach # | Algorithm | Statistic | Conditioning | Post-Processing | Abbreviation |
| 1 | Bivariate analysis | Fisher's Z | None | FDR, AND rule |
|
| 2 | Bivariate analysis | Fisher's Z | None | FDR, OR rule |
|
| 3 | Bivariate analysis | Fisher's Z | None | Alpha |
|
| 4 | Multivariate causal graph-based (GLL) | Fisher's Z | 1 | AND rule |
|
| 5 | Multivariate causal graph-based (GLL) | Fisher's Z | 1 | OR rule |
|
| 6 | Multivariate causal graph-based (GLL) | Fisher's Z | 2 | AND rule |
|
| 7 | Multivariate causal graph-based (GLL) | Fisher's Z | 2 | OR rule |
|
| 8 | Multivariate causal graph-based (GLL) | Fisher's Z | 3 | AND rule |
|
| 9 | Multivariate causal graph-based (GLL) | Fisher's Z | 3 | OR rule |
|
| 10 | Bivariate analysis | G2 | None | FDR, AND rule |
|
| 11 | Bivariate analysis | G2 | None | FDR, OR rule |
|
| 12 | Bivariate analysis | G2 | None | Alpha |
|
| 13 | Multivariate causal graph-based (GLL) | G2 | 1 | AND rule |
|
| 14 | Multivariate causal graph-based (GLL) | G2 | 1 | OR rule |
|
| 15 | Multivariate causal graph-based (GLL) | G2 | 2 | AND rule |
|
| 16 | Multivariate causal graph-based (GLL) | G2 | 2 | OR rule |
|
| 17 | Multivariate causal graph-based (GLL) | G2 | 3 | AND rule |
|
| 18 | Multivariate causal graph-based (GLL) | G2 | 3 | OR rule |
|
“FDR” refers to thresholding associations at 5% FDR using the methodology of [23], [24]. “Alpha” refers to thresholding associations at 5% alpha. “AND” rule implies that if the algorithm run on X outputs Y and if the algorithm run on Y outputs X, then X and Y have an edge in the resulting network, and (ii) “OR” rule implies that if the algorithm run on X outputs Y or if the algorithm run on Y outputs X, then X and Y have an edge in the resulting network.
Figure 5ROC curve of the Pareto frontier for sensitivity/specificity pairs obtained by application of 18 network reverse-engineering approaches to 13 datasets.
Figure 6ROC curves of the Pareto frontier for sensitivity/specificity pairs obtained by application of 18 network reverse-engineering approaches to datasets of each type.
Positive predictive value (PPV) and negative predictive value (NPV).
| Observational (Biological Replicates) | Observational (Environment/Time) | Semi-experimental (Compendium) | Experimental | ||||||||||||
| Statistics | Conditioning | Post-Processing | Holstege1 | Holstege2 | Gresham | Gasch | Smith | Yeung | M3D | GPL90 | Hughes1 | Hughes2 | Hu | Holstege3 | Holstege4 |
| Fisher's Z | None | FDR, “AND” rule | 0.02|0.98 |
|
|
|
| 0.02|0.98 | 0.02|0.98 |
|
|
| 0.02|0.98 | 0.02|0.98 |
|
| Fisher's Z | None | FDR, “OR” rule | 0.02|0.98 |
|
|
|
| 0.02|0.98 | 0.02|0.98 |
|
|
| 0.02|0.98 | 0.02|0.98 |
|
| Fisher's Z | None | None | 0.02|0.98 |
|
|
|
| 0.02|0.98 | 0.02|0.98 |
|
|
| 0.02|0.98 | 0.02|0.98 |
|
| Fisher's Z | 1 gene | “AND” rule | 0.02|0.98 |
| 0.02|0.98 |
|
|
|
|
|
|
| 0.02|0.98 | 0.02|0.98 |
|
| Fisher's Z | 1 gene | “OR” rule |
|
| 0.02|0.98 |
|
|
|
|
|
|
| 0.02|0.98 | 0.02|0.98 |
|
| Fisher's Z | 2 genes | “AND” rule | 0.02|0.98 | 0.02|0.98 |
|
|
|
|
|
|
|
| 0.02|0.98 |
|
|
| Fisher's Z | 2 genes | “OR” rule | 0.02|0.98 |
|
|
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “AND” rule | 0.02|0.98 | 0.03|0.98 | 0.01|0.98 |
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “OR” rule |
| 0.02|0.98 | 0.02|0.98 |
|
|
|
|
|
|
|
|
|
|
| G2 | None | FDR, “AND” rule | 0.02|0.98 | 0.02|0.98 | 0.02|0.98 |
|
| 0.02|0.97 |
|
|
|
| 0.02|0.98 | 0.02|0.98 |
|
| G2 | None | FDR, “OR” rule | 0.02|0.98 | 0.02|0.98 | 0.02|0.98 |
|
| 0.02|0.98 |
|
|
|
| 0.02|0.98 | 0.02|0.98 |
|
| G2 | None | None | 0.02|0.98 | 0.02|0.98 | 0.02|0.98 |
|
| 0.02|0.98 |
|
|
|
| 0.02|0.98 | 0.02|0.98 |
|
| G2 | 1 gene | “AND” rule | 0.01|0.98 | 0.02|0.98 | 0.02|0.98 |
|
|
|
|
|
|
|
| 0.02|0.98 |
|
| G2 | 1 gene | “OR” rule | 0.01|0.98 | 0.02|0.98 | 0.02|0.98 |
|
|
|
|
|
|
|
| 0.02|0.98 |
|
| G2 | 2 genes | “AND” rule | 0.01|0.98 | 0.02|0.98 | 0.02|0.98 |
|
|
|
|
|
|
|
| 0.02|0.98 |
|
| G2 | 2 genes | “OR” rule | 0.01|0.98 | 0.02|0.98 | 0.02|0.98 |
|
|
|
|
|
|
|
| 0.02|0.98 |
|
| G2 | 3 genes | “AND” rule | 0.01|0.98 | 0.02|0.98 | 0.01|0.98 |
|
|
|
|
|
|
|
| 0.02|0.98 |
|
| G2 | 3 genes | “OR” rule | 0.01|0.98 | 0.02|0.98 | 0.02|0.98 |
|
|
|
|
|
|
|
| 0.02|0.98 |
|
Cells with bold font correspond to experiments with statistically significant reconstruction of regulatory networks. See for abbreviations of row labels. See part A for a colored version of this table.
Euclidean distance from the optimal algorithm with PPV = 1 and NPV = 1.
| Observational (Biological Replicates) | Observational (Environment/Time) | Semi-Experimental (Compendium) | Experimental | Method Average | ||||||||||||
| Statistics | Conditioning | Post-Processing | Holstege1 | Holstege2 | Gresham | Gasch | Smith | Yeung | M3D | GPL90 | Hughes1 | Hughes2 | Hu | Holstege3 | Holstege4 | |
| Fisher's Z | None | FDR, “AND” rule | 0.98 |
|
|
|
| 0.98 | 0.98 |
|
|
| 0.98 | 0.98 |
|
|
| Fisher's Z | None | FDR, “OR” rule | 0.98 |
|
|
|
| 0.98 | 0.98 |
|
|
| 0.98 | 0.98 |
|
|
| Fisher's Z | None | None | 0.98 |
|
|
|
| 0.98 | 0.98 |
|
|
| 0.98 | 0.98 |
|
|
| Fisher's Z | 1 gene | “AND” rule | 0.98 |
| 0.98 |
|
|
|
|
|
|
| 0.98 | 0.98 |
|
|
| Fisher's Z | 1 gene | “OR” rule |
|
| 0.98 |
|
|
|
|
|
|
| 0.98 | 0.98 |
|
|
| Fisher's Z | 2 genes | “AND” rule | 0.98 | 0.98 |
|
|
|
|
|
|
|
| 0.98 |
|
|
|
| Fisher's Z | 2 genes | “OR” rule | 0.98 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “AND” rule | 0.98 | 0.97 | 0.99 |
|
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “OR” rule |
| 0.98 | 0.98 |
|
|
|
|
|
|
|
|
|
|
|
| G2 | None | FDR, “AND” rule | 0.98 | 0.98 | 0.98 |
|
| 0.98 |
|
|
|
| 0.98 | 0.98 |
|
|
| G2 | None | FDR, “OR” rule | 0.98 | 0.98 | 0.98 |
|
| 0.98 |
|
|
|
| 0.98 | 0.98 |
|
|
| G2 | None | None | 0.98 | 0.98 | 0.98 |
|
| 0.98 |
|
|
|
| 0.98 | 0.98 |
|
|
| G2 | 1 gene | “AND” rule | 0.99 | 0.98 | 0.98 |
|
|
|
|
|
|
|
| 0.98 |
|
|
| G2 | 1 gene | “OR” rule | 0.99 | 0.98 | 0.98 |
|
|
|
|
|
|
|
| 0.98 |
|
|
| G2 | 2 genes | “AND” rule | 0.99 | 0.98 | 0.98 |
|
|
|
|
|
|
|
| 0.98 |
|
|
| G2 | 2 genes | “OR” rule | 0.99 | 0.98 | 0.98 |
|
|
|
|
|
|
|
| 0.98 |
|
|
| G2 | 3 genes | “AND” rule | 0.99 | 0.98 | 0.99 |
|
|
|
|
|
|
|
| 0.98 |
|
|
| G2 | 3 genes | “OR” rule | 0.99 | 0.98 | 0.98 |
|
|
|
|
|
|
|
| 0.98 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||
|
|
|
|
|
| ||||||||||||
Cells with bold font correspond to experiments with statistically significant reconstruction of regulatory networks. See for abbreviations of row labels. See part B for a colored version of this table.
Recall (sensitivity) and precision (PPV).
| Observational (Biological Replicates) | Observational (Environment/Time) | Semi-experimental (Compendium) | Experimental | ||||||||||||
| Statistics | Conditioning | Post-Processing | Holstege1 | Holstege2 | Gresham | Gasch | Smith | Yeung | M3D | GPL90 | Hughes1 | Hughes2 | Hu | Holstege3 | Holstege4 |
| Fisher's Z | None | FDR, “AND” rule | 0.63|0.02 |
|
|
|
| 0.75|0.02 | 0.74|0.02 |
|
|
| 0.33|0.02 | 0.73|0.02 |
|
| Fisher's Z | None | FDR, “OR” rule | 0.65|0.02 |
|
|
|
| 0.77|0.02 | 0.76|0.02 |
|
|
| 0.36|0.02 | 0.74|0.02 |
|
| Fisher's Z | None | None | 0.68|0.02 |
|
|
|
| 0.78|0.02 | 0.77|0.02 |
|
|
| 0.43|0.02 | 0.77|0.02 |
|
| Fisher's Z | 1 gene | “AND” rule | 0.06|0.02 |
| 0.03|0.02 |
|
|
|
|
|
|
| 0.07|0.02 | 0.12|0.02 |
|
| Fisher's Z | 1 gene | “OR” rule |
|
| 0.04|0.02 |
|
|
|
|
|
|
| 0.07|0.02 | 0.13|0.02 |
|
| Fisher's Z | 2 genes | “AND” rule | 0.01|0.02 | 0.01|0.02 |
|
|
|
|
|
|
|
| 0.02|0.02 |
|
|
| Fisher's Z | 2 genes | “OR” rule | 0.02|0.02 |
|
|
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “AND” rule | 0.01|0.02 | 0.01|0.03 | 0.00|0.01 |
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “OR” rule |
| 0.01|0.02 | 0.01|0.02 |
|
|
|
|
|
|
|
|
|
|
| G2 | None | FDR, “AND” rule | 0.39|0.02 | 0.62|0.02 | 0.49|0.02 |
|
| 0.80|0.02 |
|
|
|
| 0.21|0.02 | 0.69|0.02 |
|
| G2 | None | FDR, “OR” rule | 0.44|0.02 | 0.66|0.02 | 0.56|0.02 |
|
| 0.82|0.02 |
|
|
|
| 0.26|0.02 | 0.72|0.02 |
|
| G2 | None | None | 0.50|0.02 | 0.69|0.02 | 0.60|0.02 |
|
| 0.83|0.02 |
|
|
|
| 0.36|0.02 | 0.76|0.02 |
|
| G2 | 1 gene | “AND” rule | 0.04|0.01 | 0.14|0.02 | 0.59|0.02 |
|
|
|
|
|
|
|
| 0.14|0.02 |
|
| G2 | 1 gene | “OR” rule | 0.04|0.01 | 0.15|0.02 | 0.60|0.02 |
|
|
|
|
|
|
|
| 0.16|0.02 |
|
| G2 | 2 genes | “AND” rule | 0.04|0.01 | 0.14|0.02 | 0.52|0.02 |
|
|
|
|
|
|
|
| 0.02|0.02 |
|
| G2 | 2 genes | “OR” rule | 0.04|0.01 | 0.15|0.02 | 0.60|0.02 |
|
|
|
|
|
|
|
| 0.04|0.02 |
|
| G2 | 3 genes | “AND” rule | 0.04|0.01 | 0.09|0.02 | 0.09|0.01 |
|
|
|
|
|
|
|
| 0.02|0.02 |
|
| G2 | 3 genes | “OR” rule | 0.04|0.01 | 0.14|0.02 | 0.28|0.02 |
|
|
|
|
|
|
|
| 0.04|0.02 |
|
Cells with bold font correspond to experiments with statistically significant reconstruction of regulatory networks. See for abbreviations of row labels. See part A for a colored version of this table.
Euclidean distance from the optimal algorithm with Sensitivity = 1 and PPV = 1.
| Observational (Biological Replicates) | Observational (Environment/Time) | Semi-Experimental (Compendium) | Experimental | Method Average | ||||||||||||
| Statistics | Conditioning | Post-Processing | Holstege1 | Holstege2 | Gresham | Gasch | Smith | Yeung | M3D | GPL90 | Hughes1 | Hughes2 | Hu | Holstege3 | Holstege4 | |
| Fisher's Z | None | FDR, “AND” rule | 1.05 |
|
|
|
| 1.02 | 1.02 |
|
|
| 1.19 | 1.02 |
|
|
| Fisher's Z | None | FDR, “OR” rule | 1.04 |
|
|
|
| 1.01 | 1.01 |
|
|
| 1.17 | 1.02 |
|
|
| Fisher's Z | None | None | 1.03 |
|
|
|
| 1.01 | 1.01 |
|
|
| 1.13 | 1.01 |
|
|
| Fisher's Z | 1 gene | “AND” rule | 1.36 |
| 1.38 |
|
|
|
|
|
|
| 1.36 | 1.32 |
|
|
| Fisher's Z | 1 gene | “OR” rule |
|
| 1.37 |
|
|
|
|
|
|
| 1.35 | 1.31 |
|
|
| Fisher's Z | 2 genes | “AND” rule | 1.39 | 1.39 |
|
|
|
|
|
|
|
| 1.39 |
|
|
|
| Fisher's Z | 2 genes | “OR” rule | 1.38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “AND” rule | 1.39 | 1.39 | 1.41 |
|
|
|
|
|
|
|
|
|
|
|
| Fisher's Z | 3 genes | “OR” rule |
| 1.39 | 1.39 |
|
|
|
|
|
|
|
|
|
|
|
| G2 | None | FDR, “AND” rule | 1.16 | 1.05 | 1.11 |
|
| 1 |
|
|
|
| 1.26 | 1.03 |
|
|
| G2 | None | FDR, “OR” rule | 1.13 | 1.04 | 1.08 |
|
| 1 |
|
|
|
| 1.23 | 1.02 |
|
|
| G2 | None | None | 1.1 | 1.03 | 1.06 |
|
| 1 |
|
|
|
| 1.17 | 1.01 |
|
|
| G2 | 1 gene | “AND” rule | 1.38 | 1.3 | 1.06 |
|
|
|
|
|
|
|
| 1.3 |
|
|
| G2 | 1 gene | “OR” rule | 1.37 | 1.3 | 1.06 |
|
|
|
|
|
|
|
| 1.29 |
|
|
| G2 | 2 genes | “AND” rule | 1.38 | 1.3 | 1.09 |
|
|
|
|
|
|
|
| 1.38 |
|
|
| G2 | 2 genes | “OR” rule | 1.37 | 1.3 | 1.06 |
|
|
|
|
|
|
|
| 1.37 |
|
|
| G2 | 3 genes | “AND” rule | 1.38 | 1.34 | 1.34 |
|
|
|
|
|
|
|
| 1.38 |
|
|
| G2 | 3 genes | “OR” rule | 1.37 | 1.31 | 1.22 |
|
|
|
|
|
|
|
| 1.37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||
|
|
|
|
|
| ||||||||||||
Cells with bold font correspond to experiments with statistically significant reconstruction of regulatory networks. See for abbreviations of row labels. See part B for a colored version of this table.
Figure 7Example scatter-plot of transcription factor connectivity versus the accuracy (combined PPV/NPV metric) of reconstructing their sub-networks.
The left panel shows the scatter-plot and the right panel shows the null distribution for establishing statistical significance of the observed correlation.
Number of networks that have significant correlations between transcription factor connectivity and accuracy of reconstructing their sub-networks.
| Gold-Standard Network | Inferred Network | |||||
| Approach | Sensitivity/Specificity | PPV/NPV | Sensitivity/PPV | Sensitivity/Specificity | PPV/NPV | Sensitivity/PPV |
|
| 1 | 1 | 1 | 8 | 1 | 10 |
|
| 1 | 0 | 1 | 9 | 1 | 11 |
|
| 1 | 0 | 2 | 9 | 1 | 8 |
|
| 0 | 3 | 0 | 6 | 0 | 2 |
|
| 0 | 4 | 0 | 5 | 0 | 1 |
|
| 0 | 5 | 3 | 13 | 5 | 1 |
|
| 0 | 7 | 0 | 13 | 5 | 1 |
|
| 0 | 0 | 2 | 13 | 2 | 0 |
|
| 0 | 7 | 1 | 13 | 3 | 1 |
|
| 0 | 0 | 1 | 5 | 3 | 11 |
|
| 0 | 0 | 1 | 7 | 3 | 10 |
|
| 0 | 1 | 1 | 7 | 3 | 9 |
|
| 0 | 1 | 0 | 4 | 5 | 10 |
|
| 0 | 2 | 0 | 4 | 1 | 8 |
|
| 0 | 2 | 0 | 6 | 6 | 8 |
|
| 0 | 3 | 0 | 5 | 1 | 5 |
|
| 0 | 2 | 0 | 7 | 5 | 8 |
|
| 0 | 3 | 0 | 5 | 1 | 5 |
The correlations were assessed for 13 different networks (derived from 13 gene expression microarray datasets) for each combination of network reverse-engineering approaches and combined accuracy metrics. Statistical significance is assessed at 5% alpha level adjusted globally for multiple comparisons (over all statistical tests performed for the table). The left portion of the table corresponds to transcription factor connectivity assessed in the gold-standard network, and the right portion corresponds to transcription factor connectivity assessed in the inferred network.
Datasets used for gene regulatory network reverse-engineering.
| Dataset type | Dataset name | Sample size | Number of genes | Description | Source | Reference |
|
| Holstege1 | 200 | 6,170 | A collection of wild-type | ArrayExpress |
|
| E-TABM-773 | ||||||
| Holstege2 | 200 | 6,170 | A collection of wild-type | ArrayExpress |
| |
| E-TABM-984 | ||||||
|
| Gresham | 100 | 5,590 | Environmental change induced transcription response in | (to be submitted to GEO) |
|
| Gasch | 173 | 6,152 | Environmental change induced transcription response in |
|
| |
| Smith | 220 | 6,257 | Environmental change induced transcription response in | GEO |
| |
| GSE9376 | ||||||
| Yeung | 582 | 5,717 | Time-dependent response to rapamycin in | ArrayExpress |
| |
| E-MTAB-412 | ||||||
|
| M3D | 530 | 5,520 | Compendium dataset for | Many Microbe Microarrays Database (M3D) |
|
|
| ||||||
| GPL90 | 1,470 | 6,740 | Compendium dataset for | GEO | Constructed for this study | |
| GPL90 | ||||||
|
| Hughes1 | 291 | 6,307 | Transcriptional response in | GEO |
|
| GSE1404 | ||||||
| Hughes2 | 291 | 6,307 | Transcriptional response in | GEO |
| |
| GSE5499 | ||||||
| Hu | 269 | 6,429 | Transcriptional responses in | GEO |
| |
| GSE4654 | ||||||
| Holstege3 | 464 | 6,170 | Transcriptional response in | ArrayExpress |
| |
| E-TABM-907 | ||||||
| Holstege4 | 319 | 6,170 | Transcriptional response in | ArrayExpress |
| |
| E-TABM-1074 |