| Literature DB >> 27599566 |
David M Budden1,2, Edmund J Crampin3,4,5,6.
Abstract
BACKGROUND: Characterising programs of gene regulation by studying individual protein-DNA and protein-protein interactions would require a large volume of high-resolution proteomics data, and such data are not yet available. Instead, many gene regulatory network (GRN) techniques have been developed, which leverage the wealth of transcriptomic data generated by recent consortia to study indirect, gene-level relationships between transcriptional regulators. Despite the popularity of such methods, previous methods of GRN inference exhibit limitations that we highlight and address through the lens of information theory.Entities:
Keywords: Gene expression; Gene regulatory network; Transcriptional regulation
Mesh:
Year: 2016 PMID: 27599566 PMCID: PMC5013667 DOI: 10.1186/s12918-016-0331-y
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Fig. 1Transcriptional activity of each gene in a Century-series (100 node) scale-free Mendes network [42], simulated using multiplicative Hill kinetics as defined in 5. Each time-series was simulated until convergence (d x/d t=0) using Gepasi [45], from which gene-level correlation, MI or TE can be calculated for GRN approximation
Fig. 2Examples of the Mendes synthetic GRNs used to benchmark the performance of the information theoretic measures proposed in this article [42], with blue and red edges representing activating and inhibiting interactions respectively. Erdős-Rényi [46] (random), Watts-Strogatz [47] (small-world) and Albert-Barabási [48] (scale-free) topologies were considered from both the (a) ‘Century’ (100-node) and (b) ‘Jumbo’ (1000-node) series. Of these topologies, there is growing evidence that scale-free networks most accurately represent the organisation of metabolic and transcriptomic regulatory systems [49–51]
Performance of MI and TE-based methods of GRN inference, presented as the mean AUC (and standard deviation) across a variety of random [46], small-world [47] and scale-free [48] networks from the Mendes ‘Century’ and ‘Jumbo’ collections [42]
| Collection | Networks | Nodes | Edges | Topology | AUC (Mutual Information) | AUC (Transfer Entropy) | ||
|---|---|---|---|---|---|---|---|---|
| Kernel (ARACNE [ | KSG | Kernel | KSG | |||||
| CenturyRND | 50 | 100 | 200 | Random | 0.514 | 0.478 | 0.589 | 0.603 |
| (0.030) | (0.028) | (0.024) | (0.027) | |||||
| CenturySF | 50 | 100 | 200 | Scale-free | 0.475 | 0.505 | 0.526 | 0.561 |
| (0.036) | (0.033) | (0.030) | (0.030) | |||||
| CenturySW | 50 | 100 | 200 | Small-world | 0.477 | 0.471 | 0.602 | 0.598 |
| (0.035) | (0.035) | (0.028) | (0.030) | |||||
| JumboRND | 5 | 1000 | 1000 | Random | 0.473 | 0.439 | 0.540 | 0.564 |
| (0.014) | (0.013) | (0.006) | (0.009) | |||||
| JumboSF | 5 | 1000 | 1000 | Scale-free | 0.526 | 0.577 | 0.606 | 0.649 |
| (0.007) | (0.010) | (0.007) | (0.012) | |||||
Kernel-based methods apply the uniform kernel (see (2)) with bandwidth h=0.1. For KSG-based methods, KSG algorithm 1 (better suited to small networks, see (3)) was applied to ‘Century’ data and algorithm 2 (see (4)) to ‘Jumbo’ data, both with K=4 [33] and assuming length-1 Markovian processes. Gene expression time-series were simulated until convergence (d x/d t=0) using Gepasi with default parameters [45]