| Literature DB >> 21857910 |
Hossein Zare1, Mostafa Kaveh, Arkady Khodursky.
Abstract
Transcriptional networks consist of multiple regulatory layers corresponding to the activity of global regulators, specialized repressors and activators as well as proteins and enzymes shaping the DNA template. Such intrinsic complexity makes uncovering connections difficult and it calls for corresponding methodologies, which are adapted to the available data. Here we present a new computational method that predicts interactions between transcription factors and target genes using compendia of microarray gene expression data and documented interactions between genes and transcription factors. The proposed method, called Kernel Embedding of Regulatory Networks (KEREN), is based on the concept of gene-regulon association, and captures hidden geometric patterns of the network via manifold embedding. We applied KEREN to reconstruct transcription regulatory interactions on a genome-wide scale in the model bacteria Escherichia coli (E. coli). Application of the method not only yielded accurate predictions of verifiable interactions, which outperformed on certain metrics comparable methodologies, but also demonstrated the utility of a geometric approach in the analysis of high-dimensional biological data. We also described possible applications of kernel embedding techniques to other function and network discovery algorithms.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21857910 PMCID: PMC3155518 DOI: 10.1371/journal.pone.0021969
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Performance of KEREN on two data sets of different size.
(A) Comparison of recall (blue line) and precision (red line) for cDNA microarray data set versus K, the initial number of selected neighbors, for fixed value of P = 10. (B) the same as (A) for Affymatrix data set. (C & D) An effect of the proposed assignment procedure (parameter P) for the cDNA and Affymatrix data set, respectively. K is equal to 5 in this case. Dashed lines correspond to the performance when operons are accounted for.
Performance Comparison.
| cDNA Data Set | Affymatrix data set | |||
| Method/Algorithm | Recall | Precision | Recall | Precision |
| A | 13.5 | 9.5 | 26.8 | 17.1 |
| B | 31 | 10 | 42.6 | 14.5 |
| C | 30 | 13.6 | 47.2 | 25 |
| D | 44.7 | 40 | 62.3 | 63.2 |
| E | 50.5 | 46.7 | 66 | 68 |
Recall and precision values are in % for two microarray data sets. Methods/Algorithms are: (A) Gene-TF, relevance network, (B) Gene-Regulon using a correlation matrix, (C) Gene-Regulon using a mutual information matrix, (D) KEREN, Gene-Regulon using an LLE kernel matrix derived from a mutual information matrix, (E) KEREN, Gene-Regulon using an LLE kernel matrix derived from a correlation matrix.
Comparison of KEREN with SIRENE.
| Method/Algorithm | Recall = 80% | Recall = 75% | Recall = 70% | Recall = 65% | Recall = 60% | Recall = 50% |
| KEREN | 30 | 38 | 44 | 47 | 50 | 54 |
| SIRENE | 16 | 18 | 23 | 29 | 35 | 50 |
| KEREN-bias | 65 | 75 | 82 | 86 | 88 | 91 |
| SIRENE-bias | 62 | 70 | 75 | 82 | 86 | 90 |
Comparison between precisions (%) of KEREN and SIRENE (with operon structure accounted for in the first two rows and not – ‘bias’) at different levels of recall. The values for SIRENE were taken from [25].
Figure 2Performance of KEREN using Laplacian kernel.
(A) Comparison of recall (blue line) and precision (red line) for KEREN when Laplacian kernel instead of LLE is derived from the correlation matrix of the Affymatrix data set. (B) Comparison of recall and precision for the Affymatrix data when LLE kernel is constructed from correlation matrix of randomized data.