| Literature DB >> 19015141 |
Abstract
MOTIVATION: An important problem in systems biology is reconstructing complete networks of interactions between biological objects by extrapolating from a few known interactions as examples. While there are many computational techniques proposed for this network reconstruction task, their accuracy is consistently limited by the small number of high-confidence examples, and the uneven distribution of these examples across the potential interaction space, with some objects having many known interactions and others few.Entities:
Mesh:
Year: 2008 PMID: 19015141 PMCID: PMC2639005 DOI: 10.1093/bioinformatics/btn602
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.The supervised network inference problem. (a) Adjacency matrix of known interactions (black boxes), known non-interactions (white boxes) and node pairs with an unknown interaction status (gray boxes with question marks). (b) Kernel matrix, with a darker color representing a larger inner product. (c) Partially complete adjacency matrix required by the supervised direct approach methods, with complete knowledge of a submatrix. In the basic local modeling approach, the dark gray portion cannot be predicted.
Fig. 2.Global and local modeling. (a) An interaction network with each green solid edge representing a known interaction, each red dotted edge representing a known non-interaction and the dashed edge representing a pair of objects with an unknown interaction status. (b) A global model based on a Pkernel. (c) A local model for object v3.
List of datasets used in the comparison study
| Code | Data type | Source | Kernel |
|---|---|---|---|
| phy | Phylogenetic profiles | COG v7 (Tatusov | RBF (σ = 3,8) |
| loc | Sub-cellular localization | (Huh | Linear |
| exp-gasch | Gene expression (environmental response) | (Gasch | RBF (σ = 3,8) |
| exp-spellman | Gene expression (cell-cycle) | (Spellman | RBF (σ = 3,8) |
| y2h-ito | Yeast two-hybrid | (Ito | Diffusion (β = 0.01) |
| y2h-uetz | Yeast two-hybrid | (Uetz | Diffusion (β = 0.01) |
| tap-gavin | Tandem affinity purification | (Gavin | Diffusion (β = 0.01) |
| tap-krogan | Tandem affinity purification | (Krogan | Diffusion (β = 0.01) |
| int | Integration | Summation |
Each row corresponds to a dataset from a publication in the Source column, and is turned into a kernel using the function in the Kernel column, as in previous studies (Bleakley et al., 2007; Yamanishi et al., 2004).
Prediction accuracy (percentage of AUC) of the different approaches on the BioGRID-10 dataset
| phy | loc | exp-gasch | exp-spellman | y2h-ito | y2h-uetz | tap-gavin | tap-krogan | int | |
|---|---|---|---|---|---|---|---|---|---|
| Mode 1 | |||||||||
| Direct | 58.04 | 66.55 | 64.61 | 57.41 | 51.52 | 52.13 | 59.37 | 61.62 | 70.91 |
| kCCA | 65.80 | 63.86 | 68.98 | 65.10 | 50.89 | 50.48 | 57.56 | 51.85 | 80.98 |
| kML | 63.87 | 68.10 | 69.67 | 68.99 | 52.76 | 53.85 | 60.86 | 57.69 | 73.47 |
| em | 71.22 | 75.14 | 67.53 | 64.96 | 55.90 | 53.13 | 63.74 | 68.20 | 81.65 |
| Local | 71.67 | 71.41 | 72.66 | 70.63 | 67.27 | 67.27 | 64.60 | 67.48 | 75.65 |
| Local+PP | 73.89 | 75.25 | 77.43 | 75.35 | 71.60 | 71.51 | 74.62 | 71.39 | 83.63 |
| Local+KI | 71.68 | 71.42 | 75.89 | 70.96 | 69.40 | 69.05 | 70.53 | 72.03 | 81.74 |
| Local+PP+KI | 72.40 | 75.19 | 77.41 | 73.81 | 70.44 | 70.57 | 73.59 | 72.64 | 83.59 |
| Mode 2 | |||||||||
| Direct | 59.99 | 67.81 | 66.18 | 59.22 | 54.02 | 54.64 | 62.28 | 63.69 | 72.34 |
| Pkernel | 72.98 | 69.84 | 78.61 | 77.30 | 57.01 | 54.65 | 71.16 | 70.36 | 87.34 |
| Local | 76.89 | 78.73 | 79.72 | 77.32 | 72.93 | 72.89 | 68.81 | 73.15 | 82.82 |
| Local+PP | 77.71 | 80.71 | 82.56 | 80.62 | 74.74 | 74.41 | 76.36 | 75.12 | 88.78 |
| Local+KI | 76.76 | 78.73 | 80.62 | 76.44 | 73.39 | 72.76 | 72.42 | 76.22 | 86.12 |
| Local+PP+KI | 77.45 | 80.57 | 81.93 | 78.92 | 74.14 | 74.01 | 75.59 | 76.59 | 88.56 |
The best approach for each kernel and each mode of cross-validation is in bold face.
Fig. 3.Prediction accuracy at different gold-standard set sizes. (a) Using int kernel. (b) Using exp-gasch kernel.
Fig. 4.Correlating the number of gold-standard examples and the rank difference between local+PP and the four methods.