| Literature DB >> 22492647 |
Jochen Weile1, Katherine James, Jennifer Hallinan, Simon J Cockell, Phillip Lord, Anil Wipat, Darren J Wilkinson.
Abstract
MOTIVATION: Biological experiments give insight into networks of processes inside a cell, but are subject to error and uncertainty. However, due to the overlap between the large number of experiments reported in public databases it is possible to assess the chances of individual observations being correct. In order to do so, existing methods rely on high-quality 'gold standard' reference networks, but such reference networks are not always available.Entities:
Mesh:
Year: 2012 PMID: 22492647 PMCID: PMC3356839 DOI: 10.1093/bioinformatics/bts154
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Evaluation of the algorithm in comparison with the naive method as well as the GS-based methods by Lee et al. and Lycett given different numbers of experiments. Losses are averaged over 5000 replicates. Top: average loss over interacting node pairs (L(+)). Centre: average loss over non-interacting node pairs (L(−)). Bottom: average loss regarding error rate estimates (LER). Whiskers indicate one SD. The naive method does not estimate error rates and is thus excluded from this metric. The performance of the gold standard-based methods regarding error rate estimation cannot be expected to improve with the number of experiments, since their estimates are always based on the gold standard and not on the experiments.
Fig. 2.Histogram of existence probabilities for the interactions in the integrated dataset. Top: PPIs, bottom: synthetic GIs.
Fig. 3.Probabilities assigned to PPIs by the naive integration method and the proposed method. As expected, a strong difference is clearly visible, as the proposed method takes into account the reliability of each experiment.