| Literature DB >> 17784779 |
Chris J Needham1, James R Bradford, Andrew J Bulpitt, David R Westhead.
Abstract
Entities:
Mesh:
Year: 2007 PMID: 17784779 PMCID: PMC1963499 DOI: 10.1371/journal.pcbi.0030129
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Figure 1An Example: Gene Regulatory Networks
Gene regulatory networks provide a natural example for BN application. Genes correspond to nodes in the network, and regulatory relationships between genes are shown by directed edges. In the simple example above, gene G1 regulates G2, G3, and G5, gene G2 regulates G4 and G5, and gene G3 regulates G5. The probability distribution for the expression levels of each gene is modelled by the BN parameters. Simplification results from the fact that the probability distribution for a gene depends only on its regulators (parents) in the network. For instance, the expression levels of G4 and G5 are related only because they share a common regulator G2. In mathematical terms, they are conditionally independent given G2. Such relationships lead to factorisation of the full JPD into component conditional distributions, where each variable depends only on its parents in the network.
p(G1, G2, G3, G4, G5) = p(G1)p(G2|G1)p(G3|G1)p(G4|G2)p(G5|G1, G2, G3)
Figure 7Naïve Bayes Classifier with Model Parameters in the Form of CPTs
Figure 2Illustration of Model Parameters for Two-Node Bayesian Network
Figure 3Serial Connection
Figure 4Diverging Connection
Figure 5Converging Connection
Figure 6Graphical Model Illustrating Bayesian Inference
Figure 8The Effects of Different Strength Priors and Training Set Sizes
(A) In this case, the observed data is ten interaction sites, of which five have high conservation, five low. As expected, in this case the likelihood peaks at p 2 = 0.5. The prior is B(7,3), indicating prior knowledge that high conservation is found in interaction sites; it corresponds to adding seven pseudocounts to the C = high category, and three to C = low, and produces a prior peaked above p 2 = 0.5. The posterior is also shown, along with the MAP estimate of p 2. The influence of the prior information in this case where the observed counts are low is clear.
(B) Learning from 100 training examples (75 high, 25 low). Here the weak B(7,3) prior has little influence over the posterior distribution, and with a large training set the ML and MAP estimates are similar (p 2 ∼ 0.75). The posterior distribution for p 2 is narrower—some of the uncertainty about its value has been removed given the evidence (training examples).
(C) Using a stronger prior B(70,30) still indicates that the most likely value for p 2 is 0.7; however, note that the prior is narrower—a lot of evidence would be needed to be convinced that p 2 was less than 0.6, say. Small samples are more susceptible to noise than larger samples. For a training set with five high and five low conservation scores, the ML estimate (p 2 = 0.5) is quite different from the MAP estimate of about 0.7, which takes into account the prior. Hopefully, this illustrates why priors are useful, but also cautions against choosing the wrong prior (or too strong/weak a prior)!
(D) This final example has a B(70,30) prior and shows ML and MAP estimates from training data with 75 high and 25 low conservation scores. This combination of a good prior and a larger training set is the example here with the least uncertainty about the value of p 2.