| Literature DB >> 28212422 |
Michael R W Dawson1, Maya Gupta1.
Abstract
Probability matching occurs when the behavior of an agent matches the likelihood of occurrence of events in the agent's environment. For instance, when artificial neural networks match probability, the activity in their output unit equals the past probability of reward in the presence of a stimulus. Our previous research demonstrated that simple artificial neural networks (perceptrons, which consist of a set of input units directly connected to a single output unit) learn to match probability when presented different cues in isolation. The current paper extends this research by showing that perceptrons can match probabilities when presented simultaneous cues, with each cue signaling different reward likelihoods. In our first simulation, we presented up to four different cues simultaneously; the likelihood of reward signaled by the presence of one cue was independent of the likelihood of reward signaled by other cues. Perceptrons learned to match reward probabilities by treating each cue as an independent source of information about the likelihood of reward. In a second simulation, we violated the independence between cues by making some reward probabilities depend upon cue interactions. We did so by basing reward probabilities on a logical combination (AND or XOR) of two of the four possible cues. We also varied the size of the reward associated with the logical combination. We discovered that this latter manipulation was a much better predictor of perceptron performance than was the logical structure of the interaction between cues. This indicates that when perceptrons learn to match probabilities, they do so by assuming that each signal of a reward is independent of any other; the best predictor of perceptron performance is a quantitative measure of the independence of these input signals, and not the logical structure of the problem being learned.Entities:
Mesh:
Year: 2017 PMID: 28212422 PMCID: PMC5315326 DOI: 10.1371/journal.pone.0172431
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The responses of a perceptron to each of the 16 types of input patterns over the course of 1000 epochs of training.
Responses are recorded after 1, 5, 10, 15, 20, 25, 50, 75, and 100 epochs, and then are recorded every 100 epochs until 1000 epochs of training have been conducted. The legend indicates which of the four cues are present in each of the 16 different stimulus patterns; the order of items in the legend matches the order of the lines as they are ‘stacked’ in the graph.
Relation between the actual probability associated with each type of input pattern in one training set and a typical network’s responses to the patterns.
| Conditional Probability | Input Pattern | Actual Probability | Network Response |
|---|---|---|---|
| P(R|~A~B~X~Y) | 0,0,0,0 | 0.00 | 0.12 |
| P(R|~A~B~XY) | 0,0,0,1 | 0.63 | 0.65 |
| P(R|~A~BX~Y) | 0,0,1,0 | 0.64 | 0.57 |
| P(R|~A~BXY) | 0,0,1,1 | 0.90 | 0.95 |
| P(R|~AB~X~Y) | 0,1,0,0 | 0.35 | 0.32 |
| P(R|~AB~XY) | 0,1,0,1 | 0.88 | 0.86 |
| P(R|~ABX~Y) | 0,1,1,0 | 0.81 | 0.81 |
| P(R|~ABXY) | 0,1,1,1 | 0.98 | 0.98 |
| P(R|A~B~X~Y) | 1,0,0,0 | 0.21 | 0.19 |
| P(R|A~B~XY) | 1,0,0,1 | 0.79 | 0.76 |
| P(R|A~BX~Y) | 1,0,1,0 | 0.70 | 0.69 |
| P(R|A~BXY) | 1,0,1,1 | 0.96 | 0.97 |
| P(R|AB~X~Y) | 1,1,0,0 | 0.47 | 0.44 |
| P(R|AB~XY) | 1,1,0,1 | 0.94 | 0.91 |
| P(R|ABX~Y) | 1,1,1,0 | 0.77 | 0.88 |
| P(R|ABXY) | 1,1,1,1 | 0.96 | 0.99 |
The mean R2 (with standard deviations) between network responses and actual probabilities for 16 different input patterns in each of the four types of training sets.
| Reward Probability Equals 0.6 | Reward Probability Equal 0.1 | |
|---|---|---|
| AND of Cue X and Cue Y | 0.76 (0.03) | 0.88 (0.03) |
| XOR of Cue X and Cue Y | 0.28 (0.05) | 0.87 (0.03) |