| Literature DB >> 33265602 |
Abstract
The mutual information between the state of a neural network and the state of the external world represents the amount of information stored in the neural network that is associated with the external world. In contrast, the surprise of the sensory input indicates the unpredictability of the current input. In other words, this is a measure of inference ability, and an upper bound of the surprise is known as the variational free energy. According to the free-energy principle (FEP), a neural network continuously minimizes the free energy to perceive the external world. For the survival of animals, inference ability is considered to be more important than simply memorized information. In this study, the free energy is shown to represent the gap between the amount of information stored in the neural network and that available for inference. This concept involves both the FEP and the infomax principle, and will be a useful measure for quantifying the amount of information available for inference.Entities:
Keywords: free-energy principle; independent component analysis; infomax principle; internal model hypothesis; principal component analysis; unconscious inference
Year: 2018 PMID: 33265602 PMCID: PMC7513032 DOI: 10.3390/e20070512
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Glossary of expressions.
| Expression | Description |
|---|---|
| Generative process | A set of stochastic equations that generate the external world dynamics |
| Recognition model | A model in the neural network that imitates the inverse of the generative process |
| Generative model | A model in the neural network that imitates the generative process |
|
| Hidden sources |
|
| Sensory inputs |
|
| A set of parameters |
|
| A set of hyper-parameters |
|
| A set of hidden states of the external world |
|
| Neural outputs |
|
| Synaptic strength matrices |
|
| State of neuromodulators |
|
| A set of the internal states of the neural network |
|
| Background noises |
|
| Reconstruction errors |
|
| The actual probability density of |
|
| Actual probability densities (posterior densities) |
|
| Prior densities |
|
| Likelihood function |
|
| Statistical models |
|
| Finite spatial resolution of |
|
| Expectation of · over |
|
| Shannon entropy of |
|
| Cross entropy of |
|
| KLD between |
|
| Mutual information between |
|
| Surprise |
|
| Surprise expectation |
|
| Free energy |
|
| Free energy expectation |
|
| Utilizable information between |
Figure 1Schematic images of a generative process of the environment (left) and recognition and generative models of the neural network (right). Note that the neural network can access only the states in the right side of the dashed line, including x (see text in Section 2.2). Black arrows are causal relationships in the external world. Blue arrows are information flows of the neural network (i.e., actual causal relationships in the neural network), while red arrows are hypothesized causal relationships (to imitate the external world) when the generative model is considered. See main text and Table 1 for meanings of variables and functions.
Figure 2Relationship between information measures. The mutual information between the inputs and internal states of the neural network () is less than or equal to the Shannon entropy of the inputs () because of the information loss in the recognition model. The utilizable information () is less than or equal to the mutual information, and the gap between them gives the expectation of the variational free energy (), which quantifies the loss in the generative model. The sum of the principal component analysis (PCA) and independent component analysis (ICA) costs () is equal to the gap between the Shannon entropy and the utilizable information, expressing the sum of losses in the recognition and generative models.
Figure 3Difference between the infomax principle and free-energy principle (FEP) when sources follow a non-Gaussian distribution. Black, blue, and red circles indicate the results when W is a random matrix, optimized for the infomax principle (i.e., PCA), and optimized for the FEP, respectively.