| Literature DB >> 33192426 |
Dileep George1, Miguel Lázaro-Gredilla1, J Swaroop Guntupalli1.
Abstract
Despite the recent progress in AI powered by deep learning in solving narrow tasks, we are not close to human intelligence in its flexibility, versatility, and efficiency. Efficient learning and effective generalization come from inductive biases, and building Artificial General Intelligence (AGI) is an exercise in finding the right set of inductive biases that make fast learning possible while being general enough to be widely applicable in tasks that humans excel at. To make progress in AGI, we argue that we can look at the human brain for such inductive biases and principles of generalization. To that effect, we propose a strategy to gain insights from the brain by simultaneously looking at the world it acts upon and the computational framework to support efficient learning and generalization. We present a neuroscience-inspired generative model of vision as a case study for such approach and discuss some open problems about the path to AGI.Entities:
Keywords: AGI; Recursive Cortical Network; biologically guided inductive biases; generative model; neuroscience inspired AI
Year: 2020 PMID: 33192426 PMCID: PMC7645629 DOI: 10.3389/fncom.2020.554097
Source DB: PubMed Journal: Front Comput Neurosci ISSN: 1662-5188 Impact factor: 2.380
Figure 1The triangulation strategy for extracting principles from the brain looks at three aspects at the same time.
Biological features and their computational counterparts that were simultaneously considered in the development of the RCN visual generative model.
| Blobs and interblobs | Curvelet-like smoothness of natural signals, an example of which is contour-surface factorization | Structure of the contour-surface factor |
| Lateral connections between inter-blob columns | Higher-order contour-continuity in natural signals | Cloned structure of lateral connections for higher-order interactions. |
| Object-based top-down attention | Compositionality, modularity | Only positive weights. Object-background factorization |
| Hierarchy | Efficient learning and inference | Hierarchically structured |
| Border-ownership coding | Required when objects are represented in a factorized and hierarchical manner | Two clones of each feature for border-ownership coding |
| Feedback connections | Inference requires explaining away when the representation is compositional | Message-passing algorithms automatically do explaining away |
| Different dynamics for contour and surface features | Convergence of message-passing depends on the schedule | Biologically inspired message-passing schedule works better |
Figure 5(A) Object recognition and object segmentation are two entangled problems for which a joint solution is necessary. (B) Test data with types of noise that were never seen during training degrade the performance of a CNN significantly more than RCN. (C) Occlusion reasoning increases the performance of RCN when objects overlap and allows hidden edges of an object to be recovered. (D) An example of parsing by RCN on a real world image. (E) RCN can render novel variations of characters after one-shot training. (F) A variational autoencoder and RCN try to reconstruct a digit in the presence of some type of noise that it was not trained for. (G) RCN is a single vision model that can perform all the required functions, as opposed to a collection of different models solving each of them.
Figure 2Contour-surface factorization. (A) The primary visual cortex has columns that are divided into blobs and interblobs, and the segregation remains in how they project to V2 (image credit: Federer et al., 2009). (B) People can recognize objects with unusual appearances, even when they are exposed to it for the very first time. (C) RCN consists of a contour hierarchy and a surface model. The surface model is a CRF. The factors between different surface patches encode surface similarity in the neighborhood and those are gated by the contour factors. (D) Different surface patterns can be generated by instantiating a particular set of contours, and then sampling from the surface model.
Figure 3Lateral connections in the visual cortex and their computational significance. (A) Lateral connections project long distances and connect to columns that are of similar orientation (Bosking et al., 1997). (B) Analysing the co-occurrence statistics of contours in natural images show that they have higher-order structure than just pair-wise (image credit: Lawlor and Zucker, 2013). (C) Visualization of third-order structure in natural contours show that co-circularity and collinearity are represented (image credit: Lawlor and Zucker, 2013). (D) Lateral connections in RCN enforce contour continuity. (E) Samples with and without lateral connections. (F) The effect of flexibility of lateral connections at different levels in RCN.
Figure 4The need for explaining away in parsing visual scenes. (A) Local evidence suggests that the character is “m.” Incorporating global context shows that “un” is a better explanation. Even if “m” is strong in a feed-forward pass of inference, its evidence needs to be explained-away. (B) Feedforward-feedback iterations are used to explain away evidence to arrive at the globally best solution. Alternative partial explanations are hallucinated in the process of analyzing the scene.