| Literature DB >> 32477087 |
Qiulei Dong1,2,3, Bo Liu1,2, Zhanyi Hu1,2,3.
Abstract
Recently DCNN (Deep Convolutional Neural Network) has been advocated as a general and promising modeling approach for neural object representation in primate inferotemporal cortex. In this work, we show that some inherent non-uniqueness problem exists in the DCNN-based modeling of image object representations. This non-uniqueness phenomenon reveals to some extent the theoretical limitation of this general modeling approach, and invites due attention to be taken in practice.Entities:
Keywords: deep convolutional neural network; image object representation; inferotemporal cortex; neural object representation; non-uniqueness
Year: 2020 PMID: 32477087 PMCID: PMC7235366 DOI: 10.3389/fncom.2020.00035
Source DB: PubMed Journal: Front Comput Neurosci ISSN: 1662-5188 Impact factor: 2.380
Figure 1DCNN1 and DCNN2 give the different object representations x and y for the same input image object I, however their object categorization performances are exactly the same if y′ = f(x′), where f(·) is an element-wise non-linear monotonically increasing function.
Network configurations (shown in columns).
| Conv5-32 | Conv3-bn-32 | Conv3-bn-64 | Conv3-bn-128 | Conv3-bn-32 | Conv3-bn-64 |
| Conv3-bn-32 | Conv3-bn-64 | Conv3-bn-128 | Conv3-bn-32 | ||
| Conv3-bn-32 | |||||
| Conv3-bn-32 | |||||
| Conv5-32 | Conv3-bn-64 | Conv3-bn-128 | Conv3-bn-256 | Conv3-bn-64 | Conv3-bn-128 |
| Conv3-bn-64 | Conv3-bn-128 | Conv3-bn-256 | Conv3-bn-64 | ||
| Conv3-bn-64 | |||||
| Conv3-bn-64 | |||||
| Conv5-64 | Conv3-bn-128 | Conv3-bn-256 | Conv3-bn-512 | Conv3-bn-128 | Conv3-bn-256 |
| Conv3-bn-128 | Conv3-bn-256 | Conv3-bn-512 | Conv3-bn-128 | Conv3-bn-256 | |
| Conv3-bn-128 | |||||
| Conv3-bn-128 | |||||
| Fc-64 | Conv3-bn-256 | Conv3-bn-512 | Conv3-bn-1024 | Conv3-bn-256 | Conv3-bn-512 |
| Conv3-bn-256 | Conv3-bn-512 | ||||
| Conv3-bn-512 | |||||
| Conv3-bn-512 | |||||
| Fc-10 | Fc-10 | Fc-10(100) | Fc-100 | Fc-10 | Fc-10(100) |
The convolutional layer parameters are denoted as “Conv〈receptive field size〉-bn-〈number of channels〉.” The Fully connected layer parameters are denoted as “Fc-〈number of units〉”.
Figure 2(A) Categorization accuracies of {D1, D2, D3, D5, D6} with two random initializations on CIFAR-10 (Net1 and Net2 indicate a same network with two initializations, similarly hereinafter). (B) Mean EVs on CIFAR-10 for all the inputs (blue bars)/only the correctly categorized inputs (orange bars). (C) Categorization accuracies of {D3, D4, D6} with two initializations on CIFAR-100. (D) Mean EVs on CIFAR-100 for all the inputs (blue bars)/only the correctly categorized inputs (orange bars).
Figure 3Categorization accuracies and mean EVs under different levels of noise: (A) Categorization accuracies of similar performing pairs of DCNNs. (B) Mean EVs of similar performing pairs of DCNNs.
Figure 4Mean EVs with different image samples: (A) Samples are randomly selected from the whole test image set. (B) Samples are randomly selected from the set of only those correctly categorized images.
Figure 5Mean EVs with different percentages of selective neurons.