| Literature DB >> 25542018 |
Bruno J T Fernandes1, George D C Cavalcanti2, Tsang I Ren2.
Abstract
Autoassociative artificial neural networks have been used in many different computer vision applications. However, it is difficult to define the most suitable neural network architecture because this definition is based on previous knowledge and depends on the problem domain. To address this problem, we propose a constructive autoassociative neural network called CANet (Constructive Autoassociative Neural Network). CANet integrates the concepts of receptive fields and autoassociative memory in a dynamic architecture that changes the configuration of the receptive fields by adding new neurons in the hidden layer, while a pruning algorithm removes neurons from the output layer. Neurons in the CANet output layer present lateral inhibitory connections that improve the recognition rate. Experiments in face recognition and facial expression recognition show that the CANet outperforms other methods presented in the literature.Entities:
Mesh:
Year: 2014 PMID: 25542018 PMCID: PMC4277427 DOI: 10.1371/journal.pone.0115967
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1CANet training model for a set of training images from the class .
CANet- is the output of the model.
Notations and definitions used to describe the CANet.
| Symbol | Description |
|
| Value of the pixel in the |
|
| Receptive field size of the neuron |
| of the CANet constructive layer | |
|
| Size of the inhibitory field in the reconstruction layer |
|
| Strength of the lateral inhibition in the reconstruction layer |
|
| Weights associated with the position |
| layer and with the neuron | |
| of the constructive layer to the reconstruction layer, respectively | |
|
| Receptive fields of the neuron |
| of the constructive layer in the input and reconstruction layers, respectively | |
|
| Bias of the neuron |
| constructive layer | |
|
| Outputs of the neuron |
| constructive layer and of the neuron | |
|
| Activation function |
|
| Error sensitivities for the neuron |
|
| |
| layer | |
|
| Weighted sum input for the neuron |
| of the constructive layer and for the neuron | |
| for an image | |
|
| Adaption rule of the RPROP algorithm |
|
| Increase and decrease factors of the RPROP algorithm |
|
| Maximum number of hidden neurons |
|
| Number of neurons not removed by the pruning algorithm in the reconstruction layer |
|
| Maximum and minimum mean error rates of the neurons in the output layer contained |
|
| in the receptive field of the neuron |
| of the constructive layer, respectively | |
|
| Mean error rate of the neuron |
|
| Error gradient |
Notation and definitions used to describe the CANet.
Figure 2CANet architecture composed of 2-D layers in a bottleneck shape for image autoassociation:
(a) network layers in which neurons are connected to receptive fields with different sizes in the input and output layers; (b) connectivity model of a neuron in the constructive layer.
Algorithm 1: Pseudocode of the constructive-prunning algorithm.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Algorithm 1: Pseudocode of the constructive-prunning algorithm.
Figure 3Quadtree model of the receptive fields hierarchy.
Initially, there is only one receptive field with the same height and width of the output layer, given by and , respectively. The receptive field is divided into four receptive fields with sizes and . Finally, the receptive field denoted by is divided into four receptive fields with sizes and .
Figure 4CANet pruning algorithm.
The mean error rates are sorted and the neurons associated with the lowest rates are kept in the reconstruction layer.
Figure 5Multi-class recognition system of the CANet.
Figure 6JAFFE images after pre-processing.
Figure 7Facial expression recognition rate (%) for different values of maximum number of hidden neurons .
Facial expression recognition rate (%) for different configurations of inhibitory field size, , and inhibition strength, , of the CANet in the JAFFE database.
| Inhibitory configuration | Recognition rate (std) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Facial expression recognition rate (%) for different configurations of inhibitory field size, , and inhibition strength, , of the CANet in the JAFFE database
Comparison between the facial expression recognition rate (%) obtained by CANet and different methods with feature extraction with the first test approach in the JAFFE database.
| Method | Recognition rate (std) |
| CANet |
|
| AAPNet |
|
| Gabor + LVQ |
|
| GSNMF |
|
| SNMF |
|
| DNMF |
|
| NMF |
|
| Laplacianfaces |
|
| Fisherfaces |
|
| Eigenfaces |
|
Comparison between the facial expression recognition rate (%) obtained by CANet and different methods with feature extraction with the first test approach in the JAFFE database.
Confusion matrix of the CANet presenting the probability of an expression in the row to be classified as an expression in the column with the first test approach in the JAFFE database (SU: surprise, HA: happiness, AN: anger, DI: disgust, SA: sadness, NE: neutral, FE: fear).
| SU | HA | AN | DI | SA | NE | FE | |
| SU |
|
|
|
|
|
|
|
| HA |
|
|
|
|
|
|
|
| AN |
|
|
|
|
|
|
|
| DI |
|
|
|
|
|
|
|
| SA |
|
|
|
|
|
|
|
| NE |
|
|
|
|
|
|
|
| FE |
|
|
|
|
|
|
|
Confusion matrix of the CANet presenting the probability of an expression in the row to be classified as an expression in the column with the first test approach in the JAFFE database (SU: surprise, HA: happiness, AN: anger, DI: disgust, SA: sadness, NE: neutral, FE: fear).
Comparison between the facial expression recognition rate (%) obtained by CANet and different methods without feature extraction with the second test approach in the JAFFE database.
| Method | Recognition rate (std) |
| CANet |
|
| AAPNet |
|
| Gaussian process |
|
| 3-NN |
|
Comparison between the facial expression recognition rate (%) obtained by CANet and different methods without feature extraction with the second test approach in the JAFFE database.
Comparison between the error rate for face recognition (%) obtained by CANet and different methods from the work of Zhu et al. in the ORL database.
| Method | Number of training images per class | ||
| 3 | 4 | 5 | |
| CANet |
|
|
|
| IMSEC |
|
|
|
| CMSE |
|
|
|
| CRC |
|
|
|
| SRC |
|
|
|
| Eigenface |
|
|
|
| Fisherface |
|
|
|
| 1-NN |
|
|
|
| 2DPCA |
|
|
|
| 2DLDA |
|
|
|
Comparison between the error rate for face recognition (%) obtained by CANet and different methods from the work of Zhu et al. in the ORL database.
Comparison between the face recognition rate (%) obtained by CANet and different methods from the works of Mi et al. in the AR database.
| Method | Recognition rate |
| CANet |
|
| RLRC 1 |
|
| RLRC 2 |
|
| LRC |
|
| SRC-KNS |
|
Comparison between the face recognition rate (%) obtained by CANet and different methods from the works of Mi et al. in the AR database.