| Literature DB >> 27610128 |
Shan Pang1, Xinyi Yang2.
Abstract
In recent years, some deep learning methods have been developed and applied to image classification applications, such as convolutional neuron network (CNN) and deep belief network (DBN). However they are suffering from some problems like local minima, slow convergence rate, and intensive human intervention. In this paper, we propose a rapid learning method, namely, deep convolutional extreme learning machine (DC-ELM), which combines the power of CNN and fast training of ELM. It uses multiple alternate convolution layers and pooling layers to effectively abstract high level features from input images. Then the abstracted features are fed to an ELM classifier, which leads to better generalization performance with faster learning speed. DC-ELM also introduces stochastic pooling in the last hidden layer to reduce dimensionality of features greatly, thus saving much training time and computation resources. We systematically evaluated the performance of DC-ELM on two handwritten digit data sets: MNIST and USPS. Experimental results show that our method achieved better testing accuracy with significantly shorter training time in comparison with deep learning methods and other ELM methods.Entities:
Mesh:
Year: 2016 PMID: 27610128 PMCID: PMC5005768 DOI: 10.1155/2016/3049632
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1An example of DC-ELM network (with three convolution layers).
Figure 2Illustration of stochastic pooling.
Parameter setting for LRF-ELM, DC-ELM, and CNN.
| Method |
| LRF size | Pooling size |
|---|---|---|---|
| DC-ELM | 5 | 3 × 3 | 5 × 5 |
| 10 | 3 × 3 | 5 × 5 | |
| 15 | 3 × 3 | 2 × 2 | |
|
| |||
| LRF-ELM | 30 | 3 × 3 | 5 × 5 |
|
| |||
| CNN | 5 | 5 × 5 | 2 × 2 |
| 10 | 5 × 5 | 2 × 2 | |
| 15 | 3 × 3 | 2 × 2 | |
Performance on MNIST (with only 10k training samples).
| Algorithms | Training time (s) | Training accuracy | Testing accuracy | ||
|---|---|---|---|---|---|
| Mean | Std | Mean | Std | ||
| ELM | 102.6 | 0.9965 | 0.0013 | 0.9457 | 0.0017 |
| LRF-ELM | 179.8 | 0.9882 | 0.0058 | 0.9763 | 0.0034 |
| DC-ELM |
| 0.9802 | 0.0027 |
| 0.0015 |
| CNN | 2031.5 | 0.9703 | 0.0017 | 0.9724 | 0.0013 |
| DBN | 2968.4 | 0.9872 | 0.0022 | 0.9632 | 0.0021 |
Performance on MNIST (with only 15k training samples).
| Algorithms | Training time (s) | Training accuracy | Testing accuracy | ||
|---|---|---|---|---|---|
| Mean | Std | Mean | Std | ||
| ELM | 169.3 | 0.9976 | 0.0012 | 0.9522 | 0.0009 |
| LRF-ELM | 264.9 | 0.9890 | 0.0039 | 0.9779 | 0.0021 |
| DC-ELM |
| 0.9887 | 0.0020 |
| 0.0012 |
| CNN | 2960.9 | 0.9873 | 0.0014 | 0.9781 | 0.0011 |
| DBN | 4791.3 | 0.9905 | 0.0009 | 0.9759 | 0.0015 |
Figure 3Input image and corresponding feature maps in different convolution layers.
Parameter setting for LRF-ELM, DC-ELM, and CNN.
| Method |
| LRF size | Pooling size |
|---|---|---|---|
| DC-ELM | 10 | 3 × 3 | 5 × 5 |
| 20 | 3 × 3 | 2 × 2 | |
|
| |||
| LRF-ELM | 30 | 3 × 3 | 5 × 5 |
|
| |||
| CNN | 10 | 3 × 3 | 2 × 2 |
| 20 | 2 × 2 | 2 × 2 | |
Performance on USPS (with 7000 training samples).
| Algorithms | Training time (s) | Training accuracy | Testing accuracy | ||
|---|---|---|---|---|---|
| Mean | Std | Mean | Std | ||
| ELM | 90.1 | 0.9974 | 0.0016 | 0.9323 | 0.0021 |
| LRF-ELM | 37.5 | 0.9849 | 0.0045 | 0.9651 | 0.0027 |
| DC-ELM |
| 0.9783 | 0.0034 |
| 0.0028 |
| CNN | 2590.3 | 0.9750 | 0.0022 | 0.9614 | 0.0021 |
| DBN | 3861.4 | 0.9696 | 0.0033 | 0.9586 | 0.0026 |
Performance on USPS (with 10000 training samples).
| Algorithms | Training time (s) | Training accuracy | Testing accuracy | ||
|---|---|---|---|---|---|
| Mean | Std | Mean | Std | ||
| ELM | 109.5 | 0.9992 | 0.0011 | 0.9650 | 0.0013 |
| LRF-ELM | 55.2 | 0.9948 | 0.0039 | 0.9803 | 0.0022 |
| DC-ELM |
| 0.9889 | 0.0025 |
| 0.0014 |
| CNN | 3731.5 | 0.9815 | 0.0019 | 0.9736 | 0.0020 |
| DBN | 5284.1 | 0.9877 | 0.0024 | 0.9704 | 0.0015 |
Figure 4Confusion matrix obtained by DC-ELM on USPS data set.
Figure 5Effect of network depth on the mean test error.
Figure 6Effect of regularization parameter C on the mean test error.
Figure 7Comparison of stochastic pooling with square root pooling.