| Literature DB >> 36247359 |
Jingsi Zhang1, Xiaosheng Yu1, Xiaoliang Lei1, Chengdong Wu1.
Abstract
Image classification indicates that it classifies the images into a certain category according to the information in the image. Therefore, extracting image feature information is an important research content in image classification. Traditional image classification mainly uses machine learning methods to extract features. With the continuous development of deep learning, various deep learning algorithms are gradually applied to image classification. However, traditional deep learning-based image classification methods have low classification efficiency and long convergence time. The training networks are prone to over-fitting. In this paper, we present a novel CapsNet neural network based on the MobileNetV2 structure for robot image classification. Aiming at the problem that the lightweight network will sacrifice classification accuracy, the MobileNetV2 is taken as the base network architecture. CapsNet is improved by optimizing the dynamic routing algorithm to generate the feature graph. The attention module is introduced to increase the weight of the saliency feature graph learned by the convolutional layer to improve its classification accuracy. The parallel input of spatial information and channel information reduces the computation and complexity of network. Finally, the experiments are carried out in CIFAR-100 dataset. The results show that the proposed model is superior to other robot image classification models in terms of classification accuracy and robustness.Entities:
Keywords: CapsNet neural network; MobileNetV2; attention module; robot image classification; spatial and channel information
Year: 2022 PMID: 36247359 PMCID: PMC9563336 DOI: 10.3389/fnbot.2022.1007939
Source DB: PubMed Journal: Front Neurorobot ISSN: 1662-5218 Impact factor: 3.493
Figure 1MobileNetV2 structure. (A) step size=1. (B) step size=2.
Relation between input and output.
|
|
|
|
|---|---|---|
| 1 × 1 PW ReLU | ||
| 3 × 3 DW ReLU | ( | |
| ( | 1 × 1 PW Linear | ( |
Figure 2The process of transmission between capsules.
Modified dynamic routing.
| Procedure |
Figure 3Proposed C-MobileNetV2 structure.
Parameters in C-MobileNetV2 structure.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| Input layer | 224 × 224 × 3 | conv2d | – | 64 | 1 | 2 | × |
| 1 | 112 × 112 × 34 | Bottleneck | 1 | 16 | 1 | 1 | × |
| 2 | 112 × 112 × 16 | Bottleneck | 6 | 24 | 2 | 2 | × |
| 3 | 56 × 56 × 24 | Bottleneck | 6 | 32 | 3 | 2 | √ |
| 4 | 28 × 28 × 32 | Bottleneck | 6 | 64 | 4 | 2 | × |
| 5 | 14 × 14 × 64 | Bottleneck | 6 | 96 | 3 | 1 | √ |
| 6 | 14 × 14 × 96 | Bottleneck | 6 | 160 | 3 | 2 | √ |
| 7 | 7 × 7 × 160 | Bottleneck | 6 | 320 | 1 | 1 | √ |
| 8 | 7 × 7 × 320 | conv2d 1 × 1 | – | 1,280 | 1 | 1 | – |
| 9 | 7 × 7 × 1,280 | Avgpool 7 × 7 | – | – | 1 | – | – |
| Output layer | 1 × 1 × 1,280 | conv2d 1 × 1 | – | 100 | – | – | – |
Figure 4Channel attention.
Figure 5SPP structure.
Figure 6Spatial attention.
Comparison with different methods.
|
|
|
|
|
|---|---|---|---|
|
| |||
| MobileNetV2 | 71.69 | 1.31 | 0.55 |
| BAVCT | 79.84 | 0.41 | 0.88 |
| TRk-CNN | 86.71 | 0.23 | 0.72 |
| Feat-WCLTP | 90.37 | 0.19 | 0.61 |
| C-MobileNetV2 | 96.58 | 0.02 | 0.58 |
Comparison of classification accuracy of different modules/%.
|
|
|
|
|
|
|---|---|---|---|---|
| VGG16 | 73.24 | 73.72 | 73.82 | 74.24 |
| ResNet18 | 76.03 | 76.14 | 76.30 | 76.46 |
| DenseNet | 75.59 | 75.90 | 75.93 | 76.31 |
| MobileNetV2 | 75.88 | 75.96 | 76.02 | 76.65 |
Figure 7Accuracy change curve.
Classification error rates with different improved modules/%.
| MobileNetV2 | 7.11 |
| Capsule | 5.54 |
| MobileNetV2+CAM+SAM | 5.52 |
| Capsule+CAM+SAM | 4.63 |
| MobileNetV2+Capsule (CAM+SAM) | 3.78 |
They denote the best values obtained by proposed method.
Experimental results of different models on MORPH Album2.
|
|
|
|---|---|
| MobilenetV2 | 3.52 |
| C-MobilenetV2 |
|
They denote the best values obtained by proposed method.
Experimental results of different models on Adience/%.
|
|
|
|
|---|---|---|
| MobilenetV2 | 68.67 | 93.43 |
| C-MobilenetV2 |
|
|
They denote the best values obtained by proposed method.