| Literature DB >> 35161644 |
Yunchen Kong1, Xue Ma1, Chenglin Wen2.
Abstract
The problem of deep learning network image classification when a large number of image samples are obtained in life and with only a small amount of knowledge annotation, is preliminarily solved in this paper. First, a support vector machine expert labeling system is constructed by using a bag-of-words model to extract image features from a small number of labeled samples. The labels of a large number of unlabeled image samples are automatically annotated by using the constructed SVM expert labeling system. Second, a small number of labeled samples and automatically labeled image samples are combined to form an augmented training set. A deep convolutional neural network model is created by using an augmented training set. Knowledge transfer from SVMs trained with a small number of image samples annotated by artificial knowledge to deep neural network classifiers is implemented in this paper. The problem of overfitting in neural network training with small samples is solved. Finally, the public dataset caltech256 is used for experimental verification and mechanism analysis of the performance of the new method.Entities:
Keywords: bag of visual words; convolutional neural network; knowledge transfer; support vector machine
Mesh:
Year: 2022 PMID: 35161644 PMCID: PMC8839952 DOI: 10.3390/s22030898
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Flow chart of Bag of Visual Words.
Figure 2Schematic diagram of support vector machine.
Figure 3Flow chart of pre training model.
Figure 4BOVW_SVM_ VGG16 algorithm flow chart.
Figure 5VGG16 network structure.
Types, quantities, and labels of images.
| Image Type | Quantity | Label |
|---|---|---|
| Airplane | 800 | 1 |
| Face | 435 | 2 |
| Horse | 270 | 3 |
| Ladder | 242 | 4 |
| Motorbike | 798 | 5 |
| T-shirt | 358 | 6 |
Figure 6(a) Feature extraction histogram of SIFT; (b) Feature extraction histogram of Hog; (c) Feature extraction histogram of Canny.
Figure 7(a) Confusion matrix of SIFT_BOVW_SVM; (b) Confusion matrix of Hog_BOVW_SVM; (c) Confusion matrix of Canny_BOVW_SVM.
Classification accuracy of SVM verification set under different characteristics.
| Model | Sift | Hog | Canny | ||||
|---|---|---|---|---|---|---|---|
| Parameter | |||||||
| Number of cluster | kernel function | 16 | 24 | 16 | 24 | 16 | 24 |
| 100 | linear | 0.667 | 0.72 | 0.782 | 0.837 | 0.604 | 0.639 |
| 200 | linear | 0.708 | 0.764 | 0.823 | 0.865 | 0.604 | 0.681 |
| 300 | linear | 0.729 | 0.806 | 0.83 | 0.858 | 0.646 | 0.694 |
| 400 | linear | 0.729 | 0.806 | 0.865 | 0.90 | 0.646 | 0.681 |
| 500 | linear | 0.729 | 0.792 | 0.865 | 0.90 | 0.646 | 0.694 |
| 600 | linear | 0.75 | 0.806 | 0.876 | 0.886 | 0.646 | 0.708 |
| 700 | linear | 0.75 | 0.792 | 0.865 | 0.90 | 0.646 | 0.667 |
| 800 | linear | 0.771 | 0.833 | 0.886 | 0.90 | 0.691 | 0.708 |
| 900 | linear | 0.771 | 0.778 | 0.886 | 0.907 | 0.671 | 0.694 |
| 1000 | linear | 0.75 | 0.778 | 0.865 | 0.879 | 0.683 | 0.687 |
| 1100 | linear | 0.771 | 0.778 | 0.845 | 0.893 | 0.662 | 0.699 |
| 1200 | linear | 0.792 | 0.792 | 0.875 | 0.90 | 0.662 | 0.681 |
| 1300 | linear | 0.833 | 0.829 | 0.875 | 0.886 | 0.683 | 0.681 |
| 1400 | linear | 0.771 | 0.819 | 0.854 | 0.886 | 0.662 | 0.671 |
| 1500 | linear | 0.771 | 0.778 | 0.875 | 0.893 | 0.62 | 0.639 |
Figure 8Comparison of classification accuracy under different features.
Classification accuracy of vgg16 verification set under different capacity enhancement training sets.
| Number of Training Set | ||
|---|---|---|
| Number of Labeled Samples | Quasi Label Capacity | Val_acc |
| 16/24 | 0 | 72.9%/73.5% |
| 16/24 | 500 | 74.5%/74.6% |
| 16/24 | 1000 | 78.5%/79.3% |
| 16/24 | 1500 | 83.5%/83.7% |
| 16/24 | 2000 | 89.6%/88.1% |
| 16/24 | 2500 | 92.5%/93.4% |
Classification accuracy under small label samples of different models.
| Modle | Training Set Sample Size | |
|---|---|---|
| 16 | 24 | |
| SIFT_BOVW_SVM | 83.3% | 83.3% |
| HOG_BOVW_SVM | 87.5% | 90.7% |
| Canny_BOVW_SVM | 69.1% | 70.8% |
| VGG16 | 72.9% | 73.5% |
| BOVW_SVM_VGG16 | 92.5% | 93.4% |
Figure 9Vgg16 network training process under small samples.