| Literature DB >> 35463270 |
Abstract
Color texture classification is a significant computer vision task to identify and categorize textures that we often observe in natural visual scenes in the real world. Without color and texture, it remains a tedious task to identify and recognize objects in nature. Deep architectures proved to be a better method for recognizing the challenging patterns from texture images. This paper proposes a method, DeepLumina, that uses features from the deep architectures and luminance information with RGB color space for efficient color texture classification. This technique captures convolutional neural network features from the ResNet101 pretrained models and uses luminance information from the luminance (Y) channel of the YIQ color model and performs classification with a support vector machine (SVM). This approach works in the RGB-luminance color domain, exploring the effectiveness of applying luminance information along with the RGB color space. Experimental investigation and analysis during the study show that the proposed method, DeepLumina, got an accuracy of 90.15% for the Flickr Material Dataset (FMD) and 73.63% for the Describable Textures dataset (DTD), which is highly promising. Comparative analysis with other color spaces and pretrained CNN-FC models are also conducted, which throws light into the significance of the work. The method also proved the computational simplicity and obtained results in lesser computation time.Entities:
Mesh:
Year: 2022 PMID: 35463270 PMCID: PMC9023203 DOI: 10.1155/2022/9510987
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Deep Neural Network Approaches for color texture Classification.
Figure 2Glimpse of DTD dataset.
Figure 3Glimpse of FMD dataset.
Figure 4DeepLumina texture classification- proposed framework.
Figure 5Framework of DeepLumina with color models and pretrained models.
DeepLumina framework.
| DeepLumina -proposed method |
|---|
| Step 1. Resize the input RGB images and perform color space transformation to the YIQ model |
| Step 2. Estimation of luminance images (Y channel images) from YIQ color model |
| Step 3. Apply step 1 and step 2 images to the pretrained ResNet101 model |
| Step 4. Extraction of deep or learned feature map from the ResNet101 |
| Step 5. Classification using a fast linear solver support vector machine |
CNN parameters used for feature map generation from the Pretrained Models
| CNN parameters | ResNet50 | AlexNet | Inceptionv3 | DenseNet201 |
|---|---|---|---|---|
| Convlayer name | conv1 | conv1 | conv2d_1 | conv1 | conv |
| No.of layers | 177 | 25 | 316 | 709 |
| Input image size | [224,224,3] | [227,227,3] | [299,299,3] | [224,224,3] |
| Filter size | 7 × 7 | 11 × 11 | 3 × 3 | 7 × 7 |
| No of filters | 64 | 96 | 32 | 64 |
| Stride | [2,2] | [4,4] | [2,2] | [2,2] |
| Dilation factor | [1,1] | [1,1] | [1,1] | [1,1] |
| Padding size | [3,3,3,3] | [0,0,0,0] | [0,0,0,0] | [3,3,3,3] |
| Weight initialization | Glorot | Glorot | Glorot | Glorot |
| Feature layer | fc1000 | fc8 | Predictions | fc1000 |
CNN parameters used for feature map generation from the Pretrained Models.
| CNN parameters | ResNet101 | MobileNet | VGG19 | InceptionResNetv2 |
|---|---|---|---|---|
| Convlayer name | conv1 | conv_1 | conv1_1 | conv2d_1 |
| No.of layers | 347 | 155 | 47 | 825 |
| Input image size | [224,224,3] | [224,224,3] | [224,224,3] | [299,299,3] |
| Filter size | 7 × 7 | 3 × 3 | 3 × 3 | 3 × 3 |
| No of filters | 64 | 32 | 64 | 32 |
| Stride | [2,2] | [2,2] | [1,1] | [2,2] |
| Dilation factor | [1,1] | [1,1] | [1,1] | [1,1] |
| Padding size | [3,3,3,3] | [0,1,0,1] | [1,1,1,1] | [0,0,0,0] |
| Weight initialization | Glorot | Glorot | Glorot | Glorot |
| Feature layer | fc1000 | Logits | fc8 | Predictions |
Figure 6Visualization of the Deep Texture Feature Maps obtained from DeepLumina Framework.
Accuracy obtained for DeepLumina on benchmark texture dataset DTD.
| RGB | Proposed method - DeepLumina | ||||
|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||
| MobileNet + SVM | 61.37 | 66.64 | 67.31 | 66.14 |
|
| ResNet50 + SVM | 67.23 | 72.47 | 72.10 | 70.77 |
|
| ResNet101 + SVM | 66.92 | 72.50 | 72.31 | 72.43 |
|
| DenseNet201 + SVM | 65.12 | 70.66 | 70.64 | 68.35 |
|
| AlexNet + SVM | 46.35 | 49.75 | 49.42 | 48.71 |
|
| VGG19 + SVM | 55.78 | 58.60 | 58.86 | 58.73 |
|
| Inceptionv3 + SVM | 65.20 | 70.04 | 71.30 | 69.81 |
|
| InceptionResNetv2 + SVM | 65.94 | 70.78 | 71.29 | 70.85 |
|
Best values are shown in bold and they are obtained for the proposed Method DeepLumina for Luminance from the YIQ color model.
Accuracy obtained for DeepLumina Method on benchmark texture dataset FMD.
| RGB | DeepLumina - proposed method | ||||
|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||
| MobileNet + SVM | 74.60 | 87.1 |
| 85.10 | 85.80 |
| ResNet50 + SVM | 81.60 | 88.45 |
| 87.85 | 88.80 |
| ResNet101 + SVM | 81.40 | 89.46 | 89.65 | 89.20 |
|
| DenseNet201 + SVM | 80.75 | 88.83 | 87.38 | 87.46 |
|
| AlexNet + SVM | 64.30 | 70.02 | 70.05 | 69.50 |
|
| VGG19 + SVM | 78.10 | 80.40 | 81.65 | 79.35 |
|
| Inceptionv3 + SVM | 76.60 | 88.70 | 88.15 | 87.55 |
|
| InceptionResNetv2 + SVM | 82.20 | 88.75 | 90.01 | 90.03 |
|
The best values obtained for DeepLumina on the FMD dataset are indicated in bold.
Computation time(in mins) for the Proposed Method - DeepLumina on DTD dataset.
| RGB | DeepLumina: proposed method | ||||
|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||
| MobileNet + SVM | 1.76 | 2.98 | 2.96 | 3.06 | 3.04 |
| ResNet50 + SVM | 2.95 | 5.05 | 5.02 | 5.43 | 5.42 |
| ResNet101 + SVM | 4.91 | 8.8 | 8.78 | 8.71 | 9.29 |
| DenseNet201 + SVM | 6.28 | 11.64 | 11.39 | 11.66 | 11.51 |
| AlexNet + SVM | 1.89 | 2.18 | 2.33 | 2.20 | 2.41 |
| VGG19 + SVM | 5.17 | 9.16 | 10.01 | 10.19 | 9.51 |
| Inceptionv3 + SVM | 4.09 | 7.32 | 7.3 | 7.36 | 7.39 |
| InceptionResNetv2 + SVM | 7.95 | 15.69 | 15.33 | 15.59 | 15.53 |
Computation time(in mins) for the Proposed Method, DeepLumina on FMD dataset.
| RGB | DeepLumina: proposed method | ||||
|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||
| MobileNet + SVM | 0.43 | 0.64 | 0.65 | 0.65 | 0.65 |
| ResNet50 + SVM | 0.62 | 0.97 | 0.98 | 0.97 | 0.98 |
| ResNet101 + SVM | 0.92 | 1.57 | 1.58 | 1.57 | 1.55 |
| DenseNet201 + SVM | 1.29 | 2.18 | 2.20 | 2.19 | 2.19 |
| AlexNet + SVM | 0.95 | 1.37 | 2.56 | 1.12 | 1.1 |
| VGG19 + SVM | 1.03 | 2.18 | 2.20 | 2.19 | 2.19 |
| Inceptionv3 + SVM | 0.87 | 1.47 | 1.46 | 1.46 | 1.48 |
| InceptionResNetv2 + SVM | 1.81 | 3.04 | 3.02 | 3.06 | 3.06 |
Comparison with CNN-FC models with and without luminance for DTD and FMD Datasets.
| Dataset | DTD | FMD | ||
|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| |
|
| ||||
| MobileNet | 59.49 | 65.50 | 68.50 | 85.00 |
| ResNet50 | 63.20 |
| 75.21 |
|
| ResNet101 | 64.10 | 70.50 | 75.56 | 87.50 |
| DenseNet201 | 60.50 | 68.50 | 70.50 | 80.25 |
| AlexNet | 61.50 | 67.11 | 65.20 | 81.25 |
| VGG19 | 64.20 |
| 77.50 | 84.00 |
| Inceptionv3 | 62.50 | 69.50 | 78.50 |
|
The best accuracy obtained for CNN-FC models with and without luminance for DTD and FMD datasets is indicated in bold.
Figure 7Accuracy-Loss curve for pretrained CNN-FC models with luminance information: Comparison.
Comparative Analysis for the DTD dataset.
| Authors | Method | Accuracy |
|---|---|---|
| Cimpoi et al. [ | FC-CNN | 63.4 ± 0.9 |
| Cimpoi et al. [ | FV-CNN | 72.9 ± 0.8 |
| Simon et al. [ | Deep features + SVM | 66.49 |
| Cimpoi et al. [ | IFV + DeCAF | 66.7 |
| Dai et al. [ | FASON (conv5) | 72.3 ± 0.6 |
| Dai et al. [ | FASON (conv4 + conv5) | 72.9 ± 0.7 |
| Cerezo et al. [ | ResNet50-FC | 60.8 |
|
| Deep Features (ResNet101) + luminance + SVM |
|
Comparative Analysis for the FMD dataset.
| Authors | Method | Accuracy |
|---|---|---|
| Song et al. [ | FC-CNN | 78.1 ± 1.6 |
| Song et al. [ | FV-CNN | 80.2 ± 1.8 |
| Song et al. [ | FC-CNN + FV-CNN | 83.2 ± 1.6 |
| Simon et al. [ | Deep features + SVM | 84.50 |
| Cimpoi et al. [ | IFV + DeCAF | 65.5 |
| Bell et al. [ | SIFT-IFV + fc7 | 69.6 ± 0.3 |
|
| Deep Features (ResNet101) + luminance + SVM |
|
The best accuracy value is indicated in bold.