| Literature DB >> 36262126 |
Xiaoqing Li1,2, Jinwen Ma2.
Abstract
With the popularity of wine culture and the development of artificial intelligence (AI) technology, wine label image retrieval becomes more and more important. Taking an wine label image as an input, the goal of this task is to return the wine information that the user hopes to know, such as the main brand and sub-brand of the wine. The main challenge in wine label image retrieval task is that there are a large number of wine brands with the imbalance of their sample images which strongly affects the training of the retrieval system based on deep learning. To solve this problem, this article adopts a distribted strategy and proposes two distributed retrieval frameworks. It is demonstrated by the experimental results on the large scale wine label dataset and the Oxford flowers dataset that both our proposed distributed retrieval frameworks are effective and even greatly outperform the previous state-of-the-art retrieval models.Entities:
Keywords: Distributed strategy; Inter-class imbalance; Wine label image retrieval
Year: 2022 PMID: 36262126 PMCID: PMC9575874 DOI: 10.7717/peerj-cs.1116
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Figure 1Wine label image retrieval.
Figure 2The overall flow of the distributed retrieval framework based on fusion CNN. The green section is the data parallel processing in the retrieval framework, and the orange section is the fusion and post-processing of the data parallel processing results.
Figure 3The overall flow of the distributed retrieval framework based on fusion surf matching. The green section is the data parallel processing in the retrieval framework, and the orange section is the fusion and post-processing of the data parallel processing results.
Figure 4Some image instances of the large-scale wine label dataset.
The numbers of samples for the main-brands and sub-brands in the dataset.
| Number of images | 11~20 | 21~30 | 31~50 | 51~100 | 101~1,371 |
| Number of main-brands | 8,974 | 3,359 | 2,811 | 1,473 | 711 |
| Number of images | 1 | 2 | 3 | 4~10 | 11~371 |
| Number of sub-brands | 129,932 | 71,257 | 24,993 | 32,802 | 1,595 |
The relevant information of four subsets.
| Subset1 | Subset2 | Subset3 | Subset4 | |
|---|---|---|---|---|
| NI | 11~14 | 15~20 | 21~42 | 43~1,371 |
| TNM | 5,136 | 4,838 | 4,324 | 4,030 |
| TNI | 51,468 | 79,873 | 157,576 | 259,940 |
Notes:
The number of images under each main brand.
Total number of main-brands in each subset.
Total number of images in each subset.
The experiment results on the large-scale wine label image dataset. The best performances are marked in bold.
| Methods | Backbone | MAP | SAP | Time |
|---|---|---|---|---|
| CSCSM | ResNeXt-50 | 91.07 | 78.40 | 9.5657 |
| CSCFM | DPN-92 | 92.23 | 82.32 | 2.8956 |
| OR | ResNeXt-50 | 90.67 | 76.98 | 2.2015 |
| DCSCFM1 | ResNeXt-50 |
|
|
|
| DCSCFM2 | ResNeXt-50 | 93.16 | 83.36 | 2.2976 |
Notes:
The average retrieval accuracy for wine main-brands.
The average retrieval accuracy for wine sub-brands.
The average retrieval time for each image.
The relevant information of two subsets.
| Subset 1 | Subset 2 | |
|---|---|---|
| NI | 11~20 | 21~1,371 |
| TNM | 8,974 | 8,354 |
| TNI | 130,341 | 417,516 |
Notes:
The number of images under each main brand.
Total number of main-brands in each subset.
Total number of images in each subset.
The relevant information of eight subsets.
| Subset 1 | Subset 2 | Subset 3 | Subset 4 | |
|---|---|---|---|---|
| NI | 11 | 12 | 13~14 | 15~20 |
| TNM | 2,764 | 2,172 | 2,124 | 2,914 |
| TNI | 30,404 | 26,064 | 28,371 | 45,142 |
|
|
|
|
| |
| NI | 21~27 | 28~48 | 49~66 | 67~1371 |
| TNM | 1,869 | 2,455 | 2,037 | 1,993 |
| TNI | 49,209 | 108,367 | 114,156 | 145,784 |
Notes:
The number of images under each main brand.
Total number of main-brands in each subset.
Total number of images in each subset.
Ablations of the number of branches on the large-scale wine label image dataset. The best performances are marked in bold.
| Method | S | MAP | SAP | Time |
|---|---|---|---|---|
| D-CSCFM1 | 2 | 92.81 | 82.54 |
|
| 4 |
|
| 1.3596 | |
| 8 | 93.08 | 83.32 | 2.3160 | |
| 16 | 92.05 | 82.41 | 3.5632 | |
| D-CSCFM2 | 2 | 92.33 | 81.81 | 1.9087 |
| 4 | 93.16 | 83.36 | 2.2976 | |
| 8 | 93.23 | 83.21 | 2.4621 | |
| 16 | 92.49 | 82.56 | 2.7456 |
Notes:
The number of branches.
The average retrieval accuracy of the main-brands.
The average retrieval accuracy of the sub-brands.
The average time for each image.
Ablations of SVM model in DCSCFM1. The best performances are marked in bold.
| Method | S | MAP | SAP | Time |
|---|---|---|---|---|
| DCSCFM1(-SVM) | 2 | 92.01 | 81.32 | 1.6022 |
| 4 | 93.16 | 83.42 | 2.0134 | |
| 8 | 92.73 | 82.97 | 5.9807 | |
| 16 | 91.67 | 81.96 | 11.6750 | |
| DCSCFM1 | 2 | 92.81 | 82.54 |
|
| 4 |
|
| 1.3596 | |
| 8 | 93.08 | 83.32 | 2.3160 | |
| 16 | 92.05 | 82.41 | 3.5632 |
Notes:
The number of branches.
The average retrieval accuracy of the main-brands.
The average retrieval accuracy of the sub-brands.
The average time for each image.
Figure 5Typical images of the Oxford flowers dataset.
The experiment results on Oxford flowers dataset. The best performances are marked in bold.
| Method | Top-1 mAP | Top-5 mAP |
|---|---|---|
| CroW | 73.67 | 76.16 |
| SPoC | 71.36 | 74.55 |
| R-MAC | 71.98 | 74.82 |
| SCDA | 75.13 | 77.70 |
| SGeM | 76.11 | 78.20 |
| DS | 76.09 | 78.15 |
| DCSCFM1 |
|
|
| DCSCFM2 | 76.88 | 78.49 |
Note:
The numbers of branches for the DCSCFM1 and DCSCFM2 are set to 4.