| Literature DB >> 35746391 |
Junwen Deng1, Yuhang Liu1, Xinqing Xiao1.
Abstract
The shiitake mushroom is the second-largest edible mushroom in the world, with a high nutritional and medicinal value. The surface texture of shiitake mushrooms can be quite different due to different growing environments, consequently leading to fluctuating market prices. To maximize the economic profit of the mushroom industry, it is necessary to sort the harvested mushrooms according to their qualities. This paper aimed to develop a deep-learning-based wireless visual sensor system for shiitake mushroom sorting, in which the visual detection was realized by the collection of images and cooperative transmission with the help of visual sensors and Wi-Fi modules, respectively. The model training process was achieved using Vision Transformer, then three data-augmentation methods, which were Random Erasing, RandAugment, and Label Smoothing, were applied under the premise of a small sample dataset. The training result of the final model turned out nearly perfect, with an accuracy rate reaching 99.2%. Meanwhile, the actual mushroom-sorting work using the developed system obtained an accuracy of 98.53%, with an 8.7 ms processing time for every single image. The results showed that the system could efficiently complete the sorting of shiitake mushrooms with a stable and high accuracy. In addition, the system could be extended for other sorting tasks based on visual features. It is also possible to combine binocular vision and multisensor technology with the current system to deal with sorting work that requires a higher accuracy and minor feature identification.Entities:
Keywords: deep learning; mushroom sorting; wireless sensor
Mesh:
Substances:
Year: 2022 PMID: 35746391 PMCID: PMC9231019 DOI: 10.3390/s22124606
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Sorting of shiitake mushrooms according to texture.
Figure 2Block diagram of deep-learning-based wireless visual sensor system.
Figure 3Physical implementation of deep-learning-based wireless visual sensor system.
Figure 4Workflow of deep-learning-based wireless visual sensor system.
Figure 5Shiitake mushroom images after Random Erasing.
Figure 6Structure of Vision Transformer.
Figure 7Training curves of ViT. (a) Loss curves of the training set and validation set; (b) training curves of the validation set.
Performance in ViT training.
| Method | ViT-Tiny/16 | ViT-Small/16 | ViT-Small/32 | ViT-Base/16 | ViT-Base/32 |
|---|---|---|---|---|---|
| Acc (%) | 98.3 | 98.9 | 97.2 | 99.2 | 96.2 |
| Train loss | 0.401 | 0.361 | 0.391 | 0.317 | 0.383 |
| Validation loss | 0.127 | 0.118 | 0.128 | 0.057 | 0.137 |
| Training time | 115 s | 177 s | 139 s | 472 s | 271 s |
Figure 8Confusion matrix of ViT-Base/16.
Sorting results of ViT.
| Type | Acc (%) | PPV | TPR | F1-Score |
|---|---|---|---|---|
| Type A | 98.53 | 1 | 0.9706 | 0.9851 |
| Type B | 0.9714 | 1 | 0.9855 | |
| Type C | 0.9853 | 0.9853 | 0.9853 |
Performance of ViT compared to CNN networks.
| Methods | Params | Inference Time/Image | Train Time/Epoch | Acc (%) | Validation Loss |
|---|---|---|---|---|---|
| Resnet 50 | 24 M | 13.2 ms | 4.3 s | 87.0 | 0.924 |
| Inception v3 | 22 M | 12.8 ms | 3.6 s | 96.4 | 0.166 |
| Densenet 121 | 7 M | 18.2 ms | 3.6 s | 93.8 | 0.225 |
| SE-ResNeXt 50 | 26 M | 12.5 ms | 5.3 s | 96.7 | 0.181 |
| ViT-Tiny/16 | 6 M | 6.8 ms | 2.8 s | 98.3 | 0.127 |
| ViT-Base/16 | 86 M | 8.7 ms | 11.5 s | 99.2 | 0.057 |