| Literature DB >> 35463441 |
Chao Qi1, Junfeng Gao2, Kunjie Chen1, Lei Shu1, Simon Pearson2.
Abstract
A high resolution dataset is one of the prerequisites for tea chrysanthemum detection with deep learning algorithms. This is crucial for further developing a selective chrysanthemum harvesting robot. However, generating high resolution datasets of the tea chrysanthemum with complex unstructured environments is a challenge. In this context, we propose a novel tea chrysanthemum - generative adversarial network (TC-GAN) that attempts to deal with this challenge. First, we designed a non-linear mapping network for untangling the features of the underlying code. Then, a customized regularization method was used to provide fine-grained control over the image details. Finally, a gradient diversion design with multi-scale feature extraction capability was adopted to optimize the training process. The proposed TC-GAN was compared with 12 state-of-the-art generative adversarial networks, showing that an optimal average precision (AP) of 90.09% was achieved with the generated images (512 × 512) on the developed TC-YOLO object detection model under the NVIDIA Tesla P100 GPU environment. Moreover, the detection model was deployed into the embedded NVIDIA Jetson TX2 platform with 0.1 s inference time, and this edge computing device could be further developed into a perception system for selective chrysanthemum picking robots in the future.Entities:
Keywords: NVIDIA Jetson TX2; deep learning; edge computing; generative adversarial network; tea chrysanthemum
Year: 2022 PMID: 35463441 PMCID: PMC9021924 DOI: 10.3389/fpls.2022.850606
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
FIGURE 1The results of testing tea chrysanthemum – generative adversarial network (TC-GAN) on NVIDIA Jetson TX2. First, we used an HDMI cable to connect the laptop with the Jetson TX2, and ensure that the laptop and Jetson TX2 were under the same wireless network. Then, the TC-YOLO model and the tea chrysanthemum dataset were embedded in the flashed Jetson TX2 for testing.
Details of the twelve latest generative adversarial networks.
| Algorithm | Published year | Characteristic | Resolution |
| Progressive GAN ( | 2017 | Grow the generator and discriminator progressively | 64 × 64 |
| LSGAN ( | 2017 | Applying the least squares loss function | 112 × 112 |
| SN-GAN ( | 2018 | Applying spectral normalization | 32 × 32 |
| MGAN ( | 2018 | Applying multi-channel gait templates | 64 × 64 |
| Dist-GAN ( | 2018 | Applying a latent-data distance constraint | 64 × 64 |
| Rob-GAN ( | 2019 | Jointly optimize generator and discriminator | 128 × 128 |
| AutoGAN ( | 2019 | Applying NAS algorithm | 64 × 64 |
| BigGAN ( | 2018 | Applying orthogonal regularization | 512 × 512 |
| Improved WGAN ( | 2020 | Injecting an instance noise | 128 × 128 |
| Improved WGAN-GP ( | 2021 | Wasserstein GAN with gradient penalty | 28 × 28 |
| Improved DCGAN ( | 2021 | Applying batch normalization | 64 × 64 |
| DAG ( | 2021 | Improve learning of the original distribution | 48 × 48 |
Available literature using GAN for image recognition in agriculture.
| Algorithm | Published year | Task | Accuracy (%) | Resolution | Test environment |
| DCGAN ( | 2018 | Plant disease detection | 88.6 | 64 × 64 | Illumination |
| C-DCGAN ( | 2019 | Tea leaf’s disease identification | 90 | 64 × 64 | Illumination |
| DCGAN ( | 2019 | Apple scab segmentation | 60 | 28 × 28 | Ideal |
| CycleGAN ( | 2019 | Detection of apple lesions in orchards | 95.57 | 64 × 64 | Ideal |
| DCGAN ( | 2019 | Tea clones identifications | 76 | 64 × 64 | Ideal |
| Deep CORAL ( | 2020 | Potato defects classification | 90 | 64 × 64 | Ideal |
| CAAE ( | 2020 | Citrus plant diseases recognition | 53.4 | 64 × 64 | Illumination |
| DCGAN ( | 2020 | Plant disease detection | 86.63 | 64 × 64 | Ideal |
| BEGAN ( | 2020 | Pine cone detection | 95.3 | 64 × 64 | Ideal |
| CGAN ( | 2020 | Kiwi geometry reconstruction | 75 | 28 × 28 | Ideal |
| DCGAN ( | 2020 | Plant disease classification | 95.88 | 64 × 64 | Ideal |
| DCGAN ( | 2020 | Recognition of diseased pinus trees | 78.6 | 64 × 64 | Ideal |
| TasselGAN ( | 2020 | Plant traits detection | 94 | 128 × 128 | Illumination |
| CycleGAN ( | 2021 | Bale detection | 93 | 64 × 64 | Ideal |
| DCGAN ( | 2021 | Weeds identification | 93.23 | 64 × 64 | Ideal |
| DoubleGAN ( | 2021 | Plant disease detection | 99.06 | 64 × 64 | Ideal |
| AR-GAN ( | 2020 | Plant disease recognition | 86.1 | 256 × 256 | Illumination |
FIGURE 2Examples of the collected original images.
FIGURE 3NVIDIA Jetson TX2 parameters.
FIGURE 4Structure of the proposed TC-GAN network. Mapping network can effectively capture the location of potential codes with rich features, benefiting the generator network to accurately extract complex unstructured features. A represents the learned affine transformation. B denotes the learned per-channel scaling factor applied to the noisy input. Discriminator network is designed to guide the training of the generator network, which is continuously confronted by alternating training between the two networks, ultimately enabling the generator network to better execute the generation task.
FIGURE 5Impact of dataset size and epoch time on TC-GAN.
Performance comparison of different data enhancement methods.
| Flip | Shear | Crop | Rotation | Grayscale | Blur | Mixup | Cutout | Mosaic | TC-GAN | AP |
| √ | 86.33 | |||||||||
| √ | 84.21 | |||||||||
| √ | 83.99 | |||||||||
| √ | 86.96 | |||||||||
| √ | 82.09 | |||||||||
| √ | 80.13 | |||||||||
| √ | 80.33 | |||||||||
| √ | 81.86 | |||||||||
| √ | 84.31 | |||||||||
| √ | √ | 87.39 | ||||||||
| √ | 90.09 |
√ means that this data enhancement method has been adopted.
Comparisons with state-of-the-art detection methods.
| Method | Backbone | Size | FPS | mAP |
| RetinaNet | ResNet101 | 800 × 800 | 4.54 | 82.62 |
| RetinaNet | ResNet50 | 800 × 800 | 5.31 | 80.59 |
| RetinaNet | ResNet101 | 500 × 500 | 7.23 | 79.13 |
| RetinaNet | ResNet50 | 500 × 500 | 7.87 | 83.68 |
| EfficientDetD6 | EfficientB6 | 1280 × 1280 | 5.29 | 81.23 |
| EfficientDetD5 | EfficientB5 | 1280 × 1280 | 6.21 | 83.51 |
| EfficientDetD4 | EfficientB4 | 1024 × 1024 | 7.93 | 83.19 |
| EfficientDetD3 | EfficientB3 | 896 × 896 | 9.28 | 84.83 |
| EfficientDetD2 | EfficientB2 | 768 × 768 | 11.66 | 84.22 |
| EfficientDetD1 | EfficientB1 | 640 × 640 | 15.26 | 82.93 |
| EfficientDetD0 | EfficientB0 | 512 × 512 | 37.61 | 82.81 |
| M2Det | VGG16 | 800 × 800 | 7.08 | 80.63 |
| M2Det | ResNet101 | 320 × 320 | 16.89 | 85.16 |
| M2Det | VGG16 | 512 × 512 | 21.22 | 80.88 |
| M2Det | VGG16 | 300 × 300 | 42.53 | 78.24 |
| YOLOv3 | DarkNet53 | 608 × 608 | 12.14 | 86.52 |
| YOLOv3 (SPP) | DarkNet53 | 608 × 608 | 15.66 | 83.89 |
| YOLOv3 | DarkNet53 | 416 × 416 | 43.25 | 84.13 |
| PFPNet (R) | VGG16 | 512 × 512 | 24.35 | 82.41 |
| RFBNetE | VGG16 | 512 × 512 | 21.54 | 77.37 |
| RFBNet | VGG16 | 512 × 512 | 45.46 | 85.53 |
| RefineDet | VGG16 | 512 × 512 | 31.33 | 81.12 |
| RefineDet | VGG16 | 448 × 448 | 43.31 | 79.66 |
| YOLOv4 | CSPDarknet53 | 608 × 608 | 19.22 | 85.11 |
| YOLOv4 | CSPDarknet53 | 512 × 512 | 24.63 | 84.34 |
| YOLOv5l | CSPDenseNet | 416 × 416 | 42.24 | 88.83 |
| YOLOv5m | CSPDenseNet | 416 × 416 | 36.91 | 86.68 |
| YOLOv5x | CSPDenseNet | 416 × 416 | 32.28 | 84.02 |
| YOLOv5s | CSPDenseNet | 416 × 416 | 47.88 | 88.29 |
| TC-YOLO | CSPDenseNet | 416 × 416 | 47.53 | 90.09 |
FIGURE 6Qualitative results of our method. The red box indicates the recognised tea chrysanthemum.
FIGURE 7Example of nine unstructured scenarios.
Impact of different unstructured scenarios on the TC-YOLO.
| Environment | Count | Correctly identified | Falsely identified | Missed | |||
|
|
|
| |||||
| Amount | Rate (%) | Amount | Rate (%) | Amount | Rate (%) | ||
| Strong light | 6511 | 5021 | 77.12 | 686 | 10.54 | 804 | 7.25 |
| Weak light | 10162 | 8786 | 86.46 | 857 | 8.43 | 519 | 5.11 |
| Normal light | 18686 | 17458 | 93.43 | 988 | 5.29 | 240 | 1.28 |
| High overlap | 5249 | 4167 | 79.39 | 379 | 7.22 | 703 | 13.39 |
| Moderate overlap | 11892 | 10420 | 87.62 | 659 | 5.54 | 813 | 6.84 |
| Normal overlap | 17443 | 16499 | 94.59 | 419 | 2.4 | 525 | 3.01 |
| High occlusion | 7811 | 6284 | 80.45 | 729 | 9.33 | 798 | 10.22 |
| Moderate occlusion | 12162 | 10661 | 87.66 | 630 | 5.18 | 890 | 7.16 |
| Normal occlusion | 19299 | 18147 | 94.03 | 648 | 3.36 | 504 | 2.61 |
Comparison between tea chrysanthemum – generative adversarial network (TC-GAN) and state-of-the-art GANs.
| Method | Size | Times/min | AP |
| Improved SN-GAN | 32 × 32 | 1290 | 80.61 |
| BigGAN | 512 × 512 | 1610 | 86.45 |
| Dist-GAN | 64 × 64 | 1322 | 80.68 |
| Progressive GAN | 64 × 64 | 1256 | 81.11 |
| LSGAN | 112 × 112 | 1410 | 84.03 |
| Rob-GAN | 128 × 128 | 1293 | 85.28 |
| MGAN | 64 × 64 | 1151 | 82.39 |
| AutoGAN | 64 × 64 | 1340 | 83.25 |
| Improved DCGAN | 64 × 64 | 1280 | 84.38 |
| DAG | 48 × 48 | 1768 | 83.29 |
| Improved WGAN-GP | 28 × 28 | 1640 | 76.16 |
| Improved WGAN | 128 × 128 | 1501 | 87.16 |
| TC-GAN | 512 × 512 | 1460 | 90.09 |
Generation results of different GANs.
| Methods | Result |
| Improved |
|
| SN-GAN |
|
| Dist-GAN |
|
| Progressive GAN |
|
| MGAN |
|
| AutoGAN |
|
| DAG |
|
| LSGAN |
|
| Improved DCGAN |
|
| Rob-GAN |
|
| BigGAN |
|
| Improved WGAN |
|
| TC-GAN |
|
FIGURE 8(A) Visualization results and (B,C) training process.