| Literature DB >> 34883843 |
Linlu Zu1, Yanping Zhao1, Jiuqin Liu1, Fei Su1, Yan Zhang2, Pingzeng Liu2.
Abstract
Since the mature green tomatoes have color similar to branches and leaves, some are shaded by branches and leaves, and overlapped by other tomatoes, the accurate detection and location of these tomatoes is rather difficult. This paper proposes to use the Mask R-CNN algorithm for the detection and segmentation of mature green tomatoes. A mobile robot is designed to collect images round-the-clock and with different conditions in the whole greenhouse, thus, to make sure the captured dataset are not only objects with the interest of users. After the training process, RestNet50-FPN is selected as the backbone network. Then, the feature map is trained through the region proposal network to generate the region of interest (ROI), and the ROIAlign bilinear interpolation is used to calculate the target region, such that the corresponding region in the feature map is pooled to a fixed size based on the position coordinates of the preselection box. Finally, the detection and segmentation of mature green tomatoes is realized by the parallel actions of ROI target categories, bounding box regression and mask. When the Intersection over Union is equal to 0.5, the performance of the trained model is the best. The experimental results show that the F1-Score of bounding box and mask region all achieve 92.0%. The image acquisition processes are fully unobservable, without any user preselection, which are a highly heterogenic mix, the selected Mask R-CNN algorithm could also accurately detect mature green tomatoes. The performance of this proposed model in a real greenhouse harvesting environment is also evaluated, thus facilitating the direct application in a tomato harvesting robot.Entities:
Keywords: Mask R-CNN; detection and segmentation; mature green tomato; mobile robot
Mesh:
Year: 2021 PMID: 34883843 PMCID: PMC8659851 DOI: 10.3390/s21237842
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1The (a) internal components and (b) working condition diagram of the greenhouse mobile robot.
Figure 2The (a)actual and (b) schematic diagram of the greenhouse path.
Figure 3Images of mature green tomatoes from different angles under positive light and backlight conditions.
Figure 4Visualization of manually annotated categories.
Figure 5The structure of tomato detection and segmentation based on Mask R-CNN.
Figure 6The process diagram of RPN.
Figure 7Model performance evaluation experiment in greenhouse field environment.
Evaluation index values for mature green tomato dataset under different backbone networks.
| Backbone Network | FPS | F1-Scorebbox | F1-ScoreMask | Index |
|---|---|---|---|---|
| ResNet50 | 5.77 | 0.9336 | 0.9241 | 0.8531 |
| ResNet50-FPN | 26.10 | 0.9290 | 0.9284 | 0.9430 |
| ResNet101-FPN | 19.53 | 0.9265 | 0.9300 | 0.9135 |
| ResNeXt101-vd-FPN | 9.34 | 0.9302 | 0.9323 | 0.8710 |
| SENet154-vd-FPN | 3.49 | 0.9318 | 0.9345 | 0.8465 |
Figure 8The performance evaluation index of Mask R-CNN model with ResNet50-FPN as the backbone network.
Figure 9The detection and segmentation results of mature green tomatoes in different states.
Recognition performance comparison between the trained Mask R-CNN model and manual detection.
| Samples | Number of Mature Green Tomatoes by Manual | Number of Mature Green Tomatoes Identified by Mask R-CNN | Recognition Accuracy/% | ||||
|---|---|---|---|---|---|---|---|
| Unshaded/Lightly Shaded | Shaded More Than 50% | Total | Unshaded/Lightly Shaded | Shaded More Than 50% | Total | ||
| 1 | 6 | 1 | 7 | 6 | 1 | 7 | 100 |
| 2 | 3 | 0 | 3 | 3 | 0 | 3 | 100 |
| 3 | 7 | 2 | 9 | 7 | 1 | 8 | 88.9 |
| 4 | 7 | 1 | 8 | 7 | 1 | 8 | 100 |
| 5 | 9 | 3 | 12 | 9 | 2 | 11 | 91.7 |
| 6 | 4 | 0 | 4 | 4 | 0 | 4 | 100 |
| 7 | 5 | 1 | 6 | 5 | 0 | 5 | 83.3 |
| 8 | 3 | 0 | 3 | 3 | 0 | 3 | 100 |
| 9 | 8 | 2 | 10 | 8 | 1 | 9 | 90 |
| 10 | 6 | 2 | 8 | 6 | 2 | 8 | 100 |
| 11 | 10 | 3 | 13 | 9 | 3 | 12 | 92.3 |
| 12 | 9 | 2 | 11 | 9 | 1 | 10 | 90.9 |
| 13 | 7 | 1 | 8 | 6 | 1 | 7 | 87.5 |
| 14 | 6 | 0 | 6 | 6 | 0 | 6 | 100 |
| 15 | 11 | 3 | 14 | 10 | 2 | 12 | 92.9 |
| Total | 101 | 21 | 122 | 98 | 15 | 113 | 92.6 |
Figure 10Results of the data imbalance test on mature green tomatoes.
Scientific studies in tomato detection and segmentation based on image analysis.
| Author | Method | Sensor | NO. Images | Reported Metrics |
|---|---|---|---|---|
| Huang et al., 2020 [ | Mask R-CNN with ResNet-101-FPN | RGB camera | 900 images with data augmentation | Detection accuracy of cherry tomato is 98% |
| Afonso et al., 2020 [ | Mask R-CNN with ResNext-101 | 4 RealSense cameras mounted on a pipe rail trolley | 123 images without data augmentation | F1-Score of red tomato is 0.93, and green tomato is 0.94 |
| Tenorio et al., 2021 [ | MobileNetV1 CNN for detection & color space segmentation | RGB camera mounted on a pipe rail trolley | 254 images with data augmentation | Detection accuracy of cherry tomato cluster is 95.98% |
| Benavides et al., 2020 [ | Sobel operator for detection, color-based segmentation and size-based segmentation | RGB camera located perpendicular to the soil surface | 175 images | Detection accuracy of beef tomato 90%, and cluster tomato is 79.7% |
| Proposed | Mask R-CNN with ResNet-50-FPN | RGB camera mounted on a mobile greenhouse robot | 3180 images without data augmentation | F1-Score of mask for green tomato is 0.9284 |