| Literature DB >> 32397598 |
Saeed Khaki1, Hieu Pham2, Ye Han2, Andy Kuhl2, Wade Kent2, Lizhi Wang1.
Abstract
Precise in-season corn grain yield estimates enable farmers to make real-time accurate harvest and grain marketing decisions minimizing possible losses of profitability. A well developed corn ear can have up to 800 kernels, but manually counting the kernels on an ear of corn is labor-intensive, time consuming and prone to human error. From an algorithmic perspective, the detection of the kernels from a single corn ear image is challenging due to the large number of kernels at different angles and very small distance among the kernels. In this paper, we propose a kernel detection and counting method based on a sliding window approach. The proposed method detects and counts all corn kernels in a single corn ear image taken in uncontrolled lighting conditions. The sliding window approach uses a convolutional neural network (CNN) for kernel detection. Then, a non-maximum suppression (NMS) is applied to remove overlapping detections. Finally, windows that are classified as kernel are passed to another CNN regression model for finding the ( x , y ) coordinates of the center of kernel image patches. Our experiments indicate that the proposed method can successfully detect the corn kernels with a low detection error and is also able to detect kernels on a batch of corn ears positioned at different angles.Entities:
Keywords: convolutional neural networks; corn kernel counting; digital agriculture; object detection
Year: 2020 PMID: 32397598 PMCID: PMC7249160 DOI: 10.3390/s20092721
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Three genetically different corn ears. Images (a–c) have different backgrounds. We included different types of backgrounds such as soil, grass, and hands in the training data to make the proposed method robust against the image background.
Figure 2Modeling structure of our proposed corn kernel detection method. A detailed description is given in Section 2.
The CNN architecture for kernel classification. The Conv, FC, and Avg pool stand for convolutional layer, fully connected layer, and average pooling layer, respectively.
| Type/Stride | Filter Size | Number of Filters | Output Size |
|---|---|---|---|
| Conv/s1 |
| 32 |
|
| Conv/s1 |
| 32 |
|
| Avg pool/s2 |
| - |
|
| Conv/s1 |
| 64 |
|
| Conv/s1 |
| 64 |
|
| Conv/s1 |
| 64 |
|
| Avg pool/s1 |
| - |
|
| FC-256 | |||
| FC-128 | |||
| Sigmoid | |||
The CNN architecture for finding the coordinates of the center of a kernel image. The Conv and FC stand for convolutional layer and fully connected layer, respectively.
| Type/Stride | Filter Size | Number of Filters | Output Size |
|---|---|---|---|
| Conv/s1 |
| 32 |
|
| Conv/s1 |
| 32 |
|
| Max pool/s2 |
| - |
|
| Conv/s1 |
| 64 |
|
| Conv/s1 |
| 64 |
|
| Conv/s1 |
| 64 |
|
| Max pool/s2 |
| - |
|
| FC-100 | |||
| FC-50 | |||
| FC-10 | |||
| FC-2 | |||
Figure 3A random subset of kernel images.
Figure 4A random subset of non-kernel images.
Figure 5A random subset of annotated kernel images. The blue dot indicates the center of the kernel.
Figure 6Plot of the log loss of the CNN classifier during training process.
Performance comparison of the CNN and HOG+SVM classifiers on the training and test datasets.
| Classifier | Evaluation Measures | ||||
|---|---|---|---|---|---|
| FP | FN | Accuracy | F-Score | ||
| Training | HOG+SVM | 596 | 595 | 0.947 | 0.937 |
| CNN | 0 | 0 | 1.0 | 1.0 | |
| Test | HOG+SVM | 135 | 135 | 0.918 | 0.906 |
| CNN | 19 | 22 | 0.987 | 0.985 | |
Figure 7Plot of the smooth loss of the CNN regression model during training process. The unit of the loss is pixel.
Figure 8The results of the proposed approach on 5 different test images.
The predicted and the ground truth numbers of the kernels on test images shown in Figure 8.
| Test Image | Predicted | Actual |
|---|---|---|
| 1 | 1012 | 1046 |
| 2 | 312 | 323 |
| 3 | 550 | 585 |
| 4 | 342 | 296 |
| 5 | 390 | 394 |
The performances of the competing methods on the kernel counting task of 20 different corn ears.
| Method | RMSE | MAE | Correlation Coefficient |
|---|---|---|---|
| Proposed | 33.11 | 25.95 | 95.86 |
| Deep Crowd [ | 45.29 | 35.25 | 93.12 |
Figure 9The left and right plots show the predicted number of kernels versus ground truth number of kernels for the Deep Crowd method and proposed method, respectively.