Rice is one of the most important food crops in the world, and rice seed varieties are related to the yield and quality of rice. This study used near-infrared (NIR) hyperspectral technology with conventional machine learning methods (support vector machine (SVM), logistic regression (LR), and random forest (RF)) and deep learning methods (LeNet, GoogLeNet, and residual network (ResNet)) to establish variety identification models for five common types of rice seeds. Among the deep learning methods, the classification accuracies of most models were higher than 95%. This study further used the deep learning methods to establish variety identification models for 10 varieties of rice seeds without considering their types. Among them, the ResNet model had the best classification results. The classification accuracy on the test set was 86.08%. This study used the saliency map method to visualize each convolutional neural network (CNN) model to find the band region that contributed the most to the data. The results showed that the bands with the largest data contribution were mainly concentrated at approximately 1300-1400 nm and secondarily concentrated at approximately 1050-1250 nm. The overall results showed that NIR hyperspectral imaging technology combined with deep learning could effectively distinguish rice seeds of different varieties. This method provided an effective way to identify rice seed varieties in a quick and nondestructive manner.
Rice is one of the most important food crops in the world, and rice seed varieties are related to the yield and quality of rice. This study used near-infrared (NIR) hyperspectral technology with conventional machine learning methods (support vector machine (SVM), logistic regression (LR), and random forest (RF)) and deep learning methods (LeNet, GoogLeNet, and residual network (ResNet)) to establish variety identification models for five common types of rice seeds. Among the deep learning methods, the classification accuracies of most models were higher than 95%. This study further used the deep learning methods to establish variety identification models for 10 varieties of rice seeds without considering their types. Among them, the ResNet model had the best classification results. The classification accuracy on the test set was 86.08%. This study used the saliency map method to visualize each convolutional neural network (CNN) model to find the band region that contributed the most to the data. The results showed that the bands with the largest data contribution were mainly concentrated at approximately 1300-1400 nm and secondarily concentrated at approximately 1050-1250 nm. The overall results showed that NIR hyperspectral imaging technology combined with deep learning could effectively distinguish rice seeds of different varieties. This method provided an effective way to identify rice seed varieties in a quick and nondestructive manner.
Rice
is one of the world’s three major food crops, and China’s
rice output has ranked first in the world throughout the years. Due
to differences in temperature, humidity, and soil conditions in different
regions, rice quality also varies.[1] The
quality of rice seeds directly affects the yield and quality of grain.
High-quality rice varieties have a higher nutritional and economic
value.[2,3] However, in actual market transactions,
shoddy goods are often encountered, which harms the interests of farmers
and affects grain production.[4] With the
development of hybridization technology and molecular breeding technology,
the number of rice varieties continues to increase, and the existing
detection technology can no longer meet the needs of rice seed variety
detection.Many types of rice seeds are available. According
to the different
morphological and physiological characteristics, rice types can be
divided into japonica rice and indica rice. Based on whether it is
obtained by hybridization or mutation, rice can be divided into conventional
rice and hybrid rice. According to whether it is obtained by genetically
modified technology, rice can be divided into transgenic rice and
nontransgenic rice. Different varieties of rice seeds are available
for these different types. At present, most studies on the classification
of rice seed varieties are mainly based on direct variety classification,
without considering the type of rice seed. Existing seed variety detection
methods include morphological methods, fluorescence scanning-based
identification methods, chemical identification methods, and electrophoresis
identification methods, but these methods have disadvantages such
as high time consuming, low precision, and cumbersome operation.[5] Therefore, a rapid and nondestructive detection
technology for rice seed varieties is needed. Near-infrared (NIR)
hyperspectral imaging technology is a nondestructive testing technology
that can obtain the spatial information and spectral information inside
the given sample and can be used to obtain the physicochemical characterization
and external morphological characteristics of the sample. In recent
years, NIR technology has been widely used to test the quality and
variety of various seeds,[6] such as corn,[7,8] wheat,[9,10] rice,[11,12] and soybeans.[13,14]Each pixel in hyperspectral image data has a spectrum, and
each
band is a grayscale image, which can provide both spatial and spectral
information about the sample. The spectral information contained in
samples is very similar, and it is difficult to complete a quantitative
analysis or perform accurate classification by observation. Therefore,
it is necessary to use machine learning methods to extract information
from the input data that is useful for the target study. Hyperspectral
image data processing is usually combined with machine learning methods.
Machine learning methods can be used to mine the information contained
in hyperspectral images deeply. A suitable machine learning method
can effectively process the relevant information in a hyperspectral
image. Deep learning is the current research hotspot within machine
learning, and it has a good feature of self-learning ability. Deep
learning has been widely used for data processing and analysis in
different fields. Recently, some scholars have studied combinations
of hyperspectral imaging technology and deep learning methods to detect
the quality and varieties of seeds. Nie et al.[15] used the combination of NIR hyperspectral imaging technology
and deep convolutional neural network (DCNN) to identify hybrid okra
seeds and hybrid loofah seeds. Weng et al.[16] combined hyperspectral imaging technology and a deep learning-based
principal component analysis network to establish a rice variety identification
model. Zhang et al.[17] used NIR hyperspectral
technology in combination with deep learning methods to identify coated
corn seed varieties and used the CNN, recurrent neural network (RNN),
and long short-term memory (LSTM) to establish classification models.
The classification accuracy exceeded 90%. The above studies have shown
that the combination of hyperspectral imaging technology and deep
learning can effectively identify seed varieties. However, although
the deep learning methods are effective, they lack an explanatory
nature. Therefore, Zhang et al.[18] used
the class activation map (CAM) method to visualize the CNN. Yan et
al.[19] used the saliency map method to visualize
the CNN. Currently, most studies successfully use NIR hyperspectral
imaging in combination with deep learning methods to identify the
seed varieties of various crops. Among them, the varieties of rice
seeds have also been classified, but most of the related studies did
not consider the effect of the rice seed type on variety classification.
Moreover, deep learning models are poorly interpreted, and most studies
have not investigated this aspect.In this study, we studied
5 common rice types: conventional japonica
rice, conventional indica rice, hybrid japonica rice, hybrid indica
rice, and indica–japonica hybrid rice. In addition, the saliency
map method was used to visualize the CNN. This method was employed
to find the band region that contributed the most to the classification
of the rice seed varieties in the hyperspectral data. We studied two
cases considering varieties and not considering varieties. We used
the saliency map to visualize the tested deep learning models, thereby
increasing their interpretability. The main purposes of this paper
were to (1) study NIR hyperspectral imaging technology in combination
with deep learning methods to identify the varieties of different
types of rice seeds; (2) use NIR hyperspectral imaging technology
with deep learning methods to identify 10 varieties of rice seeds
regardless of their type; and (3) use saliency map to visualize CNN
models and find the band region where the hyperspectral data contributed
the most to the classification of the rice seed varieties.
Results and Discussion
Spectroscopic Analysis
Figure shows the
average preprocessed
spectra of the different rice seed varieties within each type. The
raw spectra and preprocessed spectra of different types of rice seeds
are shown in Figure S1 (in the Supporting
Information). Differences were observed between the average spectra
of different varieties of rice seeds. Among the conventional japonica
rice, Huaidao 5 and Nanjing 9108 had large differences in the wavelength
range of 1000–1100 nm, but the differences in other wavelengths
were small. Among the conventional indica rice, the differences between
Zhongjiazao 17 and Zhongzao 39 in all bands were small. Among the
hybrid japonica rice, Bayou 8 and Yongyou 10 were different at approximately
1200 nm, but the differences at other wavelengths were small. Among
the hybrid indica rice, Chunyou 84 and Yongyou 12 were different in
the wavelength range of 1000–1400 nm. Among the indica–japonica
hybrid rice, Y liangyou 2 and Zhongzheyou 8 had relatively small differences
in all bands. Figure (f) shows the average spectral curves of all varieties of rice seeds.
Without considering the seed type, differences in the average spectra
were observed between some varieties, but the differences between
most varieties were not obvious. In general, the differences between
the average spectra of different varieties of rice samples were not
obvious, and further analysis was needed for the classification of
rice seed varieties.
Figure 1
Average spectra of different types of rice seeds: (a)
conventional
japonica rice: Huaidao 5 and Nanjing 9108, (b) conventional indica
rice: Zhongjiazao 17 and Zhongzao 39, (c) hybrid japonica rice: Bayou
8 and Yongyou 10, (d) hybrid indica rice: Chunyou 84 and Yongyou 12,
(e) indica–japonica hybrid rice: Y liangyou 2 and Zhongzheyou
8, and (f) all varieties.
Average spectra of different types of rice seeds: (a)
conventional
japonica rice: Huaidao 5 and Nanjing 9108, (b) conventional indica
rice: Zhongjiazao 17 and Zhongzao 39, (c) hybrid japonica rice: Bayou
8 and Yongyou 10, (d) hybrid indica rice: Chunyou 84 and Yongyou 12,
(e) indica–japonica hybrid rice: Y liangyou 2 and Zhongzheyou
8, and (f) all varieties.
Classification Results of Rice Seed Varieties
Considering Seed Types
Classification Result
The average
spectra could not effectively distinguish different rice seed varieties.
To obtain better classification results, this study used three conventional
machine learning methods (support vector machine (SVM), logistic regression
(LR), and random forest (RF)) and the CNN with three network structures
(LeNet, GoogLeNet, and residual network (ResNet)) to establish models
for identifying rice seeds of the same type but with different varieties.
The number of rice seeds of each variety is shown in Table , with a total of 1900 grains.
We took each type of rice seed (each type contained two varieties)
as a whole and divided them into a training set, validation set, and
test set in an approximate ratio of 6:1:1. Table S1 shows the data set division for each variety of samples.
Table 1
Sample Number of Different Varieties
of Rice Seeds
type
variety
number
conventional
japonica rice
Huaidao
5
200
Nanjing 9108
200
conventional
indica rice
Zhongzao
39
200
Zhongjiazao 17
200
hybrid japonica
rice
Yongyou 10
100
Bayou 8
100
hybrid indica
rice
Zhongzheyou
8
300
Yliangyou 2
300
indica–japonica
hybrid rice
Chunyou
84
100
Yongyou 12
200
Table shows the
classification results of different models for each type of rice seed.
A model with 90% or greater accuracy on all three sets was defined
as good, a model with 80% or greater accuracy on all three sets was
defined as fair, and the rest were defined as poor. For the conventional
japonica rice models, Huaidao 5 and Nanjing 9108 were assigned values
of 0 and 1, respectively. The LR model had the worst classification
results, and its classification accuracy was only approximately 70%.
The classification accuracies of the other models were above 90%.
Overall, the GoogLeNet model had the best classification results.
Its accuracy on the training set was 99.33%, and its accuracy on the
validation set was 100%.
Table 2
Classification Accuracy
of Different
Types of Rice Variety Detection Models
type
models
training
validation
test
model performance
conventional
japonica rice
RBF-SVM
99.33%
90.00%
98.00%
good
LR
74.33%
66.00%
70.00%
poor
RF
100.00%
94.00%
96.00%
good
LeNet
98.67%
100.00%
96.00%
good
GoogLeNet
99.33%
100.00%
96.00%
good
ResNet
99.33%
94.00%
98.00%
good
conventional
indica rice
RBF-SVM
100.00%
94.00%
90.00%
good
LR
75.33%
88.00%
76.00%
poor
RF
100.00%
90.00%
74.00%
poor
LeNet
99.00%
100.00%
96.00%
good
GoogLeNet
98.33%
100.00%
92.00%
good
ResNet
97.67%
98.00%
92.00%
good
hybrid japonica
rice
RBF-SVM
95.33%
96.00%
96.00%
good
LR
70.00%
80.00%
72.00%
poor
RF
100.00%
84.00%
88.00%
fair
LeNet
99.33%
92.00%
100.00%
good
GoogLeNet
98.67%
100.00%
100.00%
good
ResNet
100.00%
92.00%
96.00%
good
hybrid indica
rice
RBF-SVM
88.22%
85.33%
90.67%
fair
LR
74.67%
78.67%
74.67%
poor
RF
100.00%
77.33%
77.33%
poor
LeNet
94.00%
89.33%
88.00%
fair
GoogLeNet
90.22%
85.33%
82.67%
fair
ResNet
92.89%
84.00%
90.67%
fair
indica–japonica
hybrid rice
RBF-SVM
100.00%
100.00%
100.00%
good
LR
97.33%
100.00%
97.30%
good
RF
100.00%
100.00%
97.30%
good
LeNet
100.00%
100.00%
100.00%
good
GoogLeNet
100.00%
100.00%
100.00%
good
ResNet
100.00%
100.00%
100.00%
good
For conventional indica rice models, the category
values of Zhongjiazao
17 and Zhongzao 39 were 0 and 1, respectively. The results of the
LR model were still unsatisfactory. The accuracies of the RF model
on the training set and validation set were high, but the accuracy
on the test set was only 74%. The RF model had a certain degree of
overfitting. The overall results of the deep learning models were
better than those of conventional machine learning models. The test
set accuracy rate of the LeNet model was the highest among all models.For hybrid japonica rice models, the category values of Bayou 8
and Yongyou 10 were 0 and 1, respectively. The LR model performed
the worst. All deep learning models produced good results. Among them,
the GoogLeNet model had the best classification results. Its accuracies
on the validation set and test set were 100%.For hybrid indica
rice models, the category values of Chunyou 84
and Yongyou 12 were 0 and 1, respectively. The LR and RF models had
poor classification results, and the RF model exhibited overfitting.
The classification results of the three deep learning models were
relatively close.For indica–japonica hybrid rice models,
the category values
of Yliangyou 2 and Zhongzheyou 8 were 0 and 1, respectively. All models
had good classification results. The classification accuracies of
both the deep learning models and the SVM model reached 100%We further analyzed the model results using the area under the
curve (AUC) metric. The AUC values for each model are shown in Table . The closer the AUC
value is to 1, the more effective the model is. As seen from the table,
the AUC results were generally consistent with the results of model
accuracy results.
Table 3
AUC of Different Types of Rice Varieties
Detection Models (1, conventional japonica rice; 2, conventional indica
rice; 3, hybrid japonica rice; 4, hybrid indica rice; 5, indica–japonica
hybrid rice)
SVM
LR
RF
LeNet
GoogLeNet
ResNet
1
0.9839
0.7275
0.9414
0.9576
0.9474
0.9839
2
0.9074
0.7649
0.9074
0.9630
0.9195
0.9259
3
0.9375
0.7610
0.8787
1
1
0.9706
4
0.9109
0.7460
0.7687
0.8787
0.8189
0.9109
5
1
0.9828
0.9375
1
1
1
The overall
results showed that for each type of rice seed, the
classification results of the deep learning methods were better than
those of the conventional machine learning methods. Among the conventional
machine learning methods, the SVM had the best classification results,
and LR had poor classification results. Because the form of the LR
model is relatively simple, it is essentially a linear classifier,
which makes it difficult to fit the real distribution of the input
data. The hyperspectral data are relatively complex, and the LR model
could not effectively handle the relationships between different features.
RF models are prone to overfitting when noise is contained in the
given data, and hyperspectral data inevitably generate noise during
the acquisition process. Furthermore, in this study, the sample size
for each variety was not sufficiently large, which could have led
to the overfitting of the RF model.To compare the performances
of different kinds of models, we conducted
a one-way analysis of variance (ANOVA) and multiple comparisons among
the accuracies of the different models using the least significant
difference (LSD). Significant differences could be found between LR
and the deep learning models with p < 0.05. Deep
learning models performed better than LR models. Significant differences
could be found between the RF and the deep learning models with p < 0.05. The deep learning models performed better than
the RF model. No significant differences were observed between the
SVM models and the deep learning models with p >
0.05. We further compared the average accuracies of the SVM models
and the deep learning models, and all three deep learning models have
higher average accuracies than the SVM models. No significant differences
were detected among the three deep learning models with p > 0.05. Among the deep learning methods, the performance of the
three methods was relatively close and all models had good classification
results.
CNN Visualization
In this study,
the saliency map method was used to visualize the CNN models. Figure shows the visualization
results of all LeNet models. For conventional japonica rice, the bands
at approximately 1300–1400 nm provided the highest contribution.
For conventional indica rice, the bands at approximately 1200–1250
nm and 1350–1400 nm contributed the most, followed by the bands
at approximately 1050 nm and 1150–1200 nm. For hybrid japonica
rice, the bands at approximately 1350–1400 nm contributed the
most, followed by the bands at approximately 1150–1250 nm.
For hybrid indica rice, the bands at approximately 1350–1400
nm contributed the most, followed by the bands at approximately 1200–1250
nm and 1050 nm. For indica–japonica hybrid rice, the bands
at approximately 1300–1400 nm and 1100–1150 nm both
provided high contributions.
Figure 2
Saliency maps of the LeNet model: (a) conventional
japonica rice:
Huaidao 5 and Nanjing 9108, (b) conventional indica rice: Zhongjiazao
17 and Zhongzao 39, (c) hybrid japonica rice: Bayou 8 and Yongyou
10, (d) hybrid indica rice: Y liangyou 2 and Zhongzheyou 8, and (e)
indica–japonica hybrid rice: Chunyou 84 and Yongyou 12.
Saliency maps of the LeNet model: (a) conventional
japonica rice:
Huaidao 5 and Nanjing 9108, (b) conventional indica rice: Zhongjiazao
17 and Zhongzao 39, (c) hybrid japonica rice: Bayou 8 and Yongyou
10, (d) hybrid indica rice: Y liangyou 2 and Zhongzheyou 8, and (e)
indica–japonica hybrid rice: Chunyou 84 and Yongyou 12.Figure shows the
visualization results of all GoogLeNet models. For conventional japonica
rice, the bands with the largest contribution rate were mainly concentrated
at approximately 1350–1400 nm, followed by the bands at approximately
1200–1250 nm. For conventional indica rice, the bands at approximately
1350–1450 nm provided the highest contributions, followed by
the bands at approximately 1500–1550 nm. For hybrid japonica
rice, the band at approximately 1350–1400 nm contributed the
most, followed by the bands at approximately 1200–1300 nm.
For hybrid indica rice, the bands at approximately 1300–1350
nm and 1375–1400 nm contributed the most, followed by the bands
at approximately 1100–1150 nm and 1250 nm. For indica–japonica
hybrid rice, the bands at approximately 1225–1275 nm and 1150–1175
nm contributed the most, followed by the bands at approximately 1025–1075
nm.
Figure 3
Saliency maps of the GoogLeNet model: (a) conventional japonica
rice: Huaidao 5 and Nanjing 9108, (b) conventional indica rice: Zhongjiazao
17 and Zhongzao 39, (c) hybrid japonica rice: Bayou 8 and Yongyou
10, (d) hybrid indica rice: Y liangyou 2 and Zhongzheyou 8, and (e)
indica–japonica hybrid rice: Chunyou 84 and Yongyou 12.
Saliency maps of the GoogLeNet model: (a) conventional japonica
rice: Huaidao 5 and Nanjing 9108, (b) conventional indica rice: Zhongjiazao
17 and Zhongzao 39, (c) hybrid japonica rice: Bayou 8 and Yongyou
10, (d) hybrid indica rice: Y liangyou 2 and Zhongzheyou 8, and (e)
indica–japonica hybrid rice: Chunyou 84 and Yongyou 12.Figure shows the
visualization results of all ResNet models. For conventional japonica
rice, the bands at approximately 1350–1400 nm contributed the
most, followed by the bands at approximately 1200–1250 nm.
For conventional indica rice, the contribution of the bands near 1350–1550
nm was relatively high. For hybrid japonica rice, the bands at approximately
1350–1400 nm contributed the most, followed by the bands at
approximately 1050–1250 nm. For hybrid indica rice, the bands
at approximately 1350–1400 nm contributed the most, followed
by the bands at approximately 1050–1250 nm. For indica–japonica
hybrid rice, the bands at approximately 1150–1175 nm and 1375
nm contributed the most, followed by the bands at approximately 1125–1275
nm and 1325–1350 nm.
Figure 4
Saliency maps of the ResNet model: (a) conventional
japonica rice:
Huaidao 5 and Nanjing 9108, (b) conventional indica rice: Zhongjiazao
17 and Zhongzao 39, (c) hybrid japonica rice: Bayou 8 and Yongyou
10, (d) hybrid indica rice: Y liangyou 2 and Zhongzheyou 8, and (e)
indica–japonica hybrid rice: Chunyou 84 and Yongyou 12.
Saliency maps of the ResNet model: (a) conventional
japonica rice:
Huaidao 5 and Nanjing 9108, (b) conventional indica rice: Zhongjiazao
17 and Zhongzao 39, (c) hybrid japonica rice: Bayou 8 and Yongyou
10, (d) hybrid indica rice: Y liangyou 2 and Zhongzheyou 8, and (e)
indica–japonica hybrid rice: Chunyou 84 and Yongyou 12.For conventional japonica rice, the bands with
the largest contribution
rate were mainly concentrated at approximately 1300–1400 nm,
followed by the bands at approximately 1200–1250 nm. For conventional
indica rice, the bands at approximately 1350–1550 nm had the
largest contribution rate, followed by the bands at approximately
1050–1200 nm. For hybrid japonica rice, the bands with the
largest contribution rate were mainly concentrated at approximately
1350–1400 nm, followed by the bands at approximately 1150–1300
nm. For hybrid indica rice, the bands at approximately 1300–1400
nm provided the highest contribution, followed by the bands at approximately
1050–1250 nm. For the indica–japonica hybrid rice, the
bands with the largest contribution rate were mainly concentrated
at approximately 1100–1275 nm and 1300–1400 nm. Overall,
for all types of rice seeds, the bands with the largest contribution
rate were mainly concentrated at approximately 1300–1400 nm,
followed by the bands at approximately 1050–1250 nm. Table shows the NIR absorption
bands produced by the relevant experiments. The 1050 nm–1200
nm and 1300 nm–1500 nm bands are the main characteristic spectral
regions of the 20 amino acids that constitute proteins.[20] The 1050–1200 nm region is mainly composed
of the second overtone of C–H, and the 1300–1500 nm
region is mainly composed of the combined frequency of C–H,
which can reflect the amino acid composition differences among different
samples.[21]
Table 4
NIR Absorption
Bands in the Relevant
Studies
wavelength
bond vibration
researchers
1202 nm
C–H second overtone
Miao et al.[22]
1207 nm
C–H second overtone
Wimonsiri et al.[23]
1140–1350 nm
C–H second overtone
Amanah et al.[24]
1090–1180 nm
C–H second overtone
Westad et al.[25]
1100–1200 nm
C–H second overtone
1150–1260
nm
C–H second
overtone
1350–1430 nm
C–H combination
1360–1420 nm
C–H combination
1228 nm
C–H second overtone
Xie et al.[26]
1191 and 1209 nm
C–H second overtone
Kaewsorn et
al.[27]
1388 nm
C–H combination
1204 and 1301 nm
C–H second overtone
He et al.[1]
1204 nm
C–H stretching and
deformation
Shen
et al.[28]
Classification
Results of Rice Seed Varieties
without Considering the Seed Type
Classification
Results
When considering
the types of seeds, the rice seed classification methods achieved
good results. Therefore, this article further studied the classification
of different varieties of rice seeds without considering the seed
types. Among the classification models that considered seed types,
the classification results of the deep learning models were overall
better than those of the conventional machine learning models. Therefore,
this study used LeNet, GoogLeNet, and ResNet to classify ten different
varieties of rice seeds. We took all 10 varieties of rice seeds as
a whole and divided them into a training set, validation set, and
test set in an approximate ratio of 6:1:1. Table S1 shows the data set division for each variety of samples
in this case. The classification results are shown in Table . The category assignments were:
Huaidao 5-0, Nanjing 9108-1, Zhongzao 39-2, Zhongjiazao 17-3, Yongyou
10-4, Bayou 8-5, Zhongzheyou 8-6, Yliangyou2-7, Yongyou 12-8, and
Chunyou 84-9. It can be seen from the table that the LeNet model had
the worst classification results, and the ResNet model had the best
classification results. The accuracy rate of the ResNet model on the
test set was 86.08%. Compared with those achieved in the case where
the type of seed was considered, the classification results in this
scenario were relatively worse. Mutual influences were observed between
different types of seeds.
Table 5
Classification Accuracy
of Rice Seed
Varieties of Different Models
models
training
validation
test
LeNet
76.63%
73.11%
73.84%
GoogLeNet
80.14%
80.67%
80.17%
ResNet
81.12%
84.03%
86.08%
The confusion matrix of the classification results
of the three
models is shown in Figure . The horizontal axis of the confusion matrix denotes the
true category of the sample, and the vertical axis represents the
sample category predicted by the model. Through the confusion matrix,
you can observe the number of correct predictions and the number of
incorrect predictions for each category. Figure (a) is the confusion matrix for the LeNet
models. It can be seen from the figure that the LeNet models had the
worst classification results for Bayou 8, and it was most likely to
be misjudged as Huaidao 5, Zhongjiazao 17, and Nanjing 9107.
Figure 5
(a) Confusion
matrix of the LeNet model classification results.
(b) Confusion matrix of the GoogLeNet model classification results.
(c) Confusion matrix of the ResNet model classification results.
(a) Confusion
matrix of the LeNet model classification results.
(b) Confusion matrix of the GoogLeNet model classification results.
(c) Confusion matrix of the ResNet model classification results.Figure (b) is the
confusion matrix for the GoogLeNet models. The GoogLeNet models had
the worst classification results for Yongyou 10, and it was most likely
to be misjudged as Huaidao 5.Figure (c) is the
confusion matrix for the ResNet models. The classification results
of the ResNet models for Yongyou 10 and Chunyou 84 were relatively
poor. Among them, Yongyou 10 was most likely to be misjudged as Huaidao
5 and Zhongzao 39, and Chunyou 84 was most likely to be misjudged
as Huaidao 5.In this study,
the saliency map method was also used to visualize the three CNN models
for evaluating the classification results of ten rice varieties. The
visualization results are shown in Figure . For the LeNet model, the bands with the
largest contribution rate were mainly concentrated at approximately
1350–1500 nm. For the GoogLeNet model, the bands with the largest
contribution rate were mainly concentrated at approximately 1050–1250
nm and 1300–1400 nm. For the ResNet model, the bands with the
largest contribution rate were mainly concentrated at approximately
1000–1100 nm and 1300–1400 nm. Overall, the bands with
the largest contribution rate were mainly concentrated at approximately
1050–1250 nm and 1300–1400 nm. The results were close
to the visualization results for the case in which the seed type was
considered.
Figure 6
(a) LeNet model saliency maps. (b) GoogLeNet model saliency maps.
(c) ResNet model saliency maps.
(a) LeNet model saliency maps. (b) GoogLeNet model saliency maps.
(c) ResNet model saliency maps.
Discussion
Differences are present
in the quality of different varieties of rice. High-quality rice has
a higher economic and nutritional value. Therefore, it is important
to identify different varieties of rice seeds in a quick and nondestructive
manner. In addition to varieties, different types of rice are available,
such as indica and japonica, late rice, and early rice. Different
varieties of rice may belong to the same type. Different varieties
of rice seeds have small grains and similar colors and shapes, which
are difficult to distinguish by visual observation, so it is important
for us to choose a suitable classification strategy when classifying
rice seeds.In this study, we used NIR hyperspectral imaging
combined with deep learning to identify the varieties of different
types of rice seeds. We studied two cases considering the rice seed
type and not considering the rice seed type. Some studies have also
classified rice seed varieties, but they did not consider the type
of rice seeds. Wang et al. used hyperspectral imaging combined with
the back-propagation neural network (BPNN) to identify rice varieties
and quality, where the accuracy of the BPNN model based on data fusion
reached 94.45%. The authors studied only three rice varieties.[2] Samson et al. used RGB and hyperspectral imaging
to identify rice seed varieties. They studied 90 rice varieties, but
they did not consider the effect of the rice type.[11] Weng et al. used hyperspectral imaging in combination with
the principal component analysis network (PCANet) with multifeatured
fusion to identify rice seed varieties achieving a maximum accuracy
of 98.57%. They studied 10 rice varieties, but they also did not consider
the effect of the rice type.[16] In a study
by Qiu et al.,[3] who also identified rice
seed varieties, two different rice seed types were contained in their
study sample, but the authors did not consider the rice seed types
when modeling. From the results of this study, the types of rice seeds
influence the classification of rice varieties, so it is meaningful
to consider the types of rice seeds when identifying rice seed varieties.
In this study, the highest accuracy of 86.08% was obtained for the
identification of 10 different varieties of rice seeds without considering
the type of rice seeds, and our results were not poor compared to
those of other similar studies. In the case when rice seed types were
considered, the accuracies of most of our models exceeded 90%, and
the overall results of this study were good. In this study, we used
deep learning methods. Compared with other similar studies that used
deep learning methods, we used saliency maps to visualize the deep
learning models to improve the interpretability of the models.Deep learning is widely used in NIR spectral analysis. Among deep
learning approaches, CNNs are the most commonly used deep learning
methods. In this study, we further studied the application of CNN
in NIR spectral analysis, and we used three CNN models with different
network structures to identify rice seed varieties. The performance
of the three deep learning methods was very close when considering
rice seed types. The ResNet model performed the best when the rice
seed types were not considered. Moreover, we used conventional machine
learning methods to compare with deep learning methods. The results
of SVM and deep learning methods were very close.In the results
of this study, the overall performances of the deep
learning models were better than that of the SVM models when considering
rice seed types, but the advantage was not very significant. We reviewed
the related literature and found that some studies also used CNN models
and SVM models for spectral data processing. Zhang et al.[17] used NIR hyperspectral imaging to identify the
coated maize seed varieties and used the CNN and SVM models in their
study. From the results, the CNN model performed slightly better than
the SVM model overall, but the difference was not significant. Nie
et al.[15] used NIR hyperspectral imaging
combined with a deep learning approach to classify hybrid seeds of
loofah and okra, while the study also used the SVM algorithm. The
overall performance of the SVM model and the DCNN model were very
close, with the DCNN model performing slightly better. Xiao et al.[29] used visible/shortwave NIR and NIR hyperspectral
imaging to identify the origin of Astragalus. The CNN and SVM models
were also used in this study, and from the results, the two models
performed very similarly. Gao et al. used NIR hyperspectral imaging
to identify the geographic origin of narrow-leaf olive fruit, where
SVM models and CNN models were also used. From the results, it can
be seen that the performance of the SVM model was very close to the
overall performance of the CNN model, with the CNN model performing
slightly better than the SVM.[30] Zhu et
al. used NIR hyperspectral imaging in combination with a deep learning
approach to identify cotton seed varieties and used an SVM model in
comparison with a deep learning model, and from the results, it can
be seen that some of the deep learning models performed slightly better
than the SVM, but the overall results were close.[31] Similar results have been obtained in some other related
studies.[32−35] From the results of these studies, both SVM and deep learning models
are effective methods for spectral data processing, and in most cases,
the CNN performs slightly better than the SVM, but their overall performances
are very close. These findings are similar to the results of this
study.SVM and deep learning have been proven effective for
data processing
in spectral data analyses. However, they work based on different principles.
SVM is the widely used method that can deal with both linear and nonlinear
problems. In spectral data analysis, SVM with kernel functions is
the most commonly used. SVM maps the input data into higher dimensions
using a kernel function and then classifies samples. Deep learning
methods use multiple layers of nonlinear processing units for feature
extraction and transformation, and deep learning has been proven to
have a powerful feature learning capability and can effectively extract
information from the data.[36] SVM are suitable
for small datasets.[37] The number of samples
in this study was small, as were the sample sizes in the abovementioned
studies. The advantage of deep learning for large datasets was not
fully revealed. However, the results we obtained in this study and
those in the abovementioned studies, deep learning models obtained
equivalent or better results than the SVM models, exhibiting great
potential for rice seed variety identification. Moreover, this study
used one-dimensional spectral rice seed data, and the form of the
data was not complex, so the data did not give full play to the advantages
of deep learning in dealing with complex features. Therefore, in this
study, the results of the deep learning models and the SVM models
were very close.In the future, we will study more varieties
and a larger number
of samples. We will also study 2D images and 3D hyperspectral data
to exploit the information of hyperspectral images more fully and
fully reveal the potential of deep learning methods.
Conclusions
In this study, we successfully used NIR
hyperspectral imaging technology
in combination with conventional machine learning methods and deep
learning methods to identify rice seeds of different varieties but
with the same type. This study identified two different varieties
of rice seeds under the conventional japonica rice, conventional indica
rice, hybrid japonica rice, hybrid indica rice, and indica–japonica
hybrid rice types. The experimental results showed that among the
conventional machine learning methods, the SVM method could effectively
detect different varieties of rice seeds, and the effects of LR and
the RF were poor. Among the deep learning methods, the LeNet, GoogLeNet,
and ResNet models could effectively identify rice seed varieties,
the deep learning methods performed significantly better than the
conventional machine learning algorithms, and the classification accuracies
of most models exceeded 95%. This study further classifies different
varieties of rice seeds without considering the seed types. LeNet,
GoogLeNet, and ResNet were used to classify ten different varieties
of rice seeds. Among them, ResNet had the best classification results,
and its classification accuracy was 86.08%. The results showed that
ResNet could effectively classify different varieties of rice seeds.
After using the saliency map method to visualize each CNN model, the
band was located at approximately 1300–1400 nm, followed by
the band at approximately 1050–1250 nm. In summary, the employed
research method can be used to identify rice seed varieties. The combination
of deep learning methods and NIR hyperspectral imaging technology
is better than conventional machine learning methods. This approach
provides a way to identify the varieties of rice seeds in a quick
and nondestructive manner.
Materials and Methods
Sample Preparation
In this study,
rice seed samples were provided by the College of Agriculture and
Food Science, Zhejiang Agriculture and Forestry University, Lin’an,
Zhejiang Province, China. All seeds were numbered when they were collected,
and the numbers corresponded to the relative seed information, including
their variety information.
NIR Hyperspectral Image
Acquisition and Correction
The NIR hyperspectral imaging
system in this experiment was mainly
composed of an FX17 NIR hyperspectral camera (Spectral Imaging Ltd.),
a light source, a mobile platform, a computer, and control software.
Its main structure is shown in Figure . The spectral range of the hyperspectral imaging system
was 900–1700 nm, and the spectral resolution was 8 nm.
Figure 7
Hyperspectral
imaging system structure diagram.
Hyperspectral
imaging system structure diagram.To reduce the influence of dark current noise and external factors,
each image needed to be corrected. Before collecting each sample image,
a white reference image (W) was first collected. Then, the light source
was turned off and the camera lens was covered by a black opaque barrier
to collect a black reference image (D). The corrected image (R) was
obtained by the following formula (1)In formula (1),
I is the original image, R
is the corrected image, W is the white reference image, and D is the
black reference image.
Spectral Data Extraction
and Preprocessing
Before obtaining the spectral data for
each sample, a series of
preprocessing steps were required for the original image. In this
study, we used threshold segmentation based on the intensity to remove
the background of the original image. Then, we used the connected
domain method to label each seed and treated each rice seed as a region
of interest (ROI). Then, the average reflectance among all pixels
in the ROI was calculated to obtain the spectral data of each rice
seed.A certain amount of noise was contained in the obtained
spectral data. Therefore, the data needed to be preprocessed to reduce
the impact of noise. Before preprocessing, the bands at the front
and rear ends in the spectral data with severe noise were removed,
and wavelengths in the range of 1005–1634 nm were retained
and used for further analysis. This study used the moving average
(MA) and standard normal variables (SNV) to preprocess the spectral
data. The MA could filter out the high-frequency noise in the data
and retain useful low-frequency trends. SNV could reduce the spectral
error caused by particle scattering between the samples. When performing
MA-based processing, the segment size was set to 5.
Data Analysis Methods
Conventional Machine
Learning Methods
SVM is a machine learning algorithm that
assigns labels to objects
through instance learning. SVM is based on statistical learning theory
and can effectively handle classification and regression problems.[38] In linear problems, the principle is to find
an optimal hyperplane that satisfies the classification requirements
while maximizing the interval. In nonlinear problems, low-dimensional
samples are transformed into high-dimensional feature spaces through
a kernel function so that the difficult problem of space division
is transformed into a high-dimensional linear division problem. This
study chose the kernel function of the model from “Linear”
and “RBF” (a radial basis function) and then used the
grid search method to determine the penalty coefficient (C) and the
kernel width parameter (γ). The optimization range of the parameter
C was [1,10,50,100]. The optimization range of the parameter γ
was [10-8,108].LR is a linear
model commonly used for binary classification problems.[39] The model is simple, and its training speed
is fast, so it has a wide range of applications in various fields.
LR predicts the probability of an event by fitting a logistic function
and generally uses the sigmoid function as a predictive function.
In this study, L2 regularization was selected for LR and liblinear
was selected for model optimization.RF is an ensemble learning
algorithm proposed by Leo Breiman, which
was inspired by the early work of Amit and Geman.[40] It can be used for both classification and regression.
An RF has the advantages of a fast training speed and few tuning parameters,
and it can be directly applied to high-dimensional data. Its essence
is an algorithm that integrates multiple trees, and its basic unit
is a decision tree.
CNN
CNNs are
feedforward neural
networks with deep structures based on convolutional layers. The structure
of the CNN is usually composed of an input layer, the convolutional
layer, the pooling layer, the fully connected layer, and an output
layer. With the continuous development of the CNN, an increasing number
of model structures have been proposed. LeNet is a classic CNN proposed
by LeCun et al. in 1998.[41] It is composed
of a convolutional layer, a pooling layer, and other modules. GoogLeNet
is a deep neural network model based on the Inception module proposed
by Szegedy et al. in 2014.[42] The Inception
module puts multiple convolutions or pooling operations together to
form a network module. The advantage of this approach is that it reduces
the number of parameters and effectively avoids problems such as overfitting,
gradient disappearance, or gradient explosion caused by the increase
in the number of network layers. ResNet is a CNN structure proposed
by He et al.[43] ResNet solves the problem
of degradation caused by the increase in network depth through residual
learning. Compared with ordinary networks, ResNet adds a short-circuit
mechanism, and its residual modules are connected through short-circuits,
alleviating the problem of gradient disappearance in deep neural networks.In this study, LeNet, GoogLeNet, and ResNet were used to establish
rice seed variety identification models. The structure of the LeNet
model is shown in Figure (a). It contained 3 convolutional layers, the size of each
convolution kernel was 1 × 4, and the numbers of channels were
8, 16, and 32. A max pooling layer was used to extract features, the
kernel size was 1 × 4, and the padding was set to 1. Each layer
performed batch normalization on the data before utilizing an activation
function to improve the learning rate. The rectified linear unit (ReLU)
function was selected as the activation function to reduce the effect
of the gradient disappearance and increase the calculation speed.
The GoogLeNet model structure is shown in Figure (b); it contains two convolutional layers
and two Inception modules. Each Inception module had four branches.
The first branch consisted of three convolutional layers, and the
number of channels in each of the three convolutional layers was 16.
The convolution kernel sizes of the three convolutional layers were
1 × 1, 1 × 3, and 1 × 3. The second branch consisted
of two convolutional layers, the number of channels in each of the
two convolutional layers was 16, and the sizes of the convolution
kernels were 1 × 1 and 1 × 5. The third branch contained
only one convolutional layer, the number of channels in the convolutional
layer was 16, and the size of the convolution kernel was 1 ×
1. The fourth branch contained an average pooling layer and a convolution
layer. The kernel size of the average pooling layer was 3, the number
of channels of the convolution layer was 16, and the size of the convolution
kernel was 1 × 1. The Inception module replaced the human hand
to determine the size of the convolution kernel in each convolutional
layer and to determine whether a convolutional or pooling layer was
needed. The size of the convolution kernels of the two convolution
layers was 1 × 5, and the numbers of channels were 10 and 20.
The ‘ReLU’ function was selected as the activation function.
The Inception module structure is shown in Figure (b). The ResNet model structure is shown
in Figure (c). ResNet
used two convolutional layers and two residual modules. The size of
the convolution kernel of the two convolution layers was 1 ×
5, and the numbers of channels were 16 and 32. When training the CNN
model, the learning rates of all LeNet models were set to 0.001, and
the learning rates of all GoogLeNet and ResNet models were set to
0.0001. The Adam optimizer was used for training to speed up the convergence
process.
Figure 8
(a) LeNet network structure diagram. (b) Structure diagram of GoogLeNet
and its Inception module. (c) Structure diagram of ResNet and its
residual module structure diagram.
(a) LeNet network structure diagram. (b) Structure diagram of GoogLeNet
and its Inception module. (c) Structure diagram of ResNet and its
residual module structure diagram.
Saliency Map
The saliency map is
a CNN visualization method. When the CNN correctly predicts the class
of a sample, each element in the data has a corresponding contribution
value.[44] The sizes of the contribution
values can reflect the importance levels of these elements. The saliency
map can visualize the contribution value of each element to intuitively
see which elements play important roles in the process of CNN-based
sample identification. For hyperspectral data, a saliency map can
reflect the importance of each band. After the hyperspectral data C0 of a sample in the test set was classified
by the CNN model, we can obtain the score value S for each band. If the category of this sample was correctly predicted,
we could use formula 2 to calculate the weight w.where w is the absolute value
of the derivative of the score value S with respect
to the data C0.In this study, the
contribution value of each band was calculated for all of the correctly
predicted samples in the test set, and the average value of the contribution
of each band was calculated for each type of sample.
Model Evaluation and Software
This
study used the classification accuracy and AUC metrics to evaluate
the performance of the models. The definition of classification accuracy
is the ratio of the number of correct predictions to the total number
of predictions. The SVM, LR, and RF models were implemented in Python
3.8 with scikit-learn (0.23.2). All CNN models were built using PyTorch
(1.7.0), the deep learning framework. The two data preprocessing methods
for the MA and SNV in this study were implemented in Unscrambler X
10.1 (CAMO AS, Oslo, Norway). All data analyses were performed on
a computer configured with an Intel Core i5-8300H CPU at 2.30 GHz
and 16 GB of RAM.