| Literature DB >> 34211798 |
Yi Chen1, Jun Bin2, Congming Zou1, Mengjiao Ding2.
Abstract
The maturity affects the yield, quality, and economic value of tobacco leaves. Leaf maturity level discrimination is an important step in manual harvesting. However, the maturity judgment of fresh tobacco leaves by grower visual evaluation is subjective, which may lead to quality loss and low prices. Therefore, an objective and reliable discriminant technique for tobacco leaf maturity level based on near-infrared (NIR) spectroscopy combined with a deep learning approach of convolutional neural networks (CNNs) is proposed in this study. To assess the performance of the proposed maturity discriminant model, four conventional multiclass classification approaches-K-nearest neighbor (KNN), backpropagation neural network (BPNN), support vector machine (SVM), and extreme learning machine (ELM)-were employed for a comparative analysis of three categories (upper, middle, and lower position) of tobacco leaves. Experimental results showed that the CNN discriminant models were able to precisely classify the maturity level of tobacco leaves for the above three data sets with accuracies of 96.18%, 95.2%, and 97.31%, respectively. Moreover, the CNN models with strong feature extraction and learning ability were superior to the KNN, BPNN, SVM, and ELM models. Thus, NIR spectroscopy combined with CNN is a promising alternative to overcome the limitations of sensory assessment for tobacco leaf maturity level recognition. The development of a maturity-distinguishing model can provide an accurate, reliable, and scientific auxiliary means for tobacco leaf harvesting.Entities:
Year: 2021 PMID: 34211798 PMCID: PMC8205606 DOI: 10.1155/2021/9912589
Source DB: PubMed Journal: J Anal Methods Chem ISSN: 2090-8873 Impact factor: 2.193
The characteristics of fresh tobacco leaves in five maturity levels.
| Maturity levels | Characteristics description of fresh tobacco leaf |
|---|---|
| Unripe | Leaf color is dark green without any yellow, the main vein and branches are all green, and pubescence is not fallen off. |
| Mature | Leaf color is light green with litter yellow, about 2/3 main vein turns white, and the branches are green with a small amount of pubescence shedding. |
| Ripe | Leaf color is yellow-green, the main vein is all white, about 1/3 branches turn white, pubescence partly falls off, and the leaf tip is slightly hung down. |
| Mellow | Leaf color is yellow, the main vein is all white and bright, about 2/3 branches turn white, pubescence is basically or mostly shed off, the leaf surface is covered with macula, the leaf tip and leaf edge turn white, slightly withered, and the leaf tip is scorched and hooked down. |
| Overmature | The main vein and branches are all white and bright, and leaf color is yellow-white. Most of pubescence fall off, the leaf ear is yellow with withered sharp and scorched edge. |
The detail of tobacco leaves data sets.
| Data sets | Total samples | Training set | Testing set | Unripe | Mature | Ripe | Mellow | Overmature |
|---|---|---|---|---|---|---|---|---|
| Upper leaves | 1128 | 790 | 338 | 219 | 225 | 226 | 229 | 229 |
| Middle leaves | 1085 | 760 | 325 | 216 | 222 | 218 | 219 | 210 |
| Lower leaves | 1141 | 799 | 342 | 232 | 227 | 235 | 228 | 219 |
Figure 1The NIR spectra of five maturity levels of upper tobacco leaves: (a) raw spectra and (b) preprocessed spectra.
Discriminant accuracy (%) of different preprocessing methods.
| Preprocessing methods | KNN | BPNN | SVM | ELM | CNN |
|---|---|---|---|---|---|
| Raw | 55.33 | 77.05 ± 3.61 | 87.33 | 72.4 ± 3.25 | 92.35 ± 2.61 |
| First derivation | 85.33 | 88.32 ± 2.69 | 93.33 | 82.46 ± 4.44 | 95.84 ± 1.25 |
| Second derivation | 84.67 | 85.1 ± 2.54 | 92.67 | 80.24 ± 5.91 | 94.55 ± 1.65 |
| SNV | 74 | 86.67 ± 2.91 | 94 | 85.03 ± 3.43 | 94.36 ± 1.24 |
| MSC | 74 | 86.5 ± 2.52 | 93.33 | 84.49 ± 1.66 | 93.38 ± 1.42 |
Figure 2PCA score plot of the variance in NIR spectra of five maturity levels for upper tobacco leaves.
Figure 3Schematic diagram of convolutional neural network model.
Parameter settings of the convolutional neural networks for upper, middle, and lower leaves data sets.
| Layers | Model parameters | Output shape |
|---|---|---|
| Input layer | NIRS data of 454 × 1 dimension | |
| Conv1D ( | 128 convolutional kernels of the size 13 × 1, the Relu function and BN mechanism, stride = 1 | 450 × 128 |
| MaxPooling1D ( | Maxpooling, pooling size = 2 × 1, stride = 1 | 225 × 128 |
| Conv1D ( | 64 convolutional kernels of the size 13 × 1, the Relu function and BN mechanism, stride = 1 | 221 × 64 |
| MaxPooling1D ( | Maxpooling, pooling size = 1 × 1, stride = 1 | 221 × 64 |
| Flatten ( | Flatten the feature vector of the | 14144 × 1 |
| Dense ( | 100 output neurons fully connected to all neurons in layer | 100 × 1 |
| Dense ( | 5 output neurons consistent with the number of maturity levels | 5 × 1 |
| Output layer | The softmax function |
Figure 4Parameters tuning of CNN model: (a) convolutional kernel size, (b) batch size, and (c) epoch size.
Figure 5Discrimination accuracy across epochs of CNN model: (a) upper leaf, (b) middle leaf, and (c) lower leaf.
Figure 6Loss function across epochs of CNN model: (a) upper leaf, (b) middle leaf, and (c) lower leaf.
The prediction results (%) of convolutional neural networks running 10 times.
| Data sets | Sample sets | Discriminant accuracy |
|---|---|---|
|
| Training set | 99.75, 99.87, 99.75, 99.75, 100, 100, 100, 100, 100, 100 |
| Testing set | 95.86, 95.86, 96.15, 96.45, 95.86, 96.15, 96.45, 96.15, 96.45, 96.45 | |
|
| ||
|
| Training set | 100, 99.87, 100, 99.74, 99.61, 99.08, 98.95, 99.34, 99.61, 99.34 |
| Testing set | 95.38, 94.77, 94.46, 94.77, 95.69, 95.69, 95.38, 95.69, 95.08, 95.08 | |
|
| ||
|
| Training set | 99.12, 99.75, 99.75, 99.62, 100, 99.75, 99.75, 99, 99.5, 99.75 |
| Testing set | 96.49, 97.08, 97.37, 98.54, 97.66, 97.95, 96.2, 96.49, 97.37, 97.95 | |
The prediction results (%) of convolutional neural networks and other four methods.
| Data sets | Sample sets | KNN | BPNN | SVM | ELM | CNN |
|---|---|---|---|---|---|---|
|
| Training set | 89.87 | 92.11 ± 0.46 | 96.2 | 96.11 ± 1.82 | 99.91 ± 0.12 |
| Testing set | 84.02 | 66.39 ± 7.31 | 91.72 | 87.57 ± 2.79 | 96.18 ± 0.26 | |
|
| ||||||
|
| Training set | 90.79 | 93.66 ± 0.79 | 93.03 | 94.24 ± 2.1 | 99.55 ± 0.37 |
| Testing set | 84.92 | 80.18 ± 2.93 | 89.23 | 86.71 ± 1.16 | 95.2 ± 0.44 | |
|
| ||||||
|
| Training set | 91.99 | 95.87 ± 0.43 | 94.87 | 95.71 ± 2.28 | 99.6 ± 0.31 |
| Testing set | 89.77 | 86.81 ± 4.06 | 93.57 | 92.51 ± 2.12 | 97.31 ± 0.75 | |
The optimal parameters for KNN, BPNN, SVM, and ELM.
| Data sets | Parameters | ||||
|---|---|---|---|---|---|
| KNN | BPNN | SVM | ELM | ||
|
|
|
|
|
| |
| Upper leaves | 6 | 19.6 ± 8.15 | 32 | 0.0313 | 161.1 ± 28.38 |
| Middle leaves | 5 | 23.3 ± 6.25 | 8 | 0.0625 | 145.9 ± 30.34 |
| Lower leaves | 8 | 16.3 ± 10.27 | 16 | 0.0313 | 138 ± 43.89 |