| Literature DB >> 36158530 |
Qinlin Xiao1,2, Wentan Tang1,2, Chu Zhang3, Lei Zhou1,2,4, Lei Feng1,2, Jianxun Shen5, Tianying Yan6, Pan Gao6, Yong He1,2, Na Wu1,2.
Abstract
Rapid determination of chlorophyll content is significant for evaluating cotton's nutritional and physiological status. Hyperspectral technology equipped with multivariate analysis methods has been widely used for chlorophyll content detection. However, the model developed on one batch or variety cannot produce the same effect for another due to variations, such as samples and measurement conditions. Considering that it is costly to establish models for each batch or variety, the feasibility of using spectral preprocessing combined with deep transfer learning for model transfer was explored. Seven different spectral preprocessing methods were discussed, and a self-designed convolutional neural network (CNN) was developed to build models and conduct transfer tasks by fine-tuning. The approach combined first-derivative (FD) and standard normal variate transformation (SNV) was chosen as the best pretreatment. For the dataset of the target domain, fine-tuned CNN based on spectra processed by FD + SNV outperformed conventional partial least squares (PLS) and squares-support vector machine regression (SVR). Although the performance of fine-tuned CNN with a smaller dataset was slightly lower, it was still better than conventional models and achieved satisfactory results. Ensemble preprocessing combined with deep transfer learning could be an effective approach to estimate the chlorophyll content between different cotton varieties, offering a new possibility for evaluating the nutritional status of cotton in the field.Entities:
Year: 2022 PMID: 36158530 PMCID: PMC9489230 DOI: 10.34133/2022/9813841
Source DB: PubMed Journal: Plant Phenomics ISSN: 2643-6515
Figure 1Cotton leaves with different Chl content.
Figure 2The architectures of the CNN model and the flowchart of fine-tuning transfer.
Figure 3The average spectra with standard deviation (SD) of leaves of two cotton varieties (a) LMY 24 and (b) XLZ 53 captured at five growing stages.
Statistical information of cotton leaves in the calibration and prediction sets.
| Sample set | Number | Range | Mean | Standard deviation |
|---|---|---|---|---|
| Cal | 1026 | 9.690-56.069 | 32.893 | 8.300 |
| Pre | 512 | 10.750-54.962 | 32.848 | 8.244 |
Prediction results for both varieties of cotton.
| Model | R2CV | RMSECV | R2P | RMSEP |
|---|---|---|---|---|
| PLS | 0.806 | 3.651 | 0.768 | 3.996 |
| SVR | 0.879 | 2.88 | 0.822 | 3.472 |
Results of models built on different preprocessed spectra when LMY 24 was the source domain.
| Model | Pretreatment | LMY 24 | XLZ53 | ||
|---|---|---|---|---|---|
| R2CV | RMSECV | R2P | RMSEP | ||
| PLS | None | 0.801 | 3.646 | 0.734 | 4.459 |
| MSC | 0.789 | 3.759 | 0.726 | 4.593 | |
| SNV | 0.793 | 3.713 | 0.681 | 5.428 | |
| FD | 0.824 | 3.424 | 0.671 | 5.091 | |
| MSC+ FD | 0.810 | 3.561 | 0.772 | 4.021 | |
| SNV+ FD | 0.789 | 3.752 | 0.773 | 4.140 | |
| FD + MSC | 0.797 | 3.680 | 0.794 | 3.839 | |
| FD + SNV |
|
|
|
| |
| SVR | None | 0.867 | 2.981 | 0.700 | 4.572 |
| MSC | 0.878 | 2.849 | 0.602 | 5.260 | |
| SNV | 0.864 | 3.008 | 0.626 | 5.104 | |
| FD | 0.874 | 2.897 | 0.681 | 4.713 | |
| MSC+ FD | 0.872 | 2.915 | 0.511 | 5.836 | |
| SNV+ FD | 0.882 | 2.808 | 0.705 | 4.530 | |
| FD + MSC |
|
|
|
| |
| FD + SNV |
|
|
|
| |
The numbers are bolded to highlight models with relatively good results.
Results of models built on different preprocessed spectra when XLZ53 was the source domain.
| Model | Pretreatment | XLZ53 | LMY 24 | ||
|---|---|---|---|---|---|
| R2CV | RMSECV | R2P | RMSEP | ||
| PLS | None | 0.834 | 3.408 | 0.578 | 5.552 |
| MSC | 0.809 | 3.656 | 0.577 | 5.562 | |
| SNV | 0.793 | 3.799 | 0.548 | 5.846 | |
| FD | 0.829 | 3.455 | 0.573 | 5.659 | |
| MSC+ FD | 0.837 | 3.372 | 0.651 | 5.725 | |
| SNV+ FD | 0.835 | 3.392 | 0.639 | 5.135 | |
| FD + MSC | 0.837 | 3.376 | 0.637 | 5.209 | |
| FD + SNV |
|
|
|
| |
| SVR | None | 0.896 | 2.687 | 0.618 | 5.047 |
| MSC | 0.902 | 2.611 | 0.635 | 4.934 | |
| SNV | 0.893 | 2.725 | 0.597 | 5.182 | |
| FD | 0.898 | 2.667 | 0.639 | 4.907 | |
| MSC+ FD | 0.896 | 2.686 | 0.638 | 4.913 | |
| SNV+ FD | 0.904 | 2.589 | 0.567 | 5.375 | |
| FD + MSC | 0.897 | 2.679 | 0.636 | 4.928 | |
| FD + SNV |
|
|
|
| |
The numbers are bolded to highlight models with relatively good results.
Regression results using fine-tuned CNN and conventional models.
| Source/target domain | Pretreatment | Model | Calibration seta | Validation setb | Prediction setc | |||
|---|---|---|---|---|---|---|---|---|
| R2 | RMSE | R2 | RMSE | R2 | RMSE | |||
| LMY24/XLZ53 | None | PLS | 0.719 | 4.792 | 0.663 | 5.596 | 0.696 | 5.235 |
| SVR | 0.733 | 4.297 | 0.629 | 5.113 | 0.667 | 4.834 | ||
| Fine-tuned CNN | 0.850 | 3.225 | 0.796 | 3.786 | 0.842 | 3.327 | ||
| Fine-tuned CNN using a smaller set | 0.885 | 2.901 | 0.802 | 3.730 | 0.811 | 3.643 | ||
| FD + SNV | PLS | 0.821 | 3.600 | 0.754 | 4.359 | 0.777 | 4.098 | |
| SVR | 0.761 | 4.065 | 0.662 | 4.877 | 0.761 | 4.090 | ||
| Fine-tuned CNN |
|
|
|
|
|
| ||
| Fine-tuned CNN using a smaller set |
|
|
|
|
|
| ||
| XLZ53/LMY24 | None | PLS | 0.538 | 5.749 | 0.586 | 5.423 | 0.559 | 5.810 |
| SVR | 0.537 | 5.537 | 0.561 | 5.429 | 0.506 | 5.759 | ||
| Fine-tuned CNN | 0.850 | 3.156 | 0.746 | 4.129 | 0.757 | 4.036 | ||
| Fine-tuned CNN using a smaller set | 0.734 | 4.235 | 0.672 | 4.691 | 0.689 | 4.568 | ||
| FD + SNV | PLS | 0.637 | 5.140 | 0.630 | 5.086 | 0.636 | 5.300 | |
| SVR | 0.654 | 4.909 | 0.664 | 4.928 | 0.578 | 5.445 | ||
| Fine-tuned CNN |
|
|
|
|
|
| ||
| Fine-tuned CNN using a smaller set |
|
|
|
|
|
| ||
aCalibration set means the calibration set of the target domain; bValidation set means the validation set of the target domain; cprediction set means the prediction set of the target domain; the numbers are bolded to highlight models with relatively good results.
Figure 4Saliency map after transfer learning regarding two cotton varieties. (a)–(d) Key wavelengths of for the Chl content determination by different CNN: (a) the fine-tuned CNN using raw spectra (LMY24→XLZ53); (b) the fine-tuned CNN using FD + SNV preprocessed spectra (LMY24→XLZ53); (c) the fine-tuned CNN using raw spectra (XLZ53→LMY24); and (d) the fine-tuned CNN using FD + SNV preprocessed spectra (XLZ53→LMY24).
The CNN architectures used in the experiment.
| Layer | CNN1 | CNN2 | CNN3 | CNN4 | Alexnet | VGGNet-9 | |
|---|---|---|---|---|---|---|---|
| Input | 1 × 2071 | 1 × 2071 | 1 × 2071 | 1 × 2071 | 1 × 2071 | 1 × 2071 | |
| Convolution | 1 | 32 kernels in size 1 × 3 with max pooling | 32 kernels in size 1 × 3 with max pooling | 32 kernels in size 1 × 3 with max pooling | 32 kernels in size 1 × 3 with max pooling | 96 kernels in size 1 × 11 with max pooling | 64 kernels in size 1 × 5 |
| 2 | — | 32 kernels in size 1 × 3 with max pooling | 32 kernels in size 1 × 3 with max pooling | 32 kernels in size 1 × 3 with max pooling | 256 kernels in size 1 × 5 with max pooling | 64 kernels in size 1 × 3 with max pooling | |
| 3 | — | — | 32 kernels in size 1 × 3 with max pooling | 32 kernels in size 1 × 3 with max pooling | 384 kernels in size 1 × 3 with max pooling | 128 kernels in size 1× 3 | |
| 4 | — | — | — | 32 kernels in size 1 × 3 with max pooling | 384 kernels in size 1 × 3 | 128 kernels in size 1 × 3 with max pooling | |
| 5 | — | — | — | — | 256 kernels in size 1 × 3 | 256 kernels in size 1 × 3 | |
| 6 | — | — | — | — | — | 256 kernels in size 1 × 3 | |
| 7 | — | — | — | — | — | 256 kernels in size 1 × 3 with max pooling | |
| Fully connected | 1 | 512 nodes, ReLu | 512 nodes, ReLu | 512 nodes, ReLu | 512 nodes, ReLu | 4096 nodes, ReLu | 4096 nodes, ReLu |
| 2 | 32 nodes, ReLu | 32 nodes, ReLu | 32 nodes, ReLu | 32 nodes, ReLu | 4096 nodes, ReLu | 4096 nodes, ReLu | |
| Output | 1 node | 1 node | 1 node | 1 node | 1 node | 1 node |
Results of fine-tuned models using different CNN architectures.
| Source/target domain | Model | Calibration seta | Validation setb | Prediction setc | |||
|---|---|---|---|---|---|---|---|
| R2 | RMSE | R2 | RMSE | R2 | RMSE | ||
| LMY24/XLZ53 | CNN1 | 0.921 | 2.336 | 0.853 | 3.217 | 0.880 | 2.903 |
| CNN2 | 0.909 | 2.505 | 0.850 | 3.248 | 0.870 | 3.020 | |
| CNN3 | 0.910 | 2.501 | 0.821 | 3.550 | 0.855 | 3.184 | |
| CNN4 | 0.892 | 2.739 | 0.840 | 3.356 | 0.855 | 3.186 | |
| AlexNet | 0.877 | 2.923 | 0.848 | 3.269 | 0.852 | 3.222 | |
| VGGNet-9 | 0.897 | 2.672 | 0.819 | 3.570 | 0.840 | 3.348 | |
| XLZ53/LMY24 | CNN1 | 0.891 | 2.691 | 0.828 | 3.399 | 0.820 | 3.476 |
| CNN2 | 0.907 | 2.454 | 0.828 | 3.397 | 0.818 | 3.494 | |
| CNN3 | 0.910 | 2.444 | 0.836 | 3.319 | 0.818 | 3.497 | |
| CNN4 | 0.898 | 2.599 | 0.826 | 3.414 | 0.817 | 3.508 | |
| AlexNet | 0.864 | 3.003 | 0.813 | 3.549 | 0.819 | 3.489 | |
| VGGNet-9 | 0.891 | 2.693 | 0.813 | 3.550 | 0.816 | 3.509 | |
aCalibration set means the calibration set of the target domain; bvalidtion set means the validation set of the target domain; cprediction set means the prediction set of the target domain.
Results of fine-tuned CNN1 models using a dataset with different size.
| Source/target domain | Dataset sizes | Calibration seta | Validation setb | Prediction setc | |||
|---|---|---|---|---|---|---|---|
| R2 | RMSE | R2 | RMSE | R2 | RMSE | ||
| LMY24/XLZ53 | 10% | 0.829 | 3.718 | 0.706 | 4.550 | 0.656 | 4.807 |
| 20% | 0.868 | 3.012 | 0.789 | 3.851 | 0.810 | 3.651 | |
| 30% | 0.873 | 2.804 | 0.802 | 3.729 | 0.832 | 3.433 | |
| 40% | 0.859 | 3.028 | 0.817 | 3.586 | 0.838 | 3.372 | |
| 50% | 0.852 | 3.237 | 0.832 | 3.441 | 0.842 | 3.330 | |
| 60% | 0.866 | 3.058 | 0.831 | 3.454 | 0.841 | 3.333 | |
| 70% | 0.871 | 2.961 | 0.838 | 3.380 | 0.844 | 3.305 | |
| 80% | 0.856 | 3.153 | 0.837 | 3.383 | 0.846 | 3.286 | |
| 90% | 0.865 | 3.086 | 0.851 | 3.244 | 0.852 | 3.224 | |
| 100% | 0.879 | 2.895 | 0.841 | 3.346 | 0.864 | 3.083 | |
| XLZ53/LMY24 | 10% | 0.864 | 3.484 | 0.712 | 4.397 | 0.701 | 4.481 |
| 20% | 0.872 | 3.238 | 0.753 | 4.076 | 0.729 | 4.263 | |
| 30% | 0.830 | 3.579 | 0.773 | 3.901 | 0.753 | 4.073 | |
| 40% | 0.832 | 3.405 | 0.817 | 3.505 | 0.792 | 3.737 | |
| 50% | 0.837 | 3.387 | 0.817 | 3.507 | 0.801 | 3.652 | |
| 60% | 0.868 | 3.004 | 0.795 | 3.707 | 0.805 | 3.621 | |
| 70% | 0.870 | 3.023 | 0.821 | 3.466 | 0.804 | 3.626 | |
| 80% | 0.836 | 3.335 | 0.821 | 3.471 | 0.814 | 3.533 | |
| 90% | 0.840 | 3.283 | 0.813 | 3.546 | 0.815 | 3.528 | |
| 100% | 0.852 | 3.131 | 0.822 | 3.459 | 0.817 | 3.507 | |
sDataset size means the percentage of small dataset size participating in fine-tuning to the dataset size of the original calibration set. aCalibration set means the calibration set of the target domain; bvalidation set means the validation set of the target domain; cprediction set means the prediction set of the target domain.
Figure 5The average reflectance and the average transformed spectra with standard deviation (SD) for two cotton varieties. (a) raw spectra; (b) MSC; (c) SNV; (d) FD; (e) MSC+ FD; (f) SNV + FD; (g) FD + MSC; (h) FD + SNV.
The prediction results of PLS models using DS and TCA transformation.
| Source/target domain | Method | Calibration seta | Validation setb | Prediction setc | |||
|---|---|---|---|---|---|---|---|
| R2 | RMSE | R2 | RMSE | R2 | RMSE | ||
| LMY24/XLZ53 | DS 1 | 0.598 | 5.485 | 0.651 | 5.263 | 0.583 | 5.589 |
| DS 2 | 0.546 | 5.902 | 0.507 | 6.198 | 0.565 | 5.812 | |
| DS 3 | 0.623 | 5.789 | 0.687 | 5.271 | 0.623 | 5.935 | |
| TCA | 0.811 | 3.652 | 0.741 | 4.427 | 0.788 | 3.936 | |
| XLZ53/LMY24 | DS 1 | 0.466 | 6.821 | 0.462 | 6.950 | 0.499 | 6.648 |
| DS 2 | 0.449 | 6.948 | 0.461 | 6.887 | 0.425 | 6.989 | |
| DS 3 | 0.388 | 7.456 | 0.289 | 8.169 | 0.452 | 6.885 | |
| TCA | 0.600 | 5.446 | 0.591 | 5.406 | 0.600 | 5.654 | |
aCalibration set means the calibration set of the target domain; bvalidation set means the validation set of the target domain; cprediction set means the prediction set of the target domain.