| Literature DB >> 30090110 |
Salvador Gutiérrez1, Juan Fernández-Novales1, Maria P Diago1, Javier Tardaguila1.
Abstract
Grapevine varietal classification is an important plant phenotyping issue for grape growing and wine industry. This task has been achieved from destructive techniques like classic ampelography and DNA analysis under laboratory conditions. This work displays a new approach for the classification of a high number of grapevine (Vitis vinifera L.) varieties under field conditions using on-the-go hyperspectral imaging and different machine learning algorithms. On-the-go imaging was performed under natural illumination using a hyperspectral camera mounted on an all-terrain vehicle at 5 km/h. Spectra were acquired over two different leaf phenological stages on the canopy of 30 different varieties on a commercial vineyard located in La Rioja, Spain. A total of 1,200 spectral samples were generated. Support vector machines (SVM) and artificial neural networks (multilayer perceptrons, MLP) were used for the development of a large number of models, testing different algorithm parameters and spectral pre-processing techniques. Both classifiers yielded notable performance values and were able to train models with recall F1 scores and area under the receiver operating characteristic curve marks up to 0.99 for 5-fold cross validation. Statistical analyses supported that the best SVM kernel was linear and the best activation function for MLP was the hyperbolic tangent function. The prediction performance for individual varieties of MLP ranged from 0.94 to 0.99, displaying low levels of variability. In the case of SVM, slightly higher differences were obtained, ranging from 0.83 to 0.97 for individual varieties. These results support the possibility of deploying an on-the-go hyperspectral imaging system in the field capable of successfully classifying leaves from different grapevine varieties. This technology could thus be considered as a new useful non-destructive tool for plant phenotyping under field conditions.Entities:
Keywords: MLP; discrimination; non-invasive sensors; plant phenotyping; proximal sensing; remote sensing; sensors
Year: 2018 PMID: 30090110 PMCID: PMC6068396 DOI: 10.3389/fpls.2018.01102
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1(A) On-the-go hyperspectral imaging on an all-terrain vehicle in a vertically shoot positioned vineyard located in Logroño, La Rioja (Spain). Spectral acquisition was performed on the sun-exposed canopy side at 5 km/h. (The authors declare that written and informed consent has been obtained from the depicted individual in this image, for the publication of this identifiable image). (B) Construction of a two-dimensional hyperspectral image by push broom. The camera's scanline, that was acquiring spectral information from a vertical line over the vineyard canopy, was moved by the motion of the all-terrain-vehicle. Thus, the composition of the image was performed by this scanline dragging at constant speed.
Figure 2Each m × n hyperspectral image was processed column by column. For each column i, each pixel (spectrum) was compared with a signature leaf spectrum. If a certain threshold of belonging was surpassed, the pixel was marked as leaf pixel. Afterwards, all leaf pixels from the column i were averaged.
Figure 3Experimental modeling diagram summarizing the analyses performed. From the spectral dataset (input), different combinations of various pre-processing techniques were applied, modeled using two machine learning algorithms (with many parameters) and validated by several 5-fold cross validation replicates. Finally, three performance statistics were evaluated.
Comparison of means of classification recall, F1 score and AUC for each Savitzky-Golay window size by algorithm and derivative order.
| SVM | Recall | First | 0.8839 | 0.8648 | 0.8351 | |
| Second | 0.9024 | 0.8947 | 0.8842 | |||
| F1 score | First | 0.8934 | 0.8747 | 0.8450 | ||
| Second | 0.9142 | 0.9058 | 0.8938 | |||
| AUC | First | 0.9309 | 0.9305 | 0.9297 | ||
| Second | 0.9339 | 0.9328 | 0.9265 | |||
| MLP | Recall | First | 0.9796 | 0.9678 | 0.9404 | |
| Second | 0.9905 | 0.9842 | 0.9804 | |||
| F1 score | First | 0.9796 | 0.9687 | 0.9404 | ||
| Second | 0.9905 | 0.9842 | 0.9804 | |||
| AUC | First | 0.9998 | 0.9995 | 0.9986 | ||
| Second | 0.9999 | 0.9998 | 0.9996 | |||
The values represent the average recall.
Dissimilar lowercase letters within rows represent statistically different means among different window sizes, using Tukey's range test at a significance level p = 0.05.
AUC, area under the receiver operating characteristic curve; SVM, support vector machine; MLP, multilayer perceptron. n.s., not significant (p≥0.05);
p < 0.001.
Comparison of means of classification recall, F1 score and AUC for the different parameters tested for support vector machine (SVM).
| Penalty parameter ( | 1,000 | 0.99 | 0.99 | 0.99 |
| 100 | 0.99 | 0.99 | 0.99 | |
| 10 | 0.98 | 0.98 | 0.99 | |
| 1 | 0.92 | 0.94 | 0.99 | |
| 0.1 | 0.73 | 0.75 | 0.98 | |
| 0.01 | 0.65 | 0.68 | 0.60 | |
| Significance | ||||
| Kernel | 0.99 | 0.99 | 0.99 | |
| 0.90 | 0.90 | 0.95 | ||
| 0.74 | 0.77 | 0.84 | ||
| Significance |
Dissimilar lowercase letters within the different parameter values represent statistically different means, using Tukey's range test at a significance level p = 0.05.
p < 0.001.
Comparison of means of classification recall, F1 score and AUC for the different parameters tested for multilayer perceptron (MLP).
| Hidden layer | t | 0.9746 | 0.9746 | 0.9995 |
| i | 0.9757 | 0.9757 | 0.9996 | |
| a | 0.9717 | 0.9717 | 0.9994 | |
| Significance | ||||
| Activation function | 0.9855 | 0.9855 | 0.9998 | |
| 0.9837 | 0.9837 | 0.9998 | ||
| 0.9527 | 0.9527 | 0.9990 | ||
| Sign. | ||||
| Warm start | True | 0.9740 | 0.9739 | 0.9995 |
| False | 0.9739 | 0.9739 | 0.9994 | |
| Significance |
Dissimilar lowercase letters within the different parameter values represent statistically different means, using Tukey's range test at a significance level p = 0.05.
n.s., not significant (p ≥ 0.05);
p < 0.01.
p < 0.001.
AUC, area under the receiver operating characteristic curve. t, the number of neurons is the sum of the number of attributes and classes. i, the number of neurons is the number of attributes. a, the number of neurons is half the amount of t. tanh, hyperbolic tangent function. relu, rectified linear unit function. logistic, logistic sigmoid function.
Figure 4Average recall (A,B), F1 score (C,D) and area under the receiver operating characteristic curve, AUC, (E,F) per grapevine variety (n = 2160) for Multilayer Perceptron (A,C,E) and Support Vector Machine (B,D,F). BA, Baladí; BL, Blanca Cayetana; BR, Brancellao; CA, Catalán Blanco; CB, Chenin Blanc; CE, Centurion; CF, Cabernet Sauvignon; CG, Calagraño; CH, Chardonnay; CI, Cigüente; CN, Calop Negro; CO, Concord; CR, Carnelian; CS, Cabernet Franc; CU, Crujidera; PA, Palomino; PB, Pinot Blanc; PC, Picapoll Blanco; PD, Pardina; PE, Pedro Ximénez; PI, Pinot Noir; PL, Parellada; PR, Perruno Fino; RB, Rubired; RU, Rufete; SA, Sauvignon; SE, Semillón; SO, Sousón; SY, Syrah; TE, Tempranillo.