| Literature DB >> 28405214 |
Hsiang Sing Naik1, Jiaoping Zhang2, Alec Lofquist1, Teshale Assefa2, Soumik Sarkar1, David Ackerman1, Arti Singh2, Asheesh K Singh2, Baskar Ganapathysubramanian1.
Abstract
BACKGROUND: Phenotyping is a critical component of plant research. Accurate and precise trait collection, when integrated with genetic tools, can greatly accelerate the rate of genetic gain in crop improvement. However, efficient and automatic phenotyping of traits across large populations is a challenge; which is further exacerbated by the necessity of sampling multiple environments and growing replicated trials. A promising approach is to leverage current advances in imaging technology, data analytics and machine learning to enable automated and fast phenotyping and subsequent decision support. In this context, the workflow for phenotyping (image capture → data storage and curation → trait extraction → machine learning/classification → models/apps for decision support) has to be carefully designed and efficiently executed to minimize resource usage and maximize utility. We illustrate such an end-to-end phenotyping workflow for the case of plant stress severity phenotyping in soybean, with a specific focus on the rapid and automatic assessment of iron deficiency chlorosis (IDC) severity on thousands of field plots. We showcase this analytics framework by extracting IDC features from a set of ~4500 unique canopies representing a diverse germplasm base that have different levels of IDC, and subsequently training a variety of classification models to predict plant stress severity. The best classifier is then deployed as a smartphone app for rapid and real time severity rating in the field.Entities:
Keywords: High-throughput phenotyping; Image analysis; Machine learning; Plant stress; Smartphone
Year: 2017 PMID: 28405214 PMCID: PMC5385078 DOI: 10.1186/s13007-017-0173-7
Source DB: PubMed Journal: Plant Methods ISSN: 1746-4811 Impact factor: 4.993
Fig. 1Image preprocessing sequence from original image of canopy to completed automated pre-processed field soybean canopies
Fig. 2Iron deficiency chlorosis severity description using a field visual rating scale of 1–5
Fig. 3Feature extraction from plant canopies (top image) for iron deficiency chlorosis. The bottom left figure represents those regions in the canopy that are yellow in color, and the bottom right figure represents those regions in the canopy that are brown in color. The percentage spread of yellow and brown color are then taken as the two features
Confusion matrix
| Predicted positive (class 1) | Predicted negative (class 2) | |
|---|---|---|
| Actual positive (class 1) | True positive (TP) | False negative (FN) |
| Actual negative (class 2) | False positive (FP) | True negative (TN) |
Three measures of accuracy of the classifier are reported from the confusion matrix
Cost matrix, wij
| Predicted ratings | |||||
|---|---|---|---|---|---|
| Actual ratings | 0 | 1 | 2 | 3 | 4 |
| 1 | 0 | 1 | 2 | 3 | |
| 2 | 1 | 0 | 1 | 2 | |
| 3 | 2 | 1 | 0 | 1 | |
| 4 | 3 | 2 | 1 | 0 | |
Fig. 4Hierarchical classifier workflow
Results for machine learning algorithm model accuracies developed using a sub-set of iron deficiency chlorosis data on a diverse set of soybean accessions
| Algorithm | Accuracy | MPCAa | Cross validated MPCA | Interpretability | Cost metric |
|---|---|---|---|---|---|
| CT | 100.0 | 100.0 | 96.0 | Medium | 0.0000 |
| KNN | 99.7 | 96.7 | 95.0 | Low | 0.0031 |
| RF | 99.7 | 96.0 | 85.0 | Low | 0.0031 |
| Hierarchyb | 99.4 | 95.9 | 79.8 | High | 0.0062 |
| QDA | 99.4 | 92.0 | 98.9 | Medium | 0.0620 |
| Hierarchyc | 98.5 | 86.6 | 70.8 | High | 0.0155 |
| GMMB | 99.1 | 82.0 | 87.0 | Medium | 0.0093 |
| NB | 99.1 | 82.0 | 93.8 | Medium | 0.0093 |
| LDA | 98.8 | 79.3 | 84.3 | High | 0.0124 |
| SVM | 93.8 | 39.8 | 50.0 | Low | 0.1084 |
aMean per class accuracy
bSVM and SVM
cLDA and SVM
Results for machine learning algorithm model accuracies developed using the complete set of iron deficiency chlorosis data on a diverse set of soybean accessions
| Algorithm | Accuracy | MPCAa | Cross validated MPCA | Interpretability | Cost metric |
|---|---|---|---|---|---|
| CT | 99.7 | 91.7 | 78.4 | Low | 0.0027 |
| Hierarchyb | 99.2 | 90.7 | 79.2 | High | 0.0082 |
| Hierarchyc | 98.3 | 84.0 | 79.0 | High | 0.0201 |
| QDA | 98.5 | 83.2 | 77.9 | Medium | 0.0201 |
| NB | 98.4 | 79.0 | 78.5 | Medium | 0.0284 |
| KNN | 99.5 | 75.8 | 84.3 | Low | 0.0073 |
| RF | 99.1 | 75.0 | 81.1 | Low | 0.0092 |
| GMMB | 99.4 | 74.2 | 82.7 | Low | 0.0064 |
| LDA | 98.5 | 71.7 | 76.9 | High | 0.0156 |
| SVM | 97.3 | 45.8 | 45.3 | Low | 0.0458 |
aMean per class accuracy
bSVM: using SVM for both classifiers
cLDA and SVM
Fig. 5Population canopy graph of predicted data using a testing set with images and visual rating for IDC in soybean
Fig. 6Smartphone app flowchart demonstrating the integration of pre-processing, machine learning enabled classification and iron deficiency chlorosis visual rating in real time