| Literature DB >> 29588477 |
Otkrist Gupta1, Anshuman J Das2, Joshua Hellerstein2, Ramesh Raskar2.
Abstract
The analysis and identification of different attributes of produce such as taxonomy, vendor, and organic nature is vital to verifying product authenticity in a distribution network. Though a variety of analysis techniques have been studied in the past, we present a novel data-centric approach to classifying produce attributes. We employed visible and near infrared (NIR) spectroscopy on over 75,000 samples across several fruit and vegetable varieties. This yielded 0.90-0.98 and 0.98-0.99 classification accuracies for taxonomy and farmer classes, respectively. The most significant factors in the visible spectrum were variations in the produce color due to chlorophyll and anthocyanins. In the infrared spectrum, we observed that the varying water and sugar content levels were critical to obtaining high classification accuracies. High quality spectral data along with an optimal tuning of hyperparameters in the support vector machine (SVM) was also key to achieving high classification accuracies. In addition to demonstrating exceptional accuracies on test data, we explored insights behind the classifications, and identified the highest performing approaches using cross validation. We presented data collection guidelines, experimental design parameters, and machine learning optimization parameters for the replication of studies involving large sample sizes.Entities:
Mesh:
Year: 2018 PMID: 29588477 PMCID: PMC5869718 DOI: 10.1038/s41598-018-23394-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Schematic of the data collection setup. A broadband source was used to illuminate the sample and the reflected signal was collected using an optical fiber probe that served as an input to 5 UV-VIS-NIR spectrometers.
Figure 2Hyperspectral responses for Fuji apples in the 400–700 nm (top left), 700–1100 nm (top right), 1100–2000 nm (bottom left), and combined spectrum ranges (bottom right). Measurements in combined spectrum (400–2000 nm), includes visible (denoted in blue background), NIR 1 (red background) and NIR 2 (yellow background).
Classification accuracies for fine grained taxonomy of fruits and vegetables.
| Fruit Type | Number of Classes | Number of Samples | Visible | NIR 1 | NIR 2 | Composite |
|---|---|---|---|---|---|---|
| (Organic/Inorganic) | (0–700 nm) | (700–1100 nm) | (1100–2000 nm) | (0–2000 nm) | ||
| Apples | 8 | 13808 | 0.915 | 0.836 | 0.906 | |
| Strawberries | 2 | 980 | 0.828 | 0.84 | 0.917 | |
| Grapes | 2 | 947 | 0.906 | 0.867 | 0.921 | |
| Oranges | 4 | 2599 | 0.94 | 0.911 | 0.981 | |
| Mushrooms | 3 | 1217 | 0.941 | 0.943 | ||
| Onions | 2 | 2686 | 0.892 | 0.903 | ||
| Bell Peppers | 5 | 1483 | 0.959 | 0.954 | 0.945 | |
| Jalapeno Chilli | 3 | 3292 | 0.964 | 0.9 | 0.976 | |
| Potatoes | 3 | 5541 | 0.949 | 0.962 | 0.963 | |
| Tomatoes | 6 | 3718 | 0.906 | 0.876 | 0.902 |
Farmer classification accuracies from various spectra using linear SVMs.
| Fruit Type | Number of Classes | Number of Samples | Visible | NIR 1 | NIR 2 | Composite |
|---|---|---|---|---|---|---|
| (Organic/Inorganic) | (0–700 nm) | (700–1100 nm) | (1100–2000 nm) | (0–2000 nm) | ||
| Fuji Apples | 3 | 1683 | 0.962 | 0.981 | 0.982 | |
| Gala Apples | 3 | 753 | 0.987 | 0.991 | ||
| Halo Oranges | 2 | 725 | ||||
| Red Bell Peppers | 3 | 510 | 0.98 | 0.994 | ||
| Red Potatoes | 3 | 741 | 0.988 | 0.986 | ||
| Russet Potatoes | 2 | 1140 | 0.994 | |||
| Steak Tomatoes | 3 | 690 | 0.981 | 0.991 | 0.985 |
Confusion matrix for farmer classification over gala (left) and Fuji (right) apples. For gala apples our net accuracy is 99% compared to 68% for random assignment. For Fuji apples net accuracy is 99% compared to 59% for random assignment.
| farmer 1 | farmer 2 | farmer 3 | farmer 1 | farmer 2 | farmer 3 | ||
|---|---|---|---|---|---|---|---|
| farmer 1 | 0.99 | 0.00 | 0.01 | farmer 1 | 0.96 | 0.03 | 0.01 |
| farmer 2 | 0.03 | 0.97 | 0.00 | farmer 2 | 0.00 | 0.99 | 0.01 |
| farmer 3 | 0.00 | 0.00 | 1.00 | farmer 3 | 0.00 | 0.00 | 1.00 |
Classification accuracy when identifying organic vs non-organic fruit.
| Fruit Type | Number of Classes | Number of Samples | Visible | NIR 1 | NIR 2 | Composite |
|---|---|---|---|---|---|---|
| (Organic/Inorganic) | (0–700 nm) | (700–1100 nm) | (1100–2000 nm) | (0–2000 nm) | ||
| Gala Apples | 2 | 3358 | 0.966 | 0.874 | 0.982 | |
| Red Delicious Apples | 2 | 1095 | 0.938 | 0.975 | 0.984 | |
| Naval Oranges | 2 | 423 | 0.97 | 0.956 | 0.968 | |
| Green Onions | 2 | 316 | 0.904 | 0.979 | 0.989 | |
| Green Bell Peppers | 2 | 119 | 0.919 | 0.969 | 0.906 |