| Literature DB >> 36034596 |
Halil Bisgin1, Tanmay Bera2, Leihong Wu2, Hongjian Ding3, Neslihan Bisgin1, Zhichao Liu2, Monica Pava-Ripoll4, Amy Barnes3, James F Campbell5, Himansi Vyas3, Cesare Furlanello6, Weida Tong2, Joshua Xu2.
Abstract
Food samples are routinely screened for food-contaminating beetles (i.e., pantry beetles) due to their adverse impact on the economy, environment, public health and safety. If found, their remains are subsequently analyzed to identify the species responsible for the contamination; each species poses different levels of risk, requiring different regulatory and management steps. At present, this identification is done through manual microscopic examination since each species of beetle has a unique pattern on its elytra (hardened forewing). Our study sought to automate the pattern recognition process through machine learning. Such automation will enable more efficient identification of pantry beetle species and could potentially be scaled up and implemented across various analysis centers in a consistent manner. In our earlier studies, we demonstrated that automated species identification of pantry beetles is feasible through elytral pattern recognition. Due to poor image quality, however, we failed to achieve prediction accuracies of more than 80%. Subsequently, we modified the traditional imaging technique, allowing us to acquire high-quality elytral images. In this study, we explored whether high-quality elytral images can truly achieve near-perfect prediction accuracies for 27 different species of pantry beetles. To test this hypothesis, we developed a convolutional neural network (CNN) model and compared performance between two different image sets for various pantry beetles. Our study indicates improved image quality indeed leads to better prediction accuracy; however, it was not the only requirement for achieving good accuracy. Also required are many high-quality images, especially for species with a high number of variations in their elytral patterns. The current study provided a direction toward achieving our ultimate goal of automated species identification through elytral pattern recognition.Entities:
Keywords: convolutional neural networks; deep learning; food safety; food-contaminating beetle; image classification; machine learning; species identification
Year: 2022 PMID: 36034596 PMCID: PMC9412741 DOI: 10.3389/frai.2022.952424
Source DB: PubMed Journal: Front Artif Intell ISSN: 2624-8212
The complete list of pantry beetles used in this study, listed alphabetically by their family, genus, species and common names, with abbreviations.
|
|
|
|
|
| |
|---|---|---|---|---|---|
| 1 | Anthribidae |
|
| Coffee Bean Weevil | AAF |
| 2 | Anobiidae |
|
| Cigarette Beetle | ALS |
| 3 | Anobiidae |
|
| Drugstore Beetle | ASP |
| 4 | Bostrichidae |
|
| Lesser Grain Borer | BRD |
| 5 | Chrysomelidae |
|
| Cowpea Weevil | CCM |
| 6 | Curculionidae |
|
| Granary Weevil | CSG |
| 7 | Curculionidae |
|
| Rice Weevil | CSO |
| 8 | Curculionidae |
|
| Maize Weevil | CSZ |
| 9 | Dermestidae |
|
| Black Carpet Beetle | DAU |
| 10 | Dermestidae |
|
| Cabinet Beetle | DTI |
| 11 | Laemophloeidae |
|
| Rusty Grain Beetle | LCF |
| 12 | Laemophloeidae |
|
| Flat Grain Beetle | LCP |
| 13 | Laemophloeidae |
|
| Flour Mill Beetle | LCT |
| 14 | Silvanidae |
|
| Foreign Grain Beetle | SAA |
| 15 | Silvanidae |
|
| Fungus Beetle | SAS |
| 16 | Silvanidae |
|
| Squarenecked Grain Beetle | SCQ |
| 17 | Silvanidae |
|
| Merchant Grain Beetle | SOM |
| 18 | Silvanidae |
|
| Saw-toothed Grain Beetle | SOS |
| 19 | Tenebrionidae |
|
| Larger Black Flour Beetle | TCA |
| 20 | Tenebrionidae |
|
| Broad-horned Flour Beetle | TGC |
| 21 | Tenebrionidae |
|
| Longheaded Flour Beetle | TLO |
| 22 | Tenebrionidae |
|
| Siamese Grain Beetle | TLP |
| 23 | Tenebrionidae |
|
| Smalleyed Flour Beetle | TPR |
| 24 | Tenebrionidae |
|
| Red Flour Beetle | TTCa |
| 25 | Tenebrionidae |
|
| Confused Flour Beetle | TTCo |
| 26 | Tenebrionidae |
|
| Dark Flour Beetle | TTD |
| 27 | Tenebrionidae |
|
| Black Flour Beetle | TTM |
Figure 1Overview of the CNN architecture.
List of augmentation options and parameter values used in our study.
|
|
|
|
|---|---|---|
| rotation_range | Creates images with random rotations up to N degrees. | 40 |
| width_shift_range | Handles off-center objects by artificially creating shifted versions of the training data | 0.2 |
| height_shift_range | 0.2 | |
| shear_range | Shear angle in counterclockwise direction in degrees | 0.2 |
| zoom_range | Random zoom range | 0.2 |
| horizontal_flip | Creates random flips of the image (supposes you feed a mirror image) | True |
| fill_mode | Helps in filling values outside the boundaries of an image | nearest |
Figure 2Model optimization showing the model achieving optimal performance after about 50 epochs.
Figure 3Comparison of model performances on validation sets of traditionally- and optimally-acquired images.
Figure 4Performance metrics for the 27-class model.
Figure 5Confusion Matrix for 27-class task (computed on test set) showing the level of agreement between true and predicted classes. Red colored tiles (diagonal) represent correct classification of each species and represent values between 67% and 100%. Yellow tiles represent incorrect classification ratios that are non-zero and go up to 28%. Finally, green tiles represent zero values which means targeted species is not confused with the corresponding species.
Figure 6Representative images of elytral variation. (A) Intraspecies pattern variation in CSO (possibly due to the difference in maturity), (B) pattern variation due to background interference in LCP and LCF, and (C) regional variation in elytral patterns in AAF.