| Literature DB >> 34529654 |
Alison Pereira Ribeiro1, Nádia Felix Felipe da Silva1, Fernanda Neiva Mesquita1, Priscila de Cássia Souza Araújo2, Thierson Couto Rosa1, José Neiva Mesquita-Neto3.
Abstract
Bee-mediated pollination greatly increases the size and weight of tomato fruits. Therefore, distinguishing between the local set of bees-those that are efficient pollinators-is essential to improve the economic returns for farmers. To achieve this, it is important to know the identity of the visiting bees. Nevertheless, the traditional taxonomic identification of bees is not an easy task, requiring the participation of experts and the use of specialized equipment. Due to these limitations, the development and implementation of new technologies for the automatic recognition of bees become relevant. Hence, we aim to verify the capacity of Machine Learning (ML) algorithms in recognizing the taxonomic identity of visiting bees to tomato flowers based on the characteristics of their buzzing sounds. We compared the performance of the ML algorithms combined with the Mel Frequency Cepstral Coefficients (MFCC) and with classifications based solely on the fundamental frequency, leading to a direct comparison between the two approaches. In fact, some classifiers powered by the MFCC-especially the SVM-achieved better performance compared to the randomized and sound frequency-based trials. Moreover, the buzzing sounds produced during sonication were more relevant for the taxonomic recognition of bee species than analysis based on flight sounds alone. On the other hand, the ML classifiers performed better in recognizing bees genera based on flight sounds. Despite that, the maximum accuracy obtained here (73.39% by SVM) is still low compared to ML standards. Further studies analyzing larger recording samples, and applying unsupervised learning systems may yield better classification performance. Therefore, ML techniques could be used to automate the taxonomic recognition of flower-visiting bees of the cultivated tomato and other buzz-pollinated crops. This would be an interesting option for farmers and other professionals who have no experience in bee taxonomy but are interested in improving crop yields by increasing pollination.Entities:
Mesh:
Year: 2021 PMID: 34529654 PMCID: PMC8478199 DOI: 10.1371/journal.pcbi.1009426
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Taxonomic diversity of sonicating bees recorded visiting tomato flowers and the corresponding higher taxonomic group (according to [44]).
(N recordings) denotes the number of individuals with buzzing-sounds recorded; (AF) average frequency ± standard deviation; (Flight segments) the total number of flight segments per species; (Sonication segments) the total number of sonication segments per species.
| Subfamily | Tribe | Genus | Species | AF(±SD) [ | N recordings | Flight segments | Sonication segments |
|---|---|---|---|---|---|---|---|
| Halictinae | Augochlorini |
|
| 252.1 (±44.8) | 2 | 1 | 3 |
| 203.3 (±17.0) | 5 | 9 | 11 | ||||
| 190.1 (±6.3) | 2 | 2 | 12 | ||||
|
|
| 210.5 (±10.5) | 3 | 2 | 22 | ||
| Apinae | Bombini |
|
| 218.3 (±18.9) | 8 | 9 | 19 |
|
| 233.9 (±21.7) | 5 | 11 | 15 | |||
| Centridini |
|
| 330.8 (±7.3) | 1 | 6 | 2 | |
|
| 319.9 (±16.8) | 2 | 8 | 3 | |||
| Euglossini |
|
| 196.7 (±3.5) | 1 | 2 | 2 | |
| Exomalopsini |
|
| 167.6 (±21.8) | 10 | 11 | 63 | |
|
| 151.9 (±12.8) | 4 | 4 | 19 | |||
| Meliponini |
|
| 337.1 (±29.8) | 8 | 17 | 30 | |
|
| 323.0 (±20.9) | 4 | 9 | 8 | |||
| Xylocopini |
|
| 247.2 (±15.2) | 2 | 9 | 7 | |
|
| 250.1 (±29.9) | 2 | 3 | 2 |
Fig 1Overview of the approach adopted for the acoustic classification of bees buzzing-sounds and machine learning workflow.
The original audio files (.wav format) containing recordings of bees buzzing-sounds during visits to tomato flowers were manually classified into sonication or flight segments. Then, the Mel Frequency Cepstral Coefficients method (MFCC) was used to extract the audio features. After, the resulting data set was split into 50% for the training/development set (delimited by the red dashed line) and 50% for the testing data set. The GridSearchCV method was used to tune the hyperparameters of the training set (using 5-cross validations). The test data set was used to evaluate the performance of the Machine-Learning classifiers in correctly assigning the buzzing sound to the respective bee taxa.
Fig 2Spectrograms of different types of buzzing (sonication and flight) for two visiting-bees species of tomato flowers (Melipona bicolor and Exomalopsis analis).
Note that the duration and amplitude and frequency of the buzzing-sounds vary between the species and among the type of buzzing.
Fig 3Overview of the steps for audio feature extraction by Mel Frequency Cepstral Coefficients Method (MFCC)
Pre-emphasis, framing, windowing, Discrete Fourier Transform (DFT), and filter bank (applying Discrete Cosine Transform—DCT).
Predictive performance of different Machine-Learning algorithms on acoustic recognition of bee species based on the type of buzzing-sound (flight, sonication, and flight+sonication) during visits to tomato flowers.
The performance of the ML algorithms was measured by Accuracy (Acc), Macro-Precision (MacPred), Macro-Recall (MacRec) and Macro-F1 (MacF1) and compared with three baselines scenarios: (1) Majority class: assigning all the classes to the majority class; (2) Fundamental frequency: bees recognition based solely on the average frequency of the sonication, as performed by [43]; (3) Fundamental frequency (SVM): bees recognition based fundamental frequency and using the SVM algorithm, classifier with the best performance (based on the MacF1-score). Bold numbers represent the best results per evaluation metric within buzz-sound; Different upper side letters denote significant differences in the F1-score among the algorithms of the same buzzing-behavioral (p ≤ 0.05, T-test); (**) denotes that the performance of the algorithm is higher than the baselines (based on the MacF1 measure; p ≤ 0.05, T-test).
|
| ||||
|
|
|
|
|
|
| LR** | 51.92 | 47.25 | 41.30 | 40.20 |
| SVM** |
|
|
|
|
| RF | 48.07 | 47.32 | 43.20 | 41.46 |
| DTree | 26.92 | 24.25 | 23.88 | 19.74 |
| Ensemble** | 50.00 | 45.80 | 38.01 | 36.02 |
|
| ||||
|
|
|
|
|
|
| LR** | 64.22 | 45.56 | 41.13 | 41.27 |
| SVM** |
|
|
|
|
| RF** | 58.71 | 47.73 | 34.72 | 37.67 |
| DTree | 43.11 | 31.31 | 35.35 | 29.07 |
| Ensemble** | 68.80 | 48.50 | 45.59 | 44.19 |
|
| ||||
|
|
|
|
|
|
| LR** | 53.41 | 53.67 |
|
|
| SVM** | 56.52 | 46.66 | 46.59 | 45.16 |
| RF** | 50.31 | 44.56 | 38.13 | 36.21 |
| DTree | 32.91 | 33.02 | 28.19 | 25.91 |
| Ensemble** |
|
| 47.62 | 47.36 |
|
| ||||
|
|
|
|
| |
| Majority class | 23.00 | 2.00 | 7.00 | 2.00 |
| Fundamental frequency | 51.00 | 25.00 | 40.00 | 28.00 |
| Fundamental frequency (SVM) | 35.00 | 27.00 | 24.00 | 24.00 |
Predictive performance of different Machine-Learning algorithms on acoustic recognition of bee genera based on the type of buzzing-sound (flight, sonication, and flight+sonication) during visits to tomato flowers.
The performance of the ML algorithms was measured by Accuracy (Acc), Macro-Precision (MacPrec), Macro-Recall (MacRec) and Macro-F1 (MacF1) and compared with three baseline scenarios: (1) Majority class: assigning all the classes to the majority class; (2) Fundamental frequency: bee recognition based solely on the average frequency of the sonication, as performed by [43]; (3) Fundamental frequency (SVM): bee recognition based fundamental frequency and using the SVM algorithm, classifier with the best performance (based on the MacF1 score). Bold numbers represent the best results per evaluation metric within buzz-sound; Different upper side letters denote significant differences in the MacF1 scores among the algorithms of the same buzzing-behavioral (p ≤ 0.05, T-test); (**) denotes that the performance of the algorithm is higher than the baselines (based on the MacF1 measure; p ≤ 0.05, T-test).
|
| ||||
|
|
|
|
|
|
| LR** | 60.37 |
| 56.63 | 57.02 |
| SVM** |
| 64.44 |
|
|
| RF | 54.71 | 45.85 | 41.22 | 38.17 |
| DTree | 39.62 | 20.92 | 28.79 | 21.85 |
| Ensemble | 60.37 | 64.37 | 56.84 | 55.23 |
|
| ||||
|
|
|
|
|
|
| LR** | 60.90 | 56.72 | 49.91 | 51.55 |
| SVM** | 66.36 |
|
|
|
| RF | 62.72 | 44.83 | 36.93 | 37.77 |
| DTree | 49.09 | 30.59 | 29.51 | 29.82 |
| Ensemble |
| 47.97 | 41.45 | 42.60 |
|
| ||||
|
|
|
|
|
|
| LR** | 62.34 | 51.49 | 55.76 | 52.38 |
| SVM** | 67.90 | 57.79 |
|
|
| RF** | 61.11 | 53.57 | 45.77 | 46.92 |
| DTree | 45.06 | 34.53 | 35.80 | 34.23 |
| Ensemble** |
|
| 56.99 | 58.09 |
|
| ||||
|
|
|
|
|
|
| Majority class | 30.00 | 4.00 | 12.00 | 6.00 |
| Fundamental frequency | 68.00 | 41.00 | 50.00 | 43.00 |
| Fundamental frequency (SVM) | 48.00 | 29.00 | 29.00 | 28.00 |
Confusion matrix with the best performance for bee buzzing-sounds classification at genus-level using MFCC features (flight with SVM classifier, MacF1 = 60.20% and Acc = 64.15%).
The numbers in the matrix correspond to correctly (diagonal elements, bold) and incorrectly (out-of-diagonal elements) recognized samples in the data set. The best parameters of this classification were C = 10, decision_function_shape = “ovo”, gamma = 0.01, kernel = “rbf”.
| Predict → |
|
|
|
|
|
|
|
| All |
| True ↓ | |||||||||
|
|
| 0 | 0 | 0 | 3 | 1 | 1 | 0 | 8 |
|
| 0 |
| 0 | 0 | 0 | 0 | 0 | 0 | 10 |
|
| 0 | 0 |
| 0 | 0 | 2 | 0 | 0 | 7 |
|
| 0 | 0 | 0 |
| 0 | 0 | 0 | 0 | 1 |
|
| 0 | 1 | 0 | 0 |
| 1 | 0 | 2 | 8 |
|
| 0 | 1 | 1 | 0 | 2 |
| 0 | 1 | 12 |
|
| 0 | 0 | 0 | 0 | 1 | 0 |
| 0 | 1 |
|
| 0 | 2 | 0 | 0 | 0 | 0 | 0 |
| 6 |
Confusion matrix with the best performance for bees buzzing-sounds classification at species-level using MFCC features (sonication with SVM classifier, MacF1 = 59.06% and Acc = 73.39%).
The numbers in the matrix correspond to correctly (diagonal elements, bold) and incorrectly (out-of-diagonal elements) recognized samples in the data set. The best parameters of this classification were C = 10, decision_function_shape = “ovo”, gamma = 0.01, kernel = “rbf”.
| Predict → |
|
|
|
|
|
|
|
|
|
|
|
|
| All | ||
| True ↓ | ||||||||||||||||
|
|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| 0 |
| 0 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 6 | |
| 0 | 0 |
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 9 | |
|
| 0 | 0 | 0 |
| 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 8 |
|
| 0 | 0 | 0 | 0 |
| 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 8 |
|
| 0 | 0 | 0 | 0 | 1 |
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 2 |
|
| 0 | 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 |
|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
|
| 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 0 | 0 | 1 | 0 | 30 |
|
| 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 2 |
| 0 | 0 | 0 | 0 | 0 | 11 |
|
| 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 |
| 1 | 0 | 3 | 0 | 15 |
|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 0 | 2 |
|
| 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
| 0 | 0 | 11 |
|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| 0 | 2 |
|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| 1 |