Literature DB >> 36060073

Artificial intelligence versus natural selection: Using computer vision techniques to classify bees and bee mimics.

Tanvir Bhuiyan¹, Ryan M Carney², Sriram Chellappan¹.

Abstract

Many groups of stingless insects have independently evolved mimicry of bees to fool would-be predators. To investigate this mimicry, we trained artificial intelligence (AI) algorithms-specifically, computer vision-to classify citizen scientist images of bees, bumble bees, and diverse bee mimics. For detecting bees and bumble bees, our models achieved accuracies of 91.71 % and 88.86 % , respectively. As a proxy for a natural predator, our models were poorest in detecting bee mimics that exhibit both aggressive and defensive mimicry. Using the explainable AI method of class activation maps, we validated that our models learn from appropriate components within the image, which in turn provided anatomical insights. Our t-SNE plot yielded perfect within-group clustering, as well as between-group clustering that grossly replicated the phylogeny. Ultimately, the transdisciplinary approaches herein can enhance global citizen science efforts as well as investigations of mimicry and morphology of bees and other insects.

Entities: Chemical

Keywords: Artificial intelligence; Bioinformatics; Computing methodology; Entomology; Zoology

Year: 2022 PMID： 36060073 PMCID： PMC9437854 DOI： 10.1016/j.isci.2022.104924

Source DB: PubMed Journal: iScience ISSN： 2589-0042

Introduction

Bees have been important pollinators ever since the Cretaceous (Genise et al., 2020). As a source of honey, widespread exploitation of honey bees dates back to at least the early Neolithic farmers ( 9 kya, (Roffet-Salque et al., 2015)), and the earliest known apiculture was practiced by the Ancient Egyptians (Crane 1999). Darwin’s experiments with “humble bees,” as bumble bees were once called, exemplify how these insects have furthered our understanding of natural selection with respect to co-evolution, ecosystem webs, and social behavior (Darwin 2003). Such a rich history makes it all the more tragic that today, despite their ecological and economic importance, bees are facing an unprecedented anthropogenic decline in diversity and abundance (Potts et al., 2010)—making their identification and conservation urgent concerns. The mimicry of bees also has a storied history, albeit inadvertently so. As recounted by poets from Virgil to Shakespeare, the ancient superstitious ritual of bugonia held that bees spontaneously generated from the decaying carcasses of animals such as oxen (Osten-Sacken, 1894). The primary culprit of this deception was likely the honey bee-mimicking Eristalis tenax (Syrphidae), a cosmopolitan hoverfly that occasionally lays its eggs on carcasses (Osten-Sacken, 1894) (Scholl et al., 2019). Its common name, the common drone fly, stems from the resemblance of members of this genus to the drones of honey bees (Figure 1).

Figure 1

Citizen scientist photos of bees and bee mimic flies

(A) Bumble bee mimic (Asilidae: Laphria thoracica) preying on bumble bee (Hymenoptera: Bombus spp.), a putative example of aggressive mimicry. Honey bee mimic (B, Syrphidae: Eristalis tenax) and honey bee (C, Hymenoptera: Apis mellifera), an example of defensive mimicry. Red areas in the class activation maps (D and E) denote the importance of the wings and abdominal markings in these two correct classifications (mimic as a non-bee, honey bee as a bee) by our AI algorithms, elaborated in this article. Original images from iNaturalist.

Citizen scientist photos of bees and bee mimic flies (A) Bumble bee mimic (Asilidae: Laphria thoracica) preying on bumble bee (Hymenoptera: Bombus spp.), a putative example of aggressive mimicry. Honey bee mimic (B, Syrphidae: Eristalis tenax) and honey bee (C, Hymenoptera: Apis mellifera), an example of defensive mimicry. Red areas in the class activation maps (D and E) denote the importance of the wings and abdominal markings in these two correct classifications (mimic as a non-bee, honey bee as a bee) by our AI algorithms, elaborated in this article. Original images from iNaturalist. Indeed, Hymenoptera—consisting of bees, wasps, and ants—is the most mimicked insect order (Poulton 1890), which is not surprising given these insects’ formidable sting. A resemblance to such well-equipped insect models protects harmless mimics by fooling would-be predators with a false harmful signal. This type of defensive mimicry is known as Batesian mimicry (Pasteur 1982). A less common and somewhat opposite—although not mutually exclusive—phenomenon is aggressive mimicry. This “wolf in sheep’s clothing” strategy evolved to fool prey into thinking that the mimic is harmless or even beneficial. A classic example of this is the anglerfish and its lure. However, the term aggressive mimicry was actually first ascribed to bumble bee mimics, also from the hover fly family Syrphidae (genus Volucella), by Poulton in 1890 (Poulton 1890) following similar observations of this genus by Kirby and Spence (1817) (Kirby 1857) and Wallace (1871) (Wallace 1871). Poulton noted: “In some cases the Mimicry enables the aggressive form to lay eggs in the nest of that which it resembles, so that its larvæ live upon the food stored up by the latter or even upon the larvæ themselves. The boldness of these enemies sometimes depends upon the perfection of their disguise.” (Poulton 1890; see also discussion in Brower et al., 1960). Poulton later published observations of three other researchers, who demonstrated that larvae of the robber fly family Asilidae (Hyperechia spp.) also prey upon larvae of the carpenter bees (Xylocopa spp.) that they mimic as adults (Brower et al., 1960). These findings were later corroborated by Tsacas et al., (1970), who similarly hypothesized that this mimicry enables egg-laying in or near the opening of the carpenter bee nests (Tsacas et al., 1970). Strikingly, Hyperechia and other asilids also prey upon the adult forms of their model species (and related aculeate hymenoptera) (Figure 1) (Bromley 1930). Thus, these robber flies exhibit not only defensive (Batesian) mimicry, but possibly one or two types of aggressive mimicry: Kirbyan mimicry to prey on bee larvae and Batesian-Wallacian mimicry to prey on bee adults (Pasteur 1982). Our motivation for this study is 2-fold: first, to design artificial intelligence techniques to classify bees, especially bumble bees. Second, to identify a variety of convergently evolved bee mimics—in other words, to assess whether the visual resemblance to bees gained through natural selection can fool artificial intelligence algorithms that were trained on bees. For both problems, we use images uploaded to the iNaturalist platform (iNaturalist, n.d.), which is a crowning example of a global citizen science effort toward conservation. iNaturalist was launched in 2008, and as of July 2022, users have contributed more than 123 million observations of animals, plants, and other organisms worldwide. At the time of writing, the platform has observations of bees from observers. For each observation contributed, multiple identifications can be made by community members. Among these observations, the most reliable is “Research Grade,” for which two conditions must be met. The observation must contain a valid date, location, photo, or sound, and not be of a captive/cultivated organism; and at least two members must agree on the identification (iNaturalist, 2020). For bees, we found that of the observations are Research Grade, which means that close to half of the uploaded observations are not reliably identified. Leveraging Research Grade images, our specific contributions in this article are:

An artificial intelligence model for classifying bees from other insects

We designed a model based on VGG16 (Simonyan and Zisserman 2015) to classify bees from non-bees. This AI model was trained, validated, and tested on bee and non-bee insect images. Note that throughout the text, the term “model” refers to one of two disparate entities: an AI algorithm, or an insect species upon which mimicry is based.

An artificial intelligence model for classifying bumble bees from other bees

We designed a model based on ResNet-101 (He et al., 2016) to classify bumble bees from other bees (which we refer to as non-bumble bee bees, or simply non-bumble bees when the context is clear). This model was trained, validated, and tested on bumble bee and non-bumble bee images.

Evaluating both artificial intelligence models against independently evolved bee mimics

Our taxa comprise 19 bee mimic species across six diverse insect families (Asilidae, Bombyliidae, Scarabaeidae, Sphingidae, Syrphidae, Tachinidae) in three orders—Coleoptera (beetles), Diptera (flies), and Lepidoptera (moths)—as well as related outgroups of nine species of wasp mimics and 13 species of non-mimics to serve as controls (see Supplemental Information). We hypothesize that 1) within each clade, bee mimics will exhibit lower model classification accuracy compared to their non-mimic counterparts, and 2) wasp mimics will exhibit intermediate accuracy; furthermore, 3) bee mimics that are believed to exploit aggressive mimicry toward bees will exhibit better mimicry—as defined by lower classification accuracy—compared to bee mimics that employ only defensive mimicry. Finally, we visualize such classifications of artificial intelligence vis-a-vis natural selection through an integration of the t-Distributed Stochastic Neighbor Embedding (t-SNE) technique (phenotype) and evolutionary relationships (phylogeny). t-SNE is an AI technique for dimensionality reduction, particularly well-suited for the visualization of high-dimensional datasets (Hinton and Roweis 2002)).

Artificial intelligence-driven insights on the fidelity of our techniques

To evaluate the fidelity and explainability of our AI models, we adopt the technique of class activation maps (CAMs) (Zhou et al., 2016) to pinpoint which pixels in an image are most used in a classification (Figures 1D and 1E). We also convert images to grayscale in order to compare the performance of each dataset and to make observations on the respective roles of color vs. pattern in the classifications (especially relevant in the context of aposematic mimicry).

Related work

Artificial intelligence techniques to classify bees

In a 2021 study (Spiesman et al., 2021), multiple deep learning techniques were designed to classify species of bumble bees using images from citizen science platforms such as Bumble Bee Watch, BugGuide, and iNaturalist. The authors collected images from 36 species of bumble bees, including (as in our article) Bombus (B.) vosnesenskii, B. terricola, B. pensylvanicus, B. griseocollis, and B. affinis. The deep learning architectures used were Wide-ResNet-101, InceptionV3, ResNet-101, and MnasNet101, which achieved accuracies of , , , and , respectively, in classifying the species of bumble bees. In a 2022 study (De Nart et al., 2022), the authors classified species of honey bees. What is unique about this work is that the images used were exclusively wings. wing images spread across seven species were extracted from a large dataset of honey bee images archived at CREA-Research Center for Agriculture and Environment (CREA-AA, Italy). To extract the wings alone from the archived images, a RetinaNet model with a ResNet-50 backbone was designed (De Nart et al., 2022). Multiple deep neural network models were then developed, including ResNet-50, MobileNetV2, InceptionNetV3, and InceptionResNetV2, with the highest accuracy of achieved using the InceptionNetV3 model. Both works referred to above focus exclusively on identifying species of bumble bees and honey bees, and as such are complementary to the work in our article. Other related work on insect identification using image data and AI techniques include recent studies detecting invasive mosquito vectors (Carney et al., 2022; Minakshi et al., 2020), crop pests (Kasinathan et al., 2021), beetles (Venegas et al., 2021), and hornets (Jeong et al., 2020), as well as ants and their movements (Wu et al., 2020).

Artificial intelligence techniques to study bee mimicry

In the realm of investigating bee mimicry using deep neural networks, we are aware of one recent work in 2019, which looks at Mllerian mimicry among bumble bees across spatial scales (Ezray et al., 2019). Mllerian mimicry is a form of biological resemblance in which two or more unrelated harmful (e.g., stinging) organisms exhibit closely similar warning systems, such as the same pattern of bright colors. The authors generated their own dataset to represent the color patterns of 35 bumble bee species using a standardized template that removed the effects of body size while still maintaining the morphologically monotonous shape of bumble bees. This method focuses on only color pattern differences and avoids variation from other sources. Color diagrams for the 35 species were generated in this manner by applying particular colors to each segmental domain of their template to match the color patterns displayed in a guide (Williams et al., 2014). With the resulting dataset, the authors first calculated the perceptual distance between every pairwise set of bumble bee color pattern diagrams, using a method proposed in (Wham et al., 2019). Then, they employed the t-SNE method (Van der Maaten and Hinton 2008) to visualize high-dimensionality distances in a two-dimensional (2D) plot. To derive data on the geographic locations of various bumble bee species in the contiguous United States, the authors extracted location information from bumble bee records stored at the Global Biodiversity Information Facility (GBIF). Using both the t-SNE plots and the geographic data, the authors found that bumble bees exhibit color mimicry patterns that are geographically clustered, but sometimes imperfect. The authors also concluded that mimicry patterns gradually transition spatially instead of exhibiting discrete boundaries. Furthermore, the authors identified that transition zones of three co-mimicking, polymorphic species (B. flavifrons, B. melanopygus, and B. bifarius), where active selection is driving phenotype frequencies, differ within a broad region of poor mimicry. Our study is related to the above-referenced work (Ezray et al., 2019) in that we also employ the t-SNE technique (Van der Maaten and Hinton 2008). However, we do not investigate mimicry between various bumble bee species, but rather examine similarities and differences between various species of bee mimics, wasp mimics, and non-mimic insects. Furthermore, our work in this article classifies bee mimics vs. bees, which is not the case with the work in (Ezray et al., 2019).

Results

We used color images—which were also converted to grayscale—to train, validate, and test our AI models. Image counts are presented in Tables 1, 2, 3, and 4. The metric we used to evaluate our models is classification accuracy, defined as the percentage of correct predictions out of the total number of predictions:

Table 1

Insect orders and non-bee image counts

Order	Count
Hymenoptera	330
Blattodea	329
Coleoptera	337
Diptera	328
Lepidoptera	630
Odonata	329
Orthoptera	660

See Table S2 for details.

Table 2

Bumble bee species and image counts

Species	Count
affinis	247
griseocollis	250
impatiens	250
melanopygus	150
pascuorum	157
pensylvanicus	200
terrestris	250
bimaculatus	10
flavifrons	10
lucorum	10
terricola	10
vosnesenskii	10

Images in the last five rows are not used for training and validation but only used for testing. See Table S1 for details.

Table 3

Non-bumble bee genera and image counts

Genus	Count
Andrena	179
Anthidium	180
Apis	518
Centris	239
Megachile	184
Melissodes	174
Osmia	1

See Table S1 for details.

Table 4

Mimics and outgroups with image counts, categorized into three orders (from top to bottom: Coleoptera, Diptera, and Lepidoptera)

Family	Count
Scarabaeidae (bee mimics)	30
Scarabaeidae (non-mimics)	30
Asilidae (bee mimics)	30
Bombyliidae (bee mimics)	30
Syrphidae (bee mimics)	30
Syrphidae (wasp mimics)	30
Tachinidae (bee mimics)	30
Tachinidae (wasp mimics)	30
Tachinidae (non-mimics)	30
Sphingidae (bee mimics)	30
Sesiidae (wasp mimics)	30
various (non-mimics)	30
Total	360

Italics denote the non-bee mimics (i.e., wasp mimics and non-mimics). See Table S3 for species-level designations.

Insect orders and non-bee image counts See Table S2 for details. Bumble bee species and image counts Images in the last five rows are not used for training and validation but only used for testing. See Table S1 for details. Non-bumble bee genera and image counts See Table S1 for details. Mimics and outgroups with image counts, categorized into three orders (from top to bottom: Coleoptera, Diptera, and Lepidoptera) Italics denote the non-bee mimics (i.e., wasp mimics and non-mimics). See Table S3 for species-level designations.

Classifying bees from other insects

For this problem, we chose images presented in Tables 1, 2, and 3 for training and validation. Broadly, of images in each row in the tables was used for training and validation, and the other was unseen and purely used for testing. Note that images of bumble bee species in the last five rows in Table 2 were used only for testing, and not for training or validation. We have included all the species and image counts that were used in our bee and non-bee datasets in Supplemental Tables 2 and 3. All the bee species were from the families “Andrenidae,” “Apidae,” and “Megachilidae.” Results from our VGG-based AI algorithm (Simonyan and Zisserman 2015; details are elaborated in STAR Methods section) for the classification of bees from other insects are presented in Table 5. We considered a classification as correct if the AI model identified a mimic as a non-bee. All testing results reported are for unseen images only. We trained three AI models separately on color images only, grayscale images only, and an equal combination of color and grayscale images. Each model was tested on an equal number of color and grayscale images, and the results are presented in the appropriate column in Table 5. Boldface percentages represent the top classification accuracy for each of the color and grayscale testing image sets among the three models. We can observe in row 3 that the model trained on grayscale images yielded the best accuracies in classifying bees from non-bees. From the perspective of classifying mimics, the accuracy of the VGG model is not very good. However, we observe that the model correctly detected non-mimics with better accuracy than mimics across color and grayscale images (as expected, owing to the morphological similarities between bee mimics and bees). For clarity, we wish to emphasize that no mimic images were used for training and validation in any of our AI models. As such, the mimic species were purely unseen by the AI. Details of this model architecture are shown in Table 6 and hyperparameters are shown in Table 7.

Table 5

Comparison of accuracy (in %) between three VGG16-based models that classify bees from non-bees, evaluated against testing images of bee mimics and non-bee mimics (italicized)

Training image type	Color		Gray		Color + Gray		Average
Testing image type	Color	Gray	Color	Gray	Color	Gray
Bees vs. non-bees	84.63	79.79	91.36	91.71	90.85	91.62	88.33

Testing accuracy

Scarabaeidae (bee mimics)	23.33	43.33	60.00	53.33	43.33	43.33	44.44
Scarabaeidae (non-mimics)	76.67	80.00	83.33	83.33	86.67	86.67	82.78
Asilidae (bee mimics)	10.00	36.67	16.67	13.33	20.00	13.33	18.33
Bombyliidae (bee mimics)	46.67	53.33	20.00	16.67	33.33	26.67	32.78
Syrphidae (bee mimics)	23.33	36.67	23.33	16.67	13.33	13.33	21.11
Syrphidae (wasp mimics)	36.67	30.00	46.67	36.67	50.00	33.33	38.89
Tachinidae (bee mimics)	23.33	30.00	30.00	20.00	16.67	13.33	22.22
Tachinidae (wasp mimics)	60.00	60.00	83.33	86.67	63.33	70.00	71.67
Tachinidae (non-mimics)	66.67	60.00	80.00	83.33	60.00	63.33	68.89
Lepidoptera (bee mimics)	30.00	50.00	56.67	60.00	70.00	60.00	54.45
Lepidoptera (wasp mimics)	66.67	83.33	73.33	70.00	70.00	70.00	72.22
Lepidoptera (non-mimics)	86.67	90.00	96.67	100.00	100.00	100.00	95.56
Average	45.83	54.44	55.83	53.33	52.22	49.44	51.95

A correct classification is when the model accurately identifies one of these insects as non-bee. Boldface percentages represent the top classification accuracy for color and grayscale images among the three models for each row. The third row presents the accuracy for classifying bees from non-bees. For this, 683 bee and 539 non-bee color images were used, which were evenly distributed across all the species. Another 683 bee and 539 non-bee grayscale versions of those same images were used for the evaluation of the models with grayscale images. The following rows denote accuracies for the same AI models in classifying mimics as in Table 4. For testing mimics, 30 color and 30 corresponding grayscale images for each of the 12 groups were used.

Table 6

VGG16-based model architecture details

Layer	Input size	Output size
VGG16 5 Conv blocks	224, 224, 3	7, 7, 512
Global_average_pooling2d	7, 7, 512	512
dense_1 (Dense)	512	256
dropout_1 (Dropout)	256	256
dense_2 (Dense)	256	128
dropout_2 (Dropout)	128	128
dense_3 (Dense)	128	64
dropout_3 (Dropout)	64	64
dense_4 (Dense)	64	2

Table 7

VGG16-based model hyperparameters

Hyperparameter	Value
Loss	Binary Cross entropy
Optimizer	Adam Optimizer
Momentum	0.9
Early training epochs	50
Early epochs learning rate	0.005
Whole model training epochs	200
Learning rate for all epochs	0.0005

Comparison of accuracy (in %) between three VGG16-based models that classify bees from non-bees, evaluated against testing images of bee mimics and non-bee mimics (italicized) A correct classification is when the model accurately identifies one of these insects as non-bee. Boldface percentages represent the top classification accuracy for color and grayscale images among the three models for each row. The third row presents the accuracy for classifying bees from non-bees. For this, 683 bee and 539 non-bee color images were used, which were evenly distributed across all the species. Another 683 bee and 539 non-bee grayscale versions of those same images were used for the evaluation of the models with grayscale images. The following rows denote accuracies for the same AI models in classifying mimics as in Table 4. For testing mimics, 30 color and 30 corresponding grayscale images for each of the 12 groups were used. VGG16-based model architecture details VGG16-based model hyperparameters

Classifying bumble bees from non-bumble bees

For this problem, we employ a similar approach as above. The AI model used was ResNet-101 (He et al., 2016; details are elaborated in STAR Methods section). Here again, we selected of the images from the first seven species in Table 2 and the first seven genera in Table 3 for model training and validation. The remaining images were used for testing. Here also, we trained three AI models separately on color images only, grayscale images only, and an equal combination of color and grayscale images. Each model was tested on an equal number of color and grayscale images, and the results are presented in the appropriate column in Table 8. We considered a classification as correct if the AI model identified a mimic as a non-bumble bee. All testing results are for unseen images only.

Table 8

Comparison of accuracy (in %) between three ResNet-101-based models that classify bumble bees from non-bumble bee bees, evaluated against testing images of bee mimics and non-bee mimics (italicized)

Training image type	Color		Gray		Color + Gray		Average
Testing image type	Color	Gray	Color	Gray	Color	Gray
Species used in training, validation	73.48	77.82	82.33	81.99	85.78	86.87	81.38
Species not used in training, validation	86.73	86.73	94.90	93.88	92.86	92.86	91.33
Average for testing	77.90	80.79	86.52	85.95	88.14	88.86	86.36

Testing accuracy

Scarabaeidae (bee mimics)	93.33	93.33	76.67	60.00	93.33	90.00	84.44
Scarabaeidae (non-mimics)	90.00	86.67	80.00	86.67	80.00	83.33	84.45
Asilidae (bee mimics)	66.67	56.67	33.33	26.67	56.67	43.33	47.22
Bombyliidae (bee mimics)	96.67	66.67	56.67	63.33	90.00	83.33	76.11
Syrphidae (bee mimics)	100.00	96.67	96.67	96.67	100.00	100.00	98.34
Syrphidae (wasp mimics)	100.00	100.00	100.00	100.00	100.00	100.00	100.00
Tachinidae (bee mimics)	100.00	80.00	70.00	66.67	93.33	86.67	82.78
Tachinidae (wasp mimics)	100.00	100.00	96.67	100.00	100.00	100.00	99.45
Tachinidae (non-mimics)	100.00	96.67	93.33	86.67	100.00	100.00	96.11
Lepidoptera (bee mimics)	90.00	70.00	70.00	66.67	86.67	73.33	76.11
Lepidoptera (wasp mimics)	96.67	100.00	100.00	100.00	100.00	100.00	99.45
Lepidoptera (non-mimics)	66.67	56.67	90.00	66.67	86.67	70.00	72.78
Average	91.67	83.61	76.67	76.67	90.56	85.83	84.77

A correct classification is when the model accurately identifies these insects as non-bumble bee. Boldface percentages represent the top classification accuracy for color and gray images among the three models for each row. The third row presents the accuracy of a dataset comprising unseen images from those species of bumble bees and non-bumble bees used in training and validation. The fourth row presents the accuracy in a dataset comprising unseen images from species of bumble bees and non-bumble bees that were not used in training and validation. In other words, these species were also unseen by the AI model. The number of testing images of seen species used for testing was 585 color and 585 corresponding grayscale images, distributed evenly among all species. The number of testing images of unseen species used for testing was 98 color and 98 corresponding grayscale images, distributed evenly among all species. More details on the species used in testing are included in Tables S4 and S5. Later rows in this table denote accuracies for the same AI models in classifying mimics in Table 4. For testing mimics, 30 color and 30 corresponding grayscale images s for each of the 12 groups were used.

Comparison of accuracy (in %) between three ResNet-101-based models that classify bumble bees from non-bumble bee bees, evaluated against testing images of bee mimics and non-bee mimics (italicized) A correct classification is when the model accurately identifies these insects as non-bumble bee. Boldface percentages represent the top classification accuracy for color and gray images among the three models for each row. The third row presents the accuracy of a dataset comprising unseen images from those species of bumble bees and non-bumble bees used in training and validation. The fourth row presents the accuracy in a dataset comprising unseen images from species of bumble bees and non-bumble bees that were not used in training and validation. In other words, these species were also unseen by the AI model. The number of testing images of seen species used for testing was 585 color and 585 corresponding grayscale images, distributed evenly among all species. The number of testing images of unseen species used for testing was 98 color and 98 corresponding grayscale images, distributed evenly among all species. More details on the species used in testing are included in Tables S4 and S5. Later rows in this table denote accuracies for the same AI models in classifying mimics in Table 4. For testing mimics, 30 color and 30 corresponding grayscale images s for each of the 12 groups were used. First, we can see that the ResNet-101 architecture achieved good accuracy in classifying bumble bees from non-bumble bees, with the best testing accuracies exhibited by the model trained with color and grayscale images (; average for testing, row 5). Second, we observe that the ResNet-101 models classified mimics with much higher accuracy compared to the previous VGG models. Note that the grayscale image models performed better than color image models overall for the problem of classifying bumble bees from non-bumble bees. For the case of bee mimics, color models generally performed a little better in classification. Again, none of the mimic images were used for training and validation, thus the mimic species are purely unseen by the AI. Hyperparameters for this model are shown in Table 9.

Table 9

ResNet-101 based model hyperparameters

Hyperparameter	Value
Loss	Binary Cross entropy
Optimizer	Adam Optimizer
Momentum	0.9
Epochs	100,000
Learning rate (up to 50,000 epochs)	0.0003
Learning rate (up to 80,000 epochs)	0.00003
Learning rate (up to 100,000 epochs)	0.000003

ResNet-101 based model hyperparameters

Evaluating model fidelity

To evaluate the fidelity and explainability of our AI models, we employed the CAM technique (Zhou et al., 2016) to pinpoint which pixels in an image were most used to make a classification by the AI. The warmer (i.e., redder) a pixel is in the CAM, the higher the weight of that pixel used for classification. If the pixels highlighted in the CAM appear on anatomical components of the insect, that means that the model learned to classify correctly, while ignoring the background. From Figures 2, 3, 4, 5, and 6 for all AI models and image classes, we see that our models focused primarily on the anatomical components of the insect (for both color and grayscale images), and have learned well enough to ignore the background. Results are indeed generalizable to other images in our dataset and as such increase confidence in our AI models.

Figure 2

Non-bee insects

Citizen science photos (top) and class activation maps (bottom) of non-bee insects, using the bee vs. non-bee classifier (VGG16-based model, trained with color images). Species and classifications, from left: Ectobius vittiventris (correct), Hapithus agitator (correct), Uropetala carovei (correct).

Figure 3

Non-bumble bee bees

Citizen science photos (top) and class activation maps (bottom) of non-bumble bee bees using the bee vs. non-bee classifier (VGG16-based model, trained with color images). Species and classifications, from left: Andrena cineraria (correct), Apis dorsata (correct), Megachile mendica (correct).

Figure 4

Bumble bees

Citizen science photos (top) and class activation maps (bottom) of bumble bees using the bee vs. non-bee classifier (VGG16-based model, trained with color images). Species and classifications, from left: Bombus impatiens (correct), Bombus impatiens (correct), Bombus pensylvanicus (correct).

Figure 5

Bee mimics (Asilidae)

Citizen science photos (top) and class activation maps (bottom) of bee mimics from the family Asilidae using the bee vs. non-bee classifier (VGG16-based model, trained with color images). Species and classifications, from left: Mallophora leschenaulti (incorrect), Mallophora leschenaulti (incorrect), Mallophora leschenaulti (grayscale) (correct).

Figure 6

Bee mimics (Scarabaeidae)

Citizen science photos (top) and class activation maps (bottom) of bee mimics from the family Scarabaeidae using the bee vs. non-bee classifier (VGG16-based model, trained with color images). Species and classifications, from left: Trichius fasciatus (grayscale) (incorrect), Trichius gallicus (incorrect), Trichius sexualis (correct).

Non-bee insects Citizen science photos (top) and class activation maps (bottom) of non-bee insects, using the bee vs. non-bee classifier (VGG16-based model, trained with color images). Species and classifications, from left: Ectobius vittiventris (correct), Hapithus agitator (correct), Uropetala carovei (correct). Non-bumble bee bees Citizen science photos (top) and class activation maps (bottom) of non-bumble bee bees using the bee vs. non-bee classifier (VGG16-based model, trained with color images). Species and classifications, from left: Andrena cineraria (correct), Apis dorsata (correct), Megachile mendica (correct). Bumble bees Citizen science photos (top) and class activation maps (bottom) of bumble bees using the bee vs. non-bee classifier (VGG16-based model, trained with color images). Species and classifications, from left: Bombus impatiens (correct), Bombus impatiens (correct), Bombus pensylvanicus (correct). Bee mimics (Asilidae) Citizen science photos (top) and class activation maps (bottom) of bee mimics from the family Asilidae using the bee vs. non-bee classifier (VGG16-based model, trained with color images). Species and classifications, from left: Mallophora leschenaulti (incorrect), Mallophora leschenaulti (incorrect), Mallophora leschenaulti (grayscale) (correct). Bee mimics (Scarabaeidae) Citizen science photos (top) and class activation maps (bottom) of bee mimics from the family Scarabaeidae using the bee vs. non-bee classifier (VGG16-based model, trained with color images). Species and classifications, from left: Trichius fasciatus (grayscale) (incorrect), Trichius gallicus (incorrect), Trichius sexualis (correct).

Visualizing high-dimensional image data in two dimensions using t-distributed Stochastic Neighbor Embedding

We plotted the t-SNE coordinates onto a 2D graph and color-coded them by phylogenetic and mimetic grouping (Figure 7). We observed that those data points yielded twelve individual clusters, with no outliers. Furthermore, multiple clusters within a given clade (e.g., Lepidoptera) were also grouped together.

Figure 7

t-Distributed Stochastic Neighbor Embedding

t-SNE plot of bee mimics (n = 180), wasp mimics (n = 90), and non-mimics (n = 90) from Table 2, with the phylogenetic tree overlaid. Note the perfect clustering within each group, as well as the clustering between groups that grossly corresponds to the phylogeny. Evolutionary relationships follow (Wiegmann et al., 2009), (Gunter et al., 2016), (Powell 2009), (Penney et al., 2012), and (Blaschke et al., 2018).

t-Distributed Stochastic Neighbor Embedding t-SNE plot of bee mimics (n = 180), wasp mimics (n = 90), and non-mimics (n = 90) from Table 2, with the phylogenetic tree overlaid. Note the perfect clustering within each group, as well as the clustering between groups that grossly corresponds to the phylogeny. Evolutionary relationships follow (Wiegmann et al., 2009), (Gunter et al., 2016), (Powell 2009), (Penney et al., 2012), and (Blaschke et al., 2018).

Training, hardware, and inference time

Our training and validation were conducted on a GPU cluster of four Nvidia GeForce GTX TITAN X cards each having CUDA cores and 12 GB of memory each (Nvidia). It took around 28 h to train and validate the VGG16-based bee vs. non-bee model and took 46 h to train and validate the ResNet-101-based bumble bee vs. non-bumble bee model. Inference time for a single image was less than a second for both models.

Discussion

The ability to identify bees from other insects—especially bee mimics—has important applications for conservation, education, and AI in biology. Overall, our accuracies reach for classifying bees from non-bees, as well as bumble bees from non-bumble bees, when grayscale images were included in the training. For the mimics studied, the bee vs. non-bee model (VGG16-based) yielded lower classification accuracies compared to those of the bumble bee vs. non-bumble bee model (ResNet-101-based) (Table 5 and 8). This difference could be owing to a number of reasons, such as the disparate neural network architectures. Note that the ResNet-101 model is much heavier with many more layers, and is better suited for learning finer-grained discriminators compared to the lighter-weight VGG models. Second, the difference could be owing to the phylogenetic specificity of both the bumble bee and non-bumble bee classes. In other words, the non-bumble bee class was trained using non-bumble bee bees, as opposed to the more general non-bee insects used to train the bee model. It should also be noted that the great majority of the bee mimics tested were bumble bee mimics, compared to others such as the several honey bee mimics (i.e., Eristalis spp.). One surprising finding from the bumble bee model was that unseen species had higher accuracies than seen species, for both true negatives and true positives (Table 8: rows 4 vs. 3; Supplemental Information). We hypothesize that this counterintuitive result may be due in part to selection bias. Specifically, the unseen non-bumble bee species used in testing may be more morphologically derived—presumably, exhibiting traits that are further away from the bumble bee gestalt that the AI was trained to recognize. For example, the two unseen Apis species—which have accuracies of 100% (Supplementary Table 5)—include the anatomically distinct black dwarf honey bee, Apis andreniformis. Among the true positives, the unseen species of Bombus may have more prominent versions of diagnostic bumble bee traits recognized by the AI, such as hair. For example, the unseen species B. flavifrons has unusually long and uneven hair lengths (Koch 2012). Results supported our first hypothesis, as the bee mimics within all four relevant clades—Scarabaeidae, Syrphidae, Tachinidae, and Lepidoptera—had the lowest classification accuracy compared to their wasp mimic and non-bee mimic counterparts, in both bee and bumble bee models (Tables 5 and 8). The only exception to this was that the bumble bee model yielded a lower accuracy for the Lepidoptera non-mimics compared to bee mimics (72.78% vs. 76.11%; also, the difference was negligible for Scarabaeidae). Our second hypothesis was not supported, as the accuracy of the wasp mimics was intermediate between bee mimics and non-mimics in only one of the four three-way comparisons across both models. Indeed, for the bumble bee model, this included identical near-perfect classification (99.45%) for the wasp mimics within Tachinidae and Lepidoptera (not to mention 100% classification of the wasp mimics in Syrphidae). Note also that this essentially perfect classification of wasp mimics in all three clades (Syrphidae, Tachinidae, Lepidoptera) was across all three training image types (color, grayscale, color + grayscale) and both testing image types (color, grayscale). Thus, while the bee mimics were better at fooling both algorithms, the phenotypic divergence of the wasp mimics may have been more easily detected (i.e., less confused for bees) compared to the non-mimics. Phylogeny may play a non-mutually exclusive role within Tachinidae as well, as the wasp mimics and non-mimics are sister groups (Figure 7), and have more similar accuracies compared to those of the bee mimics (based on both models; Tables 5 and 8). All of the bee and wasp mimicry herein is considered to be defensive (Batesian), enabling an aposematic warning to would-be predators by falsely appearing to be a stinging taxon. In terms of fooling the AI model (i.e., convincing the algorithm that a mimic was actually a bee), the poorest performance was exhibited by images of bee hawkmoths (Hemaris, Sphingidae, Lepidoptera) in the bee model (54.45% of images were correctly classified as a mimic), and the bee beetles (Trichius, Scarabaeidae) in the bumble bee model (84.44% of images were correctly classified as a mimic) (Tables 5 and 8). Conversely, the best bee mimicry, as defined by the lowest AI classification accuracy, was exhibited by Asilidae in both models (18.33%, 47.22%). This robber fly family is represented herein by three species from the genus Mallophora, known as the bee killers (Brower et al., 1960), (Bromley 1930). These flies mimic the color, form, and hairy appearance of their bumble bee and carpenter bee prey (Figure 5), as well as exhibit bee-like flight behavior and buzz (Linsley 1960), (Bromley 1930). For one of our species, M. bomboides, Brower et al., (1960) experimentally demonstrated that this mimicry is defensive (Batesian). These authors proposed that this mimicry is also aggressive, in a non-mutually exclusive manner, and operates at two possible levels: during the known predation of the adult bees, and during the presumed oviposition in bee nests (the larval feeding habits are unknown). Assuming that such aggressive mimicry is operating on one or both levels in our Mallophora (in addition to defensive mimicry), the superior bee mimicry of these bee killers confirms our third hypothesis. Ultimately, such two- or three-way selective pressure may be responsible for driving the evolution of more accurate mimicry within this genus, and perhaps other asilids. It is also worth noting that in addition to being non-mutually exclusive, the co-existence of defensive and aggressive mimicry could also provide a positive evolutionary feedback loop. Theoretically, asilid predation upon bees would assure sympatry (Brower et al., 1960), which in turn would enable defensive mimicry (i.e., mimics and their models must co-occur), which in turn could be exapted for aggressive mimicry to enhance predation. We also advocate for a less restrictive definition of aggressive mimicry (per the original Poulton 1890: p.268) than what is sometimes followed in the literature. Namely, that mimicry should be considered aggressive if it fools any prey species (not just the model species)—just as mimicry would be considered defensive if it fools any predator species. For adult prey, Mallophora prefers aculeate Hymenoptera (Bromley 1930), and if the former’s mimicry fatally dupes a member of the latter, it should be considered no less aggressive if that prey is a honey bee rather than a bumble bee. Such an inclusive definition would apply to the putative mimicry-enabled egg-laying as well. Fundamentally, the phenomenon of aggressive mimicry in bee mimics is understudied and not well-understood, so further investigations would be welcomed. Findings from the three other groups—if they do, indeed, exhibit aggressive mimicry—would also be consistent with our third hypothesis. Bombyliidae is the sister group to Asilidae, and represents a large clade known as the bee flies (15 subfamilies, 5K described species; (Evenhuis and Greathead, 1999)). Members of Bombyliidae may theoretically include aggressive mimics, but only in the Kirbyan sense: while their adult forms do not feed upon adult bees like those of Mallophora do, Bombylius (Bombyliidae) larvae are ectoparasitoids in the nests of bees (Yeates and Greathead 1997). Larvae of some Syrphidae and the bee mimic genus Tachina also parasitize bees (Packard, 1868). Interestingly, these three groups exhibit accuracy values intermediate between the bee mimics in Asilidae and in Lepidoptera/Scarabaeidae, in the bee model. This pattern is not the case in the bumble bee model, given the high accuracy of the syrphids and tachinids, which may be owing to their mimicking of honey bees, rather than bumble bees as in the bombylids. Among the incorrect classifications—i.e., when the mimicry successfully fooled the AI model—within the Scarabaeidae (represented here by Trichius), the CAMs were often located on the bee-like hairy thorax, or the elytra (Figure 6, left). Elytra are the modified, hardened forewings of the beetle order Coleoptera, which is interesting in that this differentiates the order from both the bee mimics in Diptera and the bees in Hymenoptera. In Trichius, the elytra mimic the gold and black banding pattern of bee abdomens (Figure 6). We can quantify the role of elytra color (4%) vs. pattern (44%) based on the percentage of images in which the CAMs selected the elytra in the color incorrect images compared to the grayscale incorrect images: 48% (10/21) vs. 44% (7/16), respectively. Interestingly, the VGG16-based model trained only on color images performed much better on grayscale than color versions of the testing images (43.33 vs. 23.33%; Table 5). When comparing training with color images vs. grayscale images, we notice a trend of grayscale images giving a consistently similar performance. The exceptions to this include the Bombyliidae in Table 5, and, interestingly, the bee mimics in Table 8. The latter demonstrates that the algorithm was able to better detect these mimics as non-bumble bees on the basis of color. Conversely, for our entire testing set, the grayscale training yielded higher classification accuracies in all models, suggesting that color can possibly confound AI models. This also means that the algorithms learn from the lighter-weight grayscale images as well. This phenomenon can be explained by the fact that in some cases, when color may not be relevant to a classification problem, AI models work just as well as with grayscale images (or sometimes better) (Ĉadík 2008) (Kanan and Cottrell 2012) (Xie and Richmond 2018) (Yohanandan et al., 2018). In the case of insect morphology, pattern and shape can play a critical role in classification; for example, the very unique -shaped area of hairs on the thorax of B. affinis. For detecting such markers, grayscale images may be sufficient. Also, AI models may be confused by the colors appearing in images, which is likely to happen in citizen-generated photos owing to diverse sources of background, inconsistencies in light, camera capabilities, and more. Grayscale images may overcome these issues, which aid in robust AI models for problems related to the analysis of insect morphology. Herein, it should be noted that converting one of the Mallophora images to grayscale turned an incorrect classification into a correct classification (Figures 5B and 5C). With respect to the t-SNE results (Figure 7), it is interesting that there was perfect clustering within each group, with no mismatched data points. Such clustering is surprising within the paraphyletic groups (Scarabaeidae non-mimics, Syrphidae wasp mimics) and especially the Lepidoptera non-mimics, which comprise five distinct families within four superfamilies. Furthermore, given the imperfect mimicry within Syrphidae (Penney et al., 2012), it is notable that there was such a clear separation of the bee mimic and wasp mimic clusters. This finding is inconsistent with the multi-model hypothesis (Edmunds 2000), which would predict overlap owing to the imperfect mimicry of multiple models. This separation also corroborates the findings of Penney et al., (2012), which found no syrphids intermediate in appearance between hymenopteran models. It is also intriguing that the pattern of clustering between these bee mimic groups grossly corresponds to the evolutionary relationships, as denoted by the phylogenetic tree (Figure 7). Specifically, all within-clade clusters group together for Scarabaeidae, Lepidoptera, Syrphidae, Tachinidae, and even the Asilidae + Bombylidae clade, which represents the superfamily Asiloidea.

Limitations of the study

We would like to note that distance within the t-SNE plot cannot be used as an exact proxy for phenotypic disparity, owing to the probabilistic nature of the approach (Wattenberg et al., 2016), (Van der Maaten and Hinton 2008). Phenotypic disparity is also an imperfect proxy for phylogenetic distance, especially in the case of convergent morphotypes. Lastly, one limitation of our bumble bee model is that all of the training images were of bees (bumble bees and non-bumble bee bees), with no training images of non-bees.

Conclusion

By applying machine learning techniques within an evolutionary framework—from our integration of t-SNE and phylogeny, to grayscale images and CAMs—we can take the first step from “explainable AI” to “explainable mimicry.” This includes decoupling the role of color from pattern and elucidating diagnostic anatomical regions. We need to acknowledge, however, that the features learned by our deep learning models operating on images may have little to no functional significance in the evolutionary history of these insects. Nonetheless, these methods provide new avenues for addressing the long-standing challenge of “quantifying the extent of mimetic fidelity between mimics and models” (Penney et al., 2012). Furthermore, real-time deployment of the approaches herein—such as providing immediate feedback and visualizations—could enhance the operational efficiency of citizen scientist-driven identification of bees and their mimics, as well as help pique interest and engagement among the general public. For example, alerting users of critically endangered species such as B. affinis could have a direct impact on sustaining efforts for conservation. Ultimately, with ever-expanding datasets of crowdsourced images, coupled with expected advances in computer vision techniques, foundational insights and applications are possible in the near future and beyond.

STAR★Methods

Key resources table

Resource availability

Lead contact

Requests for resources or information should be directed to the lead contact, Tanvir Bhuiyan (bhuiyan@usf.edu).

Material availability

No reagents were generated in the study.

Method details

Herein we designed two separate convolutional neural network (CNN) architectures for two problems: 1) classifying bees vs. other insects (including mimics) and 2) classifying bumble bees vs. non-bumble bees.

Data pre-processing

Our images were sourced exclusively from Research Grade observations on the citizen science platform iNaturalist. Images there are typically large in size, with a maximum of up to pixels in the longest dimension. Processing images of these sizes can be very complex and time-consuming. To speed up learning without compromising accuracy, we reduced the size of each image to pixels in the longest dimension. To evaluate the effect of color on classification accuracy, we trained our models on color images, grayscale images, and combination of color and grayscale images. The first dataset retained the original colors of the images from iNaturalist. For the grayscale versions, we converted all the images into grayscale using OpenCV (Culjak et al., 2012). In the third dataset, we combined all the color and grayscale versions of the images, thereby doubling the training dataset. We did the same for testing dataset also. The procedure was executed for both problems. Note here that we did not do data augmentation of images in our dataset. There is a reason for this. Typically, data augmentation helps where the context of classification between two classes is very vivid and clear (for example, classifying a cat from a dog). In such cases, noise addition will not hurt the larger context of images used for classification. In our problem, the dataset was insects. And these insects have very fine-grained and minute signatures for classification. As such, adding artificial noise could change the overall context of images. Hence, we did not attempt data augmentation for images in this paper.

VGG16-based CNN for classifying bees from other insects

Our VGG16-based model was pre-trained on the “ImageNet” dataset (Deng et al., 2009). Note that the VGG16 architecture consists of five blocks of convolutional layers, followed by three fully connected layers (Simonyan and Zisserman 2015). It is an architecture that is simpler in size and complexity than most other standard CNN architectures. For our problem, after the five blocks of convolutional layers, we added one global pooling layer and four fully connected dense layers. Details of our architecture are shown in Table 6, where the first row represents the last layer of the base VGG16 architecture, up to which was fixed. For training, we did not change the pre-existing weights in the base architecture (i.e., we froze them), and we only trained those weights added after the fifth block for the first 50 epochs. Here, the weights for only the newly added layers were trained, with a rate of 0.005. Then, weights in the entire architecture were unfrozen and retrained again for 200 epochs, with a rate of 0.0005. Table 7 presents the critical hyperparameters in our architecture during training and validation. The loss function is the binary cross entropy loss function, which is given by , where is the model estimated probability of the ground truth class, which we want to minimize during training and validation. We did monitor, but did not explicitly record decrease in the loss function; we stopped training when the loss function (and accuracy) saturated.

ResNet-101-based CNN for classifying bumble bees from other bees

Classifying bumble bees from non-bumble bees is a more complex problem, since there are subtler difference between these two classes as compared to classifying bees from other insects. For this problem, the CNN architecture that worked best in our study was the more complex ResNet-101. Here also, our ResNet-101 model was not trained from scratch. We took the pre-trained version trained on “ImageNet” dataset for this model, with its base weights. ResNet-101 is a CNN with residual connections, wherein each layer, instead of feeding only into the next layer, also directly feeds into layers further down (He et al., 2016). This was done to specifically improve learning at later layers. Thus, this is a more complex architecture with 101 blocks of convolutional layers. This architecture provided optimal results in its current form; therefore, we did not change the architecture, but changed only the weights via training and validation. Table 9 presents the critical hyperparameters in this architecture for training and validation. The loss function is once again the binary cross entropy loss function. We did monitor, but did not explicitly record decrease in the loss function; we stopped training when the loss function (and accuracy) saturated.

CAM

To get a better understanding how our models interpret pixels in an image for classification, we adopted the technique of class activation maps (CAM) (Zhou et al., 2016). The CAM technique gives each pixel a weight which indicates the significance of that pixel toward classification. To execute the technique, we compute the output features generated at the final convolution layer of the CNN. Then, we traverse back in the architecture (at the conclusion of the last convolutional layer) to determine the weight (probability) of each pixel in the image that was used for classification. A higher weight for a pixel indicates a redder color in CAM, meaning that the particular pixel was a more significant factor in classification. Pixels with lower weight would appear comparatively bluer in the CAM technique, and these are pixels that were not dominant in classification. The CAM model was based on our VGG16 model that was trained on color images.

t-SNE

t-Distributed Stochastic Neighbor Embedding (t-SNE) (Van der Maaten and Hinton 2008) is an algorithm developed as an improvement over Stochastic Neighbor Embedding (Hinton and Roweis 2002). t-SNE is an unsupervised, non-linear technique for dimensionality reduction, primarily used for visualising high-dimensional data (in our case, RGB values from image pixels). In other words, t-SNE gives an intuition of how high-dimensional data points are related in low-dimensional space. Compared to many other non-parametric visualisation techniques (e.g., Sammon mapping, principal components analysis, isomap, locally linear embedding), t-SNE proved more robust and significantly more effective for high-dimensional data visualisation (Chatzimparmpas et al., 2020). t-SNE can be used for data-visualisation in a wide range of applications including biomedical signal processing, genomics, computer security research, bioinformatics, cancer research, and music analysis (Chatzimparmpas et al., 2020). To generate the t-SNE, we started with a base neural network architecture, from which the t-SNE algorithm runs a combination of two sequential phases. In the first phase, t-SNE builds up a probability distribution matrix for data points, which consists of the RGB values from the image pixels. Each pair of distinct data points are considered. For each pair, a probability value is generated. If there is a high level of similarity between the two objects in that pair, a large probability value is assigned, otherwise the probability value is small. In the second phase, t-SNE considers those data points in a lower dimensional space and generates another probability distribution following the similar procedure it did in the first phase. The algorithm then tries to minimize the loss or difference between the two probability distributions with respect to the locations on the map. To accomplish that, the algorithm calculates the Kullback-Leibler divergence (KL divergence) (Perez-Cruz 2008) value and minimises it over several iterations. To visualize phenotype vis-a-vis phylogeny, we also plotted the data points from images of bee mimics and outgroups using the t-SNE algorithm on a 2D graph, and overlaid the phylogenetic tree that illustrates the evolutionary relationships. To build that graph, at first we trained a twelve-class VGG16-based classifier deep-learning model (similar architecture to that in Table 6) with 360 bee mimic images. We then extracted features from the final convolution layer for all 360 images. These features are a matrix of size . We flattened the data to a 100,352 sized array for each image. Then we ran the t-SNE algorithm (per steps above) over the 360 flattened feature data, resulting in 2D coordinates for each image which were then color-coded by phylogenetic and mimetic group.

Quantification and statistical analysis

The number of images used in our study from iNaturalist was color images. These were also then converted to grayscale. Together, these images encompassed our dataset for training, validation and testing. We designed AI models for color only, grayscale only, and a combination of color and grayscale images. The percent of images used for training and validation was together was 80, while the remaining of images was used for testing. Note that images used for testing were completely unseen by the AI models during training and validation. The overall metric used to assess classification is denoted as Accuracy, which is computed based on True Positives, True Negatives, False Positives and False Negatives, and is defined in Equation 1. In addition, model fidelity was evaluated using Class Activation Maps, that highlight pixels with different colors based on the priority of those pixels for classification. Here, the warmer (i.e., more reddish) a pixel appears in the CAM, the higher is the priority of that pixel for classification. Ideally, we will see more red pixels in areas surrounding the insect anatomy if the model has high fidelity. We employed the t-Distributed Stochastic Neighbor Embedding (t-SNE) technique to study mimics. t-SNE is an unsupervised, non-linear technique for dimensionality reduction, primarily used for visualising high-dimensional data. We employed this method in our study to understand clustering of various mimic groups. Our study yielded perfect clustering within each group, as well as the clustering between groups that grossly corresponds to the phylogeny.

REAGENT or RESOURCE	SOURCE	IDENTIFIER
Deposited data

Training, validation and testing data	(Bhuiyan, 2022)“Bee-Non-bee-Dataset”, Mendeley Data: https://doi.org/10.17632/jykt6862s4.1.	https://data.mendeley.com/datasets/jykt6862s4/1

Software and algorithms

Python	vanRossum, G., 1995. Python reference manual. Department of Computer Science [CS], (R 9525).	https://www.python.org/
Tensorflow	Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M. and Kudlur, M., 2016. {TensorFlow}: a system for {Large-Scale} machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16) (pp. 265–283).	https://www.tensorflow.org/
OpenCV	OpenCV team	https://opencv.org/
Matplotlib	J. Hunter, “Matplotlib: A 2D Graphics Environment” in Computing in Science & Engineering, vol. 9, no. 03, pp. 90–95, 2007. https://doi.org/10.1109/MCSE.2007.55	https://matplotlib.org/
NumPy	Harris, C.R., Millman, K.J., van der Walt, S.J. et al. Array programming with NumPy. Nature 585, 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2	https://numpy.org/
Code for this paper	(Bhuiyan et al., 2022) Bee Classifier, Zenodo: https://doi.org/10.5281/zenodo.6965250	https://doi.org/10.5281/zenodo.6965250

12 in total

1. A comparative analysis of the evolution of imperfect mimicry.

Authors: Heather D Penney; Christopher Hassall; Jeffrey H Skevington; Kevin R Abbott; Thomas N Sherratt
Journal: Nature Date: 2012-03-21 Impact factor: 49.962

Review 2. Global pollinator declines: trends, impacts and drivers.

Authors: Simon G Potts; Jacobus C Biesmeijer; Claire Kremen; Peter Neumann; Oliver Schweiger; William E Kunin
Journal: Trends Ecol Evol Date: 2010-02-24 Impact factor: 17.712

3. Unsupervised machine learning reveals mimicry complexes in bumblebees occur along a perceptual continuum.

Authors: Briana D Ezray; Drew C Wham; Carrie E Hill; Heather M Hines
Journal: Proc Biol Sci Date: 2019-09-11 Impact factor: 5.349

4. t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections.

Authors: Angelos Chatzimparmpas; Rafael M Martins; Andreas Kerren
Journal: IEEE Trans Vis Comput Graph Date: 2020-04-13 Impact factor: 4.579

5. If Dung Beetles (Scarabaeidae: Scarabaeinae) Arose in Association with Dinosaurs, Did They Also Suffer a Mass Co-Extinction at the K-Pg Boundary?

Authors: Nicole L Gunter; Tom A Weir; Adam Slipinksi; Ladislav Bocak; Stephen L Cameron
Journal: PLoS One Date: 2016-05-04 Impact factor: 3.240

6. Widespread exploitation of the honeybee by early Neolithic farmers.

Authors: Mélanie Roffet-Salque; Martine Regert; Richard P Evershed; Alan K Outram; Lucy J E Cramp; Orestes Decavallas; Julie Dunne; Pascale Gerbault; Simona Mileto; Sigrid Mirabaud; Mirva Pääkkönen; Jessica Smyth; Lucija Šoberl; Helen L Whelton; Alfonso Alday-Ruiz; Henrik Asplund; Marta Bartkowiak; Eva Bayer-Niemeier; Lotfi Belhouchet; Federico Bernardini; Mihael Budja; Gabriel Cooney; Miriam Cubas; Ed M Danaher; Mariana Diniz; László Domboróczki; Cristina Fabbri; Jesus E González-Urquijo; Jean Guilaine; Slimane Hachi; Barrie N Hartwell; Daniela Hofmann; Isabel Hohle; Juan J Ibáñez; Necmi Karul; Farid Kherbouche; Jacinta Kiely; Kostas Kotsakis; Friedrich Lueth; James P Mallory; Claire Manen; Arkadiusz Marciniak; Brigitte Maurice-Chabard; Martin A Mc Gonigle; Simone Mulazzani; Mehmet Özdoğan; Olga S Perić; Slaviša R Perić; Jörg Petrasch; Anne-Marie Pétrequin; Pierre Pétrequin; Ulrike Poensgen; C Joshua Pollard; François Poplin; Giovanna Radi; Peter Stadler; Harald Stäuble; Nenad Tasić; Dushka Urem-Kotsou; Jasna B Vuković; Fintan Walsh; Alasdair Whittle; Sabine Wolfram; Lydia Zapata-Peña; Jamel Zoughlami
Journal: Nature Date: 2015-11-12 Impact factor: 49.962