Maggie M Hantak1, Robert P Guralnick1, Alina Zare2, Brian J Stucky1,3. 1. Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA. 2. Department of Electrical, and Computer Engineering, University of Florida, Gainesville, FL 32611, USA. 3. Agricultural Research Service, U.S. Department of Agriculture, Beltsville, MD 20705, USA.
Abstract
Openly available community science digital vouchers provide a wealth of data to study phenotypic change across space and time. However, extracting phenotypic data from these resources requires significant human effort. Here, we demonstrate a workflow and computer vision model for automatically categorizing species color pattern from community science images. Our work is focused on documenting the striped/unstriped color polymorphism in the Eastern Red-backed Salamander (Plethodon cinereus). We used an ensemble convolutional neural network model to analyze this polymorphism in 20,318 iNaturalist images. Our model was highly accurate (∼98%) despite image heterogeneity. We used the resulting annotations to document extensive niche overlap between morphs, but wider niche breadth for striped morphs at the range-wide scale. Our work showcases key design principles for using machine learning with heterogeneous community science image data to address questions at an unprecedented scale.
Openly available community science digital vouchers provide a wealth of data to study phenotypic change across space and time. However, extracting phenotypic data from these resources requires significant human effort. Here, we demonstrate a workflow and computer vision model for automatically categorizing species color pattern from community science images. Our work is focused on documenting the striped/unstriped color polymorphism in the Eastern Red-backed Salamander (Plethodon cinereus). We used an ensemble convolutional neural network model to analyze this polymorphism in 20,318 iNaturalist images. Our model was highly accurate (∼98%) despite image heterogeneity. We used the resulting annotations to document extensive niche overlap between morphs, but wider niche breadth for striped morphs at the range-wide scale. Our work showcases key design principles for using machine learning with heterogeneous community science image data to address questions at an unprecedented scale.
Species color patterns represent model systems for understanding evolution because color is a quantifiable biological trait that provides pertinent information about the organism. For instance, color patterns are used as a signal in mate choice and predator-prey interactions, and can aid in thermoregulation (Endler and Mappes, 2017). Color polymorphic species, in which multiple phenotypes (i.e., color morphs) coexist within the same population (Ford, 1945), make particularly good models for studying evolutionary change, as color patterns are discrete, and color morph frequency often varies geographically (McLean and Stuart-Fox, 2014). Furthermore, morphs comprise correlated trait complexes, resulting in divergent selective pressures for a single species (Sinervo and Svensson, 2002; Mckinnon and Pierotti, 2010).A wealth of information regarding species color patterns exists in web-based community science platforms, in which contributors can upload their own photographs of animals and plants, and seek help from other participants in identifying their observations. One of the largest and most successful platforms is iNaturalist (iNaturalist, 2021) (http://www.inaturalist.org/), which as of January 2022, holds >88 million images of various species from across the world and roughly doubles in size each year. DiCecco et al. (2021) showcase the research value of iNaturalist, but one still nascent application is the broad-scale assembly of color pattern data (but see Lehtinen et al., 2020; Lattanzio and Buontempo, 2021). The key challenge is that the manual extraction of color pattern data is time and effort intensive. Automation is an obvious next step but complex image backgrounds can confuse simplistic image analysis toolkits (Peña et al., 2014; Pollicelli et al., 2020). Therefore, developing best practices and tools for streamlining the extraction of information from variable quality images submitted by amateur naturalists is a critical need for processing the plethora of digital image data now being generated, enabling data-intensive research efforts in the areas of ecology and evolutionary biology (Weinstein, 2018; Lürig et al., 2021).Artificial intelligence methods, and deep learning, in particular, offer the most promise for automating the collection of phenotypic data (Lürig et al., 2021), given their remarkable ability to make accurate predictions. Convolutional neural networks (CNNs) are the basis for current state-of-the-art accuracy in whole image classification (Deng et al., 2010; Zeiler and Fergus, 2014; Sermanet et al., 2014). A CNN is a deep learning algorithm that uses training data to learn how to extract features from input images and then use those features to interpret an image’s content (LeCun et al., 2015). Much recent work using CNNs for ecological studies has focused on species identification from complex images (e.g., camera-trap images; Wäldchen and Mäder, 2018; Tabak et al., 2019; Willi et al., 2019; Whytock et al., 2021). Less developed are deep learning approaches that score quantitative traits of interest on those images.Here, we present a workflow and machine learning approach for classifying color patterns of animals from community science photographs. To illustrate the value of this computer vision model, we focus on a use-case of a striped/unstriped color pattern polymorphism in the geographically widespread and abundant Eastern Red-backed Salamander, Plethodon cinereus (Petranka, 1998). The “striped” color morph exhibits a stripe that varies in color from yellow to dark red, which is overlaid on a black dorsum, and the “unstriped” morph is completely dark in dorsal coloration (Figure 1). The ecological and evolutionary mechanisms influencing the geographic patterns of coloration in P. cinereus color morphs remain unclear, and little work has been conducted to examine range-wide patterns of the polymorphism (but see Gibbs and Karraker, 2006; Moore and Ouellet, 2015; Cosentino et al., 2017). Studies from single populations have suggested that the color morphs are correlated with distinct climatic niches; the striped morph is more associated with cooler, wetter niches, while the unstriped morph is more associated with warmer, drier conditions (Moreno, 1989; Anthony et al., 2008).
Figure 1
Color morphs of Plethodon cinereus
Representative iNaturalist images of the striped (left) and unstriped (right) color morphs of Plethodon cinereus. Photos and observations by iNaturalist users Jessica (iNaturalist user jessicapfund) and Myvanwy (iNaturalist user acuriousmagpie), respectively.
Color morphs of Plethodon cinereusRepresentative iNaturalist images of the striped (left) and unstriped (right) color morphs of Plethodon cinereus. Photos and observations by iNaturalist users Jessica (iNaturalist user jessicapfund) and Myvanwy (iNaturalist user acuriousmagpie), respectively.The goal of our study was to test range-wide color morph and climate associations by leveraging more than 20,000 community science photographs. We created a computer vision model for scoring striped and unstriped color morphs of P. cinereus via an experimental design capable of handling photographs that are highly heterogeneous and vary extensively in quality. With the classified data, we then used ecological niche modeling and a logistic modeling framework to examine whether the two color morphs partition available niche space, thereby contributing to the maintenance of this polymorphism. Our methodological approach not only provides new insight into the association between climate and color morph frequency in P. cinereus at the range-wide scale but also demonstrates a pipeline for rapidly classifying discrete color morphs in community science images. We also discuss the complications faced when developing the computer vision model, but highlight the utility of this approach with continuously growing community science image resources.
Results
Volunteer and model accuracy
Across seven volunteers that scored 4,000 training and validation images, we estimate that the mean volunteer annotation accuracy was 95.9%. Consensus was achieved for 3,871 (3,005 striped, 866 unstriped) images, while the remaining 129 images were either unidentifiable as striped or unstriped salamanders (n = 51) or unusable because both morphs were visible in the image (n = 78; Table 1). The majority of images were scored with a mean scoring time of 3 s. Some images took annotators considerably longer to analyze, although extremely long annotation times were likely owing to annotators leaving ImageAnt running while not actively scoring. The 3,871 images served as the basis for model training and validation.
Table 1
Total number of images and how the images were classified for different datasets
Image Dataset
Total Images
Striped
Unstriped
Incorrect/unidentifiable
Training & validating
4,000
3,005
866
129
Ensemble accuracy
500
374
115
11
Final output
20,318
15,413
4,905
NA
1) Volunteer scoring for the training and validation dataset. 2) Examination of a subset (500 images) of the final model ensemble to retrieve an estimate of model accuracy. 3) Final computer vision model color pattern scores.
Total number of images and how the images were classified for different datasets1) Volunteer scoring for the training and validation dataset. 2) Examination of a subset (500 images) of the final model ensemble to retrieve an estimate of model accuracy. 3) Final computer vision model color pattern scores.Validation accuracy across the four cross-validation folds varied minimally (fold 1 = 98.6%; fold 2 = 97.3%; fold 3 = 96.2%; fold 4 = 97.4%). The mean cross-validation accuracy was 97.4% and the test accuracy of the final ensemble model was 97.8% with 95% confidence intervals of 0.96-0.99 (Wilson, 1927; Table 1). Out of the 20,318 iNaturalist images analyzed by the ensemble model, 15,413 (75.9%) were labeled as striped and 4,905 (24.1%) as unstriped salamanders (Table 1).
Niche modeling
Our filtering steps removed 60 data points, generating a final dataset of 20,258 total point presences (N = 15,363 striped morphs; N = 4,895 unstriped morphs; Figure 2). These were used along with the uncorrelated environmental predictors to generate a best-fit MAXENT model for striped and unstriped morphs. The best model for both striped and unstriped, based on AICc and ΔAICc, consisted of LQPH features with a regularization multiplier of two (striped model AICc = 28,877.63, ΔAICc = 4.86; unstriped model AICc = 16,072.40, ΔAICc = 6.09). AUCtrain (striped 0.78; unstriped 0.82) suggests relatively performant models; because P. cinereus is widespread and common across its range, separating higher and lower quality habitats is more challenging than for habitat specialists. AUCtest values (striped 0.75; unstriped 0.81) were close to the AUCtrain scores, suggesting that these models are not overfitting. The Schoener’s D metric indicates that the niches of the morphs overlap at 87%. Niche breadth of the striped morph is greater than that of the unstriped morph (Levins B2; striped = 0.64; unstriped = 0.55). The PCA of the reduced bioclimatic variables shows how the morphs partition niche space (Table S1. PCA results of reduced bioclimatic variables, Related to Figure 3). PC1 represents 30% of the variation and its loadings are primarily mean diurnal range (BIO2), maximum temperature of warmest month (BIO5), and precipitation of warmest quarter (BIO18; Figure 3A). PC2 represents 26% of the variation and mean temperature of driest quarter (BIO9), temperature annual range (BIO7), and precipitation seasonality (BIO15; Figure 3A) are the main loadings. Lastly, 17% of the variation is explained by PC3, with loadings primarily from elevation (ALT) and maximum temperature of warmest month (BIO5; Figure 3B).
Figure 2
Color morph data generated from the computer vision model
Georeferenced iNaturalist observations (N = 20,258) of Plethodon cinereus. Record localities are colored by morph (red = striped, black = unstriped) based on the final computer vision model run. The known range of P. cinereus is shown in gray.
Figure 3
Climatic niche differences between color morphs of Plethodon cinereus
(A and B) PCA of reduced climatic variables: (A) PC1-PC2, (B) PC1-PC3. Predicted presence points from striped and unstriped morph ecological niche models were grouped into hexbins (red = striped; black = unstriped). PCA loadings are represented by yellow arrows.
Color morph data generated from the computer vision modelGeoreferenced iNaturalist observations (N = 20,258) of Plethodon cinereus. Record localities are colored by morph (red = striped, black = unstriped) based on the final computer vision model run. The known range of P. cinereus is shown in gray.Climatic niche differences between color morphs of Plethodon cinereus(A and B) PCA of reduced climatic variables: (A) PC1-PC2, (B) PC1-PC3. Predicted presence points from striped and unstriped morph ecological niche models were grouped into hexbins (red = striped; black = unstriped). PCA loadings are represented by yellow arrows.
Logistic modeling
The best model included elevation and all seven bioclimatic predictors (BIO2, BIO5, BIO7, BIO8, BIO9, BIO15, BIO18), however, BIO5 was subsequently dropped because it had a VIF greater than four (PsuedoR2 = 0.04). All model effects were significant. Striped morph frequency is positively correlated with elevation (β = 0.051, SE = 0.001, p < 0.001; Figure 4A). There is a decreased odds of striped morphs with mean diurnal range (BIO2; β = −0.063, SE = 0.001, p < 0.001; Figure 4B). Striped morph frequency has higher odds of occurring with higher temperature annual range (BIO7; β = 0.126, SE = 0.001, p < 0.001; Figure 4C), but the odds decrease with mean temperature of the wettest quarter (BIO8; β = −0.040, SE = 0.001, p < 0.001; Figure 4D). The odds of striped morph frequency increases with mean temperature of driest quarter (BIO9; β = 0.112, SE = 0.001, p < 0.001; Figure 4E) and with both precipitation predictors: precipitation seasonality (BIO15; β = 0.315, SE = 0.001, p < 0.001; Figure 4F) and precipitation of the warmest quarter (BIO18; β = 0.178, SE = 0.001, p < 0.001; Figure 4G). Precipitation effect sizes were generally stronger than temperature in separating morphs.
Figure 4
Climatic predictors of color morph frequency
Top model effect plots of color morph frequency variation in Plethodon cinereus.
(A–G) The proportion of color morphs is influenced by (A) elevation; (B) mean diurnal range (BIO2); (C) temperature annual range (BIO7); (D) mean temperature of wettest quarter (BIO8); (E) mean temperature of direst quarter (BIO9); (F) precipitation seasonality (BIO15); and (G) precipitation of warmest quarter (BIO18). 95% confidence intervals are included in each plot.
Climatic predictors of color morph frequencyTop model effect plots of color morph frequency variation in Plethodon cinereus.(A–G) The proportion of color morphs is influenced by (A) elevation; (B) mean diurnal range (BIO2); (C) temperature annual range (BIO7); (D) mean temperature of wettest quarter (BIO8); (E) mean temperature of direst quarter (BIO9); (F) precipitation seasonality (BIO15); and (G) precipitation of warmest quarter (BIO18). 95% confidence intervals are included in each plot.
Discussion
Community science resources, especially images tied to community identifications available via iNaturalist, are rapidly expanding. These images contain a treasure trove of biologically relevant information about phenotypes and interactions (DiCecco et al., 2021), but unlocking this information remains a challenge. Thus far, computer vision models have largely focused on species identification from images (Gomez Villa et al., 2017; Norouzzadeh et al., 2018; Willi et al., 2019). To our knowledge, no previous studies have aimed to use machine learning approaches to extract trait information, but such approaches are needed given the deluge of records with digital vouchers being submitted. Here, we created a highly accurate (∼98% accurate based on test set evaluation) computer vision model for classifying a salamander’s color pattern from community science images. With the data produced from this model, we expanded our knowledge of why a common striped/unstriped color polymorphism persists in the abundant salamander, P. cinereus.
Scalability of community science images
A challenge of using CNNs for feature classification is the need for robust sample sizes for training. Community science platforms, such as iNaturalist, hold millions of images of various plants and animals that are spatially and temporally replicated. A well-established machine learning algorithm provides iNaturalist users with a suggested species identification (www.inaturalist.org/). A few studies have manually scored traits such as flower presence or absence in order to identify phenological patterns across geography (Barve et al., 2020; Li et al., 2021). Yet, manual scoring of more images would be necessary to expand on these studies. Our pipeline provides a streamlined example of how to obtain large-scale trait data from community science images. This computer vision model can now be used to rapidly score the trait of interest and can be used in perpetuity to gather data on more records as they become available on community science platforms. From August 5th, 2020 when we downloaded our core image dataset used for model training to January 13th, 2022, the number of research-grade P. cinereus records has nearly doubled (from 15,777 to 29,040). As well, many other Plethodon species have similar color polymorphisms and our model should be transferable to these other species.Community science images are not perfect. With unstandardized images, expert decisions on feature classifications are key. For this work, we created a salamander color scoring guide (found in https://github.com/mhantak/Salamander_image_analysis) that was distributed to all volunteers who aided in creating the training dataset. Although the standardization of training data is important, some aspects of community science images remain out of our control and create unique challenges when designing machine learning experiments. For instance, during volunteer scoring, there were a few research-grade species misidentifications, which is unsurprising given that closely related species can look nearly identical to P. cinereus (Fisher-Reid and Wiens, 2015). These sorts of issues are inherent in working with community science data, but careful consideration is needed when making decisions about how to deal with these records. In our case, we scored misidentified species as the most similar looking morph of P. cinereus. For example, a Two-lined Salamander (Eurycea sp.), was categorized as a striped morph, while a Slimy Salamander (Plethodon glutinosus) was scored as an unstriped morph. Keeping these images of similar-looking species in the training dataset provides a more representative sample of what the model will encounter when analyzing new images. Furthermore, there were several images solely of the ventral side of the salamander. Although not a misidentification, the needed trait information is best obtained from a dorsal view, and ventral views would be better suited as additional images to augment iNaturalist records that also include a dorsal view. Due in part to these ventral images, there were 51 images out of 4,000 (1.3%) that were excluded from the training dataset because they could not be identified to morph. Other image problems included excessive blurriness, partial body part exposure (e.g., head only), or a salamander that was too distant in the photograph. Even if ∼1% of all input images are unidentifiable and the model was to incorrectly guess on all of them, we maintain that this is still an acceptable error rate when dealing with community science images. Finally, we removed one extraneous data point from the data after determining it was well outside of the geographic range of the species. One record out of >20,000 is a very low error rate.
Computer vision model intricacies
Our final computer vision model is based on binary classification, “striped” or “unstriped” color morph. This simplified binary classifier works for the majority of individual P. cinereus across the distribution of the species. However, there is a third, uncommon erythristic (orange-red) color morph, which we combined with the striped morph (similar to another study; Fisher-Reid and Wiens, 2015) because there were too few examples in our training image set (n = 20) to train a model to identify it. In addition, other abnormal color phenotypes of P. cinereus can sometimes be found (see Moore and Ouellet, 2014). When preparing our training dataset, we found 16 instances of a white (instead of orange or red) striped phenotype. As with the erythristic phenotype, these images were too sparse for model training and were lumped with striped morphs based on the existence of the dorsal stripe. Similar decisions were necessary for less frequent aberrant phenotypes. Single images that contained multiple salamanders also posed an issue with creating our training set. We initially considered attempting to train a model to determine the number of salamanders in an image or identify images with multiple salamanders. However, a stepwise classifier would require more training images for the additional categories and ultimately create a more unbalanced dataset, as there were less images with multiple salamanders. We, thus, adopted the simple solution of combining images with multiple salamanders of the same phenotype with images of single salamanders (e.g., an image with three striped morphs was binned into the “striped” class). We removed images that contained both color morphs from the training set because either category (striped or unstriped) could be considered correct for these images. At inference time, images with both color morphs were considered to be correctly classified regardless of which color morph the model assigned them. Such images are quite rare and accounted for only 78 of the 4,000 images analyzed to generate the training and test sets.
Climate and color morph trends in the Eastern Red-backed salamander
The ecological niche models show that the morphs largely overlap (i.e., by 87%) in climatic niche space, but striped morphs have a wider niche breadth than unstriped morphs. The PCA highlights the variation between P. cinereus color morphs and in general shows that striped morphs can be found in areas with more variable climatic conditions. Logistic model findings are consistent with the PCA and demonstrate a positive association between striped morph frequency and elevation, metrics of precipitation, and two climate variables (BIO7 and BIO9). Whereas the proportion of striped morphs decreases with mean diurnal range (BIO2) and mean temperature of wettest quarter (BIO8).Our finding of a positive relationship between elevation and striped morph frequency is consistent with previous studies (Gibbs and Karraker, 2006; Moore and Ouellet, 2015; Hantak et al., 2021). Following the expectation that higher elevations are typically colder than lower elevations, we predicted the observed positive correlation. However, here and in other studies, striped morphs are not always associated with cooler temperatures. A recent study by Hantak et al. (2021) found the proportion of striped morphs increases with increasing elevation and mean annual temperature and, based on these results suggested that these predictors may be decoupled in relation to color morph frequency in P. cinereus. Although the reason for greater proportion of striped morphs in higher elevations remains unclear, it may be possible that gene flow is reduced along altitudinal gradients in this species. Previous work has shown that elevation is a significant predictor of genetic differentiation in amphibians (Funk et al., 2005; Giordano et al., 2007; García-Rodríguez et al., 2021), including P. cinereus (Hantak et al., 2019); although moderate changes in elevation were not the most important driver of morph frequency variation in northern Ohio (Hantak et al., 2019).Based on previous studies of climate associations between in P. cinereus color morphs, we predicted that striped morph occurrences would be more tightly linked with cooler and wetter climatic niches, whereas unstriped morphs would be more correlated with warmer, drier niches (Lotter and Scott, 1977; Moreno, 1989; Anthony et al., 2008). Although we found the predicted trend for precipitation with striped morph frequency, our temperature-morph findings were more nuanced. The PCA and logistic model indicates that the striped morph is, in general, found in areas with more variability in temperature. Whereas the proportion of striped morphs decreases with mean diurnal range (BIO2), suggesting that striped morphs are negatively impacted by temperature fluctuations. In addition, the proportion of striped morphs decreased with mean temperature of wettest quarter (BIO8), indicating a possible humidity threshold for this morph.Much work on the polymorphism in P. cinereus relies heavily on findings that were conducted over relatively small spatial and temporal scales. In addition, some studies have found no climate-related morph trends or inconsistent patterns over time (Petruzzi et al., 2006; Muñoz et al., 2016; Evans et al., 2018). Fisher-Reid et al. (2013) demonstrated that striped morphs were found in warmer, wetter habitats on Long Island, New York, while Hantak et al. (2021) found striped morphs were more associated with warmer, drier habitats in localities across Maryland, New York, and Virginia. Range-wide, dense data can help examine overall trends and localize those at a finer scale in a unifying framework. Besides our current work, two other studies have attempted to examine climate-morph trends in P. cinereus across a greater proportion of the species range. But here again, these studies find conflicting results likely owing to differences in datasets, covariates, and statistical approaches (Gibbs and Karraker, 2006; Moore and Ouellet, 2015; Cosentino et al., 2017). It is possible that these variations in approaches lead to ambiguous color morph and climate relationships, or it may be there are more complex contextual cues that are being missed when attempting to understand polymorphism rates in P. cinereus. With physiological differences between the morphs (Moreno, 1989; Davis and Milanovich, 2010; Smith et al., 2015), climate likely plays some role in morph distribution, but other, local, selection pressures may be more important in this system.
Next steps
The combination of community science and deep learning provides a powerful resource for future studies of phenotypic variation. With the rapid growth of data, including community science images, scalable resources such as computer vision models are necessary to keep pace with the rate of data accumulation (Hassoun et al., 2021), which potentially provides a means to track temporal changes, not simply spatial ones. A further step for our research is to use this model to score color morphs of other species within the salamander genus Plethodon. In total, there are 10 species within Plethodon that contain the exact same dorsal striped/unstriped color pattern (Petranka, 1998; Highton, 2004). Occurrence data points are available for all of these other species on the iNaturalist platform, ranging from ∼70 observations for the IUCN listed “vulnerable” mountaintop endemic, Plethodon sherando (Highton and Collins, 2006) to >2,000 observations of the more widespread Western Red-backed Salamander (Plethodon vehiculum). Much research has been conducted on the morphs of P. cinereus, but very little is known about the dynamics of the polymorphism in these other species, including whether the morphs diverge in climatic niche space. Although our computer vision model was developed to score salamander striped and unstriped color patterns, our entire workflow can also be transferred to any system that has discrete, easily identifiable, trait variation.
Limitations of the study
Although machine learning holds much promise for rapidly gathering phenotypic data from digital images, the main limitation to using fully supervised deep learning approaches is the number of labeled training images (and, primarily, the time and expertise needed to generate the labels). Depending on the complexity of the intended classification, several thousand vouchers for each category may be necessary for training and validating the model. Here, we present a relatively simple problem: are the salamanders striped or unstriped? Adding categories or addressing more complicated phenotypes will require more training images. In the deep learning literature, methods to reduce the labeling bottleneck (e.g., through one- or few-shot learning; O’Mahony et al., 2019; Wang et al., 2020) are being developed and future studies on the applicability and effectiveness of those methods to the application presented here are needed. The other main limitation to the type of work we presented in this article is the imperfect nature of community science images. Misidentifications do occur, even when reducing the dataset to vetted (e.g., research-grade) images, and images themselves vary in absolute quality and relative usability for a particular trait scoring outcome. Solutions to dealing with these issues will be on a case-to-case basis, but in our work, we found that labeling misidentified species to the closest phenotype and filtering some of the most problematic images worked well. Misidentifications and unusable images are inherent when working with community science data, but they are infrequent. With tens of thousands of correctly identified images of usable quality, a few misidentifications and image issues will not dramatically impact the biological conclusions of the study. Certainly, future work can also include leveraging weak-learning approaches that are more robust to the presence of label noise and inaccuracies.
STAR★Methods
Key resources table
Resource availability
Lead contact
Further information or requests for resources should be directed to the Lead Contact, Maggie M. Hantak (maggiehantak@gmail.com).
Materials availability
This study did not generate new unique reagents.
Method details
Community science image dataset
We downloaded 15,777 research-grade (georeferenced observations with species ID verified by a minimum of two separate reviewers) images of P. cinereus from iNaturalist (accessed August 5, 2020) via a command-line query tool (https://gitlab.com/stuckyb/cbg_phenology). Images were not modified in any manner. From this initial set, we randomly selected 4,000 images to be the basis of our training and validation dataset. Seven volunteers aided in scoring salamander color pattern (striped/unstriped). A color pattern scoring guide and training was provided by MMH to all participants prior to scoring to ensure unanimity in trait definitions. Images were divided into 10 sets of 400. All image sets were scored twice by separate volunteers (i.e., no volunteer scored the same image twice). If there was incongruence between volunteers in scoring a color pattern, a third, independent, volunteer provided a consensus score.To score salamander color patterns, we used the scriptable desktop software program ImageAnt (https://gitlab.com/stuckyb/imageant). We wrote a custom ImageAnt script to query: 1) the number of salamanders in an image; 2) salamander color pattern (striped, unstriped, other); or 3) whether the image was unusable (i.e., the color pattern was unidentifiable). Images with multiple salamanders were subsequently presented with another scoring rubric of “striped”, “unstriped”, or “both color morphs”. In the final training set, images with multiple of the same color morph were lumped with images of a single salamander of the same color morph. Images that contained both color morphs were not included in the training set. Although P. cinereus displays a discrete striped/unstriped dorsal color pattern polymorphism, aberrant phenotypes (e.g., leucistic or the orange-red “erythristic’ phenotype) can be found (Moore and Ouellet, 2014). The few cases of erythristic (“other”) phenotypes were included within the “striped” class, while no leucistic examples were observed in our training set. Our final model was trained using the binary categories: “striped” and “unstriped”.
Deep learning
We trained a convolutional neural network (CNN) using the EfficientNet (efficientnet-b4; Tan and Le, 2019) architecture implemented in PyTorch with PyTorch Lightning used to implement model training (Falcon, 2019). We implemented transfer learning (Yosinski et al., 2014) with model weights that were pre-trained on the ImageNet dataset (Deng et al., 2009); all model weights were fine-tuned during training on our salamander image dataset. CNN training, validation, and inference on new images were performed on the University of Florida HiPerGator high-performance computing cluster, mainly with NVIDIA GeForce 2080Ti GPUs and with Quadro RTX 6000 GPUs used for some model development work.To train the model, we used stochastic gradient descent with momentum of 0.9 and a dynamic learning rate schedule starting with a learning rate of 0.001 and set to decay by a factor of 0.1 based on validation loss. We chose an initial learning rate for the scheduler by testing powers of 10 (0.1, 0.01, 0.001) and using the learning rate that gave us the best validation loss on a single train/validate split. An oversampling procedure was implemented due to unequal image representation of the striped and unstriped salamander phenotypes. Image preprocessing included resizing images to 596x447 pixels – the largest size possible given our GPU memory capacity and target batch size – and normalizing the color channels with the same transformation used to pretrain the model weights on the ImageNet dataset. A set of data augmentation techniques was applied to each batch during model training including: 1) random horizontal flips, 2) random vertical flips, 3) random rotations, 4) color jittering, and 5) random affine transforms. Model training was performed with a batch size of 8 over 50 epochs. We used 4-fold cross-validation to evaluate model performance. For our final production model, we took the best model from each cross-validation fold (as defined by the lowest validation loss for that fold) and combined them into an ensemble model by averaging the predictions of all four models. Using ImageAnt, we manually scored 500 more images that were independent of those used for model training and validation to serve as a test set for evaluating the final ensemble model. We then used the production ensemble model to analyze all remaining P. cinereus images on iNaturalist. Due to the growth of P. cinereus research-grade images between model training and validation steps, we re-downloaded all research-grade images from iNaturalist (20,318 images; accessed March 24, 2021) and then analyzed all images not included in the training and test sets using the full model ensemble. Full modeling details and code can be found on our GitHub repository (https://github.com/mhantak/Salamander_image_analysis).
Environmental data
To test climatic niche differences between the color morphs of P. cinereus, we first obtained bioclimatic (n = 19) and elevational data at 30 arc-second (∼1 km) resolution (WorldClim V1.4; Hijmans et al., 2005). We next determined the accessible area for P. cinereus by buffering the known geographic range by 100 km and then clipped environmental data layers to that area. After doing so, and to avoid overparameterization and multicollinearity, the environmental data layers were reduced to include only uncorrelated variables (r = .80). The final dataset included eight variables: elevation, mean diurnal range (BIO2), maximum temperature of warmest month (BIO5), temperature annual range (BIO7), mean temperature of wettest quarter (BIO8), mean temperature of direst quarter (BIO9), precipitation seasonality (BIO15), and precipitation of warmest quarter (BIO18).
Niche modeling
We used ecological niche modeling (ENM) as a means to determine niche characteristics of both morphs. Prior to running niche models, we first filtered the iNaturalist data records. Filtering included removing records with missing or incomplete latitude and longitude information, duplicate records, and manually removing records outside of the known range. To reduce the potential for spatial autocorrelation and bias from areas with particularly dense sampling, we thinned our data to include records separated by a minimum of 25 km. ENM’s were constructed separately for both the striped and unstriped morphs using the maximum entropy algorithm implemented in MAXENT V3.4.1 (Phillips et al., 2006) in the R package ENMeval (Muscarella et al., 2014). Data were partitioned using the “block” method to account for spatial autocorrelation. Regularization multipliers ranged from 0.5 to 5 and possible feature combinations were: L, H, LP, LQ, LQH, LQP, and LQPH (L = linear, H = hinge, P = product, Q = quadratic). The best model was selected based on the lowest ΔAICc and evaluated with AUCtrain and AUCtest. After model calibration and validation, we converted the modeled output of predicted probabilities of presence within the accessible area to binary presence/absence maps using equate entropy and the original distribution (cloglog) threshold, which typically performs well when attempting to balance omission error versus the fraction of predicted presence. We next examined niche overlap between the two morphs with the Schoener’s D metric using ENMeval. Niche breadth of both morphs was calculated using the raster.breadth function in the R package ENMTools (Warren et al., 2010). To visually examine color morph overlap in association with climatic predictors, we ran a principal component analysis (PCA) using the reduced set of bioclimatic variables and the predicted presence points from the striped and unstriped morph ENM’s with the base R prcomp() function (R Core Team, 2019).
Quantification and statistical analysis
We further quantified niche differences between the morphs by running a multiple logistic regression using the R base glm() function (R Core Team, 2019) with a binomial family and a logit link function. The predictors for this model were generated by assembling underlying bioclimatic conditions (e.g., BIO2, BIO5, BIO7, BIO8, BIO9, BIO15, BIO18) and elevation at each pixel predicted as a presence in the above binarized maps, for both morphs. We opted to use the raw environmental data rather than principal components for ease of interpretation. Color morph, coded as 1 for striped morphs and 0 for unstriped morphs was the response variable. All predictors were mean-centered and scaled. In order to select the best model, and given no a priori hypotheses about the best predictors, we used the ‘dredge’ function in the R package MuMIn (Barton, 2012) to rank and assess the best-fit model with AICc. If any predictors were not in the top model or if any predictor variance inflation factor (VIF) was greater than four, we dropped those variables and re-ran the logistic regression. To generate a pseudo-R2 value, as a measure of goodness of fit for our best-fit model, we used the ‘r2_nagelkerke’ function in the R package performance (Lüdecke et al., 2021).
Authors: W Chris Funk; Michael S Blouin; Paul Stephen Corn; Bryce A Maxell; David S Pilliod; Stephen Amish; Fred W Allendorf Journal: Mol Ecol Date: 2005-02 Impact factor: 6.185
Authors: Richard M Lehtinen; Brian M Carlson; Alyssa R Hamm; Alexis G Riley; Maria M Mullin; Weston J Gray Journal: Ecol Evol Date: 2020-01-16 Impact factor: 2.912