Literature DB >> 33997666

Spatial phylogenetics of butterflies in relation to environmental drivers and angiosperm diversity across North America.

Chandra Earl^1,2, Michael W Belitz^1,3,4, Shawn W Laffan⁵, Vijay Barve¹, Narayani Barve¹, Douglas E Soltis^1,2,3,4, Julie M Allen⁶, Pamela S Soltis^1,2,4, Brent D Mishler^7,8, Akito Y Kawahara^1,2,3, Robert Guralnick^1,2,3,4.

Abstract

Broad-scale, quantitative assessments of insect biodiversity and the factors shaping it remain particularly poorly explored. Here we undertook a spatial phylogenetic analysis of North American butterflies to test whether climate stability and temperature gradients have shaped their diversity and endemism. We also performed the first quantitative comparisons of spatial phylogenetic patterns between butterflies and flowering plants. We expected concordance between the two groups based on shared historical environmental drivers and presumed strong butterfly-host plant specializations. We instead found that biodiversity patterns in butterflies are strikingly different from flowering plants, especially warm deserts. In particular, butterflies show different patterns of phylogenetic clustering compared with flowering plants, suggesting differences in habitat conservation between the two groups. These results suggest that shared biogeographic histories and trophic associations do not necessarily assure similar diversity outcomes. The work has applied value in conservation planning, documenting warm deserts as a North American butterfly biodiversity hotspot.

Entities: Chemical Disease Species

Keywords: Ecology; Entomology; Evolutionary Ecology; Evolutionary History; Global Change; Phylogenetics; Phylogeny; Plant Biogeography

Year: 2021 PMID： 33997666 PMCID： PMC8101049 DOI： 10.1016/j.isci.2021.102239

Source DB: PubMed Journal: iScience ISSN： 2589-0042

Introduction

Insect biodiversity patterns are poorly understood across broad spatial, temporal, and phylogenetic scales. These shortfalls stand in stark contrast to our knowledge of vertebrate and flowering plant biodiversity, especially in North America, where rapid efforts to close phylogenetic and spatial information gaps (Davies and Buckley, 2011; Thornhill et al., 2017; Allen et al., 2019) have provided novel insights into the shorter and longer term processes structuring biodiversity and how best to preserve natural heritage for the future (Kling et al., 2019). One approach to quickly expand the knowledge base of insect biodiversity and preserve it in the face of accelerating terrestrial declines (van Klink et al., 2020) lies in focusing on clades where existing data are already dense but not yet fully integrated. Such efforts also provide a unique basis for direct, empirical comparisons with other lineages, such as various lineages of green plants, that are known to have strong evolutionary and ecological associations with herbivorous insects (Futuyma and Agrawal, 2009). Butterflies (Papilionoidea) serve as an ideal study group for researchers and naturalists due to their diurnal activity, often vibrant and showy colors, and specialized larval host plant associations (Brock and Kaufman, 2003; Grimaldi and Engel, 2005). Not only are they the most collected and photographed insects (Scoble, 1995), but many species and clades of butterflies have become models for studying diverse ecological and evolutionary processes, such as Batesian and Müllerian mimicry (e.g., butterflies in the genus Heliconius (Brower, 1996; Kronforst and Papa, 2015; Lewis et al., 2019)), genetics and migration (e.g., butterflies in the genus Danaus), and adaptation to agricultural systems (e.g., the common and widespread cabbage white, Pieris rapae (Shen et al., 2016)). Butterflies also serve as pollinators and bioindicators of change and are one of the few insect groups where conservation agencies such as the IUCN have made at least initial assessments of species endangered status (Bonelli et al., 2018). Due to the interest of both professionals and amateurs, the natural history of North American butterflies is relatively well known with rich distributional and genetic data resources readily available. There are approximately 1900 species of butterflies in North America (Lotts and Naberhaus, 2017), and natural history and genetic data exist for nearly 1500 species of them. This abundance of data positions butterflies as one of the best insect groups for asking broad-scale questions about the structure and drivers of diversity. These strong data sources enable moving beyond simple taxic summaries of diversity, such as species richness, and toward a more comprehensive, process-oriented understanding of how diversity is evolutionarily structured at the continental scale. Despite such potential, a synthetic, broad-scale phylodiversity analysis of butterflies (or any other insect group) and the drivers of that diversity has yet to be conducted. Even North-America-wide summaries of butterfly taxic diversity have been limited (Ricketts et al., 1999; Kocher and Williams, 2000; Luis-Martinez et al., 2002). Butterflies are sensitive to changes in climate (Dennis, 1993), and a fundamental question is how current climate and historical changes in temperature and landscape across North America have shaped butterfly phylogenetic diversity and endemism. North America is characterized by a wide range of ecosystems, a dynamic geological history, and significant insect diversity (Danks, 1994; Godfray et al., 2000). Butterflies are distributed across 14 broad ecoregions, ranging from the eastern temperate forests to tundra and taiga in Northern Canada, tropical wet forests in southern Mexico, and the warm and cold deserts of the Southwest (Lotts and Naberhaus, 2017). Landscapes across the continent have dramatically changed during the Quaternary, especially in the west, due to long-term aridification and orogeny leading to formation of the Sierra Nevada mountains, and across northern portions of the continent through cyclic patterns of glaciations (Bintanja and van de Wal, 2008). Butterflies also rely heavily on flowering plants, as sources for both adult nectar and larval food (Bronstein et al., 2006). A key question is whether butterflies and angiosperms show concordant broad biogeographic patterns, given these strong ecological associations and shared historical landscape and climate drivers. Recent efforts to document North American plant phylodiversity (Mishler et al., 2020) provide a data basis for direct, quantitative comparisons of butterflies with angiosperms. That recent analysis is the most comprehensive yet attempted, covering more than 19,500 plant species (out of more than 44,000 total species), found across the continent. This study is the first to directly compare spatial patterns and drivers of phylogenetic diversity between any group of insects and flowering plants at a continental scale. Here we assembled and analyzed butterfly spatial phylogenetic diversity across North America and examined its connection to historical climate and flowering plant phylodiversity patterns. Phylogenetic approaches have two key advantages compared with traditional taxic approaches. First, phylodiversity metrics reduce reliance on species definitions; rather, branch lengths are used to calculate diversity metrics. Second, spatial phylogenetic approaches bring in evolutionary history and allow hypothesis testing, making it possible to assess, for example, whether communities are more distantly or closely related to each other than expected by chance. We applied a set of spatial phylogenetic methods and metrics, including phylogenetic diversity (PD) (Faith, 1992), phylogenetic endemism (PE) (Rosauer et al., 2009), and relative phylogenetic diversity and endemism (RPD and RPE) (Mishler et al., 2014). We also employed CANAPE, which can differentiate between types of endemism found in a region (Mishler et al., 2014), namely between recent radiations leading to neoendemism and relictual endemism leading to range-restricted groups that were once more widespread, i.e. paleoendemism. Although these metrics are now commonly applied, we provide a short summary of those used here in Table 1.

Table 1

Summary of phylodiversity metrics and tests used

Phylogenetic diversity (PD)	Measured as the sum of branch lengths connecting the terminal taxa present in each location (usually to the root of the tree)
Phylogenetic endemism (PE)	Like PD but measured on a tree where the branches are weighted by the inverse of their geographic range (a Range-Weighted Tree)
Relative phylogenetic diversity (RPD) and relative phylogenetic endemism (RPE)	Ratio of PD or PE measured on the original tree to PD or PE measured using a comparison tree with the same topology but where each branch is adjusted to be of equal length
Categorical analysis of neo- and paleo-endemism (CANAPE)	Geographic centers of endemism are identified, first as being significantly high in either the numerator or the denominator of RPE (or both) and then classified as paleo, neo, or mixed based on whether the RPE ratio is significantly high or low
Randomization tests	These metrics are tested for statistical significance using a spatially structured randomization that re-assigns terminal taxon occurrences on the map, subject to two constraints: the range size of each taxon and the richness of each locations are held constant

Summary of phylodiversity metrics and tests used We used these metrics of phylogenetic diversity and endemism to test hypotheses about a set of potential drivers and associations, including a unique, direct empirical comparison between butterflies and flowering plants. These same techniques also document centers of diversity and endemism that may differ from plant or vertebrate groups and inform conservation prioritization. Based on a recent analysis of North American plant phylodiversity, we made the following predictions: Regions that are warmer and have remained more stable over time will have higher phylogenetic diversity (PD) (Rohde, 1992; Mittelbach et al., 2007). Stable areas, whether warm or cold, should have significantly higher than random PD because they have had the most time to accumulate lineages (Cowling and Lombard, 2002), along with specializations that may structure communities to avoid competition (Fine, 2015). Relative phylogenetic diversity (RPD) will be higher than expected in areas that have been most stable, accumulating more long-surviving, older lineages (Fine, 2015). In North America this includes the eastern and southernmost portions of the continent, as seen in flowering plants. Areas with high topographic heterogeneity and that have been most climatically unstable, such as recently deglaciated areas in the north and portions of the west, will have significantly lower than expected RPD. Butterfly phylogenetic endemism in North America will align with hotspots of high angiosperm endemism. Hotspots of neoendemism are more likely in younger areas with higher topographic relief, whereas areas of paleoendemism will be more likely where climate and landscapes have been more stable. Continental-scale flowering plant and butterfly phylodiversity patterns will be congruent, due to co-evolutionary dynamics between the two groups and similarities in underlying landscape and climate drivers. Alternatively, the relative breadth of butterfly host preferences and the limited number of host plants compared with overall plant diversity may dilute and obscure congruence.

Materials and methods

Species name assembly

We consolidated a list of all North American butterfly species with their current valid names and known synonyms (see “SupDryad_specieslist.csv” in Earl (2021)). We defined North America as including Canada, Mexico, and the United States but excluded species that were endemic to islands near the North American landmass (e.g., the Caribbean), as these islands were not well included in field guides used for documenting ranges. Valid names were derived from a global checklist (Lamas, 2004) and were augmented via assembly of synonymies from the Lepidoptera and other life forms database (Funet (Savela, 2020)) and Wikipedia (Wikipedia, 2020) using the R package taxotools (Barve, 2020). The augmented master list was used to normalize names from resources that contained expert-assessed maps and assembled names used in field guide resources (Brock and Kaufman, 2003; Glassberg, 2018). Once names were normalized to a consistent, accepted name, we used those names and associated synonyms to (1) re-assign normalized names to those digitized species range maps (see below for range map assembly) where normalization was required and (2) search GenBank and other key resources for matching genetic data to construct a North America-specific butterfly phylogeny.

Range maps and digitization

Range maps were digitized from field guides covering the USA and Canada (Brock and Kaufman, 2003) and Mexico (Glassberg, 2018) for each species included in our species list. Digitizing steps included generating high-resolution scans, georeferencing the resulting images, and then manually tracing polygons based on those scans in QGIS version 3.2 (QGIS Development Team, 2020). Fewer than 1% of the species in field guides did not have an associated range map; in those cases we used occurrence records and descriptions in field guides to estimate the ranges. Range maps were combined into a single shapefile consisting of many spatial polygons that were clipped to only terrestrial areas within North America (see “SupDryad_fishnet.csv” in Earl (2021)). We provide details about range map digitization and a rigorous approach to quality control of maps in Supplemental information. Many butterfly species have ranges that extend beyond the borders of North America. Calculations that involve range-weighting, such as phylogenetic endemism, should ideally rely on globally complete phylogenies and range estimates based on highly resolved maps. Here, we partially compensated for this currently unattainable goal by determining a coarse estimate of overall range extents using country-level range maps for every species in our list where needed. The ranges used in the endemism analyses (see below) were the sums of the areal extent of both the range maps from within North America and the total country level area extent outside North America. We generated these country-level estimates utilizing three separate resources: (1) country-level ranges from Funet (Savela, 2020); (2) GBIF data for all relevant species and extracting country-level data from these records; and (3) data from a trait database that was assembled from field guides and other published sources (ButterflyNet, 2020). Supplemental information describes more details on country list production and the extensive quality control that went into verifying a final country list. A 100 km by 100 km resolution grid at the global scale (in order to include range areas from outside the continent) was projected to a North America Albers equal area conic coordinate system. A species was considered present in the cell if the cell centroid was within the species' range map or if the distance from a cell's centroid to the nearest edge of the species' range map was less than 1 km.

Sequence data acquisition and dataset construction

We compiled sequence data for 13 common markers (1 mitochondrial and 12 nuclear genes) used in butterfly phylogenetics (see Table S1). These sequences were obtained from GenBank (Clark et al., 2016), Barcode of Life Data System (BOLD) (Ratnasingham and Hebert, 2007), and by extracting DNA from tissues and sequencing with Sanger and target capture sequencing for species that lacked genetic data (See “SupDryad_lociloc.csv” in Earl (2021)). We accumulated existing marker data from GenBank using a python toolkit (https://github.com/sunray1/GeneDumper) developed by the lead author. This toolkit automatically fetches sequences from GenBank and uses a rigorous, automated cleaning workflow to choose the best matching sequences for further processing. Specifically, this workflow uses NCBI's BLAST toolkit to query GenBank for loci of interest across a list of species. Accounting for taxonomic updates and errors in species names, sequences are subject to various thresholds and metrics to select the ideal sequence for downstream analyses. The thresholds and metrics include length and content filters, a self-BLAST of the chosen sequences to check for correct species labeling, and a clustering analysis to account for differences in SNPs. We also queried for relevant loci and sequences in BOLD (Ratnasingham and Hebert, 2007), as some of these data records are not reflected in GenBank. BOLD data were assembled using its API (http://www.boldsystems.org/index.php/resources/api). Target capture sequencing utilized Anchored Hybrid Enrichment (AHE) (Lemmon et al., 2012) on 224 butterflies with the Butterfly 1.0 (Espeland et al., 2018) and 2.0 (Kawahara et al., 2018) target capture sets. Both of these kits include the 13 markers of interest. We also generated new mitochondrial cytochrome c oxidase subunit I gene (COI) sequences by extracting DNA from dried pinned museum specimens in the Florida Museum of Natural History, McGuire Center for Lepidoptera, and Biodiversity at the University of Florida. These tissues were shipped to the Canadian Center for DNA Barcoding (CCDB; Guelph, Canada) for sequencing of the standard 658-bp region of COI. See Supplemental information for additional information on sequence acquisition. We created FASTA files for each locus, which were aligned using MAFFT v.7.294b (Katoh and Standley, 2014) and concatenated into a single alignment with FASconCAT-G v.1.02 (Kück and Longo, 2014). The concatenated alignment was 12,361 bp in length and had 69% missing data. Most species (99%) were represented by COI (See “SupDryad_alignment.fa” in Earl (2021)). Without COI, missing data increased to 76% across the remaining 12 loci.

Phylogeny construction

All loci included in this study were protein coding, and therefore the alignment was partitioned by gene and codon position, resulting in 39 partitions. PartitionFinder v.2.1 (Lanfear et al., 2016) was used to choose the best partitioning scheme and nucleotide substitution models. A phylogenetic analysis of North American butterflies was conducted in RAxML v.8.2.10 (Stamatakis, 2014) using a family-level constraint based on Espeland et al. (2018) and with the concatenated alignment and best partitioning scheme. We conducted 100 ML tree-searches with different random seeds, and the tree with the best log likelihood score was chosen as the final tree. We determined branch support by running 200 parametric bootstrap replicates in RAxML under the GTR+Γ+I model and a gradual “transfer” distance method implemented in BOOSTER (Lemoine et al., 2018). A final check was made to ensure that tip names were consistent with the species names from range products. The interpretation of phylodiversity metrics depends on the units of the branch lengths in the phylogeny, e.g., the amount of “feature diversity” contained in a region when using a phylogram or the amount of “evolutionary history” in a region when using a chronogram (Allen et al., 2019). We focus here on chronograms, with their explicit focus on the age of lineages and communities, and therefore produced a time-calibrated tree. Divergence times were calculated using penalized likelihood in TreePL (Sanderson, 2002; Smith and O'Meara, 2012) following a congruification approach (Eastman et al., 2013). In particular, we obtained node calibrations from Espeland et al. (2018), extracting date ranges from each of the six family nodes that were concordant with our phylogeny and using those as estimates in TreePL (see Supplemental information for more phylogeny reconstruction details).

Analysis of phylogenetic diversity and endemism

The spatial dataset and the phylogeny described above were imported into Biodiverse v.3.0 (Laffan et al., 2010). Tips on the tree were mapped to species in the spatial dataset to calculate species richness (SR), phylogenetic diversity (PD), and phylogenetic endemism (PE) metrics for equal-area square grid cells (100 × 100 km). The imported mapping extent was global (as described above) in order to calculate range size metrics, but our analysis region for phylodiversity metrics was constrained to North American grid cells using spatial constraints in Biodiverse. This approach ensured that PE metrics take into account overall range sizes of the terminals in the tree, including their extent outside of the continent. Relatives of the terminals that occur elsewhere in the world are not included, but this is the best approach possible to estimating PE until global analyses are feasible. We also calculated relative phylogenetic diversity (RPD) and relative phylogenetic endemism (RPE) (Mishler et al., 2014). These are ratios of PD and PE on the original tree compared with a phylogeny with the same topology but with equal branch lengths. These metrics can provide useful information about areas with concentrations of significantly longer or shorter than expected phylogenetic branches. For example, areas that have more recently radiated taxa are expected to have lower chronogram-derived RPD. All cell-based values for PD, PE, RPD, and RPE were exported from Biodiverse for mapping and further analysis.

Randomization tests

Phylogenetic diversity and endemism measurements are expected to be highly correlated with taxic diversity (species richness), because each taxon added to a community must also add to the overall PD. If the co-occurring taxa are randomly distributed on the tree, the correlation should be tight, so a key step forward is to move beyond simply reporting summary measures, and test whether phylodiversity values are higher or lower than expected compared with null models of randomized communities (Webb et al., 2002). This is achieved through a randomization approach, where species occurrences within North America are randomly reassigned to grid cells while holding constant the richness of each cell and the range size of each species. Values for PD, PE, RPD, and RPE were then calculated for each randomization iteration, creating a null distribution for each grid cell. A two-tailed test was then applied to the PD, PE, and RPD randomizations to determine whether the observed values were significantly high or low when compared with the null distributions. We utilized a 40-core Dell Xeon PowerEdge standalone server to parallelize creation of 500 random realizations per grid cell across all cores. Only observations within the North American study region were randomized, with the outside regions held constant. Output randomized results were merged, exported as GeoTIFF grids, and re-imported in R for downstream analysis. RPE randomizations enable a means to categorize different types of phylogenetic endemism. This method, called Categorical Analysis of Neo- And Paleo- Endemism (CANAPE (Mishler et al., 2014)), is a two-step approach that first selects grid cells that are significantly high (one-tailed test) in either the numerator or the denominator of RPE, then uses a two-tailed test of the RPE ratio to determine four possible outcomes per cell: higher than expected concentrations of range-restricted short branches (i.e. neoendemics); long branches (i.e. paleoendemics); a mixture of both types; or no significant endemism. Endemism measures, including randomizations, were calculated in Biodiverse, and the categorization method for CANAPE was run in R v.3.6.3 (R Core Team, 2019) to determine per-grid-cell phylogenetic endemism types and to plot those results spatially.

Drivers of phylodiversity

Assembly of explanatory variables for diversity patterns

We used seven variables to analyze the observed phylogenetic diversity patterns. These included four bioclimatic variables (annual mean temperature, annual precipitation, temperature seasonality [standard deviation ∗ 100], and precipitation seasonality [coefficient of variation] (Fick and Hijmans, 2017)), two climate stability variables (temperature stability and precipitation stability), and elevation. The climate stability variables represent the inverse of the mean standard deviation between equally spaced 1000-year time slices over the past 21,000 years and were provisioned from Owens and Guralnick (2019). These seven layers were chosen because they capture the geographic variation in climate stability over a significant transition from a full glacial to interglacial time-period, which likely is representative of similar transitions that occurred repeatedly in the Pleistocene (Waltari et al., 2007). Elevation values at this scale also provide a reasonable proxy of current topographic heterogeneity. All environmental variables were scaled to a mean of zero and SD of one and resampled to 100 km × 100 km.

Testing importance of climate and topographic drivers

Four diversity metrics (PD, RPD, and randomization tests for both measures) were utilized to test the most predictive explanatory variables of diversity and diversity significance. Phylogenetic significance analyses were derived from the randomized metrics and divided into binomial datasets, where significantly high values were scored with the value one and significantly low values assigned zero. Cells with non-significant values were excluded from the binomial logistic regressions. We fit generalized linear models (GLMs) using the climatic and terrain variables described above as predictors and phylogenetic metrics as response variables. For the PD and RPD analyses, we used the Gaussian distribution; for models examining PD/RPD significance, we used the binomial distribution with the logit link function. We used the dredge function from the package MuMIn (Barton, 2009) in R version 3.6.2 to examine all possible models. We used an information-theoretic approach using Akaike's Information Criterion (AIC) to rank models (Burnham and Anderson, 2002). Models that were a subset of another model examined were not considered to be competitive if within delta AIC ≤2. We examined the collinearity of variables of the models by calculating variance-inflation factors (VIF) using the car package, and models with VIF ≥5 were also not considered as competitive models. We used delta AIC values and Akaike weights (wi) to rank competing models. Spatial autocorrelation can lead to an increase in type I errors when building simple linear models, because observations are effectively pseudoreplicated (Bini et al., 2009; Borcard et al., 2011), and therefore it is challenging to properly assess the effect of predictor variables. To test for spatial autocorrelation in our top GLMs, we calculated Moran's I. Spatial autocorrelation was detected in all of our top GLMs, and we therefore fit spatial generalized linear mixed models (GLMMs) with spatially correlated random effects in R using the package spaMM (Rousset and Ferdy, 2014). We used Matern correlation models to generate the spatially correlated random effects included in the spatial model (Rousset and Ferdy, 2014). We present the results of the spatial GLMMs in the text below, but the results of our non-spatial GLMs can be found in the Supplemental information (Table S2).

Comparison of butterfly and plant phylodiversity

We re-ran a recently published analysis of seed plant phylodiversity (Mishler et al., 2020) but excluded gymnosperms in order to compare our results with spatial phylogenetic patterns for North American flowering plants. This re-analysis used the same methods as in Mishler et al. and resulted in inclusion of 19,173 angiosperm terminal taxa (and exclusion of 476 gymnosperms) across North America. We applied the same metrics for angiosperms as for butterflies and used nearly the same spatial extent, only excluding a small portion of the southern tip of Mexico, at 50-km resolution. We resampled gridded analysis products from Mishler et al. into the same 100-km resolution as the butterfly grids using bilinear interpolation for comparisons of associations between butterfly and plant phylodiversity metrics. We next generated univariate linear regression models, where plant PD, RPD, and PE values were the predictor variables and the corresponding butterfly PD, RPD, and PE values the response variables. The residuals of these models were then mapped spatially to display where butterfly diversity was higher or lower than predicted by the model. Although we did expect spatial autocorrelation in these simple linear models, we did not generate GLMMs with spatially autocorrelated random effects, because our main objective was to visualize the spatial pattern of the model residuals. Comparison of PD, RPD, and CANAPE significance for butterflies and flowering plants was done by visual inspection because a summary test was gauged to be superfluous given the striking regional differences between the two groups. We chose to compare butterfly phylodiversity with angiosperm phylodiversity instead of all seed plants, because far more butterfly host plants are angiosperms than gymnosperms (Narango et al., 2020). However, angiosperm phylodiversity was remarkably similar to seed plant phylodiversity (Figure S1), and similar comparisons were produced with both plant datasets.

Results

A phylogeny for North American butterflies

A total of 1,437 (74.6%) known butterfly species had sequence data already available or were sequenced de novo for COI (Figure 1). De novo sequencing led to the addition of 140 species that otherwise lacked sequence data in public repositories. Of these 140 species, 96 (68.6%) were distributed only in Mexico. Although the backbone of the butterfly tree was constrained at the family level, subfamily and tribe-level relationships generally agreed with those of prior studies (Supplemental information), and clade-based ages were largely congruent with recent butterfly-wide dating analyses (Table S3; See “SupDryad_treepl.tre” in Earl (2021)). We recovered a median bootstrap value across the entire tree of 86 using transfer bootstrapping (See “SupDryad_BOOST.tre” in Earl (2021)) (Lemoine et al., 2018).

Figure 1

A time-calibrated tree of 1,437 North American butterflies with bootstrap support shown for 39 of the deepest nodes (before the K-Pg boundary)

Observed patterns of diversity and endemism

Maps of observed species richness (Figures 2A and S2A) and phylogenetic diversity (Figures 2B and S2B) both documented highest diversity primarily in the tropical dry and wet forests in Mexico and the lowest values across the arctic of Canada. Patterns of richness and PD are complex in western North America, likely reflecting the heterogeneous landscape, with peaks in areas adjoining the Sierras and Rocky Mountains. Relative phylogenetic diversity (RPD) peaked in wet and dry tropical forests in Mexico and remained uniformly high across the Eastern Temperate Forest, Great Plains, and southern deserts (Figures 2C and S2C). By comparison, observed RPD was lower across much of the temperate Intermountain West, the Mediterranean regions of California, and into northern ecosystems such as boreal forests and taiga. Phylogenetic endemism (PE; Figures 2D and S2D) showed the same general latitudinal gradient as PD and RPD but included areas of higher phylogenetic endemism along the temperate Sierra Madre mountain ranges in Mexico and in the coast ranges in the Pacific. PE was overall higher in the temperate west than in the east and associated with transition zones in the Rockies and Sierra Nevada.

Figure 2

Diversity and endemism patterns

Observed values for North American butterflies for: (A) taxic richness, (B) phylogenetic diversity (PD), (C) relative phylogenetic diversity (RPD), and (D) phylogenetic endemism (PE). Maps without logarithmically scaled color palettes can be viewed in Figure S2.

Diversity and endemism patterns Observed values for North American butterflies for: (A) taxic richness, (B) phylogenetic diversity (PD), (C) relative phylogenetic diversity (RPD), and (D) phylogenetic endemism (PE). Maps without logarithmically scaled color palettes can be viewed in Figure S2.

Spatial randomization tests

We uncovered highly regionalized patterns of overdispersion and clustering based on PD randomizations (Figure 3B). All boreal, taiga, and tundra regions showed lower than expected PD, indicative of phylogenetic clustering. Most of the temperate regions in the west, including diverse ecoregions in cold deserts, west coast forests, and Mediterranean portions of California, also displayed clustering. In contrast, tropical wet and dry forests, the most phylodiverse areas in North America, showed higher than expected PD, or phylogenetic overdispersion, when compared with null models. We also note that portions of the south-central semi-arid prairies also showed higher than expected phylodiversity. The southern, warm deserts and eastern temperate forests did not show significantly high or low PD.

Figure 3

Statistical significance of PD

Statistical significance of phylogenetic diversity (PD) for (A) angiosperms and (B) butterflies across North America. Areas with significantly high values have taxa that are less closely related than expected by chance (blue), whereas areas with significantly low values have taxa that are more closely related than expected by chance (red).

Statistical significance of PD Statistical significance of phylogenetic diversity (PD) for (A) angiosperms and (B) butterflies across North America. Areas with significantly high values have taxa that are less closely related than expected by chance (blue), whereas areas with significantly low values have taxa that are more closely related than expected by chance (red). The RPD randomization indicated that southern portions of North America have communities containing longer branches than expected under null models (Figure 4B). This included not only tropical regions but also semi-arid highlands and southern deserts into semi-arid plains and prairie. On the other hand, shorter than expected branch lengths were found across much of the Sierra Nevada, Rockies, and Intermountain West. We found no significant RPD in the Eastern Temperate Forest, northern Great Plains, and northernmost portions of North America.

Figure 4

Statistical significance of RPD

Statistical significance of relative phylogenetic diversity (RPD) for (A) angiosperms and (B) butterflies. Areas in blue have significantly longer branches than expected; areas in red have significantly shorter branches than expected.

Statistical significance of RPD Statistical significance of relative phylogenetic diversity (RPD) for (A) angiosperms and (B) butterflies. Areas in blue have significantly longer branches than expected; areas in red have significantly shorter branches than expected.

CANAPE

Regions of significant neoendemism were located in the California Mediterranean region and western forests, including the Cascades, Coast Ranges, and Sierra Nevada (Figure 5B) and in transition zones across lower-elevation regions to the East. Mixed patterns of endemism with both paleo- and neo-endemics were found in predominantly warm deserts and the southeastern coastal plain and southern, subtropical portions of Florida. Sites dominated by paleoendemism were more rare, only indicated in some areas in tropical Mexico.

Figure 5

CANAPE results

CANAPE results showing statistically significant centers of phylogenetic endemism for (A) angiosperms and (B) butterflies. All cells that are colored have significantly high PE. Red cells have concentrations of rare short branches (neoendemism); blue cells have concentrations of rare long branches (paleoendemism), and purple cells have mixtures of neo- and paleoendemism.

CANAPE results CANAPE results showing statistically significant centers of phylogenetic endemism for (A) angiosperms and (B) butterflies. All cells that are colored have significantly high PE. Red cells have concentrations of rare short branches (neoendemism); blue cells have concentrations of rare long branches (paleoendemism), and purple cells have mixtures of neo- and paleoendemism.

Drivers of phylodiversity

Annual mean temperature was the most important environmental variable in predicting PD, followed by mean annual precipitation and precipitation seasonality (Table 2). Areas that were warmer and wetter generally had the highest PD. As well, higher elevation areas and those with more seasonal precipitation generally had higher PD, although these are weaker effects. PD significance differed from results for observed PD, with temperature being the only covariate that had a coefficient estimate larger than its standard error when accounting for spatial autocorrelation (Table 2). Warmer areas were more likely to have significantly high PD. Lower RPD in an area indicates relatively short branches, potentially indicative of more recent radiations. Results from analysis of climate and terrain drivers showed high RPD in areas that have higher temperature stability and higher precipitation. As well, areas with low elevation had higher RPD (Table 2). Spatial GLMMs showed that areas with significantly low RPD are in colder areas with less precipitation; however, the coefficient estimates of all covariates were less than the standard error.

Table 2

Summary of the top spatial GLMMs for PD, RPD, significant PD, and significant RPD

Model	Temp	Prec	Temp seas	Prec seas	Temp stab	Prec stab	Elev
PD	0.014 ± 0.005	0.010 ± 0.001		0.003 ± 0.001	−0.004 ± 0.005		0.003 ± 0.001
RPD		0.005 ± 0.002	0.003 ± 0.003	−0.002 ± 0.002	0.010 ± 0.005	−0.001 ± 0.003	−0.002 ± 0.001
PD Sig	53.1 ± 30.3	2.98 ± 18.82			−7.4 ± 21.64		1.55 ± 11.79
RPD Sig	57.1 ± 133	9.9 ± 28.8			−11.3 ± 72.4		7.6 ± 29.5

Numbers in the columns indicate changes in PD, RPD, and significant PD and RPD values ±standard error when variable values between locations increased by one standard deviation. Bolding denotes coefficients whose absolute values are greater than their standard error.

Summary of the top spatial GLMMs for PD, RPD, significant PD, and significant RPD Numbers in the columns indicate changes in PD, RPD, and significant PD and RPD values ±standard error when variable values between locations increased by one standard deviation. Bolding denotes coefficients whose absolute values are greater than their standard error.

Similarities and differences between butterfly and plant phylodiversity

Butterflies and plants of North America displayed a similar pattern of PD (= 0.34), and areas of discordance displayed moderate spatial structuring of residuals in our simple linear models with butterfly PD as a response variable to plant PD. Butterfly PD was higher in the tropics and lower along the west coast of North America than predicted by angiosperm PD (Figure 6). Surprisingly, butterflies and plants did not have similar patterns of RPD (= 0.01), and RPD showed strong spatial structuring of linear model residuals, with butterfly RPD being much higher in the west and lower in the south than predicted based on angiosperm RPD (Figure 6). Butterflies and plants both showed a pattern of having the highest PE values in southern Mexico, but overall, the similarity across North America was relatively weak ( = 0.10). The spatial residuals of the PE linear models mirrored PD in the southern portions of the continent but without spatially structured error in temperate regions of the continent (Figure 6).

Figure 6

Spatial residuals of univariate linear regressions for observed PD (A), RPD (B), and PE (C), where angiosperm metrics were used to predict butterfly metrics. High residual values (blue) represent areas where butterfly values are higher than predicted values. Butterflies showed a strikingly different pattern of PD significance compared with angiosperms, with higher-than-expected values in the south and lower-than-expected values in the north (Figure 3). Angiosperms showed a strong pattern of having significantly lower-than-expected values of RPD in western North America and significantly high RPD in southern Mexico and eastern North America (Figure 4). Butterflies also had significantly high areas of RPD in tropical wet and dry forests in southern Mexico, but unlike flowering plants, they also exhibited high RPD in Baja California and the American Southwest. Also unlike angiosperms, butterflies did not show significantly higher RPD in eastern temperate forests. Both groups showed significantly low RPD in much of western North America (Figure 4). Flowering plants and butterflies showed generally discordant patterns of endemism. Centers of mixed paleo- and neo-endemism for angiosperms were found in Mexico, including the Baja California peninsula, as well as in Florida and the adjoining southern coastal plain. Although CANAPE results for butterflies also showed mixed phylogenetic endemism in Florida, the results for the groups were otherwise quite different. Butterflies showed strong patterns of mixed endemism north of Mexico, in the warm deserts and portions of the colder deserts of the Southwest, along with predominantly neoendemism in coastal regions of the West and limited paleoendemism in southern Mexico (Figure 5).

Discussion

We present the first continental-scale phylodiversity analyses for butterflies, focusing on North America. This analysis is notable for being relatively complete, with coarse-scale distribution data for all species and a phylogeny with ~75% sampling of North American species. This level of completeness provides, for the first time, a well-resolved, continental-scale view of phylogenetic diversity for an entire insect suborder. We also extended the range estimates beyond North America, by gathering very coarse country-level range maps for the ~25% of butterfly species that have ranges outside of our defined North America boundaries. We argue this approach is better than simply truncating ranges of terminals for any analyses relying on a range-weighted metric, such as PE. Here, not including full ranges would have led to many neotropical butterflies having much smaller ranges ending at the border of Mexico rather than properly extending into Central and South America. However, even when one extends the range estimates of terminal taxa, this does not fully solve the "edge-effect" problem, because the range sizes of related taxa occurring outside the study area are still being left out, which means ranges of deeper branches may be poorly estimated, affecting PE. Thus, any study incorporating less than a globally complete assessment is still assessing primarily local endemism patterns (Daru et al., 2020). However, this effect will be relatively small for any branch with at least one wide-ranging terminal in the dataset, as the ranges of internal branches are calculated as the unions of the ranges of their terminal taxa, and such wide-ranging branches are strongly downweighted in the calculations. Below we discuss how our results address key predictions regarding patterns of butterfly diversity across a continent with enormous habitat breadth, from the hot and dry deserts in the Southwest to the wet, tropical forests in the Yucatan, and the cold, arid environments of the taiga and tundra. In particular, we focus on processes that are likely to have shaped this diversity, based on both examination of climatic and topographic drivers, and via comparison with angiosperms. We explicitly predicted concordance of patterns and process because of strong associations between butterflies and their flowering plant hosts and because of the potential for similar responses to environmental changes given co-distribution across the same latitudinal and elevational gradients.

Patterns and drivers of phylogenetic diversity and endemism

Our results point especially to the importance of current climate drivers on phylogenetic diversity. We found that PD is highest in the warmest, wettest areas of the continent, along with regions of seasonal precipitation and along elevational gradients. RPD results point not to temperature, but rather temperature stability and precipitation as drivers. RPD was highest in regions exhibiting temperature stability over the past 21,000 years and high rainfall, perhaps representing conditions where highly divergent lineages could co-exist and where flowering plant diversity may also be unusually high. However, despite our predictions that temperature and precipitation stability would be a key predictor of PD and RPD significance, they were never a top predictor in any model. Rather, current annual temperature was the dominant driver of PD significance, and none of our predictor variables were particularly useful in determining drivers of RPD significance after including spatially correlated random effects. Below we summarize these results more thoroughly by major regions across North America, focusing on synthesis across geographic distance and environmental gradients. Northern North America: we define Northern North America as regions that were mostly covered in ice during the Last Glacial Maximum 21,000 years ago. The northern portion of the continent showed low PD but not RPD. The former suggests the importance of environmental filtering due to cold and seasonal conditions and the latter orbitally forced range expansions and contractions, but limited cladogenesis, across glacial cycles (Jansson and Dynesius, 2002). We argue that environmental filtering is more likely in a volant group with high reproductive rates (Pellissier et al., 2013), such as butterflies, in comparison to clades where dispersal can often lag behind changing conditions (Alexander et al., 2018). This is further supported by low PE and non-significant RPD and CANAPE results in this region (Figures 4 and 5), suggesting most species are wide-ranging and not recently radiating across the North. Western United States: western areas south of past ice sheets and north of the warm deserts showed very strong patterns of significantly low PD and RPD. The former indicates potential environmental filtering, given steep elevational and climatic gradients that themselves were in flux during glacial-interglacial cycling during the Pleistocene (Thackray, 2008), whereas the latter might indicate that butterflies have undergone recent radiations in these areas. The potential for recent radiations would also suggest high levels of neoendemism in the region. This was not the case for the Intermountain West and Rocky Mountains; however, we did recover a strong signal of neoendemism in Mediterranean portions of California and western forests, extending into the western portions of the Great Basin. Southern North America: the southern portion of North America showed particularly surprising results, especially in the warm deserts. PD and RPD were both significantly high in tropical regions of North America, consistent with the tropics as a museum for butterfly diversity (Farrera et al., 1999; Hostetler et al., 1999). In the Temperate Sierra and warm deserts, we found significantly high RPD but no indication of clustered or overdispersed PD. This novel result suggests phylogenetically old communities of butterflies in deserts, a climate that formed recently, during the mid-Pliocene (Axelrod, 1958). It has long been known that flowering plants found in this region are derived from related lineages in thornscrub and arid highlands that are phylogenetically much older (Axelrod, 1959). Butterflies in warm deserts may therefore also be connected to older lineages that persisted in subtropical, yet still seasonal, habitats in southern regions and less closely related to species in colder deserts in the Great Basin. Bioregionalizations and their associations derived from phylogenetic beta-diversity provide a next-step means to examine such questions (Mienna et al., 2020). East of the Rockies: areas east of the Rockies, including the Great Plains and eastern temperate forests, are unremarkable in PD or RPD, which contrasts with spatial phylogenetic findings for flowering plants, as we discuss in detail below. We were particularly surprised that the Great Plains region, which became grassland-dominated during cooling in the Miocene and Pliocene, did not show accumulation of younger than expected lineages (i.e. significantly low RPD), as has been documented in other groups (Mishler et al., 2020). As well, tropical regions of Florida were not significantly higher in PD or RPD. However, tropical Florida and the nearby coastal plain do show significantly high levels of mixed PE, aligning with a known plant biodiversity hotspot (Myers et al., 2000).

Comparisons of spatial phylodiversity patterns between butterflies and flowering plants

Given strong ecological associations and shared abiotic drivers that have played out over long evolutionary timeframes, we predicted butterflies and plants to have similar patterns of phylodiversity (Kumar et al., 2009), while also acknowledging alternate hypotheses. This prediction is generally borne out in the western and northern portions of the continent, both shaped by recent perturbations including glaciation and aridification. However, our analyses also revealed striking differences. For example, butterflies and flowering plants did not show similar patterns of RPD, suggesting that diversification timing and rates between butterflies and angiosperms may not be associated at the scale and extent of this analysis. The reasons for these differences may be partially methodological. However, these results also suggest that shared historical forces and strong ecological associations can still lead to divergent historical and current biogeographic outcomes. We discuss more about methodological and biological rationales for similarities and differences below. Although we used consistent PD metrics across both studies, allowing for direct comparisons of outputs, sampling completeness varies dramatically between flowering plant and butterfly analyses. Sampling of flowering plants, which encompasses more than an order of magnitude more named species than butterflies (Mishler et al., 2020), is less complete in terms of both phylogenetic (ca. 44% of taxa included) and spatial distribution information. As problematic, both phylogenetic and spatial sampling is known to be biased. Some regions, especially in the North, are still nearly unsampled in terms of digitally accessible flowering plant specimen records, based on results in Mishler et al. This contrasts sharply with other regions, such as coastal California, where species sampling is mostly complete at this scale. These differences in completeness of sampling and spatial bias make strong assessments of patterns more challenging. Mishler et al. recovered a pattern of significantly low PD across the continent for seed plants, and our more phylogenetically restricted analysis of flowering plants shows the same result. This suggests that co-occurring species are always more closely related than expected by chance compared with the full pool of species. This result, not seen in butterflies, might be affected to some extent by incomplete sampling, but it is such a strong and uniform result that it likely points to some fundamental differences in evolutionary ecology between butterflies and plants. Significantly low PD (phylogenetic clustering) is most often taken to indicate habitat filtering due to phylogenetically conserved habitat preferences that result in close relatives tending to occur together in communities. It may well be that habitat preference has a higher level of conservation in seed plants than in butterflies, a possibility in need of future research. Although it has long been known that western North America has been dramatically reshaped by regional tectonism, orogeny, and climatic changes, the full magnitude of those impacts on flora and fauna besides vertebrates (Badgley, 2010) is just now starting to be understood (Pellissier et al., 2018). Plant spatial phylogenetic work (Mishler et al., 2020) has confirmed this in a spectacular fashion, with eastern temperate forests showing significantly older plant lineages than in the Great Plains and western portions of North America, both strongly shaped by cooling and aridification, showing more recent diversifications. We expected to find congruent results when examining RPD in butterflies. However, RPD between the two groups is not strongly correlated. Butterfly communities are comparatively younger in the West based on the spatial residual plots (Figure 6). As well, although some portions of the West show lower-than-expected RPD for both plants and butterflies, suggestive of more recent radiations there compared with other regions, we did not recover higher-than-expected butterfly RPD in the East as we did in plants. Much more of Mexico was significantly high in RPD in butterflies than in plants. These results suggest that less stable areas such as northern and western portions of North America may show moderate discordance when comparing across groups at different trophic levels but in a consistent manner. Butterflies likely have diversified in the shadow of a persistent, highly diverse angiosperm-dominated forest in eastern temperate North America. Given massive inequality in numbers of butterfly to plant species in the region, providing ample opportunity for evolving new host relationships, the overall effect is likely equilibrium between the two groups. In the West, more extensive perturbation caused by loss of continuous forests and continuing cooling and drying likely drove stronger disequilibrium, with butterflies following bursts of new plant lineages forming in that region. Spatial residual plots for relative phylogenetic diversity are supportive of this scenario (Figure 6), but further examination, in other herbivores, is warranted to see if such ordering effects may be more general. We hypothesize that areas with more active geologic histories will show evidence of lags in community ages between hosts and consumers/pollinators. Phylogenetic endemism patterns for plants and butterflies as seen in CANAPE were surprisingly discordant, with much stronger plant endemism found in southern and central Mexico compared with butterflies. Mishler et al. truncated ranges at southern edges of the region of interest, which might have led to more artificial range restrictions and higher endemism. This is particularly likely given that many widespread species with wet tropical affinities have range edges in the Yucatan of Mexico. Still, the discordance between plant and butterfly endemism is notable and may point to fundamental differences in ecological, evolutionary, and biogeographic processes between butterflies and plants worthy of further study. We call particular attention to the strong pattern of mixed butterfly phylogenetic endemism in all of the warm deserts of North America. Although more work is needed, it would be unsurprising if these warm deserts were generally areas of diversification and endemism for many clades, based on continuing floristic (Sosa et al., 2020) and faunistic work (Riddle and Hafner, 2006).

Importance for conservation

Butterflies are under threat, perhaps represented most iconically by monarchs and their decline (Thogmartin et al., 2017; Agrawal and Inamine, 2018), but the entire fauna may be imperiled (Wepprich et al., 2019). Results here provide a needed step for better prioritizing areas of highest conservation need. In particular, we document areas high in PE, PD, and RPD, which collectively harbor more accumulated evolutionary history (Redding et al., 2008) and which also contain the most range-restricted lineages. These areas are likely to be high priorities for conservation (Davies and Cadotte, 2011; Kling et al., 2019). Within North America, there are four well-documented biodiversity hotspots: the California Floristic Province, North American Coastal Plain, Madrean Pine-Oak Woodlands, and Mesoamerica (Reid, 1998; Noss et al., 2015). These areas were recognized due to their high plant diversity and endemism and threat of extinction (Myers et al., 2000). Although some hotspots have been long recognized, understanding where diversity is highest, rarest, and most threatened is still a work in progress; for example, the North American Coastal Plain region was only recently recognized as a hotspot (Noss et al., 2015), and further work across the Tree of Life is still needed to discover areas harboring the most unique and rare diversity. Butterfly PD and PE patterns are also higher than expected within all four of these hotspots, strengthening arguments about protecting habitat in these areas. However, our results also uncovered a new hotspot showing significant endemism, relatively old lineages (based on significantly high RPD), and high PD in the warm deserts of North America. Our results make a strong case for habitat conservation across warm deserts in particular, especially because they are not already documented biodiversity hotspots.

Limitations of the study

This work points the way to still broader and more detailed future projects on insect and plant spatial phylogenetics. Key next steps include further closing spatial and phylogenetic knowledge gaps regionally and globally. On the spatial side, this work relied on coarse-grain range maps, limiting finer localizations along shorter spatial and environmental gradients such as in topographically heterogeneous regions (Bini et al., 2009). On the phylogenetic side, although our species sampling is relatively strong, covering ~75% of North American species, many tips are represented by a single locus, i.e. COI. This leads to limited locus coverage (69% missing data), which although not uncommon in supermatrix approaches (Sanderson et al., 2010), is still an area where sampling improvement is warranted. We attempted to mitigate issues with coverage, in part, by using a maximum likelihood framework because Bayesian frameworks tend toward more biased topologies due to interactions between missing data and priors (Lemmon et al., 2009). We further opted to use a family-level backbone constraint from Espeland et al. (2018), although initial, unconstrained analyses resulted in the monophyly of all seven families. We also note issues with time calibration, and although congruification methods for dating trees are becoming commonplace, issues can arise when relationships at the analogous nodes between the target tree and the reference tree differ (Eastman et al., 2013). Because our custom-built phylogeny was constrained at the analogous nodes of the reference tree (Espeland et al., 2018), these issues are at least minimized here. Finally, we did not directly consider known butterfly host-plant associations in this work, which may be especially important, given recent work finding the majority of Lepidoptera are supported by a few important plant genera (Narango et al., 2020). That information, incorporated into a spatial phylogenetic framework, would deliver a stronger process-oriented understanding of spatial co-diversification that has shaped terrestrial ecosystems.

Resource availability

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Chandra Earl (sunray1@ufl.edu).

Materials availability

This study did not generate new unique reagents.

Data and code availability

All sequence data custom generated from this project are available on the Barcode of Life Database (http://dx.doi.org/10.5883/DS-56788) and GenBank (Accession Numbers MW807620 - MW807751). Gridded summary products, DNA alignments, phylogenies and information about locus origin are available on Dryad (https://doi.org/10.5061/dryad.00000002j). Maps and country level list data for taxa with ranges outside of North America are available on Dryad (https://doi.org/10.5061/dryad.00000002j). The final phylogeny is available on Open Tree of Life (https://tree.opentreeoflife.org/curator/study/view/ot_2014).

Methods

All methods can be found in the accompanying transparent methods supplemental file.

39 in total

Review 1. The evolution of plant-insect mutualisms.

Authors: Judith L Bronstein; Ruben Alarcón; Monica Geber
Journal: New Phytol Date: 2006 Impact factor: 10.151

2. PartitionFinder 2: New Methods for Selecting Partitioned Models of Evolution for Molecular and Morphological Phylogenetic Analyses.

Authors: Robert Lanfear; Paul B Frandsen; April M Wright; Tereza Senfeld; Brett Calcott
Journal: Mol Biol Evol Date: 2017-03-01 Impact factor: 16.240

3. Phylogenetics of moth-like butterflies (Papilionoidea: Hedylidae) based on a new 13-locus target capture probe set.

Authors: Akito Y Kawahara; Jesse W Breinholt; Marianne Espeland; Caroline Storer; David Plotkin; Kelly M Dexter; Emmanuel F A Toussaint; Ryan A St Laurent; Gunnar Brehm; Sergio Vargas; Dimitri Forero; Naomi E Pierce; David J Lohman
Journal: Mol Phylogenet Evol Date: 2018-06-11 Impact factor: 4.286

4. treePL: divergence time estimation using penalized likelihood for large phylogenies.

Authors: Stephen A Smith; Brian C O'Meara
Journal: Bioinformatics Date: 2012-08-20 Impact factor: 6.937

5. Phylogenomics with incomplete taxon coverage: the limits to inference.

Authors: Michael J Sanderson; Michelle M McMahon; Mike Steel
Journal: BMC Evol Biol Date: 2010-05-25 Impact factor: 3.260

6. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies.

Authors: Alexandros Stamatakis
Journal: Bioinformatics Date: 2014-01-21 Impact factor: 6.937

7. FASconCAT-G: extensive functions for multiple sequence alignment preparations concerning phylogenetic studies.

Authors: Patrick Kück; Gary C Longo
Journal: Front Zool Date: 2014-11-18 Impact factor: 3.172

8. Butterfly abundance declines over 20 years of systematic monitoring in Ohio, USA.

Authors: Tyson Wepprich; Jeffrey R Adrion; Leslie Ries; Jerome Wiedmann; Nick M Haddad
Journal: PLoS One Date: 2019-07-09 Impact factor: 3.240

9. Endemism patterns are scale dependent.

Authors: Barnabas H Daru; Harith Farooq; Alexandre Antonelli; Søren Faurby
Journal: Nat Commun Date: 2020-04-30 Impact factor: 14.919

10. GenBank.

Authors: Karen Clark; Ilene Karsch-Mizrachi; David J Lipman; James Ostell; Eric W Sayers
Journal: Nucleic Acids Res Date: 2015-11-20 Impact factor: 16.971

2 in total

1. LepTraits 1.0 A globally comprehensive dataset of butterfly traits.

Authors: Vaughn Shirey; Elise Larsen; Andra Doherty; Clifford A Kim; Faisal T Al-Sulaiman; Jomar D Hinolan; Micael Gabriel A Itliong; Mark Arcebal K Naive; Minji Ku; Michael Belitz; Grace Jeschke; Vijay Barve; Gerardo Lamas; Akito Y Kawahara; Robert Guralnick; Naomi E Pierce; David J Lohman; Leslie Ries
Journal: Sci Data Date: 2022-07-06 Impact factor: 8.501

2. Innovation in the Breeding of Common Bean Through a Combined Approach of in vitro Regeneration and Machine Learning Algorithms.

Authors: Muhammad Aasim; Ramazan Katirci; Faheem Shehzad Baloch; Zemran Mustafa; Allah Bakhsh; Muhammad Azhar Nadeem; Seyid Amjad Ali; Rüştü Hatipoğlu; Vahdettin Çiftçi; Ephrem Habyarimana; Tolga Karaköy; Yong Suk Chung
Journal: Front Genet Date: 2022-08-24 Impact factor: 4.772

2 in total