Julie M Allen1, Charlotte C Germain-Aubrey2, Narayani Barve2, Kurt M Neubig3, Lucas C Majure4, Shawn W Laffan5, Brent D Mishler6, Hannah L Owens7, Stephen A Smith8, W Mark Whitten2, J Richard Abbott9, Douglas E Soltis10, Robert Guralnick11, Pamela S Soltis12. 1. Department of Biology, University of Nevada Reno, Reno, NV 89557, USA; Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA. Electronic address: jallen23@unr.edu. 2. Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA. 3. Department of Plant Biology, Southern Illinois University, Carbondale, IL 62901, USA. 4. Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA; Department of Research, Conservation and Collections, Desert Botanical Garden, Phoenix, AZ 85008, USA. 5. School of Biological, Earth and Environmental Sciences, The University of New South Wales, Sydney, Australia. 6. University and Jepson Herbaria, and Department of Integrative Biology, University of California, Berkeley, Berkeley, CA 94720-2465, USA. 7. Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA; Center for Macroecology, Evolution and Climate, Natural History Museum of Denmark, University of Copenhagen, 2100 Copenhagen O, Denmark. 8. Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48103, USA. 9. Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA; School of Mathematical and Natural Sciences, The University of Arkansas at Monticello, Monticello, AR 71655, USA. 10. Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA; Genetics Institute, University of Florida, Gainesville, FL 32608, USA; Department of Biology, University of Florida, Gainesville, FL 32611, USA; Biodiversity Institute, University of Florida, Gainesville, FL 32611, USA. 11. Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA; Biodiversity Institute, University of Florida, Gainesville, FL 32611, USA. 12. Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA; Genetics Institute, University of Florida, Gainesville, FL 32608, USA; Biodiversity Institute, University of Florida, Gainesville, FL 32611, USA.
Abstract
Recent availability of biodiversity data resources has enabled an unprecedented ability to estimate phylogenetically based biodiversity metrics over broad scales. Such approaches elucidate ecological and evolutionary processes yielding a biota and help guide conservation efforts. However, the choice of appropriate phylogenetic resources and underlying input data uncertainties may affect interpretation. Here, we address how differences among phylogenetic source trees and levels of phylogenetic uncertainty affect these metrics and test existing hypotheses regarding geographic biodiversity patterns across the diverse vascular plant flora of Florida, US. Ecological niche models for 1,490 Florida species were combined with a "purpose-built" phylogenetic tree (phylogram and chronogram), as well as with trees derived from community resources (Phylomatic and Open Tree of Life). There were only modest differences in phylodiversity metrics given the phylogenetic source tree and taking into account the level of phylogenetic uncertainty; we identify similar areas of conservation interest across Florida regardless of the method used.
Recent availability of biodiversity data resources has enabled an unprecedented ability to estimate phylogenetically based biodiversity metrics over broad scales. Such approaches elucidate ecological and evolutionary processes yielding a biota and help guide conservation efforts. However, the choice of appropriate phylogenetic resources and underlying input data uncertainties may affect interpretation. Here, we address how differences among phylogenetic source trees and levels of phylogenetic uncertainty affect these metrics and test existing hypotheses regarding geographic biodiversity patterns across the diverse vascular plant flora of Florida, US. Ecological niche models for 1,490 Florida species were combined with a "purpose-built" phylogenetic tree (phylogram and chronogram), as well as with trees derived from community resources (Phylomatic and Open Tree of Life). There were only modest differences in phylodiversity metrics given the phylogenetic source tree and taking into account the level of phylogenetic uncertainty; we identify similar areas of conservation interest across Florida regardless of the method used.
The recent explosion of biodiversity data (spatial and genetic) along with environmental data (regarding climate, terrain, and vegetation), along with novel analytical methods and tools, has enabled an unprecedented capability to model species distributions and assemble those results into broad-scale diversity assessments (e.g., Tittensor et al., 2010, Ezard et al., 2011, Olalla-Tárraga et al., 2011, Nagalingum et al., 2015). Linking spatial ecological patterns to phylogenetic information is more powerful still (Mishler et al., 2014, Nagalingum et al., 2015) given that species assemblages encompassing deeper phylogenetic nodes and more evolutionary history are arguably more diverse than other areas with the same number of species connected via shallower nodes (Faith, 1992). Phylogenetic approaches extend diversity measurements from simplistic species counts to measures that also inform evolutionary pattern and process.One of the key measures in spatial phylogenetics is phylogenetic diversity (PD; Faith, 1992). PD is calculated as the sum of branch lengths from a phylogenetic tree connecting the terminal taxa from a specific location, typically to the root of the tree. PD can be interpreted either as the amount of “feature diversity” contained within a region of interest when using a phylogram, i.e., the number of apomorphies present in an area, or as the amount of “evolutionary history” when using a time-calibrated chronogram (Davies and Buckley, 2012, Rosauer, 2010). Those regions with higher PD than others may be prioritized for conservation (i.e., as containing higher genetic diversity or a greater amount of evolutionary history), although there are obviously other potential criteria, such as threat status, that should be applied in conservation assessments (Jetz and Freckleton, 2015). PD is typically strongly correlated with species richness, because more terminal taxa in a sample means that a larger portion of the tree is expected to be sampled. Mishler et al. (2014) developed a compound spatial phylogenetic metric, Relative Phylogenetic Diversity (RPD), designed to examine whether unusually long or unusually short branches are present in a location. PD and RPD measures along with associated randomization tests can help elucidate the evolutionary processes that have generated biotas, which in turn support stronger assessments of conservation priorities.The evolutionary trees used in spatial phylogenetic studies are often not built by the authors performing the study, and tree building is necessarily not afforded the same level of scrutiny as analysis of spatial ecological data, despite the critical importance of trees for rigorous inference. Instead, ecologists have often relied on (1) converting a taxonomic hierarchy directly into a tree (e.g., Davies et al., 2007), (2) shortcut trees constructed for focal species via Phylomatic software (e.g., Webb and Donoghue, 2005, Webb et al., 2008, Wright et al., 2007, Liu et al., 2013) or the Open Tree of Life (OTL; Hinchliff et al., 2015), (3) automated assembly of published sequences such as PhyloGenerator (Pearse and Purvis, 2013), (4) literature-based trees (e.g., Beaulieu et al., 2012), or (5) framework trees vetted by the phylogenetics community (e.g., The Angiosperm Phylogeny Group IV, 2016, Soltis et al., 2011).Despite the relative ease of acquiring such trees, their quality and inherent uncertainties are rarely examined, and the impact of these factors on PD assessment has not been well studied (but see Qian et al., 2015, Swenson, 2009, Molina-Venegas and Roquet, 2013, Rangel et al., 2015, Thornhill et al., 2017). There remains a need to better document how factors involved in constructing phylogenetic trees (e.g., phylogenetic uncertainty and taxon sampling) influence these metrics. Both tree topology and branch lengths are determined by the sampling of taxa and the gene sequences employed, and these factors must be considered when computing and interpreting PD measures. For example, limited taxonomic sampling from a tree will produce longer individual branches than are truly present, whereas limited sampling of genetic data may result in unrepresentative branch lengths. Likewise, the use of phylograms versus chronograms yields branch length differences and therefore different values for PD measures. The phylogenetic depth over which trees are computed will also affect the magnitude of these metrics: older clades have longer branches in a chronogram and therefore contribute to higher estimates of PD than younger clades. Finally, failure to account for tree uncertainty might inflate the confidence in a given result. This last issue is particularly underexplored, but crucial for interpreting PD values.Here we provide a comprehensive examination of how the choice of input phylogenetic trees and inclusion of phylogenetic uncertainty affect the assessment of PD measures, utilizing Florida vascular plants as a case study. To test the importance of input trees, we developed phylogenetic trees for the specific purpose of estimating biodiversity through integration with distribution models. A key rationale for doing so was to determine if a more comprehensive, well-developed, and purpose-built phylogenetic tree would yield different estimates of PD relative to those built from easily available and existing trees obtained, for example, using Phylomatic (Webb and Donoghue, 2005), or by pruning a subtree from a pre-assembled supertree (i.e., the OTL; Hinchliff et al., 2015).We chose Florida as the focus for study because it is home to approximately 4,300 species of native or naturalized vascular plants and a broad range of terrestrial and aquatic habitats (Wunderlin et al., 2017). Furthermore, Florida is part of the North American Coastal Plain biodiversity hotspot (Noss et al., 2015). Florida's flora ranges from temperate, eastern deciduous forest taxa in the north to tropical elements in central and southern Florida (Myers and Ewel, 1990); these unique floristic elements mix at transition zones, leading to novel communities that might be expected to have unusual phylogenetic affinities. At the same time, past climatic changes caused inundation of much of the state, forming ancient shorelines, such as the Lake Wales Ridge (LWR), that still harbor an unusual, highly endemic scrub flora and fauna (Dobson et al., 1997). The southern portion of Florida has a subtropical climate and includes unique ecoregions such as the Everglades, Big Cypress and Miami Ridge, and Pine Rocklands, each with characteristic floristic elements (Long and Lakela, 1971). Florida also supports the third highest concentration of federally sensitive, threatened, and endangered species in the United States (Ihlo et al., 2014), after California and Hawaii (Dobson et al., 1997). Furthermore, one-third of the flora of Florida is now composed of exotic species (either naturalized or invasive), and habitat loss due to human development is mounting (Gordon, 1998). Still, despite the magnitude of ecological and conservation concerns in this region, little is known about the overall geographic patterns of plant diversity in Florida.The present study had both empirical and methodological goals. Our empirical goal was to test hypotheses regarding patterns of Florida biodiversity derived from previous studies of forest types, vertebrates, and butterflies. In particular, work by documented an overall decrease in diversity from north to south in Florida, although this pattern was only assessed qualitatively based on maps of richness from a variety of vertebrates and butterflies. These conversed patterns of diversity, when compared with general latitudinal diversity gradients (Wiens et al., 2009, Buckley et al., 2010), may relate more to the unique transitional zones from temperate to tropical floras in Florida and the underlying climate, soil, and terrain of the region than to temperature. Previous work has noted that transitional areas, such as the Southern Coastal Plain ecoregion and northern peninsular Florida with southern hardwood forests and temperate broad-leaved evergreen forests, harbor particularly high diversity (Greller, 1980). Observed PD patterns in plants should be strongly concordant with previous hypotheses of diversity, but some geographic areas may harbor unexpectedly high areas of PD, such as the Miami Ridge ecoregion and its tropical hammock forest flora (Myers and Ewel, 1990). In such areas, where there may be mixing of floristic elements, we predicted concentrations of significantly overdispersed (e.g., even) lineages (based on PD) and concentrations of unusually long branches (based on RPD). We further predicted concentrations of significant phylogenetic clustering in areas wherein habitat may select for specific community members (also called “habitat filtering”), as well as significant concentrations of shorter-than-expected branches in areas where lineages have potentially diversified in situ, such as the LWR. Finally, we attempted to contextualize these findings from a conservation perspective, given ongoing rapid anthropogenic changes to native landscapes in Florida.Our methodological goals were to explore the effect of the choice of phylogenetic tree on spatial phylogenetic metrics (PD and RPD) and to provide an approach to account more effectively for sources of uncertainty in phylogenetic trees. We generated PD and RPD using a variety of input phylogenetic trees and compared the results using multiple approaches to understand how to interpret differences and uncertainty in these assessments. We expected greater variation among branch lengths across the tree in chronograms than in phylograms, as branches can often be either greatly lengthened or shortened, reflecting constraints of evolutionary time. This difference was predicted to affect the distribution of observed PD, but the impact on significance tests is poorly characterized (but see Thornhill et al., 2017). We also examined how spatial phylogenetic metrics vary between trees pruned from existing supertrees and those inferred from curated analysis where stringent efforts have been made to close gaps in taxon sampling, using a strategic approach for gene sampling and branch length assessments. Finally, we used a Bayesian framework to generate a distribution of trees representing uncertainty in phylogenetic estimates to assess the impacts on PD.
Results
Ecological Niche Models for Generating Species Lists per Pixel
Validation metrics across all models were high, with training Area Under the Curve (AUC) scores of >0.8 and test AUC scores within 0.15 of the training scores in nearly all cases. A small proportion of models had significantly worse performance, wherein the difference between training and testing was >0.5. Such metrics were often due to low sample sizes; we removed species with outlier AUC scores >3 standard deviations away from the mean from our final analysis. Ultimately, we accepted 1,490 models (i.e., one per species), rejecting 12 species with poor model performance. Figure 1 shows species richness based on stacked models for Florida at a 4-km resolution, ranging from a low of 57 species to a high of 856. We do not focus on taxonomic measures of richness here, but instead utilize the lists of species per 4- × 4-km pixel to measure PD. For one cell in Figure 1 colored red (found in the central peninsula of Florida), we show all the phylogenetic branches linking taxa present in that cell. PD was highly correlated with species richness (Figure S2). Observed PD measures across the state are summarized in Figure 2 for both the phylogram and chronogram.
Figure 1
Phylogram, Chronogram and Species Richness of Vascular Plants in Florida
(A) Phylogeny of 1,490 vascular plants in Florida, shown as both chronogram (left) and phylogram (right). The black dots on the chronogram indicate the positions of the 17 calibration points.
(B) A map showing species richness, with PD from one grid cell highlighted in red on the phylogram and the chronogram.
Figure 2
Phylogenetic Diversity of Vascular Plants in Florida
Observed (top panel) and significant (bottom panel) phylogenetic diversity measured from phylogram (left) and chronogram (right) for vascular plants. On the top panel, the Environmental Protection Agency Level III ecoregions are mapped with the darker lines, Level IV are mapped with the lighter lines, and areas of interest, e.g., Lake Wales Ridge and Miami Ridge, are identified.
Phylogram, Chronogram and Species Richness of Vascular Plants in Florida(A) Phylogeny of 1,490 vascular plants in Florida, shown as both chronogram (left) and phylogram (right). The black dots on the chronogram indicate the positions of the 17 calibration points.(B) A map showing species richness, with PD from one grid cell highlighted in red on the phylogram and the chronogram.Phylogenetic Diversity of Vascular Plants in FloridaObserved (top panel) and significant (bottom panel) phylogenetic diversity measured from phylogram (left) and chronogram (right) for vascular plants. On the top panel, the Environmental Protection Agency Level III ecoregions are mapped with the darker lines, Level IV are mapped with the lighter lines, and areas of interest, e.g., Lake Wales Ridge and Miami Ridge, are identified.
Florida Plant Phylogeny Is Consistent with Previous Literature
The relationships among vascular plants based on the two plastid genes for the species sampled from Florida agree closely with the results of previous phylogenetic analyses based on more genes and taxa. For example, we recovered the major subclades of angiosperms, and relationships within and among those are also in agreement with broader analyses (Figure 1; Moore et al., 2007, Soltis et al., 2011, Ruhfel et al., 2014, Wickett et al., 2014). Long branches are pronounced in the lycophytes, monilophytes, and gymnosperms, as well as in parasites, which tend to have increased substitution rates in plastid genes because of the loss of functionality, and in herbaceous lineages, wherein longer branches would be expected compared with woody relatives (Smith and Donoghue, 2008). Lineages known to have radiated rapidly (e.g., Asteraceae) also exhibit generally shorter branches as expected. Likewise, there are clades of very low phylogenetic resolution because of short branch lengths. This is particularly prevalent in Asteraceae and Poaceae, two of the most species-rich families of angiosperms in general and in Florida. The overall phylogenetic framework of the vascular plants of Florida is highly similar to the accepted framework based on broader geographic analyses, and relationships within subclades of vascular plants also reflect those found in other studies.The few instances in which the topology differs from published analyses are all minor deviations from expectations and result from the limited dataset (i.e., matK, rbcL; only Florida plants) employed here. These deviations (e.g., a genus with a single sampled species in Florida placed in a related genus with multiple species rather than as sister to that clade, or species of two closely related genera interdigitated) also likely reflect sampling issues and are surprisingly minor, given that the vascular plants of Florida are a small subset of global diversity. Furthermore, most inconsistencies between the Florida tree and more broadly sampled trees occur in regions of the phylogeny where relationships continue to be difficult to resolve (e.g., Lamiales and Asteraceae). Finally, uncertainty in a few very short branches is not likely to confound PD analyses, because the better-supported long branches contribute the great majority of PD (González-Orozco et al., 2016). Calibrating this tree with the fossil constraints (Table S4) yielded a chronogram for comparison with the phylogram in downstream analyses (Figure 1); the chronogram has smoothed branch lengths relative to the phylogram (see Figure 1 for comparison).
Spatial Phylogenetic Patterns across Florida
Phylogenetic Diversity Patterns in PD
Plotting PD across Florida relative to the Level III and Level IV Environmental Protection Agency ecoregions (Figure 2) and to latitude (Figure 3) revealed both ecological and geographic patterns. The highest PD occurred from the northern parts of peninsular Florida south to near Orlando and St. Petersburg. For both the phylogram and chronogram, PD was higher in central Florida than in the northern and southern parts of the state (Figure 2). South Florida showed a mix of patterns longitudinally; the Everglades and Big Cypress had relatively low PD, whereas Miami Ridge, at the same latitude, had relatively high PD (visible as the long tail toward positive PD values in Figure 3 at latitude 26.5°N). The average PD values for the ecoregions also showed higher PD in the Southern Coastal Plain across peninsular Florida than elsewhere (Table 1).
Figure 3
Phylogenetic Diversity by Latitude
The study region was binned by 0.5° into 13 latitudinal sections represented by the lines on the maps on the right. Bean plots on the left represent the phylogenetic diversity values for the pixels within each section for both the phylogram (top) and chronogram (bottom).
Table 1
PD Calculations and SD for the Cells Contained in Three Ecoregions for the Phylogram and Chronogram
Ecoregion
Mean
SD
Phylogram
Southeastern Plains
0.3997
0.0917
Southern Coastal Plains
0.5071
0.0987
Southern Florida Coastal Plains
0.3554
0.0512
Chronogram
Southeastern Plains
0.3621
0.0868
Southern Coastal Plains
0.4214
0.0968
Southern Florida Coastal Plains
0.2553
0.0426
Phylogenetic Diversity by LatitudeThe study region was binned by 0.5° into 13 latitudinal sections represented by the lines on the maps on the right. Bean plots on the left represent the phylogenetic diversity values for the pixels within each section for both the phylogram (top) and chronogram (bottom).PD Calculations and SD for the Cells Contained in Three Ecoregions for the Phylogram and Chronogram
Patterns of Significant Clustering or Evenness
Areas showing significantly high or low PD differed when measured on the phylogram versus the chronogram (Figure 2). The chronogram-derived values showed a strong pattern of evenness in the northern and central areas of Florida, especially in the Southeastern Coastal Plain ecoregion. The chronogram-based results suggest that significantly more evolutionary history than expected is assembled in the north central part of the state and significantly less than expected is assembled in the southern part of the state. In contrast, PD significance based on the phylogram, which more directly represents feature diversity, showed phylogenetic clustering in several regions of the state, particularly along the northwestern coast of Florida, with very little evenness anywhere.
Relative Phylogenetic Diversity Patterns
Patterns of RPD also differed substantially between the phylogram and chronogram. Geographic areas of major difference between the chronogram- and phylogram-derived values include (Figure 4): (1) northern Florida, where the chronogram yielded high observed RPD and a much more extensive concentration of significantly high RPD (i.e., longer branches than expected) than yielded by the phylogram; (2) central Florida, where the phylogram generally did not show significantly low RPD (i.e., shorter branches than expected), whereas the chronogram did; and (3) very southern Florida, including the Miami Ridge area and the Everglades, where the phylogram resulted in high observed RPD and a much larger concentration of significantly high RPD than the chronogram.
Figure 4
Relative Phylogenetic Diversity
Observed (top panel) and significant (bottom panel) relative phylogenetic diversity measured from (A and C) phylogram and (B and D) chronogram for vascular plants.
Relative Phylogenetic DiversityObserved (top panel) and significant (bottom panel) relative phylogenetic diversity measured from (A and C) phylogram and (B and D) chronogram for vascular plants.
Alternative Source Trees Yield Both Similarities and Differences in Patterns of PD
Similar patterns of PD emerged among the alternative source trees (from Phylomatic and the OTL) and the purpose-built tree (Figure 5). Similarities are expected between the purpose-built trees and those from the OTL as the branch lengths were determined from the same alignment. Each source phylogram produced PD values with significant clustering in the Florida panhandle and along peninsular Florida, whereas each source chronogram produced PD values showing evenness along the northern edge of the panhandle. We found that the OTL trees and purpose-built trees were the most similar, with fewer than 5% of the cells showing a difference in significance for phylogram-derived values and fewer than 20% of the cells showing a difference in significance for those based on the chronograms. Measures using the Phylomatic trees, despite having fewer deleted taxa, showed more differences from the purpose-built trees; > 25% of the cells showed a different significance result (Figure 5; Table 2). Taxon sampling differences among methods modestly affected this spatial phylogenetic metric (e.g., with fewer taxa, only the area east of LWR was prominent as an area of significant clustering in the phylogram; Figure 5B).
Figure 5
Diversity Hypothesis Tests Comparing Chronograms and Phylograms for Vascular Plants Built Using Either Our Purpose-Built Tree, Phylomatic, or Open Tree
In the top panel are phylograms and chronograms for the purpose-built tree pruned to the (A) Phylomatic and (B) Open Tree taxon dataset. In the middle panel are the (A) Phylomatic and (B) Open Tree trees. In the lower panel are the differences between the two maps. Gray pixels are those that changed in significance level between the Phylomatic and purpose-built tree and the Open Tree and purpose-built tree.
Table 2
Number of Cells Showing Different Results between the Tree Resources
Comparison with Purpose Built Tree
OpenTree
Phylomatic
Phylogram
373
2,126
Chronogram
1,517
1,945
Diversity Hypothesis Tests Comparing Chronograms and Phylograms for Vascular Plants Built Using Either Our Purpose-Built Tree, Phylomatic, or Open TreeIn the top panel are phylograms and chronograms for the purpose-built tree pruned to the (A) Phylomatic and (B) Open Tree taxon dataset. In the middle panel are the (A) Phylomatic and (B) Open Tree trees. In the lower panel are the differences between the two maps. Gray pixels are those that changed in significance level between the Phylomatic and purpose-built tree and the Open Tree and purpose-built tree.Number of Cells Showing Different Results between the Tree Resources
Tree Uncertainty Has Little Effect on Significance of Phylogenetic Diversity Scores
The standard deviations across the PD scores calculated from the 100 chronograms were larger in general than those calculated from the 100 phylograms, with the similarities between the two maps most prominent in the panhandle and far southern Florida (Figures 6A–6C). Significant clustering or evenness for the 100 Bayes trees is summarized in Figures 6D and 6E. As might be expected, no cells showed a change from significant clustering to significant evenness, or vice versa, across the 100 trees, yet some cells showed relatively high inconsistency in significance level in one direction or another. Key questions were whether phylogenetic uncertainty might lead to widespread errors in assessment of PD significance, particularly if changes due to uncertainty are spatially clustered, leading to geographic bias. Although there were some cells for which significance changed, the overall pattern does not suggest spatial structuring in uncertainty of PD significance. Approximately 80% of all pixels were consistent in significance for values derived from both the chronogram and the phylogram, whereas only 7% of the pixels showed the highest level of uncertainty and these were widely scattered (Table 3). Finally, general patterns of PD across latitude do not change when we include all the PD calculations across the 100 trees (Figure 7).
Figure 6
Uncertainty of Phylogenetic Diversity
Top: Standard deviation of observed Phylogenetic Diversity across all (A) 100 phylograms and (B) 100 chronograms selected from the post-burn-in distribution of trees in our Bayesian analysis. (C) Difference between the two on the right in blue. Bottom: Areas in light blue are those for which 91 (of 100) or more of the trees had the same level of significance. Areas in yellow are mostly consistent, with 71–90 (of 100) of the trees finding the same level of significance. Areas in red are the relatively inconsistent pixels, with only 50–70 of the trees having the same level of significance for (D) 100 phylograms and (E) 100 chronograms.
Table 3
Number of Cells in Each Class of Uncertainty
Classa
Phylogram
Chronogram
50-70
642
645
71-90
1,013
997
91-100
6,508
6,521
Class indicates the number of trees with a consistent level of significance. For example, 50-70 class indicates that for those pixels 50-70 of the trees were similarly significant meaning and 50-30 were not similarly significant. For the 71-90 class more of the trees were consistently significant, and for the 91-100 class the majority of the trees found the same level of significance, suggesting that those pixels are consistent when taking into account uncertainty in the phylogenetic estimates.
Figure 7
Uncertainty of Phylogenetic Diversity by Latitude
The study region was binned by 0.5° into 13 latitudinal sections represented by the lines on the maps on the right. Bean plots represent the phylogenetic diversity values for all 100 trees for each of the pixels within each section for both the phylogram (left) and chronogram (right).
Uncertainty of Phylogenetic DiversityTop: Standard deviation of observed Phylogenetic Diversity across all (A) 100 phylograms and (B) 100 chronograms selected from the post-burn-in distribution of trees in our Bayesian analysis. (C) Difference between the two on the right in blue. Bottom: Areas in light blue are those for which 91 (of 100) or more of the trees had the same level of significance. Areas in yellow are mostly consistent, with 71–90 (of 100) of the trees finding the same level of significance. Areas in red are the relatively inconsistent pixels, with only 50–70 of the trees having the same level of significance for (D) 100 phylograms and (E) 100 chronograms.Number of Cells in Each Class of UncertaintyClass indicates the number of trees with a consistent level of significance. For example, 50-70 class indicates that for those pixels 50-70 of the trees were similarly significant meaning and 50-30 were not similarly significant. For the 71-90 class more of the trees were consistently significant, and for the 91-100 class the majority of the trees found the same level of significance, suggesting that those pixels are consistent when taking into account uncertainty in the phylogenetic estimates.Uncertainty of Phylogenetic Diversity by LatitudeThe study region was binned by 0.5° into 13 latitudinal sections represented by the lines on the maps on the right. Bean plots represent the phylogenetic diversity values for all 100 trees for each of the pixels within each section for both the phylogram (left) and chronogram (right).
Discussion
An Improved Understanding of Florida Floristic Diversity
Here we examined all vascular plant diversity that shapes vegetation definitions in Florida, instead of limiting our study to only the dominant vegetation. Importantly, we found peaks of plant diversity in northern peninsular Florida rather than in the panhandle. One major reason for putting effort into phylogenetic measures is that they connect to evolutionary and ecological processes that shape diversity patterns. For example, although PD may be the highest in northern peninsular Florida, rather than the panhandle, there is significantly more PD than expected in many panhandle areas, especially when considering chronograms rather than phylograms. Southeastern forests are composed of communities containing deep evolutionary branches, particularly in the time-calibrated phylogenies. These mixed forests are stable over long time periods, facilitating accumulation of a broad set of older lineages, as opposed to oscillations of more open oak savannah habitats and inundation during Pleistocene sea-level incursions in central and southern peninsular Florida.In southern Florida, which was entirely submerged during the last interglacial (reviewed in Germain-Aubrey et al., 2014), we find an unusual pattern of phylodiversity, where only the phylogram shows a strong signal in RPD, i.e., significantly longer branches than expected given the null hypothesis. We argue that phylograms, often with relatively longer branches toward the tips, are likely to show stronger patterns in some cases than chronograms, which tend to redistribute branch length from terminal to deeper branches (demonstrated in Figure 1). In southern Florida, communities with taxa of Caribbean or Central/South American origin may be dominated by longer terminal branches. Further examination of both community composition and possible artifacts from methodological choices is warranted.In the central peninsula of Florida, we find strong patterns of both phylogenetic clustering and shorter branches than expected using either the phylogram or the chronogram. Although central Florida is a floristically diverse area, it includes locations that were inundated during Pleistocene interglacial sea-level rise, as well as xeric scrub that was more persistent, but co-occurring taxa are likely filtered due to the evolutionarily conserved preferences of some lineages for the harsh environments of these areas (e.g., excessively drained soil and extreme heat). Alternatively, some of this pattern may be due to in situ differentiation, whereas some taxa may be more recent arrivals as many are derived from western North American lineages that dispersed eastward during more xeric interglacial periods (reviewed in Germain-Aubrey et al., 2014).Our results are also consistent with those found in other, distantly related animal lineages. Used a Florida gap analysis to document high species richness in vertebrates and butterflies especially in the panhandle and extending into the core of central Florida. Although our methods differ from those used by these authors, especially given the focus in on just taxic measures, the results are broadly consistent, perhaps unsurprisingly because plant diversity may generally drive diversity in groups such as butterflies (Burkle et al., 2013).Finally, two particular areas of interest, given known endemicity and unusual floras, are the Miami Ridge/Pine Rocklands and the LWR (location denoted in Figure 2, top left panel, and Figure S3). The Pine Rocklands exhibits a diverse flora of hammock species and those common across the Bahamas and Greater Antilles (Myers and Ewel, 1990). In the Miami Ridge area, as expected, we found increased PD and significant PD clustering for some pixels based on the chronogram and significantly high RPD in others for the phylogram. The LWR, in particular, is known to harbor high endemic species diversity (Myers and Ewel, 1990, Germain-Aubrey et al., 2014), which we hypothesize may show high neo-endemism when examined using phylogenetic endemism metrics in the future. It is beyond the scope of this study to investigate such patterns, especially given that we did not create full geographic range surfaces for some of the species examined, but we found that LWR is neither particularly high in PD and nor does it show significantly clustered or even lineages; however, areas immediately east of LWR show strong clustering. This region is a mosaic of habitats, including pine flatwoods, dry prairies, and marshes (Myers and Ewel, 1990), and the significant clustering in this area may indicate strong filtering for these habitats. It also suggests that conservation priorities should not only be concentrated in areas such as the LWR but also include those areas directly adjacent to it along zones of highly varying diversity. Zones of conservation priority are also found in areas such as the Miami Ridge/Pine Rocklands, which are under direct threat from rapid, continuing human development and provide a further strong justification for conservation actions to support these unique evolutionary assemblages. To obtain a complete picture of conservation priorities, future studies are needed of phylogenetic endemism and associated hypothesis tests (e.g., Cadotte and Davies, 2010, Rosauer et al., 2009, Tucker et al., 2012, Mishler et al., 2014) to complement the PD studies reported here. Future analyses of PD and RPD can compare the ecoregions noted above, as well as native habitats and protected areas, thereby informing conservation priorities for human managed habitats and regions as well as those areas experiencing rapid land conversion in Florida.
The Importance of Evaluating Input Trees for Phylogenetic Diversity
Quality of the Purpose-Built Tree and Community Tree Resources
Relationships within the major clades of ferns (monilophytes), gymnosperms, and angiosperms agree closely with broader phylogenetic analyses focused on those specific subclades (e.g., The Angiosperm Phylogeny Group IV, 2016, Schuettpelz and Pryer, 2007, Smith et al., 2011, Soltis et al., 2011, Stevens, 2001). Some of the more difficult areas to resolve on the purpose-built Florida tree were appropriately resolved in the topology produced from the OTL. This is likely due to the continuously updated nature of the OTL, where the tree topology integrates previously estimated trees into the framework to produce a “synthesis” tree. Our results suggest that, in the future, the OTL may be an important resource for spatial phylogenetic analyses. Providing there is adequate sampling of the terminal taxa in a region represented in the OTL, researchers will be able to save numerous hours in building their own region-specific trees. Of course, for less well-studied regions of the world, many new sequences may need to be added; even for this dataset, a large proportion of the species did not have existing sequence data, making it necessary to sequence many taxa to provide data for branch length estimation.Although Phylomatic and the OTL may give relatively accurate topologies, calculating branch lengths for these trees remains problematic, particularly when using phylograms. To address this issue, we used our DNA sequence alignment to estimate branch lengths on the OTL topology, which likely explains why we found fewer differences between results based on our purpose-built tree and the OTL tree when compared with the Phylomatic tree (Figure 5). However, this method requires assembling an alignment for OTL phylogenies, which may defeat the purpose of using such resources. Current efforts already underway to add branch lengths estimates to the OTL method will further increase the strong utility of OTL as a source for spatial phylogenetics analysis.Taxon sampling may be another issue with using trees from repositories. The Phylomatic tree contained almost all of our terminal taxa of interest (99%), and more taxa than the OTL tree (80%). It is unclear how many taxa would be available if we were to attempt an analysis of all vascular plant species in Florida (∼4,300 species). In general, taxon sampling is a concern that is always difficult to overcome. In some studies, coarser-scale Operational Taxonomic Units (OTUs) such as genera are used to represent most of a flora when sequence data are limited at finer scales (e.g., Thornhill et al., 2016, Thornhill et al., 2017). In our case, we included only 35% of Florida's vascular plant species because the tree was built to match the species for which sufficient occurrence data were available for constructing distribution models. Although we do not know how these patterns might change if we were to add more taxa, pruning of species for comparison with community resources (e.g., OTL) provided a means to examine effects of reduced taxon sampling. We found that the number of cells with significant clustering decreased considerably in central Florida when ∼250 taxa were removed (Figure 5). This result suggests that some power may be reduced with more limited taxonomic sampling. However, our current analyses likely reasonably capture general trends in PD in Florida, providing a much-needed, initial snapshot of diversity.
Tree Uncertainty
Using a different way to examine uncertainty, by comparing multiple outcomes of RaxML searches, Thornhill et al. (2017) found virtually no effect of tree uncertainty on spatial phylogenetics results. Here we used a Bayesian approach to examine uncertainty, which has the potential to yield trees with more differences, yet we hypothesized that tree uncertainty would have a minimal impact on measures of PD significance given that, in most cases, shorter branches are affected, especially for phylograms. However, with larger trees and limited character sampling (nucleotides), uncertainty could have an impact on assessments of PD. Figure 6 shows that approximately 20% of pixels showed moderate to high differences in significance among the 100 trees, flipping between non-significant and either significantly high or significantly low (but never from significantly high to significantly low). This pattern is found for both the phylogram and chronogram, although standard deviations per pixel are much higher for the chronograms. Tree uncertainty in chronograms has more impact on branch lengths due to time scaling—if nodes are uncertain then swapping of branches can result in more pronounced changes in branch lengths than seen at the same place in the corresponding phylogram wherein the uncertain branches tend to be quite short (see demonstration of this in González-Orozco et al., 2016). Our example is an empirical one, and more work using simulated trees could further elucidate expectations of the impacts of tree uncertainty as it relates to measures of phylodiversity.Two key messages come from our analyses. First, although uncertainty is likely to affect judgments of significance in PD and RPD for certain grid cells, there appeared to be no geographic structuring of such cells, and thus the modest amounts of uncertainty seen here do not broadly affect conclusions at the landscape scale. For example, our general assessment of significant clustering in central Florida still appears to hold despite some differences in results among the 100 trees. Second, whereas changes between non-significant and significant clustering or evenness in one direction were seen occasionally due to uncertainty, no changes were seen between significantly high and significantly low. This suggests that, although uncertainty may affect our interpretations of significance or not, it will not change our interpretation of significant clustering to significant evenness. Still, we argue that phylogenetic uncertainty should be considered in phylodiversity analyses, which is currently often not the case.
Phylograms vs. Chronograms
The divergent results seen in significance tests inferred from the phylogram versus the chronogram were the largest differences observed in this analysis, much larger than differences due to tree uncertainty or tree source. The tree topology is the same between the two analyses, whereas the branch lengths are different, and each approach has unique interpretations. An analogy would be travel directions for a route using either geographic distances or times; both indicators are informative in different ways. Generally, it is thought that evenness of lineages measured on the phylogram directly relates to the unexpectedly high genetic disparity and thus may indirectly relate to high functional trait disparity, if the change in those traits is correlated with genetic change in the markers employed. By contrast, evenness of lineages measured on the chronogram directly relates to unexpectedly high temporal disparity and may likewise indirectly relate to high functional trait disparity, if the change in those traits is correlated with time. Both correlations are quite plausible; the interpretation of differences in significance patterns may come down to tempo and mode of evolution. If anagenesis in functional traits is correlated with heterogeneous rates of genetic change on different branches, as, for example, due to generation time effects as commonly seen when comparing woody plants with herbaceous relatives (Smith and Donoghue, 2008) or major adaptive effects such as commonly seen when comparing parasitic plants with autotrophic relatives, then the phylogram will illuminate those processes with significance of PD. On the other hand, if anagenesis in functional traits is relatively uniform and generally correlated with the amount of time elapsed along a branch, then the chronogram will likely indicate that process.The same distinction is important when using PD-related results to help set conservation priorities. Areas with high PD measured on the phylogram by definition have high genetic diversity, and this may be the better measurement if the goal is preserving genetic diversity, whereas high-PD areas measured on the chronogram contain an unusually large amount of evolutionary time, and this may be the better measurement if the goal is preserving evolutionary diversity. Which form of branch lengths is preferred for a proxy for functional trait diversity depends on the value placed on these processes along with the conservation priorities (Thornhill et al., 2017).
Limitations of the Study
Although this study includes 1,490 taxa of Florida plants, more than 4,000 vascular plant species are known to occur in the state, and it is possible that full inclusion of all species could affect the results presented here. Further efforts to assemble more complete distribution data, especially for range-restricted species, are ongoing, and those records can hopefully lead to further refined and accurate species distribution modeling. We note that although the phylogeny recovered using a small set of markers aligns with known relationships, further work to develop more robust phylogenetic hypotheses is a next step. Finally, further work correlating these patterns with areas of high population growth and encroaching sea-level rise will provide additional insights for conservation efforts and planning.
Methods
All methods can be found in the accompanying Transparent Methods supplemental file.
Authors: Chandra Earl; Michael W Belitz; Shawn W Laffan; Vijay Barve; Narayani Barve; Douglas E Soltis; Julie M Allen; Pamela S Soltis; Brent D Mishler; Akito Y Kawahara; Robert Guralnick Journal: iScience Date: 2021-03-23
Authors: Daijiang Li; Lauren Trotta; Hannah E Marx; Julie M Allen; Miao Sun; Douglas E Soltis; Pamela S Soltis; Robert P Guralnick; Benjamin Baiser Journal: Ecology Date: 2019-07-09 Impact factor: 5.499