| Literature DB >> 28480004 |
Kendra Spence Cheruvelil1, Shuai Yuan2, Katherine E Webster3, Pang-Ning Tan2, Jean-François Lapierre4, Sarah M Collins5, C Emi Fergus6, Caren E Scott7, Emily Norton Henry8, Patricia A Soranno6, Christopher T Filstrup9, Tyler Wagner10.
Abstract
Understanding broad-scale ecological patterns and processes often involves accounting for regional-scale heterogeneity. A common way to do so is to include ecological regions in sampling schemes and empirical models. However, most existing ecological regions were developed for specific purposes, using a limited set of geospatial features and irreproducible methods. Our study purpose was to: (1) describe a method that takes advantage of recent computational advances and increased availability of regional and global data sets to create customizable and reproducible ecological regions, (2) make this algorithm available for use and modification by others studying different ecosystems, variables of interest, study extents, and macroscale ecology research questions, and (3) demonstrate the power of this approach for the research question-How well do these regions capture regional-scale variation in lake water quality? To achieve our purpose we: (1) used a spatially constrained spectral clustering algorithm that balances geospatial homogeneity and region contiguity to create ecological regions using multiple terrestrial, climatic, and freshwater geospatial data for 17 northeastern U.S. states (~1,800,000 km2); (2) identified which of the 52 geospatial features were most influential in creating the resulting 100 regions; and (3) tested the ability of these ecological regions to capture regional variation in water nutrients and clarity for ~6,000 lakes. We found that: (1) a combination of terrestrial, climatic, and freshwater geospatial features influenced region creation, suggesting that the oft-ignored freshwater landscape provides novel information on landscape variability not captured by traditionally used climate and terrestrial metrics; and (2) the delineated regions captured macroscale heterogeneity in ecosystem properties not included in region delineation-approximately 40% of the variation in total phosphorus and water clarity among lakes was at the regional scale. Our results demonstrate the usefulness of this method for creating customizable and reproducible regions for research and management applications.Entities:
Keywords: constrained spectral clustering; ecoregions; geospatial variables; lake; landscape; macroecology; macrosystems; regional spatial scale; regionalization; spatial heterogeneity
Year: 2017 PMID: 28480004 PMCID: PMC5415510 DOI: 10.1002/ece3.2884
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Figure 1The study extent: (a) The U.S. with the 17 states shaded, (b) A close‐up of HU‐12s in ~2 states (Wisconsin and Michigan), and (c) A close‐up of a focal HU‐12 (red in (b)) with its neighboring HU‐12s shaded to demonstrate different levels of the contiguity constraint
Descriptive statistics for the hydrologic units (HU‐12s; Seaber et al., 1987) clustered to make regions and the lake characteristics used to test region ability to capture macroscale variation among ecosystems (using means). In‐lake data were from summer samples of lakes ≥4 ha in size during 2002–2011. Water clarity was measured as Secchi disk depth
| Variable | Unit | Median | Mean | 25th percentile | 75th percentile | Sample size |
|---|---|---|---|---|---|---|
| HU‐12 | ha | 7,868 | 8,446 | 5,710 | 10,580 | 20,257 |
| Water clarity | m | 2.6 | 2.9 | 1.4 | 3.9 | 6,044 |
| Total Phosphorus | μg/L | 16.2 | 35.2 | 10.0 | 32.2 | 3,896 |
Figure 2Schematic illustrating the procedure for creating ecological regions from geospatial data using a spatially constrained spectral clustering method, evaluating the ecological regions, and applying them to ecosystem properties. The circle represents the Hadamard product. See text, Appendix S3 and Yuan et al., 2015 for details
Figure 3The range of optimal number of regions in our study extent, calculated using the ratio of slopes for the regions created using spatially constrained spectral clustering (SSC) to the slope for the regions created using a completely random clustering approach (no consideration of geospatial features or region contiguity). The inset is a blow‐up of the range of number of regions that included the optimal number (between 80 and 110) for our study extent and geospatial data
Figure 4(a–f) Maps depicting the ecological regions created using spatially constrained spectral clustering and varying the level of region contiguity (i.e., the neighborhood constraint δ = 1, 4, 8 and 16; a‐d), as well as k‐means clustering on the PCA factors with no contiguity constraint (e) and a combination plot contrasting the boundaries of the δ = 1, 4, and 8 regions (f). White lines indicate U.S. state borders
Metrics of clustering to create ecological regions. (A) Sum of squares error within regions (SSW), the spatial contiguity metric (PctML, measured in %), and the ratio of the sum of squares within and between regions (SSW:SSB) for ecological regions created using 52 geospatial variables characterizing the terrestrial, climatic, and freshwater landscapes. (B) Sum of squares error within regions (SSW) and the ratio of the sum of squares within and between regions (SSW:SSB) for two lake characteristics (mean values). Regions were made with (1) spatially constrained spectral clustering (SSC; Yuan et al., 2015) along a continuum of contiguity created by varying the contiguity constraint (δ = 1, 4, 8, 16), (2) spectral clustering (SC) along that same contiguity continuum while ignoring landscape homogeneity, and (3) k‐means clustering (K) of the PCA features directly with no contiguity constraint. “Considers” refer to whether regions were made while accounting for landscape homogeneity, region contiguity, or both. Bolded values point out the ecological regions created with SSC that resulted in the smallest SSW, largest PctML, or smallest ratio. Total sum of squares variation (SSW + SBB) for each response variables was: 836,439, 3,852, and 5,887 for geospatial variables, total phosphorus, and water clarity, respectively. Mod = moderate
| (A) Geospatial features | ||||||||
|---|---|---|---|---|---|---|---|---|
| Clustering method | Considers: Region contiguity | Considers: Landscape homogeneity | Metric | Contiguity level | ||||
| None | Loose | Moderate | Moderate | Strict | ||||
| No δ | δ = 16 | δ = 8 | δ = 4 | δ = 1 | ||||
| (1) SSC | Yes | Yes | SSW(PctML) |
| 380,608 (31) | 412,407 (60) | 447,160 | |
| (2) SC | Yes | No | SSW(PctML) | 639,402 (57) | 559,241 (92) | 499,723 (92) |
| |
| (3) K | No | Yes | SSW(PctML) | 152,220 | ||||
| SSW:SSB | 0.22 |
|
| 0.97 | 1.15 | |||
This method does not include the region contiguity constraint.
Figure 5Random forest importance scores heat map for each of the 52 geospatial features and the nine sets of ecological regions created. Values are mean decreases in the Gini impurity criterion, with higher values (darker shading) indicating higher variable importance in the random forest. Regions were made with (1) k‐means clustering (K) of the PCA features directly with no contiguity constraint, (2) spatially constrained spectral clustering (SSC; Yuan et al., 2015) along a continuum of contiguity created by varying the contiguity constraint (δ = 1, 4, 8, 16), and (3) with spectral clustering (SC) along that same contiguity continuum while ignoring landscape homogeneity