Literature DB >> 32373764

Dramatic impact of metric choice on biogeographical regionalization.

Jian-Fei Ye1,2,3, Yun Liu1,2, Zhi-Duan Chen1.   

Abstract

For a quantitative biogeographical regionalization, the choice of an appropriate dissimilarity index to measure pairwise distances is crucial. Several different metrics have been used, but there is no specific study to test the impact of metric choice on biogeographical regionalization. We herein applied a hierarchical cluster analysis on the mean nearest taxon distance (MNTD) and the phylogenetic turnover component of the Sørensen dissimilarity index (pβsim) pairwise distances to generate two schemes of phylogenetic regionalization of the Chinese flora, and then evaluated the effect of metric choice. Floristic regionalization based on MNTD was influenced by richness differences, but regionalization based on pβsim can clearly reflect the evolutionary history of the Chinese flora. We provided a brief description of the five regions identified by pβsim, and the regionalization can help develop strategies to effectively conserve the taxa and floristic regions with different origins and evolutionary histories.
© 2020 Kunming Institute of Botany, Chinese Academy of Sciences. Publishing services by Elsevier B.V. on behalf of KeAi Communications Co., Ltd.

Entities:  

Keywords:  Angiosperms; Chinese flora; Dissimilarity index; Distance metrics; Spatial turnover

Year:  2020        PMID: 32373764      PMCID: PMC7195599          DOI: 10.1016/j.pld.2019.12.003

Source DB:  PubMed          Journal:  Plant Divers        ISSN: 2468-2659


Introduction

Biogeographical regionalization aims to classify global or regional biota into meaningful geographical units for analysis (Mackey et al., 2007). Thus, it can provide an indispensable background for research on biodiversity and conservation (Crisp et al., 2009). Takhtajan (1978) divided the global flora into six kingdoms, 34 regions and 147 provinces, which are widely used in historical and ecological biogeography (Linder et al., 2012, Meyer et al., 2012a, Meyer et al., 2012b). The increasing availability of species distribution data, combined with the tree of life reconstructions, have set new dimensions for studying biogeographical regionalization (Holt et al., 2013, Kreft and Jetz, 2010, Li and Kraft 2015, Li et al., 2018, Li and Sun, 2017). Recently, important progress has been made in biogeographical regionalization using both qualitative approaches with endemic taxa (Wu et al., 2010) and quantitative approach with distribution and evolutionary data (Zhang et al., 2016a, Zhang et al., 2016b; Ye et al., 2019). Compared with qualitative approaches, regionalization based on quantitative approaches are much more objective and reproducible. To arrive at a reasonable quantitative biogeographical regionalization scheme, it is critical to choose an appropriate dissimilarity index that measures pairwise distances between grid cells assemblages. Several metrics have been used in biogeographical regionalization, including Euclidean distance (Zhang et al., 2016a), Jaccard (Linder et al., 2005), the mean nearest taxon distance (MNTD) (Webb, 2000), Sørensen (Xie et al., 2004) and the turnover component of the Sørensen dissimilarity index (βsim; Baselga, 2012) (see Table 1). Some of these metrics are independent of differences in species richness among grid cells (Baselga, 2012), while some are strongly affected by differences in species richness (Lennon et al., 2001). For instance, Slik et al. (2018) suggested that the MNTD is easily affected by differences in taxonomic richness, and cannot give the correct measure of taxon turnover between two assemblages. Therefore, in studies of biogeographical regionalization, researchers must use those metrics (e.g., βsim and βjtu) that are least affected by variation in richness (Leprieur and Oikonomou, 2014), especially for those biotas with a large richness gradient (Mouillot et al., 2013), such as the Chinese flora (Lu et al., 2018).
Table 1

Examples of bioregionalization studies employing different dissimilarity indices after Kreft and Jetz (2010).

ReferenceTaxonGeographical extentDissimilarity index
Linder et al. (2012)plants and vertebratessub-Saharan Africaβsim
González-Orozco et al. (2013)the genus AcaciaAustraliaβsim
Holt et al. (2013)amphibians, birds, and mammalsglobalβsim
Mouillot et al. (2013)fishesIndo-Pacificβsim and βjtu
Rueda et al. (2013)birds, mammals and amphibiansglobalHellinger distance
González-Orozco et al. (2014a)plantsAustraliaβsim
González-Orozco et al. (2014b)eucalyptsAustralia and Malesiaβsim
Kubota et al. (2014)woody plantsthe Japanese archipelagoβsor
Hattab et al. (2015)fishesMediterranean Seaβjtu
Jønsson and Holt (2015)passerine birdsglobalβsor
Li et al. (2015)plantsYunnan, ChinaBray–Curtis index, mean nearest phylogenetic distance
Daru et al. (2016)woody plantssouthern Africaβsim and βjtu
Julien et al. (2016)orchidsNew Guineaβjac
Zhang et al. (2016a)plantsQinghai-Tibetan PlateauEuclidean distance
Zhang et al. (2016b)plantsChinaβsim
Daru et al. (2017)marine plantsglobalβsim
He et al. (2017)terrestrial vertebratesChinaβsim
König et al. (2017)plantsglobalβsim
Clark et al. (2017)archaeawestern Europe, the Mediterranean and east Africaβsim
Droissart et al. (2018)plantstropical AfricaBray–Curtis index
Godinho and da Silva (2018)anuransAmazonβsim
Slik et al. (2018)plantsglobalmean pairwise phylogenetic distance
Silva and Souza (2018)plantsCaatingaβsim
Bribiesca-Contreras et al. (2019)brittle starsglobalβsim
Ye et al. (2019)plantsChinaβsim
Examples of bioregionalization studies employing different dissimilarity indices after Kreft and Jetz (2010). Because some previous studies used unreasonable metrics, in this study, we first explored the effect of metric choice on biogeographical regionalization, then compared the regionalizations of the Chinese flora resulting from different metrics by using species distribution records and a genus level phylogenetic tree of the Chinese flora. We then analyzed the boundaries, the characteristics, and origins of the chosen regions of the Chinese flora. We anticipate this work will help identify reasonable metrics that can be used to propose a phylogenetic floristic regionalization of China.

Materials and methods

Distribution data and phylogenetic tree

We used the same data set as Ye et al. (2019).

Distance metrics and clustering analysis

To compare the effect of richness-dependent and richness-independent metrics on biogeographical regionalization, we used MNTD and pβsim to measure phylogenetic turnover (pβsim), respectively. MNTD is affected by differences in species richness whereas pβsim is independent of differences in species richness among grid cells (Slik et al., 2018). In this study, MNTD was obtained by the ‘comdistnt’ function of the ‘picante’ package (Kembel et al., 2010) in R (R Core Team, 2019) and pβsim was calculated according to Ye et al. (2019). To identify spatial clusters, we compared the output from eight hierarchical clustering algorithms based on the MNTD and pβsim pairwise distance matrices: (1) single linkage (SL), (2) complete linkage (CL), (3) unweighted pair-group method using arithmetic averages (UPGMA), (4) unweighted pair-group method using centroids (UPGMC), (5) weighted pair-group method using arithmetic averages (WPGMA), (6) weighted pair-group method using centroids (WPGMC), and (7) and (8), two variants of Ward's minimum variance (ward.D and ward.D2). We used the Kelley–Gardner–Sutcliffe penalty function (Kelley et al., 1996) to get a reasonable number of clusters for the phytogeographical regions. The clustering algorithms and reasonable number of clusters selected for this study helped to identify appropriate phytogeographical regions, and show the differences of the floristic regions obtained by using MNTD and pβsim respectively.

Characterized clade and indicator genus analyses

MNTD is sensitive to differences in taxon richness; therefore, we adopted floristic regionalization using pβsim. To identify clades that characterize a phytogeographical region (Ye et al., 2019) or contribute significantly to phylogenetic structure, we used the procedure nodesig in Phylocom 4.2 (Webb et al., 2008). The overall phylogenetic tree nodesig tests were used to further compare the genus pool of each phytogeographical region, and examine whether a node in the phylogeny has significantly more (sig. more) or significantly less (sig. less) descendent taxa in a sample than a null model predicts. The null model is a random draw of n taxa from the phylogeny, where n is the number of taxa in the sample (Webb et al., 2008). We used the Dufrêne–Legendre indicator species analysis (Dufrêne and Legendre, 1997) to characterize the genus composition of each identified phytogeographical region. The indicator value (IndVal) was defined as the product of two quantities, specificity (A) and fidelity (B). A represents the probability that a surveyed site belongs to the target site group on the basis of a target genus that has been found (calculated as the relative frequency of the occurrence of each genus in the target site group, divided by the sum of relative frequencies over all groups). B is the probability of finding the target genus in a site that belongs to the site group (i.e., the relative frequency of a genus inside the target site group; De Câceres and Legendre, 2009). The significance of the IndVal was assessed by a permutation test (9,999 permutations). A genus that had significant IndVals (P < 0.001) was ranked as a suitable indicator genus of its relevant phytogeographical region. The analysis was conducted using the ‘indicspecies’ package (De Câceres and Legendre, 2009).

Results

Floristic regionalization using MNTD and pβsim

The five clustering algorithms varied in performance for the MNTD pairwise distance analysis (Table 2). Ward's method was the best performing clustering algorithm (cophenetic coefficient = 0.99), followed by CL (complete linkage) (cophenetic coefficient = 0.97), while UPGMA (unweighted pair-group method using arithmetic averages) performed worst (cophenetic coefficient = 0.94). Therefore, we chose Ward's method. According to the Kelley–Gardner–Sutcliffe penalty function, eight floristic regions belonging to two groups were identified in China, with southern regions 6 and 8 forming one group (South Cluster), and the other six regions forming the other group (North Cluster) (Fig. 1a). The geographical dividing line between North and South Cluster in eastern China was roughly consistent with the Qinling Mountain-Huaihe River Line. The North Cluster further splits into two subclusters, with regions 1, 3 and 7 forming one subcluster and regions 2, 4 and 5 the other subcluster.
Table 2

Cophenetic correlation coefficients for five different clustering methods performed on the mean nearest phylogenetic neighbor index (MNTD) pairwise distances.

Clustering algorithmsCophenetic correlation coefficients
Unweighted pair-group method using arithmetic averages (UPGMA)0.94
Weighted pair-group method using arithmetic averages (WPGMA)0.96
Ward's method0.99
Single (SL)0.95
Complete linkage (CL)0.97
Fig. 1

Map (a) and dendrogram (b) resulting from Ward hierarchical clustering and scatter plot from nonmetric multidimensional scaling (NMDS) two-dimensional ordination for the Chinese floristic assemblages based on MNTD distance matrices (c). The eight distinct floristic regions are highlighted in the dendrogram with large colored rectangles and displayed in the map in same colours.

Cophenetic correlation coefficients for five different clustering methods performed on the mean nearest phylogenetic neighbor index (MNTD) pairwise distances. Map (a) and dendrogram (b) resulting from Ward hierarchical clustering and scatter plot from nonmetric multidimensional scaling (NMDS) two-dimensional ordination for the Chinese floristic assemblages based on MNTD distance matrices (c). The eight distinct floristic regions are highlighted in the dendrogram with large colored rectangles and displayed in the map in same colours. For the floristic regionalization using pβsim, five floristic regions were identified (Fig. 2; Fig. 4 in Ye et al., 2019).
Fig. 2

Map of the five floristic regions identified by pβsim and major mountain ranges in China. For the list of mountain ranges, see Table S6. The data for the elevation of China (a) and China's major mountain ranges (b) (Fig. 1a) were obtained from the National Fundamental Geographical Information System of China (http://nfgis.nsdi.gov.cn/nfgis/chinese/c_xz.htm).

Map of the five floristic regions identified by pβsim and major mountain ranges in China. For the list of mountain ranges, see Table S6. The data for the elevation of China (a) and China's major mountain ranges (b) (Fig. 1a) were obtained from the National Fundamental Geographical Information System of China (http://nfgis.nsdi.gov.cn/nfgis/chinese/c_xz.htm).

Characterized clades and indicator genera

The nodesig algorithm identified differently characterized clades for different phytogeographical regions. The Tethyan region had the largest number (301 clades) of sig. more clades, which included 99.12% of genera (904 genera) in this region, but the mean number of genera contributed by each clade was the least (3.00 genera). In contrast, the East Asiatic region had relatively few sig. more clades (57 clades), which only included 42.88% of genera (988 genera) in this region, but each clade contributed the largest number (25.61 genera) of genera on average. Details of sig. more/sig. less nodes and the descendant taxa are shown in Appendix S1: Fig. S1. Among the 2,591 genera analyzed, 34.3% (889 genera) had a significant indicator value (Table 3). This was a substantial number of genera that qualified as indicator taxa and indicated that different phytogeographical regions had significantly different floras. The Paleotropic region had the largest proportion (40.6%, 361 genera) of total indicator genera (889 genera), whereas the Holarctic region had the least proportion of indicator genera (5.8%, 52 genera). The indicator genera identified for each phytogeographical region are provided in Appendix S1: Tables S1–S5.
Table 3

Summary of indictor genera and nodes in the phylogeny with significantly more/less (sig. more/less) daughter genera in the phytogeographical regions.

Phytogeographical regionNumber of indictor genera (proportion)Number of sig. more nodesNumber of sig. less nodesNumber of nodesNumber of genera descending from sig. more nodesNumber of generaMean number of genera descending from each sig. more nodesThe Proportion of regional genera descending from sig. more nodes
Paleotropic region342 (38.5%)1582712,2641,3401,6058.4897.85%
Holarctic region52 (5.8%)1521842,1251,2271,2548.0799.12%
East Asiatic region230 (25.9%)571702,5391,4602,30425.6142.88%
Tethyan region147 (16.5%)3012441,6619049123.0063.37%
QTP region118 (13.3%)481092,4728131,89716.9483.49%
Summary of indictor genera and nodes in the phylogeny with significantly more/less (sig. more/less) daughter genera in the phytogeographical regions.

Discussion

Choice of the right index for bioregionalization studies

Beta diversity is composed of two different components: spatial species turnover and nestedness of assemblages, which result from two antithetic processes, namely species replacement and species loss, respectively (Lennon et al., 2001, Koleff et al., 2003, Baselga et al., 2007, Tuomisto, 2010). Using European longhorn beetles as a case, Baselga (2010) exemplified the relevance of disentangling spatial turnover and nestedness patterns. Baselga and Orme (2012) published an R package “betapart” to compute these dissimilarity measures. In the studies of biogeographical regionalization (bioregionalization), we should use the turnover component to measure composition dissimilarity between grid cells to avoid the impact of diversity differences (Leprieur and Oikonomou, 2014). Kreft and Jetz (2010) recommended the beta-sim index (βsim) and it has been applied in most of the subsequent bioregionalization studies (Table 1; González-Orozco et al., 2013, Holt et al., 2013, Linder et al., 2012). However, some studies still used inappropriate or incorrect indices (e.g., Bray–Curtis, mean nearest phylogenetic distance), which have proved to be influenced by diversity differences (Slik et al., 2018). In this study, the floristic regions obtained by MNTD distance matrices were also influenced by diversity differences between grid cells, because the Chinese flora has a conspicuous east-west gradient in plant diversity (Lu et al., 2018).

Comparison between two regionalizations

The cluster analysis based on MNTD distance matrices presented a picture of eight regions for the Chinese flora, whereas there were five regions based on pβsim identified by or found by Ye et al. (2019). The floristic regions obtained by MNTD distance matrices were similar with the patterns of generic richness in China (Extended Data Figure 8a in Lu et al., 2018). Southern regions 6 and 8 have the highest richness, regions 1, 3 and 7 have the moderate richness, and regions 2, 4 and 5 have the lowest richness. As a result, regionalization based on MNTD distance matrices was strongly influenced by the nestedness component of the metric (Baselga, 2012), which was also found in a study of Indo-Pacific coral reef fishes (Mouillot et al., 2013). Furthermore, the pβsim was demonstrated to be able to reflect the true phylogenetic turnover for the Chinese angiosperm flora (Fig. 2b in Ye et al., 2019) and the five floristic regions - the Paleotropic, Holarctic, East Asiatic, Tethyan, and QTP regions - which clearly reflected the evolutionary history of the flora (Ye et al., 2019). In our study, the characterized clades identified by the nodesig algorithm for each phytogeographical region differed greatly. For example, the clades in magnoliids (Myristiaceae, Annonaceae, Magnoliaceae, Calycanthaceae, Lauraceae, Hernandiaceae, Piperaceae, and Aristolochiaceae) were significantly associated with the Paleotropic region, while clades in Asteraceae occurred less often than expected in this region but were overabundant in the Holarctic region (Appendix S1: Fig. S2). We found that in the East Asiatic region, the sig. more nodes were deeper nodes with the most descendant genera on average in the five regions. This suggests the East Asiatic region has older lineages in China, while the flora of the Tethyan region with shallower sig. more nodes is younger (Table 3). We concluded the discussion with a description of the five regions (Paleotropic, Holarctic, East Asiatic, Tethyan, and QTP regions) identified by our analysis (Fig. 2).

The five identified regions based on the pβsim metric

The Paleotropic region consists of tropical evergreen broad-leaved forest vegetation in the southeastern coast of China and Hainan Island. Its northern boundary is roughly the Tropic of Cancer (Fig. 2). Within the five regions, this region has the least amount of areas and lowest elevation, the highest mean annual temperature (MAT) and mean annual precipitation (MAP), and the richest indicator clades and genera (Table 1 in Ye et al., 2019, Table S1 in Appendix S1). The Paleotropic region has the most evolutionarily distinct flora in China (Table 1 in Ye et al., 2019) and is characterized by the presence of thermophilic Malesian taxa (e.g., Bruguiera, Chrysophyllum, Endospermum, Stylidium, Tetracera), which were immigrated from southern and southeastern Asia (Axelrod et al., 1998). These tropical and subtropical plant clades and genera have mostly not migrated to the subtropical and temperate areas in central and northern China and alpine areas in southwestern China, implying the prevalence of Niche Conservatism Hypothesis (Wiens and Donoghue, 2004), thus resulting in the main split of the Chinese flora between the southern Paleotropic and northern Chinese floras, as shown in our clustering dendrogram (Fig. 1b). The flora of the Paleotropic region is the oldest and represents a floristic museum (Lu et al., 2018). The Holarctic region consists of Inner Mongolia steppe in the west and deciduous broad-leaved forest vegetation in Northeast and North China. Within North China, it is roughly bounded by the Helan Mountains in the west and the Taihang Mountains in the east (Fig. 2). The mean elevation is approximately 700 m, MAT is 4.1 °C, and MAP is nearly 500 mm (Table 1 in Ye et al., 2019). Flora of this region includes 6,878 species that belong to 1,254 genera and 170 families. It represents the Boreal-Tertiary flora in China that preserved Northern Hemisphere temperate taxa with origins in the late Cretaceous and early Paleogene, such as Acer, Alnus, Corylus, Eleutherococcus, Rhododendron, and Tilia (Sun, 2002b). This region has the fewest indicator genera among the five regions that are mainly composed of north temperate taxa (e.g., Filifolium, Hypochaeris, Saposhnikovia, Scabiosa, and Trientalis) (Table 1 in Ye et al., 2019, Table S2 in Appendix S1). The East Asiatic region includes deciduous broad-leaved forest vegetation in the east of the Taihang Mountains (Fig. 2), subtropical evergreen broad-leaved forest areas in most of southern China, and seasonal rain forest vegetation in southern Yunnan. Its western border extends along the western margin of the Sichuan Basin to the eastern and southern margin of the Yunnan Plateau. Flora of the East Asiatic region is closely related to that of the Holarctic region. The mean elevation is approximately 600 m a.s.l.; MAT is 16 °C; MAP is nearly 1,200 mm (Table 1 in Ye et al., 2019). The East Asiatic region is characterized by the presence of genera that are endemic to East Asia (e.g., Akebia, Emmenopterys, Pinellia) and genera with the distribution pattern of the East Asia–North American disjunction patterns (e.g., Liriodendron, Shortia) (Table S3 in Appendix S1). The East Asiatic region hosts the highest plant species richness (20,317 species) among the five regions (Table 3). First, as a stable and ancient area, the East Asiatic region preserved the flora from the early Cretaceous to early Paleogene (Wu et al., 2010). Second, with the uplift of the QTP, and intensification of both the Pacific and Indian Ocean monsoons in this area, new habitats have appeared during the Neogene, which might have promoted the rapid radiation of some plant taxa, such as Begoniaceae, Gesneriaceae, Orchidaceae, and Rubiaceae (Favre et al., 2015). Third, the East Asiatic region preserved Boreotropical plant taxa that were once common in the Northern Hemisphere but extinct from other areas during Quaternary glaciation, such as Cyclocarya, Davidia, Emmenopterys, and Eucommia (Wolfe, 1975, Wen, 1999, López-Pujol and Ren, 2010). Therefore, the East Asiatic region harbors both early and recently diverged taxa in China and East Asia. The Tethyan region mainly includes the desert areas in Northwest China (Fig. 2). The mean elevation is approximately 1,800 m; MAT is 5.6 °C; and MAP is 111.2 mm. The Tethyan region has the lowest MAP of the five regions. As a temperate desert flora, plant diversity in the Tethyan region is low, with only 5,412 species that belong to 912 genera and 128 families (Table 1 in Ye et al., 2019). This region has 147 indicator genera, which are mainly temperate desert xerophytic shrubs and herbs (e.g., Ammopiptanthus, Glaucium, Lagochilus, Potaninia, Tugarinovia, and Zygophyllum) (Table S4 in Appendix S1). However, the Tethyan region had a high evolutionary distinctiveness in comparison with other regions (Table 1 in Ye et al., 2019). After the Oligocene, the inland Tethys Sea retreated and was accompanied by an increasingly dry climate, which resulted in diversification of the xerophytic Tethyan flora (Palamarev, 1989). Consequently, this region is the diversity center for Amaranthaceae (Chenopodioideae), Apiaceae, Brassicaceae, and Lamiaceae (Sun, 2013). Moreover, the Tethyan region was also the most important glacial refugium for Tertiary relics throughout Quaternary glacial cycles (Qiu et al., 2011); for example, Tetraena mongolica is considered a relic of the Tethys sea tropical phytogeographical region (Wu et al., 2010). The QTP region is bordered to the south by the Himalayan Mountains, to the east by the Hengduan Mountains, to the north by the Kunlun Mountains and Qilian Mountains, and to the west by the Karakorum Mountains (Fig. 2). The QTP region is the largest region and has the highest average elevation (4,248 m a.s.l.) and lowest MAT of the five regions. The QTP region is characterized by the presence of alpine genera (e.g., Cyananthus, Diapensia, Lancea, Meconopsis, Morina, and Solms-laubachia) (Table S5 in Appendix S1). This region possesses one of the world's richest flora, with 16,220 species that belong to 1,897 genera and 205 families (Table 1 in Ye et al., 2019), and numerous Chinese and local endemic species (Huang et al., 2016, Zhang et al., 2016a). Because of the uplift of the QTP, this region is also an important origin and diversification center of north temperate and alpine floras, such as the genera Aconitum, Corydalis, Gentiana, Rhododendron, Pedicularis, and Primula (Sun, 2002a, Zhang et al., 2016a).

Conclusions

We demonstrated that the choice of metrics could have a dramatic impact on the biogeographical regionalization. Those regionalizations based on metrics that are sensitive to taxa richness can reflect the pattern of turnover and nestedness together, but cannot reflect the true pattern of taxonomic and phylogenetic turnover (Baselga, 2012). The regionalization based on the MNTD metric is greatly influenced by the richness differences in the Chinese flora. Our phylogeny-based regionalization using beta-sim metric provides a foundation for future biogeographical and biodiversity studies and can also be used to develop strategies to conserve taxa and phytogeographical regions with different origins and evolutionary histories more effectively in China.

Declaration of Competing Interest

The authors declare that there is no conflict of interest regarding this manuscript. All the authors agreed to submit this manuscript.
  3 in total

1.  Spatial phylogenetics of the native woody plant species in Hainan, China.

Authors:  Zhi-Xin Zhu; A J Harris; Mir Muhammad Nizamani; Andrew H Thornhill; Rosa A Scherson; Hua-Feng Wang
Journal:  Ecol Evol       Date:  2021-02-01       Impact factor: 2.912

2.  Current patterns of plant diversity and phylogenetic structure on the Kunlun Mountains.

Authors:  Wei-Bo Du; Peng Jia; Guo-Zhen Du
Journal:  Plant Divers       Date:  2021-05-07

3.  Current biogeographical roles of the Kunlun Mountains.

Authors:  Weibo Du; Peng Jia; Guozhen Du
Journal:  Ecol Evol       Date:  2022-01-15       Impact factor: 2.912

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.