| Literature DB >> 30370185 |
Monique A Ladds1, Nokuthaba Sibanda1, Richard Arnold1, Matthew R Dunn2.
Abstract
BACKGROUND: Functional groups serve two important functions in ecology: they allow for simplification of ecosystem models and can aid in understanding diversity. Despite their important applications, there has not been a universally accepted method of how to define them. A common approach is to cluster species on a set of traits, validated through visual confirmation of resulting groups based primarily on expert opinion. The goal of this research is to determine a suitable procedure for creating and evaluating functional groups that arise from clustering nominal traits.Entities:
Keywords: Clustering; Compactness; Connectedness; Fish; Missing data; Morphology; Separation; Stability; Teleost; Traits
Year: 2018 PMID: 30370185 PMCID: PMC6202955 DOI: 10.7717/peerj.5795
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Diet, habitat and morphology traits included in the analysis along with trait type, function, categories, percent missing and references.
| Variable | Function | Data type | Categories | Missing | Reference/s |
|---|---|---|---|---|---|
| Diet | Diet | Nominal | Omnivore; Invert feeder, Piscivore, Herbivore, Gelatinous inverts | 0% | |
| Trophic level | Diet | Continuous/ Discretized | Low (0–3); Medium (3–3.5); High (3.5–4); Very high (4+) | 0% | |
| Common maximum depth (m) | Habitat | Continuous/ Discretized | Reef (0–20.1); Shallow (20.2–54.6); Ocean (54.7–148.4); Deep (148.5+) | 0% | New |
| Maximum depth (m) | Habitat | Continuous/ Discretized | Reef (0–20.1); Shallow (20.2–54.6); Ocean (54.7–148.4); Deep (148.5–403.4); Bathy (403.4+) | 0% | New |
| Temperature preference | Habitat | Nominal | Deep, Temperate, Subtropical, Tropical | 0% | |
| Vertical habitat | Habitat | Nominal | Reef, Pelagic, Demersal, Benthopelagic, Bathypelagic, Bathydemersal | 0% | |
| Horizontal habitat | Habitat | Nominal | Coast, Neritic, Ocean | 0% | |
| Caudal fin shape | Morphology | Nominal | Forked, Rounded, Truncated, Emarginate, Heterocercal, Continuous, Lanceolate | 0% | |
| Swimming mode | Morphology | Nominal | Body caudal fin (BCF), Median paired fin (MPF) | 0% | |
| Body form | Morphology | Nominal | Compressed, Cylindrical, Eel, Flat, Fusiform | 0% | |
| Eye position | Morphology | Nominal | Mid, Side, Top | 0% | |
| Oral gape position | Morphology | Nominal | Subterminal, Terminal, Hyper-protusable, Inferior, Snout projecting, Lower jaw projecting, Tubular | 0% | |
| Maximum length (cm) | Morphology | Continuous/ Discretized | Small (0–20.1), Medium (20.2–54.6); Large (54.7–148.4); Very large (148.5+) | 0% | |
| Reproductive strategy | Life history | Nominal | Oviviparous, Ovovparous, Viviparous | 1.7% | |
| Sexual differentiation | Life history | Nominal | Gonochoristic, Hermaphrodite | 1.7% | |
| Migration | Life history | Nominal | Anadromous, Catadromous, Oceanic, None | 12.1% | |
| Parental care | Life history | Nominal | None, Paternal, Resource defence polygeny (RDP), Sheltered | 2.6% | |
| Egg attachment | Life history | Nominal | Pelagic, Benthic, Adhesive, None | 7.8% | |
| Reproduction location | Life history | Nominal | Bay, Ocean, River | 23.3% | |
| Gregariousness/ Schooling type | Life history | Nominal | Faculative, Obligatory, Solitary | 18.1% | |
| Population doubling | Life history | Nominal | High, Medium, Low, Very low | 12.1% |
Figure 1Proportion of values imputed correctly (accuracy) and 95% confidence interval for different imputation methods across varying amounts of missing data.
The four imputation methods displayed are; MICE (orange), missForest (blue), missMDA (grey) and mode (black). Bars are jittered for clarity.
Figure 2Evaluation of the optimal number of clusters using the pseudo F coefficient based on the entropy (PSFE).
All five distance matrices (coloured lines) tested are displayed across three clustering algorithms (facets).
Figure 3Evaluation of the optimal number of clusters using the within-cluster entropy coefficient (WCE).
All five distance matrices (coloured lines) tested displayed across three clustering algorithms (facets).
Figure 4Evaluation of the optimal number of clusters using the difference between the k cluster and k of the WCE scores.
The red bar corresponds to the highest PSFE score for that combination of distance matrix and linkage method indicating the optimal number of clusters. The black bar is the second highest score and colour gradient lightens as the PSFE scores lower (indicating a poor fit). (A–C): WCE difference results for Eskin distance matrix with average, complete and single linkage; (D–E): WCE difference results for Goodall distance matrix with average, complete and single linkage; (G–I): WCE difference results for IOF distance matrix with average, complete and single linkage and (J–L): WCE difference results for Lin distance matrix with average, complete and single linkage.
Figure 5Clustering results using the average linkage method for four distance matrices (columns) for four different numbers of clusters (rows) displayed in two dimensions as the result of t-SNE.
Colours represent the different groups found with hierarchical clustering. (A–D): t-SNE clustering for Eskin linkage method with three, five, seven and nine clusters; (E–H): t-SNE clustering for IOF linkage method with 3, 5, 7 and 9 clusters; I-L: t-SNE clustering for Goodall linkage method with three, five, seven and nine clusters and M-P: t-SNE clustering for Lin linkage method with three, five, seven and nine clusters.
Figure 6Mean and standard deviation of the bootstrapped (n = 100) Jaccard distance measure from PAM clustering for five nominal distance matrices across four cluster sizes.