| Literature DB >> 28251186 |
James T Morton1, Liam Toran2, Anna Edlund3, Jessica L Metcalf4, Christian Lauber5, Rob Knight1.
Abstract
The horseshoe effect is a phenomenon that has long intrigued ecologists. The effect was commonly thought to be an artifact of dimensionality reduction, and multiple techniques were developed to unravel this phenomenon and simplify interpretation. Here, we provide evidence that horseshoes arise as a consequence of distance metrics that saturate-a familiar concept in other fields but new to microbial ecology. This saturation property loses information about community dissimilarity, simply because it cannot discriminate between samples that do not share any common features. The phenomenon illuminates niche differentiation in microbial communities and indicates species turnover along environmental gradients. Here we propose a rationale for the observed horseshoe effect from multiple dimensionality reduction techniques applied to simulations, soil samples, and samples from postmortem mice. An in-depth understanding of this phenomenon allows targeting of niche differentiation patterns from high-level ordination plots, which can guide conventional statistical tools to pinpoint microbial niches along environmental gradients. IMPORTANCE The horseshoe effect is often considered an artifact of dimensionality reduction. We show that this is not true in the case for microbiome data and that, in fact, horseshoes can help analysts discover microbial niches across environments.Entities:
Keywords: decomposition; horseshoe; microbial ecology; pH; soil
Year: 2017 PMID: 28251186 PMCID: PMC5320001 DOI: 10.1128/mSystems.00166-16
Source DB: PubMed Journal: mSystems ISSN: 2379-5077 Impact factor: 6.496
FIG 1 (a) A band table where the y axis data represent individual OTUs and the x axis data represent samples. Blocks that are colored black have a value of 1/10, while blocks that are colored white have a value of 0. (b) The first 2 components from a PCA of the band table, yielding the typical horseshoe shape. (c) The Euclidean distance from point 0 to all of the other points. (d) An illustration of the distance saturation property.
FIG 2 (a) Correspondence analysis of 88 soil samples. (b) Distance saturation of chi-squared metric, plotting the chi-squared distance of the first sample versus all of the other samples. (c) Heat map of log transformed OTU counts from the 88 soil samples with the samples sorted by pH and the OTUs sorted by mean pH. (d) PCoA of unweighted UniFrac distance. (e) UniFrac distance of a given sample from the last time point versus all of the samples. (f) Heat map of centered log ratio transformed (equation 2) OTU counts sorted by harvest days. clr, clearance.