| Literature DB >> 28144630 |
James T Morton1, Jon Sanders2, Robert A Quinn3, Daniel McDonald4, Antonio Gonzalez4, Yoshiki Vázquez-Baeza1, Jose A Navas-Molina1, Se Jin Song2, Jessica L Metcalf3, Embriette R Hyde4, Manuel Lladser5, Pieter C Dorrestein6, Rob Knight1.
Abstract
Advances in sequencing technologies have enabled novel insights into microbial niche differentiation, from analyzing environmental samples to understanding human diseases and informing dietary studies. However, identifying the microbial taxa that differentiate these samples can be challenging. These issues stem from the compositional nature of 16S rRNA gene data (or, more generally, taxon or functional gene data); the changes in the relative abundance of one taxon influence the apparent abundances of the others. Here we acknowledge that inferring properties of individual bacteria is a difficult problem and instead introduce the concept of balances to infer meaningful properties of subcommunities, rather than properties of individual species. We show that balances can yield insights about niche differentiation across multiple microbial environments, including soil environments and lung sputum. These techniques have the potential to reshape how we carry out future ecological analyses aimed at revealing differences in relative taxonomic abundances across different samples. IMPORTANCE By explicitly accounting for the compositional nature of 16S rRNA gene data through the concept of balances, balance trees yield novel biological insights into niche differentiation. The software to perform this analysis is available under an open-source license and can be obtained at https://github.com/biocore/gneiss. Author Video: An author video summary of this article is available.Entities:
Keywords: Aitchison geometry; balance trees; compositionality; cystic fibrosis; niche; soil microbiology
Year: 2017 PMID: 28144630 PMCID: PMC5264246 DOI: 10.1128/mSystems.00162-16
Source DB: PubMed Journal: mSystems ISSN: 2379-5077 Impact factor: 6.496
FIG 1 (a, b) Hypothetical scenario where 2 samples of 2 proportions may explain two different scenarios in the environment. (c) The balance between these 2 proportions is consistent for both scenarios. (d) Balance of Red and Blue species abundances. (e, f) Balances of Red and Blue individuals across an environmental variable. The comparison is of proportions and balances of two environments in the scenario where the Purple/Orange population (i.e., the most-right bin) triples. The balances were calculated using the groupings specified by the tree.
FIG 2 (a) Hierarchical clustering of closed-reference OTUs based on mean pH; (b) balance of low-pH-associated organisms (3.8 < mean pH < 6.7) and high-pH-associated organisms (6.8 < mean pH < 8.2); (c) observed OTU counts sorted by pH; (d) predicted OTU proportions from ordinary least-squares linear regression on balances sorted by pH. The coefficient of determination was 35%, showing that 35% of the variation in the microbial community abundance data can be predicted by pH alone.
FIG 3 (a) Bifurcating tree generated from hierarchical clustering of OTUs based on mean pH. The size of the internal nodes is inversely proportional to the P value of the linear mixed-effects model test on pH for that given balance. A heatmap of all of the OTU abundances sorted by patient is shown. OTUs were log transformed and centered across rows and columns. These abundances are aligned with the tips of the tree. (b) Progression of the top balance over the pH for all of the patients. (c) Progression of the second top balance over the pH for all of the patients.