Literature DB >> 28743891

Advantages of phylogenetic distance based constrained ordination analyses for the examination of microbial communities.

V Shankar1, R Agans1, O Paliy2.   

Abstract

Recently developed high throughput molecular techniques such as massively parallel sequencing and phylogenetic microarrays generate vast datasets providing insights into microbial community structure and function. Because of the high dimensionality of these datasets, multivariate ordination analyses are often employed to examine such data. Here, we show how the use of phylogenetic distance based redundancy analysis provides ecological interpretation of microbial community differences. We also extend the previously developed method of principal response curves to incorporate phylogenetic distance measure, and we demonstrate the improved ability of this approach to provide ecologically relevant insights into temporal alterations of microbial communities.

Entities:  

Mesh:

Year:  2017        PMID: 28743891      PMCID: PMC5526943          DOI: 10.1038/s41598-017-06693-z

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

Recent advances in high-throughput massively parallel sequencing and phylogenetic microarrays have led to a bloom of studies in the field of microbial ecology[1, 2]. Because these experimental platforms generate large datasets of measured values such as sequence read counts or microarray probe hybridization signals, multivariate statistical analyses are usually employed to interpret the acquired data[3, 4]. Unconstrained ordination analyses such as principal coordinate and component analyses are frequently used to assess the variability in the datasets and distribute samples in lower dimensional space according to the matrix of measured values. While useful, these exploratory techniques do not provide a direct assessment of how different explanatory variables such as environmental gradients, sample groups, or patient metadata (age, weight, gender, etc) contribute to the observed patterns in microbial community variability. To provide such assessment, the use of constrained methods is advocated[3]. The two widely used approaches, redundancy analysis (RDA) and canonical correspondence analysis (CCA), utilize Euclidean and χ2distances, respectively, to calculate the relationships among samples[5, 6]. Neither distance measure takes into consideration the phylogenetic makeup of microbial communities and thus is not able to take advantage of this ecologically important information. Phylogenetic trees closely resemble clusters obtained on the basis of shared gene content[7], and the microbial phylogeny and function were shown to be linked for many microbial traits[8, 9] (note however that significant variability in gene content and functional capacity can often exist even among different strains and closely related species[10]). Thus, the microbial community composition and function are dependent to at least some degree on the phylogeny of its members[11, 12]. In this report we show that phylogenetic distance based constrained analyses provide ecological interpretation of microbial community datasets.

Results and Discussion

To illustrate the advantage of phylogenetic distance based constrained ordination in microbial ecology, we extended the use of a distance-based variant of redundancy analysis (dbRDA)[13, 14] to utilize the phylogenetic UniFrac (UF) distance-based matrix of (dis)similarities among samples in a dataset as has been done in several previous studies[15-18]. UniFrac distance is computed by calculating the fraction of branch lengths of a combined phylogenetic tree that are not shared between two communities[19]. Thus, two communities that mostly have members of phylogenetically distinct clades would have large a UF distance, whereas communities that consist of different members of the same phylogenetic clade (e.g., same genus but different species) would have a small UF distance. By incorporating the abundance estimates of each taxon in different samples, a weighted UniFrac measure (wUF) can also be calculated. To compare the performance of wUF-dbRDA with several other commonly used constrained ordination analysis including Euclidean and Bray-Curtis distance-based RDAs as well as CCA, we designed a small synthetic microbial community consisting of a dozen bacterial members known to reside in the human gastrointestinal tract. We chose to use a synthetic dataset at this stage over an actual example of microbial community in order to reduce overall dataset variance and limit noise arising due to many low-abundance members typically present in most microbial communities[20]. Response variables were counts of each species’ abundance simulated manually with random noise (±10% and ±20% of each species target level for more and less abundant species, respectively) as shown in Fig. 1A. Two explanatory variables were defined, “group” and “gender”. Group variable contained two choices, either samples were drawn from (i) “healthy” patients (group H), or (ii) patients with a “disease” (group D). The groups differed by two-fold in the abundance of Fusobacterium nucleatum, a gut bacterial species that has been associated with human colorectal cancer[21]. Within each group, we introduced further dichotomy between “genders” by varying which species of Clostridium and Bacteroides they harbored (Fig. 1A). The overall abundance of each of these two genera did not differ between genders, and thus phylogenetically there is little overall distinction between “male” and “female” samples. The full numerical dataset is provided in Supplementary File 1. The outputs (first two canonical axes) of traditional (Euclidean) RDA and wUF-dbRDA ordination analyses of this synthetic dataset are visualized in Fig. 1B,D, respectively, whereas the analysis of dataset variance is shown in Fig. 1C,E. Because RDA weighs the importance of all variables equally and only takes into consideration their numerical values, it separates equally well both H/D groups and genders in the constrained ordination space, and reveals significant contribution of variation in Clostridium and Bacteroides members towards overall data variability. This is evident by the canonical axis 1 separating genders rather than H/D groups (Fig. 1B), and because two-thirds of the overall variance in the synthetic microbial dataset were attributed to the “gender” explanatory variable (Fig. 1C). Both canonical correspondence analysis as well as Bray-Curtis distance based dbRDA analysis applied to the same dataset produced the outputs similar to that of traditional RDA (Supplementary Figure 1). In contrast, phylogenetic distance based wUF-dbRDA attributes much less variance to gender (11% vs 68%, see Fig. 1E) and instead separates samples first according to health/disease status (canonical axis 1 in Fig. 1D). Thus, while wUF-dbRDA reveals Fusobacterium as the driver of microbial community differences between H and D cohorts as was designed in the structure of our synthetic dataset, that finding is less prominent in traditional RDA and CCA analyses which focus more on the phylogenetically minor variation within Bacteroides and Clostridium genera.
Figure 1

Comparison of the outputs between RDA and weighted UniFrac distance based dbRDA. (A) Structure of synthetic community dataset used as input for RDA ordination analyses. Top panel shows the differences between groups in the abundances of community members; bottom panel depicts the phylogenetic relationship among species. (B) and (D) Triplots of the Euclidean distance-based RDA output (panel B) and the weighted UniFrac distance-based RDA output (panel D). First two canonical axes are visualized. Species scores are shown as arrows; species names are shown in three-letter code (please refer to phylogenetic tree in panel A for definitions). Explanatory variables are shown as squares; samples are shown as colored circles. Sample names designate group (“D” or “H”) and “gender” (M” or “F”). (C) and (E) Venn diagrams present the analysis of variance of RDA (panel C) and weighted UniFrac distance-based RDA (panel E) models. Structure (panel F), UF-dbRDA ordination output (panel G), and analysis of variance of UF-dbRDA model (panel H) of the kIBS-kHLT dataset originally published by Rigsbee et al.[22]. In panel H, (*)indicates a statistically significant relationship between an explanatory variable and the response variable dataset at α = 0.01 level.

Comparison of the outputs between RDA and weighted UniFrac distance based dbRDA. (A) Structure of synthetic community dataset used as input for RDA ordination analyses. Top panel shows the differences between groups in the abundances of community members; bottom panel depicts the phylogenetic relationship among species. (B) and (D) Triplots of the Euclidean distance-based RDA output (panel B) and the weighted UniFrac distance-based RDA output (panel D). First two canonical axes are visualized. Species scores are shown as arrows; species names are shown in three-letter code (please refer to phylogenetic tree in panel A for definitions). Explanatory variables are shown as squares; samples are shown as colored circles. Sample names designate group (“D” or “H”) and “gender” (M” or “F”). (C) and (E) Venn diagrams present the analysis of variance of RDA (panel C) and weighted UniFrac distance-based RDA (panel E) models. Structure (panel F), UF-dbRDA ordination output (panel G), and analysis of variance of UF-dbRDA model (panel H) of the kIBS-kHLT dataset originally published by Rigsbee et al.[22]. In panel H, (*)indicates a statistically significant relationship between an explanatory variable and the response variable dataset at α = 0.01 level. We subsequently applied UF-dbRDA analysis to the microbiota abundance dataset available from the Rigsbee et al. study[22]. The dataset comprised phylogenetic microarray based abundance values for 775 phylotypes of human gut microbiota profiled in two cohorts of teenagers: healthy group (designated kHLT) and those diagnosed with diarrhea-predominant irritable bowel syndrome (designated kIBS). Group (healthy vs IBS), gender, and age served as explanatory variables in our UF-dbRDA analysis (see Fig. 1F). The UF-dbRDA ordination using the first two canonical axes is shown in Fig. 1G, and the analysis of variance is presented in Fig. 1H. While there were many unknown gradients of variance influencing the microbiota composition (expected for the complex human gut microbiota dataset and indicated by a large fraction of residual unexplained variance), the group assignment was the most dominant predictor among the explanatory variables tested. It accounted for the largest explained variance (Fig. 1H), the samples were separated according to the group assignment along the first canonical axis (Fig. 1G), and it was the only statistically significant relationship between explanatory and response variables. Unconstrained principal coordinates analysis of the same dataset similarly indicated a visual separation of kIBS and kHLT samples in the ordination space, though it could not provide statistical evaluation of the relationship (see ref. 23). Phylogenetic distance-based redundancy analysis has also been successfully employed in other recent studies, where dbRDA was used to reveal the extent to which age, body mass index, and country of residence influenced gut microbiota composition in US and Egyptian teenagers[18], to uncover major genera driving skin microbiome differentiation among individuals[17], to test the separation of gut microbiota of IBS patients from that of healthy controls[15], to show the effects of plant host, amount of available nitrogen, and competitor removal on the root-associated bacterial community assembly[16], to test if cloacal microbiome of barn swallows differed between males and females and between breeding colonies[24], and to identify the major environmental factors controlling bacterial and fungal community composition in soils[25-27]. To further demonstrate the utility of phylogenetic distance based constrained ordination analyses, we also extended the method of principal response curves (PRC)[28] to use a phylogenetic distance measure in its calculations. PRC was originally developed to analyze time-series data and carries out partial RDA ordination to obtain estimates of community changes using time as a predictor variable. Here, we developed an extension of PRC by incorporating phylogenetic weighted UniFrac distance into its distance matrix calculations. We compared the performance of wUF-dbPRC and Euclidean distance-based PRC on a synthetic community dataset visualized in Fig. 2A. The dataset contained abundance values for the same set of 12 bacterial species shown in Fig. 1A, and incorporated a gradual reduction of Fusobacterium abundance over the observation period. Abundances of individual species of Bacteroides and Clostridium oscillated from one time point to another; however, the overall abundance of each of these genera remained the same (see Fig. 2A). The full numerical dataset is provided in Supplementary File 2. Standard PRC analysis showed a significant oscillating pattern in community structure, with no indication of consistent community alteration over time (Fig. 2B). Bray-Curtis distance based dbPRC analysis also showed an oscillating composition of the community (Supplementary Figure 2). In contrast, wUF-dbPRC output clearly revealed a community change starting from time point 1 and demonstrated that Fusobacterium is the main single driver of these changes (Fig. 2C).
Figure 2

Comparison of the outputs between PRC and weighted UniFrac distance based dbPRC. (A) Structure of synthetic community dataset used as input for PRC ordination analyses. Please refer to phylogenetic tree in Fig. 1A for definitions of species codes. (B) and (C) Principal response curves plots for PRC (panel B) and weighted UniFrac distance-based dbPRC (panel C) analyses. Genus weights contributing to each statistical model are shown on the right side of each panel. (D) wUF-dbPRC analysis of genus level community structure in three patients with Clostridium difficile associated disease (CDAD) undergoing fecal microbiota transplantation (original dataset was published by Shankar et al.[29]). Each curve corresponds to a different individual as shown; genus weights are provided in the right side panel. Initial time point corresponds to community structure prior to FMT, all other time points represent days after FMT procedure, which was carried out at time 0.

Comparison of the outputs between PRC and weighted UniFrac distance based dbPRC. (A) Structure of synthetic community dataset used as input for PRC ordination analyses. Please refer to phylogenetic tree in Fig. 1A for definitions of species codes. (B) and (C) Principal response curves plots for PRC (panel B) and weighted UniFrac distance-based dbPRC (panel C) analyses. Genus weights contributing to each statistical model are shown on the right side of each panel. (D) wUF-dbPRC analysis of genus level community structure in three patients with Clostridium difficile associated disease (CDAD) undergoing fecal microbiota transplantation (original dataset was published by Shankar et al.[29]). Each curve corresponds to a different individual as shown; genus weights are provided in the right side panel. Initial time point corresponds to community structure prior to FMT, all other time points represent days after FMT procedure, which was carried out at time 0. We then applied the wUF-dbPRC analysis to the time-series measurements of microbiota composition taken from the study by Shankar and co-workers[29]. The study described fecal microbiota changes in three patients with Clostridium difficile associated disease following fecal microbiota transplantation (FMT) from a healthy donor. The results of the wUF-dbPRC analysis are presented in Fig. 2D. The fecal microbiota in all three patients changed drastically within few days following FMT procedure, and community remained stable over a three-month period. The analysis of variable weights from the wUF-dbRDA model identified the genera that contributed most to these changes (aerotolerant microbes decreased, many well-known fiber degraders increased in abundance following fecal microbiota transfer). These results match those originally reported in the Shankar et al. study based on the K-means cluster analysis[29], and additionally provide the ability to quantitatively establish the main determinants of community alterations. While our synthetic datasets were designed specifically to show the potential differences in outputs between traditional and weighted UniFrac distance-based RDA and PRC analyses, the above comparisons provide compelling evidence for the advantages of phylogenetic distance based constrained ordination analyses in the study of microbial communities.

Methods

The constrained ordination techniques were performed in R using the vegan package[30]. Specifically, distance-based redundancy analysis was performed using the vegan function capscale. Principal response curves analysis was performed using the prc function. Analysis of variance was performed using the anova.cca command. The R code to run both dbRDA and dbPRC analyses is provided in Supplementary File 3; identical output for dbRDA can also be obtained with built-in Phyloseq R package functions. Supplementary  Information
  24 in total

1.  Microbial community structure across the tree of life in the extreme Río Tinto.

Authors:  Linda A Amaral-Zettler; Erik R Zettler; Susanna M Theroux; Carmen Palacios; Angeles Aguilera; Ricardo Amils
Journal:  ISME J       Date:  2010-07-15       Impact factor: 10.302

Review 2.  Application of multivariate statistical techniques in microbial ecology.

Authors:  O Paliy; V Shankar
Journal:  Mol Ecol       Date:  2016-03       Impact factor: 6.185

Review 3.  Microbial ecology in the age of genomics and metagenomics: concepts, tools, and recent advances.

Authors:  Jianping Xu
Journal:  Mol Ecol       Date:  2006-06       Impact factor: 6.185

4.  Phylogenetic beta diversity: linking ecological and evolutionary processes across space in time.

Authors:  Catherine H Graham; Paul V A Fine
Journal:  Ecol Lett       Date:  2008-12       Impact factor: 9.492

5.  Distal gut microbiota of adolescent children is different from that of adults.

Authors:  Richard Agans; Laura Rigsbee; Harshavardhan Kenche; Sonia Michail; Harry J Khamis; Oleg Paliy
Journal:  FEMS Microbiol Ecol       Date:  2011-06-01       Impact factor: 4.194

6.  Quantitative profiling of gut microbiota of children with diarrhea-predominant irritable bowel syndrome.

Authors:  Laura Rigsbee; Richard Agans; Vijay Shankar; Harshavardhan Kenche; Harry J Khamis; Sonia Michail; Oleg Paliy
Journal:  Am J Gastroenterol       Date:  2012-11       Impact factor: 10.864

7.  Highly heterogeneous soil bacterial communities around Terra Nova Bay of Northern Victoria Land, Antarctica.

Authors:  Mincheol Kim; Ahnna Cho; Hyoun Soo Lim; Soon Gyu Hong; Ji Hee Kim; Joohan Lee; Taejin Choi; Tae Seok Ahn; Ok-Sun Kim
Journal:  PLoS One       Date:  2015-03-23       Impact factor: 3.240

8.  Insights into the pan-microbiome: skin microbial communities of Chinese individuals differ from other racial groups.

Authors:  Marcus H Y Leung; David Wilkins; Patrick K H Lee
Journal:  Sci Rep       Date:  2015-07-16       Impact factor: 4.379

9.  Species and genus level resolution analysis of gut microbiota in Clostridium difficile patients following fecal microbiota transplantation.

Authors:  Vijay Shankar; Matthew J Hamilton; Alexander Khoruts; Amanda Kilburn; Tatsuya Unno; Oleg Paliy; Michael J Sadowsky
Journal:  Microbiome       Date:  2014-04-21       Impact factor: 14.650

10.  Reduction of butyrate- and methane-producing microorganisms in patients with Irritable Bowel Syndrome.

Authors:  Marta Pozuelo; Suchita Panda; Alba Santiago; Sara Mendez; Anna Accarino; Javier Santos; Francisco Guarner; Fernando Azpiroz; Chaysavanh Manichanh
Journal:  Sci Rep       Date:  2015-08-04       Impact factor: 4.379

View more
  15 in total

1.  Dietary Fatty Acids Sustain the Growth of the Human Gut Microbiota.

Authors:  Richard Agans; Alex Gordon; Denise Lynette Kramer; Sergio Perez-Burillo; José A Rufián-Henares; Oleg Paliy
Journal:  Appl Environ Microbiol       Date:  2018-10-17       Impact factor: 4.792

2.  Differential MicroRNA Signatures in the Pathogenesis of Barrett's Esophagus.

Authors:  Michael P Craig; Sumudu Rajakaruna; Oleg Paliy; Mumtaz Sajjad; Srivats Madhavan; Nikhil Reddy; Jin Zhang; Michael Bottomley; Sangeeta Agrawal; Madhavi P Kadakia
Journal:  Clin Transl Gastroenterol       Date:  2020-01       Impact factor: 4.396

3.  Associations between phenotypic characteristics and clinical parameters of broilers and intestinal microbial development throughout a production cycle: A field study.

Authors:  Jannigje G Kers; Jean E de Oliveira; Egil A J Fischer; Monique H G Tersteeg-Zijderveld; Prokopis Konstanti; Jan Arend Arjan Stegeman; Hauke Smidt; Francisca C Velkers
Journal:  Microbiologyopen       Date:  2020-10-17       Impact factor: 3.139

4.  Effect of wheat bran derived prebiotic supplementation on gastrointestinal transit, gut microbiota, and metabolic health: a randomized controlled trial in healthy adults with a slow gut transit.

Authors:  Mattea Müller; Gerben D A Hermes; Canfora Emanuel E; Jens J Holst; Erwin G Zoetendal; Hauke Smidt; Freddy Troost; Frank G Schaap; Steven Olde Damink; Johan W E Jocken; Kaatje Lenaerts; Ad A M Masclee; Ellen E Blaak
Journal:  Gut Microbes       Date:  2020-01-25

5.  Rhizosphere analysis of field-grown Panax ginseng with different degrees of red skin provides the basis for preventing red skin syndrome.

Authors:  Ling Dong; Xingbo Bian; Yan Zhao; He Yang; Yonghua Xu; Yongzhong Han; Lianxue Zhang
Journal:  BMC Microbiol       Date:  2022-01-06       Impact factor: 3.605

6.  Flooding and ecological restoration promote wetland microbial communities and soil functions on former cranberry farmland.

Authors:  Rachel L Rubin; Kate A Ballantine; Arden Hegberg; Jason P Andras
Journal:  PLoS One       Date:  2021-12-17       Impact factor: 3.240

7.  Communities of Phytoplankton Viruses across the Transition Zone of the St. Lawrence Estuary.

Authors:  Myriam Labbé; Frédéric Raymond; Alice Lévesque; Mary Thaler; Vani Mohit; Martyne Audet; Jacques Corbeil; Alexander Culley
Journal:  Viruses       Date:  2018-11-27       Impact factor: 5.048

8.  A Multi-Omics Analysis Suggests Links Between the Differentiated Surface Metabolome and Epiphytic Microbiota Along the Thallus of a Mediterranean Seaweed Holobiont.

Authors:  Benoît Paix; Nathan Carriot; Raphaëlle Barry-Martinet; Stéphane Greff; Benjamin Misson; Jean-François Briand; Gérald Culioli
Journal:  Front Microbiol       Date:  2020-03-25       Impact factor: 5.640

9.  Bryophytes and the symbiotic microorganisms, the pioneers of vegetation restoration in karst rocky desertification areas in southwestern China.

Authors:  Wei Cao; Yuanxin Xiong; Degang Zhao; Hongying Tan; Jiaojiao Qu
Journal:  Appl Microbiol Biotechnol       Date:  2019-12-10       Impact factor: 4.813

10.  The Role of Photobionts as Drivers of Diversification in an Island Radiation of Lichen-Forming Fungi.

Authors:  Miguel Blázquez; Lucía S Hernández-Moreno; Francisco Gasulla; Israel Pérez-Vargas; Sergio Pérez-Ortega
Journal:  Front Microbiol       Date:  2022-01-03       Impact factor: 5.640

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.