| Literature DB >> 28533766 |
Lucas Moitinho-Silva1,2, Georg Steinert3, Shaun Nielsen1,2, Cristiane C P Hardoim4, Yu-Chen Wu5, Grace P McCormack6, Susanna López-Legentil7, Roman Marchant8, Nicole Webster9,10, Torsten Thomas1,2, Ute Hentschel5.
Abstract
The dichotomy between high microbial abundance (HMA) and low microbial abundance (LMA) sponges has been observed in sponge-microbe symbiosis, although the extent of this pattern remains poorly unknown. We characterized the differences between the microbiomes of HMA (n = 19) and LMA (n = 17) sponges (575 specimens) present in the Sponge Microbiome Project. HMA sponges were associated with richer and more diverse microbiomes than LMA sponges, as indicated by the comparison of alpha diversity metrics. Microbial community structures differed between HMA and LMA sponges considering Operational Taxonomic Units (OTU) abundances and across microbial taxonomic levels, from phylum to species. The largest proportion of microbiome variation was explained by the host identity. Several phyla, classes, and OTUs were found differentially abundant in either group, which were considered "HMA indicators" and "LMA indicators." Machine learning algorithms (classifiers) were trained to predict the HMA-LMA status of sponges. Among nine different classifiers, higher performances were achieved by Random Forest trained with phylum and class abundances. Random Forest with optimized parameters predicted the HMA-LMA status of additional 135 sponge species (1,232 specimens) without a priori knowledge. These sponges were grouped in four clusters, from which the largest two were composed of species consistently predicted as HMA (n = 44) and LMA (n = 74). In summary, our analyses shown distinct features of the microbial communities associated with HMA and LMA sponges. The prediction of the HMA-LMA status based on the microbiome profiles of sponges demonstrates the application of machine learning to explore patterns of host-associated microbial communities.Entities:
Keywords: 16S rRNA gene; marine sponges; microbial diversity; microbiome; random forest; symbiosis
Year: 2017 PMID: 28533766 PMCID: PMC5421222 DOI: 10.3389/fmicb.2017.00752
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
The effect of HMA-LMA status, geographic region, and host identity on microbial communities based on OTU abundances.
| HMA-LMA status | 1 | 41,744 | 5.5289 | 0.002 | 34.238 |
| Geographic region | 8 | 15,266 | 1.4992 | 0.012 | 13.223 |
| HMA-LMA status × geographic region | 5 | 15,398 | 1.6644 | 0.003 | 17.741 |
| Host identity (HMA-LMA status × geographic region) | 37 | 20,925 | 12.249 | 0.001 | 41.229 |
| Residual | 523 | 1708.3 | 41.332 |
PERMANOVA analysis was performed with Bray–Curtis dissimilarities between samples obtained from square root transformed OTU abundances.
Term has one or more empty cells.
Estimates of components of variation are shown in squared units of Bray–Curtis dissimilarity.
Figure 1Classification of the HMA-LMA status of sponges based on transmission electron microscopy. Scale bars represent 5 μm, but vary in length. b, bacteria; sc, sponge cell.
Figure 2Alpha diversity of HMA and LMA sponge samples. Richness (A–C) and diversity (D–F) metrics were calculated for each sample (n = 575) using rarefied OTU abundances. Estimated mean and 95% confidence intervals were obtained from linear mixed models of alpha diversity metrics. The effect of HMA-LMA status was tested with Likelihood ratio tests. In this procedure, two linear models with mixed effects were compared, the full model and the null model. All metrics were significantly greater (ANOVA, P ≤ 0.001) in the HMA than in the LMA group.
Figure 3Beta diversity of microbial communities associated with HMA and LMA sponge samples. NMDS was conducted from Bray–Curtis dissimilarities between samples based on OTU abundances. The three displayed plots represent the same analysis, where sample symbols and colors stand for (A) HMA-LMA status, (B) geographic region, and (C) host identity.
The effect of HMA-LMA status on microbial communities at different taxonomic ranks.
| Phylum | 21,816 | 12.172 | 0.001 | 5.75 |
| Class | 25,402 | 10.755 | 0.001 | 20.27 |
| Order | 184,850 | 8.7579 | 0.001 | 54.64 |
| Family | 31,065 | 10.029 | 0.001 | 64.11 |
| Genus | 33,377 | 9.9077 | 0.001 | 89.34 |
| Species | 33,061 | 9.6671 | 0.001 | 98.24 |
| df:1, Res:224, Total:250 | ||||
PERMANOVA analysis was performed with Bray–Curtis dissimilarities between samples obtained from square root transformed abundances. See Table .
Percentage of sequences that fell in “unclassified” taxon during the taxonomic grouping of OTU abundances.
Figure 4Selection of differentially abundant bacterial and archaeal taxa in the microbiomes of HMA and LMA sponge species. Estimated mean and 95% confidence intervals were obtained from negative binominal generalized linear models (HMA = 19, LMA = 17) and converted to percentages. (A) Phyla and (B) classes that differed in more than 0.25% of their mean relative abundance per group are displayed. (C) The cut-off for OTUs was 0.5% difference. The shown taxa resulted in P < 0.05. Classification of OTUs is shown down to their deepest taxonomic level.
Figure 5Selection and standardization of classifiers. (A) Performance of classifiers training on phylum, class, and OTU abundances. Percentage of correctly classified samples per species were averaged according to training tables. Weighted means were used due to the difference in number of HMA (n = 19) and LMA (n = 17) sponges. Error bars represent weighted standard deviations. (B) Performance of Random Forest for phylum and class datasets according to number of trees in the forest. Mean of weighted averages are displayed at the top of bars. (C) Performance of Random Forest (number of trees in the forest = 50) on classification of known HMA and LMA sponge species. Radial Basis Function kernel (RBF) and Support Vector Machine (SVM) are abbreviated.
Figure 6Random Forest predictions of HMA-LMA status of previously uncharacterized sponge species (. Prediction of samples were carried out by Random Forest (number of trees in the forest = 50) based on phylum and class abundances. Clustering of the classifier results (left numbered panel, A–D) were performed with affinity propagation. Color scheme of right panels represents percentage of samples predicted as either HMA or LMA.
Figure 7Relationship between the structures of microbial communities (beta diversity) from classified and predicted sponges. NMDS plots were constructed from Bray–Curtis dissimilarities between samples obtained from (A) phylum, (B) class, and (C) OTU abundances. Points correspond to samples and are colored according to the HMA-LMA classification and to the clusters obtained from Random Forest prediction results (see Figure 6). Density of points along the NMDS dimensions (axes) was plotted in gray.