Frank O Aylward1, Garret Suen2, Peter H W Biedermann3, Aaron S Adams4, Jarrod J Scott, Stephanie A Malfatti5, Tijana Glavina del Rio5, Susannah G Tringe5, Michael Poulsen6, Kenneth F Raffa4, Kier D Klepzig7, Cameron R Currie1. 1. faylward@hawaii.edu currie@bact.wisc.edu. 2. Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin, USA. 3. Insect Symbiosis Research Group, Max Planck Institute for Chemical Ecology, Jena, Germany. 4. Department of Entomology, University of Wisconsin-Madison, Madison, Wisconsin, USA. 5. Department of Energy Joint Genome Institute, Walnut Creek, California, USA. 6. Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark. 7. USDA Forest Service, Southern Research Station, Asheville, North Carolina, USA.
Abstract
UNLABELLED: The ability to cultivate food is an innovation that has produced some of the most successful ecological strategies on the planet. Although most well recognized in humans, where agriculture represents a defining feature of civilization, species of ants, beetles, and termites have also independently evolved symbioses with fungi that they cultivate for food. Despite occurring across divergent insect and fungal lineages, the fungivorous niches of these insects are remarkably similar, indicating convergent evolution toward this successful ecological strategy. Here, we characterize the microbiota of ants, beetles, and termites engaged in nutritional symbioses with fungi to define the bacterial groups associated with these prominent herbivores and forest pests. Using culture-independent techniques and the in silico reconstruction of 37 composite genomes of dominant community members, we demonstrate that different insect-fungal symbioses that collectively shape ecosystems worldwide have highly similar bacterial microbiotas comprised primarily of the genera Enterobacter, Rahnella, and Pseudomonas. Although these symbioses span three orders of insects and two phyla of fungi, we show that they are associated with bacteria sharing high whole-genome nucleotide identity. Due to the fine-scale correspondence of the bacterial microbiotas of insects engaged in fungal symbioses, our findings indicate that this represents an example of convergence of entire host-microbe complexes. IMPORTANCE: The cultivation of fungi for food is a behavior that has evolved independently in ants, beetles, and termites and has enabled many species of these insects to become ecologically important and widely distributed herbivores and forest pests. Although the primary fungal cultivars of these insects have been studied for decades, comparatively little is known of their bacterial microbiota. In this study, we show that diverse fungus-growing insects are associated with a common bacterial community composed of the same dominant members. Furthermore, by demonstrating that many of these bacteria have high whole-genome similarity across distantly related insect hosts that reside thousands of miles apart, we show that these bacteria are an important and underappreciated feature of diverse fungus-growing insects. Because of the similarities in the agricultural lifestyles of these insects, this is an example of convergence between both the life histories of the host insects and their symbiotic microbiota.
UNLABELLED: The ability to cultivate food is an innovation that has produced some of the most successful ecological strategies on the planet. Although most well recognized in humans, where agriculture represents a defining feature of civilization, species of ants, beetles, and termites have also independently evolved symbioses with fungi that they cultivate for food. Despite occurring across divergent insect and fungal lineages, the fungivorous niches of these insects are remarkably similar, indicating convergent evolution toward this successful ecological strategy. Here, we characterize the microbiota of ants, beetles, and termites engaged in nutritional symbioses with fungi to define the bacterial groups associated with these prominent herbivores and forest pests. Using culture-independent techniques and the in silico reconstruction of 37 composite genomes of dominant community members, we demonstrate that different insect-fungal symbioses that collectively shape ecosystems worldwide have highly similar bacterial microbiotas comprised primarily of the genera Enterobacter, Rahnella, and Pseudomonas. Although these symbioses span three orders of insects and two phyla of fungi, we show that they are associated with bacteria sharing high whole-genome nucleotide identity. Due to the fine-scale correspondence of the bacterial microbiotas of insects engaged in fungal symbioses, our findings indicate that this represents an example of convergence of entire host-microbe complexes. IMPORTANCE: The cultivation of fungi for food is a behavior that has evolved independently in ants, beetles, and termites and has enabled many species of these insects to become ecologically important and widely distributed herbivores and forest pests. Although the primary fungal cultivars of these insects have been studied for decades, comparatively little is known of their bacterial microbiota. In this study, we show that diverse fungus-growing insects are associated with a common bacterial community composed of the same dominant members. Furthermore, by demonstrating that many of these bacteria have high whole-genome similarity across distantly related insect hosts that reside thousands of miles apart, we show that these bacteria are an important and underappreciated feature of diverse fungus-growing insects. Because of the similarities in the agricultural lifestyles of these insects, this is an example of convergence between both the life histories of the host insects and their symbiotic microbiota.
Symbioses between metazoans and microbial communities are ubiquitous in nature and have contributed to many of the watershed events in the history of life on Earth (1–3). These symbiotic microbiota, which range from simple consortia of relatively few species to highly complex and dynamic communities, have been shown to benefit their hosts through defense against pathogens (4–7), degradation of recalcitrant dietary material (8–10), and biosynthesis of essential nutrients (11–13). The vast physiological potential of microbes plays an important role in the acquisition of novel ecological strategies in metazoans, and microbial symbionts have been argued to play an important role in host adaptation and speciation (14). Despite the importance of symbiotic microbiota for the evolution and physiology of their host, the forces that shape the structure and dynamics of these communities are not well understood.Metazoans collectively associate with a vast phylogenetic diversity of microbes encompassing all three domains of life (15). Although these host-associated communities are also physiologically diverse, there often exists substantial functional redundancy between distantly related community members (15, 16). This diversity and functional redundancy of host-associated microbiota have implications for community structure. Specifically, patterns of convergence observed in the functional potential of host-associated communities of distantly related hosts have been postulated to be the product of selection for microbial groups that possess particular traits but are not always closely related (17, 18). For example, studies of various metazoan hosts have indicated that host niche is a primary determinant of the overall physiological capacity of microbiota that can lead to functional convergence irrespective of the evolutionary history of the host (19–21). Moreover, other recent studies have generally found that niche-specific factors are important in structuring host-associated microbiota (22–24). However, host phylogeny has also been linked to microbial community composition in studies of primates and insects (24–31), indicating that a number of factors are important for structuring the phylogenetic composition and functional capacity of host-associated microbiotas.In this study, we characterized bacterial microbiotas associated with fungus-growing insects, which comprise distantly related ant, beetle, and termite lineages that have independently established symbioses with fungi (32). The microbial symbioses of these hosts are central to their life histories and have enabled them to become dominant herbivores and prevalent tree pests in widespread tropical and temperate ecosystems (32–34). These insects are able to shape ecosystems around the globe largely because their symbiotic microbiota act as “ancillary guts” that degrade recalcitrant plant biomass and convert it into nutrients more accessible to their hosts (32, 35–39), thereby allowing these insects to exploit ecological niches that would otherwise be unavailable. In some fungus-growing insects, the external digestive systems have led to the evolution of species that build elaborate colonies composed of millions of insects divided into castes with distinct tasks, behaviors, and morphologies, as demonstrated by leaf-cutter ants of the genus Atta and fungus-growing termites of the genus Macrotermes (32, 33).Here, we analyzed fungus-growing ants (Tribe: Attini), ambrosia beetles (Tribe: Xyleborini), and termites (subfamily: Macrotermitinae), which all engage in obligate fungal agriculture (32), as well as mountain and southern pine beetles (genus Dendroctonus), which, although lacking many of the true agricultural characteristics of the other insects, also associate with mutualistic fungi that they consume for food (34, 40). Our samples included three insect orders and spanned a considerable portion of the global distribution of these insect-fungal symbioses (Fig. 1A and B; see Table S1 in the supplemental material). Due to the independent origins of these nutritional fungal symbioses across distantly related host lineages collected from across the globe, our analyses of their microbiotas provides a unique opportunity to assess the extent to which the similarity of their ecological niches has influenced the structure of their associated microbial communities. To this end, we sought to provide a fine-scale comparison of the composition of bacterial communities associated with these insect fungal symbioses through sequencing of 18 16S amplicon libraries comprising a total of 136,400 quality-filtered sequences (minimum length of 200 bp) and 18 community metagenomes comprising a total of 6.8 Gbp of raw sequence data from which we reconstructed 37 composite genomes of dominant community members (see Tables S2 and S3).
FIG 1
Distribution of insect-fungal symbioses and composition of their bacterial microbiota. (A) Map showing the global distribution of the insects analyzed here (colored regions on the map) and the locations from which samples were obtained in this study (circled). Pie charts show the phylogenetic composition of bacteria identified from 16S amplicon libraries sequenced from each sample, with colors corresponding to bacterial phylogenetic groups (in key). Metagenomes constructed from both the top and bottom strata of fungus gardens are shown for the leaf-cutter ant Atta colombica. Global insect distributions are based on previous estimates (see Materials and Methods). (B) Simplified phylogeny of select insect orders (based on that previously reported [79]). Orders that include insects with insect-fungal symbioses presented in this study are highlighted in blue.
Distribution of insect-fungal symbioses and composition of their bacterial microbiota. (A) Map showing the global distribution of the insects analyzed here (colored regions on the map) and the locations from which samples were obtained in this study (circled). Pie charts show the phylogenetic composition of bacteria identified from 16S amplicon libraries sequenced from each sample, with colors corresponding to bacterial phylogenetic groups (in key). Metagenomes constructed from both the top and bottom strata of fungus gardens are shown for the leaf-cutter ant Atta colombica. Global insect distributions are based on previous estimates (see Materials and Methods). (B) Simplified phylogeny of select insect orders (based on that previously reported [79]). Orders that include insects with insect-fungal symbioses presented in this study are highlighted in blue.
RESULTS AND DISCUSSION
The 16S libraries of all insect samples were dominated by sequences classified as belonging to the class Gammaproteobacteria (ranging from 54 to 99% of sequences per sample), except in the adult ambrosia beetles, where the phylum Bacteroidetes dominated (46% of sequences) and Gammaproteobacteria were the next most abundant group (23% of sequences) (Fig. 1A). The genera Pseudomonas, Enterobacter, and Rahnella were the most abundant in our genus-level analysis, and rank-abundance curves demonstrated that 2 or 3 groups dominated the composition of our 16S libraries (see Fig. S1 and S2 in the supplemental material). All libraries contained between 31 and 753 operational taxonomic units (OTUs; 95% identity cutoff), and the rarefaction analyses demonstrate sufficient sampling (Fig. S3). Relative abundance quantification of the contigs in the community metagenomes corroborated those of our 16S-based analysis in showing that the gammaproteobacterial families Enterobacteriaceae and Pseudomonadaceae dominated all of our samples (Fig. 2A). Comparison of the Clusters of Orthologous Groups (COG [41]) and Protein Families (Pfam [42]) profiles of these metagenomes to publicly available metagenomes revealed clustering into three groups: one containing all fungus-associated insect samples, one containing other host-associated communities, and one containing environmental (non-host-associated) communities (Fig. 3; see Table S4). In addition to the correspondence identified for the phylogenetic composition of the insect communities sampled here, this functional clustering demonstrates a general equivalence of the physiological potential of these microbiota that is distinct from other microbial communities for which metagenomes are available.
FIG 2
Phylogenetic binning comparisons of the 18 metagenomes analyzed in this study. (A) Family-level binning and coverage-weighted relative abundance comparison of the contigs in the metagenomes. The genera Rahnella and Enterobacter belong to the family Enterobacteriaceae, while the genus Pseudomonas belongs to the family Pseudomonadaceae. (B) Rank-abundance overview of the most abundant bacterial genera identified in the combined 18 metagenomes using coverage-weighted contig binning. Relative abundance estimates were obtained by multiplying the length of each contig by its coverage and summing the results for a given family- or genus-level bin. (C) Mbp of sequences binned to the genera Rahnella, Enterobacter, and Pseudomonas in the 18 metagenomes. Abbreviations: AB, Alberta; BC, British Columbia; Bot, Bottom.
FIG 3
Principle component analyses (PCA) comparing the functional profiles of the metagenomes of insect-fungal symbioses to those of 57 publicly available metagenomes generated from environmental or gut-associated samples. Annotations were performed using both the Clusters of Orthologous Groups (COG) and Protein Families (Pfam) databases. A full list of metagenomes used can be found in Table S4. A metagenome constructed from the gut of the honey bee was the only metagenome found to cluster near the insect-fungal symbiosis samples (labeled in both panels and in both cases closest to the fungus-growing termite adult sample). Squares indicate the category averages.
Phylogenetic binning comparisons of the 18 metagenomes analyzed in this study. (A) Family-level binning and coverage-weighted relative abundance comparison of the contigs in the metagenomes. The genera Rahnella and Enterobacter belong to the family Enterobacteriaceae, while the genus Pseudomonas belongs to the family Pseudomonadaceae. (B) Rank-abundance overview of the most abundant bacterial genera identified in the combined 18 metagenomes using coverage-weighted contig binning. Relative abundance estimates were obtained by multiplying the length of each contig by its coverage and summing the results for a given family- or genus-level bin. (C) Mbp of sequences binned to the genera Rahnella, Enterobacter, and Pseudomonas in the 18 metagenomes. Abbreviations: AB, Alberta; BC, British Columbia; Bot, Bottom.Principle component analyses (PCA) comparing the functional profiles of the metagenomes of insect-fungal symbioses to those of 57 publicly available metagenomes generated from environmental or gut-associated samples. Annotations were performed using both the Clusters of Orthologous Groups (COG) and Protein Families (Pfam) databases. A full list of metagenomes used can be found in Table S4. A metagenome constructed from the gut of the honey bee was the only metagenome found to cluster near the insect-fungal symbiosis samples (labeled in both panels and in both cases closest to the fungus-growing termite adult sample). Squares indicate the category averages.The genera Pseudomonas, Enterobacter, and Rahnella were particularly well represented in the community metagenomes, with the genus Pseudomonas abundant in most of the ant, beetle, and termite systems, the genus Enterobacter more common in the fungus-growing ant samples, and the genus Rahnella more abundant in the termite- and beetle-associated samples (Fig. 2B). Genus-level rank-abundance analysis also found these three genera to be overwhelmingly the most abundant in these communities (Fig. 2B and C). Mapping of all genes recovered from the Enterobacteriaceae and Pseudomonas bins onto a multilocus phylogeny of reference genomes (Fig. 4A) revealed highly similar phylogenetic profiles among the ambrosia beetle, termite, and pine beetle samples, with sequences most similar to Rahnella aquatilis and Pseudomonas fluorescens strains dominating the phylogenetic profiles of these metagenomes. The fungus-growing ant samples were also dominated by sequences mapping to Enterobacteriaceae and Pseudomonas genomes, but the genes in these metagenomes mapped primarily to the Enterobacter and Pseudomonas putida clades in the phylogeny (Fig. 4A). This pattern held for all fungus-growing ants, including Apterostigma dentigerum, which cultivates a pterulaceous fungus distantly related to the Lepiotaceae family of fungi grown by other attine ants (43).
FIG 4
Comparisons of dominant groups represented in the metagenomes of insect-fungal symbioses. (A) Bubble chart showing the relative abundance of the most prevalent phylogenetic groups identified in the metagenomes. Relative abundances of the genera Pseudomonas, Enterobacter, and Rahnella were calculated using abundance-weighted coverage estimates of binned contigs. For the phylogenetic mapping analysis, all genes predicted from contigs classified to the family Enterobacteriaceae and genus Pseudomonas were mapped onto a maximum-likelihood phylogeny of representative sequenced genomes constructed using concatenated amino acid sequences from 9 highly conserved proteins (see Materials and Methods). Bootstrap support values have been omitted for clarity (a full phylogeny with support values can be found in Fig. S5). (B) Heatmaps showing the ANI values obtained from pairwise BLASTN comparisons of the composite Enterobacter, Pseudomonas, and Rahnella genomes reconstructed in this study. Dendrograms were constructed using a neighbor-joining algorithm with distance matrices constructed from pairwise ANI comparisons (see Materials and Methods).
Comparisons of dominant groups represented in the metagenomes of insect-fungal symbioses. (A) Bubble chart showing the relative abundance of the most prevalent phylogenetic groups identified in the metagenomes. Relative abundances of the genera Pseudomonas, Enterobacter, and Rahnella were calculated using abundance-weighted coverage estimates of binned contigs. For the phylogenetic mapping analysis, all genes predicted from contigs classified to the family Enterobacteriaceae and genus Pseudomonas were mapped onto a maximum-likelihood phylogeny of representative sequenced genomes constructed using concatenated amino acid sequences from 9 highly conserved proteins (see Materials and Methods). Bootstrap support values have been omitted for clarity (a full phylogeny with support values can be found in Fig. S5). (B) Heatmaps showing the ANI values obtained from pairwise BLASTN comparisons of the composite Enterobacter, Pseudomonas, and Rahnella genomes reconstructed in this study. Dendrograms were constructed using a neighbor-joining algorithm with distance matrices constructed from pairwise ANI comparisons (see Materials and Methods).We reconstructed composite genomes for the dominant microbial community members to compare genomes directly across insect-fungal symbioses (see Materials and Methods). These included 12 Enterobacter genomes, 15 Pseudomonas genomes, and 10 Rahnella genomes, all estimated to be >40% complete (see Table S5 in the supplemental material). Calculation of the pairwise average nucleotide identities (ANI) of the reconstructed genomes revealed that the microbiota from different insect-fungal symbioses contained dominant bacterial constituents that were highly similar to each other at the whole-genome level (often >95% ANI) (Fig. 4B, and see Fig. S4), and multilocus phylogenetic analysis of conserved housekeeping genes confirmed that the majority of the reconstructed Enterobacter, Rahnella, and Pseudomonas genomes grouped together in well-supported clades (Fig. 5). The Rahnella and Pseudomonas fluorescens genomes reconstructed from the ambrosia beetle, fungus-growing termite, and pine beetle metagenomes were particularly similar across insect systems, with ANI values in some cases exceeding 98% (Fig. 4B). Composite genomes from the genus Enterobacter could be reconstructed from ant and termite samples and segregated into two distinct groups (ANI of 95.9 to 98.5% and 85.5 to 99.1% within the groups) (Fig. 4B; Fig. S4), while the reconstructed genomes from the Pseudomonas putida group were specific to fungus-growing ants (Fig. 4B and see Fig. S4).
FIG 5
Maximum-likelihood multilocus phylogeny of reference Enterobacter, Pseudomonas, and Rahnella genomes together with the composite genomes reconstructed in this study (color coded according to host insects, as shown in the key). The phylogeny is based on concatenated amino acid sequences of 9 conserved proteins, and local support values were computed using the Shimodaira-Hasegawa test (see Materials and Methods for details).
Maximum-likelihood multilocus phylogeny of reference Enterobacter, Pseudomonas, and Rahnella genomes together with the composite genomes reconstructed in this study (color coded according to host insects, as shown in the key). The phylogeny is based on concatenated amino acid sequences of 9 conserved proteins, and local support values were computed using the Shimodaira-Hasegawa test (see Materials and Methods for details).Our results demonstrate that the microbiotas of diverse insect-fungal symbioses that collectively shape terrestrial ecosystems worldwide contain prevalent and highly similar bacterial constituents. Our comparison of the coding potential of these microbiotas demonstrates broad functional congruence. Moreover, our composite-genome reconstructions reveal that the Enterobacter, Rahnella, and Pseudomonas groups that dominate these communities exhibit high degrees of whole-genome similarity, with the ANI of reconstructed genomes often exceeding 95% (Fig. 4B). In other metazoans, similarities in host niche have been shown to drive convergence in the functional potential of associated microbiota but to influence the phylogenetic composition at broad taxonomic levels (e.g., phylum level) only weakly (19, 21). This pattern, where different hosts possess physiologically congruent microbiotas with phylogenetically distinct composition, is likely due to selection for particular physiological traits that are not necessarily linked to specific phylogenetic groups of microbes. Although the broad congruence of functional profiles across the communities described here is consistent with other systems in which distantly related hosts occupy similar ecological niches (19, 21), our finding of fine-scale phylogenetic convergence across hosts is unexpected, given that the insects analyzed collectively reside on three continents in both tropical and temperate ecosystems in which they are exposed to a vast diversity of microbes. This coupling of both functional and phylogenetic convergence in the microbiotas of insect-fungal symbioses, together with the dominance of 2 or 3 bacterial groups in each insect, suggests that highly specific mechanisms for maintaining host-bacterial or fungal-bacterial interactions are responsible for shaping diversity in these systems.Our findings indicate that bacteria are a common and perhaps even defining feature of widespread insect-fungal symbioses. This is supported by the high degrees of similarity in the dominant bacterial constituents of the ants, termites, and beetles, despite contrasts in the details of the insect-fungal symbioses. For example, these insects cultivate phylogenetically disparate fungi that collectively span two phyla (32), and it is even unclear to what extent mountain and southern pine beetles depend on their symbiotic fungi for nutrition (34). Due to the influence host-associated microbes have been shown to exert on metazoan behavior (44, 45), these findings raise the possibility that similar life history traits in distantly related insect lineages may in some ways be a consequence rather than the cause of the similar bacterial communities they harbor. Together with recent evidence that patterns of host speciation are recapitulated and potentially driven by symbiotic microbiota (14), our finding of convergent community assembly in the bacterial communities of insect-fungal symbioses underscores the fundamental importance of host-microbe interactions in shaping metazoan evolution.
MATERIALS AND METHODS
Insect distributions.
The distributions of insect-fungal symbioses as depicted in Fig. 1 were compiled from previously reported estimates for attine ants (46, 47), Macrotermes natalensis (48), and Xyleborinus saxesenii (49). The distribution of Dendroctonus frontalis is based on estimates from the United States Department of Agriculture (USDA; http://www.fs.usda.gov/Internet/FSE_DOCUMENTS/fsbdev2_042840.pdf). The Invasive Species Compendium (http://www.cabi.org/isc/) was also used for Xyleborinus saxesenii and Dendroctonus ponderosae.
Sample collection, processing, and sequencing.
Insect samples were collected between April and November of 2009. Details regarding the times and locations in which insect samples were acquired can be found in Table S1 in the supplemental material. Sampling of the leaf-cutter ant species Atta cephalotes and Atta colombica and the mountain pine beetle Dendroctonus ponderosae has been described previously (50, 51). All samples were placed on ice immediately after sampling and transported to the laboratory for processing. For all fungus garden, gallery, or whole-insect samples, the bacterial fraction was isolated by gently vortexing the samples in 1% phosphate-buffered saline (PBS) and 0.1% Tween before conducting a modified differential centrifugation procedure, as previously described (50). For the ambrosia beetles, fungus-growing termites, mountain pine beetles, and southern pine beetles, whole-insect or larval samples were also prepared for sequencing. For these samples, 90 to 300 whole insects were pooled for each sample. Once the bacterial fraction of each sample was isolated, DNA was extracted using the bacterial extraction protocol available in the Qiagen DNeasy plant maxikit (Qiagen Sciences, Germantown, MD), which has been shown to yield an accurate representation of community DNA (51). Community metagenomes were subsequently generated by using Roche 454 Titanium pyrosequencing (52), and assemblies were generated using Newbler version 2.1 with the default parameters. Details of the metagenomes and assembly statistics can be found in Table S2.
16S amplicon sequencing, processing, and analysis.
Amplicon libraries spanning the V6-to-V8 (V6-8) region of the 16S ribosomal gene were constructed from DNA extracted from the 18 insect, fungus garden, and gallery samples using methods described previously (53). The libraries were processed using mothur (54) with procedures based on those outlined by Schloss et al. (55) and described on the mothur website (http://www.mothur.org/wiki/454_SOP) as of 1 July 2013. Briefly, flow data associated with the .sff files were extracted using the sffinfo command, and PyroNoise (56) as implemented by the shhh.flows command in mothur was used to trim sequences. Bar code and primer sequences were then removed, and all sequences of <200 bp were discarded. Sequences were then aligned to the SILVA 16S reference dataset (57) using the align.seqs command, and any not overlapping with the V6-8 region were removed (screen.seqs and filter.seqs commands). The pre.cluster command and UCHIME (58), as implemented by the chimera.uchime command, were used to remove chimeras. Sequences were classified through comparison with the NCBI 16S Ribosomal RNA Sequence Library (downloaded 2 January 2013) using BLASTN with the parameter -e−50. Bacterial family- and phylum-level assessments were then made by comparing the top BLASTN hit with the UniProt taxonomy hierarchy (http://www.uniprot.org/taxonomy/) (downloaded 1 May 2013). Sequences not having a BLASTN hit were removed from subsequent analyses on the grounds that they comprised primarily misamplified fungal or insect 18S or chloroplast 16S sequences. Genus-level classifications were performed using only BLASTN hits with ≥97% nucleotide identity.To generate operational taxonomic units (OTUs), quality-trimmed pyrotags were aligned using MAFFT version 5.662 E-INS-I (59), and distances were calculated using the dist.seqs command in mothur. OTUs were then generated using the cluster command in mothur. We focused our analyses on OTUs generated with a 95% identity cutoff, as the more traditional 97% cutoff has been shown to be more appropriate for full-length 16S sequences (60). To calculate the Shannon diversity index of the bacterial component of each 16S library, we first rarefied the data sets down to the number contained in the smallest library (248 bacterial sequences in the ambrosia beetle adult sample). The Shannon diversity index was then calculated for the 95% OTU cutoff for each of the 16S libraries using the summary.single command in mothur. Rarefaction curves for the confirmed bacterial sequences in each of the 16S libraries were also constructed using mothur (the rarefaction.single command). OTUs represented by a single sequence were not included in the rarefaction analysis on the grounds that these OTUs are more likely to be unidentified chimeras or misamplified sequences that may artificially inflate diversity estimates (61). Sequencing statistics for the 16S libraries can be found in Table S3 in the supplemental material, and rarefaction curves for the bacterial sequences are shown in Fig. S3.
Phylogenetic binning and relative abundance estimation.
Phylogenetic bins from all metagenomes were generated using a combination of BLASTN (62) and PhymmBL (63). All contigs and singletons were first compared to a reference data set containing all completely sequenced bacterial and archaeal genomes available in the NCBI as of 1 January 2012. All contigs having BLASTN hits with E values of <1e−10 were classified according to their best hit, while all other contigs were subsequently classified using PhymmBL. Contigs with no BLASTN matches that were classified by PhymmBL with confidence scores of <50 were considered “unclassified.” A relative abundance value for each contig was calculated by multiplying the length of the contig by its percent coverage. The relative abundances of each phylogenetic group in the metagenomes were then calculated by summing the abundance value for each contig classified in a particular bacterial family or genus. To estimate the proportion of a particular phylogenetic group compared to the rest of the metagenome (as depicted for the genera Enterobacter, Pseudomonas, and Rahnella in Fig. 4A), the summed relative abundance of all contigs classified to that group was divided by the summed relative abundance of all contigs in that metagenome.
Phylogenetic mapping of community metagenome data.
All contigs in the community metagenomes binned to the family Enterobacteriaceae or genus Pseudomonas were separated, and genes were predicted from these contigs by using Prodigal (64) with the metagenomic gene caller option. BLASTN was used to map genes and contigs onto the reference Enterobacteriaceae and Pseudomonas genomes listed in Fig. 4A (using nondefault parameters of -X 150, -q -1, -F F, and -e 1e−5), as these genomes represented bacterial families or genera found to be abundant in our phylogenetic binning analysis. Only best hits were retained, and the number of hits was catalogued for each genome. A phylogeny of the reference genomes was created (see below) and the number of hits to each reference were mapped using the interactive Tree of Life (iTOL [65]).
Composite genome reconstruction and completeness estimates.
Composite genomes were reconstructed from the assembled community metagenomes through manual analysis of BLASTN-based homology searches against reference genomes, PhymmBL binning of contigs, and coverage estimates of contigs. First, all contigs >800 bp were binned at the genus level based on their binning assignment, and plots of contig length versus coverage were then generated for each genus in each metagenome. Contigs having similar coverage were placed in the same bin, and the best BLASTN hit of each contig was again cross-referenced to ensure the contigs in each bin were mapping to the same genome (or genomes of closely related bacteria: for example, different strains of Pseudomonas putida). Estimates of the completeness of the composite genomes were generated by extrapolating from the number of single-copy housekeeping genes present in the assembled contig bins. We used a list of 182 core proteins previously used for this purpose (53), although 20 of these proteins were excluded on the grounds that they were not present in complete genomes of bacteria closely related to those identified in the metagenomes, namely, other sequenced Pseudomonas, Enterobacter, and Rahnella genomes. COG models for the remaining 162 core proteins were used for their identification, and the presence of these proteins in each composite genome was ascertained using reverse position specific (RPS)-BLAST (66) (E value, <1e−5) of the predicted proteins in each bin. Only bins with >2 Mb of total sequence that were predicted to be >40% complete (see below) were considered for subsequent analyses (a list of the final composite genomes used can be found in Table S5 in the supplemental material).
Phylogenetic analysis of complete and composite genomes.
Phylogenetic trees of select Enterobacteriaceae and Pseudomonas genomes were constructed from a concatenated amino acid alignment of translations of the highly conserved single-copy housekeeping genes recA, fusA, recG, rpoB, rplB, lepA, ileS, pyrG, and leuS, which have previously been shown to be useful phylogenetic markers (67). Hidden Markov models (HMMs) were created from curated alignments of these proteins available from the Ribosomal Database Project (68), using the hmmbuild command in HMMER3 (69). Proteins encoded in the genomes of interest were predicted using Prodigal (64) and compared to the HMMs using the hmmersearch command in HMMER3, and only best matches were retained. Partial-length proteins were manually annotated before being included in the final protein set. Proteins from the same genome were concatenated using custom PERL scripts, and missing proteins were replaced with the character “X,” as this would allow proteins to be included in the final concatenated alignment even if they were not present in all genomes analyzed. Alignments were created using MAFFT version 5.662 E-INS-I (59) and trimmed using the program Trimal version 1.2 (70) with the parameter -automated1. Maximum-likelihood phylogenetic trees were constructed using FastTree (71), and support values for the nodes were calculated using the Shimodaira-Hasegawa (SH) test (72). Trees were visualized using iTOL.
Composite genome comparisons and BLASTN mapping.
Composite genomes were compared to each other and to complete genomes in the same genera using both average nucleotide identity (ANI) and average amino acid identity (AAI) analyses. While ANI is typically used for comparing closely related bacteria, AAI is useful for more divergent comparisons. Thus, we used ANI to ascertain the degrees of similarity between our composite genomes belonging to the same genus, which were typically very similar, while AAI was used for comparisons of our composite genomes and representative genomes spanning a broader phylogenetic diversity (i.e., across the genus Pseudomonas). Overall, the results of our AAI, ANI, and multilocus phylogenetic analyses provided equivalent results in terms of the overall similarity of the composite genomes analyzed (Fig. 4B, and see Fig. S4 and S5 in the supplemental material). ANI and AAI values were generated for pairs of complete or composite genomes by averaging the percent identities of reciprocal best BLASTN or BLASTP hits of each gene/protein using the parameters -X 150, -q -1, -F F, and -e 1e−5. These BLASTN parameters have previously been shown to provide accurate ANI and AAI values (73). Dendrograms representing the ANI- or AAI-based similarity between different genomes were generated by converting ANI or AAI percent identity values into distances and using the Neighbor algorithm in PHYLIP (74) to calculate corresponding newick trees with the Mobyle web interface (75).
Comparison of the functional profiles of metagenomes.
The 18 metagenomes generated here were compared to 24 gut-associated metagenomes and 33 non-host-associated metagenomes that are publicly available in the Integrated Microbial Genomes/Metagenomes database (IMG/M) (76) or the Metagenome Rapid Annotation using Subsystem Technology (MG-RAST) database (77) (details are in Table S4 in the supplemental material). Predicted proteins obtained from either the IMG/M or MG-RAST databases were compared to the Clusters of Orthologous Groups (COG) (41) and Protein Families (Pfam) (42) databases using RPS-BLAST (66) (E value, <1e−5). Best RPS-BLAST hits were compiled in a matrix and normalized by the total number of COG or Pfam hits. COG or Pfam families representing <0.1% of total annotated proteins were excluded from subsequent analyses. The normalized matrices were then used to for principle component analyses (PCA) using the module FactoMineR (http://factominer.free.fr/) in the R statistical programming environment (http://www.R-project.org/) (78).
Accession numbers.
All metagenomic data generated in this study can be found in the Integrated Microbial Genomes/Metagenomes Database (IMG/M [76]) under accession numbers 2029527003 to 2029527007, 2030936005, 2032320008, 2032320009, 2035918000, 2035918003, 2043231000, 2044078006, 2044078007, 2065487013, 2065487014, 2084038008, 2084038018, and 2228664020; further details on the metagenomic data are presented in Table S2 in the supplemental material. All 16S pyrotag data sets have been deposited in the NCBI Sequence Read Archive (SRA) under accession numbers SRP006785 and SRA047411.Heatmap representing the top 20 most abundant genera identified in the 18 16S pyrotag libraries constructed in this study. For genus-level classifications, only sequences matching with % identity scores exceeding 97% were considered. Abbreviations: AB, Alberta; BC, British Columbia. DownloadFigure S1, EPS file, 1.3 MBRank-abundance curves for the 10 most abundant genera identified in the 18 16S pyrotag libraries constructed in this study. Abbreviations: AB, Alberta; BC, British Columbia. DownloadFigure S2, EPS file, 0.7 MBRarefaction curves calculated for the 18 16S amplicon libraries sequenced in this study. OTUs represented by single reads were not included in rarefaction analysis. DownloadFigure S3, EPS file, 1.3 MBAverage amino acid identity (AAI) comparisons of Enterobacter, Rahnella, and Pseudomonas genomes. Complete reference genomes are colored black, while the composite genomes reconstructed in this study are colored according to their host insect. AAI values are based on reciprocal best-BLASTP calculations of the predicted proteins in the genomes. A neighbor-joining algorithm was used to construct the dendrogram from genome distances calculated from the AAI values. DownloadFigure S4, PDF file, 0.6 MBFull phylogeny as shown in Fig. 4A. Local support values computed using the Shimodaira-Hasegawa test implemented by FastTree v. 2.1. DownloadFigure S5, EPS file, 1.2 MBDetails regarding the time and location of sample collection and species of insects analyzed in this study.Table S1, DOCX file, 0.02 MB.Sequencing and assembly statistics for the 18 metagenomes of insect-fungal symbioses analyzed in this study.Table S2, DOCX file, 0.01 MB.Sequencing statistics and Shannon diversity estimates for the 18 16S amplicon libraries analyzed in this study.Table S3, DOCX file, 0.01 MB.Details regarding publicly available metagenomes used for the COG- and Pfam-based comparisons of coding potential presented in this study.Table S4, DOCX file, 0.02 MB.Summary of the 37 composite genomes reconstructed in this study.Table S5, DOCX file, 0.02 MB.
Authors: Christopher Quince; Anders Lanzén; Thomas P Curtis; Russell J Davenport; Neil Hall; Ian M Head; L Fiona Read; William T Sloan Journal: Nat Methods Date: 2009-08-09 Impact factor: 28.547
Authors: Aaron S Adams; Frank O Aylward; Sandye M Adams; Nadir Erbilgin; Brian H Aukema; Cameron R Currie; Garret Suen; Kenneth F Raffa Journal: Appl Environ Microbiol Date: 2013-03-29 Impact factor: 4.792
Authors: Garret Suen; Jarrod J Scott; Frank O Aylward; Sandra M Adams; Susannah G Tringe; Adrián A Pinto-Tomás; Clifton E Foster; Markus Pauly; Paul J Weimer; Kerrie W Barry; Lynne A Goodwin; Pascal Bouffard; Lewyn Li; Jolene Osterberger; Timothy T Harkins; Steven C Slater; Timothy J Donohue; Cameron R Currie Journal: PLoS Genet Date: 2010-09-23 Impact factor: 5.917
Authors: Reid N Harris; Robert M Brucker; Jenifer B Walke; Matthew H Becker; Christian R Schwantes; Devon C Flaherty; Brianna A Lam; Douglas C Woodhams; Cheryl J Briggs; Vance T Vredenburg; Kevin P C Minbiole Journal: ISME J Date: 2009-03-26 Impact factor: 10.302
Authors: Christopher M Bianchetti; Taichi E Takasuka; Sam Deutsch; Hannah S Udell; Eric J Yik; Lai F Bergeman; Brian G Fox Journal: J Biol Chem Date: 2015-03-09 Impact factor: 5.157
Authors: Lily Khadempour; Kristin E Burnum-Johnson; Erin S Baker; Carrie D Nicora; Bobbie-Jo M Webb-Robertson; Richard A White; Matthew E Monroe; Eric L Huang; Richard D Smith; Cameron R Currie Journal: Mol Ecol Date: 2016-10-26 Impact factor: 6.185
Authors: Frank O Aylward; Lily Khadempour; Daniel M Tremmel; Bradon R McDonald; Carrie D Nicora; Si Wu; Ronald J Moore; Daniel J Orton; Matthew E Monroe; Paul D Piehowski; Samuel O Purvine; Richard D Smith; Mary S Lipton; Kristin E Burnum-Johnson; Cameron R Currie Journal: PLoS One Date: 2015-08-28 Impact factor: 3.240