Literature DB >> 27416027

Functional Comparison of Bacteria from the Human Gut and Closely Related Non-Gut Bacteria Reveals the Importance of Conjugation and a Paucity of Motility and Chemotaxis Functions in the Gut Environment.

Dragana Dobrijevic1, Anne-Laure Abraham1, Alexandre Jamet1, Emmanuelle Maguin1, Maarten van de Guchte1.   

Abstract

The human GI tract is a complex and still poorly understood environment, inhabited by one of the densest microbial communities on earth. The gut microbiota is shaped by millennia of evolution to co-exist with the host in commensal or symbiotic relationships. Members of the gut microbiota perform specific molecular functions important in the human gut environment. This can be illustrated by the presence of a highly expanded repertoire of proteins involved in carbohydrate metabolism, in phase with the large diversity of polysaccharides originating from the diet or from the host itself that can be encountered in this environment. In order to identify other bacterial functions that are important in the human gut environment, we investigated the distribution of functional groups of proteins in a group of human gut bacteria and their close non-gut relatives. Complementary to earlier global comparisons between different ecosystems, this approach should allow a closer focus on a group of functions directly related to the gut environment while avoiding functions related to taxonomically divergent microbiota composition, which may or may not be relevant for gut homeostasis. We identified several functions that are overrepresented in the human gut bacteria which had not been recognized in a global approach. The observed under-representation of certain other functions may be equally important for gut homeostasis. Together, these analyses provide us with new information about this environment so critical to our health and well-being.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27416027      PMCID: PMC4945068          DOI: 10.1371/journal.pone.0159030

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

The human gastrointestinal system, and especially the distal gut, is inhabited by one of the densest populations of microorganisms known. The importance of this community, the human gut microbiota, for human health and wellbeing is now well documented. While some of the cultivable bacteria living in the human gut have been studied for decades (for instance Escherichia coli), the development of new DNA sequencing technologies and the concept of metagenomics provided a paradigm-changing shift in the study of the human gut microbiota. The collective genetic information of the human gut microbiota, the human gut metagenome, is currently the focus of intense international sequencing and research efforts. A first catalogue of 3.3 million human gut microbial genes has been established in 2010 [1], and more recently an extensive update of this catalogue was published, combining data from different sources and containing nearly 10 million genes [2]. Apart from metagenomics projects, data derive from sequencing efforts targeting the genomes of specific bacteria from the human gut. So far, sequences of 778 gut-associated bacterial genomes are available through the Human Microbiome Project [3], thus giving access to an independent complementary line of investigation of the human gut microbiota. More than 90% of gut bacteria are members of only two phyla, Bacteroidetes and Firmicutes, the relative proportions of which exhibit a continuous gradient within the human population [4, 5]. Within the borders of these phyla the microbiota composition is highly individual-specific, showing high variability at the species and strain levels. In contrast to this taxonomic diversity however, functional profiles are far less variable across healthy individuals [1, 6, 7], confirming the existence of a well-balanced host-microbial symbiosis. The results of functional analyses of the human gut metagenome showed that the proteome of the human gut microbiota is enriched in proteins involved in carbohydrate metabolism, energy metabolism and storage, generation of short-chain fatty acids, amino acids metabolism, biosynthesis of secondary metabolites and metabolism of cofactors and vitamins [1, 4, 6–8]. In particular, the human gut microbiota provides a broad and diverse array of carbohydrate-active enzymes, many of which are not present in the human glycobiome [9]. Similar observations were made during genome analysis of some of the human gut-associated bacteria, such as Bacteroides fragilis, Bacteriodes thetaiotaomicron or Methanobrevibacter smithii [10, 11]. Additionally, many yet uncharacterized or completely novel protein families were shown to be specific to the human gut, suggesting that many unknown and uncharacterized processes are yet to be discovered in this environment [12]. The formentioned catalogue containing 3.3 million non-redundant microbial genes from the intestinal tract of 124 individuals [1] provided an opportunity to differentiate bacterial functions necessary for a bacterium to thrive in the gut environment, and therefore present in every gut bacterial species, from those involved in the homeostasis of the whole gut ecosystem, encoded across many bacterial species. Qin et al. referred to these functions as (bacterial) "minimal gut genome" and "minimal gut metagenome", respectively [1]. The minimal gut metagenome includes functions known to be important to the host—bacterial interaction, such as the capacities to metabolize complex polysaccharides and to synthesize short-chain fatty acids, indispensable amino acids and vitamins. It also includes a considerable fraction of functions (~45%) that were present in less than 10% of earlier sequenced, mainly non-gut, bacterial genomes. These "otherwise rare", gut-specific, functions mainly contained uncharacterized genes, underscoring our limited knowledge of gut functioning. Here we present a different approach to the identification of functions that may inform us on the conditions of gut homeostasis, comparing the predicted proteomes of fully sequenced gut bacteria to those of closely related bacteria from other environments. The underlying hypothesis is that the comparison of metagenomic data from different ecosystems, for example gut and soil, may reveal functions that are characteristic of either ecosystem partly because of the different constraints of each system (i.e. adaptive functions) and partly because the two ecosystems are populated by different bacterial phyla with inherent differences between them (which may or may not be important in adaptation to the environment). Also, the global approach would not recognize the importance of certain functions in the gut environment if these functions also play a role in (other bacteria in) other environments. The comparison of closely related bacterial species from different environments allows focusing on differences that are directly related to the different environments.

Materials and Methods

Data preparation

Information on sequenced genomes was downloaded from the GOLD database (www.genomesonline.org) (version of 07 March 2013). Gene and protein sequences were downloaded from NCBI via iMOMi [13] except for those of Megamonas rupellensis DSM 1994 and Caloramator australicus KCTC 5601 which were downloaded from Integrated Microbial Genomes data warehouse (www.img.jgi.doe.gov).

Phylogenetic analysis

The 16S rRNA gene sequences used in this study were retrieved in fasta format from GenBank (NCBI-GenBank Flat File Release 195.0). Phylogenetic analysis was conducted using the SeaView 4.4.1 platform [14]. MUSCLE [15] was used for multiple sequence alignment with default parameters and blocks of evolutionary conserved sites were selected by Gblocks [16]. The tree was computed using phyML 3.0 [16] based on the Maximum Likelihood method and visualized using iTOL v3.0 [17]. The 16S rRNA gene sequence of Methanobrevibacter smithii ATCC 35061 was used as an outgroup for the analysis.

Protein functional annotation and localization prediction

Protein functional annotations were made by BLASTP search against the eggNOGv3.0 bactNOG catalog [18] and the best hit (e-value <10e-5) was retained. To predict protein localization, protein sequences were analyzed using SurfG+ [19]. Functional comparisons were performed using R (https://www.r-project.org): functional clustering of the bacterial species used in this study was performed with the heatmap function taking bactNOGs present in ≥ 2 species and ≤ 45 species into account (i.e. bactNOGs present in only one species or in all 46 species studied, which are not informative for clustering, were ignored); the distribution of functions among gut and non-gut genomes was visualized with the function hist2d of the gplots library (version 2.17.0), bin contents was summarized with a log function.

Results

Closely related gut and non-gut bacterial species constitute distinct functional groups

Bacterial diversity in the gut is largely restricted to two major phyla, Firmicutes and Bacteroidetes. In this study, we chose to focus on gut bacteria from the Firmicutes phylum and compare the functions encoded in their genomes with those encoded in closely related bacteria from other environments. Information on sequenced genomes was downloaded from the GOLD database (www.genomesonline.org) and filtered to select publicly available complete genomes from the Firmicutes phylum. Among these, "gut species" were identified by comparison with the gut metagenome data presented in [1], choosing species that in the latter study were marked "frequent" (i.e. highly abundant in the gut) and/or "common" (i.e. present in the gut of many individuals). "Non-gut" species, for which no association with the gut environment could be found in literature, were chosen in taxonomic groups as close as possible to the gut-species. To validate this choice, the gene repertoires of the selected species (one strain per species) were analyzed using BLASTn against the MetaHIT gut microbiota gene catalog [1] to identify genes present in this catalog (sequence identity ≥ 95% over ≥ 90% of the longest sequence length). A species was considered as “gut species” if at least 5% of the genes were present in the catalog, and as “non-gut” species if less than 0.5% of the genes were present in the catalog. This resulted in the constitution of two genome sets with “gut” or “non-gut” attributions, respectively, that were coherent between data obtained from literature and from metagenomic sequencing of fecal samples. The first set represents 23 bacterial species isolated from human stool samples. The second set represents 23 closely related species from other environments (Table 1, Fig 1). We refer to these sets of genomes as “gut” and “non-gut”, respectively. The majority of the selected species belongs to the class Clostridia, order Clostridiales (19 gut species and 14 non-gut species).
Table 1

Bacterial genomes used in this study.

BacteriaEnvironmentClassOrderFamily
Bacillus subtilis 168non gutsoilBacilliBacillalesBacillaceae
Brevibacillus brevis NBRC 100599non gutsoilBacilliBacillalesPaenibacillaceae
Solibacillus silvestris StLB046non gutsoilBacilliBacillalesPlanococcaceae
Lactobacillus buchneri ATCC 11577non gutsilageBacilliLactobacillalesLactobacillaceae
Lactobacillus delbrueckii bulgaricus ATCC 11842non gutyoghurtBacilliLactobacillalesLactobacillaceae
Pediococcus pentosaceus ATCC 25745non gutplants, cheeseBacilliLactobacillalesLactobacillaceae
Oenococcus oeni ATCC BAA 1163non gutwineBacilliLactobacillalesLeuconostocaceae
Alkaliphilus metalliredigens QYMFnon gutborax leachate pondsClostridiaClostridialesClostridiaceae
Alkaliphilus oremlandii OhILAsnon gutClostridiaClostridialesClostridiaceae
Caloramator australicus KCTC 5601non gutClostridiaClostridialesClostridiaceae
Clostridium acetobutylicum ATCC 824non gutClostridiaClostridialesClostridiaceae
Clostridium beijerinckii NCIMB 8052non gutsoilClostridiaClostridialesClostridiaceae
Clostridium botulinum A str ATCC 3502non gutClostridiaClostridialesClostridiaceae
Clostridium cellulovorans ATCC 35296non gutClostridiaClostridialesClostridiaceae
Clostridium kluyveri NBRC 12016non gutClostridiaClostridialesClostridiaceae
Sulfobacillus acidophilus DSM 10332non gutgeothermal environments, minesClostridiaClostridialesClostridiales Family XVII
Acetobacterium woodii DSM 1030non gutClostridiaClostridialesEubacteriaceae
Eubacterium ventriosum ATCC 27560gutClostridiaClostridialesEubacteriaceae
Blautia hansenii DSM 20583gutClostridiaClostridialesLachnospiraceae
Butyrivibrio crossotus DSM 2876gutClostridiaClostridialesLachnospiraceae
Clostridium bolteae ATCC BAA 613gutClostridiaClostridialesLachnospiraceae
Clostridium lentocellum DSM 5427non gutClostridiaClostridialesLachnospiraceae
Clostridium nexile DSM 1787gutClostridiaClostridialesLachnospiraceae
Clostridium phytofermentans ISDgnon gutClostridiaClostridialesLachnospiraceae
Coprococcus comes ATCC 27758gutClostridiaClostridialesLachnospiraceae
Coprococcus eutactus ATCC 27759gutClostridiaClostridialesLachnospiraceae
Dorea formicigenerans ATCC 27755gutClostridiaClostridialesLachnospiraceae
Dorea longicatena DSM 13814gutClostridiaClostridialesLachnospiraceae
Roseburia intestinalis M50/1gutClostridiaClostridialesLachnospiraceae
Ruminococcus gnavus ATCC 29149gutClostridiaClostridialesLachnospiraceae
Ruminococcus obeum A2-162gutClostridiaClostridialesLachnospiraceae
Ruminococcus torques L2-14gutClostridiaClostridialesLachnospiraceae
Clostridium bartlettii DSM 16795gutClostridiaClostridialesPeptostreptococcaceae
Acetivibrio cellulolyticus CD2non gutsewage sludgeClostridiaClostridialesRuminococcaceae
Anaerotruncus colihominis DSM 17241gutClostridiaClostridialesRuminococcaceae
Clostridium leptum DSM 753gutClostridiaClostridialesRuminococcaceae
Ethanoligenens harbinense DSM 18485non gutanaerobic activated sludge of molasses wastewaterClostridiaClostridialesRuminococcaceae
Eubacterium siraeum 70/3gutClostridiaClostridialesRuminococcaceae
Faecalibacterium prausnitzii A2-165gutClostridiaClostridialesRuminococcaceae
Subdoligranulum variabile DSM 15176gutClostridiaClostridialesRuminococcaceae
Caldicellulosiruptor lactoaceticus DSM 9545non gutClostridiaThermoanaerobacteralesThermoanaerobacterales Family III.
Catenibacterium mitsuokai DSM 15897gutErysipelotrichiaErysipelotrichalesErysipelotrichaceae
Holdemania filiformis DSM 12042gutErysipelotrichiaErysipelotrichalesErysipelotrichaceae
Megamonas rupellensis DSM 19944gutNegativicutesSelenomonadalesVeillonellaceae
Mitsuokella multacida DSM 20544gutNegativicutesSelenomonadalesVeillonellaceae
Thermosinus carboxydivorans Nor1non guthot springNegativicutesSelenomonadalesVeillonellaceae

All bacteria belong to the Firmicutes phylum, the sequencing status of their respective genomes is "complete" and sequences are publicly available (GOLD database: http://www.genomesonline.org/cgi-bin/GOLD/index.cgi).

Fig 1

Phylogeny of bacterial species used in this study.

16S rRNA based tree by Maximum Likelihood method. Green, gut species; Violet, non-gut species. Clusters A and B make part of one functional cluster (Fig 2).

All bacteria belong to the Firmicutes phylum, the sequencing status of their respective genomes is "complete" and sequences are publicly available (GOLD database: http://www.genomesonline.org/cgi-bin/GOLD/index.cgi).

Phylogeny of bacterial species used in this study.

16S rRNA based tree by Maximum Likelihood method. Green, gut species; Violet, non-gut species. Clusters A and B make part of one functional cluster (Fig 2).
Fig 2

Functional clustering of bacterial species used in this study.

Clustering of 46 species on the basis of functional profiles (presence or absence of bactNOGs). Green, gut species; Violet, non-gut species. A and B represent phylogenetically distinct clusters (Fig 1).

To evaluate the functional composition of the species studied, we attributed the predicted protein complement of each genome to orthologous groups (bactNOGs) using the eggNOG v.3.0 database [18]. Orthologous groups of proteins have proven useful for functional analyses, as orthologs tend to have equivalent functions [20]. The fraction of the proteins that could be attributed to bactNOGs varied from 48 to 81% depending on the species (Table 2), and was generally higher for non-gut species than for gut species (71% vs 60%, respectively, on an average). For each genome this resulted in a functional profile based on the presence of bactNOGs, which was subsequently used as a basis for functional genome clustering (Fig 2). With only few exceptions, the gut bacterial species studied appear to form a distinct cluster. Notably, the phylogenetically distinct gut bacteria clusters A and B (Fig 1) appear to be united in one functional cluster (Fig 2).
Table 2

Protein functional annotation and localization prediction per genome.

BacteriaPSESECCYTOMEMBTotaleggNOGeggNOG %
Anaerotruncus colihominis DSM 1724130411134965164427221450,0
Blautia hansenii DSM 205832928023264733171202964,0
Butyrivibrio crossotus DSM 28762919218033432529166966,0
Catenibacterium mitsuokai DSM 158972116923113862977179960,4
Clostridium bartlettii DSM 167952458820254292787200171,8
Clostridium bolteae ATCC BAA 613644188538610667284347347,7
Clostridium leptum DSM 7532939130554843923193049,2
Clostridium nexile DSM 17874238531525794239215750,9
Coprococcus comes ATCC 277583036929545873913205552,5
Coprococcus eutactus ATCC 277593208421664122982185962,3
Dorea formicigenerans ATCC 277553016524384733277207263,2
Dorea longicatena DSM 138142485322344352970193065,0
Eubacterium siraeum 70/32528517023082347148663,3
Eubacterium ventriosum ATCC 2756025516819534262802172161,4
Faecalibacterium prausnitzii A2-16525011826444633475184953,2
Holdemania filiformis DSM 1204241611030756224223216751,3
Megamonas rupellensis DSM 1994413713015973612225172977,7
Mitsuokella multacida DSM 2054416214518853662558176869,1
Roseburia intestinalis M50/132711425474903478214961,8
Ruminococcus gnavus ATCC 291493397329585433913226657,9
Ruminococcus obeum A2-16227917722554443155201563,9
Ruminococcus torques L2-142485521083872798190268,0
Subdoligranulum variabile DSM 151763199024874853381201359,5
gut mean29810225464823428201160
gut SEM21816731218782
Acetivibrio cellulolyticus CD259143533136094948288258,2
Acetobacterium woodii DSM 10303298426114493474242469,8
Alkaliphilus metalliredigens QYMF50611332847224625315468,2
Alkaliphilus oremlandii OhILAs32710219724352836225079,3
Bacillus subtilis subtilis 16832421028677754177303572,7
Brevibacillus brevis NBRC 10059958533341398905947393266,1
Caldicellulosiruptor lactoaceticus DSM 954519414416922882319180777,9
Caloramator australicus KCTC 56012058720513822725199173,1
Clostridium acetobutylicum ATCC 82439919427025523847279072,5
Clostridium beijerinckii NCIMB 805249622835947025021341668,0
Clostridium botulinum A str ATCC 350231912825615823591262273,0
Clostridium cellulovorans ATCC 3529645925429955464255288267,7
Clostridium kluyveri NBRC 120163589425724993523258473,3
Clostridium lentocellum DSM 542747320428876184183279266,7
Clostridium phytofermentans ISDg53814126285953903283372,6
Ethanoligenens harbinense DSM 184852439419833812702196372,6
Lactobacillus buchneri ATCC 1157718317221185293002188662,8
Lactobacillus delbrueckii bulgaricus ATCC 118421167010992441530112573,5
Oenococcus oeni ATCC BAA 116383649862651675113067,5
Pediococcus pentosaceus ATCC 257451344212822971755141880,8
Solibacillus silvestris StLB04640213226216683823270570,8
Sulfobacillus acidophilus DSM 1033225810825225833471217262,6
Thermosinus carboxydivorans Nor119414420064062750208775,9
non-gut mean33515624565223482243071
non-gut SEM3219162362341461

PSE, SEC, CYT, MEMB, predicted numbers of potentially surface exposed, secreted, cytoplasmic and membrane proteins, respectively, per genome. Total, total number of proteins encoded per genome. eggNOG and eggNOG %, number and % of proteins, respectively, assigned to bactNOGs in the eggNOG v.3.0 database. Gut bacteria are grouped in the upper part of the table, non-gut bacteria in the lower part.

PSE, SEC, CYT, MEMB, predicted numbers of potentially surface exposed, secreted, cytoplasmic and membrane proteins, respectively, per genome. Total, total number of proteins encoded per genome. eggNOG and eggNOG %, number and % of proteins, respectively, assigned to bactNOGs in the eggNOG v.3.0 database. Gut bacteria are grouped in the upper part of the table, non-gut bacteria in the lower part.

Functional clustering of bacterial species used in this study.

Clustering of 46 species on the basis of functional profiles (presence or absence of bactNOGs). Green, gut species; Violet, non-gut species. A and B represent phylogenetically distinct clusters (Fig 1).

Functional adaptation to the human gut environment through gut-specific functions

In order to examine potential gut adaptation among Firmicutes, we compared the distribution of bactNOGs among gut and non-gut bacteria (Fig 3). The majority of the 20,426 detected bactNOGs was present in only one of the 46 bacterial species studied (7924 bactNOGs, not shown), or shared between two (3527 bactNOGs) or three species (2164 bactNOGs) (Fig 3, bottom left). For further analysis, bactNOGs for which the number of gut genomes where the bactNOG is represented exceeds the number of non-gut genomes where the bactNOG is represented by at least 12 were regarded as “overrepresented” in genomes of gut Firmicutes. This means that the representation of functions that are regarded as overrepresented varies from "present in at least 12 of the 23 gut genomes and none of the 23 non-gut genomes" to "present in all 23 gut genomes and 11 or less of the 23 non-gut genomes" (Fig 3).
Fig 3

Distribution of functions among gut and non-gut genomes.

Squares indicate bactNOGs as a function of the number of gut genomes (horizontal axis) and the number of non-gut genomes (vertical axis) in which they are encoded. The colours of the squares indicate the numbers of different bactNOGs at each position. BactNOGs encoded in only one of the 46 genomes are not indicated. Diagonal lines separate bactNOGs that are overrepresented in the gut genomes (bottom right), bactNOGs that are underrepresented in the gut genomes (top left), and bactNOGs with an intermediate position (see text for details).

Distribution of functions among gut and non-gut genomes.

Squares indicate bactNOGs as a function of the number of gut genomes (horizontal axis) and the number of non-gut genomes (vertical axis) in which they are encoded. The colours of the squares indicate the numbers of different bactNOGs at each position. BactNOGs encoded in only one of the 46 genomes are not indicated. Diagonal lines separate bactNOGs that are overrepresented in the gut genomes (bottom right), bactNOGs that are underrepresented in the gut genomes (top left), and bactNOGs with an intermediate position (see text for details). The majority of the 153 overrepresented bactNOGs have no known function, or a function involved in energy production, DNA metabolism, transcription or translation (Fig 4, S1 Table). Individual bactNOG descriptions reveal a number of functions that have earlier been associated with the gut environment, such as the degradation of conjugated bile acids (bactNOG15678, Choloylglycine hydrolase), cobalamin (vitamin B12) biosynthesis (bactNOG85989, Cobyrinic acid a, c-diamide synthase) [21] or iron acquisition (bactNOG99581, bactNOG38121, Ferric uptake regulator protein) [22, 23], emphasizing the importance of these processes in the gut environment.
Fig 4

Functional composition of gut species bactNOG datasets.

BactNOGs represented in one or more of the 23 gut bacterial species in this study were attributed to one of three groups: overrepresented or underrepresented compared to non-gut species (see text for details), or neither over nor underrepresented (indicated by *). Within each of these groups, the number of different BactNOGSs attributed to a functional category is indicated as a percentage of the total number of BactNOGs in the group. Functional category descriptions are short forms of the full descriptions presented in tables “S1” and “S2” Tables.

Functional composition of gut species bactNOG datasets.

BactNOGs represented in one or more of the 23 gut bacterial species in this study were attributed to one of three groups: overrepresented or underrepresented compared to non-gut species (see text for details), or neither over nor underrepresented (indicated by *). Within each of these groups, the number of different BactNOGSs attributed to a functional category is indicated as a percentage of the total number of BactNOGs in the group. Functional category descriptions are short forms of the full descriptions presented in tables “S1” and “S2” Tables. Not less than 4 of the overrepresented bactNOGs point to an important role of conjugation in the gut environment (bactNOGs 07070, 26309, 08200, 44258), a function that had not been recognized as such in the earlier global approach to the identification of bacterial functions that are important in the gut environment described in [1]. Other functions that had not been recognized in the global approach are "sulfuric ester hydrolase" (sulfatase, bactNOG20561), which may play a role in the foraging of abundant sulfated glycans in the intestine, such as mucins or glycosaminoglycans [24], and "Sortase B" (bactNOG70972) which plays a role in the anchoring of a specific category of bacterial surface proteins [25]. Surprisingly, only 78 bactNOGs were shared among all the genomes in the two genome sets (Fig 3). This is far less than expected when compared with the approximately 250 bacterial protein coding genes that are considered as essential in the Gram-positive model bacterium B. subtilis [26, 27]. Also, certain bactNOGs that represent essential proteins (e.g. certain ribosomal proteins) appeared to be present in only a fraction of the genomes studied (results not shown). A closer look at the catalog of bactNOGs and their functional descriptions revealed that these observations could (in part) be explained by the fact that many functions are represented by several bactNOGs, i.e. several bactNOGs carry identical function descriptions. This is due to the way the non-supervised orthologous groups (NOGs) were constructed [18], where the authors chose for a high resolution, i.e. small precisely defined NOGs containing very similar proteins, to improve accuracy with the consequence that less similar proteins with the same function are attributed to different bactNOGs. We therefore decided to perform a second comparison of our sets of gut and non-gut genomes at the level of "groups of bactNOGs" where we considered bactNOGs with identical function descriptions as one group (in our dataset, a "group of bactNOGs" may contain from 1 to 606 bactNOGs, the latter representing "transcriptional regulator proteins"). By doing so, all the information on bactNOGs that had no description, accounting for 31.2% of the bactNOGs in our dataset, was lost. Other methods of protein clustering and protein family construction that are out of the scope of this study may be instructive in revealing the potential importance of these unclassified bactNOGs beyond the level of the single bactNOG comparisons described above [12, 28, 29]. 166 functional “groups of bactNOGs” were shared among all the genomes in the two sets (data not shown). As expected, among these shared groups we find functions involved in DNA and RNA metabolism, transcription, translation, cell envelope, shape and division, as well as energy conversion and metabolism of nucleotides, coenzymes and carbohydrates. 44 groups of bactNOGs were considered as overrepresented in the genomes of the gut Firmicutes (Table 3). These include several functions that had already been identified in the single bactNOG comparisons described above, like "sulfuric ester hydrolase" (sulfatase, bactNOG20561), "sortase B" (bactNOG 70972), "proteins involved in unidirectional conjugation" (6 bactNOGs of which 3 had been identified in individual comparisons) and "proteins involved in conjugation with cellular fusion" (3 bactNOGs, of which 1 had been identified in individual comparisons). The latter two functions are represented in 87 and 74% of the 23 gut species, respectively, as opposed to 35 and 22% of the non-gut species (Table 3).
Table 3

Functional groups overrepresented in gut bacteria.

eggNOGFunctionFunctional categoryAbundance, %
egg NOG v.3.0GUTNONGUT
bactNOG570795-Aminoimidazole-4-Carboxamide ribonucleotide transformylase[F] Nucleotide transport and metabolism1.4100.026.1
bactNOG33416Transcriptional regulator protein-like protein[K] Transcription0.473.90.0
bactNOG14419, bactNOG78827Site-Specific recombinaseNANA87.017.4
bactNOG20561Sulfuric ester hydrolase[M] Cell wall/membrane/envelope biogenesis11.182.613.0
bactNOG61174Replication initiator protein[S] Function unknown0.269.60.0
bactNOG43319, bactNOG30533, bactNOG35352Adenylate cyclaseNANA78.313.0
bactNOG14637Adenylosuccinate protein[F] Nucleotide transport and metabolism2.669.64.3
bactNOG13499Selenate reductase subunit YgfM; with YgfK and YgfN forms a selenate reductase, which seems to catalyze the reduction of selenate to selenite; YgfM contains a FAD domain-containing protein[C] Energy production and conversion3.469.64.3
bactNOG78875GB:X04470, GB:X04503, GB:X04502, SP:P03973, PID:28639, PID:338233, PID:36491, and PID:758101; identified by sequence similarity protein[K] Transcription0.365.20.0
bactNOG03861Elongation factor G[J] Translation, ribosomal structure and biogenesis18.782.617.4
bactNOG10082SAM dependent methyltransferase[R] General function prediction only7.282.617.4
bactNOG29973Deoxycytidylate deaminase[F] Nucleotide transport and metabolism4.673.98.7
bactNOG20957, bactNOG28451, bactNOG18161, bactNOG37597Specifically catalyzes the dephosphorylation of 2- phosphoglycolate. Is involved in the dissimilation of the intracellular 2-phosphoglycolate formed during the DNA repair of 3'-phosphoglycolate ends, a major class of DNA lesions induced by oxidative stress (By similarity) proteinNANA95.734.8
bactNOG04076Zinc phosphodiesterase, which displays some tRNA 3'- processing endonuclease activity. involved in tRNA maturation, by removing a 3'-trailer from precursor tRNA (By similarity)[R] General function prediction only5.165.24.3
bactNOG05123, bactNOG074172-Isopropylmalate synthaseNANA91.330.4
bactNOG51505, bactNOG44758Addiction module toxin, RelE/StbE family proteinNANA73.913.0
bactNOG45170Cdp-Diacylglycerol--Glycerol-3-Phosphate 3 protein[I] Lipid transport and metabolism5.873.913.0
bactNOG30240Glyoxalase/Bleomycin resistance protein/Dioxygenase[E] Amino acid transport and metabolism2.969.68.7
bactNOG00016Phosphoserine aminotransferase; catalyzes the formation of 3-phosphonooxypyruvate and glutamate from O-phospho-L-serine and 2-oxoglutarate; required both in major phosphorylated pathway of serine biosynthesis and in the biosynthesis of pyridoxine[E] Amino acid transport and metabolism41.795.739.1
bactNOG40424Glycoside hydrolase, family 25[M] Cell wall/membrane/envelope biogenesis1.560.94.3
bactNOG028264-Alpha-Glucanotransferase[G] Carbohydrate transport and metabolism44.291.334.8
bactNOG03506Aminopeptidase 2; catalyzes the removal of amino acids from the N termini of peptides[E] Amino acid transport and metabolism10.987.030.4
bactNOG15648, bactNOG74792Aconitate hydratase[C] Energy production and conversionNA78.321.7
bactNOG65104Phosphoribosylpyrophosphate synthetase; Catalyzes the formation of PRPP from ATP and ribose 5-phosphate[F] Nucleotide transport and metabolism1.878.321.7
bactNOG20523Sugar phosphatase; YidA; catalyzes the dephosphorylation of erythrose 4-phosphate (preferred substrate), mannose 1-phosphate and p-nitrophenyl phosphate; hydrolyzes the alpha-D-glucose-1-phosphate but not the beta form; member of the haloacid dehalogenase-like hydrolases superfamily and Cof family of proteins[R] General function prediction only10.278.321.7
bactNOG30560, bactNOG37582Removes the formyl group from the N-terminal Met of newly synthesized proteins. Requires at least a dipeptide for an efficient rate of reaction. N-terminal L-methionine is a prerequisite for activity but the enzyme has broad specificity at other positions (By similarity)NANA69.613.0
bactNOG83597Subunit C[C] Energy production and conversion0.465.28.7
bactNOG31052, bactNOG35249, bactNOG35454, bactNOG05302Had-Superfamily hydrolase, subfamily IA, variant 3NANA87.034.8
bactNOG07070, bactNOG11507, bactNOG08200, bactNOG10025, bactNOG13178, bactNOG26309Protein involved in unidirectional conjugationNANA87.034.8
bactNOG22665, bactNOG08175Pyridoxal kinase[H] Coenzyme transport and metabolismNA87.034.8
bactNOG74867, bactNOG16222Sugar Hydrogen symporter proteinNANA87.034.8
bactNOG82609Oxaloacetate decarboxylase[C] Energy production and conversion21.078.326.1
bactNOG62080Ribosomal protein S3[J] Translation, ribosomal structure and biogenesis0.460.98.7
bactNOG17864, bactNOG34439RNA methyltransferase[J] Translation, ribosomal structure and biogenesisNA60.98.7
bactNOG14801Hydro-Lyase, Fe-S type, tartrate/fumarate subfamily, beta[C] Energy production and conversion8.691.339.1
bactNOG02215Potassium transporter peripheral membrane component; involved in potassium uptake; found to be peripherally associated with the inner membrane in Escherichia coli; contains an NAD-binding domain protein[P] Inorganic ion transport and metabolism32.291.339.1
bactNOG4509250S ribosomal protein L30; L30 binds domain II of the 23S rRNA and the 5S rRNA[J] Translation, ribosomal structure and biogenesis24.456.54.3
bactNOG70972Sortase B[S] Function unknown1.756.54.3
bactNOG53104Histidine Phosphotransfer domain-containing protein[T] Signal transduction mechanisms0.352.20.0
bactNOG99320Ribosomal protein L34[J] Translation, ribosomal structure and biogenesis0.352.20.0
bactNOG01580Decarboxylase, beta[C] Energy production and conversion15.482.630.4
bactNOG24561, bactNOG09355L-Fucose isomerase[G] Carbohydrate transport and metabolismNA73.921.7
bactNOG69266, bactNOG44258Protein involved in conjugation with cellular fusionNANA73.921.7
bactNOG30123, bactNOG62262, bactNOG63699, bactNOG39777Transcriptional regulator, DeoR family proteinNANA73.921.7

bactNOGs with identical functional descriptions were grouped in our dataset. Groups of bactNOGs are presented for which the number of gut genomes where bactNOGs are represented exceeds the number of non-gut genomes where bactNOGs are represented by at least 12. Abundance, % of species where bactNOG is represented, in the eggNOG v.3.0 database (shown only for functional groups containing one bactNOG), in 23 “gut” genomes (GUT), or in 23 “non-gut” genomes (NONGUT), respectively. NA- non- assigned.

bactNOGs with identical functional descriptions were grouped in our dataset. Groups of bactNOGs are presented for which the number of gut genomes where bactNOGs are represented exceeds the number of non-gut genomes where bactNOGs are represented by at least 12. Abundance, % of species where bactNOG is represented, in the eggNOG v.3.0 database (shown only for functional groups containing one bactNOG), in 23 “gut” genomes (GUT), or in 23 “non-gut” genomes (NONGUT), respectively. NA- non- assigned. A new function which only becomes apparent with the grouped bactNOGs approach, and which again was not detected by the global approach in [1], is L-Fucose isomerase (bactNOGs 24561 and 09355), the enzyme catalizing the first step of fucose metabolism, found in 74% of the gut bacteria against 22% of the non-gut bacteria. L-Fucose is highly abundant in the intestine [30], present in dietary polysaccharides such as pectin but also in mucin glycoproteins overlying the intestinal epithelium. It can be cleaved from host glycans by multiple fucosidases produced by gut commensals such as Bacteroides thetaiotaomicron, resulting in a high availability in the intestinal lumen where it can be used as a carbon source by B. thetaiotaomicron itself [31] or by other resident bacteria such as Escherichia coli [32] or Roseburia intestinalis [33]. A certain number of functions that were detected as overrepresented in the single bactNOG comparisons do not appear as such in the analysis by groups of bactNOGs. Some group functions even appear in the list of overrepresented single bactNOGs (S1 Table) as well as in the list of underrepresented single bactNOGs (see below, S2 Table). This is for example the case for the function "Ferric uptake regulator protein" (represented by bactNOGs 38121 and 99581 in the list of overrepresented bactNOGs and by bactNOG31290 in the list of underrepresented bactNOGs). This type of result suggests that closely related gut and non-gut bacteria use distinct proteins for the same function or, alternatively, that the proteins represented by different bactNOGs in spite of identical descriptions exert different functions of which some are more important in the gut environment.

Gut adaptation through functional paucity

The direct comparison of closely related bacteria from the gut and other environments also provides a unique opportunity to focus on underrepresented functions, which may be equally informative of the functioning of the gut ecosystem. In contrast to the overrepresented functions, underrepresented functions included far less bactNOGs with unknown functions (Fig 4, S2 Table). Remarkably, one third of the underrepresented bactNOGs is involved in motility and signal transduction, including several chemotaxis related functions. Another remarkable underrepresented function is the Pur operon repressor protein (bactNOG16918), identified in only one of the 23 gut bacteria as opposed to 21 of the 23 non-gut bacteria. This repressor controls the transcription of the pur operon for purine biosynthetic genes, and its absence would be expected to result in the constitutive transcription of the operon. The ability to synthesize nucleotides was shown to be a prerequisite for successful colonization of the mouse intestine by E. coli [34]. The constitutive expression of genes involved in purine biosynthesis may thus give a competitive advantage during unsteady nucleotide supply in the human gut environment. The analysis of underrepresented functions by groups of bactNOGs led to the same conclusion as the single-bactNOG comparisons: of the 67 functional groups that are underrepresented in the gut bacteria (Table 4), an astonishing 1/3 appears to be involved in motility and chemotaxis. These functions are, on an average, represented in only 16% of the 23 gut bacteria studied, as opposed to 77% of the non-gut bacteria.
Table 4

Functional groups underrepresented in gut bacteria.

eggNOGFunctionFunctional categoryAbundance, %
egg NOG v.3.0GUTNONGUT
bactNOG16918Pur operon repressor protein[K] Transcription15.24.391.3
bactNOG00229Phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase[F] Nucleotide transport and metabolism75.213.095.7
bactNOG23778, bactNOG36792, bactNOG29992, bactNOG28755, bactNOG23746, bactNOG26169, bactNOG04324, bactNOG40456, bactNOG12593IsochorismataseNANA8.782.6
bactNOG01751Flagellar biosynthesis protein FlhB; membrane protein responsible for substrate specificity switching from rod/hook-type export to filament-type export[N] Cell motility37.913.082.6
bactNOG00287Involved in the modulation of the chemotaxis system; catalyzes the demethylation of specific methylglutamate residues introduced into the chemoreceptors (methyl-accepting chemotaxis proteins) by cheR[T] Signal transduction mechanisms39.013.082.6
bactNOG18316, bactNOG06650, bactNOG100225, bactNOG98591, bactNOG05199, bactNOG21958Methionine sulfoxide reductaseNANA13.082.6
bactNOG12472, bactNOG05849, bactNOG01096, bactNOG04323Alkaline phosphataseNANA17.487.0
bactNOG30877, bactNOG44267Flagellar hook capping protein[N] Cell motility8.773.9
bactNOG00716, bactNOG85469, bactNOG30379Flagellin proteinNANA17.482.6
bactNOG43795Protein FliQ[N] Cell motility30.417.482.6
bactNOG20038Cell envelope-related transcriptional attenuator; TIGRFAM: cell envelope-related function transcriptional attenuator, LytR/CpsA family; PFAM: cell envelope-related transcriptional attenuator protein[K] Transcription12.721.787.0
bactNOG28010, bactNOG02636Ribose-Phosphate pyrophosphokinaseNANA30.495.7
bactNOG52752, bactNOG42812, bactNOG27823, bactNOG44224 bactNOG26630Chec, inhibitor of MCP methylation proteinNANA21.782.6
bactNOG30371, bactNOG95849, bactNOG30633Flagellar basal body rod protein[N] Cell motility21.782.6
bactNOG04544Flagellar basal body rod protein FlgG[N] Cell motility41.421.782.6
bactNOG02669Flagellar biosynthesis protein FlhA[N] Cell motility43.821.782.6
bactNOG02127Flagellar biosynthesis protein FliP; FliP, with proteins FliQ and FliR, forms the core of the central channel in the flagella export apparatus[N] Cell motility42.321.782.6
bactNOG97172, bactNOG100574, bactNOG19021, bactNOG22405, bactNOG34382, bactNOG38432, bactNOG98529, bactNOG75984, bactNOG10394, bactNOG03185Flagellar hook-associated proteinNANA21.782.6
bactNOG50628, bactNOG37469, bactNOG36391, bactNOG46261Flagellar hook-basal body proteinNANA21.782.6
bactNOG10389, bactNOG44208, bactNOG02345Flagellar motor switch proteinNANA21.782.6
bactNOG38595, bactNOG34666, bactNOG43852Flagellar protein FliSNANA21.782.6
bactNOG02069Flagellum-Specific ATP synthase[C] Energy production and conversion43.221.782.6
bactNOG37514, bactNOG31243, bactNOG42066, bactNOG27248, bactNOG09558, bactNOG14057, bactNOG72426Protein involved in cellular iron ion homeostasisNANA21.782.6
bactNOG27648, bactNOG36097, bactNOG01870, bactNOG31082, bactNOG74955, bactNOG16925, bactNOG31591, bactNOG06736, bactNOG00914, bactNOG25916, bactNOG01033, bactNOG15932, bactNOG37025, bactNOG40000, bactNOG32982, bactNOG20199Protein involved in chemotaxisNANA21.782.6
bactNOG4373150S ribosomal protein L34; in Escherichia coli transcription of this gene is enhanced by polyamines[J] Translation, ribosomal structure and biogenesis63.934.895.7
bactNOG00965Phosphoribosylaminoimidazole carboxylase ATPase subunit[F] Nucleotide transport and metabolism53.40.060.9
bactNOG05379, bactNOG02517, bactNOG19673, bactNOG01018Udp-N-Acetylglucosamine 2-epimeraseNANA34.895.7
bactNOG16773, bactNOG00811Atp-Dependent proteaseNANA4.365.2
bactNOG00957Catalyzes the condensation of the acetyl group of acetyl-CoA with 3-methyl-2-oxobutanoate (2-oxoisovalerate) to form 3-carboxy-3-hydroxy-4-methylpentanoate protein[E] Amino acid transport and metabolism49.14.365.2
bactNOG25456Glutamine amidotransferase, subunit PdxT; with PdxST is involved in the biosynthesis of pyridoxal 5'-phosphate; PdxT catalyzes the hydrolysis of glutamine to glutamate and ammonia; PdxS utilizes the ammonia to synthesize pyridoxal 5'-phosphate[H] Coenzyme transport and metabolism17.44.365.2
bactNOG13897, bactNOG00750, bactNOG67445, bactNOG14611, bactNOG60045, bactNOG74366, bactNOG15012, bactNOG65815, bactNOG37933, bactNOG59733, bactNOG19072, bactNOG20172, bactNOG28001Methyl-Accepting chemotaxis sensory transducer proteinNANA4.365.2
bactNOG12090Methyltransferase, CheR[N] Cell motility/ [T] Signal transduction mechanisms5.04.365.2
bactNOG52841, bactNOG46445, bactNOG96629, bactNOG90266Flagellar export protein FliJNANA8.769.6
bactNOG48374, bactNOG99922, bactNOG82249, bactNOG13602, bactNOG80514Protein involved in flagellum assemblyNANA8.769.6
bactNOG01470, bactNOG04846Cell division protein FtsANANA26.182.6
bactNOG02318Flagellar motor switch protein FliM[N] Cell motility41.921.778.3
bactNOG05792Cyanophycin synthetase[M] Cell wall/membrane/envelope biogenesis18.50.056.5
bactNOG29500, bactNOG10264, bactNOG18792Protein involved in cytochrome complex assemblyNANA0.056.5
bactNOG92570, bactNOG97494, bactNOG46066Ribosomal protein L30NANA34.891.3
bactNOG11597, bactNOG13922Ribonuclease Z; member of metallo-beta-lactamase family; the purified enzyme from Escherichia coli forms dimeric zinc phosphodiesterase; in Bacillus subtilis this protein is a 3'-tRNA processing endoribonuclease and is essential while in Escherichia coli it is not; associates with two zinc ionsNANA4.360.9
bactNOG37372Stage II sporulation protein M[S] Function unknown6.64.360.9
bactNOG46911Transcriptional regulator, CopG family protein[K] Transcription6.34.360.9
bactNOG42002, bactNOG86985, bactNOG92617, bactNOG50625, bactNOG75642, bactNOG89894, bactNOG92834, bactNOG48059Type IV pilus assembly PilZ proteinNANA8.765.2
bactNOG31141, bactNOG26612, bactNOG36412, bactNOG37092, bactNOG71934, bactNOG52195CBS domain-containing proteinNANA13.069.6
bactNOG01343Plays an important role in the de novo pathway of purine nucleotide biosynthesis protein[F] Nucleotide transport and metabolism76.730.487.0
bactNOG12565Atp:Guanido phosphotransferase[E] Amino acid transport and metabolism12.721.773.9
bactNOG44690, bactNOG74103, bactNOG55341, bactNOG53379Carbon storage regulator proteinNANA26.178.3
bactNOG00578Drug resistance transporter, EmrB/QacA protein[P] Inorganic ion transport and metabolism56.826.178.3
bactNOG67310, bactNOG67983, bactNOG64693, bactNOG67454, bactNOG33438Enzyme activatorNANA21.773.9
bactNOG01697, bactNOG24497, bactNOG05093Sigma factors are initiation factors that promote the attachment of RNA polymerase to specific initiation sites and are then releasedNANA26.178.3
bactNOG28544, bactNOG330863-Methyladenine DNA glycosylaseNANA0.052.2
bactNOG35415, bactNOG66375, bactNOG32336, bactNOG28346Glycerol-3-Phosphate responsive antiterminator proteinNANA0.052.2
bactNOG19498, bactNOG43068, bactNOG11337Phenylalanine-Trna ligaseNANA0.052.2
bactNOG04772, bactNOG01433, bactNOG09984Peptidase M16NANA39.191.3
bactNOG16417Flagellar motor protein MotD; Homologous to MotB. These organism have both MotB and MotD. With MotC (a MotA homolog) forms the ion channels that couple flagellar rotation to proton/sodium motive force across the membrane and forms the stator elements of the rotary flagellar machine. Either MotAB or MotCD is sufficient for swimming, but both are necessary for swarming motility[N] Cell motility13.84.356.5
bactNOG55225, bactNOG27515, bactNOG65818, bactNOG01779, bactNOG10628Iron-Sulfur cluster-binding proteinNANA4.356.5
bactNOG87479, bactNOG49851 bactNOG55890, bactNOG86277 bactNOG102115, bactNOG33628 bactNOG16540, bactNOG53787TPR repeat-containing proteinNANA4.356.5
bactNOG00626Arsenical-Resistance protein[P] Inorganic ion transport and metabolism29.88.760.9
bactNOG50959, bactNOG48759, bactNOG89587, bactNOG39189, bactNOG43920, bactNOG36618, bactNOG38281, bactNOG52031, bactNOG51626, bactNOG95758Glutaredoxin proteinNANA8.760.9
bactNOG00525Utp-Glucose-1-Phosphate uridylyltransferase[M] Cell wall/membrane/envelope biogenesis63.730.482.6
bactNOG18657Bifunctional pyrimidine regulatory protein PyrR uracil phosphoribosyltransferase; regulates pyrimidine biosynthesis by binding to the mRNA of the pyr genes, also has been shown to have uracil phosphoribosyltransferase activity[F] Nucleotide transport and metabolism35.017.469.6
bactNOG01465Flagellar hook protein FlgE[N] Cell motility41.713.065.2
bactNOG04205, bactNOG34242Ppx/Gppa phosphataseNANA13.065.2
bactNOG44112, bactNOG36504RNA chaperone that binds small regulatory RNA (sRNAs) and mRNAs to facilitate mRNA translational regulation in response to envelope stress, environmental stress and changes in metabolite concentrations. Also binds with high specificity to tRNAs proteinNANA17.469.6
bactNOG01716Undecaprenyl-Phosphate alpha-N protein[M] Cell wall/membrane/envelope biogenesis41.417.469.6
bactNOG14292, bactNOG05550, bactNOG03912, bactNOG60601, bactNOG03248, bactNOG63252, bactNOG10183, bactNOG00751, bactNOG00172, bactNOG58297Atp-Dependent helicaseNANA34.887.0
bactNOG23776, bactNOG28510 bactNOG86058, bactNOG60688 bactNOG38850, bactNOG11515, bactNOG02006, bactNOG58574, bactNOG85199, bactNOG08921, bactNOG35523, bactNOG34604, bactNOG10155, bactNOG18901, bactNOG09956, bactNOG74830, bactNOG34667, bactNOG21541, bactNOG32868, bactNOG02188, bactNOG27650, bactNOG43067, bactNOG33646, bactNOG30150, bactNOG09220, bactNOG74004, bactNOG87394, bactNOG08118, bactNOG87029, bactNOG29496, bactNOG50314, bactNOG42983, bactNOG42538, bactNOG24960, bactNOG12392, bactNOG09967, bactNOG05219, bactNOG10337, bactNOG02508Diguanylate cyclaseNANA34.887.0

bactNOGs with identical functional descriptions were grouped in our dataset. Groups of bactNOGs are presented for which the number of non-gut genomes where bactNOGs are represented exceeds the number of gut genomes where bactNOGs are represented by at least 12. Abundance, % of species where bactNOG is represented, in the eggNOG v.3.0 database (shown only for functional groups containing one bactNOG), in 23 gut genomes (GUT), or in 23 non-gut genomes (NONGUT), respectively. NA- non- assigned.

bactNOGs with identical functional descriptions were grouped in our dataset. Groups of bactNOGs are presented for which the number of non-gut genomes where bactNOGs are represented exceeds the number of gut genomes where bactNOGs are represented by at least 12. Abundance, % of species where bactNOG is represented, in the eggNOG v.3.0 database (shown only for functional groups containing one bactNOG), in 23 gut genomes (GUT), or in 23 non-gut genomes (NONGUT), respectively. NA- non- assigned.

Underrepresentation of secreted proteins in gut bacteria

The predicted proteomes of the strains in this study were analyzed using SurfG+ [19] to predict protein localization (Table 2). This analysis revealed a difference in the numbers of secreted proteins where on an average 3.1% (SEM 0.3) of proteins were predicted to be secreted in our set of gut bacteria as opposed to 4.3% (SEM 0.3) for the non-gut bacteria (not shown). Interestingly, this difference appears to be explained by the presence of relatively low and stable numbers of secreted proteins across differently sized gut bacterial genomes, while in the closely related non-gut bacteria the number of secreted proteins clearly correlates with the total number of proteins encoded in the genome (Fig 5). The numbers of predicted membrane proteins and surface exposed proteins are correlated to the total numbers of encoded proteins in both gut and non-gut bacteria (Fig 5).
Fig 5

Predicted numbers of bacterial membrane, potentially surface exposed and secreted proteins as a function of the total number of proteins.

The predicted numbers of (A) membrane (mem), (B) potentially surface exposed (pse), and (C) secreted (sec) proteins in a bacterial species are correlated to the total number of encoded proteins (Spearman's rank correlation test, p < 0.01), with the exception of the sec proteins in gut bacteria where no significant correlation is observed (p > 0.2). Red, gut bacteria; blue, non-gut bacteria.

Predicted numbers of bacterial membrane, potentially surface exposed and secreted proteins as a function of the total number of proteins.

The predicted numbers of (A) membrane (mem), (B) potentially surface exposed (pse), and (C) secreted (sec) proteins in a bacterial species are correlated to the total number of encoded proteins (Spearman's rank correlation test, p < 0.01), with the exception of the sec proteins in gut bacteria where no significant correlation is observed (p > 0.2). Red, gut bacteria; blue, non-gut bacteria.

Discussion

The human gut microbiota is increasingly recognized as a major health determining factor. As our knowledge on this microbial community and notably its bacterial component expands, it becomes clear that atypical microbiota compositions, dysbioses, are associated with a growing number of diseases, to an extent that microbiota composition can constitute a "signature" or bio-marker of a specific disease (e.g. [35, 36]). At least for some diseases, experiments in animals convincingly show that an atypical microbiota can be a driving force in the development of disease. In line with these observations, promising results have been reported with the use of fecal microbiota transplantation (in this case the transfer of fecal material from a healthy donor to the intestine of a patient) to improve the symptoms of inflammatory bowel disease in humans [37]. Yet, our knowledge on the bacterial properties that drive homeostasis, the equilibrium of the gut microbial ecosystem including bacteria—host interactions, is still limited. Seen from the bacterial side, which are the genes a bacterium needs to maintain itself in this ecosystem? Which are the bacterial genes that play a role in the functioning of the system, including interactions with the host, as a whole? Several approaches, in silico and through experimental screening, have been and are used to answer these questions. One of the main in silico approaches consists of a global comparison of functions encoded by the gut bacteria and functions encoded by non-gut bacteria, looking for what seems to be gut-specific, as examplified by the study presented in [1]. A possible drawback of this approach, however, is that the taxonomical composition of the gut microbiota and the non-gut reference data set may be largely different. As a consequence, observed differences may in part be due to inherent differences between bacterial taxons that are not necessarily relevant for the comprehension of the gut ecosystem. On the other hand, this approach may fail to detect obvious gut adaptations in a specific taxon if similar functions exist in a different taxon among the non-gut bacteria. In the present study we therefore used a complementary approach and compared data from selected gut bacteria and closely related non-gut bacteria of the Firmicutes phylum, a procedure that should favor the detection of environment-specific adaptations. We observed a tendency of a relatively low and stable number of predicted secreted proteins across the gut bacteria that was not observed in the non-gut bacteria. It will be interesting to see if this tendency is confirmed when larger numbers of genomes will be analyzed. If so, this could mean that gut bacteria limit the number of secreted proteins, maybe in response to the intestinal flow as this type of proteins could easily become separated from, and thus be of limited advantage to, the secreting bacteria. We identified several functions that may play an important role in the gut environment but had gone undetected by the global comparison approach described in [1]. For instance, our data strongly suggest that conjugation plays an important role in the gut environment. Conjugation is the most effective mechanism of horizontal gene transfer (HGT) where the exchange of genetic material can occur even between highly divergent bacterial species [38], and our conclusion is in line with earlier evidence of HGT in the gut environment [39-41]. The ability to acquire fitness genes by conjugation may provide gut bacteria with competitive advantages to thrive in this complex environment. Of clinical importance, elevated bacterial conjugation activity in the densely populated gut ecosystem, an environment recognized as a significant reservoir of antibiotic resistances [42], may also play an important role in the spread of antibiotic resistance genes. We further detected sulfatase and L-fucose isomerase as overrepresented functions in the gut bacteria. Sulfatases and their role in the foraging of abundant sulfated glycans in the gut have been described as critical for the fitness of Bacteroides thetaiotaomicron in the gut environment [24], but are far less studied in Firmicutes [24, 43]. Similarly, L-fucose isomerase is involved in the metabolism of L-fucose, a highly abundant sugar in the intestine [30], present in dietary polysaccharides such as pectin but also in mucin glycoproteins overlying the intestinal epithelium. Together, these examples clearly illustrate the potential of our targeted comparative analysis, focusing on closely related bacteria from different environments, to identify functions that are important in the gut environment. This approach also permits to identify functions that are underrepresented among gut bacteria, an analysis that proved equally informative. We thus observed that an astonishing one third of the underrepresented functions appears to be involved in motility and chemotaxis, representing to our knowledge the first observation of this kind. Bacterial chemotaxis is the phenomenon whereby bacteria direct their movements according to certain chemical stimulants in their environment, and our observation may be explained by the fact that the majority of the bacteria from the “non-gut” set were isolated from water or soil. It is easy to imagine that in these environments it is important for bacteria to move towards the highest concentration of food or other essential molecules, or to flee from poisons. In the gut, the opposite is true as free molecules in transit pass by bacteria that are often adhering to the intestinal surfaces or food particles [44]. An alternative explication may be that these commensal bacteria have been selected for the absence of one of the best known immune modulatory bacterial cell surface proteins, flagellin, that interacts with TLR5 to induce an inflammatory response [45]. The present study can be regarded as a proof of principle demonstrating the potential of taxonomically targeted comparative analyses in the identification of functions that are important in a given ecosystem, in our case the human intestinal tract. The results of these analyses confirmed a number of earlier observations or intuitions about functions that are considered as key functions in the gut environment. The analyses not only identified new functions but also a relative paucity in some other functions, both of which appear to be important in the human gut environment and that, even if experimental evidence is still incomplete, intuitively seem to make sense. These results suggest that the identified "unknown functions" that are found to be overrepresented in the gut bacteria in our analysis are important too and worth further investigating. In this pilot experiment we limited ourselves to completely sequenced genomes. Without this self-imposed limitation, which is probably not necessary, a wealth of additional data becomes available for analysis. Many more bacterial genomes have been sequenced to near completion since we started this study and ever more are becoming available, including "metagenomic species" genomes that are directly assembled from metagenomic data [46]. The use of these data will allow more robust studies with higher numbers of bacteria. Parallel developments see the pairwise comparison of two human gut microbiota types, typically patients and healthy subjects, rather than comparison of the gut microbiota with bacteria from completely different ecosystems. While not answering exactly the same questions, the different approaches are complementary and should together eventually lead to the unraveling of the critical factors in gut homeostasis that rule our health. The acquired knowledge may subsequently guide our choice of health-beneficial probiotics, screening for desired properties to restore or consolidate homeostasis and avoiding properties that are incompatible with homeostasis.

BactNOGs overrepresented in gut bacteria.

(DOCX) Click here for additional data file.

BactNOGs underrepresented in gut bacteria.

(DOCX) Click here for additional data file.
  46 in total

Review 1.  Host-gut microbiota metabolic interactions.

Authors:  Jeremy K Nicholson; Elaine Holmes; James Kinross; Remy Burcelin; Glenn Gibson; Wei Jia; Sven Pettersson
Journal:  Science       Date:  2012-06-06       Impact factor: 47.728

2.  Composition and metabolic activities of bacterial biofilms colonizing food residues in the human gut.

Authors:  Sandra Macfarlane; George T Macfarlane
Journal:  Appl Environ Microbiol       Date:  2006-09       Impact factor: 4.792

3.  Development of software facilities to characterize regulatory binding motifs and application to streptococcaceae.

Authors:  Nicolas Pons; Jean-Michel Batto; Stanislav Dusko Ehrlich; Pierre Renault
Journal:  J Mol Microbiol Biotechnol       Date:  2008

Review 4.  Sortases and the art of anchoring proteins to the envelopes of gram-positive bacteria.

Authors:  Luciano A Marraffini; Andrea C Dedent; Olaf Schneewind
Journal:  Microbiol Mol Biol Rev       Date:  2006-03       Impact factor: 11.056

5.  Whole-genome transcription profiling reveals genes up-regulated by growth on fucose in the human gut bacterium "Roseburia inulinivorans".

Authors:  Karen P Scott; Jennifer C Martin; Gillian Campbell; Claus-Dieter Mayer; Harry J Flint
Journal:  J Bacteriol       Date:  2006-06       Impact factor: 3.490

6.  Prediction of surface exposed proteins in Streptococcus pyogenes, with a potential application to other Gram-positive bacteria.

Authors:  Aleksandr Barinov; Valentin Loux; Amal Hammani; Pierre Nicolas; Philippe Langella; Dusko Ehrlich; Emmanuelle Maguin; Maarten van de Guchte
Journal:  Proteomics       Date:  2009-01       Impact factor: 3.984

7.  Genomic and metabolic adaptations of Methanobrevibacter smithii to the human gut.

Authors:  Buck S Samuel; Elizabeth E Hansen; Jill K Manchester; Pedro M Coutinho; Bernard Henrissat; Robert Fulton; Philippe Latreille; Kung Kim; Richard K Wilson; Jeffrey I Gordon
Journal:  Proc Natl Acad Sci U S A       Date:  2007-06-11       Impact factor: 11.205

8.  Richness of human gut microbiome correlates with metabolic markers.

Authors:  Emmanuelle Le Chatelier; Trine Nielsen; Junjie Qin; Edi Prifti; Falk Hildebrand; Gwen Falony; Mathieu Almeida; Manimozhiyan Arumugam; Jean-Michel Batto; Sean Kennedy; Pierre Leonard; Junhua Li; Kristoffer Burgdorf; Niels Grarup; Torben Jørgensen; Ivan Brandslund; Henrik Bjørn Nielsen; Agnieszka S Juncker; Marcelo Bertalan; Florence Levenez; Nicolas Pons; Simon Rasmussen; Shinichi Sunagawa; Julien Tap; Sebastian Tims; Erwin G Zoetendal; Søren Brunak; Karine Clément; Joël Doré; Michiel Kleerebezem; Karsten Kristiansen; Pierre Renault; Thomas Sicheritz-Ponten; Willem M de Vos; Jean-Daniel Zucker; Jeroen Raes; Torben Hansen; Peer Bork; Jun Wang; S Dusko Ehrlich; Oluf Pedersen
Journal:  Nature       Date:  2013-08-29       Impact factor: 49.962

9.  The Pfam protein families database.

Authors:  Marco Punta; Penny C Coggill; Ruth Y Eberhardt; Jaina Mistry; John Tate; Chris Boursnell; Ningze Pang; Kristoffer Forslund; Goran Ceric; Jody Clements; Andreas Heger; Liisa Holm; Erik L L Sonnhammer; Sean R Eddy; Alex Bateman; Robert D Finn
Journal:  Nucleic Acids Res       Date:  2011-11-29       Impact factor: 16.971

10.  Iron availability increases the pathogenic potential of Salmonella typhimurium and other enteric pathogens at the intestinal epithelial interface.

Authors:  Guus A M Kortman; Annemarie Boleij; Dorine W Swinkels; Harold Tjalsma
Journal:  PLoS One       Date:  2012-01-17       Impact factor: 3.240

View more
  3 in total

1.  Correlation analysis of intestinal flora with hypertension.

Authors:  Jilun Liu; Ning An; Cong Ma; Xiaofeng Li; Jie Zhang; Wei Zhu; Yihe Zhang; Junpeng Li
Journal:  Exp Ther Med       Date:  2018-07-20       Impact factor: 2.447

Review 2.  The promise and challenge of cancer microbiome research.

Authors:  Sumeed Syed Manzoor; Annemiek Doedens; Michael B Burns
Journal:  Genome Biol       Date:  2020-06-02       Impact factor: 13.583

3.  Impact of Tilapia hepcidin 2-3 dietary supplementation on the gut microbiota profile and immunomodulation in the grouper (Epinephelus lanceolatus).

Authors:  Chen-Hung Ting; Chieh-Yu Pan; Yi-Chun Chen; Yu-Chun Lin; Tzong-Yueh Chen; Venugopal Rajanbabu; Jyh-Yih Chen
Journal:  Sci Rep       Date:  2019-12-13       Impact factor: 4.379

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.