| Literature DB >> 35208194 |
Donghui Yan1, Liu Cao1, Muqing Zhou1, Hosein Mohimani1.
Abstract
The human microbiome is a complex community of microorganisms, their enzymes, and the molecules they produce or modify. Recent studies show that imbalances in human microbial ecosystems can cause disease. Our microbiome affects our health through the products of biochemical reactions catalyzed by microbial enzymes (microbial biotransformations). Despite their significance, currently, there are no systematic strategies for identifying these chemical reactions, their substrates and molecular products, and their effects on health and disease. We present TransDiscovery, a computational algorithm that integrates molecular networks (connecting related molecules with similar mass spectra), association networks (connecting co-occurring molecules and microbes) and knowledge bases of microbial enzymes to discover microbial biotransformations, their substrates, and their products. After searching the metabolomics and metagenomics data from the American Gut Project and the Global Foodomic Project, TranDiscovery identified 17 potentially novel biotransformations from the human gut microbiome, along with the corresponding microbial species, substrates, and products.Entities:
Keywords: association network; biotransformation; enterobacteria; mass spectrometry; metagenomics; microbiome; molecular network; riboflavin; sutterella
Year: 2022 PMID: 35208194 PMCID: PMC8877437 DOI: 10.3390/metabo12020119
Source DB: PubMed Journal: Metabolites ISSN: 2218-1989
Figure 1TransDiscovery framework for discovering novel biotransformations of human dietary ingredients by (a) the gut microbiome. Starting with (b) the mass spectral data of small gut molecules and (c) metagenomics data of gut microbes, the pipeline includes the following steps: extracting (d) molecular and (e) microbial features from raw data, (f) constructing an association network [26,27,28] of molecular and microbial features (edges shown in green), (g) constructing a molecular network [30] (edges shown in red), (h) integrating associations and the molecular network, (i) extracting candidate biotransformations as golden triangles, (j) identifying substrates of biotransformations with an in silico database search with Dereplicator+ [31], and (k) characterizing molecular products of known biotransformations using in silico predictions of BioTransformer [29]. Note that in steps (f,h–k), the nodes can represent either strains or enzymes. In steps (f,h), the plus and minus labels indicate that the substrate is negatively correlated with the microbial feature and the product is positively correlated with the microbial feature.
Figure 2Integrating molecular and association networks. As the size of the network is enormous (41,765 nodes and 45,479 edges), here, we focus on some of the networking families that have a known molecule identified by Dereplicator+ [34] as a polyphenol or a vitamin (25 molecules in total). The network for all 25 molecules is shown in Figure S5. As the association network is currently too dense to visualize, we only show the top two microbial features for each molecular feature (Fisher p-value of ). The known transformations were reported by BioTransformer [29]. By focusing solely on edges in the molecular network and association network, one can get a large number of potential biotransformations with a high chance of being spurious. However, focusing on the overlap between the two networks (golden triangles) results in a much smaller set of potential biotransformations, where many of them can be validated by a literature search.
TransDiscovery identified 17 biotransformations. The columns and represent Spearsman’s rank correlation coefficients between the microbial features and substrate or product, respectively. The top biotransformations hold negative/positive biotransformations between substrates/products and microbial features, and the bottom ones do not.
| Substrate Name | Biotransformation Name | List of All Strains that Are Observed |
|
| Description |
|---|---|---|---|---|---|
| Dihydroferuloylglycine | Hydrolysis of carboxylic acid ester | Prevotella, etc. | −0.16 | 0.17 | Lachnospiraceae bacterium AM48-27BH |
| 5-(3 | Dehydroxylation | Pseudomonas; Enterobacteriaceae | −0.14 | 0.11 | Escherichia coli DEC12C |
| Isoferulic acid; Ferulic acid | Alpha, beta-ketoalkene double bond reductase | Oscillospira, etc. | −0.11 | 0.15 | Corynebacterium aurimucosum 911 CAUR |
| 5-(3 | Dehydroxylation | Methanobrevibacter | −0.11 | 0.10 | Methanobrevibacter woesei DSM 11979 |
| 3-Hydroxy-4-methoxyphenyllactic acid, etc. | Dehydroxylation | Dentocariosa | −0.11 | 0.09 | Rothia dentocariosa 694 RDEN |
| Dihydrocaffeic acid | Catechol O-methylation | Tissierellaceae; Finegoldia | −0.10 | 0.16 | Peptoniphilus senegalensis JC140 |
| Hydroxybenzoic acid; Protocatechuic aldehyde | Dehydroxylation | Blautia | −0.09 | 0.09 | Blautia wexlerae BIOML-A4 |
| Dihydrosinapic acid, etc. | Dehydroxylation | Ruminococcaceae; Lachnospiraceae | −0.09 | 0.09 | Lachnospiraceae bacterium MGYG-HGUT-00141 |
| Matairesinol | Dehydroxylation | Prausnitzii | −0.08 | 0.11 | Faecalibacterium prausnitzii MGYG-HGUT-00195 |
| 3-Phenylpropionic acid | Beta-Oxidation of carboxylic acid | Blautia | −0.08 | 0.09 | Blautia wexlerae BIOML-A4 |
| p-Coumaric acid; m-Coumaric acid | Dehydroxylase; Dehydroxylation | Faecalibacterium; Prausnitzii | 0.13 | 0.14 | Veillonella parvula BIOML-A2 |
| Dihydroferulic acid | Dehydroxylation | Clostridiales; Granulicatella | 0.11 | 0.14 | Ruminococcus bromii ATCC 27255 |
| 3-Hydroxyphenylvaleric acid | Dehydroxylation | Enterobacteriaceae | 0.11 | 0.11 | Escherichia coli DEC12C |
| p-Coumaric acid | Decarboxylation of phenolic acid/hydroxycinnamic acid | Bifidobacterium; Clostridiales; Lactobacillus | 0.10 | 0.17 | Lactobacillus casei NBRC 101979 |
| 5-(3 | Catechol O-methylation | Desulfovibrio; Enterobacteriaceae | 0.10 | −0.09 | Ruminococcus torques 2789STDY5608867 |
| Protocatechuic acid, etc. | Dehydroxylation; Aldehyde oxidation | Bacillales, etc. | −0.10 | −0.10 | Finegoldia magna DSM 20470 |
| 3-Hydroxyphenylpropionic acid; Paeonol, etc. | UDP-glucuronosyltransferase | Pseudomonas | −0.09 | −0.09 | Pseudomonas fragi F1786 |
If multiple strains are included in one row, the ρ value for the first strain is shown.
Figure 3One of the molecular features (orange circle) in the molecular network (same color scheme as in Figure 2) is specific to AGP, and Dereplicator+ [34] identified it as hydroxyethylflavine, a known product of the degradation of riboflavin by an unknown microbial enzyme in gut microbiota [37]. A microbial feature annotated as Sutterella wadsworthensis is positively correlated with this product and negatively correlated with riboflavin (highlighted green circle). The predicted riboflavin degradation gene cluster in Sutterella wadsworthensis is shown, along with two known riboflavin degradation gene clusters from Microbacterium maritypicum and Devosia riboflavina. Sutterella wadsworthensis is predicted to degrade riboflavin to hydroxyethylflavine, while the other two bacteria degrade it to lumichrome.
Figure 4Candidate biotransformations were identified as triplets of microbial features, substrates, and products where there was (i) a positive correlation between the microbial feature and the product, (ii) a negative correlation between the microbial feature and the substrate, and (iii) an edge in the molecular network between the substrate and the product.