| Literature DB >> 36243731 |
Efrat Muller1, Yadid M Algavi2, Elhanan Borenstein3,4,5.
Abstract
Integrative analysis of microbiome and metabolome data obtained from human fecal samples is a promising avenue for better understanding the interplay between bacteria and metabolites in the human gut, in both health and disease. However, acquiring, processing, and unifying such datasets from multiple sources is a daunting and challenging task. Here we present a publicly available, simple-to-use, curated dataset collection of paired fecal microbiome-metabolome data from multiple cohorts. This data resource allows researchers to easily obtain multiple fully processed and integrated microbiome-metabolome datasets, facilitating the discovery of universal microbe-metabolite links, benchmark various microbiome-metabolome integration tools, and compare newly identified microbe-metabolite findings to other published datasets.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36243731 PMCID: PMC9569371 DOI: 10.1038/s41522-022-00345-5
Source DB: PubMed Journal: NPJ Biofilms Microbiomes ISSN: 2055-5008 Impact factor: 8.462
Datasets included in the Curated Gut Microbiome-Metabolome Data Resource.
| Dataset name | Ref | Cohort description | No. samples w/ paired data | Longitudinal Y/N | No. HMDB-annotated compounds | No. KEGG-annotated compounds |
|---|---|---|---|---|---|---|
| YACHIDA_CRC_2019 | [ | Patients with colonoscopy findings from normal to stage 4 CRC, and controls | 347 | No | 407 | 431 |
| FRANZOSA_IBD_2019 | [ | IBD patients and controls (PRISM cohort) | 220 | No | 199 | 174 |
| SINHA_CRC_2016 | [ | CRC patients and controls | 131 | No | 352 | 189 |
| HE_INFANTS_MFGM_2019 | [ | Infants on different diets during their 1st year of life | 277 | Yes | 118 | 111 |
| iHMP_IBDMDB_2019 | [ | HMP2 (iHMP) cohort: Longitudinal samples from IBD patients and controls | 389 | Yes | 455 | 276 |
| JACOBS_IBD_2016 | [ | IBD patients and their first degree (healthy) relatives | 90 | No | 36 | 27 |
| POYET_BIO_ML_2019 | [ | Longitudinal samples from healthy BIO-ML (stool bank) donors | 164 | Yes | 255 | 223 |
| ERAWIJANTARI_GC_2020 | [ | Patients with a history of gastrectomy for GC, and controls | 96 | No | 462 | 505 |
| KIM_ADENOMAS | [ | Patients with advanced colorectal adenomas, CRC, and controls | 240 | No | 358 | 262 |
| MARS_IBS_2020 | [ | Longitudinal samples from patients with IBS and controls | 455 | Yes | 40 | 36 |
| KANG_AUTISM_2018 | [ | Children with autism and neurotypical children | 44 | No | 58 | 57 |
| KOSTIC_INFANTS_T1D_2015 | [ | Longitudinal samples from children at risk for T1D (DIABIMMUNE cohort) | 103 | Yes | 138 | 130 |
| WANDRO_PRETERMS_2018 | [ | Preterm infants during their first 6 months of life. Some developed LOS/NEC | 75 | Yes | 198 | 199 |
| WANG_ESRD_2020 | [ | Adults with ESRD and controls | 287 | No | 148 | 87 |
CRC Colorectal cancer, IBD Inflammatory bowel disease, MFGM Milk fat globule membrane, BIO-ML Broad Institute-OpenBiome Microbiome Library, GC Gastric cancer, IBS Irritable bowel syndrome, T1D Type 1 diabetes, LOS Late-onset sepsis, NEC Necrotizing enterocolitis, ESRD End-stage renal disease.
Fig. 1Data resource processing, organization, and statistics.
a A highlight of data resources and main processing steps of the “curated microbiome-metabolome data resource” (see Methods); b A database scheme of the final data products per dataset. Each box describes a specific table and its content and primary key (PK) field. The “species” table is only available for studies with shotgun metagenomic data; c Data resource summary statistics; d Genera prevalence across datasets. Each bar represents the number of unique genera that appear in at least the specified number of datasets; e Metabolite prevalence across datasets, interpretation equivalent to (d).
Fig. 2A meta-analysis of genus-metabolite association reveals a dense network of consistent associations.
a Associations between genera and metabolites were tested using linear models, in each dataset independently and controlling for study groups. The dot plot illustrates association results for the top 70 associated metabolites and the top 40 associated genera. Each dot represents a genus-metabolite pair, dot size represents the number of datasets in which the pair was analyzed, and dot colors represent the percent of datasets in which a significant association (positive or negative) was found (see also Methods). A question mark indicates conflicting results between 2 or more datasets, i.e. at least one significant negative association and at least one significant positive association. Metabolites (grid columns) are grouped by their metabolite classes, abbreviated as follows: Ben. Benzenoids, OS Other steroids, Cbxm. Carboximidic acids, COOH Carboxylic acids and derivatives, AA Amino acids, OO Other organic acids, ONC Organonitrogen compounds, CHO Carbohydrates and carbohydrate conjugates, OHC Organoheterocyclic compounds, PPA Phenylpropanoic acids. Genera (grid rows) are grouped by their order taxonomic rank, abbreviated as follows: Actin. Actinomycetales (Actinobacteriota phylum), Bacte. Bacteroidales (Bacteroidota phylum), Lachn. Lachnospirales (Firmicutes_A phylum), Oscil. Oscillospirales (Firmicutes_A phylum), Chris. Christensenellales (Firmicutes_A phylum), Veill. Veillonellales (Firmicutes_C phylum), Enter. Enterobacterales (Proteobacteria phylum), b A bipartite network of consistent genus-metabolite associations, identified by a meta-analysis of 11 different microbiome-metabolome datasets from the “curated microbiome-metabolome data resource”. Green nodes represent genera, with node sizes proportional to genus’ average relative abundance, and orange nodes represent metabolites. Edges between genus nodes and metabolite nodes represent a consistent positive (blue) or negative (red) association. Details about the network nodes and edges are available in Supplementary Table 4.