| Literature DB >> 35729161 |
Li Zhang1,2, Karen R Jonscher1,3, Zuyuan Zhang4, Yi Xiong2, Ryan S Mueller5, Jacob E Friedman1,3,6, Chongle Pan7,8,9.
Abstract
The immune system of some genetically susceptible children can be triggered by certain environmental factors to produce islet autoantibodies (IA) against pancreatic β cells, which greatly increases their risk for Type-1 diabetes. An environmental factor under active investigation is the gut microbiome due to its important role in immune system education. Here, we study gut metagenomes that are de-novo-assembled in 887 at-risk children in the Environmental Determinants of Diabetes in the Young (TEDDY) project. Our results reveal a small set of core protein families, present in >50% of the subjects, which account for 64% of the sequencing reads. Time-series binning generates 21,536 high-quality metagenome-assembled genomes (MAGs) from 883 species, including 176 species that hitherto have no MAG representation in previous comprehensive human microbiome surveys. IA seroconversion is positively associated with 2373 MAGs and negatively with 1549 MAGs. Comparative genomics analysis identifies lipopolysaccharides biosynthesis in Bacteroides MAGs and sulfate reduction in Anaerostipes MAGs as functional signatures of MAGs with positive IA-association. The functional signatures in the MAGs with negative IA-association include carbohydrate degradation in lactic acid bacteria MAGs and nitrate reduction in Escherichia MAGs. Overall, our results show a distinct set of gut microorganisms associated with IA seroconversion and uncovered the functional genomics signatures of these IA-associated microorganisms.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35729161 PMCID: PMC9213500 DOI: 10.1038/s41467-022-31227-1
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 17.694
Fig. 1Phylogeny and taxonomy distribution of high-quality MAGs from TEDDY.
a Taxonomy tree of the 883 species represented by TEDDY MAGs. Branches are colored at the phylum level. The four rings mark the 567 species matched to Almeida, et al.[21] in blue, the 458 species matched to Nayfach, et al.[18] in green, the 626 species matched to Pasolli, et al.[22] in brown, and the 176 species only identified from TEDDY in red. b Order-level composition of the 883 TEDDY species (left) and the number of species per subject in each order (right). Only the five most common orders, Lachnospirales (n = 823 subjects), Oscillospirales (n = 648), Coriobacteriales (n = 321), Bacteroidales (n = 698), and Lactobacillales (n = 615), are shown individually, while the remaining orders are grouped as ‘other’ (n = 3752). Most subjects had less than 10 species from each of the orders except Lachnospirales. Boxplots show the median (center), the first and third quartile (bounds of box), and 1.5X interquartile ranges (whiskers). Points beyond the ends of whiskers are outliers. c Species identified in the largest numbers of subjects in TEDDY, colored by orders. Source data are provided as a Source Data file.
Fig. 2Genome-resolved longitudinal abundance profiles of TEDDY microbiomes.
a Species clustered into seven groups by their average MAG abundances across eight developmental stages, shown in months on the x-axis. Clusters are ordered temporally by their peak abundances in the developmental stages from month 3 to month 35. Line colors indicate the membership probabilities of the species. Only species with a membership probability greater than 0.5 are shown. Abundances were standardized to a mean value of zero and a standard deviation of one. b Order-level composition of species identified in the seven clusters. Source data are provided as a Source Data file.
Fig. 3Construction of protein core families from human gut microbiota.
a Mapping rates of metagenomic reads onto protein-coding genes with functional annotations. Violin plots show the distributions of annotation mapping rates across samples (n = 12,854 metagenomic sequencing runs). Over half of the genetic potential in a sample have functional annotations from MetaCyc or KEGG. Violin plots indicate median (white dot), the first and third quartile (black bar in the center), and the 1.5X interquartile ranges (black lines stretched from the bar). b Mapping rates of metagenomic reads onto protein families present in more than certain percentages of subjects in each metagenomic sequencing run (n = 12,854). All protein families (0% on the x-axis) accounted for 82.4% of the metagenomic reads. The core protein families were defined to be families found in >50% of subjects (50% on the x-axis), which accounted for 63.6% of the metagenomic reads. c Distribution of proteins across protein families. The core protein families in >50% of subjects represented 2.2% of all families, but included 62.7% of all proteins. The peripheral protein families in less than 10% of subjects represented 91.2% of all families, but included only 13% of all proteins. d Mapping rates of metagenomic reads onto the core protein families across the following developmental stages defined by the months of age: 3 to 10 (n = 4,645), 11 to 18 (n = 3,634), 19 to 26 (n = 2,252), 27 to 34 (n = 1,385), and ≥35 (n = 938). The mapping rates only decreased slightly as the subjects matured and their microbiomes diversified. e Principal component analysis (PCA) of the functional profiles of major orders over time. The functional profile of an order is the gene abundances of core families in this order in every KEGG category. f PCA of the taxonomic profiles of KEGG categories over time. The taxonomic profile of a KEGG category is the gene abundances of core families in this KEGG category in every order. Boxplots show the median (center), the first and third quartile (box), and 1.5X interquartile ranges (whiskers). Source data are provided as a Source Data file.
Fig. 4Taxonomy distribution of MAGs positively or negatively associated with seroconversion.
For clarity, the phylogenetic tree comprises only species containing more than 10 high-quality MAGs. The inner ring shows the total numbers of MAGs in each species in blue bars and the outer ring shows the numbers of positively IA-associated MAGs in red bars and negatively IA-associated MAGs in green bars. Branches of the phylogenetic trees are colored in red for taxa containing positively IA-associated MAGs and in green for taxa containing negatively IA-associated MAGs. Taxa are highlighted in arcs of varying colors and are identified in the legend. Comparative genomics were performed between adjacent taxa with and without significant MAGs. Source data are provided as a Source Data file.
Comparative genomics of the taxa containing MAGs positively associated with IA seroconversion.
| Major species containing positive MAGs (Positive MAGs/Total MAGs) | Module ID | Function description | Enrichment analysisb | Phylogenetic regression analysisc | ||
|---|---|---|---|---|---|---|
| Effect size | Estimate | |||||
| Comparison within | ||||||
| M00064 | ADP-L-glycero-D-manno-heptose biosynthesis | 0.43 | 2.16E–32 | 1.95 | 6.74E–4 | |
| Comparison within | ||||||
| M00616 | Sulfate-sulfur assimilation | 1.00 | 3.05E–40 | 7.22 | 2.53E–26 | |
| M00307 | Pyruvate` oxidation, pyruvate => acetyl-CoA | 0.97 | 4.52E–38 | 9.26 | 3.80E–13 | |
| M00620 | Incomplete reductive citrate cycle, acetyl-CoA => oxoglutarate | 0.95 | 8.18E–39 | 24.47 | 5.60E–05 | |
| M00173 | Reductive citrate cycle (Arnon-Buchanan cycle) | 0.79 | 3.69E–33 | 11.32 | 7.04E–15 | |
| M00596 | Dissimilatory sulfate reduction, sulfate => H2S | 0.78 | 6.96E–30 | 12.67 | 4.67E–18 | |
| M00176 | Assimilatory sulfate reduction, sulfate => H2S | 0.71 | 2.71E–23 | 17.31 | 2.24E–21 | |
| M00374 | Dicarboxylate-hydroxybutyrate cycle | 0.62 | 9.19E–22 | 11.02 | 4.57E–12 | |
| M00632 | Galactose degradation, Leloir pathway, galactose => alpha-D-glucose-1P | 0.58 | 8.72E–15 | 11.47 | 9.30E–16 | |
| M00125 | Riboflavin biosynthesis, GTP => riboflavin/FMN/FA | 0.48 | 1.97E–14 | 7.84 | 1.03E–15 | |
| M00376 | 3-Hydroxypropionate bi-cycle | 0.47 | 4.68E–12 | 7.67 | 3.67E–17 | |
| M00565 | Trehalose biosynthesis, D-glucose 1P => trehalose | 0.45 | 3.87E–12 | 10.52 | 6.69E–18 | |
| M00017 | Methionine biosynthesis, apartate => homoserine => methionine | 0.45 | 2.50E–14 | 8.07 | 1.40E–14 | |
| M00159 | V-type ATPase, prokaryotes | 0.43 | 2.37E–14 | 14.89 | 7.99E–23 | |
| M00082 | Fatty acid biosynthesis, initiation | 0.43 | 1.23E–11 | 7.08 | 9.69E–16 | |
| M00028 | Ornithine biosynthesis, glutamate => ornithine | 0.41 | 4.15E–10 | 12.05 | 2.11E–12 | |
| Comparison within | ||||||
| M00616 | Sulfate-sulfur assimilation | 1.11 | 2.52E–159 | 12.24 | 4.48E–11 | |
| M00596 | Dissimilatory sulfate reduction, sulfate => H2S | 0.85 | 3.90E–112 | 9.18 | 2.17E–12 | |
| M00176 | Assimilatory sulfate reduction, sulfate => H2S | 0.80 | 3.79E–112 | 8.84 | 6.1793E–13 | |
| Comparison of | ||||||
| M00019 | Valine/isoleucine biosynthesis, pyruvate => valine / 2-oxobutanoate => isoleucine | 0.65 | 2.31E–17 | 9.50 | 1.50E–06 | |
| M00570 | Isoleucine biosynthesis, threonine => 2-oxobutanoate => isoleucine | 0.62 | 1.20E–14 | 9.11 | 1.37E–06 | |
| M00432 | Leucine biosynthesis, 2-oxoisovalerate => 2-oxoisocaproate | 0.57 | 1.48E–14 | 9.97 | 2.93E–05 | |
| M00535 | Isoleucine biosynthesis, pyruvate => 2-oxobutanoate | 0.57 | 2.06E–12 | 28.20 | 1.09E–05 | |
| M00115 | NAbiosynthesis, aspartate => NA | 0.56 | 1.55E–12 | 2.05 | 8.11E–05 | |
| M00346 | Formaldehyde assimilation, serine pathway | 0.52 | 3.52E–10 | 4.69 | 3.79E–04 | |
| M00017 | Methionine biosynthesis, apartate => homoserine => methionine | 0.46 | 8.56E–08 | 30.33 | 6.47E–13 | |
| M00007 | Pentose phosphate pathway, non-oxidative phase, fructose 6P => ribose 5P | 0.45 | 1.12E–08 | 6.96 | 2.05E–07 | |
| Comparison of | ||||||
| M00651 | Vancomycin resistance, D-Ala-D-Lac type | 0.45 | 6.08E–12 | 3.49 | 1.69E–07 | |
| M00173 | Reductive citrate cycle (Arnon-Buchanan cycle) | 0.42 | 2.40E–11 | 7.69 | 1.36E–07 | |
aThe numbers correspond to the numbered taxa shown in the caption of Fig. 4.
bWilcoxon test (two-sided), Benjamini-Hochberg adjusted
cPhylogenetic linear modeling (two-sided), Benjamini-Hochberg adjusted
Comparative genomics of the taxa containing MAGs negatively associated with IA seroconversion.
| Major species containing negative MAGs (Negative MAGs/Total MAGs) | Module ID | Function description | Enrichment analysisb | Phylogenetic regression analysisc | ||
|---|---|---|---|---|---|---|
| Effect size | Estimate | |||||
| Comparison of | ||||||
| M00550 | Ascorbate degradation, ascorbate => D-xylulose−5P | 1.07 | 3.78E–125 | 3.92 | 1.97E–10 | |
| M00061 | D-Glucuronate degradation, D-glucuronate => pyruvate + D-glyceraldehyde 3P | 0.79 | 4.94E–89 | 4.95 | 8.80E–18 | |
| M00631 | D-Galacturonate degradation (bacteria), D-galacturonate => pyruvate + D-glyceraldehyde 3P | 0.76 | 1.40E–83 | 6.43 | 2.15E–18 | |
| M00008 | Entner-Doudoroff pathway, glucose-6P => glyceraldehyde-3P + pyruvate | 0.73 | 1.48E–76 | 5.45 | 1.25E-12 | |
| M00006 | Pentose phosphate pathway, oxidative phase, glucose 6P => ribulose 5P | 0.68 | 1.85E–67 | 8.24 | 4.79E–28 | |
| M00003 | Gluconeogenesis, oxaloacetate => fructose-6P | 0.65 | 1.68E–75 | 6.53 | 1.12E–15 | |
| M00116 | Menaquinone biosynthesis, chorismate => menaquinol | 0.61 | 4.76E−50 | 10.07 | 2.55E–10 | |
| M00153 | Cytochrome bd ubiquinol oxidase | 0.60 | 3.42E–59 | 9.35 | 1.31E–05 | |
| M00308 | Semi-phosphorylative Entner-Doudoroff pathway, gluconate => glycerate-3eP | 0.54 | 9.00E–48 | 6.59 | 3.99E–21 | |
| M00004 | Pentose phosphate pathway (Pentose phosphate cycle) | 0.53 | 2.15E–51 | 14.10 | 1.88E–28 | |
| M00165 | Reductive pentose phosphate cycle (Calvin cycle) | 0.52 | 1.99E–47 | 7.70 | 3.94E–18 | |
| M00011 | Citrate cycle, second carbon oxidation, 2-oxoglutarate => oxaloacetate | 0.47 | 7.77E–32 | 7.83 | 7.86E–28 | |
| M00532 | Photorespiration | 0.46 | 1.02E–36 | 6.28 | 6.63E–39 | |
| M00001 | Glycolysis (Embden-Meyerhof pathway), glucose => pyruvate | 0.46 | 1.63E–43 | 19.39 | 7.62E–26 | |
| M00167 | Reductive pentose phosphate cycle, glyceraldehyde-3P => ribulose-5P | 0.43 | 6.08E–28 | 13.87 | 1.93E–19 | |
| M00345 | Formaldehyde assimilation, ribulose monophosphate pathway | 0.43 | 5.47E–31 | 19.86 | 6.85E–10 | |
| Comparison of | ||||||
| M00529 | Denitrification, nitrate => nitrogen | 1.20 | 1.15E–137 | 20.59 | 9.55E–08 | |
| M00880 | Molybdenum cofactor biosynthesis, GTP => molybdenum cofactor | 1.08 | 3.82E-128 | 11.37 | 2.60E-17 | |
| M00550 | Ascorbate degradation, ascorbate => D-xylulose-5P | 0.96 | 3.08E–105 | 3.87 | 5.24E–09 | |
| M00804 | Complete nitrification, comammox, ammonia => nitrite => nitrate | 0.81 | 8.55E-83 | 8.19 | 4.38E-29 | |
| M00150 | Fumarate reductase, prokaryotes | 0.78 | 2.43E–88 | 15.61 | 9.55E–09 | |
| M00616 | Sulfate-sulfur assimilation | 0.69 | 1.12E–72 | 4.31 | 1.12E–22 | |
| M00095 | C5 isoprenoid biosynthesis, mevalonate pathway | 0.69 | 5.94E–71 | 5.76 | 2.92E–16 | |
| M00718 | Multidrug resistance, efflux pump MexAB-OprM | 0.67 | 5.63E–61 | 21.27 | 3.85E–43 | |
| M00546 | Purine degradation, xanthine => urea | 0.65 | 8.20E–58 | 4.92 | 1.44E–19 | |
| M00167 | Reductive pentose phosphate cycle, glyceraldehyde-3P => ribulose-5P | 0.64 | 2.99E–56 | 16.59 | 1.08E–44 | |
| M00879 | Arginine succinyltransferase pathway, arginine => glutamate | 0.63 | 3.97E–64 | 3.93 | 4.92E–21 | |
| M00087 | beta-Oxidation | 0.62 | 1.39E–61 | 3.42 | 9.28E–09 | |
| M00761 | Undecaprenylphosphate alpha-L-Ara4N biosynthesis, UDP-GlcA => undecaprenyl phosphate alpha-L-Ara4N | 0.56 | 5.70E–51 | 2.89 | 4.27E–08 | |
| M00417 | Cytochrome o ubiquinol oxidase | 0.55 | 3.85E–51 | 2.89 | 2.06E–08 | |
| M00170 | C4-dicarboxylic acid cycle, phosphoenolpyruvate carboxykinase type | 0.53 | 5.49E–42 | 15.96 | 9.14E–50 | |
| M00004 | Pentose phosphate pathway (Pentose phosphate cycle) | 0.52 | 2.09E–40 | 22.59 | 1.10E–17 | |
| M00088 | Ketone body biosynthesis, acetyl-CoA => acetoacetate/3-hydroxybutyrate/acetone | 0.51 | 6.75E–43 | 17.61 | 1.11E–11 | |
| M00006 | Pentose phosphate pathway, oxidative phase, glucose 6P => ribulose 5P | 0.50 | 7.30E–40 | 5.24 | 5.18E–16 | |
| M00615 | Nitrate assimilation | 0.49 | 4.46E–38 | 13.86 | 2.17E–37 | |
| M00008 | Entner-Doudoroff pathway, glucose-6P => glyceraldehyde-3P + pyruvate | 0.48 | 7.42E–39 | 9.68 | 7.37E–08 | |
| M00165 | Reductive pentose phosphate cycle (Calvin cycle) | 0.48 | 1.03E–38 | 18.87 | 7.92E–22 | |
| M00061 | D-Glucuronate degradation, D-glucuronate => pyruvate + D-glyceraldehyde 3P | 0.47 | 2.22E–40 | 2.89 | 8.20E–15 | |
| M00345 | Formaldehyde assimilation, ribulose monophosphate pathway | 0.45 | 3.02E–30 | 19.62 | 3.65E–03 | |
| M00034 | Methionine salvage pathway | 0.43 | 4.56E–27 | 18.65 | 3.89E–38 | |
| M00579 | Phosphate acetyltransferase-acetate kinase pathway, acetyl-CoA => acetate | 0.42 | 2.03E–30 | 4.47 | 1.24E–13 | |
| M00631 | D-Galacturonate degradation (bacteria), D-galacturonate => pyruvate + D-glyceraldehyde 3P | 0.41 | 6.69E–31 | 4.41 | 9.77E–26 | |
aThe numbers correspond to the numbered taxa shown in the caption of Fig. 4.
bWilcoxon test (two-sided), Benjamini-Hochberg adjusted
cPhylogenetic linear modeling (two-sided), Benjamini-Hochberg adjusted