Yan Shao1, Samuel C Forster1,2,3, Evdokia Tsaliki4, Kevin Vervier1, Angela Strang4, Nandi Simpson4, Nitin Kumar1, Mark D Stares1, Alison Rodger4, Peter Brocklehurst5, Nigel Field6, Trevor D Lawley7. 1. Host-Microbiota Interactions Laboratory, Wellcome Sanger Institute, Hinxton, UK. 2. Centre for Innate Immunity and Infectious Diseases, Hudson Institute of Medical Research, Clayton, Victoria, Australia. 3. Department of Molecular and Translational Sciences, Monash University, Clayton, Victoria, Australia. 4. Institute for Global Health, University College London, London, UK. 5. Birmingham Clinical Trials Unit, University of Birmingham, Birmingham, UK. 6. Institute for Global Health, University College London, London, UK. nigel.field@ucl.ac.uk. 7. Host-Microbiota Interactions Laboratory, Wellcome Sanger Institute, Hinxton, UK. tl2@sanger.ac.uk.
Abstract
Immediately after birth, newborn babies experience rapid colonization by microorganisms from their mothers and the surrounding environment1. Diseases in childhood and later in life are potentially mediated by the perturbation of the colonization of the infant gut microbiota2. However, the effects of delivery via caesarean section on the earliest stages of the acquisition and development of the gut microbiota, during the neonatal period (≤1 month), remain controversial3,4. Here we report the disrupted transmission of maternal Bacteroides strains, and high-level colonization by opportunistic pathogens associated with the hospital environment (including Enterococcus, Enterobacter and Klebsiella species), in babies delivered by caesarean section. These effects were also seen, to a lesser extent, in vaginally delivered babies whose mothers underwent antibiotic prophylaxis and in babies who were not breastfed during the neonatal period. We applied longitudinal sampling and whole-genome shotgun metagenomic analysis to 1,679 gut microbiota samples (taken at several time points during the neonatal period, and in infancy) from 596 full-term babies born in UK hospitals; for a subset of these babies, we collected additional matched samples from mothers (175 mothers paired with 178 babies). This analysis demonstrates that the mode of delivery is a significant factor that affects the composition of the gut microbiota throughout the neonatal period, and into infancy. Matched large-scale culturing and whole-genome sequencing of over 800 bacterial strains from these babies identified virulence factors and clinically relevant antimicrobial resistance in opportunistic pathogens that may predispose individuals to opportunistic infections. Our findings highlight the critical role of the local environment in establishing the gut microbiota in very early life, and identify colonization with antimicrobial-resistance-containing opportunistic pathogens as a previously underappreciated risk factor in hospital births.
Immediately after birth, newborn babies experience rapid colonization by microorganisms from their mothers and the surrounding environment1. Diseases in childhood and later in life are potentially mediated by the perturbation of the colonization of the infant gut microbiota2. However, the effects of delivery via caesarean section on the earliest stages of the acquisition and development of the gut microbiota, during the neonatal period (≤1 month), remain controversial3,4. Here we report the disrupted transmission of maternal Bacteroides strains, and high-level colonization by opportunistic pathogens associated with the hospital environment (including Enterococcus, Enterobacter and Klebsiella species), in babies delivered by caesarean section. These effects were also seen, to a lesser extent, in vaginally delivered babies whose mothers underwent antibiotic prophylaxis and in babies who were not breastfed during the neonatal period. We applied longitudinal sampling and whole-genome shotgun metagenomic analysis to 1,679 gut microbiota samples (taken at several time points during the neonatal period, and in infancy) from 596 full-term babies born in UK hospitals; for a subset of these babies, we collected additional matched samples from mothers (175 mothers paired with 178 babies). This analysis demonstrates that the mode of delivery is a significant factor that affects the composition of the gut microbiota throughout the neonatal period, and into infancy. Matched large-scale culturing and whole-genome sequencing of over 800 bacterial strains from these babies identified virulence factors and clinically relevant antimicrobial resistance in opportunistic pathogens that may predispose individuals to opportunistic infections. Our findings highlight the critical role of the local environment in establishing the gut microbiota in very early life, and identify colonization with antimicrobial-resistance-containing opportunistic pathogens as a previously underappreciated risk factor in hospital births.
The acquisition and development of the early-life gut microbiota follow successive waves of microbial exposures and colonisation that shapes the longer-term microbiota composition and function[5]. Early life events, including Caesarean section delivery[1,6], formula feeding[7,8] and antibiotic exposure[8,9] that could perturb the gut microbiota composition are associated with the development of childhood asthma and atopy[10-12]. While recent studies[8,9,13-15] have provided substantial insights into the gut microbiota development during the first 3 years of life, many were limited by the taxonomic resolution provided by 16S rRNA gene profiling, small sample size or limited sampling during the first month of life (neonatal period). High-resolution metagenomic studies of large, longitudinal cohorts are required to establish the impact and risks of early life events on the gut microbiota assembly, particularly during the neonatal period where pioneering microbes could influence subsequent microbiota and immune system development[16,17].To characterise the trajectory of gut microbiota acquisition and development during the neonatal period, we enrolled 596 healthy, term babies (39.5 ± 1.37 gestation weeks, 314 vaginal and 282 C-section births, Fig. 1a, Extended Data Table 1, Supplementary Table 1) through the Baby Biome Study (BBS). Faecal samples were collected from all babies at least once during their neonatal period (<1 month) with 302 babies re-sampled later in infancy (8.75 ± 1.98 months). Maternal faecal samples were also obtained from 175 mothers paired with 178 babies. Metagenomic analysis of 1,679 faecal samples from 771 babies and mothers revealed temporal dynamics of the gut microbiota development (Fig. 1b) and increased diversity with age (Extended Data Fig. 1a). Strikingly, the gut microbiotas exhibited substantial heterogeneity (inter-individual) and instability (intra-individual) during the first weeks of life (Extended Data Fig. 1b). Inter-individual differences explained 57% of the microbial taxonomic variation (Permutational multivariate analysis of variance (PERMANOVA), P < 0.001, 1,000 permutations), followed by sampling age at 5.7% of the variance (P < 0.001). These results indicate that the gut microbiotas were highly dynamic and individualised during the neonatal period, even more than observed in infancy (Extended Data Fig. 1c).
Fig. 1
Developmental dynamics of the neonatal gut microbiota.
a, Longitudinal metagenomic sampling of 1,679 early-life gut microbiotas of 771 individuals from three participating hospitals (A, B, C) of the Baby Biome Study. Each row corresponds to the time course of a subject, comprising 596 babies sampled during the neonatal period primarily on day 4 (n=310), 7 (n=532) and 21 (n=325), in infancy (8.75 ± 1.98 months of age, n = 302), and from matched mothers (n = 175). b, Non-metric multidimensional scaling (NMDS) ordination of Bray–Curtis dissimilarity, n = 917) between the species relative abundance profiles of the gut microbiota sampled from babies sampled on day 4, day 7, day 21, in infancy and from mothers.
Extended Data Table 1
Baseline clinical characteristics of the Baby Biome Study cohort.
Complete clinical metadata of the study participants reported in Supplementary Table 1.
The neonatal gut microbiotas exhibited high volatility and individuality.
a, Microbiota diversity (alpha diversity) increased over developmental time. The violin plot outlines illustrate kernel probability density, with the width of the shaded area representing the proportion of the data shown. Centre lines show the medians; box limits indicate the 25th and 75th percentiles; whiskers extend 1.5 times the interquartile range from the 25th and 75th percentiles, outliers are represented by dots. Number of gut microbiotas on day 4 (n=310), 7 (n=532) and 21 (n=325), in infancy (n = 302), and from matched mothers (n = 175). b-c, Gut microbiota stability, stratified by inter-individual (day 4, n=310; day 7, n=532; day 21, n=325) and intra-individual comparisons in sliding time windows (day 4 to 7, n = 274; day 7 to 21, n=285) during the neonatal period (b), in the context of the overall infancy period (c) with the TEDDY study microbiota stability measurements (earliest measurements on day 90, and year 3) plotted in crosses. Solid lines show the median per time window. Shaded areas show the 99% confidence interval estimated using binomial distribution. Error bars indicate median absolute deviation. Statistical significance between groups was determined by two-sided Wilcoxon rank-sum test.
To determine the impact of clinical covariates on the composition of the gut microbial community, we performed cross-sectional PERMANOVA, stratified by age. Mode of delivery was the most significant factor driving gut microbiota variation during the neonatal period (Fig. 2a, Supplementary Table 2), while other clinical covariates associated with hospital birth (e.g. perinatal antibiotics, duration of hospital stay) and breastfeeding exhibited smaller effects (Supplementary Note 1). The largest effect of delivery mode was observed on day 4 (Extended Data Fig. 2, R2=7.64%, P<0.001), which dissipated with age but remained significant at the point of infancy sampling (R2=1.00%, P<0.01). No difference was observed in maternal gut microbiotas by delivery modes or neonatal gut microbiotas between elective and emergency C-section births (Supplementary Table 2).
Fig. 2
Perturbed neonatal gut microbiota composition and development associated with the mode of delivery
a, Bar plot illustrating the clinical covariates associated with the neonatal gut microbiota variations on day 4 (n=310), day 7 (n=532), day 21 (n=325) and in infancy (n=302). Only the statistically significant associations in cross-sectional tests are shown. Covariates are ranked by the number statistically significant effect observed across sampling age groups. The proportion of explained variance (R2) and statistical significance were determined by PERMANOVA on between-sample Bray-Curtis distances. b, Longitudinal changes in the mean relative abundance (RA) of faecal bacteria at the genus level sampled on day 4, 7, 21 days of life and in infancy, for genera with > 1% RA across all neonatal period samples. Vaginal, n=744 from 310 babies; C-section, n=725 from 281 babies.
Extended Data Fig. 2
Microbiota variation associated with mode of delivery in the neonatal period and infancy.
Non-metric multidimensional scaling (NMDS) ordination of Bray–Curtis dissimilarity between the species relative abundance profiles of the gut microbiota sampled from babies on day 4 (vaginal, n=157; C-section, n=153), day 7 (vaginal, n=280; C-section, n=252), day 21 (vaginal, n=147; C-section, n=178), during infancy (vaginal, n=160; C-section, n = 142) and from mothers (vaginal, n=110; C-section, n=65). Microbial variation explained by factor mode of delivery is represented by the PERMANOVA R2 value (bottom left) and statistically significant across four cross-sectional PERMANOVA tests (FDR-corrected p-values reported in Supplementary Table 2).
Given the significant effect of the mode of delivery during the neonatal period, we next sought to understand how the microbiota composition and developmental trajectory were altered. Samples from babies delivered vaginally were enriched with Bifidobacterium (e.g. B. longum, B. breve), Escherichia (E. coli) and Bacteroides/Parabacteroides species (e.g. B. vulgatus, P. distasonis) with these commensal genera comprising 68.3% (95% CI 65.7-71.0%) of the neonatal gut microbial communities (Fig. 2b, Supplementary Table 3), which validated the recent observations in other cohorts[4,13]. In contrast, the gut microbiota of C-section delivered babies were depleted of these commensal genera and instead were dominated by Enterococcus (E. faecalis, E. faecium), Staphylococcus epidermis, Streptococcus parasanguinis, Klebsiella (K. oxytoca, K. pneumoniae), Enterobacter cloacae and Clostridium perfringens, which are commonly associated with hospital environments[18] and hospitalised preterm babies[19-21]. On day 4, species belonging to these genera accounted for 68.25% (95% CI 62.74-73.75%) of the total microbiota composition in C-section delivered babies (Fig. 2b).Previous studies reported that, compared to C-section delivered babies, the gut microbiotas of vaginally delivered babies were enriched in lactobacilli associated with the mother’s vaginal microbiota[1,22]. However, here we observed no statistical difference in the prevalence (vaginal 11.9% vs C-section 15.7% present at over 1% abundance) or abundance of Lactobacillus between vaginally (1.217%, 95% CI 0.81-1.621%) or C-section (2.21%, 95% CI 1.54-2.88%) delivered babies. Rather, commensal species from the Bacteroides genus were detected at high abundance in the gut microbiota of 49.0% (154/314) of vaginally delivered babies (mean relative abundance 8.13%, 95% CI 6.88-9.39%, Extended Data Fig. 3). In contrast, Bacteroides species were low or absent in 99.6% (281/282) C-section delivered babies (mean relative abundance 0.43%, 95% CI 0.11-0.74). In 60.6% (86/142) of the C-section babies, this low-Bacteroides profile (defined in Methods) persisted into infancy, when Bacteroides became the only differentially abundant species between vaginally and C-section delivered babies (Supplementary Table 3). Although we could not assess the independent effect of maternal antibiotic exposure during C-section delivery as antibiotics were administered in all C-section deliveries, among vaginally delivered babies we observed a statistically significant association between the low-Bacteroides profile with maternal intrapartum antibiotic prophylaxis (IAP, OR=1.77, 95% CI: 1.17-2.71, P=0.0074), which also accounted for the greatest amount of gut microbiota variation in vaginally delivered babies (R2=5.88-13.6%, Supplementary Table 2). These results expand on previous findings[9,23] and further highlight a low-Bacteroides profile as the perturbation signature associated with C-section and maternal IAP in vaginal delivery.
Extended Data Fig. 3
Microbial succession in the vaginally-delivered neonatal gut microbiota over the first 21 days of life.
Bar plots show longitudinal changes in the mean relative abundance (RA) of faecal bacteria at the genus level at day 4, 7 and 21 days of life, for genera with > 1% RA across all neonatal samples. Left panel: n = 316 from 160 vaginally delivered babies detected with Bacteroides, right panel: n = 290 from 154 vaginally delivered babies with low Bacteroides status (defined in Methods).
Maternal transmission of gastrointestinal bacteria to their babies is an underappreciated form of kinship[24]. To assess if the neonatal microbiota variation could be attributed to differential transmission of maternal microbiota, we profiled the bacterial strain transmission across 178 mother-baby dyads. We show that the majority of maternal strain transmissions during the neonatal period occurred in vaginally delivered babies (74.39%), at much higher frequency in comparison with those delivered by C-section (12.56%, Fisher’s exact test, P<0.0001, Fig 3a, Extended Data Fig. 4, Supplementary Table 4). Bacteroides spp., Parabacteroides spp., E. coli and Bifidobacterium spp. were most frequently transmitted from mothers to babies through vaginal birth, in agreement with previous observation in smaller cohorts[4,25-27]. For Bacteroides species such as B. vulgatus (Fig. 3b), the lack of transmission continued far beyond the neonatal period in C-section born babies[25] with the late transmission of B. vulgatus rarely detected later in infancy. This is in contrast to the transmission pattern of other common early colonisers such as B. longum (Fig. 3c) and E. coli, for which colonisations of maternal strains occurred more frequently later in infancy (Fisher’s exact tests, P=0.0479 and P=0.0226, respectively). This result highlights the neonatal period as a critical early window of maternal transmission with the disrupted transmission of pioneering Bacteroides species evident in C-section babies with long-term Bacteroides absence.
Fig. 3
Disrupted maternal strain transmission in C-section-delivered babies.
a, Early and late transmission of the maternal strains in mother-baby pairs (vaginal: 35, C-section: 24) longitudinally sampled during the neonatal (early) and infancy (late) period. Only the frequently shared species detected with sufficient coverage for strain analysis in more than 10 pairs are shown. b, c Transmission events of maternal B. vulgatus (b) and B. longum (c) strains in vaginally delivered, and C-section delivered babies over time. In each row of mother-baby paired samples, each circle represents a detectable strain either identical to (filled) or distinct from (hollow) the maternal strain. Across the rows, identical strains are linked by a solid line representing early transmission and persistence to infancy, while the dashed line indicates late transmission.
Extended Data Fig. 4
Maternal strain transmission during the early neonatal period.
Maternal strain transmission across 178 mother-baby pairs (vaginal: 112, C-section: 66) sampled at least once during the early neonatal period. Only the frequently shared species detected with sufficient coverage for strain analysis in more than 10 pairs are shown. The neighbour-joining tree is constructed based on the pairwise mash distances of the respective reference genomes. Phylogenetically related species shared similar transmission timing pattern, for example the frequent transmission of Bacteroides/Parabacteroides spp. and Bifidobacterium spp. in vaginally delivered babies and the lack thereof in C-section born babies; and that most Streptococcus species were transmitted from other sources (non-maternal) in the environment.
While C-section babies were deprived of maternally transmitted commensal bacteria, they had a substantially higher relative abundance of opportunistic pathogens commonly associated with the healthcare environment. These enriched species included E. faecalis, E. faecium, E. cloacae, K. pneumoniae, K. oxytoca andC. perfringens (Fig. 4a, Supplementary Table 3), some of which are members of the ESKAPE pathogens responsible for the majority of nosocomial infections[28]. Indeed, their frequent gut microbiota colonisation in C-section newborns was under-reported in previous smaller cohorts[3,13] with insufficient statistical power (Supplementary Note 2). Among C-section born babies, 83.7% carried opportunistic pathogen species during the neonatal period (as defined in Methods), in comparison to 49.4% of the vaginally born babies (Fig. 4a). During the first 21 days of life, these healthcare-associated opportunistic pathogens accounted for 30.4% (95% CI 27.86-32.96%) of the species level abundance in the gut microbiota of C-section babies, compared to 9.8% (95% CI 8.19-11.4%) in the vaginal babies, with the greatest difference observed on day 4 (Extended Data Fig. 5a). Longitudinally, the difference in combined opportunistic pathogen abundance persisted in the C-section babies re-sampled later in infancy (C-section 2.8% versus vaginal 1.6%, P=0.0375, Welch’s t-test). Interestingly, frequent and abundant carriage of opportunistic pathogens was also observed in low-Bacteroides vaginally delivered babies (Extended Data Fig. 5b), while the absence of breastfeeding during the neonatal period was associated with a higher carriage of C. perfringens, K. oxytoca and E. faecalis (Supplementary Table 3).
Fig. 4
Extensive and frequent colonisation of C-section delivered babies with diverse opportunistic pathogen species previously associated with healthcare infection.
a, The mean relative abundance (RA) and frequency (>1% RA) of six opportunistic pathogen species enriched in C-section born babies (n=596), compared to vaginal-born babies (n=606) during the first 21 days of life, in the context of the maternal level carriage (n=175). Error bars indicate the 95% CI of the mean RA. Statistical significance (P values indicated above) of the difference in RA and combined pathogen carriage frequency between vaginal and C-section babies was determined by two-sided Wilcoxon signed-rank test and Fisher’s exact tests, respectively. b, Phylogenetic representation of 836 bacterial strains cultured from raw faecal samples, including six opportunistic pathogens isolated five major genera: Enterococcus spp. (red, n=451); Clostridium spp. (yellow, n=24); Klebsiella spp. (blue, n=235), Enterobacter spp. (green, n=52) and Escherichia spp. (purple, n=41). c, Phylogeny of the BBS E. faecalis isolates (n=282) in the context of public isolates from the UK hospitals (n=168), human gut microbiotas (n=28) and environmental sources (n=27) with the high-risk UK epidemic lineage branches coloured in blue. Midpoint-rooted maximum likelihood tree is based on SNPs in 1,656 core genes. d, Diverse Enterobacter-Klebsiella complex strain populations among the BBS collection (n=202), in the context of the UK hospital (n=604), human gut microbiota (n=37) and environmental strains (n=120).
Extended Data Fig. 5
Frequency and abundance of opportunistic pathogens in the gut microbiotas.
a-b, C-section and low-Bacteroides vaginally delivered babies were more frequently carrying opportunistic pathogens (as defined in Methods) and at higher level of species relative abundance (RA), compared to vaginally delivered babies (a) and normal-Bacteroides vaginally delivered babies (b), respectively. Significant differential presence in neonatal samples within each major neonatal period sampling groups (day 4 (n = 310), 7 (n = 532) and 21 (n = 325)) in terms of mean (RA) and frequency of six known opportunistic pathogens associated with the healthcare environment, which are rarely carried by adults (mothers, n=175) (b). Number of individuals sampled in the neonatal period: vaginal, n=314; vaginal-normal (Bacteroides level), n=160; vaginal-low (Bacteroides level), n=154. Error bars indicate the 95% CI of the mean RA. Statistical significance in mean species RA and combined pathogen carriage (defined in Methods) frequency was obtained by applying two-sided Wilcoxon signed-rank test and Fisher’s exact test, respectively.
Given the prevalent carriage of opportunistic pathogens in the neonatal gut metagenomes, we sought to validate their presence and viability with culturing. We undertook targeted large-scale culturing of 836 opportunistic pathogen strains in the faecal samples of 177 babies (70 vaginal and 107 C-section babies, total 741 isolates) and 38 mothers (95 isolates) using selective media (Fig. 4b, Supplementary Table 5). Subsequent WGS and genomic characterisation of E. faecalis (n=356), E. cloacae (n=52), K. oxytoca (n=150) and K. pneumoniae (n=78) allowed us to perform high-resolution phylogenetic analysis and to delineate strain-specific carriage of AMR genes and virulence factors.Focusing on the most prevalent opportunistic pathogen in C-section born babies, we analysed the genomes of a diverse population of BBSE. faecalis strains in the context of publicly available genomes of human and environmental strains (Fig. 4c). We found that 53.9% of the BBS strains were represented by five major lineages, each of which was distributed across vaginal and C-section babies and mothers in the three BBS hospitals (Extended Data Fig. 6a) and UK hospital patients, but did not include high-risk UK epidemic lineages enriched in multi-drug resistance (MDR) and virulence. In congruence with the phylogenetic placement of the BBS strains with the humangastrointestinal and environmental strains, these non-epidemic E. faecalis exhibited comparable levels of carriage of AMR genes (Extended Data Fig. 6b-e, Supplementary Note 3). Similar to E. faecalis, the BBSEnterobacter and Klebsiella strains also exhibited high-level population diversities with the phylogenetic under-representation of epidemic lineages (Fig. 4d, Extended Data Fig. 7), and levels of AMR and virulence gene carriage indicative of non-epidemic lineages circulating in hospital environments and healthy populations, rather than hypervirulent and ESBL-enriched epidemic lineages (Extended Data Fig. 8, Supplementary Note 3). Given the prior isolation of the major BBS lineages in hospitalised patients and their AMR and virulence capabilities, any level of opportunistic pathogen carriage represents a significant risk of future infections, especially for the C-section born babies with high prevalence (83.7%) of carriage.
Extended Data Fig. 6
Phylogeny and pathogenicity potential of the BBS E. faecalis strains.
a, Phylogenetic tree of the BBS E. faecalis strains (n=282, isolated from 269 faecal samples of 160 subjects). Midpoint-rooted maximum likelihood is based on SNPs in 1,827 core genes. The five major lineages (>10 BBS strain representatives; ST179, n=60; ST16, n=30, ST40, n=27; ST30, n=21, ST191, n=14) identified with UK hospital collection distributed across three hospitals in this study with no phylogroup limited to any single hospital. Solid lines between indicated the intra-subject persistence (n=92 in 67 babies). Dash lines indicated phylogenetically distinct strains isolated from longitudinal samples (n=18) or mother-baby paired samples (yellow, n=10) with arrows indicating the direction of potential transmission (early-to-later or mother-to-baby). Where multiple identical strains (no SNP difference in species core-genome) were isolated from the same faecal sample, only one representative strain was included in the species phylogenetic tree (total number of strains, n=356). b-e, Prevalence of virulence (b-c) and AMR genes (grouped by antibiotic class) (d-e) were detected in the BBS E. faecalis strains. Statistical significance results shown are coloured according to the group with higher frequency of detected genes by two-sided Fisher's exact test between the groups of gut microbiotas (n=28) versus BBS strains (n=356), and BBS versus the UK hospital epidemic strains (n=89, tree branches coloured blue in Fig. 4c). ****P<0.0001, ***P < 0.001, **P < 0.01, *P < 0.05. Virulence genes: asa1, EF0149, EF0485, prgB = Aggregation substance; esp = enterococcal surface protein; Exoenzymes: gelE = gelatinase; EF0818, EF3023 = hyaluronidase (spreading factor); sprE = serine protease; fsr = Quorum sensing system; Toxin: cyl = cytolysin. Genes that detected across all isolates (dfrE, efrA, efrB, emeA, lsaA) are not shown. AMR genes: Am = aminoglycosides (aph3"-III, ant(6)-Ia, aph(2''), str); Chlor = chloramphenicol (catA); Linc = lincosamides (lnuB); MLSB = macrolide, lincosamide, streptogramin B (ermB or ermT); Tet = tetracycline (tetL, tetM, tetO, tetS); Trim = trimethoprim (dfrC, dfrD, dfrF or dfrG); Vanc = vancomycin.
Extended Data Fig. 7
Phylogenies of the BBS E. cloacae, K. oxytoca and K. pneumoniae strains.
a-f, Midpoint-rooted core-genome maximum likelihood trees of the E. cloacae complex, K. oxytoca and K. pneumoniae strains isolated in this study (a-c) and in the context of public genomes (d-f). a-c, Number of BBS strains of E. cloacae (a, n=37, isolated from 37 faecal samples of 30 subjects, 1,861 core genes), K. oxytoca (b, n=107, isolated from 90 faecal samples of 62 subjects, 2,910 core genes) and K. pneumoniae strains (c, n=53, isolated from 47 faecal samples of 35 subjects, 3,471 core genes). Solid lines between indicated the intra-subject strain persistence (E. cloacae, n=5; K. oxytoca, n= 25 in 18 babies; K. pneumoniae, n=11 in 8 babies). Dash lines indicated phylogenetically distinct strains isolated from longitudinal samples (E. cloacae, n=2; K. oxytoca, n=7 in 6 subjects; K. pneumoniae, n=1) with arrows indicating the direction of potential transmission (early-to-later samples). Where multiple identical strains (no SNP difference in species core-genome) were isolated from the same faecal sample, only one representative strain was included in the species phylogenetic tree (number of non-redundant BBS strains: E. cloacae, n=52; K. oxytoca, n=150; K. pneumoniae, n=78). For each species, the main phylogroups identified with UK hospital collection (E. cloacae: III, VIII; K. oxytoca: KoI, KoII, KoV, KoVI; K. pneumoniae: KpI, KpII, KpIII) distributed across three hospitals in this study with no phylogroup limited to any single hospital. d-f, Number of public genomes included in the phylogenetic analysis of E. cloacae (d, UK hospitals, n=314; gut microbiotas, n=8; environmental sources, n=43; 1,484 core genes), K. oxytoca (e, UK hospitals, n=40; gut microbiotas, n=9; environmental sources, n=8; 3,399 core genes), and K. pneumoniae strains (f, UK hospitals, n=250; gut microbiotas, n=17; environmental sources, n=66; 2,510 core genes).
Extended Data Fig. 8
Prevalence of AMR and virulence in the Klebsiella and Enterobacter strains.
a-d, Frequency and heatmaps of isolates for putative AMR (a-b) and virulence genes (grouped by antibiotic class) (c-d) most frequently detected in the UK hospital collection strains of E. cloacae (green), K. oxytoca (orange) and K. pneumoniae (blue). Statistical significance results shown are coloured according to the group with higher frequency of detected genes by two-sided Fisher's exact test between the groups of gut microbiota (E. cloacae, n=8; K. oxytoca, n=9; K. pneumoniae, n=17) versus BBS strains (E. cloacae, n=52; K. oxytoca, n=150; K. pneumoniae, n=78), and BBS versus the UK hospital strains (E. cloacae, n=314; K. oxytoca, n=40; K. pneumoniae, n=250). ****P<0.0001, ***P < 0.001, **P < 0.01, *P < 0.05. AMR genes: extended-spectrum beta-lactamases (ESBLs): blaSHV, blaCTX-M, blaTEM; other beta-lactamases: blaOXA, blaOXY, blaACT, blaLEN; Tet = tetracycline (tetA, tetR); Am = aminoglycosides (aac(3), aac(6’), aad, str). Virulence genes: iron acquisition: fyu, ybt = yersiniabactin, kfu = iron transporter permease, irp = iron regulatory proteins; all = allatonin metabolism; wzi = capsule; iutA = aerobactin siderophore receptor; mrk = fimbriae and biofilm formation; fli = flagella biosynthesis; iro = siderophore production; lpf = fimbrial chaperones. Genes detected across all isolates are not shown.
Whilst there is insufficient evidence from metagenomics and cultured isolate WGS that indicates an apparent maternal origin of the opportunistic pathogens (Supplementary Note 4), the absence of lineage-specific colonisation suggests hospital environmental exposure as the primary factor driving opportunistic pathogen colonisation of the BBS babies. Although our study was not designed for retrospective sampling of the hospital environmental sources, opportunistic pathogens are frequently found in hospital environments, where hospital-born babies have been shown to carry the same bacteria present in operating rooms[29] and neonatal intensive care units[30].Undertaking the largest, longitudinal WGS characterisation of the human gut microbiota in the previously under-sampled neonatal period (≤1 month), we consolidate the recent findings that mode of delivery is a major factor shaping the gut microbiota in the first few weeks of life[4], with the diminished effect persisting into infancy[14,15]. The disrupted transmission of the maternal gastrointestinal bacteria, particularly the pioneering Bacteroides species in birth via C-section and maternal IAP, predisposed newborn babies to colonisation by clinically important opportunistic pathogens circulating in healthcare and hospital environments. However, the clinical consequences of the early life microbiota perturbations and carriage of immunogenic pathogens during this critical window of immune development remain to be determined. This highlights the need for large-scale, long-term cohort studies that also sample home births[31] to better understand the consequence of hospital birth and establish if neonatal microbiota perturbation negatively impacts health outcomes in childhood and later life.
Methods
Study population
The study was approved by the NHS London - City and East Research Ethics Committee (REC reference 12/LO/1492). Participants were recruited at the Barking, Havering and Redbridge University Hospitals NHS Trust (BHR), the University Hospitals Leicester NHS Trust (LEI), and the University College London Hospitals NHS Foundation Trust (UCLH), through the Baby Biome Study (previously Life Study enhancement pilot study) from May 2014 to December 2017. Mothers provided written, informed consent to participate and for their children to participate in the study. The study was performed in compliance with all relevant ethical regulations.
Sample collection
Faecal samples were collected from babies with at least one sample in the first 21 days of life, primarily on day 4, 7 or 21. For a subset of babies who provided neonatal samples, a follow-up faecal sample collection was performed between 4 to 12 months of their lives. Maternal faecal samples were collected in the maternity unit before or after delivery, or stool was collected during delivery by midwives. Baby samples were collected at home by mothers and returned to the processing laboratory by post at ambient temperature within 24 hours. On arrival at the lab, all faecal samples were immediately stored at 4°C for an average of 2.41 days (95% CI 2.06-2.76 days) before further processing. Samples were aliquoted into six vials, four of which were stored at -80°C for raw faeces biobanking while the other two vials were processed immediately for DNA extraction. Although this sample storage protocol (no preservation buffer for room temperature and 4°C storage) was shown to be robust to technical variation in microbiome profiles at the time of study design (Supplementary Note 5), state-of-the-art preservation methods should be utilised in future large-scale microbiome studies to minimise the potential effect of sample storage on the microbiota composition[32]. DNA was extracted from 30 mg of faecal samples as described in the BBS collection and processing protocol[33]. Negative controls using ultrapure water was included in parallel for each kit as well as each extraction batch, and DNA concentration quantified to confirm contamination free. Total DNA was eluted in 60μl DNase/Pyrogen-free water, and stored at -80°C until shipment to the Wellcome Sanger Institute for metagenomic sequencing.
Shotgun metagenomic sequencing and analysis
DNA samples, including negative controls, were quantified by PicoGreen dsDNA assay (Thermo Fisher), and samples with >100 ng DNA material proceeded to paired-end (2 x 125bp) metagenomics sequencing on the HiSeq 2500 v4 platform. Low-quality bases were trimmed (SLIDINGWINDOW:4:20), and reads below 87 nucleotides (70% of original read length) were removed (MINLEN:87) using Trimmomatic[34]. To remove potential human contaminants, quality trimmed reads were screened against the human genome (GRCh38) with Bowtie2 v2.3.0[35]. On average, 22.4 (95% CI 22.1-22.6) million raw reads were generated per sample. 19.3 (95% CI 19.1-19.6) million reads (87.3% of the raw reads) per sample passed decontamination and quality trimming steps for downstream analysis. Sequencing depth was accounted for as a potential technical confounding factor in analyses of microbiota species and strain measurements, and significant species association with clinical covariates (Supplementary Note 6). Taxonomic classification from metagenomics reads was performed using Kraken v1.0[36], a k-mer based sequence classification approach against the HumanGastrointestinal Bacteria Genome Collection (HGG) genomes[37]. Bracken v1.0[38] was run on the Kraken classification output to estimate taxonomic abundance down to the species level. Metagenomic samples were compared at the genus and species levels by relative abundance. A cut-off of 100 Kraken-assigned paired-end reads (corresponds to 0.001% relative abundance given the sampling depth of ~10 million paired-end reads) was applied to determine metagenomic species detection. To assess whether the trade-off between the observed level of Bacteroides and opportunistic pathogens was an artefact of compositional effects, the proportion of abundances and reads corresponding to Bacteroides were removed separately, prior to relative abundance normalisation. In the normalised datasets, the statistical enrichment of opportunistic pathogen species in C-section babies was consistent with the observation with the original data. The R packages phyloseq[39] and microbiome[40] was used for metagenomic data analysis and results visualised using ggplot2[41] in RStudio.
Classification of the low-Bacteroides babies
For each baby, the median relative abundance of the Bacteroides genus was calculated across the neonatal period samples. Based on the threshold described previously[9], babies with a median abundance of less than 0.1% were assigned low-Bacteroides status.
Classification of the opportunistic pathogen carriage
Total opportunistic pathogen load is estimated by calculating the median relative abundance of combined opportunistic pathogen species (C. perfringens, E. cloacae, E. faecalis, E. faecium, K. oxytoca, K. pneumoniae) per individual across their neonatal period samples, and independently for the infancy period and maternal samples. To prioritise on relatively high-level opportunistic pathogen carriage feasible for downstream strain cultivation experiments, individuals with a median abundance of over 1% total opportunistic pathogen load were defined as a positive carriage.
Maternal strain transmission analysis
Strain transmissions in mother-baby paired samples were determined using a single-nucleotide variant calling method[42]. StrainPhlAn was run on pre-processed metagenomes to generate consensus species-specific marker genes for phylogenetic reconstruction of all detectable strains (one dominant strain per sample), using default parameters and with the options "--alignment_program mafft" and "--relaxed_parameters3" as previously described[26]. No statistically significant variation in sequencing depth was observed between vaginal and C-section born subjects across age groups that had any impact on coverage-dependent microbiota species and strains detection (Supplementary Note 6). For each species and strains with sufficient coverage for strain profiling, we generated a species-specific phylogenetic tree using RAxML[43]. As previously described[26], the strain distance for each pair of mother-baby sample strains was computed by calculating the pairwise normalised phylogenetic distance on the corresponding species tree.To define strain transmission events, a previously described[26], conservative threshold of 0.1 on the strain distance value was used. The detectable strains in a given pair of mother-baby samples were considered identical (strain distance less than 0.1, transmission) or distinct (strain distance greater than 0.1, no transmission). For all mother-baby pairs shown in Extended Data Fig. 4, early transmission event was counted once per species per mother-baby pair, considering the detected transmission (or evidence for no transmission) at the earliest time point (primary transmission), irrespective of the subsequent transmission events in any later neonatal period samples. For a subset of mother-baby pairs with both neonatal and infancy period sampled (Fig. 3a), late transmission events were counted separately, including cases of no early transmission due to insufficient coverage (no detectable strains). To highlight the transmission pattern shared by phylogenetically related species, a neighbour-joining[44] tree of the eligible species was constructed based on the mash distance matrix[45] of the respective reference genomes included in the StrainPhlAn database (Supplementary Table 4). The same approach and strain distance threshold (core-genome SNPs) were applied to the cultured strains to count the number of identical and distinct strains within mother-baby and longitudinal paired samples.
Statistical analysis
To calculate the effect of clinical covariates on the gut microbiota composition, we stratified by age groups and then assessed the proportion of explained variance (R2 from PERMANOVA) in Bray-Curtis distance for each clinical covariate, using the adonis from the R package vegan[46]. While PERMANOVA is mostly unaffected by group dispersion effects in balanced designs[47] (e.g. mode of delivery comparisons), for unbalanced designs (e.g. breastfeeding comparisons) more sensitive to group dispersion effects, the group variance homogeneity condition was validated using the betadisper function. Group dispersions were not significantly different (betadisper P<0.05) in all comparisons, which lent support to the statistically significant, albeit visibly weak effects of breastfeeding as reported by PERMANOVA. Samples with missing metadata (NA) for the given clinical covariate were excluded prior to running each cross-sectional analysis. Effect sizes and statistical significance were determined by 1,000 permutations, and P-values corrected for multiple testing using the Benjamini-Hochberg false discovery rate (FDR = 5%). Statistical tests of between-group taxonomic abundance comparisons (Welch’s t-test with p-values FDR-corrected) were performed in the Statistical Analysis of Metagenomics Profiles program v2.0[48]. MaAsLin[49] was used for adjustment of covariates when determining the significance of species associated with a specific variable while accounting for potentially confounding covariates, as previously described[14,15]. All the covariates tested in the PERMANOVA were included in the adjustment along with the sequencing depth used as fixed effects. The default MaAsLin parameters were applied (maximum percentage of samples NA in metadata 10%, minimum percentage relative abundance 0.01%, P < 0.05, q < 0.25).
Bacterial isolation and whole-genome sequencing
Raw faecal samples from neonates stored in the biobank lab at -80°C were requested based on faecal carriage of targeted species over 1% relative abundance in metagenomes. Selected frozen faecal aliquots, where available (> 100 ng) were couriered on dry ice to the Wellcome Sanger Institute within 6 hours of shipment from the biobank lab. Bacterial isolates were cultured using the following culture media: Enterococcus faecium ChromoSelect Agar Base (Sigma-Aldrich) for Enterococcus spp., CP ChromoSelect Agar (Sigma-Aldrich) for Closteridium spp., Coliform ChromoSelect Agar (Sigma-Aldrich) and Klebsiella ChromoSelect Selective Agar (Sigma-Aldrich) for species of Enterobacteriaceae. Between 2-5 colonies per sample were picked for full-length 16S rRNA gene sequencing to confirm species identification, as described previously[50]. Bacterial isolates with species identification congruent with metagenomic identification were re-streaked and purified for genomic DNA extraction using DNeasy 96 kit. DNA sequencing was performed on the Illumina HiSeq X, generating paired-end reads (2 x 151bp). Multiple strains per species per faecal sample were also sequenced based on variation across the full-length 16S rRNA sequences. Bacterial genomes were assembled and annotated using the pipeline described previously[51]. Genome assemblies were subjected to quality check and contaminant screening with CheckM[52] and Mash[53], respectively. Where applicable, the suspected contaminant (non-target organism) sequences were confirmed and filtered out via raw read mapping using Bowtie2 v2.3.0, prior to re-assembly.
Bacterial phylogenetic analysis
The phylogenetic analysis of the complete diverse species collection was conducted by extracting the amino acid sequence of 40 universal core marker genes[54,55] from the BBS bacterial culture collection using SpecI[56]. The protein sequences were concatenated and aligned with MAFFT v.7.2040, and maximum-likelihood trees were constructed using RAxML[43] with default settings. Four most prevalent BBS collection opportunistic pathogen species E. faecalis, E. cloacae, K. oxytoca and K. pneumoniae were further analysed in context of the public genomes (Supplementary Table 5), including the UK hospital strain collections[57-60], the gut microbiota-cultured strains from the HGG and the Culturable Genome Reference (CGR)[61] collections, and the environmental strains on the Genome Taxonomy Database (GTDB, v86) [62]. To generate phylogenetic trees of individual species, the public genome assemblies were combined with the assemblies of the study isolates, annotated with Prokka[63], and a pangenome estimated using Roary(Page et al. 2015). Where multiple identical strains (no SNP difference in species core-genome) were cultured from the same faecal sample, only one representative strain was included in the species phylogenetic trees. A 95% identity cut-off was used, and core genes were defined as those in 99% of isolates unless otherwise stated. A maximum likelihood tree of the SNPs in the core genes was created using RAxML(Stamatakis 2014) and 100 bootstraps. To illustrate the population structure of the closely related Enterobacter and Klebsiella strain isolates, FastANI(Jain et al. 2017) was used to estimate the pairwise average nucleotide identity distance between all public and BBS genome assemblies, which was then used as an input to generate a neighbour-joining with BIONJ(Gascuel 1997). All phylogenetic trees were visualised in iTOL(Letunic & Bork 2016). Sequence types were determined using MLSTcheck(Page et al. n.d.), which was used to compare the assembled genomes against the MLST database for the corresponding species.
Detecting virulence and resistance genes
ABRicate (v0.8.13, https://github.com/tseemann/abricate) was used to screen for known, acquired resistance genes and virulence factors against bacterial genome assemblies. For AMR genes, a comprehensive BLAST database integrating 5,556 non-redundant sequences in the NCBI Bacterial Antimicrobial Resistance Reference Gene Database (PRJNA313047), CARD v2.0.3, ARG-ANNOT and ResFinder was queried against. 3,202 non-redundant experimentally validated core virulence genes in VFDB (version 5 Oct 2018) were included to build a BLAST database for virulence factor screening.
The neonatal gut microbiotas exhibited high volatility and individuality.
a, Microbiota diversity (alpha diversity) increased over developmental time. The violin plot outlines illustrate kernel probability density, with the width of the shaded area representing the proportion of the data shown. Centre lines show the medians; box limits indicate the 25th and 75th percentiles; whiskers extend 1.5 times the interquartile range from the 25th and 75th percentiles, outliers are represented by dots. Number of gut microbiotas on day 4 (n=310), 7 (n=532) and 21 (n=325), in infancy (n = 302), and from matched mothers (n = 175). b-c, Gut microbiota stability, stratified by inter-individual (day 4, n=310; day 7, n=532; day 21, n=325) and intra-individual comparisons in sliding time windows (day 4 to 7, n = 274; day 7 to 21, n=285) during the neonatal period (b), in the context of the overall infancy period (c) with the TEDDY study microbiota stability measurements (earliest measurements on day 90, and year 3) plotted in crosses. Solid lines show the median per time window. Shaded areas show the 99% confidence interval estimated using binomial distribution. Error bars indicate median absolute deviation. Statistical significance between groups was determined by two-sided Wilcoxon rank-sum test.
Microbiota variation associated with mode of delivery in the neonatal period and infancy.
Non-metric multidimensional scaling (NMDS) ordination of Bray–Curtis dissimilarity between the species relative abundance profiles of the gut microbiota sampled from babies on day 4 (vaginal, n=157; C-section, n=153), day 7 (vaginal, n=280; C-section, n=252), day 21 (vaginal, n=147; C-section, n=178), during infancy (vaginal, n=160; C-section, n = 142) and from mothers (vaginal, n=110; C-section, n=65). Microbial variation explained by factor mode of delivery is represented by the PERMANOVA R2 value (bottom left) and statistically significant across four cross-sectional PERMANOVA tests (FDR-corrected p-values reported in Supplementary Table 2).
Microbial succession in the vaginally-delivered neonatal gut microbiota over the first 21 days of life.
Bar plots show longitudinal changes in the mean relative abundance (RA) of faecal bacteria at the genus level at day 4, 7 and 21 days of life, for genera with > 1% RA across all neonatal samples. Left panel: n = 316 from 160 vaginally delivered babies detected with Bacteroides, right panel: n = 290 from 154 vaginally delivered babies with low Bacteroides status (defined in Methods).
Maternal strain transmission during the early neonatal period.
Maternal strain transmission across 178 mother-baby pairs (vaginal: 112, C-section: 66) sampled at least once during the early neonatal period. Only the frequently shared species detected with sufficient coverage for strain analysis in more than 10 pairs are shown. The neighbour-joining tree is constructed based on the pairwise mash distances of the respective reference genomes. Phylogenetically related species shared similar transmission timing pattern, for example the frequent transmission of Bacteroides/Parabacteroides spp. and Bifidobacterium spp. in vaginally delivered babies and the lack thereof in C-section born babies; and that most Streptococcus species were transmitted from other sources (non-maternal) in the environment.
Frequency and abundance of opportunistic pathogens in the gut microbiotas.
a-b, C-section and low-Bacteroides vaginally delivered babies were more frequently carrying opportunistic pathogens (as defined in Methods) and at higher level of species relative abundance (RA), compared to vaginally delivered babies (a) and normal-Bacteroides vaginally delivered babies (b), respectively. Significant differential presence in neonatal samples within each major neonatal period sampling groups (day 4 (n = 310), 7 (n = 532) and 21 (n = 325)) in terms of mean (RA) and frequency of six known opportunistic pathogens associated with the healthcare environment, which are rarely carried by adults (mothers, n=175) (b). Number of individuals sampled in the neonatal period: vaginal, n=314; vaginal-normal (Bacteroides level), n=160; vaginal-low (Bacteroides level), n=154. Error bars indicate the 95% CI of the mean RA. Statistical significance in mean species RA and combined pathogen carriage (defined in Methods) frequency was obtained by applying two-sided Wilcoxon signed-rank test and Fisher’s exact test, respectively.
Phylogeny and pathogenicity potential of the BBS E. faecalis strains.
a, Phylogenetic tree of the BBSE. faecalis strains (n=282, isolated from 269 faecal samples of 160 subjects). Midpoint-rooted maximum likelihood is based on SNPs in 1,827 core genes. The five major lineages (>10 BBS strain representatives; ST179, n=60; ST16, n=30, ST40, n=27; ST30, n=21, ST191, n=14) identified with UK hospital collection distributed across three hospitals in this study with no phylogroup limited to any single hospital. Solid lines between indicated the intra-subject persistence (n=92 in 67 babies). Dash lines indicated phylogenetically distinct strains isolated from longitudinal samples (n=18) or mother-baby paired samples (yellow, n=10) with arrows indicating the direction of potential transmission (early-to-later or mother-to-baby). Where multiple identical strains (no SNP difference in species core-genome) were isolated from the same faecal sample, only one representative strain was included in the species phylogenetic tree (total number of strains, n=356). b-e, Prevalence of virulence (b-c) and AMR genes (grouped by antibiotic class) (d-e) were detected in the BBSE. faecalis strains. Statistical significance results shown are coloured according to the group with higher frequency of detected genes by two-sided Fisher's exact test between the groups of gut microbiotas (n=28) versus BBS strains (n=356), and BBS versus the UK hospital epidemic strains (n=89, tree branches coloured blue in Fig. 4c). ****P<0.0001, ***P < 0.001, **P < 0.01, *P < 0.05. Virulence genes: asa1, EF0149, EF0485, prgB = Aggregation substance; esp = enterococcal surface protein; Exoenzymes: gelE = gelatinase; EF0818, EF3023 = hyaluronidase (spreading factor); sprE = serine protease; fsr = Quorum sensing system; Toxin: cyl = cytolysin. Genes that detected across all isolates (dfrE, efrA, efrB, emeA, lsaA) are not shown. AMR genes: Am = aminoglycosides (aph3"-III, ant(6)-Ia, aph(2''), str); Chlor = chloramphenicol (catA); Linc = lincosamides (lnuB); MLSB = macrolide, lincosamide, streptogramin B (ermB or ermT); Tet = tetracycline (tetL, tetM, tetO, tetS); Trim = trimethoprim (dfrC, dfrD, dfrF or dfrG); Vanc = vancomycin.
Phylogenies of the BBS E. cloacae, K. oxytoca and K. pneumoniae strains.
a-f, Midpoint-rooted core-genome maximum likelihood trees of the E. cloacae complex, K. oxytoca and K. pneumoniae strains isolated in this study (a-c) and in the context of public genomes (d-f). a-c, Number of BBS strains of E. cloacae (a, n=37, isolated from 37 faecal samples of 30 subjects, 1,861 core genes), K. oxytoca (b, n=107, isolated from 90 faecal samples of 62 subjects, 2,910 core genes) and K. pneumoniae strains (c, n=53, isolated from 47 faecal samples of 35 subjects, 3,471 core genes). Solid lines between indicated the intra-subject strain persistence (E. cloacae, n=5; K. oxytoca, n= 25 in 18 babies; K. pneumoniae, n=11 in 8 babies). Dash lines indicated phylogenetically distinct strains isolated from longitudinal samples (E. cloacae, n=2; K. oxytoca, n=7 in 6 subjects; K. pneumoniae, n=1) with arrows indicating the direction of potential transmission (early-to-later samples). Where multiple identical strains (no SNP difference in species core-genome) were isolated from the same faecal sample, only one representative strain was included in the species phylogenetic tree (number of non-redundant BBS strains: E. cloacae, n=52; K. oxytoca, n=150; K. pneumoniae, n=78). For each species, the main phylogroups identified with UK hospital collection (E. cloacae: III, VIII; K. oxytoca: KoI, KoII, KoV, KoVI; K. pneumoniae: KpI, KpII, KpIII) distributed across three hospitals in this study with no phylogroup limited to any single hospital. d-f, Number of public genomes included in the phylogenetic analysis of E. cloacae (d, UK hospitals, n=314; gut microbiotas, n=8; environmental sources, n=43; 1,484 core genes), K. oxytoca (e, UK hospitals, n=40; gut microbiotas, n=9; environmental sources, n=8; 3,399 core genes), and K. pneumoniae strains (f, UK hospitals, n=250; gut microbiotas, n=17; environmental sources, n=66; 2,510 core genes).
Prevalence of AMR and virulence in the Klebsiella and Enterobacter strains.
a-d, Frequency and heatmaps of isolates for putative AMR (a-b) and virulence genes (grouped by antibiotic class) (c-d) most frequently detected in the UK hospital collection strains of E. cloacae (green), K. oxytoca (orange) and K. pneumoniae (blue). Statistical significance results shown are coloured according to the group with higher frequency of detected genes by two-sided Fisher's exact test between the groups of gut microbiota (E. cloacae, n=8; K. oxytoca, n=9; K. pneumoniae, n=17) versus BBS strains (E. cloacae, n=52; K. oxytoca, n=150; K. pneumoniae, n=78), and BBS versus the UK hospital strains (E. cloacae, n=314; K. oxytoca, n=40; K. pneumoniae, n=250). ****P<0.0001, ***P < 0.001, **P < 0.01, *P < 0.05. AMR genes: extended-spectrum beta-lactamases (ESBLs): blaSHV, blaCTX-M, blaTEM; other beta-lactamases: blaOXA, blaOXY, blaACT, blaLEN; Tet = tetracycline (tetA, tetR); Am = aminoglycosides (aac(3), aac(6’), aad, str). Virulence genes: iron acquisition: fyu, ybt = yersiniabactin, kfu = iron transporter permease, irp = iron regulatory proteins; all = allatonin metabolism; wzi = capsule; iutA = aerobactin siderophore receptor; mrk = fimbriae and biofilm formation; fli = flagella biosynthesis; iro = siderophore production; lpf = fimbrial chaperones. Genes detected across all isolates are not shown.
Baseline clinical characteristics of the Baby Biome Study cohort.
Complete clinical metadata of the study participants reported in Supplementary Table 1.
Authors: Jakob Stokholm; Jonathan Thorsen; Bo L Chawes; Susanne Schjørring; Karen A Krogfelt; Klaus Bønnelykke; Hans Bisgaard Journal: J Allergy Clin Immunol Date: 2016-04-01 Impact factor: 10.793
Authors: Moran Yassour; Tommi Vatanen; Heli Siljander; Anu-Maaria Hämäläinen; Taina Härkönen; Samppa J Ryhänen; Eric A Franzosa; Hera Vlamakis; Curtis Huttenhower; Dirk Gevers; Eric S Lander; Mikael Knip; Ramnik J Xavier Journal: Sci Transl Med Date: 2016-06-15 Impact factor: 17.956
Authors: Nicholas A Bokulich; Jennifer Chung; Thomas Battaglia; Nora Henderson; Melanie Jay; Huilin Li; Arnon D Lieber; Fen Wu; Guillermo I Perez-Perez; Yu Chen; William Schweizer; Xuhui Zheng; Monica Contreras; Maria Gloria Dominguez-Bello; Martin J Blaser Journal: Sci Transl Med Date: 2016-06-15 Impact factor: 17.956
Authors: Maria G Dominguez-Bello; Elizabeth K Costello; Monica Contreras; Magda Magris; Glida Hidalgo; Noah Fierer; Rob Knight Journal: Proc Natl Acad Sci U S A Date: 2010-06-21 Impact factor: 11.205
Authors: Jeremy E Koenig; Aymé Spor; Nicholas Scalfone; Ashwana D Fricker; Jesse Stombaugh; Rob Knight; Largus T Angenent; Ruth E Ley Journal: Proc Natl Acad Sci U S A Date: 2010-07-28 Impact factor: 11.205
Authors: Derrick M Chu; Jun Ma; Amanda L Prince; Kathleen M Antony; Maxim D Seferovic; Kjersti M Aagaard Journal: Nat Med Date: 2017-01-23 Impact factor: 53.440
Authors: Linda Wampach; Anna Heintz-Buschart; Joëlle V Fritz; Javier Ramiro-Garcia; Janine Habier; Malte Herold; Shaman Narayanasamy; Anne Kaysen; Angela H Hogan; Lutz Bindl; Jean Bottu; Rashi Halder; Conny Sjöqvist; Patrick May; Anders F Andersson; Carine de Beaufort; Paul Wilmes Journal: Nat Commun Date: 2018-11-30 Impact factor: 14.919
Authors: Aimee M Baumann-Dudenhoeffer; Alaric W D'Souza; Phillip I Tarr; Barbara B Warner; Gautam Dantas Journal: Nat Med Date: 2018-10-29 Impact factor: 53.440
Authors: Miren B Dhudasia; Jonathan M Spergel; Karen M Puopolo; Corinna Koebnick; Matthew Bryan; Robert W Grundmeier; Jeffrey S Gerber; Scott A Lorch; William O Quarshie; Theoklis Zaoutis; Sagori Mukhopadhyay Journal: Pediatrics Date: 2021-04-08 Impact factor: 7.124
Authors: Sagori Mukhopadhyay; Karen Marie Puopolo; Matthew Bryan; Miren B Dhudasia; William Quarshie; Jeffrey S Gerber; Robert W Grundmeier; Corinna Koebnick; Margo A Sidell; Darios Getahun; Andrea J Sharma; Michael W Spiller; Stephanie J Schrag Journal: Arch Dis Child Fetal Neonatal Ed Date: 2021-05-06 Impact factor: 5.747
Authors: Francesco Beghini; Lauren J McIver; Aitor Blanco-Míguez; Leonard Dubois; Francesco Asnicar; Sagun Maharjan; Ana Mailyan; Paolo Manghi; Matthias Scholz; Andrew Maltez Thomas; Mireia Valles-Colomer; George Weingart; Yancong Zhang; Moreno Zolfo; Curtis Huttenhower; Eric A Franzosa; Nicola Segata Journal: Elife Date: 2021-05-04 Impact factor: 8.140
Authors: Rabindra K Mandal; Joshua E Denny; Ruth Namazzi; Robert O Opoka; Dibyadyuti Datta; Chandy C John; Nathan W Schmidt Journal: Cell Rep Date: 2021-05-11 Impact factor: 9.423
Authors: Ruizhu Huang; Charlotte Soneson; Pierre-Luc Germain; Thomas S B Schmidt; Christian Von Mering; Mark D Robinson Journal: Genome Biol Date: 2021-05-17 Impact factor: 13.583