| Literature DB >> 32518186 |
Andrea Gori1, Odile B Harrison2, Ethwako Mlia3,4, Yo Nishihara5, Jia Mun Chan6, Jacquline Msefula5, Macpherson Mallewa3, Queen Dube4, Todd D Swarthout6,5, Angela H Nobbs7, Martin C J Maiden2, Neil French5,8, Robert S Heyderman6,5.
Abstract
Streptococcus agalactiae (group B streptococcus; GBS) is a colonizer of the gastrointestinal and urogenital tracts, and an opportunistic pathogen of infants and adults. The worldwide population of GBS is characterized by clonal complexes (CCs) with different invasive potentials. CC17, for example, is a hypervirulent lineage commonly associated with neonatal sepsis and meningitis, while CC1 is less invasive in neonates and more commonly causes invasive disease in adults with comorbidities. The genetic basis of GBS virulence and the extent to which different CCs have adapted to different host environments remain uncertain. We have therefore applied a pan-genome-wide association study (GWAS) approach to 1,988 GBS strains isolated from different hosts and countries. Our analysis identified 279 CC-specific genes associated with virulence, disease, metabolism, and regulation of cellular mechanisms that may explain the differential virulence potential of particular CCs. In CC17 and CC23, for example, we have identified genes encoding pilus, quorum-sensing proteins, and proteins for the uptake of ions and micronutrients which are absent in less invasive lineages. Moreover, in CC17, carriage and disease strains were distinguished by the allelic variants of 21 of these CC-specific genes. Together our data highlight the lineage-specific basis of GBS niche adaptation and virulence.IMPORTANCE GBS is a leading cause of mortality in newborn babies in high- and low-income countries worldwide. Different strains of GBS are characterized by different degrees of virulence, where some are harmlessly carried by humans or animals and others are much more likely to cause disease.The genome sequences of almost 2,000 GBS samples isolated from both animals and humans in high- and low- income countries were analyzed using a pan-genome-wide association study approach. This allowed us to identify 279 genes which are associated with different lineages of GBS, characterized by a different virulence and preferred host. Additionally, we propose that the GBS now carried in humans may have first evolved in animals before expanding clonally once adapted to the human host.These findings are essential to help understand what is causing GBS disease and how the bacteria have evolved and are transmitted.Entities:
Keywords: GWAS; Streptococcus agalactiaezzm321990; bacterial genomics; bacterial phylogeny; pan-genome; population structure; virulence
Mesh:
Substances:
Year: 2020 PMID: 32518186 PMCID: PMC7373188 DOI: 10.1128/mBio.00728-20
Source DB: PubMed Journal: mBio Impact factor: 7.867
Characteristics of GBS isolates
| Isolate type | Country | Source | Count | No. invasive | No. missing data |
|---|---|---|---|---|---|
| Human | Malawi | This work | 303 | 131 | 6 |
| Kenya | Seale et al. ( | 1034 | 71 | 0 | |
| USA | Flores et al. ( | 99 | 99 | 0 | |
| Canada | Teatero et al. ( | 141 | 141 | 0 | |
| The Netherlands | 300 | unknown | 300 | ||
| Unknown | *** | 24 | unknown | 24 | |
| Animal | Italy | *** | 3 | ||
| Kenya | *** | 2 | |||
| Germany | *** | 1 | |||
| Brazil | *** | 1 | |||
| Unknown | *** | 80 | |||
Isolated from Queen Elizabeth Central Hospital, Blantyre.
Jamrozy et al. (65).
***, Genomes retrieved from https://pubmlst.org/. Full metadata are reported in Table S1.
Animal isolates are reported to be isolated from cattle (n = 49), fish (n = 23), frogs (n = 3), or other animal sources (n = 12).
FIG 1Isolates used in this study, stratified per serotype, CC, and source. Clinical isolates are grouped as invasive (including strains isolated from children and adults affected by any GBS invasive disease), carrier (including healthy carrying mothers), and unknown, where metadata were not available.
FIG 2Core genome-based population structure of GBS. The phylogenetic tree is annotated with 4 colored strips representing the clonal complex, the country of isolation, the origin, and the serotype of each strain. The three binary heatmaps represent the presence (blue) or absence (yellow) of the genes identified by the pan-GWAS pipeline. The tree is rooted at midpoint. The reference strain used in this analysis was COH1, reference HG939456. The red square in the CC10 heatmap highlights the cluster of CC10-associated genes found in CC19 clones. Trees built with different reference strains are shown in Fig. S1 in the supplemental material and show analogous topology.
Summary of the number of genes driving the pan-GWAS in each clonal complex and type of mutations encountered
| Clonal complex | Presence/absence | Nonsynonymous SNPs | Synonymous SNPs | Nonsense SNPs | Total | GWAS-genes island |
|---|---|---|---|---|---|---|
| CC1 | 47 | 3 | 0 | 1 | 51 | 0 |
| CC10 | 32 | 3 | 2 | 4 | 41 | 0 |
| CC17 | 38 | 42 | 10 | 12 | 102 | 0 |
| CC19 | 34 | 5 | 0 | 0 | 39 | 1 |
| CC23 | 42 | 14 | 3 | 5 | 64 | 0 |
A GWAS-gene island is defined as a region of the genome of at least 200kbp where > 90% of the coding regions are GWAS-driving genes.
FIG 3Location of genes identified by the pan-GWAS pipeline on a strain belonging to CC1 (A), CC10 (B), CC17 (C), CC19 (D), and CC23 (E). Gene location on each chromosome is represented by a red mark. Gene names correspond to the ones reported in Table S3.
Pathways and functional categories identified by KEGG annotation in the five groups of CC-associated genes
| Clonal complex | Kegg no. | Pathway |
|---|---|---|
| CC1 | ||
| Metabolism (09100) | 01130 | Biosynthesis of antibiotics |
| 00052 | Galactose metabolism | |
| 00999 | Biosynthesis of secondary metabolites—unclassified | |
| Environmental Information Processing (09130) | 02060 | Phosphotransferase system (PTS) |
| CC10 | ||
| Metabolism (09100) | 01100 | Metabolic pathways |
| 01110 | Biosynthesis of secondary metabolites | |
| 01120 | Microbial metabolism in diverse environments | |
| 01130 | Biosynthesis of antibiotics | |
| 00010 | Glycolysis/gluconeogenesis | |
| 00040 | Pentose and glucuronate interconversions | |
| 00051 | Fructose and mannose metabolism | |
| 00052 | Galactose metabolism | |
| 00561 | Glycerolipid metabolism | |
| 00600 | Sphingolipid metabolism | |
| 00603 | Glycosphingolipid biosynthesis—globo and isoglobo series | |
| Environmental Information Processing (09130) | 02060 | Phosphotransferase system (PTS) |
| CC19 | ||
| Metabolism (09100) | 01100 | Metabolic pathways |
| 00270 | Cysteine and methionine metabolism | |
| 00760 | Nicotinate and nicotinamide metabolism | |
| Environmental Information Processing (09130) | 03070 | Bacterial secretion system |
| CC17 | ||
| Metabolism (09100) | 00010 | Glycolysis/gluconeogenesis |
| 00020 | Citrate cycle (TCA cycle) | |
| 00052 | Galactose metabolism | |
| 00500 | Starch and sucrose metabolism | |
| 00520 | Amino sugar and nucleotide sugar metabolism | |
| 00620 | Pyruvate metabolism | |
| 00630 | Glyoxylate and dicarboxylate metabolism | |
| 00640 | Propanoate metabolism | |
| 00680 | Methane metabolism | |
| 00910 | Nitrogen metabolism | |
| 00561 | Glycerolipid metabolism | |
| 00230 | Purine metabolism | |
| 00240 | Pyrimidine metabolism | |
| 00250 | Alanine, aspartate and glutamate metabolism | |
| 00260 | Glycine, serine, and threonine metabolism | |
| 00280 | Valine, leucine, and isoleucine degradation | |
| 00220 | Arginine biosynthesis | |
| 01007 | Amino acid related enzymes | |
| 00430 | Taurine and hypotaurine metabolism | |
| 01003 | Glycosyltransferases | |
| 01005 | Lipopolysaccharide biosynthesis proteins | |
| 01011 | Peptidoglycan biosynthesis and degradation proteins | |
| 00760 | Nicotinate and nicotinamide metabolism | |
| 00770 | Pantothenate and CoA biosynthesis | |
| 01001 | Protein kinases | |
| 01002 | Peptidases | |
| 03021 | Transcription machinery | |
| 03016 | Transfer RNA biogenesis | |
| CC17 (continued) | ||
| 00970 | Aminoacyl-tRNA biosynthesis | |
| 03110 | Chaperones and folding catalysts | |
| 03060 | Protein export | |
| 03420 | Nucleotide excision repair | |
| 03400 | DNA repair and recombination proteins | |
| Environmental Information Processing (09130) | 02000 | Transporters |
| 02010 | ABC transporters | |
| 02060 | Phosphotransferase system (PTS) | |
| 03070 | Bacterial secretion system | |
| 02020 | Two-component system | |
| 02044 | Secretion system | |
| 02022 | Two-component system | |
| Cellular Processes (09140) | 04147 | Exosome |
| 02048 | Prokaryotic defense system | |
| 02024 | Quorum sensing | |
| 02026 | Biofilm formation- | |
| Unclassified (09190) | 99982 | Energy metabolism |
| 99984 | Nucleotide metabolism | |
| 99999 | Others | |
| 99977 | Transport | |
| CC23 | ||
| Metabolism (09100) | 00630 | Glyoxylate and dicarboxylate metabolism |
| 00061 | Fatty acid biosynthesis | |
| 01040 | Biosynthesis of unsaturated fatty acids | |
| 01004 | Lipid biosynthesis proteins | |
| 00260 | Glycine, serine, and threonine metabolism | |
| 00550 | Peptidoglycan biosynthesis | |
| 01011 | Peptidoglycan biosynthesis and degradation proteins | |
| 00780 | Biotin metabolism | |
| 00670 | One carbon pool by folate | |
| 01008 | Polyketide biosynthesis proteins | |
| 01053 | Biosynthesis of siderophore group nonribosomal peptides | |
| 00333 | Prodigiosin biosyntheses | |
| 01002 | Peptidases | |
| Cellular Processes (09140) | 02000 | Transporters |
| 02010 | ABC transporters | |
| 02020 | Two-component system | |
| 02042 | Bacterial toxins | |
| Human Disease (09100) | 02048 | Prokaryotic defense system |
| 02024 | Quorum sensing | |
| 01502 | Vancomycin resistance | |
| 01504 | Antimicrobial resistance genes | |
| Unclassified (09190) | 99988 | Biosynthesis and biodegradation of secondary metabolites |
| Genetic Information Processing (09120) | 03000 | Transcription factors |
For each clonal complex the functional categories of the GWAS-associated genes are shown. The metabolic pathways affected by those genes and their Kegg reference numbers are reported.
CC17-associated genes showing at least one allele statistically associated with strains isolated from either invasive disease or from carriage
| Gene | Allele 1 | Allele 2 | No. mismatches (aa) | % difference (aa) | ||
|---|---|---|---|---|---|---|
| Odds ratio | Odds ratio | |||||
| 8.98E−18 | 84.47 | 1.67E−03 | 0.19 | 0 | 0 | |
| 5.48E−10 | 0.08 | 0.063 | 0.55 | 0 | 0 | |
| 1.19E−19 | 0.09 | 0.151 | 1.41 | 0 | 0 | |
| 6.27E−18 | 0.10 | 2.97E−14 | 0.10 | 1 | 0.4 | |
| 0.028 | 0.24 | 1 | 0.2 | |||
| 0.018 | 0.19 | 1 | 0.3 | |||
| 0.006 | 5.86 | 0 | 0 | |||
| 5.72E−03 | 5.93 | 1 | 0.3 | |||
| 1.11E−04 | 0.47 | 1 | 0.6 | |||
| 7.42E−05 | 0.46 | 1 | 0.9 | |||
| 2.41E−05 | 0.43 | 1 | 0.2 | |||
| 1.48E−05 | 0.42 | 1 | 0.3 | |||
| 3.80E−08 | 0.10 | 1 | 0.3 | |||
| 3.27E−09 | 0.15 | 1 | 1.1 | |||
| 5.64E−10 | 3.56 | 1 | 0.2 | |||
| 5.48E−10 | 3.55 | 0 | 0 | |||
| 1.04E−10 | 3.69 | 1 | 0.9 | |||
| 2.42E−15 | 0.09 | 1 | 0.1 | |||
| 1.71E−16 | 42.87 | 1 | 0.1 | |||
| 1.78E−17 | 83.77 | 1 | 0.9 | |||
| 1.78E−18 | 49.48 | 1 | 0.2 | |||
P values are Benjamini-Hochberg corrected.
gcc1730 showed only one mismatch in the protein alignment, which introduced a stop codon in position 122.
Susceptibility of four S. agalactiae strains against five antimicrobial agents
| Antimicrobial | MIC (μg/ml) | |||
|---|---|---|---|---|
| COH1 (serotype III, cc17) | NEM316 (serotype III, cc23) | 2603V/R (serotype III, cc19) | H36B (serotype Ib, cc1) | |
| Erythromycin | 0.25 | 0.125 | 0.25 | 0.25 |
| Chloramphenicol | 4 | 4 | 4 | 2 |
| Norfloxacin | >8 | >8 | 4 | 4 |
| Acriflavine | 8 | 8 | 8 | 8 |
| Berberine | 32 | 64 | 64 | 32 |
MICs were determined from 3 experiments.
Value of 8 μg/ml was the higher detection limit for norfloxacin due to acidification and precipitation of media components.