| Literature DB >> 33208516 |
Hannah D Steinberg1, Evan S Snitkin2.
Abstract
Illness caused by the pathogen Clostridioides difficile is widespread and can range in severity from mild diarrhea to sepsis and death. Strains of C. difficile isolated from human infections exhibit great genetic diversity, leading to the hypothesis that the genetic background of the infecting strain at least partially determines a patient's clinical course. However, although certain strains of C. difficile have been suggested to be associated with increased severity, strain typing alone has proved insufficient to explain infection severity. The limited explanatory power of strain typing has been hypothesized to be due to genetic variation within strain types, as well as genetic elements shared between strain types. Homologous recombination is an evolutionary mechanism that can result in large genetic differences between two otherwise clonal isolates, and also lead to convergent genotypes in distantly related strains. More than 400 C. difficile genomes were analyzed here to assess the effect of homologous recombination within and between C. difficile clades. Almost three-quarters of single nucleotide variants in the C. difficile phylogeny are predicted to be due to homologous recombination events. Furthermore, recombination events were enriched in genes previously reported to be important to virulence and host-pathogen interactions, such as flagella, cell wall proteins, and sugar transport and metabolism. Thus, by exploring the landscape of homologous recombination in C. difficile, we identified genetic loci whose elevated rates of recombination mediated diversification, making them strong candidates for being mediators of host-pathogen interaction in diverse strains of C. difficile IMPORTANCE Infections with C. difficile result in up to half a million illnesses and tens of thousands of deaths annually in the United States. The severity of C. difficile illness is dependent on both host and bacterial factors. Studying the evolutionary history of C. difficile pathogens is important for understanding the variation in pathogenicity of these bacteria. This study examines the extent and targets of homologous recombination, a mechanism by which distant strains of bacteria can share genetic material, in hundreds of C. difficile strains and identifies hot spots of realized recombination events. The results of this analysis reveal the importance of homologous recombination in the diversification of genetic loci in C. difficile that are significant in its pathogenicity and host interactions, such as flagellar construction, cell wall proteins, and sugar transport and metabolism.Entities:
Keywords: Clostridioides difficile; S layer; flagella; homologous recombination; phosphotransferase system
Year: 2020 PMID: 33208516 PMCID: PMC7677006 DOI: 10.1128/mSphere.00799-20
Source DB: PubMed Journal: mSphere ISSN: 2379-5042 Impact factor: 4.389
FIG 1Recombination in five C. difficile clades. (a) Unrooted phylogeny of genomes used in study, colored by previously defined clade designations: clade 1 (blue, n = 306), clade 2 (yellow, n = 76), clade 3 (green; n = 15), clade 4 (purple, n = 10), and clade 5 (red, n = 5). (b) Distribution of SNVs in recombination events per SNV outside recombination events (r/m) on each branch of each C. difficile clade. Boxplots represent the first to third quartiles of values, with a line representing the median r/m value for each clade, and whiskers are 1.5 times the interquartile range above the third quartile. Values for each branch are plotted on top of the boxplots for each clade. There is a statistically significant difference in r/m values between at least 2 of the clades (Kruskal-Wallis χ2df=4 = 36.8; P < 0.001). (c) Distribution of lengths of recombination events in each C. difficile clade, colored as in panel a.
C. difficile genes with the greatest increase in recombination events in the clade 1 and 2 core genome
| Gene | Annotation | PR events | OR events | Fold change | |
|---|---|---|---|---|---|
| Flagellar motor switch protein FliG | 11 | 58 | 5.4 | <0.001 | |
| Flagellar assembly protein FliH | 11 | 58 | 5.3 | <0.001 | |
| Flagellar protein FliJ | 10 | 52 | 5.0 | <0.001 | |
| ATP synthase subunit beta FliI | 11 | 52 | 4.9 | <0.001 | |
| Flagellar MS-ring protein | 12 | 56 | 4.8 | <0.001 | |
| Flagellar hook-basal body protein FliE | 11 | 50 | 4.7 | <0.001 | |
| Basal-body rod modification protein FlgD | 10 | 49 | 4.7 | <0.001 | |
| Flagellar hook-length control protein FliK | 10 | 48 | 4.6 | <0.001 | |
| Flagellar hook protein FlgE | 11 | 51 | 4.6 | <0.001 | |
| Flagellar basal-body rod protein FlgB | 11 | 48 | 4.5 | <0.001 | |
| Flagellar basal-body rod protein FlgC | 11 | 47 | 4.3 | <0.001 | |
| CD630_27960 | Cell surface protein | 12 | 52 | 4.2 | <0.001 |
| Flagellar protein FlbD | 11 | 44 | 4.2 | <0.001 | |
| Flagellar protein FliS1 | 11 | 43 | 4.0 | <0.001 | |
| Flagellar motor rotation protein MotB | 11 | 44 | 3.9 | <0.001 | |
| Flagellar motor rotation protein MotA | 11 | 44 | 3.9 | <0.001 | |
| Carbon storage regulator CsrA | 10 | 41 | 3.9 | <0.001 | |
| Flagellar basal body-associated protein FliL | 11 | 41 | 3.9 | <0.001 | |
| Flagellar protein FliZ | 10 | 40 | 3.8 | <0.001 | |
| Flagellar protein FliS2 | 11 | 41 | 3.8 | <0.001 | |
| Flagellar biosynthetic protein FliQ | 10 | 40 | 3.8 | <0.001 | |
| Bifunctional flagellar biosynthesis protein FliR/FlhB | 12 | 45 | 3.8 | <0.001 | |
| Flagellar biosynthesis protein FliP | 11 | 41 | 3.7 | <0.001 | |
| CD630_27970 | Calcium-binding adhesion protein | 17 | 62 | 3.7 | <0.001 |
| CD630_36440 | Hypothetical protein | 11 | 40 | 3.6 | <0.001 |
| Flagellar biosynthesis protein FlhA | 12 | 42 | 3.5 | <0.001 | |
| Flagellin C | 11 | 39 | 3.4 | <0.001 | |
| CD630_30830 | PTS operon transcription antiterminator | 11 | 38 | 3.4 | <0.001 |
| flagellar hook-associated protein FliD | 12 | 40 | 3.3 | <0.001 | |
| CD630_02380 | hypothetical protein | 11 | 35 | 3.3 | <0.001 |
| Flagellar assembly factor FliW | 11 | 34 | 3.2 | <0.001 | |
| Flagellar biosynthesis regulator FlhF | 11 | 36 | 3.1 | <0.001 | |
| CD630_02410 | Phosphoserine phosphatase | 10 | 32 | 3.1 | <0.001 |
| CD630_02420 | Hypothetical protein | 10 | 32 | 3.1 | <0.001 |
| CD630_02430 | Hypothetical protein | 10 | 32 | 3.1 | <0.001 |
| CD630_02440 | CDP-glycerol:poly(glycerophosphate) glycerophosphotransferase | 10 | 32 | 3.1 | <0.001 |
| CD630_22420 | Hypothetical protein | 11 | 32 | 3.0 | <0.001 |
| Flagellar hook-associated protein FlgL | 11 | 34 | 3.0 | <0.001 | |
| CD630_22430 | Membrane protein | 11 | 32 | 3.0 | <0.001 |
| Flagellar operon RNA polymerase σ28 factor | 11 | 33 | 3.0 | <0.001 |
PR events, the predicted number of overlapping recombination events with each gene, calculated as the mean number of recombination events in each of 10,000 permutations randomly placing the identified recombination events.
OR events, the observed number of overlapping recombination events with each gene, as identified by Gubbins.
Ratio (fold change) of observed recombination events compared to predicted recombination events.
Probability (P) that the number of observed recombination events is observed by chance based on 10,000 permutations randomly placing the identified recombination events throughout the clades 1 and 2 core genome.
Gene in F3 regulon.
Gene in S layer.
Gene in F2 regulon.
Gene in F1 regulon.
PTS gene.
Membrane protein gene.
Functional categories enriched for recombination in C. difficile clades 1 and 2
| Functional category | Fold enrichment | ||
|---|---|---|---|
| Unadjusted | Benjamini | ||
| KEGG pathway | |||
| Flagellar assembly | 7.8 | <0.001 | <0.001 |
| Phosphotransferase system (PTS) | 2.7 | <0.001 | <0.001 |
| Amino sugar and nucleotide sugar metabolism | 3.0 | <0.001 | <0.001 |
| Fructose and mannose metabolism | 2.5 | <0.001 | 0.0086 |
| Pentose and glucuronate interconversions | 3.1 | 0.032 | 0.28 |
| Pentose phosphate pathway | 2.4 | 0.039 | 0.28 |
| Bacterial chemotaxis | 2.6 | 0.064 | 0.38 |
| COG ontology | |||
| Cell envelope biogenesis, outer membrane | 3.1 | <0.001 | <0.001 |
| Defense mechanisms | 1.6 | 0.078 | 0.66 |
FIG 2Distribution of recombination events throughout the C. difficile genome. From innermost circle outward: positions in the core genome of clades 1 and 2 (purple) used as input for recombination detection; histogram of recombination events overlapping with each position in the C. difficile genome (dark gray represents more recombination than average, light gray represents less recombination than average); flagellar genes (turquoise); S-layer cassette genes (teal), phosphotransferase system (PTS) genes (dark red); genes involved in amino sugar and nucleotide sugar metabolism (salmon); genes involved in fructose and mannose metabolism (orange), and genes annotated as membrane proteins (navy). Areas highlighted in light blue represent genes that have more recombination than expected with a P value of <0.05 as determined by a permutation test. Numbers around the circular genome plot mark the nucleotide position in the reference genome C. difficile 630.
FIG 3Recombination in flagellar genes in C. difficile clades 1 and 2. All variants in the flagellar region of the C. difficile clades 1 and 2 core genome are plotted on the x axis based on genomic position. For each genome on the y axis, the variant is shown as black (F1 gene variant present), blue (F3 gene variant present), or white (variant absent). (a and b) The phylogeny in panel a is a maximum-likelihood tree based on all vertically inherited core variants, and the phylogeny in panel b is a neighbor-joining tree based on only the variants in the flagellar region. For panels a and b, clade 1 genomes are shown in shades of green to blue and clade 2 genomes are shown in shades of yellow to red based on their position in the maximum-likelihood phylogeny presented in panel a.