| Literature DB >> 33082337 |
Charles J Norsigian1, Heather A Danhof2,3, Colleen K Brand2,3, Numan Oezguen4, Firas S Midani2,3, Bernhard O Palsson1, Tor C Savidge4, Robert A Britton2,3, Jennifer K Spinler4, Jonathan M Monk5.
Abstract
Hospital acquired Clostridioides (Clostridium) difficile infection is exacerbated by the continued evolution of C. difficile strains, a phenomenon studied by multiple laboratories using stock cultures specific to each laboratory. Intralaboratory evolution of strains contributes to interlaboratory variation in experimental results adding to the challenges of scientific rigor and reproducibility. To explore how microevolution of C. difficile within laboratories influences the metabolic capacity of an organism, three different laboratory stock isolates of the C. difficile 630 reference strain were whole-genome sequenced and profiled in over 180 nutrient environments using phenotypic microarrays. The results identified differences in growth dynamics for 32 carbon sources including trehalose, fructose, and mannose. An updated genome-scale model for C. difficile 630 was constructed and used to contextualize the 28 unique mutations observed between the stock cultures. The integration of phenotypic screens with model predictions identified pathways enabling catabolism of ethanolamine, salicin, arbutin, and N-acetyl-galactosamine that differentiated individual C. difficile 630 laboratory isolates. The reconstruction was used as a framework to analyze the core-genome of 415 publicly available C. difficile genomes and identify areas of metabolism prone to evolution within the species. Genes encoding enzymes and transporters involved in starch metabolism and iron acquisition were more variable while C. difficile distinct metabolic functions like Stickland fermentation were more consistent. A substitution in the trehalose PTS system was identified with potential implications in strain virulence. Thus, pairing genome-scale models with large-scale physiological and genomic data enables a mechanistic framework for studying the evolution of pathogens within microenvironments and will lead to predictive modeling to combat pathogen emergence.Entities:
Mesh:
Year: 2020 PMID: 33082337 PMCID: PMC7576604 DOI: 10.1038/s41540-020-00151-9
Source DB: PubMed Journal: NPJ Syst Biol Appl ISSN: 2056-7189
Fig. 1Experimental phenotyping of three different laboratory stock cultures of C. difficile 630.
The Savidge 630, Britton 630, and Britton 630Δerm are represented by red, green, and blue respectively. (A) Heat map of the maximal OD620 of C. difficile strains in Biolog phenotype microarray plates for which the fold change among the strains had the greatest standard deviation between the strains. Selected carbon substrates supporting differential fold change are shown. (n = 3 biological replicates per strain). (B) Venn diagram of 190 carbon substrates tested. All three strains shared 158 growth phenotypes, while 21 phenotypes were shared between Savidge 630 and Britton 630, 9 between Britton 630 and Britton 630Δerm, and 2 phenotypes between Britton 630Δerm and Savidge 630. (C) Venn diagram detailing the identified gene deletions of each strain versus the reference sequence. (D) Venn diagram detailing mutations of each strain versus the reference sequence.
Comparison of SNVs detected across three C. difficile 630 laboratory stock strains.
| Gene | Description | Mutation | Annotation | Savidge 630 | Britton 630 | Britton 630Δerm |
|---|---|---|---|---|---|---|
| CD630_05730/thrS | Putative membrane protein/Threonyl-tRNA synthetase | C → T | Intergenic (+173/−843) | ✓ | ✓ | ✓ |
| CD630_05770/CD630_05780 | Conserved hypothetical protein/transporter, Major Facilitator superfamily (MFS) | A → T | Intergenic (−126/+35) | ✓ | ✓ | ✓ |
| CD630_24550/CD630_24560 | Putative CRISPR-associated protein/ABC-type transport system, sugar-family ATP-binding protein | G → T | Intergenic(−543/−193) | ✓ | ✓ | ✓ |
| rplC | 50S ribosomal protein L3 | G → T | G114G (GGG → GGT) | ✓ | ✓ | ✓ |
| CD630_11900 | Putative acetyltransferase | T → C | F133L (TTT → CTT) | ✓ | ✓ | ✓ |
| CD630_17670 | Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) | C → G | P33A (CCC → GCC) | ✓ | ✓ | ✓ |
| CD630_25320 | Aminotransferase, alanine-glyoxylate transaminase | C → T | E304E (GAG → GAA) | ✓ | ✓ | ✓ |
| CD630_13880 | Putative transcriptional regulator | (T) 6→7 | Coding (40/45 nt) | ✓ | ✓ | ✓ |
| CD630_31561 | Fragment of conserved hypothetical protein | +A | Coding (309/339 nt) | ✓ | – | ✓ |
| CD630_34170/CD630_34180 | ABC-type transport system, sugar-family ATP-binding protein/Precorrin-2 dehydrogenase | A → G | Intergenic (−3769/+1786) | ✓ | ✓ | – |
| CD630_34170/CD630_34180 | ABC-type transport system, sugar-family ATP-binding protein/Precorrin-2 dehydrogenase | + C | Intergenic(−3628/+1927) | ✓ | – | – |
| CD630_19000/CD630_19010 | conserved hypothetical protein/Transcriptional regulator, Phage-type | A → T | Intergenic (−160/−294) | ✓ | – | – |
| CD630_02050 | Transcription antiterminator, PTS operon regulator | G → T | G165C (GGT → TGT) | ✓ | – | – |
| CD630_26850 | Putative sporulation stage II, protein E | Δ21 bp | Coding (339–359/1770 nt) | ✓ | – | – |
| CD630_32450 | Transcriptional regulator, sigma-54 dependent | C → T | E261K (GAA → AAA) | ✓ | – | – |
| CD630_26270 | Conserved hypothetical protein | C → A | G68C (GGT → TGT) | ✓ | – | – |
| CD630_30890 | PTS system, glucose-like IIBC component | T → G | E258D (GAA → GAC) | ✓ | – | – |
| CD630_26670 | PTS system, glucose specific IIBC component | C → T | V228I (GTT → ATT) | ✓ | – | – |
| CD630_26670 | PTS system, glucose specific IIBC component | A → C | *524E (TAA → GAA) | – | – | ✓ |
| CD630_26670 | PTS system, glucose specific IIBC component | (T) 8→7 | Coding (1558/1572 nt) | – | – | ✓ |
| CD630_20270 | N-carbamoyl-L-amino acid hydrolase | G → A | G373E (GGG → GAG) | – | – | ✓ |
| CD630_06430 | Two-component response regulator | T → C | I199I (ATT → ATC) | – | – | ✓ |
| CD630_07610 | Putative ATP-dependent RNA helicase | G → T | D136Y (GAC → TAC) | – | – | ✓ |
| CD630_12480 | ribonuclese III | G → T | G59V (GGC → GTC) | – | – | ✓ |
| CD630_14040 | Putative oligopeptide transporter | A → G | E536G (GAA → GGA) | – | – | ✓ |
| CD630_12740 | DNA topoisomerase I | C → T | Q386* (CAA → TAA) | – | – | ✓ |
| CD630_22630 | Peptidyl-prolyl cis-trans isomerase, PpiC-type | G → T | S127* (TCA → TAA) | – | – | ✓ |
| CD630_22670 | Fragment of membrane protein, abortive infection-type protein | (A) 5→6 | Coding (280/321 nt) | – | – | ✓ |
| CD630_29430 | Putative phage replication protein | T → C | N210D (AAT → GAT) | – | – | ✓ |
| CD630_33790 | Putative conjugate transposon protein Tn916-like, CTn7-Orf15 | C → A | E63D (GAG → GAT) | – | – | ✓ |
| CD630_33980 | Putative hydrolase, NUDIX family | C → A | G9C (GGT → TGT) | – | – | ✓ |
| CD630_30360/CD630_30370 | Transporter, Major Facilitator Superfamily (MFS)/Transcriptional regulator, CarD family | G → T | Intergenic (−1521/+386) | – | – | ✓ |
| treR | Transcriptional regulator, GntR family | Δ6 bp | coding (192–197/723 nt) | – | – | ✓ |
| CD630_12060 | Putative membrane protein | A → T | K120N (AAA → AAT) | – | ✓ | – |
| CD630_27920 | Protein translocase subunit secA 2 | T → A | P669P (CCA → CCT) | – | ✓ | – |
| CD630_31840/CD630_31850 | Diaminopropionate ammonia-lysase/Fragment of putative RNA-binding protein | Δ9 bp | Intergenic (−393/+222) | – | ✓ | – |
Comparison of deletions detected across three C. difficile 630 laboratory stock strains.
| Gene | Description | Savidge 630 | Britton 630 | Britton 630Δerm |
|---|---|---|---|---|
| CD630_10250 | ABC-type transport system, spermidine/putrescine permease | 1197421:1197442 | 1197421:1197445 | 1197421:1197443 |
| CD630_34170/CD630_34180 | ABC-type transport system, sugar-family ATP-binding protein/Precorrin-2 dehydrogenase | 4004552–4006944 :4007449–4007360 | 4004579–4006943 :4007462–4007423 | 4004577–4006952 :4007434–4007372 |
| CD630_02880 | PTS system, mannose/fructose/sorbose IIC component | 348613:348633 | – | 348613:348633 |
| CD630_01960 | Fragment of conserved hypothetical protein, DUF111 family | 255270:255292 | – | – |
| CD630_12100 | conserved hypothetical protein | – | 1409337–1409360 | – |
| CD630_02440/CD630_02450 | Putative CDP-glecerol:Poly(glycerophosphate) glycerophosphotransferase/Flagellar basal-body rod protein FlgB | – | – | 309199–309207 |
| CD630_31350 | Putative fructose-1-6-biphosphate adolase | – | – | 3654239–3654260 |
| CD630_09390-[CD630_09770] | 46 genes: putative phage protein | – | – | 1110626–1125088 :1141390–1125187 |
| [CD630_28900]-CD630_29521 | 44 genes: putative phage protein | – | – | 3381246–3397454 :3412003–3397553 |
| [CD630_20060]-[ermB1] | 8 genes | – | – | 2316723–2317718 :2320011–2319025 |
Fig. 2Properties and validation metrics of iCN900.
(A) Model predictions for biomass flux on four different in silico media types: Complex media, CADM, BDM, and minimal media. Importantly, the biomass objective flux reflects the increasing amount of nutrients in each media condition. The overall gene, reaction, and metabolite content of iCN900 is summarized within the inset box. (B) Comparison of model predictions of essential genes on complex media compared to experimental gene-knockout results from Dembek et al. (C) C. difficile optical density at 620 nm was measured over time in Biolog Phenotype Microarray plates. Representative growth curves for the Savidge 630 strain on 5 indicated carbon sources (of the 190 tested) and the negative control are shown. Experimental growth of C. difficile was compared to iCN900 metabolic flux predictions, to determine the accuracy of predictions as summarized in the inset box. (D) Putative metabolic pathways for C. difficile utilization of salicin and arbutin were incorporated into iCN900 through targeted gap-filling enabled by comparison to experimental growth data.
Fig. 3Characterization of phenotypic growth differences of lab-adapted isolates on trehalose.
(A) Predicted protein structure of C. difficile PTS_(CD630_30890) based upon the crystal structure of the MalT transporter. The EIIC domain is shown as a dimer, with the E285D substitution of the 630-Savage isolate highlighted in red on the cytoplasmic interface. The model shading indicates amino acid hydrophobicity (gray residues are hydrophobic and blue residues are hydrophilic according to the Kyte-Dolittle scale). (B) Growth curves of C. difficile isolates, Savidge 630 (red), Britton 630 (green) and Britton 630Δerm (blue) in defined minimal medium supplemented with trehalose. The gray line indicates the maximal optical density of the negative control wells. Optical density at 620 nm measured at 10 min intervals, The plotted bar is the mean of three biological replicates assayed in duplicate wells and the error bars represent the standard deviation of the mean. (C) Growth curves from the conditions in (B) were analyzed by Gaussian process curve fitting to calculate the total carrying capacity, doubling time, and total area under the curve (error bars represent the standard deviation of the mean **P ≤ 0.001, ***P ≤ 0.0001).
Fig. 4Core-genome of C. difficile reveals metabolic subsystems with greater sequence variation.
(A) By comparing the genomes of 415 publicly available C. difficile genomes the core-genome was calculated and includes 765 metabolic genes. (B) Analyzing the sequence variation among the 765 core metabolic genes demonstrates that the average difference in amino acid sequence range from 0 to just over 20 for these shared genes. (C) The genome-scale reconstruction enables stratification of the genes by metabolic subsystem and comparison of average amino acid differences of each gene within a subsystem. This reveals that nitrite and starch/sucrose metabolism have the highest degree of sequence variation, whereas Stickland reactions and leucine fermentation are the most conserved.
Fig. 5Allele diversity for thiD as an example of sequence diversity.
(A) The 415 sequences for the thiD gene have 11 variant sequences (alleles) variably present within the population. Notably the reference sequence allele is present within 79.7% of the population, whereas the next most frequent allele is present in 5.3% of the population. (B) The degree of similarity between each sequence is readily accessible. For example the thiD 6 and thiD 7 sequences are similar to one another sharing a K60N mutation. (C) Through the use of the GEM-PRO each mutation by variant can be visualized within the 3D space of crystal structures where applicable.