| Literature DB >> 35328559 |
Martina Aulitto1,2, Laura Martinez-Alvarez3, Gabriella Fiorentino1,4,5, Danila Limauro1,4,5, Xu Peng3, Patrizia Contursi1,4,5.
Abstract
The production of biochemicals requires the use of microbial strains with efficient substrate conversion and excellent environmental robustness, such as Weizmannia coagulans species. So far, the genomes of 47 strains have been sequenced. Herein, we report a comparative genomic analysis of nine strains on the full repertoire of Carbohydrate-Active enZymes (CAZymes), secretion systems, and resistance mechanisms to environmental challenges. Moreover, Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) immune system along with CRISPR-associated (Cas) genes, was also analyzed. Overall, this study expands our understanding of the strain's genomic diversity of W. coagulans to fully exploit its potential in biotechnological applications.Entities:
Keywords: B. coagulans; CAZymes; CRISPR-Cas systems; bacteriocins
Mesh:
Year: 2022 PMID: 35328559 PMCID: PMC8954581 DOI: 10.3390/ijms23063135
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Phylogenetic tree based on the genome comparison of the nine strains of W. coagulans.
ANI value comparisons between the nine strains analyzed.
| Query | Reference | ANI Estimate (%) |
|---|---|---|
| MA-13 | CSL1 | 94.68 |
| MA-13 | 36D1 | 94.85 |
| MA-13 | XZL9 | 95.04 |
| MA-13 | P38 | 95.15 |
| MA-13 | H-1 | 97.89 |
| MA-13 | 2–6 | 97.98 |
| MA-13 | XZL4 | 98.22 |
| MA-13 | DSM1 | 98.43 |
Figure 2Comparative genomic analysis of W. coagulans MA-13 and the closest eight relatives.
Figure 3Heat-map of singletons. The depth of color corresponds to the number of proteins. Dark green and light green/white represent the highest and lowest number of proteins.
CAZymes distribution in the genomes of W. coagulans strains. The total number of genes (for each genome) is based on the data obtained by NCBI and the number of CAZymes has been derived from Search with dbCAN2 HMMs of CAZy families.
| Strain | Total No. of Genes | No. of CAZymes | % CAZymes |
|---|---|---|---|
| MA-13 | 2689 | 40 | 1.49 |
| DSM1 | 2766 | 45 | 1.62 |
| 36D1 | 3128 | 48 | 1.53 |
| XZL9 | 3112 | 46 | 1.47 |
| P38 | 3097 | 46 | 1.48 |
| 2–6 | 2743 | 39 | 1.42 |
| H-1 | 2536 | 36 | 1.42 |
| XZL4 | 2593 | 43 | 1.65 |
| CSIL1 | 3165 | 50 | 1.56 |
Figure 4Heat map showing the genome-wide distribution of CAZymes in the W. coagulans strains. A total of 36 families were identified and obtained with Search with dbCAN2 HMMs of CAZy families. Blue and white represent the highest and lowest number of proteins.
Non-core genes of W.coagulans secretion systems.
| Accession Number | Function | |
|---|---|---|
| WP_035181994.1 | 2–6 | 6 kDa early secretory antigenic target ESAT-6 (EsxA) |
| WP_041818879.1 | 2–6 | Putative secretion accessory protein EsaA/YueB @ Bacteriophage SPP1 receptor |
| WP_013858856.1 | 2–6 | FtsK/SpoIIIE family protein, putative EssC/YukB component of Type VII secretion system |
| WP_035181994.1 | 36D1 | 6 kDa early secretory antigenic target ESAT-6 (EsxA) |
| WP_014096028.1 | 36D1 | secretion protein HlyD |
| WP_029141776.1 | DSM_1 | Putative secretion system component EssB/YukC |
| MBF8418457.1 | MA-13 | type VII secretion protein EssA |
| MBF8418459.1 | MA-13 | type VII secretion protein EssB |
| WP_026685044.1 | CSIL1 | secretion protein HlyD |
| WP_052123334.1 | P38 | FtsK/SpoIIIE family protein, putative EssC/YukB component of Type VII secretion system |
| WP_035190278.1 | P38 | FtsK/SpoIIIE family protein, putative EssC/YukB component of Type VII secretion system |
| WP_035190280.1 | P38 | Putative secretion system component EssB/YukC |
| WP_035190310.1 | P38 | Putative secretion accessory protein EsaA/YueB @ Bacteriophage SPP1 receptor |
| WP_035190312.1 | P38 | 6 kDa early secretory antigenic target ESAT-6 (EsxA) |
Figure 5Genetic determinants of bacitracin resistance in MA-13.
Figure 6Graphical representation of circularin A gene cluster in diverse W. coagulans strains.
Figure 7Heat map distribution of innate immunity systems in the W. coagulans strains. Green and white represent the highest and lowest number of proteins.
Figure 8CRISPR-Cas cassettes present in W. coagulans genomes. The cas operons for each strain are depicted: (A) MA-13, (B) DSM1, (C) 2–6, (D) H-1, (E) P38, (F) XZL9, (G) 36D1, and (H) CSIL1. The headers above each operon refer to the type of cas operon, strain, contig, and location. The color of cas genes corresponds to their associated stage of the CRISPR-Cas response: blue—adaptation; yellow—interference; and red—crRNA maturation. CRISPR arrays are depicted as black, vertical lines.
Figure 9Heat map distribution of cas operon genes among W. coagulans strains. Green and white represent the highest and lowest number of genes.
CRISPR arrays in the W. coagulans strains.
| Strain | Contig | Prediction | Start | End | Consensus_repeat | Orientation | Cas_associated | N_repeats |
|---|---|---|---|---|---|---|---|---|
| 2–6 | 2–6_NC_015634.1 | I-B | 2,499,794 | 2,503,038 | GTTGAACTTTAACATTGGATGTATTTAAAT | R | yes | 50 |
| I-B | 2,497,017 | 2,497,373 | GTTTCAATTCCTCATAGGTAAAATACAAAC | R | yes | 6 | ||
| 36D1 | 36D1_NC_016023.1 | Unknown | 320,637 | 321,284 | TTTTGAAGCCGTCAAAAGGACAAAA | F | orphan | 13 |
| I-B | 1,096,065 | 1,097,943 | GTTAGTATTTTACCTATGAGGAATTGAAAC | R | orphan | 29 | ||
| I-B | 2,117,433 | 2,121,795 | GTTTGTATTTTACCTATGAGGAATTGAAAC | F | yes | 66 | ||
| I-B | 2,123,872 | 2,124,703 | GTTTGTATTTTACCTATGAGGAATTGAAAC | F | yes | 13 | ||
| I-B | 2,126,172 | 2,127,007 | GTTTGTATTTTACCTATGAGGAATTGAAAC | F | yes | 13 | ||
| CSIL1 | CSIL1_NZ_AXVW01000068.1_scaffold00063.63_C | I-C | 13,264 | 13,753 | GTCACACTCCTTGCGAGTGTGTGGATTGAAAT | F | yes | 8 |
| CSIL1_NZ_AXVW01000098.1_scaffold00091.91_C | I-C | 7 | 236 | GTCGCTCCCTACATGGGGGCGTGGATTGAAATC | F | orphan | 4 | |
| CSIL1_NZ_KI519465.1_scaffold00026.26 | I-B | 19,196 | 19,356 | GTTAGTATTTTACCTATGAGGAATTGAAA | F | orphan | 3 | |
| DSM1 | DSM1_NZ_CP009709.1 | I-C | 1,536,669 | 1,541,901 | GTCGCTCCCTACGTGGGGGCGTGGATTGAAAT | R | yes | 80 |
| I-B | 2,584,186 | 2,585,866 | GTTAGTATTTTACCTATGAGGAATTGAAAC | F | orphan | 26 | ||
| I-C | 2,785,593 | 2,786,680 | GTCACACTCCTCGTGAGTGTGTGGATTGAAAT | F | yes | 17 | ||
| H-1 | H-1_NZ_ANAQ01000049.1_000049 | I-C | 9436 | 9599 | GTCACACTCCTCGTGAGTGTGTGAATTGAAAT | F | yes | 3 |
| H-1_NZ_ANAQ01000116.1_000116 | I-C | 43 | 204 | GTCACACTCCTCGTGAGTGTGTGAATTGAAAT | F | orphan | 3 | |
| H-1_NZ_ANAQ01000125.1_000125 | I-C | 3689 | 3983 | GTCACACTCCTCGTGAGTGTGTGAATTGAAAT | R | orphan | 5 | |
| H-1_NZ_ANAQ01000203.1_000203 | I-B | 74 | 2139 | GTTGAACTTTAACATTGGATGTATTTAAAT | F | orphan | 32 | |
| H-1_NZ_ANAQ01000268.1_000268 | I-B | 17,520 | 20,714 | GTTGAACTTTAACATTGGATGTATTTAAAT | F | yes | 49 | |
| H-1_NZ_ANAQ01000293.1_000293 | I-B | 112 | 339 | GTTGAACTTTAACATTGGATGTATTTAAATT | R | orphan | 4 | |
| MA-13 | MA-13_NZ_SMSP02000017.1_69 | I-C | 32 | 1577 | GTCGCTCCCTACATGGGGGCGTGGATTGAAAT | F | orphan | 24 |
| MA-13_NZ_SMSP02000061.1_151 | I-C | 8671 | 10,021 | GTCGCTCCCTACATGGGGGCGTGGATTGAAAT | R | yes | 21 | |
| MA-13_NZ_SMSP02000062.1_309 | I-C | 52,270 | 52,630 | GTCGCTCCCTACATGGGGGCGTGGATTGAAAT | F | yes | 6 | |
| P38 | P38_NZ_JSVI01000006.1_6 | I-B | 99,580 | 101,051 | GTTGAACTTTAACATTGGATGTATTTAAAT | F | yes | 23 |
| P38_NZ_JSVI01000107.1_111 | I-B | 446 | 1784 | GTTGAACTTTAACATTGGATGTATTTAAAT | R | orphan | 21 | |
| XZL9 | XZL9_NZ_ANAP01000038.1_000038 | I-B | 122 | 953 | GTTTGTATTTTACCTATGAGGAATTGAAAC | R | yes | 13 |
| XZL9_NZ_ANAP01000157.1_000160 | I-B | 97 | 1197 | GTTTGTATTTTACCTATGAGGAATTGAAAC | F | orphan | 17 | |
| XZL9_NZ_ANAP01000220.1_000226 | I-B | 68 | 3018 | GTTTGTATTTTACCTATGAGGAATTGAAAC | R | orphan | 45 |
Figure 10Graphical representation of interaction networks between spacers and viruses. Nodes represent spacers and virus sequences, with lines representing a match between them. (A) Spacer nodes colored by their strain of origin. (B) Virus nodes are colored by their ecosystem of isolation.
Figure 11Circular visualization of predicted genomic islands. Blocks are colored according to the prediction method: IslandPath-DIMOB (blue), SIGI-HMM (orange), as well as the integrated results (dark red).
Figure 12Linear genomic map of prophage phBC6A52 identified with PHASTER from MA-13 genome.