Literature DB >> 36040103

In silico analysis and a comparative genomics approach to predict pathogenic trehalase genes in the complete genome of Antarctica Shigella sp. PAMC28760.

Prasansah Shrestha1, Jayram Karmacharya1, So-Ra Han1, Hyun Park2, Tae-Jin Oh1,3,4.   

Abstract

Although four Shigella species (S. flexneri, S. sonnei, S. dysenteriae, and S. boydii) have been reported, S. sp. PAMC 28760, an Antarctica isolate, is the only one with a complete genome deposited in NCBI database as an uncharacterized isolate. Because it is the world's driest, windiest, and coldest continent, Antarctica provides an unfavourable environment for microorganisms. Computational analysis of genomic sequences of four Shigella species and our uncategorized Antarctica isolates Shigella sp. PAMC28760 was performed using MP3 (offline version) program to predict trehalase encoding genes as a pathogenic or non-pathogenic form. Additionally, we employed RAST and Prokka (offline version) annotation programs to determine locations of periplasmic (treA) and cytoplasmic (treF) trehalase genes in studied genomes. Our results showed that only 56 out of 134 Shigella strains had two different trehalase genes (treF and treA). It was revealed that the treF gene tends to be prevalent in Shigella species. In addition, both treA and treF genes were present in our strain S. sp. PAMC28760. The main objective of this study was to predict the prevalence of two different trehalase genes (treF and treA) in the complete genome of Shigella sp. PAMC28760 and other complete genomes of Shigella species. Till date, it is the first study to show that two types of trehalase genes are involved in Shigella species, which could offer insight on how the bacteria use accessible carbohydrate like glucose produced from the trehalose degradation pathway, and importance of periplasmic trehalase involvement in bacterial virulence.

Entities:  

Keywords:  HMM; MP3; SVM; Shigella sp.; prokka; trehalase

Mesh:

Substances:

Year:  2022        PMID: 36040103      PMCID: PMC9450901          DOI: 10.1080/21505594.2022.2117679

Source DB:  PubMed          Journal:  Virulence        ISSN: 2150-5594            Impact factor:   5.428


Introduction

Shigella is a Gram-negative bacterium that is genetically related to Escherichia coli [1]. It is a facultative anaerobe and a non-spore former. It belongs to non-motile and rod-shaped bacteria. Shigella are among common causes of diarrhoea worldwide. Shigella infection is one of the top four infections among African and South Asian children [2]. Based on its serological features, Shigella genus can be differentiated into four species: S. dysenteriae (serogroup A), S. flexneri (serogroups B), S. boydii (serogroups C), and S. sonnei (serogroup D). Shigella species has a highly immunogenic O-antigen made of many oligosaccharides unit (O) repeats with a wide range of sugar components, number of repeats, arrangements, and linkages. Each Shigella species can be further differentiated into several serotypes based on O-antigen on its lipopolysaccharide layer: S. dysenteriae having 15 serotypes, S. flexneri having 6 serotypes with 15 subtypes, S. boydii having 18 serotypes, and S. sonnei having only 1 serotype [3-5]. Although serogroups A, B, and C are physiologically identical, due to its positive beta-D-galactosidase and ornithine decarboxylase activity, S. sonnei is distinguished as a single serogroup D [6]. A previous study has reported that 60% of all infections worldwide are caused by S. flexneri. Thus, S. flexneri has been intensively studied, which has enhanced our understanding of Shigella pathophysiology and the underlying “host-pathogen” communication [7]. S. sp. PAMC28760 is a lichen-associated polar bacteria isolated from Antarctica. It has been deposited in the NCBI (National Center for Biotechnology Information) database (https://www.ncbi.nlm.nih.gov/) as an uncharacterized organism. Antarctica is a geographical mass covered with up to 13000 feet of ice and bare rock, with small mosses and lichens being its primary vegetation [8]. Various microorganisms remain unknown in such a harsh environment since they have developed specific adaption abilities towards a wide range of extreme conditions to thrive in such habitat [9]. Generally, Shigella species can grow in a temperature range from (6–8) °C to (45–47) °C [10]. However, temperatures about 65 °C cause their rapid inactivation. Some Shigella species can survive for long durations when they are frozen at −20 °C or refrigerated at 4 °C [11,12]. Bacteria have developed a wide range of coping mechanisms to endure adverse environments such as food deprivation, biochemical and biological changes, and extreme temperatures. Temperature is one of the most crucial elements that can influence microbial protein expression. According to previous studies, expression levels of outer membrane proteins were analysed using proteome profiles of S. flexneri cells grown at 37, 38.5, and 40° C. Pathogens might use the overexpression of specific proteins (18.4, 25.6, and 57.0 kDa) to govern the expression of virulence-related proteins when cells were exposed to higher temperatures [13]. Moreover, cold-adapted enzymes from organisms living in polar regions, deep oceans, and high altitudes have several benefits, they have been increasingly analysed in recent years. Trehalose is also essential to organisms as a survival mechanism in a stress environment because of its unique physiochemical properties, which allow it to protect cell integrity against a different environmental damage and nutritional limitations [14]. Also, trehalose and its derivatives have also been found to possess crucial functions in the pathogenicity of a wide range of organisms, including bacteria (Gram-positive and Gram-negative) and plants [15] Also, trehalose metabolism could be employed as a target for novel pathogen-specific treatments. Trehalose is a disaccharide produced by various organisms. It can be degraded via several pathways. Among these pathways, the trehalose-6-phosphate pathway (TPP) is used by many bacteria to degrade trehalose. This pathway has been investigated under conditions of low osmolarity in both Gram-positive and Gram-negative bacteria [16,17]. It was reported in E. coli K-12 that under different osmolarity conditions, it may survive on trehalose as its sole carbon source and uses different pathways for its breakdown. Likewise, the external trehalose is hydrolysed by periplasmic trehalase (TreA) at high osmotic conditions. At that moment, the glucose PTS then transports the produced glucose molecules back into the cytoplasm [17,18]. During the transition between high and low osmolarity, a second trehalase, cytoplasmic trehalase (TreF), is active which removes the internal pool of trehalose as the cells alter their metabolism to low osmolarity. TreF’s low enzymatic activity is low enough not to interfere with trehalose biosynthesis during high osmolarity, but high enough to breakdown the accumulated trehalose during the return to normal conditions, when no more biosynthesis proceeds [19]. Several prokaryotes and eukaryotes can degrade trehalose to glucose through the enzyme trehalase [EC 3.2.1.28] [20,21]. It has been reported that E. coli has two trehalases, including cytoplasmic trehalase (TreF) and periplasmic trehalase (TreA). The periplasm is a small space between the outer and inner membranes of Gram-negative bacteria. Trehalases from E. coli, such as periplasmic TreA (Tre37A), have an extra C-terminal region, whereas TreF has an extended N-terminal region. Both enzymes are monomeric and have a 47% similarity [22]. Neutral trehalase (L72) is a protein found in Klebsiella oxytoca that has been linked to several functions, including energy sources and stress protection [23]. Experimental evidence of periplasmic treA gene in needed for optimal development of type 1 fimbriae for cell invasion and colonization in extraintestinal pathogenic E. coli (ExPEC) strain MT78 has been addressed in the previous study [24]. Similarly, in Burkholderia pseudomallei, a single trehalase-encoding gene, identical to E. coli TreA, which is involved in stress tolerance and virulence in mouse and insect infection models, plays a role in stress tolerance and virulence [25]. Despite its tiny size, the periplasm contains many important proteins required for a variety of physiological activities and bacterial survival under stress. Periplasmic proteins aid in the defence against different stresses, making it easier for bacteria such as S. Typhimurium to colonize the host [26]. However, there has been no complete analysis of the expression of many periplasmic proteins, especially periplasmic trehalase (TreA), in Shigella strains. The goal of this study was to determine the prevalence of two different trehalase genes (treF and treA) in 134 complete Shigella genomes, including lichen-associated S. sp. PAMC28760 isolated from the Antarctica region. Additionally, we would like to determine which trehalase genes (treF or treA) might contribute to virulence. It is thought that analysis of pathogenic and non-pathogenic trehalase might provide a new direction to understand bacterial pathogenic mechanism at the genetic level and to provide a new insight on drug development for the treatment of bacterial infections. The use of a bioinformatics tools such as MP3 can allow the study of virulence genes involved in respective strains without the need to perform hazardous laboratory experiments.

Materials and methods

Data sources

The complete genome and amino acid sequences of Shigella species were obtained from the NCBI database (https://www.ncbi.nlm.nih.gov/) [27]. A total of 134 Shigella strains deposited in NCBI by September 2021 were analysed, including our Antarctica isolate S. sp. PAMC28760, whose genome size was 4,558,287 bp [28].

Phylogenetic tree construction and average nucleotide identity (ANI) analysis

To compare 16S rRNA sequences of S. sp. PAMC28760 with those in other complete genomes of Shigella strains (133 strains), phylogenetic analysis was performed using the ClustalW alignment tool and the Molecular Evolutionary Genetic Analysis (MEGA X) (https://www.megasoftware.net/) tools [29]. MEGAX was used to create the phylogenetic tree, which was built on a neighbour-joining tree method [30] and 1,000 bootstrap replications [31]. The online software Interactive Tree life (iTOL) v6 (https://itol.embl.de/) was used to analyse phylogenetic trees [32]. Orthologous Average Nucleotide Identity Software Tool (OAT) [33] was used to determine the average nucleotide identity (ANI) of 16S rRNA from closely related species acquired from EziBio Cloud (www.ezibiocloud.net) [34]. To determine if the strain PAMC28760 belonged to Shigella or Escherichia, EziBio cloud 16S rRNA sequencing was used. Secondary data was used to identify the cytoplasmic trehalase or periplasmic trehalase from the characterized strains E. coli strain K-12 substrain MG1655 (NC 000913.3) as a reference for the construction of a phylogenetic tree for trehalase genes (treA and treF) in those studied strains who possess both trehalase genes. NCBI, RAST, and Prokka were used to find the cytoplasmic and periplasmic genes. MUSCLE [35,36] was used to align amino acid sequences, and maximum-likelihood and neighbour-joining methods were used to build a phylogenetic tree.

Comparative genomic analysis and, prediction of periplasmic trehalase and cytoplasmic trehalase

The prevalence of trehalase genes in the studied genome, as well as to predict pathogenic and non-pathogenic factors, were carried out using the MP3 (offline version) program (http://metagenomics.iiserb.ac.in/mp3/index.php) [37]. This program uses two modules including Support Vector Model (SVM) and Hidden Markov Model (HMM) to predict pathogenic and non-pathogenic proteins in the genome. Furthermore, Rapid Annotations utilizing Subsystems Technology (RAST, https://rast.nmpdr.org/rast.cgi) [38] and Prokka annotation (Prokka 1.14.6 offline version) [39] were used to locate predicted trehalase genes. CGView ServerBETA (www.cgview.ca) was used to better visualization of location predicted trehalase genes [40].

Results and discussion

Phylogenetic tree analysis of S. sp. PAMC28760

Phylogenomic analysis revealed that S. sp. PAMC28760 and S. dysenteriae ATCC12037 belonged to the same branch (Figure 1a). MEGA X program was used to construct phylogenetic tree to analyse their evolutionary history using the neighbour-joining method [41] with 1,000-replicate bootstrap. Furthermore, ANI value revealed that S. sp. PAMC28760 had a close relationship with strains S. flexneri ATCC29903(T) (99.80%), S. sonnei CECT4887(T) (99.70%), E. coli ATCC11775(T) and S. boydii GTC779(T) (99.19%), E. fergusonii ATCC35469(T) (99.70%), S. dysenteriae ATCC13313 (T) (98.99%), and E. albertii TW07627 (T) (98.89%) (Figure 1b). These results suggest that the S. sp. PAMC28760 strain is closely related to Escherichia strain as both belong to the same family Enterobacteriaceae.
Figure 1.

(a) Circular phylogenetic analysis of the complete genomes of Shigella: Phylogenetic tree showing the relationships of genomes of a total 134 Shigella strains including an Antarctica isolate Shigella sp. PAMC28760 (represented in red text), and their phylogenetic position. This analysis was prepared using MEGA X based on 16S rRNA sequences with neighbour-joining method with 1,000-replicate bootstrap. (b) Heatmap generated with OrthoANI values calculated using the OAT software to determine the close relationship of strain S. sp. PAMC28760 with S. flexneri ATCC29903(T), S. sonnei CECT4887(T), E. coli ATCC11775(T), S. boydii GTC779(T), E. fergusonii ATCC35469(T), S. dysenteriae ATCC13313(T), and E. albertii TW07627(T).

(a) Circular phylogenetic analysis of the complete genomes of Shigella: Phylogenetic tree showing the relationships of genomes of a total 134 Shigella strains including an Antarctica isolate Shigella sp. PAMC28760 (represented in red text), and their phylogenetic position. This analysis was prepared using MEGA X based on 16S rRNA sequences with neighbour-joining method with 1,000-replicate bootstrap. (b) Heatmap generated with OrthoANI values calculated using the OAT software to determine the close relationship of strain S. sp. PAMC28760 with S. flexneri ATCC29903(T), S. sonnei CECT4887(T), E. coli ATCC11775(T), S. boydii GTC779(T), E. fergusonii ATCC35469(T), S. dysenteriae ATCC13313(T), and E. albertii TW07627(T).

Trehalase gene and its phylogeny

When complete genomes of 134 Shigella strains including our strain PAMC28760 were studied, only 56 strains were found to have two types of trehalase (treF and treA) genes. Furthermore, we employed RAST annotation database and, Prokka annotation to differentiate cytoplasmic (treF) and periplasmic (treA) trehalase. In addition, the CGview online server (Figure 2) visualize the predicted trehalase genes in S. sp. PAMC28760. When we aligned them with characterized trehalase genes (treF and treA) of E. coli K-12 substrain MG655, S. sp. PAMC28760 was found to also encode the same genes involved in trehalose degradation (Figure 3). While 48, 47, and 47 of S. flexneri’s strains had treF, treA, and both treF and treA genes, respectively, 39, 2, and 2 of S. sonnei’s strains had treF, treA, and both treF and treA genes, respectively. In addition, of a total of 20 S. boydii strains, 18, 5, and 3 strains had treF, treA, and both treF and treA genes, respectively. For a total of 25 S. dysenteriae strains, 12,12, and 3 strains had treF, treA, and both treF and treA genes, respectively (Figure 4). Results showed that S. sp. PAMC28760 had both trehalase genes treF (cytoplasmic trehalase) and treA (periplasmic trehalase).
Figure 2.

Circular genome comparison using CGView ServerBETA (http://cgview.Ca/) tool for the representation of genome and features of the S. sp. PAMC28760. The contents of the featured rings (starting with the outermost ring to the centre) are as follows. Ring 1, combined ORFs in forward and reverse strands; Ring 2, trehalose degradative genes, combined forward and reverse strand, and CDS (including tRNA and rRNA) in forward and reverse strands; Ring 3, GC skew plot, values above average are depicted in green, and below average in purple; Ring 4, GC content plot; and Ring 5, Sequence ruler.

Figure 3.

Cytoplasmic trehalase (TreF) amino acid sequence alignment with a characterized trehalase (TreF). TreF (GH37) from E. coli K-12 substr. MG1655, trehalase from S. flexneri C32, trehalase from Shigella sp. PAMC28760, and trehalase from S. boydii ATCC49812. The signature motif 1 and signature motif 2 represent two highly conserved sequence segments that belong to the GH37 family. The “#” symbol denotes the catalytic sites of Asp312 and Glu496. the three black boxes represent conserved regions (CR3–CR5).

Figure 4.

Venn diagram categorizes trehalase genes involved in the complete genomes of four Shigella species along with uncategorized Shigella sp. PAMC28760. Green circle represents the cytoplasmic trehalase (treF), whereas red circle represents the periplasmic trehalase (treA). The number outside the circles represents the absence of both trehalase genes.

Circular genome comparison using CGView ServerBETA (http://cgview.Ca/) tool for the representation of genome and features of the S. sp. PAMC28760. The contents of the featured rings (starting with the outermost ring to the centre) are as follows. Ring 1, combined ORFs in forward and reverse strands; Ring 2, trehalose degradative genes, combined forward and reverse strand, and CDS (including tRNA and rRNA) in forward and reverse strands; Ring 3, GC skew plot, values above average are depicted in green, and below average in purple; Ring 4, GC content plot; and Ring 5, Sequence ruler. Cytoplasmic trehalase (TreF) amino acid sequence alignment with a characterized trehalase (TreF). TreF (GH37) from E. coli K-12 substr. MG1655, trehalase from S. flexneri C32, trehalase from Shigella sp. PAMC28760, and trehalase from S. boydii ATCC49812. The signature motif 1 and signature motif 2 represent two highly conserved sequence segments that belong to the GH37 family. The “#” symbol denotes the catalytic sites of Asp312 and Glu496. the three black boxes represent conserved regions (CR3–CR5). Venn diagram categorizes trehalase genes involved in the complete genomes of four Shigella species along with uncategorized Shigella sp. PAMC28760. Green circle represents the cytoplasmic trehalase (treF), whereas red circle represents the periplasmic trehalase (treA). The number outside the circles represents the absence of both trehalase genes. Phylogenetic tree analysis of trehalase genes (treF and treA) with a characterized E. coli K-12 substrain MG 1655 revealed that treA of S. sp. PAMC28760 and E. coli K-12 substrain MG1655 shared the same clade with 100% sequence identity, whereas S. sp. PAMC28760 did not share the same clade as E. coli K-12 substrain MG1655, although both shared 99.82% sequence identity (Figure 5). This shows that trehalase genes (treA and treF) of S. sp. PAMC28760 could be distinctly divided into two major clades. It was found that treA and treF genes from studied genome clustered together more closely with both genes of S. flexneri. The treA gene is clustered with S. flexneri FDAARGOS-74 and S. flexneri WW1 whereas treF is clustered with S. flexneri 2016AM–0877 and S. flexneri 74–1170.
Figure 5.

Circular phylogenetic tree based on trehalase genes (treF/treA) sequence in the complete genomes of Shigella strains with reference to the characterized trehalase of E. coli strain K-12 substrain MG165 using a neighbour-joining tree method with 1,000-replicate bootstrap. The pink highlighted boxes represent the characterized trehalase genes (treF and treA), whereas the red text indicates the strain (Shigella sp. PAMC28760) under study.

Circular phylogenetic tree based on trehalase genes (treF/treA) sequence in the complete genomes of Shigella strains with reference to the characterized trehalase of E. coli strain K-12 substrain MG165 using a neighbour-joining tree method with 1,000-replicate bootstrap. The pink highlighted boxes represent the characterized trehalase genes (treF and treA), whereas the red text indicates the strain (Shigella sp. PAMC28760) under study. These results suggest that S. sp. PAMC28760 might have a trehalose degradation pathway like that of E. coli. Also, it has been reported that TreA in E. coli is a trehalase found in the periplasmic area of cells that hydrolyzes trehalose glucose under high osmolarity, whereas TreF is a cytoplasmic isoform of TreA trehalase that plays important role in trehalose breakdown produced within bacterial cells under high osmolarity conditions [42,43]. Similarly, in the case of cytoplasmic trehalase (TreF), it becomes active during the transition between high and low osmolarity. TreF can deplete the internal trehalose pool as the cell metabolism shifts to a low osmolarity state. TreF has a low enzymatic activity that is low enough not to interfere with trehalose production under high osmolarity, but high enough to degrade the accumulated trehalose once the environment returns to normal [19].

Trehalose degradative pathway

Six routes of trehalose degradation pathways (trehalose degradation I, II, III, IV, V, and VI) have been found in organisms depending on their subcellular locations. These pathways have been reported in the MetaCyc pathway database [44]. They are summarized in (Figure 6). Depending on the organism, trehalose might enter cells via a permease where it remains unmodified, or it gets transformed to phosphorylated trehalose 6-phosphate forms via a phosphotransferase system (PTS). Trehalose that cannot be modified might get degraded by a hydrolysing trehalase (EC 3.2.1.28) or might be split by trehalose phosphorylase (EC 2.4.1.64, and EC 2.4.1.231) (Figure 7). It was revealed that our Antarctica isolate S. sp. PAMC28760 had the trehalase gene based on the prediction of trehalose degradative pathway. The result is summarized in Figures 2 and 6. Trehalose is broken down into two molecules of glucose and water by the trehalase enzyme that utilizes glucose as a carbon source. Trehalase is classified into glucoside hydrolase (GH) families such as GH37, GH65, and GH15 in the CAZy (Carbohydrate-Active Enzyme) database (http://www.cazy.org/) [45]. The GH37 family possesses only trehalase enzymes, whereas GH65 and GH15 families possess other enzymes along with trehalase enzymes. In 2007, it was reported that Mycobacterium smegmatis and Mycobacterium tuberculosis possessed trehalase that belonged to the GH15 family [46].
Figure 6.

Trehalose degradative pathways. Six different trehalose degradative pathways are found in organisms (bacteria, fungi, yeast, Arthropoda, and plants). Among them, only two degradation pathways (Trehalose degradation pathway II (cytosolic) and VI (periplasmic)) are found in Shigella species.

Figure 7.

Schematic diagram of the trehalose metabolism pathway in Gram-negative bacteria is formulated from Kosciow et al., 2014 and Purvis et al., 2005. The green boxes represent the trehalose synthesis genes (otsA, trehalose-6-phosphate phosphatase; otsB, trehalose-6-phosphate synthase; and treC, trehalose-6-phosphate hydrolase), whereas grey boxes represent the trehalose degrading genes (treA, periplasmic trehalase; and treF, cytoplasmic trehalase). At cytoplasm, trehalose is degraded by cytoplasmic trehalase gene (treF). The plasma membrane, stretch-activated proteins (SAP) facilitate the exit of trehalose under hypotonic conditions to the periplasm where it further degraded by periplasmic trehalase gene (treA).

Trehalose degradative pathways. Six different trehalose degradative pathways are found in organisms (bacteria, fungi, yeast, Arthropoda, and plants). Among them, only two degradation pathways (Trehalose degradation pathway II (cytosolic) and VI (periplasmic)) are found in Shigella species. Schematic diagram of the trehalose metabolism pathway in Gram-negative bacteria is formulated from Kosciow et al., 2014 and Purvis et al., 2005. The green boxes represent the trehalose synthesis genes (otsA, trehalose-6-phosphate phosphatase; otsB, trehalose-6-phosphate synthase; and treC, trehalose-6-phosphate hydrolase), whereas grey boxes represent the trehalose degrading genes (treA, periplasmic trehalase; and treF, cytoplasmic trehalase). At cytoplasm, trehalose is degraded by cytoplasmic trehalase gene (treF). The plasma membrane, stretch-activated proteins (SAP) facilitate the exit of trehalose under hypotonic conditions to the periplasm where it further degraded by periplasmic trehalase gene (treA). Trehalase belonging to the GH37 family can hydrolyse a molecule of ∝,∝-trehalose into two molecules of glucose by inverting the anomeric orientation. Trehalase belonging to the GH37 family have been found in different species, including bacteria, fungi, yeasts, plants, insects, and vertebrates [22]. GH family has been divided into “clans” in the CAZy database, where enzymes are regarded to have a common evolutionary origin. Clan GH-G was ascribed to GH37 enzymes, while clan GH-L was ascribed to GH65 and GH15 enzymes. Although clans GH-G and GH-L share only a low amount of sequence homology, such finding is significant. GH37 trehalase has two catalytic residues, Asp and Glu, in their CDs (catalytic domains). Asp and Glu residues tend to be involved in the function of GH65 and GH15 trehalases. These amino acid residues are most likely to be involved in a common inverting mechanism during catalysis [47]. Structures of these trehalases are comprised of conserved regions (CRs), which include catalytic residues. These CRs can form active sites that usually have loops. CDs of GH enzymes contain well-known trehalase signature motifs, motif 1 (PGGRFXEXY[G/Y] D[S/T] Y] and motif 2 (QWD[Y/F]PN/Y) [G/A] W[P/A] P), whereas GH65 and GH15 trehalases do not [48,49]. Our Antarctica isolate S. sp. PAMC28760 possesses GH37 trehalase with two signature motifs (motifs 1 and 2) as well as highly conserved regions (CR3-CR5), which have also been found in E. coli. Further study confirms that S. sp. PAMC28760 possesses trehalase enzyme, a member of the GH37 CAZyme family (Figure 3). The Gram-positive bacteria like Bacillus subtilis (non-pathogenic) and Clostridioidess difficile (pathogenic) share a pathway in which exogenous trehalose can be imported by a PTS to produce glucose and glucose-6-phosphate via the phosphotreahalose TerA (analogous to the PTS-TreC system in pathogenic E. coli). Due to the acquisition of an additional cluster of trehalose metabolism genes, namely a second PTS that mediates high-efficiency trehalose uptake from the environment, epidemic C. difficile strains can also grow on low trehalose. By increasing toxin levels, both modified trehalose utilization systems contributed to the growth and toxicity of these epidemic C. difficile strains [49]. There have been no previous papers on the function of the trehalose degradation pathway in virulence in Antarctic isolates till date. However, in Variovorax sp. PAMC28711 [50], the presence of trehalose metabolic pathway was mentioned.

Prediction of pathogenic and non-pathogenic proteins

MP3 (standalone program) can predict the presence of pathogenic and non-pathogenic proteins in a complete genome of a microbe based on two models, SVM and HMM, and their hybrids (integrated SVM and HMM models). To predict pathogenic and non-pathogenic trehalase, we retrieved complete genomes of 134 Shigella species (strains) from the NCBI database along with our S. sp. PAMC28760 isolates from Antarctica. Our strain S. sp. PAMC28760 showed pathogenic proteins of 1,136 (based on SVM model) out of 4329 total proteins (Table 1), with periplasmic trehalase as a pathogenic trehalase (data not shown). MP3 tool can be used to compare numbers of pathogenic proteins in healthy and infected samples by precisely identifying pathogenic protein fragments (based on amino acid composition and dipeptide composition) commonly found in metagenomic data without needing a time-consuming homology-based alignment [37]. In comparison with other publicly available bioinformatic tools, this program can predict pathogenic proteins with improved accuracy (95.06%), sensitivity (85.59%), and specificity (96.64%) as it employs both SVM and HMM models. Also, it is essential to analyse complete genome sequences of pathogenic and non-pathogenic bacteria of closely related species to determine if any significant genomic changes have occurred. It has been proposed that both pathogenic and non-pathogenic strains have virulence factors/genes. They can be distinguished based on gene content. When other genes suppress the virulence factors/genes, the bacterium becomes non-pathogenic. However, when suppressing genes are lost, a commensal can become pathogenic [51].
Table 1.

MP3 prediction of the total proteins, pathogenic protein, and non-pathogenic proteins in all the complete genomes of Shigella strains including Shigella sp. PAMC28760, which is indicated as a asterisk symbol. Hybrid: predictions from both HMM and SVM models.

StrainTotal proteinsHMMHybridSVMStrainTotal proteinsHMMHybridSVMStrainTotal proteinsHMMHybridSVM
Shigella flexneri
 
 
 
 
Shigell
 
 
 
 
Shigella boydii
 
 
 
 
S. flexneri C32474636711261259S. sonnei 2015C_3566429527910021125S. boydii 54_16213409181690803
S. flexneri 1a 2283973254843955S. sonnei 2015AM-109943182819921115S. boydii 59_2483958254855967
S. flexneri 1a 43940862628731001S. sonnei AR_042641202658871004S. boydii 83_5783725219777896
S.flexneri 1a 6704067276880997S. sonnei ATCC 29,93041402739291041S. boydii ATCC 87003436204725818
S. flexneri 2a 9814056257873993S. sonnei FC 14283930252879998S. boydii ATCC 92103807231801914
S. flexneri 2a 2457T3827236805923S. sonnei FDAARGOS 71541492749311061S. boydii ATTC 35,96440702488871004
S. flexneri 2a AUSMDU0001053540432698921019S. sonnei KCCM4128240412698921006S. boydii ATCC 49,81243472859711090
S.flexneri 2a str 3014313260835959S.sonnei 86640862749191046S. boydii ATCCBAA_12473723228783905
S. flexneri 4c 7023996250853964S. sonnei 53 G464831311191239S. boydii CDC 3083_943909252854970
S. flexneri 5a M90T3972260863984S. sonnei 75_02458331911061231S. boydii KCCM 41,6903650212749867
S. flexneri 64-55003981250870981S. sonnei FDAARGOS_52441148998991023S. boydii NCTC 97333611240793885
S. flexneri 74_117040992618871015S. sonnei Ss04640562829031026S. boydii NCTC 98503749224792909
S. flexneri 2016AM_08774062269875994S. sonnei FORC_011449930610871218S. boydii Sb 2273819227805924
S. flexneri 61_49823933240811931S. sonnei 2015C_379442182729871111S. boydii 59_27083753236780894
S. flexneri 2,002,0174045263879998S. sonnei CFSAN030807431628810161142S. boydii NCTC93533318177672778
S. flexneri 2,003,0363770235235907S. sonnei FC16533930256865986S. boydii 600,6573702240888777
S. flexneri AR_04244037262880996S. sonnei LC1477_1840482689081034S. boydii 600,0803784234928807
S. flexneri AR04233980251848960S. sonnei AUSMDU0000833341842729381059S. boydii 600,6904023267965807
S. flexneri FC9063882239822950S. sonnei AR_003043192779561080S. boydii 602,0683777245796903
S. flexneri G16633976261261971S. sonnei 2015C_38073857274840950S. boydii FDAARGOS_11393641221748855
S. flexneri shi06HN0063795237804916S. sonnei AUSMDU0001053441652809211045     
S. flexneri Y 93-306341002759111027S. sonnei FDAARGOS_9041491829311061     
S. flexneri Y PE5773807239802915S. sonnei 19.0821.34841962608831006     
S. flexneri FDAARGOS_743925262847967S. sonnei 19.1125.34934097260862983StrainTotal proteinsHMMHybridSVM
S. flexneri 1c Y3943922258834951S. sonnei 50645052959821099S.sp. PAMC 28,760*432930310061136
S. flexneri AR_04253937259848961S. sonnei 1205.313142012678871013     
S. flexneri 7b 94_300741172739001021S. sonnei 620742602699091021     
S. flexneri NCTC 97283886245817939S. sonnei 66074112262876993     
S. flexneri 98_31933665216765880S. sonnei 6904.274022260859974     
S. flexneri AUSMDU000083553905246830953S. sonnei 7111.694168262873999     
S. flexneri 89_1413880252835947S. sonnei 3,123,8853916251832947     
S. flexneri 4c 1205467629510221160S. sonnei 9,163,63341652618721003     
S. flexneri 04-31453785237784899S. sonnei 401,930,1054044257861977     
S. flexneri NCTC13769234784898S. sonnei L409441272668861005     
S. flexneri SFL15203833236809915S. sonnei SE6-142623259601078     
S. flexneri 5str 84013838244807919S. sonnei UKMCC-10154146268874986     
S.flexneri 2a ATCC 29,90341172538951014S. sonnei 401,952,0274141259867990     
S. flexneri 4c 160241692769241045S. sonnei LC1477/1841412599081034     
S. flexneri FDAARGOS_53540592708921012S. sonnei 893,9163864241810928     
S. flexneri AUSMDU0000833241162699081033          
S. flexneri 3a 888,0483611227746860          
S. flexneri 2013C_374940242618781001          
S. flexneri 5908_23777241802922          
S. flexneri FDAARGOS_6913730230783904          
S. flexneri M290140922618831006          
S. flexneri AUSMDU000021847553334812761442          
S. flexneri AUSMDU00022017549434312831434          
S. flexneri WW1441334810201140          
S. flexneri 83523033112251370          
S. dysenteriae ATCC97533944243854971          
S. dysenteriae ATCC97542959164601679          
S. dysenteriae ATCC120373831249830946          
S. dysenteriae ATCC120393942252820941          
S. dysenteriae ATCC493463689229786898          
S. dysenteriae ATCC493473868241823941          
S. dysenteriae BU53M13697229760869          
S. dysenteriae CFSAN0109542688150554625          
S. dysenteriae CFSAN01095640192748901014          
S. dysenteriae CFSAN0297863917262829946          
S. dysenteriae 07_33083274175614721          
S. dysenteriae 08_33803518213700812          
S. dysenteriae 53_39373310179628736          
S. dysenteriae 69_38183462207688790          
S. dysenteriae 16173140176625756          
S. dysenteriae 2017C_45223621216759866          
S. dysenteriae ATCC97523087178619713          
S. dysenteriae ATCC133133474208686792          
S. dysenteriae E670_7443642719641105          
S. dysenteriae NCTC97183356199651756          
S. dysenteriae Sd1974294222695804          
S. dysenteriae 80_5473462207688790          
S. dysenteriae ATCC97513062182642734          
S. dysenteriae 79_80063698228760874          
S. dysenteriae HNCMB200803296179610719          
MP3 prediction of the total proteins, pathogenic protein, and non-pathogenic proteins in all the complete genomes of Shigella strains including Shigella sp. PAMC28760, which is indicated as a asterisk symbol. Hybrid: predictions from both HMM and SVM models. In addition, the detection of transposon mutants in extraintestinal pathogenic E. coli (ExPEC) that are defective in binding to non-phagocytic cells is an unexpected finding on the probable role of periplasmic trehalase (treA) in virulence [24]. Furthermore, while trehalase enzymes are known to have a role in virulence of some fungal species, the occurrence of multiple enzymes can inhibit their potential as an antifungal drug target. Because the trehalose pathway and its enzymes are not found in mammals (including humans), fungi-specific inhibitors of the trehalose pathway and their enzymes should be generally non-toxic to mammals [52,53]. Likewise, a previous study has reported that inactivating trehalose biosynthesis pathways does not reduce resistance to oxidative stress in many bacteria, but a periplasmic trehalase gene (treA) mutant in Burkholderia pseudomallei shows increased sensitivity to oxidative stress despite elevated trehalose levels in the mutant, which is expected to protect against this stress [25]. Another study also reported that validmycin A was ineffective against Clostridioides difficile TreA, whereas trehalose derivatives such as epimers containing hydroxyl groups (2- and 4-positions), and thiotrehalose derivatives showed promise as TreA inhibitors with a larger spectrum. The efficacy of these drugs in treating specific bacterial infections is currently being studied [54]. It has also been reported that the PTS route for trehalose uptake (trehalose degradation I, low osmolarity) is inhibited when the osmolarity is high. Thus, trehalase (TreA) in the periplasm can allow cells to utilize trehalose at a high osmolarity by breaking it down into glucose molecules, which can be subsequently transported by phosphotransferase mediated system [55]. Genome of Shigella strains were analysed for pathogenic and non-pathogenic trehalase genes in this study for the first time. It is assumed that studying trehalase in one pathogenic bacterium like Shigella species could be important for further studies. Trehalase (TreA) from the pathogenic strain of extraintestinal E. coli known as MT78 has also been identified as a member of glycoside hydrolase 37 (GH37). Similarly, deletion of these genes in the meningoencephalitis-causing yeast Crytococcus neoformans resulted in severe defects in spore production, a decrease in spore germination, and an increase in the production of alternative development structures, which spores forms are plausible infectious particles [56]. Trehalose does not have to solely play a role in osmoregulation. According to Lee et al., it has stated that if glucose is present in the cytoplasm, molecules like trehalose are produced at levels approaching 400 mM in the cytoplasm [57]. Glycine betaine and L-proline often accumulate in the cytoplasm (around 700 and 400 mM, respectively) and can replace trehalose [58]. Many species utilize these osmolytes, which appear to be well-adapted to cellular functions. The electro-neutral solutes trehalose, glycine betaine, and L-proline, as well as potassium glutamate, have various chemical characteristics that may suit their functions in cell survival during osmotic shock.

Conclusions

Although there are many studies on trehalase, it was not studied in Shigella species based on two different trehalase genes (treF and treA) and pathogenicity. Most Shigella species (S. flexneri, S. boydii, S. dysenteriae, and S. sonnei), as well as our strain S. sp. PAMC28760, have cytoplasmic trehalase, and all periplasmic trehalase predicted in the studied strains showed up as pathogenic proteins using MP3, RAST, and Prokka tools. Notably, treF was detected in all strains of S. sonnei, but treA was identified in only two strains. This sort of research on pathogenic and non-pathogenic trehalase could help researchers to elucidate how and why Shigella species have certain traits. Furthermore, before performing any kinds of wet lab work, these bioinformatics tools are important in determining the nature of proteins present in a complete genome of bacteria.
  53 in total

1.  Osmo-regulation of bacterial transcription via poised RNA polymerase.

Authors:  Shun Jin Lee; Jay D Gralla
Journal:  Mol Cell       Date:  2004-04-23       Impact factor: 17.970

2.  Degradation-resistant trehalose analogues block utilization of trehalose by hypervirulent Clostridioides difficile.

Authors:  Noah D Danielson; James Collins; Alicyn I Stothard; Qing Qing Dong; Karishma Kalera; Peter J Woodruff; Brian J DeBosch; Robert A Britton; Benjamin M Swarts
Journal:  Chem Commun (Camb)       Date:  2019-04-23       Impact factor: 6.222

3.  Trehalase of Escherichia coli. Mapping and cloning of its structural gene and identification of the enzyme as a periplasmic protein induced under high osmolarity growth conditions.

Authors:  W Boos; U Ehmann; E Bremer; A Middendorf; P Postma
Journal:  J Biol Chem       Date:  1987-09-25       Impact factor: 5.157

Review 4.  Clinical trials of Shigella vaccines: two steps forward and one step back on a long, hard road.

Authors:  Myron M Levine; Karen L Kotloff; Eileen M Barry; Marcela F Pasetti; Marcelo B Sztein
Journal:  Nat Rev Microbiol       Date:  2007-07       Impact factor: 60.633

5.  Complete genome sequencing of Shigella sp. PAMC 28760: Identification of CAZyme genes and analysis of their potential role in glycogen metabolism for cold survival adaptation.

Authors:  So-Ra Han; Do Wan Kim; Byeollee Kim; Young Min Chi; Seunghyun Kang; Hyun Park; Sang-Hee Jung; Jun Hyuck Lee; Tae-Jin Oh
Journal:  Microb Pathog       Date:  2019-09-24       Impact factor: 3.738

6.  A novel trehalase from Mycobacterium smegmatis - purification, properties, requirements.

Authors:  J David Carroll; Irena Pastuszak; Vineetha K Edavana; Yuan T Pan; Alan D Elbein
Journal:  FEBS J       Date:  2007-02-23       Impact factor: 5.542

7.  Trehalase plays a role in macrophage colonization and virulence of Burkholderia pseudomallei in insect and mammalian hosts.

Authors:  Muthita Vanaporn; Mitali Sarkar-Tyson; Andrea Kovacs-Simon; Philip M Ireland; Pornpan Pumirat; Sunee Korbsrisate; Richard W Titball; Aaron Butt
Journal:  Virulence       Date:  2016-07-01       Impact factor: 5.882

8.  Osmoregulation in Escherichia coli by accumulation of organic osmolytes: betaines, glutamic acid, and trehalose.

Authors:  P I Larsen; L K Sydnes; B Landfald; A R Strøm
Journal:  Arch Microbiol       Date:  1987-02       Impact factor: 2.552

9.  Cleavage of trehalose-phosphate in Bacillus subtilis is catalysed by a phospho-alpha-(1-1)-glucosidase encoded by the treA gene.

Authors:  C Helfert; S Gotsche; M K Dahl
Journal:  Mol Microbiol       Date:  1995-04       Impact factor: 3.501

10.  The carbohydrate-active enzymes database (CAZy) in 2013.

Authors:  Vincent Lombard; Hemalatha Golaconda Ramulu; Elodie Drula; Pedro M Coutinho; Bernard Henrissat
Journal:  Nucleic Acids Res       Date:  2013-11-21       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.