Literature DB >> 25197485

The complete genome sequence of Clostridium indolis DSM 755(T.).

Amy S Biddle1, Susan Leschine, Marcel Huntemann2, James Han2, Amy Chen2, Nikos Kyrpides2, Victor Markowitz2, Krishna Palaniappan2, Natalia Ivanova2, Natalia Mikhailova2, Galina Ovchinnikova2, Andrew Schaumberg2, Amrita Pati2, Dimitrios Stamatis2, Tatiparthi Reddy2, Elizabeth Lobos2, Lynne Goodwin2, Henrik P Nordberg2, Michael N Cantor2, Susan X Hua2, Tanja Woyke2, Jeffrey L Blanchard3.   

Abstract

Clostridium indolis DSM 755(T) is a bacterium commonly found in soils and the feces of birds and mammals. Despite its prevalence, little is known about the ecology or physiology of this species. However, close relatives, C. saccharolyticum and C. hathewayi, have demonstrated interesting metabolic potentials related to plant degradation and human health. The genome of C. indolis DSM 755(T) reveals an abundance of genes in functional groups associated with the transport and utilization of carbohydrates, as well as citrate, lactate, and aromatics. Ecologically relevant gene clusters related to nitrogen fixation and a unique type of bacterial microcompartment, the CoAT BMC, are also detected. Our genome analysis suggests hypotheses to be tested in future culture based work to better understand the physiology of this poorly described species.

Entities:  

Keywords:  Clostridium indolis; aromatic degradation; bacterial microcompartments; citrate; lactate; nitrogen fixation

Year:  2014        PMID: 25197485      PMCID: PMC4149025          DOI: 10.4056/sigs.5281010

Source DB:  PubMed          Journal:  Stand Genomic Sci        ISSN: 1944-3277


Introduction

The species group is a poorly described and taxonomically confusing clade in the , a family within the that includes members of clostridial cluster XIVa [1]. This group includes , , , , and , none of which are well studied (Figure 1). has gained attention because its saccharolytic capacity was shown to be syntrophic with the cellulolytic activity of in co-culture, enabling the conversion of cellulose to ethanol in a single step [6,7]. Members of this group, such as are themselves cellulolytic [8], and others are known to degrade unusual substrates such as methylated aromatic compounds () [9], and the insecticide lindane () [10]. was targeted for whole genome sequencing to provide insight into the genetic potential of this taxa that could then direct experimental efforts to understand its physiology and ecology.
Figure 1

Phylogenetic tree based on 16S rRNA gene sequences highlighting the position of relative to other type strains (T) within the . The strains and their corresponding NCBI accession numbers (and, when applicable, draft sequence coordinates) for 16S rRNA genes are: strain DSM 4024T, Y11568; ATCC 19403T, AB075772; DSM 5628T, X71848; DSM 755T, Pending release by JGI: 1620643-1622056; SR3, AF067965; WM1T, NC_014376:18567-20085; SPL73T, AF092549; DSM 13479T, ADLN00000000: 202-1639; L34420 T, L34420; ATCC 29149T, X94967; R. torques ATCC 27756T, L76604; L34627T; L1-82T, AJ312385; A2-183T, AJ270482; HY-35-12T, AY494606; HESP1T, AF116920; ISDgT, CP000885: 15754-17276. The tree uses sequences aligned by MUSCLE, and was inferred using the Neighbor-Joining method [2]. The optimal tree with the sum of branch lengths = 0.50791241 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches [3]. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [4] and are in the units of the number of base substitutions per site. Evolutionary analyses were conducted in MEGA 5 [5]. ATCC 35414T, CP003992: 856992-858513 was used as an outgroup.

Phylogenetic tree based on 16S rRNA gene sequences highlighting the position of relative to other type strains (T) within the . The strains and their corresponding NCBI accession numbers (and, when applicable, draft sequence coordinates) for 16S rRNA genes are: strain DSM 4024T, Y11568; ATCC 19403T, AB075772; DSM 5628T, X71848; DSM 755T, Pending release by JGI: 1620643-1622056; SR3, AF067965; WM1T, NC_014376:18567-20085; SPL73T, AF092549; DSM 13479T, ADLN00000000: 202-1639; L34420 T, L34420; ATCC 29149T, X94967; R. torques ATCC 27756T, L76604; L34627T; L1-82T, AJ312385; A2-183T, AJ270482; HY-35-12T, AY494606; HESP1T, AF116920; ISDgT, CP000885: 15754-17276. The tree uses sequences aligned by MUSCLE, and was inferred using the Neighbor-Joining method [2]. The optimal tree with the sum of branch lengths = 0.50791241 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches [3]. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [4] and are in the units of the number of base substitutions per site. Evolutionary analyses were conducted in MEGA 5 [5]. ATCC 35414T, CP003992: 856992-858513 was used as an outgroup.

Classification and features

The general features of DSM 755T are listed in Table 1. DSM 755T was originally named for its ability to hydrolyze tryptophan to indole, pyruvate, and ammonia [23] in the classic Indole Test used to distinguish bacterial species. It has been isolated from soil [24], feces [25], and clinical samples from infections [27]. Despite its prevalence, is not well characterized, and there are conflicting reports about its physiology. It is described as a sulfate reducer with the ability to ferment some simple sugars, pectin, pectate, mannitol, and galacturonate, and convert pyruvate to acetate, formate, ethanol, and butyrate [28]. According to this source, neither lactate nor citrate are utilized, however other studies demonstrate that fecal isolates closely related to may utilize lactate [29], and that the type strain DSM 755T utilizes citrate [30]. It is unclear whether is able to make use of a wider range of sugars or break down complex carbohydrates, however growth is reported to be stimulated by fermentable carbohydrates [28].
Table 1

Classification and general features of DSM 755T

MIGS ID     Property    Term    Evidence Code
    Domain Bacteria    TAS [11]
    Phylum Firmicutes    TAS [12-14]
    Class Clostridia    TAS [15,16]
     Current classification    Order Clostridiales    TAS [17,18]
    Family Lachnospiraceae    TAS [15,19]
    Genus Clostridium    TAS [17,20,21]
    Species Clostridium indolis    TAS [17,22]
    Type strain DSM 755
     Gram stain    Negative    TAS [23,24]
     Cell shape    Rod    TAS [23,24]
     Motility    Motile    TAS [23,24]
     Sporulation    Terminal, spherical spores    TAS [23,24]
     Temperature range    Mesophilic    TAS [23,24]
     Optimum temperature    37oC    TAS [23,24]
     Carbon sources    Glucose, lactose, sucrose, mannitol, pectin, pyruvate, others    TAS [23,24]
     Terminal electron receptor    Sulfate    TAS [23,24]
     Indole test    Positive    TAS [23,24]
MIGS-6     Habitat    Isolated from soil, feces, wounds    TAS [24,25]
MIGS-6.3     Salinity    Inhibited by 6.5% NaCl    TAS [23,24]
MIGS-22     Oxygen    Anaerobic    TAS [23,24]
MIGS-15     Biotic relationship    Free living and host associated TAS [24,25],9
MIGS-14     Pathogenicity    No NAS
MIGS-4     Geographic location    Soil, feces TAS [24,25],9

Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [26].

Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [26].

Genome sequencing information

Genome project history

The genome was selected based on the relatedness of DSM 755T to , an organism with interesting saccharolytic and syntrophic properties. The genome sequence was completed on May 2, 2013, and presented for public access on June 3, 2013. Quality assurance and annotation done by DOE Joint Genome Institute (JGI) as described below. Table 2 presents a summary of the project information and its association with MIGS version 2.0 compliance [31].
Table 2

Project information

MIGS ID    Property    Term
MIGS-31    Finishing quality    Improved Draft
MIGS-28    Libraries used    Shotgun and long insert mate pair (Illumina), SMRTbellTM (PacBio)
MIGS-29    Sequencing platforms    Illumina and PacBio
MIGS-31.2    Fold coverage    759.7× (Illumina), 51.6× (PacBio)
MIGS-30    Assemblers    Velvet, AllpathsLG
MIGS-32    Gene calling method    Prodigal, GenePRIMP
    Genome Database release    June 3, 2013 (IMB)
    Genbank ID    Pending release by JGI
    Genbank Date of Release    Pending release by JGI
    GOLD ID    Gi22434
    Project relevance    Anaerobic plant degradation

Growth conditions and DNA isolation

DSM 755T was cultivated anaerobically on GS2 medium as described elsewhere [32]. DNA for sequencing was extracted using the DNA Isolation Bacterial Protocol available through the JGI (http://www.jgi.doe.gov). The quality of DNA extracted was assessed by gel electrophoresis and NanoDrop (ThermoScientific, Wilmington, DE) according to the JGI recommendations, and the quantity was measured using the Quant-iTTM Picogreen assay kit (Invitrogen, Carlsbad, CA) as directed.

Genome sequencing and assembly

The draft genome of was generated at the DOE Joint genome Institute (JGI) using a hybrid of the Illumina and Pacific Biosciences (PacBio) technologies. An Illumina std shotgun library and long insert mate pair library was constructed and sequenced using the Illumina HiSeq 2000 platform [33]. 16,165,490 reads totaling 2,424.8 Mb were generated from the std shotgun and 26,787,478 reads totaling 2,437.7 Mb were generated from the long insert mate pair library. A Pacbio SMRTbellTM library was constructed and sequenced on the PacBio RS platform. 99,448 raw PacBio reads yielded 118,743 adapter trimmed and quality filtered subreads totaling 330.2 Mb. All general aspects of library construction and sequencing performed at the JGI can be found at http://www.jgi.doe.gov. All raw Illumina sequence data was passed through DUK, a filtering program developed at JGI, which removes known Illumina sequencing and library preparation artifacts [34]. Filtered Illumina and PacBio reads were assembled using AllpathsLG (PrepareAllpathsInputs: PHRED 64=1 PLOIDY=1 FRAG COVERAGE=50 JUMP COVERAGE=25; RunAllpath- sLG: THREADS=8 RUN=std pairs TARGETS=standard VAPI WARN ONLY=True OVERWRITE=True) [35]. The final draft assembly contained 1 contig in 1 scaffold. The total size of the genome is 6.4 Mb. The final assembly is based on 2,424.6 Mb of Illumina Std PE, 2,437.6 Mb of Illumina CLIP PE and 330.2 Mb of PacBio post filtered data, which provides an average 759.7× Illumina coverage and 51.6× PacBio coverage of the genome, respectively.

Genome annotation

Genes were identified using Prodigal [36], followed by a round of manual curation using GenePRIMP [9] for finished genomes and Draft genomes in fewer than 10 scaffolds. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, KEGG, COG, and InterPro databases. The tRNAScanSE tool [37] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [38]. Other non–coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [39]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes (IMG) platform [40] developed by the Joint Genome Institute, Walnut Creek, CA, USA [41]. Information in the tables below reflects the gene information in the JGI annotation on the IMG website [40].

Genome properties

The genome of DSM 755 consists of a 6,383,701 bp circular chromosome with GC content of 44.93% (Table 3). Of the 5,903 genes predicted, 5,802 were protein-coding genes, and 101 RNAs; 170 pseudogenes were also identified. 81.21% of genes were assigned with a putative function with the remaining annotated as hypothetical proteins. The genome summary and distribution of genes into COGs functional categories are listed in Tables 3 and 4.
Table 3

Nucleotide content and gene count levels of the genome of DSM 755

Attribute     Value    % of totala
Genome size (bp)     6,383,701
DNA Coding region (bp)     5,688,007    89.10
DNA G+C content (bp)     2,868,247    44.93
Total genesb     5,903    100.00
RNA genes     101    1.71
Protein-coding genes     5,802    98.29
Protein-coding with function pred.     4,794    81.21
Genes in paralog clusters     4,527    76.69
Genes assigned to COGs     4,643    78.65
Genes with signal peptides     421    7.13
Genes with transmembrane helices     1,494    25.31
Paralogous groups     4,527    76.69

a) The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome.

b) Also includes 170 pseudogenes.

Table 4

Number of genes in DSM 755 associated with the 25 general COG functional categories

Code    Value    %agea     Description
J    184    3.57     Translation
A    0    0     RNA processing and modification
K    531    10.30     Transcription
L    191    3.71     Replication, recombination and repair
B    1    0.02     Chromatin structure and dynamics
D    28    0.54     Cell cycle control, mitosis and meiosis
Y    0    0     Nuclear structure
V    107    2.08     Defense mechanisms
T    335    6.50     Signal transduction mechanisms
M    235    4.56     Cell wall/membrane biogenesis
N    70    1.36     Cell motility
Z    0    0     Cytoskeleton
W    0    0     Extracellular structures
U    41    0.80     Intracellular trafficking and secretion
O    124    2.41     Posttranslational modification, protein turnover, chaperones
C    261    5.06     Energy production and conversion
G    910    17.65     Carbohydrate transport and metabolism
E    493    9.56     Amino acid transport and metabolism
F    110    2.13     Nucleotide transport and metabolism
H    153    2.97     Coenzyme transport and metabolism
I    77    1.49     Lipid transport and metabolism
P    325    6.30     Inorganic ion transport and metabolism
Q    70    1.36     Secondary metabolites biosynthesis, transport and catabolism
R    590    11.45     General function prediction only
S    319    6.19     Function unknown
-    1260    21.35     Not in COGs

a) The total is based on the total number of protein coding genes in the annotated genome.

a) The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome. b) Also includes 170 pseudogenes. a) The total is based on the total number of protein coding genes in the annotated genome. The genomes of and its near relatives () have similar numbers of genes in each of the 25 broad COG categories (not shown), however differences exist in the type and distribution of genes in specific functional groups (Table 5), particularly those related to COG categories (G) Carbohydrate transport and metabolism, (C) Energy production and conversion, and (Q) Secondary metabolites biosynthesis, transport and catabolism.
Table 5

Number of genes in each of the 25 general COG functional categoriesa found in DSM 755T but not in closely related species

Code    Value     Description
J    4     Translation
A    0     RNA processing and modification
K    5     Transcription
L    9     Replication, recombination and repair
B    1     Chromatin structure and dynamics
D    0     Cell cycle control, mitosis and meiosis
Y    0     Nuclear structure
V    1     Defense mechanisms
T    2     Signal transduction mechanisms
M    8     Cell wall/membrane biogenesis
N    2     Cell motility
Z    0     Cytoskeleton
W    0     Extracellular structures
U    1     Intracellular trafficking and secretion
O    10     Posttranslational modification, protein turnover, chaperones
C    28     Energy production and conversion
G    6     Carbohydrate transport and metabolism
E    8     Amino acid transport and metabolism
F    1     Nucleotide transport and metabolism
H    11     Coenzyme transport and metabolism
I    2     Lipid transport and metabolism
P    11     Inorganic ion transport and metabolism
Q    10     Secondary metabolites biosynthesis, transport and catabolism
R    18     General function prediction only
S    21     Function unknown

a) Number of genes from a set of 158 genes not found in near relatives () associated with the 25 general COG functional categories.

a) Number of genes from a set of 158 genes not found in near relatives () associated with the 25 general COG functional categories.

Carbohydrate transport and metabolism

Plant biomass is a complex composite of fibrils and sheets of cellulose, hemicellulose, waxes, pectin, proteins, and lignin. Bacteria from soil and the gut generally possess a variety of genes to degrade and transport the diversity of substrates encountered in these plant-rich environments. The genome of includes 910 genes (17.65% of total protein coding genes) in this COG group including glycoside hydrolases with the potential to degrade complex carbohydrates including starch, cellulose, and chitin (Table 6), as well as an abundance of carbohydrate transporters (Figure 2).
Table 6

Selected carbohydrate active genes in the DSM 755T genome

Gene count    Product namea     Database IDb
19    Beta-glucosidase (GH-1)     EC:3.2.1.86
8    Beta-galactosidase/    beta-glucuronidase (GH-2)     EC:3.2.1.23     EC:3.2.1.25     EC:3.2.1.31
7    Beta-glucosidase/ related glucosidases (GH-3)     EC:3.2.1.21     EC:3.2.1.52
14    Alpha-galactosidases/    6-phospho-beta-glucosidases (GH-4)     EC:3.2.1.86     EC:3.2.1.122     EC:3.2.1.22
2    Cellulase, endogluconase (GH-5)     EC:3.2.1.4
14    Alpha-amylase     EC:3.2.1.10     EC:3.2.1.20     EC:2.4.1.7     EC:3.2.1.70
8    Beta-xylosidase (GH 39)     EC:3.2.1.37
2    Chitinase (GH 18)     EC:3.2.1.14

a) GH designations given from the CAZy database [42]. b) Enzyme Commission (EC) numbers assigned by the Integrated Microbial Genome (IMG) database [41].

Figure 2

Distribution of ABC and PTS transporters in the genomes of and related genomes determined from Integrated Microbial Genome (IMG) annotation [40] viewed based on (a) Total umber of COGS, and (b) Percentage of genes in the genome.

a) GH designations given from the CAZy database [42]. b) Enzyme Commission (EC) numbers assigned by the Integrated Microbial Genome (IMG) database [41]. Distribution of ABC and PTS transporters in the genomes of and related genomes determined from Integrated Microbial Genome (IMG) annotation [40] viewed based on (a) Total umber of COGS, and (b) Percentage of genes in the genome. Almost 8% of the protein-coding genes in the genome of C. indolis were found to be associated with carbohydrate transport, represented by two main strategies. ABC (ATP binding cassette) transporters tend to carry oligosaccharides, and have less affinity for hexoses [43,44], while PTS (phosphotransferase system) transporters carry many different mono- and disaccharides, especially hexoses [45]. PTS systems provide a means of regulation via catabolite repression [46], and are thought to enable bacteria living in carbohydrate-limited environments to more efficiently utilize and compete for substrates [46]. Both and its near relatives are more highly enriched in ABC than PTS transporters (Fig 2), however nearly a third of and transporters are PTS genes, suggesting a preference for hexoses, as well as an adaptation to more marginal environments. also possesses ten genes associated with all three components of the TRAP-type C4-dicarboxylate transport system, which transports C4-dicarboxylates such as formate, succinate, and malate [47], as well as six putative malate dehydrogenases and two putative succinate dehydrogenases suggesting that may have the potential to utilize both of these short chain fatty acids.

Energy production and conversion

The genome of contains 261 genes in COG category (C) Energy production and conversion, 28 of which are not found in the near relatives analyzed, including genes for citrate utilization (Table 7) and nitrogen fixation (Table 8).
Table 7

Selection of DSM 755 genes related to citrate utilization.

Locus Tag     Putative Gene Producta    Gene IDa
K401DRAFT_2892     holo-ACP synthase (CitX)    EC:2.7.7.61
K401DRAFT_2893     citrate lyase acyl carrier (CitD)    EC:4.1.3.6
K401DRAFT_2894     citrate lyase beta subunit (CitE)    EC:4.1.3.6    EC:2.8.3.10
K401DRAFT_2895     citrate lyase alpha subunit (CitF)    EC:4.1.3.6    EC:2.8.3.10
K401DRAFT_2896     triphosphoribosyl-dephospho-CoA synthase (CitG)    EC:2.7.8.25
K401DRAFT_2897     citrate (pro3S)-lyase ligase (CitC)    EC:6.2.1.22
K401DRAFT_2898     response regulator, CheY-like receiver domain, winged helix DNA binding domain    -
K401DRAFT_2899     signal transduction histidine kinase    -
K401DRAFT_2900     citrate transporter, CITMHS family    KO:K03303    TC.LCTP

Gene products and Enzyme Commission (EC) numbers assigned by the Integrated Microbial Genome (IMG) database [41].

Table 8

Selection of DSM 755 genes related to nitrogen fixation.

Locus Tag    Putative Gene Product    Gene ID
K401DRAFT_0533    nitrogenase Mo-Fe protein, α and β chains    pfam00148
K401DRAFT_0534    nitrogenase Mo-Fe protein, α and β chains    pfam00148
K401DRAFT_0535    nitrogenase subunit (ATPase) (nifH)    pfam00142
K401DRAFT_0884    nitrogenase Mo-Fe protein, α and β chains    pfam00148
K401DRAFT_0885    nitrogenase Mo-Fe protein, α and β chains    pfam00148
K401DRAFT_0886    nitrogenase subunit (ATPase) (nifH)    pfam00142
K401DRAFT_3349    nitrogenase Mo-Fe protein, α and β chains    pfam00148
K401DRAFT_3350    nitrogenase Mo-Fe protein, α and β chains    pfam00148
K401DRAFT_3351    nitrogenase subunit (ATPase) (nifH)    pfam00142
K401DRAFT_3874    nitrogenase Mo-Fe protein, α and β chains (nifD)    pfam00148
K401DRAFT_3875    nitrogenase Mo-Fe protein, α and β chains (nifK)    pfam00148
K401DRAFT_3876    nitrogenase Fe protein    pfam00142
K401DRAFT_3878    nitrogenase Mo-Fe protein, α and β chains (nifD)    pfam00148
K401DRAFT_3879    nitrogenase Mo-Fe protein, α and β chains (nifK)    pfam00148
K401DRAFT_3880    dinitrogenase Fe-Mo cofactor, (nifH)    pfam02579
K401DRAFT_3895    nitrogenase Mo-Fe protein, α and β chains (nifD)    pfam00148
K401DRAFT_3896    nitrogenase Mo-Fe protein, α and β chains (nifK)    pfam00148
K401DRAFT_5519    nitrogenase Mo-Fe protein, α and β chains (nifB)    pfam04055
K401DRAFT_5520    nitrogenase Mo-Fe protein, α and β chains (nifE)    pfam00148
K401DRAFT_5521    nitrogenase Mo-Fe protein (nifK)    pfam00148
K401DRAFT_5522    nitrogenase component 1, alpha chain (nifN-like)    pfam00148
K401DRAFT_5525    nitrogenase subunit (ATPase) (nifH)    pfam00142

Nitrogenase genes have a common gene identifier (EC:1.18.6.1), therefore the pfam numbers are given to distinguish between subunits. Gene product names and pfam numbers assigned by the Integrated Microbial Genome (IMG) database [41].

Gene products and Enzyme Commission (EC) numbers assigned by the Integrated Microbial Genome (IMG) database [41]. Nitrogenase genes have a common gene identifier (EC:1.18.6.1), therefore the pfam numbers are given to distinguish between subunits. Gene product names and pfam numbers assigned by the Integrated Microbial Genome (IMG) database [41].

Citrate utilization

Citrate is a metabolic intermediary found in all living cells. In aerobic bacteria, citrate is utilized as part of the tricarboxylic acid (TCA) cycle. In anaerobes, citrate is fermented to acetate, formate, and/or succinate. The first step is the conversion of citrate to acetate and oxaloacetate in a reaction catalyzed by citrate lyase (EC:4.1.3.6) [48]. , a close relative of that does not yet have a sequenced genome has been shown to utilize citrate [49], but there is conflicting evidence as to whether this phenotype is present in [28,30]. The genome of reveals a group of seven citrate genes organized in a cluster similar to operons found in other bacterial species [48,50] (Figure 3) including CitD, CitE, and CitF, the three subunits of the citrate lyase gene [48], CitG and CitX which have been shown to be necessary for citrate lyase function [50], CitMHS, a citrate transporter, and a putative two component system similar to citrate regulatory mechanisms in other bacteria [51].
Figure 3

Citrate utilization genes are in a single gene cluster on K401DRAFT_scaffold0000.1.1, including the citrate transporter CitMHS, and a putative two-component system.

Citrate utilization genes are in a single gene cluster on K401DRAFT_scaffold0000.1.1, including the citrate transporter CitMHS, and a putative two-component system.

Nitrogen Fixation

Nitrogen fixation has been observed in other clostridia [52,53] but has not been demonstrated in the species group. It has been suggested that the capacity to fix nitrogen confers a selective advantage to cellulolytic microbes that live in nitrogen limited environments such as many soils [52]. The functional summary suggests that can fix nitrogen. The genome reveals 22 nitrogenase related genes in four gene clusters (Table 8), none of which are found in the near relatives analyzed in this study. A minimum set of six genes encoding for structural and biosynthetic components of a functional nitrogenase complex have been hypothesized [54]. Genes needed for the nitrogenase structural component proteins (nifH, nifD, and nifK) are present in , but one of the three genes required to synthesize the nitrogenase iron-molybdenum cofactor (nifN) is not identified. Follow up experiments are needed to determine whether can fix nitrogen as predicted by the genome analysis.

Lactate utilization

The genome of includes both D- and L-lactate dehydrogenases, which convert lactate to pyruvate. Additionally, there is a lactate transporter, suggesting that is able to utilize exogenous lactate [Table 9].
Table 9

Selection of DSM 755 genes related to lactate utilization.

Locus Tag    Putative Gene Product    Gene ID
K401DRAFT_1877    L-lactate dehydrogenase    EC:1.1.1.27
K401DRAFT_5775    L-lactate dehydrogenase    EC:1.1.1.27
K401DRAFT_3431    L-lactate transporter, LctP family    TC.LCTP
K401DRAFT_3220    D-lactate dehydrogenase    EC:1.1.1.28

Annotations assigned by the Integrated Microbial Genome (IMG) database [41]

Annotations assigned by the Integrated Microbial Genome (IMG) database [41]

Bacterial microcompartments (BMC)

The genome contains genes associated with bacterial microcompartment shell proteins. Bacterial microcompartments (BMCs) are proteinaceous organelles involved in the metabolism of ethanolamine, 1,2-propanediol, and possibly other metabolites (Rev in [55-57]). BMCs are often encoded by a single operon or contiguous stretch of DNA. The different metabolic types of BMCs can be distinguished by a key enzyme (e.g., ethanolamine lyase and propanediol dehydratase) related to its metabolic function. While the other associated genes in the operon can vary, they frequently include an alcohol dehydrogenase, an aldehyde dehydrogenase, an aldolase and an oxidoreductase. In there are 2 separate genetic loci that code for BMCs (Table 10 and 11 and Figure 4). One locus (Table 10) contains a gene (K401DRAFT_2189) with sequence similarity to a B12-independent propanediol dehydratase found in and [58,59] (both members of the ). This enzyme has been shown to be involved in the metabolism of fucose and rhamnose [58,59] and was subsequently categorized as the glycyl radical prosthetic group-based (grp) BMC [60]. The glycyl radical family of enzymes was recently expanded to include a choline trimethylamine lyase activity that is part of a microcompartment loci in [61]. The corresponding C. indolis enzymes (K401DRAFT_2189 and K401DRAFT_2190) are more similar to the protein, but there are differences in the gene content of the microcompartment loci. Further work is needed to determine the physiological role of this microcompartment.
Table 10

grp-BMC genes found in the genome.

Locus Tag      Product Name    Gene ID/ Protein Information
K401DRAFT_2181      Predicted transcriptional regulator    COG0789
K401DRAFT_2182      Predicted membrane protein    COG2510
K401DRAFT_2183      Carbon dioxide concentrating mechanism/carboxysome shell protein    pfam00936
K401DRAFT_2184      Predicted membrane protein    pfam00936
K401DRAFT_2185      Hypothetical protein    -
K401DRAFT_2186      Carbon dioxide concentrating mechanism/carboxysome shell protein    pfam00936
K401DRAFT_2187      Carbon dioxide concentrating mechanism/carboxysome shell protein    pfam00936
K401DRAFT_2188      NAD-dependent aldehyde dehydrogenase    pfam00171
K401DRAFT_2189      Pyruvate formate lyase    pfam02901
K401DRAFT_2190      Pyruvate formate lyase activating enzyme    pfam04055
K401DRAFT_2191      Ethanolamine utilization protein    pfam00936
K401DRAFT_2192      Ethanolamine utilization protein    pfam10662
K401DRAFT_2193      Alcohol dehydrogenase, class IV    pfam00465
K401DRAFT_2194      Ethanolamine utilization cobalamin adenosyltransferase    COG4892
K401DRAFT_2195      Ethanolamine utilization protein, possible chaperonin    COG4820
K401DRAFT_2196      Carbon dioxide concentrating mechanism/carboxysome shell protein    pfam00936
K401DRAFT_2197      Carbon dioxide concentrating mechanism/carboxysome shell protein    pfam03319
K401DRAFT_2198      Ethanolamine utilization protein    pfam06249
K401DRAFT_2199      Carbon dioxide concentrating mechanism/carboxysome shell protein    pfam00936
K401DRAFT_2200      NAD-dependent aldehyde dehydrogenase    pfam00171
K401DRAFT_2201      Propanediol utilization protein    pfam06130
K401DRAFT_2202      Carbon dioxide concentrating mechanism/carboxysome shell protein    pfam00936

Annotations assigned by the Integrated Microbial Genome (IMG) database [41].

Table 11

CoAT BMC genes found in the genome.

Locus Tag     Product Name    Gene ID/ Protein Information
K401DRAFT_4970     DeoRC transcriptional regulator    pfam00455
K401DRAFT_4969     fucA, L-fuculose-phosphate aldolase    EC:4.1.2.17
K401DRAFT_4968     pduP, propionaldehyde dehydrogenase    pfam00171
K401DRAFT_4967     eutM, ethanolamine utilization protein    pfam00936
K401DRAFT_4966     Carbon dioxide concentrating mechanism/carboxysome shell protein    pfam00936
K401DRAFT_4965     Carbon dioxide concentrating mechanism/carboxysome shell protein    pfam00936
K401DRAFT_4964     Carbon dioxide concentrating mechanism/carboxysome shell protein    pfam00936
K401DRAFT_4963     Pdul, propanediol utilization protein    pfam06130
K401DRAFT_4962     eutN_CcmL    pfam03319
K401DRAFT_4961     SBP_bac_8, ABC-type sugar transporter    pfam13416
K401DRAFT_4960     Uncharacterized NAD(FAD)-dependent dehydrogenase    COG0446
K401DRAFT_4959     CoA-transferase    pfam01144
K401DRAFT_4958     CoA-transferase    pfam01144
K401DRAFT_4957     Fe-ADH, Alcohol dehydrogenase    pfam00465

Annotations assigned by the Integrated Microbial Genome (IMG) database [41]

Figure 4

CoAT BMC operon found in and . Gene details are found in Table 11.

Annotations assigned by the Integrated Microbial Genome (IMG) database [41]. Annotations assigned by the Integrated Microbial Genome (IMG) database [41] CoAT BMC operon found in and . Gene details are found in Table 11. The second BMC loci (Table 11 and Figure 4) is even more enigmatic. This loci contains the shell proteins, alcohol dehydrogenase, aldehyde dehydrogenase, aldolase and oxidoreductase commonly found in microcompartments, but it lacks a known key enzyme. Homologs of this operon were found in four other bacterial species (Figure 4). They are all missing a known key enzyme and contain 2 genes annotated as CoA-transferase. We propose that the genome and these other bacteria contain a novel type of microcompartment, designated the CoAT BMC. It is not clear that the function of the 2 annotated CoA-transferase genes are as predicted and further research is needed to demonstrate the physiological role of this BMC.

Secondary metabolites biosynthesis, transport and catabolism

Protocatechuate and other aromatics are intermediaries in the degradation of lignin in plant rich environments [62]. The genome of contains two protocatechuate dioxygenases and an aromatic hydrolase, revealing the potential for utilizing aromatic compounds (Table 12).
Table 12

Selection of DSM 755T genes related to degradation of aromatics.

Locus Tag      Putative Gene Product    Gene ID
K401DRAFT_3571      Protocatechuate 3,4-dioxygenase beta subunit    EC:1.13.11.3
K401DRAFT_3568      Protocatechuate 3,4-dioxygenase beta subunit    EC:1.13.11.3
K401DRAFT_3412      Aromatic ring hydroxylase    EC:5.3.3.3    EC:4.2.1.120

Annotations assigned by the Integrated Microbial Genome (IMG) database [41]

Annotations assigned by the Integrated Microbial Genome (IMG) database [41]

Conclusion

The genomic sequence of reported here reveals the metabolic potential of this organism to utilize a wide assortment of fermentable carbohydrates and intermediates including citrate, lactate, malate, succinate, and aromatics, and points to potential ecological roles in nitrogen fixation and ethanolamine utilization. Further culture-based characterization is necessary to confirm the metabolic activity suggested by this genomic analysis, and to expand the description of .
  45 in total

Review 1.  Regulation of carbon catabolism in Bacillus species.

Authors:  J Stülke; W Hillen
Journal:  Annu Rev Microbiol       Date:  2000       Impact factor: 15.500

2.  Symbiotic Relationship of Bacteroides cellulosolvens and Clostridium saccharolyticum in Cellulose Fermentation.

Authors:  W D Murray
Journal:  Appl Environ Microbiol       Date:  1986-04       Impact factor: 4.792

3.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.

Authors:  Koichiro Tamura; Daniel Peterson; Nicholas Peterson; Glen Stecher; Masatoshi Nei; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2011-05-04       Impact factor: 16.240

4.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence.

Authors:  T M Lowe; S R Eddy
Journal:  Nucleic Acids Res       Date:  1997-03-01       Impact factor: 16.971

5.  The neighbor-joining method: a new method for reconstructing phylogenetic trees.

Authors:  N Saitou; M Nei
Journal:  Mol Biol Evol       Date:  1987-07       Impact factor: 16.240

6.  Towards a phylogeny of the clostridia based on 16S rRNA sequences.

Authors:  P A Lawson; P Llop-Perez; R A Hutson; H Hippe; M D Collins
Journal:  FEMS Microbiol Lett       Date:  1993-10-01       Impact factor: 2.742

Review 7.  Ethanolamine utilization in bacterial pathogens: roles and regulation.

Authors:  Danielle A Garsin
Journal:  Nat Rev Microbiol       Date:  2010-04       Impact factor: 60.633

8.  Citrate, a specific substrate for the isolation of Clostridium sphenoides.

Authors:  R Walther; H Hippe; G Gottschalk
Journal:  Appl Environ Microbiol       Date:  1977-04       Impact factor: 4.792

9.  Whole-genome transcription profiling reveals genes up-regulated by growth on fucose in the human gut bacterium "Roseburia inulinivorans".

Authors:  Karen P Scott; Jennifer C Martin; Gillian Campbell; Claus-Dieter Mayer; Harry J Flint
Journal:  J Bacteriol       Date:  2006-06       Impact factor: 3.490

10.  The phylogeny of the genus Clostridium: proposal of five new genera and eleven new species combinations.

Authors:  M D Collins; P A Lawson; A Willems; J J Cordoba; J Fernandez-Garayzabal; P Garcia; J Cai; H Hippe; J A Farrow
Journal:  Int J Syst Bacteriol       Date:  1994-10
View more
  4 in total

Review 1.  Bacterial microcompartments.

Authors:  Cheryl A Kerfeld; Clement Aussignargues; Jan Zarzycki; Fei Cai; Markus Sutter
Journal:  Nat Rev Microbiol       Date:  2018-03-05       Impact factor: 60.633

2.  Chronic Zinc Deficiency Alters Chick Gut Microbiota Composition and Function.

Authors:  Spenser Reed; Hadar Neuman; Sharon Moscovich; Raymond P Glahn; Omry Koren; Elad Tako
Journal:  Nutrients       Date:  2015-11-27       Impact factor: 5.717

3.  An insight into intestinal mucosal microbiota disruption after stroke.

Authors:  Dragana Stanley; Robert J Moore; Connie H Y Wong
Journal:  Sci Rep       Date:  2018-01-12       Impact factor: 4.379

4.  Western Indian Rural Gut Microbial Diversity in Extreme Prakriti Endo-Phenotypes Reveals Signature Microbes.

Authors:  Nar S Chauhan; Rajesh Pandey; Anupam K Mondal; Shashank Gupta; Manoj K Verma; Sweta Jain; Vasim Ahmed; Rutuja Patil; Dhiraj Agarwal; Bhushan Girase; Ankita Shrivastava; Fauzul Mobeen; Vikas Sharma; Tulika P Srivastava; Sanjay K Juvekar; Bhavana Prasher; Mitali Mukerji; Debasis Dash
Journal:  Front Microbiol       Date:  2018-02-13       Impact factor: 5.640

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.