| Literature DB >> 25197485 |
Amy S Biddle1, Susan Leschine, Marcel Huntemann2, James Han2, Amy Chen2, Nikos Kyrpides2, Victor Markowitz2, Krishna Palaniappan2, Natalia Ivanova2, Natalia Mikhailova2, Galina Ovchinnikova2, Andrew Schaumberg2, Amrita Pati2, Dimitrios Stamatis2, Tatiparthi Reddy2, Elizabeth Lobos2, Lynne Goodwin2, Henrik P Nordberg2, Michael N Cantor2, Susan X Hua2, Tanja Woyke2, Jeffrey L Blanchard3.
Abstract
Clostridium indolis DSM 755(T) is a bacterium commonly found in soils and the feces of birds and mammals. Despite its prevalence, little is known about the ecology or physiology of this species. However, close relatives, C. saccharolyticum and C. hathewayi, have demonstrated interesting metabolic potentials related to plant degradation and human health. The genome of C. indolis DSM 755(T) reveals an abundance of genes in functional groups associated with the transport and utilization of carbohydrates, as well as citrate, lactate, and aromatics. Ecologically relevant gene clusters related to nitrogen fixation and a unique type of bacterial microcompartment, the CoAT BMC, are also detected. Our genome analysis suggests hypotheses to be tested in future culture based work to better understand the physiology of this poorly described species.Entities:
Keywords: Clostridium indolis; aromatic degradation; bacterial microcompartments; citrate; lactate; nitrogen fixation
Year: 2014 PMID: 25197485 PMCID: PMC4149025 DOI: 10.4056/sigs.5281010
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Figure 1Phylogenetic tree based on 16S rRNA gene sequences highlighting the position of relative to other type strains (T) within the . The strains and their corresponding NCBI accession numbers (and, when applicable, draft sequence coordinates) for 16S rRNA genes are: strain DSM 4024T, Y11568; ATCC 19403T, AB075772; DSM 5628T, X71848; DSM 755T, Pending release by JGI: 1620643-1622056; SR3, AF067965; WM1T, NC_014376:18567-20085; SPL73T, AF092549; DSM 13479T, ADLN00000000: 202-1639; L34420 T, L34420; ATCC 29149T, X94967; R. torques ATCC 27756T, L76604; L34627T; L1-82T, AJ312385; A2-183T, AJ270482; HY-35-12T, AY494606; HESP1T, AF116920; ISDgT, CP000885: 15754-17276. The tree uses sequences aligned by MUSCLE, and was inferred using the Neighbor-Joining method [2]. The optimal tree with the sum of branch lengths = 0.50791241 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches [3]. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [4] and are in the units of the number of base substitutions per site. Evolutionary analyses were conducted in MEGA 5 [5]. ATCC 35414T, CP003992: 856992-858513 was used as an outgroup.
Classification and general features of DSM 755T
| | | | |
|---|---|---|---|
| Domain | TAS [ | ||
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Current classification | Order | TAS [ | |
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | TAS [ | ||
| Type strain DSM 755 | |||
| Gram stain | Negative | TAS [ | |
| Cell shape | Rod | TAS [ | |
| Motility | Motile | TAS [ | |
| Sporulation | Terminal, spherical spores | TAS [ | |
| Temperature range | Mesophilic | TAS [ | |
| Optimum temperature | 37oC | TAS [ | |
| Carbon sources | Glucose, lactose, sucrose, mannitol, pectin, pyruvate, others | TAS [ | |
| Terminal electron receptor | Sulfate | TAS [ | |
| Indole test | Positive | TAS [ | |
| MIGS-6 | Habitat | Isolated from soil, feces, wounds | TAS [ |
| MIGS-6.3 | Salinity | Inhibited by 6.5% NaCl | TAS [ |
| MIGS-22 | Oxygen | Anaerobic | TAS [ |
| MIGS-15 | Biotic relationship | Free living and host associated TAS [ | |
| MIGS-14 | Pathogenicity | No NAS | |
| MIGS-4 | Geographic location | Soil, feces TAS [ |
Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [26].
Project information
| | | |
|---|---|---|
| MIGS-31 | Finishing quality | Improved Draft |
| MIGS-28 | Libraries used | Shotgun and long insert mate pair (Illumina), SMRTbellTM (PacBio) |
| MIGS-29 | Sequencing platforms | Illumina and PacBio |
| MIGS-31.2 | Fold coverage | 759.7× (Illumina), 51.6× (PacBio) |
| MIGS-30 | Assemblers | Velvet, AllpathsLG |
| MIGS-32 | Gene calling method | Prodigal, GenePRIMP |
| Genome Database release | June 3, 2013 (IMB) | |
| Genbank ID | Pending release by JGI | |
| Genbank Date of Release | Pending release by JGI | |
| GOLD ID | Gi22434 | |
| Project relevance | Anaerobic plant degradation |
Nucleotide content and gene count levels of the genome of DSM 755
| Attribute | Value | % of totala |
|---|---|---|
| Genome size (bp) | 6,383,701 | |
| DNA Coding region (bp) | 5,688,007 | 89.10 |
| DNA G+C content (bp) | 2,868,247 | 44.93 |
| Total genesb | 5,903 | 100.00 |
| RNA genes | 101 | 1.71 |
| Protein-coding genes | 5,802 | 98.29 |
| Protein-coding with function pred. | 4,794 | 81.21 |
| Genes in paralog clusters | 4,527 | 76.69 |
| Genes assigned to COGs | 4,643 | 78.65 |
| Genes with signal peptides | 421 | 7.13 |
| Genes with transmembrane helices | 1,494 | 25.31 |
| Paralogous groups | 4,527 | 76.69 |
a) The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome.
b) Also includes 170 pseudogenes.
Number of genes in DSM 755 associated with the 25 general COG functional categories
| | | | |
|---|---|---|---|
| J | 184 | 3.57 | Translation |
| A | 0 | 0 | RNA processing and modification |
| K | 531 | 10.30 | Transcription |
| L | 191 | 3.71 | Replication, recombination and repair |
| B | 1 | 0.02 | Chromatin structure and dynamics |
| D | 28 | 0.54 | Cell cycle control, mitosis and meiosis |
| Y | 0 | 0 | Nuclear structure |
| V | 107 | 2.08 | Defense mechanisms |
| T | 335 | 6.50 | Signal transduction mechanisms |
| M | 235 | 4.56 | Cell wall/membrane biogenesis |
| N | 70 | 1.36 | Cell motility |
| Z | 0 | 0 | Cytoskeleton |
| W | 0 | 0 | Extracellular structures |
| U | 41 | 0.80 | Intracellular trafficking and secretion |
| O | 124 | 2.41 | Posttranslational modification, protein turnover, chaperones |
| C | 261 | 5.06 | Energy production and conversion |
| G | 910 | 17.65 | Carbohydrate transport and metabolism |
| E | 493 | 9.56 | Amino acid transport and metabolism |
| F | 110 | 2.13 | Nucleotide transport and metabolism |
| H | 153 | 2.97 | Coenzyme transport and metabolism |
| I | 77 | 1.49 | Lipid transport and metabolism |
| P | 325 | 6.30 | Inorganic ion transport and metabolism |
| Q | 70 | 1.36 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 590 | 11.45 | General function prediction only |
| S | 319 | 6.19 | Function unknown |
| - | 1260 | 21.35 | Not in COGs |
a) The total is based on the total number of protein coding genes in the annotated genome.
Number of genes in each of the 25 general COG functional categoriesa found in DSM 755T but not in closely related species
| | | |
|---|---|---|
| J | 4 | Translation |
| A | 0 | RNA processing and modification |
| K | 5 | Transcription |
| L | 9 | Replication, recombination and repair |
| B | 1 | Chromatin structure and dynamics |
| D | 0 | Cell cycle control, mitosis and meiosis |
| Y | 0 | Nuclear structure |
| V | 1 | Defense mechanisms |
| T | 2 | Signal transduction mechanisms |
| M | 8 | Cell wall/membrane biogenesis |
| N | 2 | Cell motility |
| Z | 0 | Cytoskeleton |
| W | 0 | Extracellular structures |
| U | 1 | Intracellular trafficking and secretion |
| O | 10 | Posttranslational modification, protein turnover, chaperones |
| C | 28 | Energy production and conversion |
| G | 6 | Carbohydrate transport and metabolism |
| E | 8 | Amino acid transport and metabolism |
| F | 1 | Nucleotide transport and metabolism |
| H | 11 | Coenzyme transport and metabolism |
| I | 2 | Lipid transport and metabolism |
| P | 11 | Inorganic ion transport and metabolism |
| Q | 10 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 18 | General function prediction only |
| S | 21 | Function unknown |
a) Number of genes from a set of 158 genes not found in near relatives () associated with the 25 general COG functional categories.
Selected carbohydrate active genes in the DSM 755T genome
| | | |
|---|---|---|
| 19 | Beta-glucosidase (GH-1) | EC:3.2.1.86 |
| 8 | Beta-galactosidase/ | EC:3.2.1.23 |
| 7 | Beta-glucosidase/ related glucosidases (GH-3) | EC:3.2.1.21 |
| 14 | Alpha-galactosidases/ | EC:3.2.1.86 |
| 2 | Cellulase, endogluconase (GH-5) | EC:3.2.1.4 |
| 14 | Alpha-amylase | EC:3.2.1.10 |
| 8 | Beta-xylosidase (GH 39) | EC:3.2.1.37 |
| 2 | Chitinase (GH 18) | EC:3.2.1.14 |
a) GH designations given from the CAZy database [42]. b) Enzyme Commission (EC) numbers assigned by the Integrated Microbial Genome (IMG) database [41].
Figure 2Distribution of ABC and PTS transporters in the genomes of and related genomes determined from Integrated Microbial Genome (IMG) annotation [40] viewed based on (a) Total umber of COGS, and (b) Percentage of genes in the genome.
Selection of DSM 755 genes related to citrate utilization.
| | | |
|---|---|---|
| K401DRAFT_2892 | holo-ACP synthase (CitX) | EC:2.7.7.61 |
| K401DRAFT_2893 | citrate lyase acyl carrier (CitD) | EC:4.1.3.6 |
| K401DRAFT_2894 | citrate lyase beta subunit (CitE) | EC:4.1.3.6 |
| K401DRAFT_2895 | citrate lyase alpha subunit (CitF) | EC:4.1.3.6 |
| K401DRAFT_2896 | triphosphoribosyl-dephospho-CoA synthase (CitG) | EC:2.7.8.25 |
| K401DRAFT_2897 | citrate (pro3S)-lyase ligase (CitC) | EC:6.2.1.22 |
| K401DRAFT_2898 | response regulator, CheY-like receiver domain, winged helix DNA binding domain | - |
| K401DRAFT_2899 | signal transduction histidine kinase | - |
| K401DRAFT_2900 | citrate transporter, CITMHS family | KO:K03303 |
Gene products and Enzyme Commission (EC) numbers assigned by the Integrated Microbial Genome (IMG) database [41].
Selection of DSM 755 genes related to nitrogen fixation.
| | | |
|---|---|---|
| K401DRAFT_0533 | nitrogenase Mo-Fe protein, α and β chains | pfam00148 |
| K401DRAFT_0534 | nitrogenase Mo-Fe protein, α and β chains | pfam00148 |
| K401DRAFT_0535 | nitrogenase subunit (ATPase) (nifH) | pfam00142 |
| K401DRAFT_0884 | nitrogenase Mo-Fe protein, α and β chains | pfam00148 |
| K401DRAFT_0885 | nitrogenase Mo-Fe protein, α and β chains | pfam00148 |
| K401DRAFT_0886 | nitrogenase subunit (ATPase) (nifH) | pfam00142 |
| K401DRAFT_3349 | nitrogenase Mo-Fe protein, α and β chains | pfam00148 |
| K401DRAFT_3350 | nitrogenase Mo-Fe protein, α and β chains | pfam00148 |
| K401DRAFT_3351 | nitrogenase subunit (ATPase) (nifH) | pfam00142 |
| K401DRAFT_3874 | nitrogenase Mo-Fe protein, α and β chains (nifD) | pfam00148 |
| K401DRAFT_3875 | nitrogenase Mo-Fe protein, α and β chains (nifK) | pfam00148 |
| K401DRAFT_3876 | nitrogenase Fe protein | pfam00142 |
| K401DRAFT_3878 | nitrogenase Mo-Fe protein, α and β chains (nifD) | pfam00148 |
| K401DRAFT_3879 | nitrogenase Mo-Fe protein, α and β chains (nifK) | pfam00148 |
| K401DRAFT_3880 | dinitrogenase Fe-Mo cofactor, (nifH) | pfam02579 |
| K401DRAFT_3895 | nitrogenase Mo-Fe protein, α and β chains (nifD) | pfam00148 |
| K401DRAFT_3896 | nitrogenase Mo-Fe protein, α and β chains (nifK) | pfam00148 |
| K401DRAFT_5519 | nitrogenase Mo-Fe protein, α and β chains (nifB) | pfam04055 |
| K401DRAFT_5520 | nitrogenase Mo-Fe protein, α and β chains (nifE) | pfam00148 |
| K401DRAFT_5521 | nitrogenase Mo-Fe protein (nifK) | pfam00148 |
| K401DRAFT_5522 | nitrogenase component 1, alpha chain (nifN-like) | pfam00148 |
| K401DRAFT_5525 | nitrogenase subunit (ATPase) (nifH) | pfam00142 |
Nitrogenase genes have a common gene identifier (EC:1.18.6.1), therefore the pfam numbers are given to distinguish between subunits. Gene product names and pfam numbers assigned by the Integrated Microbial Genome (IMG) database [41].
Figure 3Citrate utilization genes are in a single gene cluster on K401DRAFT_scaffold0000.1.1, including the citrate transporter CitMHS, and a putative two-component system.
Selection of DSM 755 genes related to lactate utilization.
| | | |
|---|---|---|
| K401DRAFT_1877 | L-lactate dehydrogenase | EC:1.1.1.27 |
| K401DRAFT_5775 | L-lactate dehydrogenase | EC:1.1.1.27 |
| K401DRAFT_3431 | L-lactate transporter, LctP family | TC.LCTP |
| K401DRAFT_3220 | D-lactate dehydrogenase | EC:1.1.1.28 |
Annotations assigned by the Integrated Microbial Genome (IMG) database [41]
grp-BMC genes found in the genome.
| | | |
|---|---|---|
| K401DRAFT_2181 | Predicted transcriptional regulator | COG0789 |
| K401DRAFT_2182 | Predicted membrane protein | COG2510 |
| K401DRAFT_2183 | Carbon dioxide concentrating mechanism/carboxysome shell protein | pfam00936 |
| K401DRAFT_2184 | Predicted membrane protein | pfam00936 |
| K401DRAFT_2185 | Hypothetical protein | - |
| K401DRAFT_2186 | Carbon dioxide concentrating mechanism/carboxysome shell protein | pfam00936 |
| K401DRAFT_2187 | Carbon dioxide concentrating mechanism/carboxysome shell protein | pfam00936 |
| K401DRAFT_2188 | NAD-dependent aldehyde dehydrogenase | pfam00171 |
| K401DRAFT_2189 | Pyruvate formate lyase | pfam02901 |
| K401DRAFT_2190 | Pyruvate formate lyase activating enzyme | pfam04055 |
| K401DRAFT_2191 | Ethanolamine utilization protein | pfam00936 |
| K401DRAFT_2192 | Ethanolamine utilization protein | pfam10662 |
| K401DRAFT_2193 | Alcohol dehydrogenase, class IV | pfam00465 |
| K401DRAFT_2194 | Ethanolamine utilization cobalamin adenosyltransferase | COG4892 |
| K401DRAFT_2195 | Ethanolamine utilization protein, possible chaperonin | COG4820 |
| K401DRAFT_2196 | Carbon dioxide concentrating mechanism/carboxysome shell protein | pfam00936 |
| K401DRAFT_2197 | Carbon dioxide concentrating mechanism/carboxysome shell protein | pfam03319 |
| K401DRAFT_2198 | Ethanolamine utilization protein | pfam06249 |
| K401DRAFT_2199 | Carbon dioxide concentrating mechanism/carboxysome shell protein | pfam00936 |
| K401DRAFT_2200 | NAD-dependent aldehyde dehydrogenase | pfam00171 |
| K401DRAFT_2201 | Propanediol utilization protein | pfam06130 |
| K401DRAFT_2202 | Carbon dioxide concentrating mechanism/carboxysome shell protein | pfam00936 |
Annotations assigned by the Integrated Microbial Genome (IMG) database [41].
CoAT BMC genes found in the genome.
| | | |
|---|---|---|
| K401DRAFT_4970 | DeoRC transcriptional regulator | pfam00455 |
| K401DRAFT_4969 | fucA, L-fuculose-phosphate aldolase | EC:4.1.2.17 |
| K401DRAFT_4968 | pduP, propionaldehyde dehydrogenase | pfam00171 |
| K401DRAFT_4967 | eutM, ethanolamine utilization protein | pfam00936 |
| K401DRAFT_4966 | Carbon dioxide concentrating mechanism/carboxysome shell protein | pfam00936 |
| K401DRAFT_4965 | Carbon dioxide concentrating mechanism/carboxysome shell protein | pfam00936 |
| K401DRAFT_4964 | Carbon dioxide concentrating mechanism/carboxysome shell protein | pfam00936 |
| K401DRAFT_4963 | Pdul, propanediol utilization protein | pfam06130 |
| K401DRAFT_4962 | eutN_CcmL | pfam03319 |
| K401DRAFT_4961 | SBP_bac_8, ABC-type sugar transporter | pfam13416 |
| K401DRAFT_4960 | Uncharacterized NAD(FAD)-dependent dehydrogenase | COG0446 |
| K401DRAFT_4959 | CoA-transferase | pfam01144 |
| K401DRAFT_4958 | CoA-transferase | pfam01144 |
| K401DRAFT_4957 | Fe-ADH, Alcohol dehydrogenase | pfam00465 |
Annotations assigned by the Integrated Microbial Genome (IMG) database [41]
Figure 4CoAT BMC operon found in and . Gene details are found in Table 11.
Selection of DSM 755T genes related to degradation of aromatics.
| | | |
|---|---|---|
| K401DRAFT_3571 | Protocatechuate 3,4-dioxygenase beta subunit | EC:1.13.11.3 |
| K401DRAFT_3568 | Protocatechuate 3,4-dioxygenase beta subunit | EC:1.13.11.3 |
| K401DRAFT_3412 | Aromatic ring hydroxylase | EC:5.3.3.3 |
Annotations assigned by the Integrated Microbial Genome (IMG) database [41]