| Literature DB >> 24655715 |
Steven D Brown1,2,3, Shilpa Nagaraju4, Sagar Utturkar3, Sashini De Tissera4, Simón Segovia4, Wayne Mitchell4, Miriam L Land1, Asela Dassanayake4, Michael Köpke4.
Abstract
BACKGROUND: Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published.Entities:
Year: 2014 PMID: 24655715 PMCID: PMC4022347 DOI: 10.1186/1754-6834-7-40
Source DB: PubMed Journal: Biotechnol Biofuels ISSN: 1754-6834 Impact factor: 6.040
Sequencing statistics
| 454-3 kb PE | 511,515 | 202,048,425 | 395 | 945 | 46× |
| Illumina PE | 3,689,644 | 553,446,600 | 151 | 151 | 127× |
| PacBio | 122,933 | 782,530,012 | 6,366 | 26,777 | 179× |
Assembly statistics for strain DSM 10061
| 454/Ion Torrent * | 100 | 436,795 | 115,901 | 4.32 | | | | Newbler 2.6 |
| Illumina only | 57 | 460,940 | 255,482 | 4.3 | 53 | 769,812 | 328,660 | Velvet 1.2 |
| 454 only | 32 | 134,546 | 330,116 | 4.3 | 13 | 1,137,876 | 898,466 | Newbler 2.8 |
| Illumina/ 454 Hybrid | 22 | 1,137,625 | 687,076 | 4.3 | 13 | 1,137,625 | 899,926 | Newbler 2.8 |
| PacBio | 1 | 4,352,205 | 4,352,267 | 4.3 | 1 | 4,352,267 | 4,352,267 | SMRT 2.0 |
*Previously published as a 4.5-Mb draft genome [26], but present [GenBank: ASZX00000000.1] as 4,323,309 bp.
Figure 1Comparison of DSM10061 genome assemblies. The orange colored ring represents the PacBio assembly. The next inner ring represents the genes encoded on positive and negative strands respectively and color coded by Clusters of Orthologous Groups (COG) categories. The 454/Illumina hybrid assembly and published draft assembly are represented as yellow and green circles, respectively. Next, three rings represent the raw-read coverage from PacBio, 454 and Illumina technology, respectively. The gaps in the 454/Illumina hybrid assembly and published draft assembly as compared to PacBio assembly are highlighted by red colors. The key genes in the gap regions are shown by black markers and intergenic regions are shown by gray markers. The phage region and CRISPR repeats are highlighted on PacBio assembly by blue and yellow color, respectively. Detail is provided in Table 3. CRISPR, clustered regularly interspaced short paloindromic repeats.
Regions of low sequence-coverage
| CAETHG_0145 | 156117 | 156914 | Methionine synthase | 87 | 26 | 62 | Complete | Partial |
| CAETHG_0152 | 161167 | 161292 | Hypothetical protein | 94 | 16 | 55 | Complete | Partial |
| CAETHG_0153 | 161313 | 161963 | Dihydropteroate synthase DHPS | 93 | 22 | 46 | Complete | Partial |
| CAETHG_0433 | 472649 | 474331 | Transcriptional regulator, PucR family | 110 | 25 | 57 | Complete | Partial |
| CAETHG_0601 | 661798 | 663339 | Citrate lyase, alpha subunit | 109 | 25 | 64 | Partial | Partial |
| CAETHG_0774 | 832108 | 833028 | SufBD protein | 109 | 23 | 65 | Complete | Partial |
| CAETHG_0814 | 873533 | 874333 | Hypothetical protein | 106 | 23 | 69 | Complete | None |
| CAETHG_0815 | 874375 | 874953 | Hypothetical protein | 102 | 23 | 55 | Complete | None |
| CAETHG_0871 | 940541 | 941353 | 3-dehydroquinate dehydratase | 109 | 27 | 59 | Complete | Partial |
| CAETHG_1053 | 1138010 | 1138912 | Citrate lyase, beta subunit | 106 | 29 | 75 | Complete | None |
| CAETHG_1054 | 1138912 | 1139208 | Citrate lyase acyl carrier protein | 109 | 37 | 70 | Complete | None |
| Intergenic | 1148600 | 1148780 | NA | 131 | 16 | 63 | Complete | None |
| CAETHG_1100 | 1186843 | 1187643 | Hypothetical protein | 118 | 23 | 68 | Complete | None |
| CAETHG_1101 | 1187685 | 1188263 | Hypothetical protein | 105 | 28 | 59 | Complete | None |
| CAETHG_1630 | 1752229 | 1753149 | SufBD protein | 118 | 26 | 79 | Complete | Partial |
| CAETHG_1634 | 1755642 | 1756505 | modD protein | 115 | 22 | 69 | Complete | Partial |
| CAETHG_1708 | 1841018 | 1841572 | Lumazine-binding | 132 | 23 | 66 | Complete | Complete |
| CAETHG_1816 | 1956238 | 1956534 | Microcompartments protein | 138 | 35 | 76 | Complete | Partial |
| CAETHG_1817 | 1956609 | 1956899 | Microcompartments protein | 139 | 19 | 81 | Complete | None |
| CAETHG_1818 | 1956948 | 1957598 | Propanediol utilization protein | 144 | 24 | 74 | Complete | None |
| CAETHG_1819 | 1957600 | 1959153 | Acetaldehyde dehydrogenase (acetylating) | 153 | 25 | 67 | Complete | None |
| CAETHG_1826 | 1963196 | 1964038 | Ethanolamine utilization protein EutJ family protein | 161 | 34 | 73 | Complete | Partial |
| CAETHG_1827 | 1964020 | 1964790 | Hypothetical protein | 162 | 22 | 68 | Complete | Partial |
| CAETHG_1949 | 2079078 | 2080271 | Hypothetical protein | 161 | 30 | 79 | Complete | Partial |
| CAETHG_1963 | 2095013 | 2096206 | Hypothetical protein | 128 | 36 | 97 | Complete | Partial |
| tRNA | 2113813 | 2113886 | tRNA_Met | 128 | 15 | 61 | None | Complete |
| tRNA | 2135117 | 2135189 | tRNA_Met | 132 | 22 | 64 | Complete | None |
| tRNA | 2135201 | 2135286 | tRNA_Leu | 133 | 16 | 59 | Complete | None |
| tRNA | 2135301 | 2135374 | tRNA_Met | 133 | 17 | 57 | Complete | None |
| tRNA | 2135394 | 2136466 | tRNA_Met | 139 | 35 | 74 | Complete | None |
| tRNA | 2135478 | 2135563 | tRNA_Leu | 140 | 30 | 62 | Complete | None |
| tRNA | 2276744 | 2276817 | tRNA_Met | 153 | 28 | 70 | None | Complete |
| tRNA | 2360340 | 2360412 | tRNA_Lys | 122 | 15 | 65 | Complete | Partial |
| CAETHG_2238 | 2397706 | 2397882 | Hypothetical protein | 138 | 23 | 57 | Partial | Complete |
| CAETHG_2268 | 2424703 | 2425503 | Integrase catalytic region | 115 | 26 | 61 | Complete | None |
| CAETHG_2269 | 2425545 | 2426123 | Hypothetical protein | 124 | 26 | 56 | Complete | None |
| Intergenic | 2666300 | 2666515 | NA | 145 | 25 | 69 | Complete | None |
| Intergenic | 2710650 | 2710840 | NA | 124 | 36 | 71 | Complete | None |
| CAETHG_2526 | 2714747 | 2715550 | Hypothetical protein | 133 | 28 | 74 | Complete | Partial |
| Intergenic | 2769840 | 2769880 | NA | 124 | 23 | 67 | Complete | None |
| CAETHG_2620 | 2822788 | 2823741 | Transposase IS66 | 124 | 30 | 59 | Partial | Complete |
| CAETHG_2843 | 3078642 | 3079445 | Dihydropteroate synthase DHPS | 152 | 30 | 66 | Complete | Partial |
| CAETHG_2844 | 3079499 | 3080131 | Hypothetical protein | 148 | 32 | 71 | Complete | Partial |
| CAETHG_2848 | 3085939 | 3086742 | Dihydropteroate synthase DHPS | 146 | 27 | 66 | Complete | Partial |
| CAETHG_2849 | 3086796 | 3087428 | Hypothetical protein | 139 | 31 | 75 | Complete | Partial |
| CAETHG_3037 | 3301321 | 3302088 | MCP methyltransferase, CheR-type | 149 | 23 | 65 | Complete | Partial |
| CAETHG_3075 | 3342748 | 3343524 | Transposase IS66 | 112 | 39 | 74 | Complete | Partial |
| CAETHG_3281 | 3537107 | 3537880 | Hypothetical protein | 109 | 27 | 55 | Complete | Partial |
| CAETHG_3282 | 3537862 | 3538704 | Ethanolamine utilization protein | 107 | 30 | 62 | Complete | None |
| CAETHG_3283 | 3538721 | 3539026 | Microcompartments protein | 103 | 20 | 65 | Complete | None |
| CAETHG_3284 | 3539020 | 3539286 | Ethanolamine utilization protein EutN/carboxysome structural protein Ccml | 106 | 25 | 55 | Complete | None |
| CAETHG_3285 | 3539304 | 3539975 | Ethanolamine utilization EutQ family protein | 110 | 29 | 63 | Complete | None |
| CAETHG_3286 | 3540008 | 3540784 | Microcompartments protein | 106 | 30 | 61 | Complete | None |
| CAETHG_3287 | 3540833 | 3542350 | Acetaldehyde dehydrogenase (acetylating) | 111 | 27 | 61 | Complete | Partial |
| Intergenic | 3848150 | 3848350 | NA | 126 | 34 | 39 | Complete | None |
| CAETHG_4028 | 4315106 | 4316413 | VanW family protein | 98 | 24 | 66 | Complete | Partial |
| CAETHG_4029 | 4316730 | 4319132 | Collagen triple helix repeat-containing protein | 94 | 13 | 38 | Complete | Partial |
| CAETHG_4035 | 4325792 | 4326292 | VanW family protein | 78 | 21 | 54 | Complete | Partial |
The genomic regions which were not assembled in 454/Draft assembly are listed above; the ‘x’ coverage defines the raw-read coverage averaged over given coordinates; ‘Complete/partial’ contig coverage defines whether the region was completely/partially assembled while ‘None’ defines that this region is missing in the respective assembly. Missing regions in either 454/Draft assembly are shown in bold.
Figure 2Inferred metabolism of . Capital letters in brown denote enzymes. ATP, adenosine triphosphate, ADP, adenosine diphosphate; BDO, 2,3-butanediol; CO, carbon monoxide; CO2, carbon dioxide; FAD, flavin adenine dinucleotide; FADH2 FD_red, ferredoxin (reduced); FD_ox, ferredoxin (oxidized); G3P, 3-phosphoglycerate; GP, glycerone-phosphate; H3PO4, phosphate; NAD, nicotinamide adenine dinucleotide (oxidized); NADH, nicotinamide adenine dinucleotide (reduced); NADP, nicotinamide adenine dinucleotide phosphate (oxidized); NADPH, nicotinamide adenine dinucleotide phosphate (reduced); TCA, tricarboxcylic acid cycle. Note that reaction directionality has not been rigorously determined; in general, directionality is as reported in KEGG reactions. Acetyl-CoA (Wood-Ljungdahl) pathway – Reductive branch. W1 Bifunctional CO dehydrogenase/Acetyl-CoA synthase (CODH/ACS) CAETHG_1620-21, 1608-11, W2 Seleno formate dehydrogenase (Fdh) CAETHG_0084, 2789, W3 Non-seleno formate dehydrogenase (Fdh) CAETHG_2988, W4 Formyl-THF ligase (Fhs) CAETHG_1618, W5 Methenyl-THF cyclohydrolase (FchA) CAETHG_1617, W6 Methylene-THF dehydrogenase (FolD) CAETHG_1616, W7 Methylene-THF reductase (MetF) CAETHG_1614-15. Acetyl-CoA (Wood-Ljungdahl) pathway – Oxidative branch. C Monofunctional CO dehydrogenase CAETHG_3899, 3005, H1 Electron-bifurcating [FeFe] Hydrogenase (HytCBDE1AE2) CAETHG_2798, H2 Other [FeFe] hydrogenases (Hyd) CAETHG_0110, 0120, 1576, 3569, 3841, H3 [NiFe] hydrogenase (Hyd) CAETHG_0862, H4 Hydrogenase maturation factor (HypEDCF) CAETHG_0368-0371. Energy conservation. A F1FO ATPase (AtpIBEFHAGDC) CAETHG_2342-50, N Electron-bifurcating NADH-dependent Fd:NADP oxidoreductase (Nfn) CAETHG_1580, R Rnf complex (RnfCDGEAB) CAETHG_3227-32. Acetate fermentation pathway. Ac1 Phosphotransacetylase (Pta) CAETHG_3358, Ac2 Acetate kinase (Ack) CAETHG_3359. Ethanol fermentation pathway. E1 Bifunctional aldehyde/alcohol dehydrogenase (AdhE) CAETHG_3747, 3748, E2 Aldehyde:Fd oxidoreductase (AOR) CAETHG_0092, 0102, E3 Additional alcohol dehydrogenases (Adh) CAETHG_0555. 2,3-butanediol fermentation pathway. B1 Acetolactate synthase (AlsS) CAETHG_0124-25, 0406, 1740, B2 Acetolactate decarboxylase (BudA) CAETHG_2932, B3 2,3-butanediol dehydrogenase (Bdh) CAETHG_0385, Lactate fermentation pathway. L Lactate dehydrogenase (Ldh) CAETHG_1147. Central pyruvate metabolism. P1 Pyruvate:ferredoxin oxidoreductase (PFOR) CAETHG_0928, 3029, P2 Pyruvate, phosphate dikinase (PPDK) CAETHG_2055, 2909, P3 Pyruvate kinase (Pk) CAETHG_2440-41, P4 Pyruvate carboxylase (Pyc) CAETHG_1594, P5 PEP carboxykinase (PEPCK) CAETHG_2721, P5 Malic enzyme CAETHG_0605, 1055. Incomplete TCA cycle. T1 Citrate synthase CAETHG _2751, T2 Citrate lyase CAETHG_1052-54, 1898–1901, 2480-83, T3 Aconitase (Aco) CAETHG_1051, 2752, T4 Isocitrate dehydrogenase (Idh) CAETHG_2753, T5 Malate dehydrogenase (Mdh) CAETHG_1702, 2478, 2689, T6 Fumarase CAETHG_1902-03, 2062, 2479, T7 Fumarate reductase CAETHG_0344, 1032, 2961. Glycolysis/Gluconeogenesis. PTS Fructose phosphotransferase system (PTS) CAETHG_0142, 0676-73, G1 Fructokinase (Fk) /Fructose-6-phosphate isomerase CAETHG_0166, 0156, G2 1-phosphofructokinase (Pfk1) CAETHG_0143, G3 6-phosphofructokinase (Pfk6) CAETHG_648, 2439, G4 Fructose bisphosphate aldolase (Aldo) CAETHG_2382, G5 Triose-phosphate isomerase (Tpi) CAETHG_1758, G6 Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) CAETHG_1760, 3424, G7 Phosphoglycerate kinase (Pgk) CAETHG_1759, G8 Phosphoglycerate mutase (Pgm) CAETHG_712, 1757, G9 Enolase phosphopyruvate hydratase (Eno) CAETHG_1756.
Overview of CRISPR systems, plasmids and prophages in fuel-producing species
| Solventogenic (ABE) Clostridia | 3.94 | complete | [ | 1 | 192 | 3 | 191 | - | - | |
| 3.94 | complete | [ | 1 | 192 | 3 | 191 | - | - | ||
| 3.94 | complete | [ | 2 | 201 | 3 | 191 | - | - | ||
| 6.00 | complete | - | - | - | 4 | 106 | - | - | ||
| 5.10 | complete | [ | - | - | 5 | 133 | 4 | 55 | ||
| 6.53 | complete | [ | 1 | 136 | 6 | 161 | 4 | 67 | ||
| Cellulolytic Clostridia | 4.07 | complete | - | - | - | 8 | 210 | 3 | 23 | |
| 5.26 | complete | [ | - | - | 5 | 179 | 3 | 44 | ||
| 3.84 | complete | [ | - | - | 5 | 222 | 5 | 442 | ||
| 3.56 | complete | [ | - | - | 1 | 26 | 5 | 189 | ||
| 4.85 | complete | - | - | - | 1 | 28 | - | - | ||
| Acetogenic Clostridia | 4.35 | complete | this study | - | - | 4 | 115 | 3 | 95 | |
| 4.63 | complete | [ | - | - | 6 | 248 | - | - | ||
| 5.59 | 251 contigs | - | 1 | 20 | 1 | 19 | - | - | ||
| 4.40 | 69 contigs | [ | 1 | 20 | - | - | - | - |
*Please refer to Additional file 10: Table S9 for details.