| Literature DB >> 24019987 |
Kristen M Deangelis1, Patrik D'Haeseleer, Dylan Chivian, Blake Simmons, Adam P Arkin, Konstantinos Mavromatis, Stephanie Malfatti, Susannah Tringe, Terry C Hazen.
Abstract
Tropical forest soils decompose litter rapidly with frequent episodes of anoxia, making it likely that bacteria using alternate terminal electron acceptors (TEAs) such as iron play a large role in supporting decomposition under these conditions. The prevalence of many types of metabolism in litter deconstruction makes these soils useful templates for improving biofuel production. To investigate how iron availability affects decomposition, we cultivated feedstock-adapted consortia (FACs) derived from iron-rich tropical forest soils accustomed to experiencing frequent episodes of anaerobic conditions and frequently fluctuating redox. One consortium was propagated under fermenting conditions, with switchgrass as the sole carbon source in minimal media (SG only FACs), and the other consortium was treated the same way but received poorly crystalline iron as an additional terminal electron acceptor (SG + Fe FACs). We sequenced the metagenomes of both consortia to a depth of about 150 Mb each, resulting in a coverage of 26× for the more diverse SG + Fe FACs, and 81× for the relatively less diverse SG only FACs. Both consortia were able to quickly grow on switchgrass, and the iron-amended consortium exhibited significantly higher microbial diversity than the unamended consortium. We found evidence of higher stress in the unamended FACs and increased sugar transport and utilization in the iron-amended FACs. This work provides metagenomic evidence that supplementation of alternative TEAs may improve feedstock deconstruction in biofuel production.Entities:
Keywords: Anaerobic decomposition; Panicum virgatum; archaea; bacteria; feedstock-adapted consortia; metagenomics; switchgrass; tropical forest soil
Year: 2013 PMID: 24019987 PMCID: PMC3764933 DOI: 10.4056/sigs.3377516
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Classification and general features of the four metagenome data sets according to the Minimum Information about Genomes and Metagenomes (MIMS) standards [13].
| | | | |
|---|---|---|---|
| Current classification | Metagenome ecological | TAS [ | |
| Carbon source | Switchgrass | IDA | |
| Energy source | Switchgrass | IDA | |
| Terminal electron receptor | Iron reduction or fermentation | TAS [ | |
| MIGS-6 | Habitat | Consortia (mixed community) derived from wet tropical forest soils | TAS [ |
| MIGS-14 | Pathogenicity | none | NAS |
| MIGS-4 | Geographic location | Wet tropical forest, Puerto Rico, USA | |
| MIGS-5 | Sample collection time | April, 2009 | |
| MIGS-4.1 | Latitude | 18°18′N | |
| MIGS-4.2 | Longitude | 65°50′W | |
| MIGS-4.3 | Depth | 0-10 cm | TAS [ |
| MIGS-4.4 | Altitude | 250 masl | TAS [ |
a Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [14].
Project information
| | | |
|---|---|---|
| MIGS-31 | Finishing quality | Standard Draft |
| MIGS-28 | Libraries used | Illumina standard paired-end library (0.3 kb insert size) |
| MIGS-29 | Sequencing platforms | Illumina GaIIX |
| MIGS-31.2 | Fold coverage | 26.1466 × (PR soil-derived FAC SG + Fe) |
| MIGS-30 | Assemblers | SOAPdenovo v1.05, Newbler v2.5, minimus2 |
| MIGS-32 | Gene calling method | Glimmer |
| GOLD ID | Gm00278 | |
| IMG Project ID | 18182 | |
| Project relevance | biotechnological |
Summary of metagenomes
| | | | | |
|---|---|---|---|---|
| SG only | 152.66 | 57,147 | Gm00278 | Gs0000888 |
| SG + Fe | 154.12 | 65,160 | Gm00278 | Gs0000889 |
Nucleotide content and gene count levels of the metagenomes
| | | |||
|---|---|---|---|---|
| Number | % of Total | Number | % of Total | |
| DNA, total number of bases | 152,660,070 | 100.00% | 154,120,208 | 100.00% |
| DNA coding number of bases | 130,438,005 | 85.44% | 136,080,382 | 88.29% |
| DNA G+C number of bases | 62,858,797 | 41.18%* | 70,930,796 | 46.02%* |
| DNA scaffolds | 57,147 | 100.00% | 65,160 | 100.00% |
| CRISPR Count | 51 | 4 | ||
| Genes total number | 197,271 | 100.00% | 193,491 | 100.00% |
| Protein coding genes | 195,006 | 98.85% | 192,751 | 99.62% |
| RNA genes | 2,265 | 1.15% | 740 | 0.38% |
| rRNA genes | 294 | 0.15% | 16 | 0.01% |
| 5S rRNA | 106 | 0.05% | 9 | 0.00% |
| 16S rRNA | 75 | 0.04% | 3 | 0.00% |
| 18S rRNA | 1 | 0.00% | 4 | 0.00% |
| tRNA genes | 1,971 | 1.00% | 724 | 0.37% |
| Protein coding genes with function prediction | 127,406 | 64.58% | 129,389 | 66.87% |
| without function prediction | 67,600 | 34.27% | 63,362 | 32.75% |
| not connected to SwissProt Protein Product | 195,006 | 98.85% | 192,751 | 99.62% |
| Protein coding genes with enzymes | 33,383 | 16.92% | 30,632 | 15.83% |
| w/o enzymes but with candidate KO based enzymes | 26,793 | 13.58% | 32,919 | 17.01% |
| Protein coding genes connected to KEGG pathways3 | 37,533 | 19.03% | 34,348 | 17.75% |
| not connected to KEGG pathways | 157,473 | 79.83% | 15,8403 | 81.87% |
| Protein coding genes connected to KEGG Orthology (KO) | 63,949 | 32.42% | 57,111 | 29.52% |
| not connected to KEGG Orthology (KO) | 131,057 | 66.44% | 135,640 | 70.10% |
| Protein coding genes connected to MetaCyc pathways | 32,243 | 16.34% | 29,552 | 15.27% |
| not connected to MetaCyc pathways | 162,763 | 82.51% | 163,199 | 84.34% |
| Protein coding genes with COGs3 | 121,020 | 61.35% | 123,077 | 63.61% |
| with Pfam3 | 115,645 | 58.62% | 118,589 | 61.29% |
| with TIGRfam3 | 33,743 | 17.10% | 33,969 | 17.56% |
| in internal clusters | 77,655 | 39.36% | 73,856 | 38.17% |
| Protein coding genes coding signal peptides | 48,556 | 24.61% | 49,644 | 25.66% |
| Protein coding genes coding transmembrane proteins | 43,693 | 22.15% | 43,726 | 22.60% |
| COG clusters | 4,125 | 84.65% | 3,974 | 81.55% |
| KOG clusters | 0 | 0.00% | 0 | 0.00% |
| Pfam clusters | 4,447 | 37.33% | 4,293 | 36.04% |
| TIGRfam clusters | 2,580 | 64.13% | 2,489 | 61.87% |
* GC percentage shown as count of G's and C's divided by a total number of G's, C's, A's, and T's. This is not necessarily synonymous with the total number of bases.
Number of genes associated with the 25 general COG functional categories.
| ID | Name | SG only | SG + Fe | R | P-value |
|---|---|---|---|---|---|
| J | Translation, ribosomal structure and biogenesis | 6,659 | 6,055 | 4.57 | *** 0.000 |
| A | RNA processing and modification | 21 | 22 | 0 | |
| K | RNA processing and modification | 21 | 22 | 0 | |
| L | Replication, recombination and repair | 6,248 | 6,103 | 1.36 | 0.086 |
| B | Chromatin structure and dynamics | 49 | 35 | 0.21 | |
| D | Cell cycle control, cell division, chromosome partitioning | 1,457 | 1,68 | 1.35 | 0.089 |
| Y | Nuclear structure | - | - | - | - |
| V | Defense mechanisms | 2,884 | 3,232 | -2.11 | * 0.018 |
| T | Signal transduction mechanisms | 10,430 | 10,271 | 2.01 | * 0.022 |
| M | Cell wall/membrane/envelope biogenesis | 8,396 | 8,753 | -1.67 | * 0.047 |
| N | Cell motility | 3,150 | 3,146 | -1.2 | |
| Z | Cytoskeleton | 39 | 43 | -0.29 | |
| W | Extracellular structures | 2 | 1 | 0 | |
| U | Intracellular trafficking, secretion, and vesicular transport | 2,438 | 2,525 | -0.56 | |
| O | Posttranslational modification, protein turnover, chaperones | 3,893 | 3,914 | 0.14 | |
| C | Energy production and conversion | 8,221 | 8,426 | 0.08 | |
| G | Carbohydrate transport and metabolism | 13,038 | 14,361 | -3.69 | *** 0.000 |
| E | Amino acid transport and metabolism | 9,571 | 10,682 | -5.33 | *** 0.000 |
| F | Nucleotide transport and metabolism | 2,808 | 3,022 | -2.21 | * 0.014 |
| H | Coenzyme transport and metabolism | 5,193 | 5,080 | 2.17 | * 0.015 |
| I | Lipid transport and metabolism | 3,034 | 3,375 | -1.9 | * 0.029 |
| P | Inorganic ion transport and metabolism | 5,914 | 6,171 | 0.12 | |
| Q | Secondary metabolites biosynthesis, transport and catabolism | 1,608 | 1,916 | -2.51 | ** 0.006 |
| R | General function prediction only | 15,442 | 15,796 | -0.87 | |
| S | Function unknown | 10,667 | 10,106 | 0.6 | |
aP-value symbols denote * P<0.05, ** P<0.01, *** P<0.001, and n.s. indicates not significant.
Figure 1Phyla that are at least 2-fold differentially represented in one metagenome compared to the other, and had greater than one representative detected. Phyla with gene counts over-represented in the iron-amended consortium (SG + Fe) are colored brown, while phyla with gene counts over-represented in the unamended consortium (SG only) are colored light green.
Overview of taxonomic diversity in metagenomes.
| Domain | Phylum | SG only Count | % | SG + Fe Count | % |
|---|---|---|---|---|---|
| | 575 | 0.29 | 88 | 0.05 | |
| | 5,089 | 2.58 | 2,590 | 1.34 | |
| | 1 | 0 | 0 | 0 | |
| | 478 | 0.24 | 2,392 | 1.24 | |
| | 1,113 | 0.56 | 2,756 | 1.42 | |
| | 100 | 0.05 | 99 | 0.05 | |
| | 14,680 | 7.44 | 14,937 | 7.72 | |
| | 16 | 0.01 | 35 | 0.02 | |
| | 460 | 0.23 | 530 | 0.27 | |
| | 1,073 | 0.54 | 2042 | 1.06 | |
| | 13 | 0.01 | 18 | 0.01 | |
| | 742 | 0.38 | 823 | 0.43 | |
| | 117 | 0.06 | 121 | 0.06 | |
| | 158 | 0.08 | 275 | 0.14 | |
| | 132 | 0.07 | 144 | 0.07 | |
| | 17 | 0.01 | 35 | 0.02 | |
| | 30 | 0.02 | 36 | 0.02 | |
| | 38,958 | 19.75 | 44,858 | 23.18 | |
| | 533 | 0.27 | 534 | 0.28 | |
| | 11 | 0.01 | 29 | 0.01 | |
| | 113 | 0.06 | 165 | 0.09 | |
| | 70 | 0.04 | 80 | 0.04 | |
| | 375 | 0.19 | 448 | 0.23 | |
| | 11,289 | 5.72 | 11,803 | 6.1 | |
| | 2,460 | 1.25 | 3,795 | 1.96 | |
| | 314 | 0.16 | 460 | 0.24 | |
| | 30 | 0.02 | 34 | 0.02 | |
| | 36 | 0.02 | 43 | 0.02 | |
| | 330 | 0.17 | 414 | 0.21 | |
| | 450 | 0.23 | 611 | 0.32 | |
| | 25 | 0.01 | 19 | 0.01 | |
| | 32 | 0.02 | 30 | 0.02 | |
| | 64 | 0.03 | 59 | 0.03 | |
| | 7 | 0 | 8 | 0 | |
| | 8 | 0 | 5 | 0 | |
| | 5 | 0 | 5 | 0 | |
| | 32 | 0.02 | 29 | 0.01 | |
| | 1 | 0 | 0 | 0 | |
| | 7 | 0 | 3 | 0 | |
| | 73 | 0.04 | 66 | 0.03 | |
| Plasmid: | |||||
| | 5 | 0 | 2 | 0 | |
| | 4 | 0 | 4 | 0 | |
| Plasmid: | |||||
| | 3 | 0 | 3 | 0 | |
| | 2 | 0 | 1 | 0 | |
| | 23 | 0.01 | 32 | 0.02 | |
| | 32 | 0.02 | 29 | 0.01 | |
| Viruses | |||||
| ds DNA viruses, no RNA stage | 113 | 0.06 | 97 | 0.05 | |
| ss DNA viruses | 1 | 0 | 0 | 0 |
Figure 2Network analysis of feedstock adapted consortia grown on switchgrass only (SG only), SG plus iron oxides (FeOx), SG plus nitrate (NO3-), or SG plus sulfate (SO3-). Each point represents one taxon, and the size of the point corresponds to the number of connections (edges) associated with the taxon. Edges (grey lines) indicate a minimum correlation of Pearson r = 0.9 as well as statistical significance (P<0.01). On the left, taxa are colored by taxonomy according to their assigned phylum; on the right, taxa are colored based on whether they are generalists (present in all four treatments) or specialists (present in one treatment only and absent in the rest).
Report of pfams that were significantly enriched†
| ID | SG only | R | SG + Fe | Description | ||
|---|---|---|---|---|---|---|
| Enriched in SG+Fe FACs | ||||||
| pfam00005 | 2,409 | -7.38 | 8.10e-14 | 3,025 | ABC transpoerter | |
| pfam00528 | 1,817 | -7.36 | 9.02e-14 | 2,348 | Bacterial binding protein-dependent transport systems | |
| pfam02653 | 394 | -6.24 | 2.23e-10 | 605 | Bacterial binding protein-dependent transport systems | |
| pfam00106 | 420 | -4.88 | 5.38e-07 | 589 | short-chain dehydrogenases/reductases family | |
| pfam01979 | 163 | -5.55 | 1.44e-08 | 287 | large metal dependent hydrolase superfamily | |
| pfam08352 | 99 | -6.43 | 6.20e-11 | 218 | C-terminus of oligopeptide ABC transporter ATP binding proteins | |
| pfam02894 | 146 | -4.1 | 2.03e-05 | 231 | Oxidoreductase family, C-terminal alpha/beta domain | |
| pfam02782 | 113 | -4.33 | 7.56e-06 | 193 | FGGY carbohydrate kinase family | |
| pfam00395 | 130 | -3.98 | 3.39e-05 | 208 | S-layer homology domain | |
| pfam01266 | 93 | -3.77 | 8.17e-05 | 156 | FAD dependent oxidoreductase family | |
| pfam02801 | 47 | -4.13 | 1.79e-05 | 99 | Beta-ketoacyl-ACP synthase (fatty acid synthesis) | |
| pfam00404 | 2 | -6.02 | 8.86e-10 | 43 | Dockerin: protein domain in cellulosome cellular structure | |
| pfam01799 | 24 | -3.71 | 1.02e-04 | 59 | [2Fe-2S] binding domain | |
| pfam03632 | 19 | -3.8 | 7.32e-05 | 52 | glycoside hydrolase family 65 | |
| pfam08659 | 1 | -5.41 | 3.23e-08 | 33 | polyketide synthase domain, catalyses the first step in the reductive modification of the beta-carbonyl centers in the growing polyketide chain | |
| Enriched in SG only FACs | ||||||
| pfam00990 | 697 | 5.95 | 1.33e-09 | 508 | GGDEF domain, cyclic di-GMP synthesis involved in intracellular signaling | |
| pfam03466 | 499 | 6.29 | 1.60e-10 | 330 | LysR substrate binding domain, similar to periplasmic binding protein | |
| pfam00126 | 476 | 5.71 | 5.71e-09 | 326 | Helix-turn-helix DNA binding domain | |
| pfam00583 | 1,048 | 3.88 | 5.26e-05 | 905 | Acetyltransferase (or transacetylase) | |
| pfam00989 | 569 | 3.89 | 4.92e-05 | 459 | PAS domain, signal sensor | |
| pfam00563 | 262 | 5.43 | 2.89e-08 | 157 | EAL domain, possible diguanylate phosphodiesterase with metal-binding site | |
| pfam01473 | 135 | 8.26 | 1.11e-16 | 31 | Putative cell wall binding repeat | |
| pfam02311 | 405 | 3.78 | 7.82e-05 | 314 | rabinose-binding and dimerization domain of the AraC regulatory protein | |
| pfam00665 | 201 | 4.53 | 2.94e-06 | 124 | Retroviral integrase | |
| pfam02378 | 145 | 4.25 | 1.08e-05 | 84 | Phosphotransferase system, EIIC, part of a sugar-specific permease system | |
| pfam00801 | 163 | 3.71 | 1.03e-04 | 106 | Polycystic-kidney disease domain, usually involved in mediating protein-protein interactions | |
| pfam01797 | 119 | 3.84 | 6.07e-05 | 69 | Transposase IS200, for transposition of insertion elements | |
| pfam01609 | 119 | 3.76 | 8.44e-05 | 70 | Transposase DDE domain, for transposition of insertion elements | |
| pfam01011 | 75 | 4.54 | 2.83e-06 | 30 | beta propeller, found in several enzymes which utilize pyrrolo-quinoline quinone as a prosthetic group | |
| pfam00367 | 80 | 4.35 | 6.80e-06 | 35 | phosphotransferase system, EIIB | |
| pfam02302 | 105 | 3.69 | 1.13e-04 | 60 | phosphoenolpyruvate: sugar phosphotransferase system (PTS) system, Lactose/Cellobiose specific IIB subunit | |
| pfam09681 | 47 | 5.93 | 1.53e-09 | 5 | N-terminal phage replisome organizer, origin of phage replication | |
| pfam01978 | 83 | 3.83 | 6.47e-05 | 42 | sugar-specific transcriptional regulator of the trehalose/maltose ABC transporter | |
| pfam03143 | 52 | 3.75 | 8.81e-05 | 21 | GTP-binding elongation factor family | |
| pfam09820 | 35 | 4.41 | 5.09e-06 | 7 | predicted AAA-ATPase domain | |
| pfam08350 | 33 | 4 | 3.22e-05 | 8 | domain of unknown function, so far found only at the C-terminus of archaean proteins | |
| pfam08495 | 36 | 3.74 | 9.02e-05 | 11 | FIST N domain: novel sensory domain present in signal transduction proteins | |
| pfam09373 | 22 | 4.45 | 4.34e-06 | 1 | Pseudomurein-binding repeat, pseudomurein being a cell-wall structure | |
| pfam08004 | 18 | 3.96 | 3.71e-05 | 1 | domain of unknown function, so far found only among archaeal proteins | |
†in either the iron-amended consortia (SG + Fe FAC, upper half of table) or the unamended feedstock-adapted consortia (SG only FAC, lower half of table).
Figure 3Illustration of pfams that were differentially represented in SG only compared to SG + Fe. On the left, pfams are listed for the consortium grown in switchgrass only with no iron (SG only), and on the right, pfams are listed for the consortium grown in switchgrass with iron (SG + Fe). This illustration is based on data from Table 7.
Count of genes in COGs that bear protein sequence homology to target lignocellulolytic genes of interest.
| | | | |
|---|---|---|---|
| COG1472 | 11 | 23 | Beta-glucosidase-related glycosidases |
| COG3250 | 11 | 7 | Beta-galactosidase/beta-glucuronidase |
| COG5001 | 9 | 9 | Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain |
| COG1028 | 4 | 33 | Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) |
| COG0300 | 3 | 2 | Short-chain dehydrogenases of various substrate specificities |
| COG4221 | 3 | 2 | Short-chain alcohol dehydrogenase of unknown specificity |
| COG1012 | 2 | 11 | NAD-dependent aldehyde dehydrogenases |
| COG0677 | 2 | 3 | UDP-N-acetyl-D-mannosaminuronate dehydrogenase |
| COG3384 | 2 | 3 | Uncharacterized conserved protein |
| COG0280 | 1 | 3 | Phosphotransacetylase |
| COG1344 | 1 | 3 | Flagellin and related hook-associated proteins |
| COG3325 | 1 | 2 | Chitinase |
| COG0277 | 1 | 1 | FAD/FMN-containing dehydrogenases |
| COG0179 | 1 | 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) | |
| COG1874 | 1 | Beta-galactosidase | |
| COG2132 | 1 | Putative multicopper oxidases | |
| COG1129 | 20 | ABC-type sugar transport system, ATPase component | |
| COG2723 | 11 | Beta-glucosidase/6-phospho-beta-glucosidase/beta-galactosidase | |
| COG0411 | 9 | ABC-type branched-chain amino acid transport systems, ATPase component | |
| COG0673 | 8 | Predicted dehydrogenases and related proteins | |
| COG0036 | 4 | Pentose-5-phosphate-3-epimerase | |
| COG1455 | 4 | Phosphotransferase system cellobiose-specific component IIC | |
| COG1486 | 4 | Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases | |
| COG2200 | 4 | FOG: EAL domain | |
| COG0366 | 3 | Glycosidases | |
| COG1004 | 3 | Predicted UDP-glucose 6-dehydrogenase | |
| COG3842 | 3 | ABC-type spermidine/putrescine transport systems, ATPase components | |
| COG3845 | 3 | ABC-type uncharacterized transport systems, ATPase components | |
| COG0435 | 2 | Predicted glutathione S-transferase | |
| COG0583 | 2 | Transcriptional regulator | |
| COG0812 | 2 | UDP-N-acetylmuramate dehydrogenase | |
| COG3836 | 2 | 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase | |
| COG4213 | 2 | ABC-type xylose transport system, periplasmic component | |
| COG4214 | 2 | ABC-type xylose transport system, permease component | |
| COG0376 | 1 | Catalase (peroxidase I) | |
| COG0410 | 1 | ABC-type branched-chain amino acid transport systems, ATPase component | |
| COG1640 | 1 | 4-alpha-glucanotransferase | |
| COG1921 | 1 | Selenocysteine synthase [seryl-tRNASer selenium transferase] | |
| COG1960 | 1 | Acyl-CoA dehydrogenases | |
| COG2368 | 1 | Aromatic ring hydroxylase | |
| COG2373 | 1 | Large extracellular alpha-helical protein | |
| Total Result | 54 | |