| Literature DB >> 16549025 |
Beltran Rodriguez-Brito1, Forest Rohwer, Robert A Edwards.
Abstract
BACKGROUND: Metagenomics, sequence analyses of genomic DNA isolated directly from the environments, can be used to identify organisms and model community dynamics of a particular ecosystem. Metagenomics also has the potential to identify significantly different metabolic potential in different environments.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16549025 PMCID: PMC1473205 DOI: 10.1186/1471-2105-7-162
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Number of genomes and protein encoding genes in the SEED database at the time of analysis. The two environmental samples are the Sargasso Sea and Acid Mine Drainage metagenomes.
| Archaea | 37 | 61,709 | 2 |
| Bacteria | 550 | 1,187,180 | 44 |
| Eukarya | 556 | 482,760 | 18 |
| Environmental Samples | (2) | 968,149 | 36 |
Figure 1Effect of sample size on identifying differences between phylosubsystems. The red lines reflect the number of phylosubsystems overrepresented in the Sargasso Sea dataset. The blue lines represented the number of phylosubsystems overrepresented in the SEED dataset. Three different confidence levels (90, 95, and 99%) are plotted.
Phylosubsystems that are overrepresented in AMD dataset versus SEED dataset with 99% confidence at a sample size of 145,000 proteins.
| 1 | Arginine degradation | Amino Acids and Derivatives | Archaea |
| 2 | Chorismate Synthesis | Amino Acids and Derivatives | Archaea |
| 3 | Histidine Degradation | Amino Acids and Derivatives | Archaea |
| 4 | Leucine Biosynthesis | Amino Acids and Derivatives | Archaea |
| 5 | Calvin-Benson cycle | Carbohydrates | Archaea |
| 6 | Embden-Meyerhof and Gluconeogenesis | Carbohydrates | Archaea |
| 7 | Methylcitrate cycle | Carbohydrates | Archaea |
| 8 | Riboflavin metabolism | Cofactors, Vitamins, Prosthetic Groups, Pigments | Archaea |
| 9 | Conserved tRNAs | Experimental Subsystems | Archaea |
| 10 | Fatty acid metabolism | Fatty Acids and Lipids | Archaea |
| 11 | Fatty acid oxidation pathway | Fatty Acids and Lipids | Archaea |
| Nucleosides and Nucleotides | Archaea | ||
| Nucleosides and Nucleotides | Archaea | ||
| 14 | Pyrimidine conversions | Nucleosides and Nucleotides | Archaea |
| 15 | Ribosome LSU (eukaryotic and archaeal) | Protein Metabolism | Archaea |
| 16 | Ribosome SSU (eukaryotic and archaeal) | Protein Metabolism | Archaea |
| 17 | Translation initiation factors (eukaryotic and archaeal) | Protein Metabolism | Archaea |
| 18 | tRNA aminoacylation | RNA metabolism | Bacteria |
| 19 | TTSS transporters | Virulence | Bacteria |
Figure 2Significantly different subsystems between Sargasso Sea and SEED datasets. (A). Each subsystem was bootstrapped with between 10 and 400 samples per bootstrap, and subsystems that are significantly different with 99% and 2,000 bootstraps are highlighted. Those subsystems that are significantly more prevalent in the SEED database are colored blue, and those subsystems that are significantly more prevalent in the Sargasso Sea dataset are colored red. (B). Magnified view of several different subsystems. Subsystems from amino acid synthesis, carbohydrate utilization, cofactor synthesis, fatty acids, nucleotide synthesis, and photosynthesis are shown in more detail. Colors are as described for (A). E = eukaryotic subsystem, B = bacterial subsystem, A = archaeal subsystem.
Presence of Glycine, Serine, and Threonine subsystems in the AMD, SEED, and Sargasso databases. The table is a subset of the data from the supplemental data [see Additional File 3].
| 206 | Glycine synthesis | A | Amino | Alanine, serine, | 7 | 24 | 34 | 923 | 19 | 35 | Sargasso |
| 207 | Glycine synthesis | B | 2 | 492 | 483 | 264 | 390 | 503 | Sargasso | ||
| 426 | Serine biosynthesis | A | 13 | 166 | 244 | 1713 | 132 | 254 | Sargasso | ||
| 427 | Serine biosynthesis | B | 7 | 1545 | 1257 | 923 | 1224 | 1309 | Sargasso | ||
| 475 | Threonine synthesis | B | K, T, M, and C | 8 | 1070 | 952 | 1054 | 848 | 991 | Sargasso | |
1Subsystem number
2Subsystem name as designated by the curator
3K: Kingdom (A: Archaea; B: Bacteria; E: Eukaryota)
4Classification of the subsystem. K, T, M, and C: Lysine, threonine, methionine and cysteine.
5Number of proteins present in the AMD sample
6Number of proteins present in the SEED sample
7Number of proteins present in the Sargasso Sea sample (SS).
8–10Number of proteins present in the AMD, SEED, and Sargasso samples normalized per million proteins in each sample.
11Statistically significant prevalence. Prev.: prevalence.
Figure 3Fraction of amino acids in metagenomes. The fraction of each amino acid in all the predicted proteins in the three data samples was counted and compared.
Figure 4Flow chart of methods used to identify statistical differences between phylosubsystems.