| Literature DB >> 23234305 |
Hideto Takami1, Takeaki Taniguchi, Yuki Moriya, Tomomi Kuwahara, Minoru Kanehisa, Susumu Goto.
Abstract
BACKGROUND: One of the main goals of genomic analysis is to elucidate the comprehensive functions (functionome) in individual organisms or a whole community in various environments. However, a standard evaluation method for discerning the functional potentials harbored within the genome or metagenome has not yet been established. We have developed a new evaluation method for the potential functionome, based on the completion ratio of Kyoto Encyclopedia of Genes and Genomes (KEGG) functional modules.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23234305 PMCID: PMC3541978 DOI: 10.1186/1471-2164-13-699
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Workflow of evaluation of the potential functionomes. Detailed workflow of the three annotation servers, KAAS, MEGAN4, and MG-RAST using query sequences after gene finding process of sequenced data; KAAS and MEGAN4 use BLASTP and BLASTX for amino acid and nucleotide query sequences respectively and the MG-RAST uses only BLASTX. All use different databases, i.e. KEGG GENES for KAAS, NCBI-NR for MEGAN4, and M5nr [15] for MG-RAST (M5nr includes the SEED as a subset.), and different default threshold values for the BLAST hits. Each server converts the hit entries to the corresponding orthology IDs for functional annotation and pathway/module/subsystem mapping. Red colored texts of KAAS indicate its improvements in the current study (see Assignment of the query sequences to KO identifiers in the Methods section).
Figure 2KEGG functional modules.A: A pathway module. The module M00009 comprising 8 reactions is defined for the citrate cycle (TCA cycle) core module and represented as a Boolean algebra-like equation of KO identifiers or K numbers for computational applications. The relationship between this module and the corresponding KEGG pathway map is also shown by indicating corresponding K number sets in the module and EC numbers in the pathway map using the same index. In each K number set, vertically connected K numbers indicate a complex and therefore represent “And” or “+” in the Boolean algebra-like equation, whereas horizontally located K numbers indicate alternatives and represent “Or” or “,” in the equation. B: A structural complex module. The structural complex module M00163 comprising 12 (cyanobacteria) or 14 (plant) components is defined for the type I photosystem. The Boolean algebra-like equation and the corresponding KEGG pathway map are also shown. The KEGG pathway map shows the Thermosynechococcus elongatus (cyanobacteria) photosystem. Green and purple boxes indicate plant and cyanobacteria components, respectively.
Classification of the KEGG modules based on the module completion ratio of 768 prokaryotes
| | | | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| A | Universal | A-1 | 15 (7.4) | 0 (0) | 9 (3.4) | 0 (0) | 1 (25) | 0 (0) | 0 (0) | 0 (0) |
| | | A-2 | 8 (3.9) | 1 (1.9) | 0 (0) | 0 (0) | 1 (25) | 0 (0) | 0 (0) | 0 (0) |
| B | Restricted | - | 22 (10.8) | 17 (31.5) | 119 (45.2) | 95 (89.6) | 0 (0) | 0 (0) | 3 (100) | 3 (100) |
| C | Diversified | - | 79 (38.9) | 36 (66.7) | 54 (20.5) | 11 (10.4) | 1 (25) | 1 (100) | 0 (0) | 0 (0) |
| D | Non-prokaryotic | - | 79 (38.9) | - | 81 (30.8) | - | 1 (25) | - | 0 (0) | - |
[ ] shows total number of the KEGG modules containing branched modules. “Rare” indicates the modules completed by less than 10% of 768 prokaryotic species. Universal: the modules completed by more than 70% of 768 prokaryotic species, Restricted: the modules completed by less than 30% of 768 prokaryotic species. Diversified: the modules that varies in the module completion ratio among 768 prokaryotic species, Non-prokaryotic: the modules not to be completed by any prokaryotic species.
Figure 3Typical completion patterns to the KEGG modules by 768 prokaryotic species.A: universal modules. (A-1) The modules completed by more than 70% of 768 prokaryotic species. M00018_1, which is threonine biosynthesis (aspartate homoserine threonine) is one of examples of the pattern A-1. (A-2) The modules for which more than 70% of 768 prokaryotic species show a module completion ratio of >80%. M00019_1, which is leucine biosynthesis (pyruvate 2-oxoisovalerate leucine) is one of examples of the pattern A-2. B: Restricted modules completed by less than 30% of 768 prokaryotic species. M00038_1, which is tryptophan metabolism (tryptophan kynurenine 2-aminomuconate) is one of examples of the pattern B. C: Diversified modules. These are the modules that vary in the module completion ratio among 768 prokaryotic species. M00012_1, which is glyoxylate cycle is one of examples of the pattern C. D: Non-prokaryotic modules completed by no prokaryotic species. M00014_1, which is glucuronate pathway (uronate pathway) is one of examples of the pattern D. Breakdown of taxonomic variations that complete each KEGG module is summarized in Table 1 and shown in Supplementary Table S2 in detail.
Breakdown of taxonomic patterns of the KEGG modules
| Major taxonomic pattern | Number (%) | Major taxonomic pattern | Number (%) |
| Non-prokaryote | 79 (38.9) | Non-prokaryote | 81 (30.8) |
| Prokaryote | 52 (25.6) | Bacteria-specific | 45 (17.1) |
| Bacteria-specific | 25 (12.3) | Prokaryote | 42 (16) |
| 8 (3.9) | 24 (9.1) | ||
| 6 (3) | Archaea-specific | 10 (3.8) | |
| 4 (2) | 10 (3.8) | ||
| 4 (2) | 10 (3.8) | ||
| 3 (1.5) | 8 (3) | ||
| 3 (1.5) | 4 (1.5) | ||
| Archaea-specific | 2 (1) | 3 (1.1) | |
| 2 (1) | 3 (1.1) | ||
| 2 (1) | 3 (1.1) | ||
| 2 (1) | 2 (0.8) | ||
| 1 (0.5) | 1 (0.4) | ||
| 1 (0.5) | 1 (0.4) | ||
| 1 (0.5) | 1 (0.4) | ||
| 1 (0.5) | 1 (0.4) | ||
| 1 (0.5) | 1 (0.4) | ||
| 1 (0.5) | 1 (0.4) | ||
| 1 (0.5) | 1 (0.4) | ||
| 1 (0.5) | 1 (0.4) | ||
| 1 (0.5) | 1 (0.4) | ||
| 1 (0.5) | 1 (0.4) | ||
| 1 (0.5) | 1 (0.4) | ||
| 1 (0.4) | |||
| Major taxonomic pattern | Number (%) | 1 (0.4) | |
| Prokaryote | 3 (75) | 1 (0.4) | |
| Non-prokaryote | 1 (25) | 1 (0.4) | |
| 1 (0.4) | |||
| Major taxonomic pattern | Number (%) | 1 (0.4) | |
| 1 (33.3) | 1 (0.4) | ||
| 1 (33.3) | | | |
| 1 (33.3) | |||
[ ] shows total number of the KEGG modules containing branched modules.
Figure 4Comparison of module completion patterns in 8 phenotypically different -related species.A: Pathway modules showing remarkable differences appeared among the 8 species. B: Structural complex modules showing remarkable differences appeared among the 8 species. Upper histogram indicates common or specific modules in the species possessing each phenotype (from left to right; mesophilic neutrophile; mesophilic alkaliphile; mesophilic, extremely halotolerant alkaliphile; and thermophilic neutrophile). Green letters show rare modules completed by less than 10% of 768 prokaryotic species described in Figure 3. Alphabet in parentheses shows the patterns of completion profile based on the module completion ratio as shown in Table 1 and Figure 3. A: Universal module, B: Restricted module, C: Diversified module, D: Non-prokryotic module. bsu, B. subtilis; bao, B. amyloliquefaciens; bli, B. licheniformis; bha, B. halodurans; bpf, B. pseudofirmus; oih, O. iheyensis; gka, G. kaustophilus; and gth, G. thermoglucosidasius .
Figure 5Comparison of module completion patterns in humans and human gut microbiomes from 13 healthy individuals.A: Typical pathway modules showing remarkable differences in the module completion ratio appeared among human gut microbiomes from 13 healthy individuals. B: Typical pathway modules possessing complementary relationships between humans and human gut microbiomes in the module completion ratio. C: Typical pathway modules for which the completion ratio in the human gut microbiome is very low in contrast to that in humans. Green letters show rare modules completed by less than 10% of 768 prokaryotic species described in Figure 3. Detailed information of the 13 individuals has been previously described [6]. Alphabet in parentheses shows the patterns of completion profile based on the module completion ratio as shown in Table 1 and Figure 3. A: Universal module, B: Restricted module, C: Diversified module, D: Non-prokryotic module.
Figure 6Taxonomic variation in genes assigned to KOs associated with glycolysis in human gut microbiomes from 13 healthy individuals. The pathway module M00002_1 comprising 6 steps shows glycolysis (core module involving 3-carbon compounds), and K number in each box indicates KO assigned for every individual reaction. Because K00134 and K00150 have the relationship of “Or” as explained in Figure 2, if the gene from human gut microbiomes is assigned to either K00134 or K00150, the reaction at the 2nd step can be executed. Pie charts show taxonomic breakdown of the genes assigned to KOs in all 6 steps. Numbers in parentheses indicate the number of the genes assigned to KO in each reaction step.
Breakdown of small functional categories of the KEGG modules
| Small functional category | Number (%) | Small functional category | Number (%) |
| Cofactor & vitamin biosynthesis | 30 (14.6) | Saccharide and polyol transport system | 29 (11.0) |
| Carbon fixation | 14 (6.8) | ATP synthesis | 27 (10.3) |
| Central carbohydrate metabolism | 14 (6.8) | Phosphotransferase system (PTS) | 24 (9.1) |
| Lipid metabolism | 14 (6.8) | Mineral and organic ion transport system | 23 (8.7) |
| Glycan metabolism | 13 (6.3) | Phosphate and amino acid transport system | 19 (7.2) |
| Aromatic amino acid metabolism | 11 (5.4) | ABC-2 type and other transport systems | 16 (6.1) |
| Methane metabolism | 11 (5.4) | Bacterial secretion system | 14 (5.3) |
| Fatty acid metabolism | 10 (4.9) | RNA processing | 13 (4.9) |
| Sterol biosynthesis | 10 (4.9) | Ubiquitin | 13 (4.9) |
| Cystein & methionine metabolism | 7 (3.4) | Metallic cation, iron-siderophore and vitamin B12 transport system | 12 (4.6) |
| Glycosaminoglycan metabolism | 7 (3.4) | Protein processing | 9 (3.4) |
| Other carbohydrate metabolism | 7 (3.4) | Spliceosome | 9 (3.4) |
| Polyamine biosynthesis | 6 (2.9) | Repair system | 8 (3.0) |
| Telpenoid backbone biosynthesis | 6 (2.9) | DNA polymerase | 7 (2.7) |
| Lysine metabolism | 5 (2.4) | Photosynthesis | 7 (2.7) |
| Pyrimidine metabolism | 5 (2.4) | RNA polymerase | 7 (2.7) |
| Akaloid & other secondardy metabolite | 4 (1.9) | Peptide and nickel transport system | 6 (2.3) |
| LPS metabolism | 4 (1.9) | Replication system | 6 (2.3) |
| Other terpenoid biosynthesis | 4 (1.9) | Carbohydrate metabolism | 5 (1.9) |
| Arginine & proline metabolism | 3 (1.5) | Proteasome | 5 (1.9) |
| BCAA metabolism | 3 (1.5) | Ribosome | 3 (1.1) |
| Other amino acid metabolism | 3 (1.5) | Glycan metabolism | 1 (0.4) |
| Phenylpropanoid & flavonoid biosyntesis | 3 (1.5) | | |
| Purine metabolism | 3 (1.5) | Small functional category | Number (%) |
| Histidine metabolism | 2 (1.0) | Aminoacyl-tRNA | 2 (50) |
| Metabolic capacity | 2 (1.0) | Nucleotide sugar | 2 (50) |
| Serin & threonine metabolism | 2 (1.0) | | |
| Nitrogen fixation | 1 (0.5) | Small functional category | Number (%) |
| Sulfur metabolism | 1 (0.5) | Genotypic signature | 3 (100) |
[ ] shows total number of the KEGG modules containing branched modules.