| Literature DB >> 22675073 |
Francisco P Lobo1, Maíra R Rodrigues, Gisele O L Rodrigues, Heron O Hilário, Raoni A Souza, Andreas Tauch, Anderson Miyoshi, Glaura C Franco, Vasco Azevedo, Glória R Franco.
Abstract
The enrichment analysis is a standard procedure to interpret 'omics' experiments that generate large gene lists as outputs, such as transcriptomics and protemics. However, despite the huge success of enrichment analysis in these classes of experiments, there is a surprising lack of application of this methodology to survey other categories of large-scale biological data available. Here, we report Kegg Orthology enrichMent-Online DetectiOn (KOMODO), a web tool to systematically investigate groups of monophyletic genomes in order to detect significantly enriched groups of homologous genes in one taxon when compared with another. The results are displayed in their proper biochemical roles in a visual, explorative way, allowing users to easily formulate and investigate biological hypotheses regarding the taxonomical distribution of genomic elements. We validated KOMODO by analyzing portions of central carbon metabolism in two taxa extensively studied regarding their carbon metabolism profile (Enterobacteriaceae family and Lactobacillales order). Most enzymatic activities significantly biased were related to known key metabolic traits in these taxa, such as the distinct fates of pyruvate (the known tendency of lactate production in Lactobacillales and its complete oxidation in Enterobacteriaceae), demonstrating that KOMODO could detect biologically meaningful differences in the frequencies of shared genomic elements among taxa. KOMODO is freely available at http://komodotool.org.Entities:
Mesh:
Year: 2012 PMID: 22675073 PMCID: PMC3394310 DOI: 10.1093/nar/gks490
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Pipeline used for implementation of KOMODO. Orange boxes denote flat files from KEGG or generated by KOMODO. Purple boxes represent programs developed in this study. Green cylinder represents the relational database generated for this study. Blue boxes indicate the user-supplied information needed for KO enrichment analysis. The sequential steps for analysis are highlighted by yellow dots. (A) Parsing of KEGG information to generate KO count and genome number per taxa; data stored in a local relational database. (B) Program ‘web.php’ coordinates the entire pipeline by getting user-defined parameters (step C) to search in the relational database (step D), make the statistical testing (steps E, F and G), and generate a dynamic webpage and a flat file as final results (step G).
Figure 2.KEGG Orthology enrichment analysis of the glycolytic pathway in the ENT group. Yellow boxes denote KOs where no difference was observed between test and background taxa. Black boxes denote KOs not observed in test or background taxa. Green and red boxes denote KOs significantly more and less represented in the test lineage when compared with the background one, respectively. Darker and lighter tones of both green and red boxes denote q-values between 0.05 and 0.00001 and smaller than 0.00001, respectively. Blue, pink, orange and purple arrows indicate trunk glycolysis enzymes, l-lactate dehydrogenase, PEP carboxykinase and taxon-specific enzymatic activities, respectively.
Figure 3.KEGG Orthology enrichment analysis of the glycolytic pathway in the LAB group. The color schema and q-value cut-offs are the same as utilized in Figure 2.