| Literature DB >> 20459825 |
Aurélien Mazurie1, Danail Bonchev, Benno Schwikowski, Gregory A Buck.
Abstract
BACKGROUND: Comparison of metabolic networks across species is a key to understanding how evolutionary pressures shape these networks. By selecting taxa representative of different lineages or lifestyles and using a comprehensive set of descriptors of the structure and complexity of their metabolic networks, one can highlight both qualitative and quantitative differences in the metabolic organization of species subject to distinct evolutionary paths or environmental constraints.Entities:
Mesh:
Year: 2010 PMID: 20459825 PMCID: PMC2876064 DOI: 10.1186/1752-0509-4-59
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Classification models performance.
| Comparison | Accuracy | Kappa statistic | Classification model |
|---|---|---|---|
| Archaea (56) vs. Bacteria (600) vs. Eukarya (87) | 93.54% (89.50%) | 0.81 (0.62) | Functions.Logistic |
| Prokarya (656) vs. Eukarya (87) | 98.25% (96.90%) | 0.91 (0.84) | Functions.MultilayerPerceptron |
| Unicellular (44) vs. Multicellular (43) Eukarya | 96.55% (96.55%) | 0.93 (0.93) | Rules.JRip |
| Free-living (525) vs. Host-associated (61) Bacteria | 91.98% (92.32%) | 0.45 (0.48) | Rules.OneR |
| Immotile (202) vs. Motile (322) Bacteria | 72.33% (72.14%) | 0.40 (0.40) | Lazy.IB1 |
| Anaerobe (253) vs. Facultative aerobe (170) vs. Aerobe (253) | 61.20% (57.92%) | 0.38 (0.33) | Trees.RandomForest |
| Halotolerant (4) vs. Halophile (15) Bacteria | 78.95% | 0.00 | Functions.LibSVM |
| Psychrophile (24) vs. Mesophile (508) vs. Thermophile (61) Bacteria | 86.00% | 0.04 | Functions.LibSVM |
Scores obtained by training classification models to discriminate groups of taxa based on quantitative descriptors of the structure and complexity of their Networks of Interacting Pathways (NIPs). Data given is the number of taxa considered for each group, as well as the accuracy and Kappa statistics on 10-fold cross-validation of the best performing classification model when using all 52 NIP descriptors and, in parentheses, those obtained with the best subsets of descriptors identified (see Figure 1).
Figure 1Values of the NIP descriptors (abridged). For each group comparison, the descriptors of the structure and complexity of NIPs reported are those shown to best discriminate the different groups of taxa considered. Bars represent the average value and standard deviation of a given descriptor for each group of taxa, based on metabolic networks extracted from KEGG. The hypothesis that the descriptor value is the same over all groups was evaluated for each metabolic dataset either by a Kruskal-Wallis test (comparisons of three groups) or Mann-Whitney U test (comparisons of two groups). Resulting p-values were corrected for multiple testing using Bonferroni correction. Values for all 52 descriptors are available in Additional file 1. A.I., N.I., T.I.: Average, Normalized and Total Information, respectively.
Figure 2Effects of lineage and environment on pathway frequency, connectivity and centrality. The amplitude of variation of six scores of frequency, connectivity and centrality, and a p-value evaluating the significance of this variation are reported for all metabolic pathways. P-values were calculated by either a Fisher's Exact Test (frequency) or Mann-Whitney U-text (connectivity and centrality), and corrected for multiple testing using the Benjamini-Hochberg method [31]. The median variation and the p-value for pathways in the same functional category were pictured as either a triangle or a diamond. A triangle pointing left (◀) means the score increases from left to right (e.g., from Prokarya to Eukarya), while a triangle pointing right (▶) means the score decreases from left to right. A diamond means the score does not change. The position of the symbol is proportional to the median corrected p-value, i.e. False Discovery Rate, and its size is proportional to the amplitude of variation. All values are available in Additional file 2. B.C.: Betweenness Centrality.