| Literature DB >> 22936075 |
Summit Suen1, Henry Horng-Shing Lu, Chen-Hsiang Yeang.
Abstract
Domain architectures and catalytic functions of enzymes constitute the centerpieces of a metabolic network. These types of information are formulated as a two-layered network consisting of domains, proteins, and reactions-a domain-protein-reaction (DPR) network. We propose an algorithm to reconstruct the evolutionary history of DPR networks across multiple species and categorize the mechanisms of metabolic systems evolution in terms of network changes. The reconstructed history reveals distinct patterns of evolutionary mechanisms between prokaryotic and eukaryotic networks. Although the evolutionary mechanisms in early ancestors of prokaryotes and eukaryotes are quite similar, more novel and duplicated domain compositions with identical catalytic functions arise along the eukaryotic lineage. In contrast, prokaryotic enzymes become more versatile by catalyzing multiple reactions with similar chemical operations. Moreover, different metabolic pathways are enriched with distinct network evolution mechanisms. For instance, although the pathways of steroid biosynthesis, protein kinases, and glycosaminoglycan biosynthesis all constitute prominent features of animal-specific physiology, their evolution of domain architectures and catalytic functions follows distinct patterns. Steroid biosynthesis is enriched with reaction creations but retains a relatively conserved repertoire of domain compositions and proteins. Protein kinases retain conserved reactions but possess many novel domains and proteins. In contrast, glycosaminoglycan biosynthesis has high rates of reaction/protein creations and domain recruitments. Finally, we elicit and validate two general principles underlying the evolution of DPR networks: 1) duplicated enzyme proteins possess similar catalytic functions and 2) the majority of novel domains arise to catalyze novel reactions. These results shed new lights on the evolution of metabolic systems.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22936075 PMCID: PMC3468959 DOI: 10.1093/gbe/evs072
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FLeft: A toy example of the evolutionary history of DPR networks. Species 1 has two child species: species 2 and 3. Triangles denote domains, circles denote proteins, and squares denote reactions. Solid, undirected edges denote domain-protein and protein-reaction edges. Dotted, directed edges denote phylogenetic relations of proteins. In species 1, protein consists of one domain and catalyzes reaction . From species 1 to species 2, a new domain is recruited into protein , which is inherited from . From species 1 to species 3, is duplicated into and , and catalyzes a new reaction . Right: Eleven types of changes in the DPR networks from a parent species to a child species. From left to right: (1) domain creation, (2) domain deletion, (3) protein duplication, (4) protein creation, (5) protein deletion, (6) reaction creation, (7) reaction deletion, (8) domain-protein edge addition, (9) domain-protein edge deletion, (10) protein-reaction edge addition, and (11) protein-reaction edge deletion.
FSchematic of the DPR network reconstruction algorithm. A collection of protein trees is inferred from their domain compositions and sequences alone. These protein trees set the initial values of protein lineage variables. With observed DPR networks in the contemporary species as inputs, the max-product algorithm is iteratively applied to infer the values of one set of variables (e.g., domain-protein and protein-reaction edges) by fixing the values of the other set of variables (e.g., protein lineages). Iteration continues until all variable values converge. The converged variable configuration is postprocessed to generate the inferred evolutionary history.
FPhylogenetic tree of 13 selected species according to the National Center for Biotechnology Information (NCBI) taxonomy database. Branch lengths are not scaled to the evolutionary distances between genomes.
FValidations of the reconstruction algorithm. Top: Distribution of error rates on simulated data, with varying perturbation rates on the initial protein trees. From top-left to bottom-right, the perturbation rate varies from 0 to 0.5. The error rate distributions of the max-product algorithm (solid blue) and dynamic programming (dashed red) are plotted. Bottom: Sensitivities, specificities, and overal error rates of cross-validation predictions on real data sets, with varying ratios of test set sizes and training set sizes. Crosses: sensitivities of domain-protein edges. Circles: specificities of domain-protein edges. Plus signs: sensitivities of protein-reaction edges. Asterisks: specificities of protein-reaction edges. Squares: overall rates. Blue symbols: max-product prediction outcomes. Red symbols: dynamic programming prediction outcomes.
FSummary of metabolic network evolution of 13 species. The topology of the phylogenetic tree (shown by blue lines) is extracted from the National Center for Biotechnology Information (NCBI) taxonomy. Each node represents a contemporary or ancestral species marked with its taxonomy name and ID. Vertical positions of nodes denote the total sizes of their DPR networks (node number + edge number). Horizontal distances between two adjacent nodes denote the total numbers of network change events between the adjacent species pairs. The compositions (contributions) of network change types between each pair of adjacent species nodes are visualized as pie charts and placed along their edges. Prominent network change mechanisms include protein duplications (medium blue), protein deletions (light blue), protein-reaction edge additions (red), domain-protein edge additions (orange), reaction creations (light green), and domain creations (dark blue).
Four Categories of Metabolic Network Evolution and Their Constituent Pathways
| Category | Pathway | Pathway |
|---|---|---|
| Protein duplication | Fatty acid elongation | Fatty acid metabolism |
| Alanine, aspartate, and glutamate metabolism | Valine, leucine, and isoleucine biosynthesis | |
| Biosynthesis of alkaloids | Trna metabolism | |
| Starch degradation | Folate transformation | |
| Pyrimidine metabolism | Methionine degradation | |
| Nicotine degradation | Lysine, threonine, and methionine biosynthesis | |
| Glycolysis | Tca cycle | |
| Entner–Doudoroff pathway | Gluconeogenesis | |
| Serine-isocitrate lyase pathway | Peptidoglycan biosynthesis | |
| Aspartate superpathway | Ribose and deoxyribose phosphate degradation | |
| Arginine, ornithine, and proline metabolism | Udp-sugar interconversion | |
| Formaldehyde assimilation | Formyithf biosynthesis | |
| Folate transformation | Heterolactic fermentation | |
| Reaction creation | Steroid biosynthesis | Bile acid metabolism |
| DDT degradation | Chlorocyclohexane and chlorobenzene degradation | |
| Benzene metabolism | Polyketide metabolism | |
| Peptidoglycan biosynthesis | Nitrotoleucine degradation | |
| Indole alkaloid biosynthesis | Monoterpenoid biosynthesis | |
| Insect hormone biosynthesis | Palmitate biosynthesis | |
| Noradrenaline and adrenaline degradation | Actinorhodin biosynthesis | |
| Tryptophan degradation | Myristate biosynthesis | |
| Protein generation | Oxidative phosphorylation | Purine metabolism |
| Alanine metabolism | C5-branched dibasic acid metabolism | |
| Carbon fixation in prokaryotes | Thiamine metabolism | |
| Kinases | Phosphatidylinositol signaling system | |
| mTOR signaling pathway | KDO2-lipid a biosynthesis | |
| Lipopolysaccharide biosynthesis | Arginine and polyamine biosynthesis | |
| Aromatic compound degradation | Foraldehyde assimilation | |
| Phospholipid biosynthesis | Ascorbate biosynthesis | |
| Acetyl-CoA assimilation | Bifidum pathway | |
| PIP metabolism | Atrazine degradation | |
| Novel domain/protein and reaction creation | Glycan biosynthesis | Glycosaminoglycan biosynthesis |
| Glycosaminoglycan degradation | Inositol phosphate metabolism | |
| Glycosylphosphatidulinositol-anchor biosynthesis | Sphingolipid metabolism | |
| Glycosphingolipid biosynthesis | Biotin metabolism | |
| Dolichyl-diphosphooligosaccharide biosynthesis | Heparan sulfate biosynthesis | |
| Zymosterol biosynthesis | Cholesterol biosynthesis | |
| Thyronamine and iodothyronamine metabolism | Thyroid hormone metabolism | |
| BMP signaling pathway | Adenosylcobalamin biosynthesis | |
| NAD biosynthesis | Ergosterol biosynthesis |
FGeneral rules relating domains and reactions in the DPR networks. Left: Distribution of the fraction of reactions in the dominant EC class among all protein family. Horizontal axis: fraction of reactions in the dominant EC class. Vertical axis: distribution of the reaction fraction among all protein families. Right: Scattered plot of the numbers of domain creations and reaction creations along each branch of the species tree.