| Literature DB >> 25973143 |
Patricia Ortegon1, Augusto C Poot-Hernández2, Ernesto Perez-Rueda3, Katya Rodriguez-Vazquez1.
Abstract
In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case.Entities:
Keywords: Comparative genomics; Genetic algorithms; KEGG database; Metabolism; k-medoids
Year: 2015 PMID: 25973143 PMCID: PMC4423528 DOI: 10.1016/j.csbj.2015.04.001
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1General strategy for the comparative analysis of E. coli K-12 metabolism. The metabolic maps from KEGG were converted to ESS by using the breadth first search (BFS) algorithm. For each map a graphical representation was created, where nodes represent enzymes and edges are product-substrate relationships. Then, a set of initialization nodes was selected (green arrowhead) as roots for BFS trees. Those trees were used as guide for ESS construction. Afterwards all the ESS were compared against each other by GA pairwise alignments. The similarities among ESS were used to conduct a clustering analysis based on the k-medoids algorithm. Finally, clusters of similar sequences were aligned using an MSA approach.
Fig. 2Statistics for the ESS per metabolic map. Only the 45 metabolic maps that generate at least one sequence are shown. In the left panel, the number of ESS generated by the metabolic map are shown; in the right panel, the distribution of lengths of those sequences is shown.
Fig. 3Crossover for MSA of metabolic pathways.
Fig. 4Distribution of ESS among the 34 identified clusters. Bars represent the number of sequences per cluster.
Fig. 5MSA of the ESS included in cluster 21.
Fig. 6Structural domain assignment according to the Superfamily database for proteins aligned in cluster 21. In panel A, all of the enzymes in the aignment were mapped in the corresponding metabolic map. Panel B are the results of the Superfamily domain assignations. Only the similar domains are indicated.