Literature DB >> 22618875

The evolutionary dynamics of functional modules and the extraordinary plasticity of regulons: the Escherichia coli perspective.

Gabriel Moreno-Hagelsieb1, Petar Jokic.   

Abstract

Using profiles of phylogenetic profiles (P-cubic) we compared the evolutionary dynamics of different kinds of functional associations. Ordered from most to least evolutionarily stable, these associations were genes in the same operons, genes whose products participate in the same biochemical pathway, genes coding for physically interacting proteins and genes in the same regulons. Regulons showed the most plastic functional interactions with evolutionary stabilities barely better than those of unrelated genes. Further regulon analyses showed that global regulators contain less evolutionarily stable associations than local regulators. Genes co-repressed by global regulators had a higher evolutionary conservation than genes co-activated by global regulators. However, the reverse was true for genes co-repressed and co-activated by local regulators. Of all the regulon-related associations, the relationship between regulators and their target genes showed the most evolutionary stability. Different negative data sets built to contrast against each of the analysed kinds of modules also differed in evolutionary conservation revealing further underlying genome organization. Applying P-cubic analyses to other genomes might help visualize genome organization, understand the evolutionary importance and plasticity of functional associations and compare the quality of data sets expected to reflect functional interactions, such as those coming from high-throughput experiments.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22618875      PMCID: PMC3424573          DOI: 10.1093/nar/gks443

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The main idea behind phylogenetic profiles is that if the products of two genes have interdependent functions, both genes should be either present or absent within a given genome (1–3). Accordingly, previous work has shown that genes in the same operons, adjacent genes transcribed into a single messenger ribonucleic acid (4,5), tend to have more similar profiles than adjacent genes in different transcription units (6,7). However, the co-occurrence (co-occurring pairs of genes divided by the sum of co-occurring pairs plus genes that have lost their partners) of genes in operons of Escherichia coli can be as low as 0.2 in Archaea [updated data following (6)]. The loss of a functionally related partner could be due to events such as false negatives, where the tools available can no longer find an orthologous gene; non-orthologous gene displacement (8) and/or a particular function being unnecessary under a different environment. The loss of a gene partner might also reflect functional divergence, where the product of the remaining gene might be associated to a different cellular process perhaps in conjunction with other gene products. In other words, functionally related genes in one organism might not be functionally related in another. As the co-occurrence analyses mentioned earlier might have shown that not all the functional associations of known operons are stable across organisms, genomic context tools, such as phylogenetic profiles, can be used to study and compare the evolutionary stability of other functional associations. In line with this idea, Snel and Huynen (9) presented a work using phylogenetic profiles to explore whether functional modules are also evolutionary modules. They analysed particular groups of genes within different kinds of functional gene modules. Herein, instead of studying such particular groups of genes (single operons and single regulons), we aimed to compare the evolutionary plasticity of different, experimentally determined types of functional associations (for instance, all genes known to be associated into operons compared against all genes in known regulons) using phylogenetic profiles. The objective was to compare the relative evolutionary stability of these kinds of associations. We have chosen functional modules of E. coli K12 because there are high-quality literature-derived databases of several types of experimentally determined modules in this model organism (Figure 1). The types of gene modules analysed were (i) operons (4,5), taken from RegulonDB (10,11); (ii) genes whose products participate in the same metabolic pathway, taken from EcoCyc (12,13); (iii) regulons, genes regulated by the same transcription factor (TF), also taken from RegulonDB and (iv) genes coding for physically interacting proteins (13–17).
Figure 1.

Venn diagram showing the overlaps between the four different kinds of functional modules from E. coli K12 substr. MG1655 analysed in this work. The few same-regulon genes overlapping with the same-operon data set are due to operons with internal promoters.

Venn diagram showing the overlaps between the four different kinds of functional modules from E. coli K12 substr. MG1655 analysed in this work. The few same-regulon genes overlapping with the same-operon data set are due to operons with internal promoters.

DATA AND METHODS

Phylogenetic profiles for each gene were represented as vectors where each item represented either the presence (number 1) or the absence (number 0) of an ortholog to the gene within a genome (18) (Figure 2). There are more elaborated vectors where the presences have been annotated with a number related to the score of the alignment of the gene and the corresponding ortholog (19). However, our preliminary results did not show a significant difference between such elaborated vectors and simpler binary vectors. Thus, we used here the binary (1/0) vectors to calculate mutual information scores for the phylogenetic profiles of all possible pairs of genes within the genome of E. coli K12 MG1655 using the formula described previously (19–21):
Figure 2.

Overview of the P-cubic method. Mutual information can be though of as a measure of how much two patterns coincide beyond what would be expected by chance. When the pattern for two proteins is almost the same, that is, the two proteins tend to co-occur across genomes, their mutual information is higher than when the patterns do not show co-occurrence. For example, despite the number of ‘1’ is approximately the same for genes A and C, their mutual information is low because their co-occurrences are more likely random. Genes whose products interact are expected to co-occur. However, this is not always the case, but the tendency is measurable as a higher proportion of co-occurring pairs than there would be among genes whose products are independent from each other (do not work together). A caveat of mutual information, however, is that if genes are abundant (or the opposite), then even though they might tend to co-occur, the patterns of co-occurrence might not result in high mutual information (genes E and F). As all gene pair sets have mutual information of 0 and better, all P-cubic curves start at ‘0’ [ln(1) = 0]. As the mutual information threshold increases, the proportion of gene pairs with that mutual information or better should decrease. More so for gene pairs that do not work together (less co-occurrence), than for genes whose products functionally interact (more co-occurrence).

Overview of the P-cubic method. Mutual information can be though of as a measure of how much two patterns coincide beyond what would be expected by chance. When the pattern for two proteins is almost the same, that is, the two proteins tend to co-occur across genomes, their mutual information is higher than when the patterns do not show co-occurrence. For example, despite the number of ‘1’ is approximately the same for genes A and C, their mutual information is low because their co-occurrences are more likely random. Genes whose products interact are expected to co-occur. However, this is not always the case, but the tendency is measurable as a higher proportion of co-occurring pairs than there would be among genes whose products are independent from each other (do not work together). A caveat of mutual information, however, is that if genes are abundant (or the opposite), then even though they might tend to co-occur, the patterns of co-occurrence might not result in high mutual information (genes E and F). As all gene pair sets have mutual information of 0 and better, all P-cubic curves start at ‘0’ [ln(1) = 0]. As the mutual information threshold increases, the proportion of gene pairs with that mutual information or better should decrease. More so for gene pairs that do not work together (less co-occurrence), than for genes whose products functionally interact (more co-occurrence). Our working definition of orthology consisted of BLASTP reciprocal best hits and fusions as described elsewhere (22). We ran BLAST+ (23) to compare all the proteins annotated within ∼1300 prokaryotic genomes available at RefSeq (24,25) (ftp://ftp.ncbi.nih.gov/genomes/Bacteria/) by May 2011. The E value cutoff was 1E−6, with a database size fixed at 5E8 (-dbsize 500 000 000), soft filtering of low information content sequences (-seg yes -soft_masking true) and a final Smith-Waterman alignment (-use_sw_tback). We also required coverage of at least 50% of any of the sequences in the alignment. The phylogenetic profiles were built using a non-redundant genome subset obtained as described elsewhere (26). Genomes smaller than 2.5 Mbp were not used for the analyses because obligate parasites and symbionts, which have severely reduced genomes, tend to lack TFs (27). We compared data from four different kinds of modules of functionally related genes as follows: genes in the same operons, genes in the same biochemical pathways, genes in the same regulons and genes coding for physically interacting proteins (Figure 1). To build data sets of pairs of genes within operons and of genes at transcription unit boundaries (TU borders), we used the current data set of transcription units of E. coli K12 substr MG1655 (28) found in RegulonDB (10,11) as explained previously (29,30). The current data sets contain 736 same-operon pairs and 567 TU borders. The operon data set included pairs of genes in the same operon even if they were not immediately adjacent to each other. We complemented TU borders with adjacent genes in different strands (divergently transcribed TU borders and convergently transcribed TU borders). This way the total number of operon pairs increased to 2536 and the total of TU borders increased to 1765. To derive pairs of genes whose protein products participate in the same pathway, we used the EcoCyc database (12,13). As a contrasting data set, we used pairs of genes in different pathways constructed using pathways with no single gene in common. If the comparison of two pathways resulted in a single pair of genes found in the same-pathway set, the whole data set of different-pathway genes derived from such comparison was eliminated. The procedures resulted in 9524 same-pathway and 359 392 different-pathway pairs. Physically interacting pairs (PPI) were genes coding for protein–protein interactions in E. coli as found by high-throughput methods (15–17), as well as manually curated interactions from low-throughput experiments, curated out of the database of interacting proteins (14) and the EcoCyc database (13,17). Negatives consisted of genes whose products are found in different compartments (17). Genes in the same regulon were also derived from data in RegulonDB. As we wanted to explore the stability of the functional associations implied by the co-regulation due to the TF, we included only genes in different transcription units regulated by the same TF as same-regulon pairs. This way the data set remained exclusively composed of same-regulon pairs, rather than a combination of same-regulon and operon pairs. The difference with the TU borders mentioned earlier is the co-regulation by a common TF and that the transcription units can be anywhere in the genome. As some transcription units can be regulated by several TFs, we took care not to use any pair more than once. Genes in different regulons were built by comparing two regulons at a time. The compared regulons could not contain a single gene in common. Also, if the comparison of two regulons produced a pair of genes present in the same-regulon data set, the whole set of different-regulon pairs resulting from the comparison of the two regulons was eliminated. The procedure resulted in 92 376 same-regulon pairs and 140 259 different-regulon pairs. We did not consider genes transcribed by the same sigma factors as regulons. Note that the current data set of genomes contains several obligate symbionts and parasites. These organisms tend to display degraded genomes where the missing genes might have been lost because of the lack of selective pressure to keep them under this very particular lifestyle. A gene group particularly missing in obligate parasites is that of genes coding for TFs (27). Thus, the phylogenetic profiles for all the analyses presented, and the resulting P-cubic curves, did not include organisms with genomes smaller than 2.5 Mbp to avoid biases. Figures in colour were produced in a colour-blind-friendly palette as suggested at: http://jfly.iam.u-tokyo.ac.jp/color/. A simple PERL module that we use to establish the palette for use in GNUPLOT and in LaTeX is offered ‘as is’ at: http://microbiome.wlu.ca/palette.

RESULTS AND DISCUSSION

Proof of concept: genes in the same operon are evolutionarily more stable than genes at TU borders

The contrast attainable in comparing profiles of phylogenetic profiles (P-cubic) is shown by using genes in the same operon, compared with genes at different transcription units (Figure 3a). Our measure of co-occurrence was mutual information (see ‘Data and Methods’ section). The phylogenetic profiles of functionally related genes should display higher mutual information than those of unrelated genes. Accordingly, the P-cubic analysis used to display and compare the evolutionary stability of different data sets consists on graphs showing the drop in the proportion of pairs of genes remaining as the mutual information threshold increases. Although a curve of direct proportion values could be used, taking the logarithm of these proportions helps to better compare the curves corresponding to each data set at higher mutual information thresholds.
Figure 3.

Profiles of phylogenetic-profile (P-cubic) are useful to compare the evolutionary stability of different gene sets. Sets of gene-pairs with the most evolutionarily stable functional interactions would have a higher proportion of pairs with high mutual information, thus their curves should drop less than those of unrelated genes. Accordingly, genes in the same operon (WO pairs), which are functionally related, show a higher P-cubic curve than genes at TU borders, which are not necessarily functionally related (a). Also in (a), as reported previously (31), convergently transcribed genes (cTU borders) are the least related of all adjacent genes in different TUs, followed by divergently transcribed TU borders (dTU borders) and adjacent TU borders in the same strand (TU borders). (b) Genes in the same biochemical pathway have an evolutionarily stable relationship when compared with genes in different pathways. However, the relationship is less stable than that among genes in the same operon. (c) Genes producing proteins that physically interact have a less stable functional relationship than genes in operons. The higher mutual information of verified protein–protein interactions shows that P-cubic analyses are also useful to verify the quality of large experimental data sets. (d) Genes in the same regulon have higher mutual information than genes in different regulons. However, the relationship seems to be subtle and so plastic throughout evolution that the P-cubic of genes in regulons is close to that of functionally unrelated genes. Transcriptional regulation might evolve very fast and be a major source of functional diversity and adaptation.

Profiles of phylogenetic-profile (P-cubic) are useful to compare the evolutionary stability of different gene sets. Sets of gene-pairs with the most evolutionarily stable functional interactions would have a higher proportion of pairs with high mutual information, thus their curves should drop less than those of unrelated genes. Accordingly, genes in the same operon (WO pairs), which are functionally related, show a higher P-cubic curve than genes at TU borders, which are not necessarily functionally related (a). Also in (a), as reported previously (31), convergently transcribed genes (cTU borders) are the least related of all adjacent genes in different TUs, followed by divergently transcribed TU borders (dTU borders) and adjacent TU borders in the same strand (TU borders). (b) Genes in the same biochemical pathway have an evolutionarily stable relationship when compared with genes in different pathways. However, the relationship is less stable than that among genes in the same operon. (c) Genes producing proteins that physically interact have a less stable functional relationship than genes in operons. The higher mutual information of verified protein–protein interactions shows that P-cubic analyses are also useful to verify the quality of large experimental data sets. (d) Genes in the same regulon have higher mutual information than genes in different regulons. However, the relationship seems to be subtle and so plastic throughout evolution that the P-cubic of genes in regulons is close to that of functionally unrelated genes. Transcriptional regulation might evolve very fast and be a major source of functional diversity and adaptation. The P-cubic of operons drops slowly compared with those of the different sets of TU borders as the mutual information increases (Figure 3a). This is expected because operons are mainly formed of functionally related genes, and functionally related genes should have a higher tendency to co-occur than non-related genes. Different sets of TU borders, namely co-directional TU borders, divergently transcribed genes and convergently transcribed genes, also display differences in their co-occurrence (Figure 3a). Same-strand TU borders show the highest mutual information, meaning that a higher proportion of these gene pairs might be functionally related than those in the other TUB categories. The least related were the convergent TU borders. These results are in agreement with previous work showing that divergent TU borders have stronger tendencies towards conservation of gene order than convergent TU borders (31) and with work showing that some co-directional TU borders also have functional associations (31–33). Thus, the P-cubic reflects both the proportions of functionally related pairs of genes and the evolutionary stability of such associations. As most genes in operons are known, or expected, to have functional interactions, the main component of the operon curve should correspond to evolutionary stability of the functional association.

Genes in the same biochemical pathway are less evolutionarily stable than genes in the same operon

The next comparison consisted of genes in the same biochemical pathway against genes in different pathways. As expected again, genes in the same pathway show higher mutual information than those in different pathways (Figure 3b). However, same-pathway genes are not as evolutionarily stable as those in the same operon. Genes in the same operon are often thought to consist of genes whose products participate in the same biochemical pathways [see for instance (21,34)]. Accordingly, there is an overlap between the two data sets, operon pairs and same-pathway pairs, of 447 gene pairs. This number constitutes 17.6% of the operon pairs and 5.4% of the same-pathway pairs (Figure 1). Genes in different pathways present a P-cubic curve that drops faster than the curve of overall TU borders (Figure 3b). Different-pathway genes are a large data set (359 392 pairs of genes) and, thus, make a cleaner and smoother negative sample than overall TU borders (1765 pairs). Thus, we decided to use the different-pathway set as a negative contrasting data set for the following analyses.

Genes coding for physically interacting proteins are less evolutionarily stable than genes in the same operon

To test whether genes coding for physically interacting proteins (PPI for protein–protein interactions) form an evolutionarily stable gene module, we compared their P-cubic with that of genes in the same operon. Only a small proportion of the proteins involved in PPIs are encoded by adjacent genes. Accordingly, only 28 of the manually curated low-throughput PPIs, and 47 of the 6047 total PPIs from Butland et al. (15) (21 of them among the 716 verified PPI pairs), are also in the operon data set. The data set of genes coding for manually curated PPIs shows the higher mutual information of all PPI sets, only slightly lower than operon pairs. The P-cubic comparison of operon pairs and PPI pairs shows that the functional association of genes in the same operon might be more stable throughout evolution than the functional association of genes coding for physically interacting proteins (Figure 3c). Also noteworthy, the data set of verified PPIs shows that this data set contains a higher proportion of evolutionarily stable pairs of genes than the non-verified data set. This result shows that P-cubic comparisons can also be used to compare the quality of different high-throughput experimental data sets. The negative PPI set, genes whose proteins are found in different cellular compartments show an interesting curve. A few pairs of genes show such high mutual information that they twist the curve, so that it does not drop as much as that for genes in different pathways. This makes sense given that proteins in different cellular compartments will not interact physically, but they still might have a functional association. Thus, similar to TU borders, the curve seems to reflect the presence of a proportion of functionally associated gene pairs.

Genes in the same regulon have the most evolutionarily plastic functional associations

The data set of genes in the same regulons contained very few pairs in common with the operons data set because we were interested in the association arising from the co-regulation brought about by the TF. The comparison of co-regulated (same regulon) genes against genes in different regulons shows a higher stability for the genes expected to be functionally associated (Figure 3d). However, same-regulon pairs display much lower mutual information than genes in the same operons and noticeably close to that of genes in different pathways. In other words, it would seem that pairs of genes in different pathways are almost as strongly associated as genes regulated by the same TF. To better understand the evolutionary plasticity of the functional associations by co-regulation, we separated the regulon data into categories. We first separated the data into those involving global TFs and those involving local TFs as defined elsewhere (35). Each data set (overall, global and local) was separated into activation (positively co-regulated transcription units), repression (negatively co-regulated transcription units) and dual (dually co-regulated transcription units). The rationale being that if two transcription units are regulated in the same way, then it should be more probable for their gene products to have a stable functional relationship. Although the P-cubic of any of these data sets shows better evolutionary stability than that of genes in different regulons (Figure 4a), genes related by co-activation presented a higher P-cubic than genes related by co-repression. This makes sense because activation requires more information than repression, as for the latter it is enough for a protein to bind at an appropriate site impeding, for instance, the binding of a sigma factor. This has been the argument used to explain why repressors are the most abundant kind of TFs (36). However, co-activated genes still showed a lower P-cubic than any of the other functionally related groups analysed and very close to the P-cubic of TU borders (compare to Figure 3). In contrast to the results earlier, if we analyse regulons involving global regulators (Figure 4b), we find co-activated genes to be the least conserved, showing worse conservation than genes in different regulons. This result is contradictory given the rationale earlier that repression requires less information than activation. We suggest that the results are due to the dual nature of global regulators. As global TFs perform both activation and repression, they already have a way of interacting with sigma factors to provide activation and, thus, acting as repressors or as activators does not make too much of a difference. Accordingly, local TFs, most of which act as either repressors or activators, show higher conservation of co-activated gene pairs than of co-repressed pairs (Figure 4c). The reversal between co-activated and co-repressed P-cubic pairs seen when comparing overall regulons with global TFs is explained by the facts that genes co-activated by local TFs, which tend to be better conserved, constitute close to 26% of the overall co-activated pairs. Dually co-regulated genes show a somewhat higher P-cubic than co-repressed and co-activated pairs; only they contain no pairs with mutual information much higher than 0.4 bits in both the overall and the global TFs analyses and no pairs with mutual information higher than 0.2 bits in the local TFs analysis.
Figure 4.

P-cubic comparison of regulon subsets. Herein, we compared the P-cubic of positively co-regulated genes (activation), negatively co-regulated genes (repression) and the relationship between TFs and their TGs (TF–TG). (a) Overall, the most stable functional association corresponds to that between TFs and their TGs. Co-activated genes and dually regulated genes are next in stability with the lowest stability presented among co-repressed gene pairs. The relationships change when we analyse the subset of regulatory interactions by global regulators in (b). Although the TF–TG relationship shows the highest stability again, co-activated genes show a lower conservation than genes in different regulons, whereas dually co-regulated genes show the highest conservation. In the analyses involving local regulators (c), co-activated genes show more conservation than the TF–TG relationship. In agreement with previous results about the conservation of the TF–TG relationship among evolutionarily close Enterobacteria (38), we found that a higher proportion of repressor–TG relationships attain a higher mutual information than activator–TG relationships. However, even the most conserved interactions brought about by TFs remain close to those among overall TU borders shown in Figure 3b.

P-cubic comparison of regulon subsets. Herein, we compared the P-cubic of positively co-regulated genes (activation), negatively co-regulated genes (repression) and the relationship between TFs and their TGs (TF–TG). (a) Overall, the most stable functional association corresponds to that between TFs and their TGs. Co-activated genes and dually regulated genes are next in stability with the lowest stability presented among co-repressed gene pairs. The relationships change when we analyse the subset of regulatory interactions by global regulators in (b). Although the TF–TG relationship shows the highest stability again, co-activated genes show a lower conservation than genes in different regulons, whereas dually co-regulated genes show the highest conservation. In the analyses involving local regulators (c), co-activated genes show more conservation than the TF–TG relationship. In agreement with previous results about the conservation of the TF–TG relationship among evolutionarily close Enterobacteria (38), we found that a higher proportion of repressor–TG relationships attain a higher mutual information than activator–TG relationships. However, even the most conserved interactions brought about by TFs remain close to those among overall TU borders shown in Figure 3b. We also explored the relationship of the TF and their target genes (TGs). In agreement with a previous report that shows that the TFs and TGs seem to evolve independently (37), the P-cubic of the TF/TG interactions are close to that of overall TU pairs (compare the curves of TF–TG in Figure 4 with curves in Figure 3). However, except for regulons involving local TFs, it shows the highest P-cubic among the TF association groups analysed (Figure 4). A previously published analysis found that the TF/TG association for positively regulated genes was less conserved among Enterobacteria than that of negatively regulated genes (38). To test whether we had a similar result using P-cubic, we separated our TF/TG pairs into activated, repressed and dually regulated (Figure 4d). In agreement with those previous results, the TF/TG P-cubic does not show activation TF/TG pairs with higher mutual information than 0.4 bits. Thus, our results confirm the previous finding. However, we note that despite this difference in TF/TG conservation, neither set seems to be more conserved than TU borders. Given the results in this section, relationships arising from co-regulation are the most plastic of all the gene associations tested. Previous work has suggested that co-regulation is not well conserved in evolution [see for instance (37,39)], whereas other analyses have suggested high conservation [see for instance (40)]. Although more particular analyses are necessary, the results herein show that neither the association of genes by co-regulation nor the regulation of a gene by a particular TF is much more conserved than the relationship of genes with little evidence for a functional interaction. In other words, it would seem as if the evolution of operators, the DNA motif where a TF binds (5), is independent from the evolution of the genes they regulate. This suggestion is also in agreement with previous work suggesting that operators can evolve quickly (41).

CONCLUDING REMARKS

This work uses profiles of phylogenetic profiles (P-cubic) to compare the evolutionary stability of functional associations of different gene relationships, gaining insight into the structure and evolution of the genome. The results led us to conclude that conservation of interactions is not as ubiquitous as believed previously. Four aspects of gene relationships have been addressed in this work: genes in the same operon, genes in the same biochemical pathway, genes coding for physically interacting proteins and genes in the same regulons. One of the expected results is that the P-cubic of gene pairs within operons would be more stable than otherwise adjacent genes; in other words, they would show more enduring relationships than the genes at the borders of transcription units. This particular result showing the general functional relationship of genes within an operon has been suggested in previous work (7,26,31). Thus, this first result also served as a confirmation of the concept. Genes in different pathways show the lowest conservation of all negative sets, as those genes would rarely be expected to function as partners in the organism. Their lower co-occurrence compared with that of any TU borders set supports the idea of an underlying genome organization that keeps transcription units of related functions within a close network. Despite the fact that TU borders might be the proper contrast against operons, they might still contain some proportion of gene pairs with related functions. This should be expected given that previous works have shown that genes in the same biochemical pathways tend to be closer in the chromosome than would be expected by chance (42). The analysis shows that pairs of genes coding for physically interacting proteins are not as conserved as operon pairs, this can mainly be attributed to the protein redundancy within the cell. Losing the physical interaction in a protein interactome does not mark a loss of functional interaction, just an evolutionary replacement of a protein with one that has a higher efficiency or more commonly the shuffling in amino acid composition through evolutionary time (43). The finding with the most wide-ranging implications is the counterintuitive result stemming from a comparison of genes within a regulon, as opposed to gene pairs outside regulons and gene pairs in the other three gene modules tested. When the different regulon pairs and same regulon pairs are compared, the same-regulon pairs show a greater conservation over increasing mutual information than genes in different regulons. This is to be expected, genes within the same regulons are more likely to be part of the same functional module compared with genes regulated differently. However, same-regulon pairs are only slightly more conserved than pairs in different regulons, and both are overall less evolutionarily stable than any of the other gene modules. Although it has been previously postulated that regulons show plasticity (37,39), the relationship has not been compared with other gene modules. From this analysis (Figure 3d) it is clear that regulon conservation is only slightly higher than that of baseline results from gene pairs in different pathways. This suggests that regulons are the gene modules that evolve the most readily and offer more information on the evolution of prokaryotic organisms than previously thought (44). Comparing the plasticity of regulatory interactions with that of acquisition of genes by horizontal gene transfer, and gene loss, is not possible by this method. It is noteworthy, however, that many genomic islands, such as pathogenicity ones, contain TFs (45) and that a high proportion of genes coding for regulatory proteins in E. coli K12 might come from horizontal gene transfer (46), pointing to a possible relationship between regulatory plasticity and the plasticity of gene content, as the quick evolution of transcriptional regulation might allow these islands to quickly become part of the network of functional interactions of their hosts. Such relationship could be worth exploring in future research. Individual regulatory changes have been implicated in large evolutionary changes of both eukaryotes (47,48) and prokaryotes (37,39), but it has not been clear to what extent this line of evidence extends due to the inherent difficulty of comparing many individual regulatory systems between species. This work presents a broad view over a large data set of gene pairs in regulons to clearly show that their evolutionary plasticity is great enough to account for an important part of variation at the species level in prokaryotes. One can thus suggest that the regulatory networks of all life forms are governed by the same principle (49), which offers insights deeply into evolutionary theory.

FUNDING

Discovery grant from Natural Sciences and Engineering Research Council of Canada (to G.M.-H.). Funding for open access charge: Natural Sciences and Engineering Research Council of Canada. Conflict of interest statement. None declared.
  48 in total

1.  Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs.

Authors:  Jan O Korbel; Lars J Jensen; Christian von Mering; Peer Bork
Journal:  Nat Biotechnol       Date:  2004-07       Impact factor: 54.908

2.  Quantifying modularity in the evolution of biomolecular systems.

Authors:  Berend Snel; Martijn A Huynen
Journal:  Genome Res       Date:  2004-03       Impact factor: 9.043

3.  Non-orthologous gene displacement.

Authors:  E V Koonin; A R Mushegian; P Bork
Journal:  Trends Genet       Date:  1996-09       Impact factor: 11.639

Review 4.  A genomic perspective on protein families.

Authors:  R L Tatusov; E V Koonin; D J Lipman
Journal:  Science       Date:  1997-10-24       Impact factor: 47.728

5.  The complete genome sequence of Escherichia coli K-12.

Authors:  F R Blattner; G Plunkett; C A Bloch; N T Perna; V Burland; M Riley; J Collado-Vides; J D Glasner; C K Rode; G F Mayhew; J Gregor; N W Davis; H A Kirkpatrick; M A Goeden; D J Rose; B Mau; Y Shao
Journal:  Science       Date:  1997-09-05       Impact factor: 47.728

6.  Gene co-regulation is highly conserved in the evolution of eukaryotes and prokaryotes.

Authors:  Berend Snel; Vera van Noort; Martijn A Huynen
Journal:  Nucleic Acids Res       Date:  2004-09-07       Impact factor: 16.971

7.  Conservation of adjacency as evidence of paralogous operons.

Authors:  Sarath Chandra Janga; Gabriel Moreno-Hagelsieb
Journal:  Nucleic Acids Res       Date:  2004-10-11       Impact factor: 16.971

8.  Population genetic and phylogenetic evidence for positive selection on regulatory mutations at the factor VII locus in humans.

Authors:  Matthew W Hahn; Matthew V Rockman; Nicole Soranzo; David B Goldstein; Gregory A Wray
Journal:  Genetics       Date:  2004-06       Impact factor: 4.562

9.  Using functional and organizational information to improve genome-wide computational prediction of transcription units on pathway-genome databases.

Authors:  P R Romero; P D Karp
Journal:  Bioinformatics       Date:  2004-01-29       Impact factor: 6.937

10.  EcoCyc: an encyclopedia of Escherichia coli genes and metabolism.

Authors:  P D Karp; M Riley; S M Paley; A Pelligrini-Toole
Journal:  Nucleic Acids Res       Date:  1996-01-01       Impact factor: 16.971

View more
  6 in total

Review 1.  The EcoCyc Database.

Authors:  Peter D Karp; Wai Kit Ong; Suzanne Paley; Richard Billington; Ron Caspi; Carol Fulcher; Anamika Kothari; Markus Krummenacker; Mario Latendresse; Peter E Midford; Pallavi Subhraveti; Socorro Gama-Castro; Luis Muñiz-Rascado; César Bonavides-Martinez; Alberto Santos-Zavaleta; Amanda Mackie; Julio Collado-Vides; Ingrid M Keseler; Ian Paulsen
Journal:  EcoSal Plus       Date:  2018-11

2.  Mutational Pleiotropy and the Strength of Stabilizing Selection Within and Between Functional Modules of Gene Expression.

Authors:  Julie M Collet; Katrina McGuigan; Scott L Allen; Stephen F Chenoweth; Mark W Blows
Journal:  Genetics       Date:  2018-02-01       Impact factor: 4.562

3.  The EcoCyc Database.

Authors:  Peter D Karp; Daniel Weaver; Suzanne Paley; Carol Fulcher; Aya Kubo; Anamika Kothari; Markus Krummenacker; Pallavi Subhraveti; Deepika Weerasinghe; Socorro Gama-Castro; Araceli M Huerta; Luis Muñiz-Rascado; César Bonavides-Martinez; Verena Weiss; Martin Peralta-Gil; Alberto Santos-Zavaleta; Imke Schröder; Amanda Mackie; Robert Gunsalus; Julio Collado-Vides; Ingrid M Keseler; Ian Paulsen
Journal:  EcoSal Plus       Date:  2014-05

4.  The loose evolutionary relationships between transcription factors and other gene products across prokaryotes.

Authors:  Marc del Grande; Gabriel Moreno-Hagelsieb
Journal:  BMC Res Notes       Date:  2014-12-17

Review 5.  The power of operon rearrangements for predicting functional associations.

Authors:  Gabriel Moreno-Hagelsieb
Journal:  Comput Struct Biotechnol J       Date:  2015-07-02       Impact factor: 7.271

6.  Acetobixan, an inhibitor of cellulose synthesis identified by microbial bioprospecting.

Authors:  Ye Xia; Lei Lei; Chad Brabham; Jozsef Stork; James Strickland; Adam Ladak; Ying Gu; Ian Wallace; Seth DeBolt
Journal:  PLoS One       Date:  2014-04-18       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.