Literature DB >> 33856016

Universal Constraints on Protein Evolution in the Long-Term Evolution Experiment with Escherichia coli.

Rohan Maddamsetti1.   

Abstract

Although it is well known that abundant proteins evolve slowly across the tree of life, there is little consensus for why this is true. Here, I report that abundant proteins evolve slowly in the hypermutator populations of Lenski's long-term evolution experiment with Escherichia coli (LTEE). Specifically, the density of all observed mutations per gene, as measured in metagenomic time series covering 60,000 generations of the LTEE, significantly anticorrelates with mRNA abundance, protein abundance, and degree of protein-protein interaction. The same pattern holds for nonsynonymous mutation density. However, synonymous mutation density, measured across the LTEE hypermutator populations, positively correlates with protein abundance. These results show that universal constraints on protein evolution are visible in data spanning three decades of experimental evolution. Therefore, it should be possible to design experiments to answer why abundant proteins evolve slowly.
© The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  experimental evolution; protein evolution; purifying selection

Mesh:

Substances:

Year:  2021        PMID: 33856016      PMCID: PMC8233687          DOI: 10.1093/gbe/evab070

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Significance

A universal evolutionary pattern is that highly abundant and highly interacting proteins evolve slowly. This pattern was discovered in analyses that cover millions of years’ worth of sequence variation, so it is not clear how long it takes (decades, centuries, millennia) for such patterns to emerge. Here, I report that this universal evolutionary pattern emerges in metagenomic data that cover just 30 years of experimental evolution.

Introduction

One consequence of the high complexity and intricate functional organization of organisms is that most mutations are deleterious. Natural selection resists the loss of function and fitness caused by mutation accumulation over time (Leiby and Marx 2014; LaBar and Adami 2017; Grant et al. 2021). This process, called purifying selection, maintains the complexity and functional integrity of evolved organisms. Despite its importance, purifying selection has been little studied in experimental systems (Alvarez-Ponce et al. 2016), in contrast to adaptive evolution (Barrick and Lenski 2013). In two recent papers, my colleagues and I reported evidence for purifying selection in metagenomic time series of Lenski’s long-term evolution experiment with Escherichia coli, often called the LTEE for short (Lenski et al. 1991; Good et al. 2017). We considered the molecular evolution of the six hypermutator LTEE populations, which have elevated mutation rates due to evolved defects in DNA repair (Tenaillon et al. 2016; Maddamsetti and Grant 2020a). These populations continue to increase in fitness due to adaptive evolution, even though genome evolution in these populations largely reflects the accumulation of nearly neutral mutations (Couce et al. 2017). In Grant et al. (2021), we reported evidence for purifying selection on aerobic- and anaerobic-specific genes in E. coli. In Maddamsetti and Grant (2020b), we then reported evidence for purifying selection on genes that were found to be essential in the ancestral LTEE strain, REL606, in a transposon mutagenesis screen (Couce et al. 2017). Here, I report evidence that purifying selection in the LTEE reflects a universal constraint on protein evolution found across the tree of life, namely that highly abundant and highly interacting proteins evolve slowly (Fraser et al. 2002; Hahn et al. 2004; Drummond et al. 2005; Hahn and Kern 2005; Drummond and Wilke 2008; Alvarez-Ponce et al. 2017). Despite the universality and simplicity of this pattern of purifying selection, its proximate causes continue to be debated (Plata et al. 2010; Plata and Vitkup 2018; Razban 2019; Usmanova et al. 2021). A number of compelling hypotheses have been proposed, but consensus has not been reached. The findings reported here will not settle this debate. Nonetheless, an important consequence of my findings is that it may be possible to resolve the causes of this universal pattern by experimental means.

Results

Rationale and Study Design

This study takes a novel approach to study the anticorrelation between protein abundance and evolutionary rates (Pál et al. 2001; Drummond et al. 2005, 2006; Drummond and Wilke 2008; Lobkovsky et al. 2010; Yang et al. 2010; Wylie and Shakhnovich 2011; Serohijos et al. 2012; Serohijos and Shakhnovich 2014). In this section, I present the logical structure of the hypotheses and predictions under consideration and explain the methods that I use (fig. 1).

Study design. (A) Many studies have reported that highly abundant proteins evolve slowly. If this fact is caused by purifying selection, then mutations in highly abundant proteins, should be more deleterious than mutations in less abundant proteins, on average. This logic leads to the prediction that highly abundant proteins should have fewer observed mutations than less abundant proteins across the hypermutator populations of the LTEE, taking gene length into account. (B) Previous studies inferred evolutionary rates using DNA and protein sequence comparisons across species. (C) This study sums all observed mutations per gene in metagenomic time series of the long-term evolution experiment with Escherichia coli (LTEE), considering nonmutator and hypermutator populations separately. This approach increases statistical power over a rate-based approach and is affected by neither clonal interference nor frequency-dependent selection. To give a concrete example, the top panel in (C) shows the number of observed mutations (stars) in the adhesin gene yeeJ in the nonmutator population Ara − 6 over 60,000 generations. The bottom panel in (C) shows the number of observed mutations (stars) in yeeJ in the hypermutator population Ara + 6 over the same period. For comparison across genes, the number of observed mutations is normalized by gene length.

Study design. (A) Many studies have reported that highly abundant proteins evolve slowly. If this fact is caused by purifying selection, then mutations in highly abundant proteins, should be more deleterious than mutations in less abundant proteins, on average. This logic leads to the prediction that highly abundant proteins should have fewer observed mutations than less abundant proteins across the hypermutator populations of the LTEE, taking gene length into account. (B) Previous studies inferred evolutionary rates using DNA and protein sequence comparisons across species. (C) This study sums all observed mutations per gene in metagenomic time series of the long-term evolution experiment with Escherichia coli (LTEE), considering nonmutator and hypermutator populations separately. This approach increases statistical power over a rate-based approach and is affected by neither clonal interference nor frequency-dependent selection. To give a concrete example, the top panel in (C) shows the number of observed mutations (stars) in the adhesin gene yeeJ in the nonmutator population Ara − 6 over 60,000 generations. The bottom panel in (C) shows the number of observed mutations (stars) in yeeJ in the hypermutator population Ara + 6 over the same period. For comparison across genes, the number of observed mutations is normalized by gene length. I assume that the mutation rates in the hypermutator LTEE populations are high enough that the vast majority of observed mutations are nearly neutral hitchhikers, whose dynamics are driven by a relatively small number of highly beneficial mutations (Barrick and Lenski 2009; Levy et al. 2015; Maddamsetti, Lenski, et al. 2015; Tenaillon et al. 2016; Couce et al. 2017; Good et al. 2017; Ba et al. 2019; Maddamsetti and Grant 2020a). This allows us to infer information about mutation rates and biases (Couce et al. 2017; Maddamsetti and Grant 2020a) even under environmental and population-genetic conditions that favor strong positive selection. It follows that the mutations observed across the nonmutator and hypermutator LTEE populations, to a large extent, reflect different parts of the distribution of mutation fitness effects (DFE) per gene. With this assumption in hand, I start from the hypothesis that purifying selection causes abundant proteins to evolve slowly. This means that the DFE for abundant proteins should contain more deleterious mutations than the DFE for less abundant proteins, all else being equal. It follows that highly abundant proteins should have fewer observed mutations in the hypermutator LTEE populations, because it is unlikely that highly deleterious mutations will reach observable allele frequencies in the LTEE, given the population-genetic conditions of the LTEE (Good et al. 2017). This is the logical basis for using the hypermutator LTEE populations to test for purifying selection on abundant proteins. The key technical trick is that we do not need to calculate evolutionary rates for the LTEE—in fact, we can completely ignore the phylogenetic structure of each population. Instead, we only need to count the number of observed mutations per gene across all hypermutator populations, and normalize by gene length (fig. 1). An additional benefit of this approach is that the effects of clonal interference and frequency-dependent selection (Maddamsetti, Lenski, et al. 2015; Good et al. 2017) can be ignored, because these phenomena do not affect the density of mutations that are ever observed in the LTEE. By contrast, clonal interference and frequency-dependent selection may have significant effects on evolutionary rates (Lang et al. 2013; Serohijos and Shakhnovich 2014; Maddamsetti, Lenski, et al. 2015; Good et al. 2017). The great advantage of the LTEE, and other evolution experiments with microbes, is the “fossil record” of frozen samples that can be revived for comparison with later samples. The vast majority of mutations in the LTEE lie off the line of descent, but are still accessible from sequencing those frozen population samples (Good et al. 2017). By contrast, analyses of natural sequence data are largely restricted to extant within-population polymorphism and between-species fixations. The use of mutations off the line of descent in the LTEE, along with its multidecade duration, provides sufficient (and ever increasing) statistical power to discern patterns of purifying selection, such as the one discussed in this work.

Correlations between mRNA and Protein Abundance and Mutation Density per Gene in LTEE Populations

I compared the density of observed mutations in the LTEE (Good et al. 2017) with mRNA and protein abundance data for the LTEE ancestral strain, REL606, grown in DM500 media (Caglar et al. 2017). These comparisons are shown in figure 2; note that throughout this section, all significant Spearman correlation coefficients and associated P values are labeled on the figures. In the hypermutator LTEE populations, mRNA abundance during exponential growth significantly anticorrelates with mutation density, whereas protein abundance, at all time points, significantly anticorrelates with mutation density. The same anticorrelation holds, for all time points, when only nonsynonymous (i.e., missense and nonsense) mutations are considered (fig. 3). The significance of these anticorrelations increases when genes with no observed mutations in the metagenomic data are excluded (supplementary fig. S1, Supplementary Material online, for all mutation types; supplementary fig. S2, Supplementary Material online, for nonsynonymous mutations). By contrast, the density of synonymous mutations across the hypermutator populations shows a significant positive correlation with mRNA and protein abundance for REL606 in DM500 media, across all phases of growth (fig. 4). When genes with no mutations in the metagenomic data are excluded, significant positive correlations remain between synonymous mutation density and mRNA and protein abundance, although to a lesser degree (supplementary fig. S3, Supplementary Material online). In the nonmutator LTEE populations, both mRNA and protein abundance for REL606 grown in DM500 show significant positive correlations with the density of observed mutations (supplementary fig. S4, Supplementary Material online).

The density of observed mutations per gene across all hypermutator LTEE populations anticorrelates with mRNA abundance in exponential growth phase, and anticorrelates with protein abundance at all time points. RNA and protein abundance were measured for the ancestral LTEE clone REL606, grown in DM500 media (Caglar et al. 2017). Each point represents a protein-coding gene in the genome of the ancestral LTEE clone, Escherichia coli B strain REL606. The abundance of mRNA or protein expressed per gene is shown on the x axis of each plot. The density of observed mutations per gene is shown on the y axis of each plot. Comparisons to mRNA abundance are shown in purple, whereas comparisons to protein abundance are shown in green. Statistically significant correlations are shown in blue, whereas nonsignificant correlations are shown in light gray. Spearman correlation coefficients (rho) and associated P values are shown on each panel.

The density of observed nonsynonymous mutations per gene across all hypermutator LTEE populations anticorrelates with mRNA abundance in exponential growth phase, and anticorrelates with protein abundance at all time points (see fig. 2 legend for further details).

The density of observed synonymous mutations per gene across all hypermutator LTEE populations positively correlates with mRNA and protein abundance at all time points (see fig. 2 legend for further details).

The density of observed mutations per gene across all hypermutator LTEE populations anticorrelates with mRNA abundance in exponential growth phase, and anticorrelates with protein abundance at all time points. RNA and protein abundance were measured for the ancestral LTEE clone REL606, grown in DM500 media (Caglar et al. 2017). Each point represents a protein-coding gene in the genome of the ancestral LTEE clone, Escherichia coli B strain REL606. The abundance of mRNA or protein expressed per gene is shown on the x axis of each plot. The density of observed mutations per gene is shown on the y axis of each plot. Comparisons to mRNA abundance are shown in purple, whereas comparisons to protein abundance are shown in green. Statistically significant correlations are shown in blue, whereas nonsignificant correlations are shown in light gray. Spearman correlation coefficients (rho) and associated P values are shown on each panel. The density of observed nonsynonymous mutations per gene across all hypermutator LTEE populations anticorrelates with mRNA abundance in exponential growth phase, and anticorrelates with protein abundance at all time points (see fig. 2 legend for further details). The density of observed synonymous mutations per gene across all hypermutator LTEE populations positively correlates with mRNA and protein abundance at all time points (see fig. 2 legend for further details). I also asked whether the strength of the Spearman correlations between protein abundance and mutation density in the hypermutator populations increased over the course of the LTEE (fig. 5). In analyses of natural sequence variation, it is understood that the strength of anticorrelation between protein evolutionary rates and protein abundance increases with divergence time among the taxa under consideration (Serohijos et al. 2012). Based on protein biophysics, Serohijos et al. (2012) additionally predicted that the strength of the anticorrelation between evolutionary rates and protein abundance would increase, but at declining rates over time. Even though the differences in measurements, units, and timescales make direct comparisons to those theoretical predictions impossible, it is striking that a similar functional form of the relationship between time and the strength of the rate-abundance anticorrelation occurs with the mutations observed across the LTEE hypermutator populations (fig. 5). By contrast, the positive Spearman correlation coefficient between synonymous mutation density and protein abundance remains steady at ∼0.075 for at least 40,000 generations, ranging from the 20,000-generation mark through 60,000 generations (fig. 5).

Correlations between protein abundance in REL606 and mutation density are consistent over time. Points represent Spearman correlation coefficients, calculated using the cumulative number of mutations observed by each 10,000-generation mark in the metagenomic time series for the LTEE hypermutator populations. Colors indicate the growth time at which protein abundance was sampled for REL606; the growth times correspond to the separate panels in figures 2–4. (A) Correlations between protein abundance in REL606 and mutation density across all hypermutator LTEE populations. (B) Correlations between protein abundance in REL606 and nonsynonymous mutation density across all hypermutator LTEE populations. (C) Correlations between protein abundance in REL606 and synonymous mutation density across all hypermutator LTEE populations.

Correlations between protein abundance in REL606 and mutation density are consistent over time. Points represent Spearman correlation coefficients, calculated using the cumulative number of mutations observed by each 10,000-generation mark in the metagenomic time series for the LTEE hypermutator populations. Colors indicate the growth time at which protein abundance was sampled for REL606; the growth times correspond to the separate panels in figures 2–4. (A) Correlations between protein abundance in REL606 and mutation density across all hypermutator LTEE populations. (B) Correlations between protein abundance in REL606 and nonsynonymous mutation density across all hypermutator LTEE populations. (C) Correlations between protein abundance in REL606 and synonymous mutation density across all hypermutator LTEE populations. A limitation of these analyses is that these RNA and protein abundance data come from the ancestral LTEE clone, REL606, and so these patterns may not hold for evolved strains. To address this limitation, I examined RNA abundance data for eleven 50,000 generation LTEE clones, grown to exponential phase in DM4000 media (Favate et al. 2021). In every single case, the density of observed mutations per gene, measured across all hypermutator populations, significantly anticorrelates with mRNA abundance (supplementary fig. S5, Supplementary Material online). In addition, a significant anticorrelation is seen with nonsynonymous mutations for all 11 clones (supplementary fig. S6, Supplementary Material online), whereas a positive correlation is seen with synonymous mutations, again for all 11 evolved clones (supplementary fig. S7, Supplementary Material online). The density of observed mutations per gene in the nonmutator populations significantly correlates with mRNA abundance in seven out of 11 clones (supplementary fig. S8, Supplementary Material online). As an additional check for the robustness of these correlations, I compared the density of observed mutations per gene in the LTEE with protein abundance data in the ProteomeVis database (Razban et al. 2018). Although these data only cover 664 out of 4,205 genes analyzed in the LTEE metagenomic data, they still reveal significant anticorrelations between mutation density per gene in the hypermutator populations and protein abundance, when all mutations and nonsynonymous mutations are analyzed (supplementary fig. S9, Supplementary Material online). Corresponding results for synonymous mutations in the hypermutator LTEE populations, and for all mutation types in the nonmutator LTEE populations, are not statistically significant.

Highly Interacting Proteins Evolve Slowly in Hypermutator Populations

Another universal pattern is that highly interacting proteins evolve more slowly than those with fewer interaction partners (Fraser et al. 2002; Hahn et al. 2004; Hahn and Kern 2005; Alvarez-Ponce et al. 2017). I hypothesized that highly interacting proteins would be under strong selection in the LTEE, based on those reports, as well as previous results showing that the E. coli core genome is under positive selection in the LTEE (Maddamsetti et al. 2017), and that global regulators of gene expression show evidence of strong positive selection in both nonmutator and hypermutator LTEE populations (Maddamsetti and Grant 2020b). In particular, I hypothesized that highly interacting proteins should evolve rapidly in the nonmutator LTEE populations due to positive selection, but should evolve slowly in the hypermutator populations during to purifying selection. I compared the number of protein–protein interactions (PPI) with the density of observed mutations across LTEE populations for every protein-coding gene in the E. coli genome, using three curated data sets of PPI in E. coli (Razban et al. 2018; Cong et al. 2019; Zitnik et al. 2019), which I refer to as the Cong data set, the Zitnik data set, and the Razban data set. These comparisons are shown in figure 6 and supplementary figure 10, Supplementary Material online. I find significant negative correlations between mutation density and PPI degree in the hypermutators (Spearman’s rho = −0.056, P = 0.00037 for Cong data set; Spearman’s rho = −0.11, P < 10−11 for Zitnik data set; Spearman’s rho = −0.068, P < 10−4 for Razban data set). However, the weak positive correlations between mutation density and PPI degree in the nonmutators are not significant (supplementary fig. 10, Supplementary Material online).

The density of observed mutations per gene across all hypermutator LTEE populations negatively correlates with PPI degree. Comparisons to the PPI data from Cong et al. (2019) are shown in light blue, comparisons to the PPI data from Zitnik et al. (2019) are shown in orange, and comparisons to the PPI data in the ProteomeVis database (Razban et al. 2018) are shown in red. Significant Spearman correlations are shown in blue. For improved visual dispersion, PPI degree is square-root transformed; the Spearman correlation is unaffected by this monotonic data transformation. (A) Proteins with more interactions in the Cong et al. (2019) data set tend to evolve more slowly than those with fewer interactions in the hypermutator LTEE populations. (B) Proteins with more interactions in the Zitnik et al. (2019) data set tend to evolve more slowly than those with fewer interactions in the hypermutator LTEE populations. (C) Proteins with more interactions in the ProteomeVis Escherichia coli PPI data set (Razban et al. 2018) tend to evolve more slowly than those with fewer interactions in the hypermutator LTEE populations.

The density of observed mutations per gene across all hypermutator LTEE populations negatively correlates with PPI degree. Comparisons to the PPI data from Cong et al. (2019) are shown in light blue, comparisons to the PPI data from Zitnik et al. (2019) are shown in orange, and comparisons to the PPI data in the ProteomeVis database (Razban et al. 2018) are shown in red. Significant Spearman correlations are shown in blue. For improved visual dispersion, PPI degree is square-root transformed; the Spearman correlation is unaffected by this monotonic data transformation. (A) Proteins with more interactions in the Cong et al. (2019) data set tend to evolve more slowly than those with fewer interactions in the hypermutator LTEE populations. (B) Proteins with more interactions in the Zitnik et al. (2019) data set tend to evolve more slowly than those with fewer interactions in the hypermutator LTEE populations. (C) Proteins with more interactions in the ProteomeVis Escherichia coli PPI data set (Razban et al. 2018) tend to evolve more slowly than those with fewer interactions in the hypermutator LTEE populations.

Discussion

I show that a number of well-known but poorly understood correlations between mRNA abundance, protein abundance, PPI degree, and evolutionary rates across the tree of life are also found in the hypermutator populations of the LTEE. In some cases, I find significant anticorrelation between mutation densities and mRNA abundance in exponential phase, but not during stationary phase. The simplest explanation for this finding is that mRNAs decay more rapidly than the proteins they encode. Protein abundance consistently shows a negative correlation with the density of all observed mutations (fig. 2) and with nonsynonymous mutation density across all time points (fig. 3). It is widely believed that these correlations are driven by purifying selection on universal aspects of protein evolution (Drummond et al. 2006; Drummond and Wilke 2009; Serohijos et al. 2012; Serohijos and Shakhnovich 2014), and indeed, this is the most parsimonious explanation for why similar patterns are seen in the LTEE. An intriguing difference, however, is the positive correlation that I find between synonymous mutation density across LTEE hypermutator populations and protein abundance (fig. 4)—which contrasts with the anticorrelation between the rate of synonymous mutations and gene expression seen in nature (Drummond and Wilke 2008). In part, this may be explained by the differences in the distribution of synonymous mutations observed in the LTEE, and the distribution of synonymous diversity per gene in nature (Maddamsetti, Hatcher, et al. 2015; Maddamsetti and Grant 2020a), although the causes for this difference between natural variation and experiment is still a matter for hypothesis generation (Maddamsetti 2016), data collection, hypothesis testing, and debate. An important limitation of these results is that the protein and mRNA abundance data for LTEE strains were collected in DM500 and DM4000 media (Caglar et al. 2017; Favate et al. 2021). These media contain much more than the 25 mg/l glucose in the DM25 media used in the LTEE. This represents a technical compromise due to the fact that researchers have not yet succeeded in isolating sufficient mRNA from exponential phase cultures in DM25 for RNA-seq (Jagdish T and Grant N, personal communication). With this caveat in mind, my findings support the conclusion that highly abundant proteins evolve slowly in the hypermutator LTEE populations. The causes for why highly abundant proteins evolve slowly may emerge from a number of different, and nonmutually exclusive phenomena, so many explanations have been proposed (Razban 2019). These include the protein misfolding avoidance hypothesis (Yang et al. 2010), the protein misinteraction avoidance hypothesis (Levy et al. 2012; Yang et al. 2012), the mRNA folding hypothesis (Park et al. 2013), purifying selection on protein function (Konaté et al. 2019), folding stability (Serohijos et al. 2012; Serohijos and Shakhnovich 2014), and others (Tartaglia et al. 2007; Plata et al. 2010; Kepp and Dasmeh 2014). Differentiating among these possibilities is difficult, because it is challenging to study the causes of patterns that span millions of years of protein evolution. I do not draw conclusions about the causes of these correlations. Rather, my results show that evolution experiments are reasonable model systems to study the causes of evolutionary rate variation in proteins. A concrete approach would be to recode the genomes of hypermutator strains to modulate the anticipated action of purifying selection per protein, based on the predictions of a particular explanation, and then ask whether those predictions are borne out during experimental evolution. Breakthroughs that allow for the inexpensive recoding of whole bacterial genomes may be needed, but it is plausible that such experiments will be feasible in the future. Many other experimental directions are possible. First, a better understanding of how chaperones and other molecular mechanisms of protein quality control affect evolutionary rates and fitness (Chen et al. 2017; Alvarez-Ponce et al. 2019; Samhita et al. 2020) is needed. We also need to better understand purifying selection on synonymous mutations (Walsh et al. 2020). Second, studies on how RNA transcription error rates (Li and Lynch 2020) and RNA folding errors affect evolutionary rates would be valuable. Indeed, mRNA accessibility seems to be an important predictor of protein abundance (Terai and Asai 2020)—and RNA chaperones buffer deleterious mutations in LTEE hypermutator strains (Rudan et al. 2015). Third, it would be interesting to experimentally test the hypothesis that protein and RNA chaperones evolve under more and more stringent purifying selection during long-term experimental evolution, which follows from the premise that hypermutator LTEE populations are affected by a mutation load that affects protein folding and stability. Studies on the existence and relevance of phenomena like evolutionary capacitance caused by the contributions that PPI make to folding stability (Dixit and Maslov 2013; Jarzab et al. 2020; Mateus et al. 2020), including cryptic genetic variation hidden by protein and RNA chaperones (Queitsch et al. 2002; Bergman and Siegal 2003; Masel 2005, 2006, 2013; Trotter et al. 2014; Geiler-Samerotte et al. 2016; Zheng et al. 2019) during experimental evolution, and the effects of such phenomena on rates of protein evolution may be especially valuable in this regard. Finally, it would be valuable to develop a better understanding of the temperature sensitivity of evolved LTEE populations (Mongold et al. 1996, 1999; Leiby and Marx 2014), and to collect data on protein evolutionary rates in long-term experiments conducted at elevated temperatures (Bennett et al. 1990; Tenaillon et al. 2012). Much remains to be explored, in regard to how evolution experiments can deepen our understanding of purifying selection on molecular and cellular organization and function.

Materials and Methods

Preprocessed LTEE metagenomic data were downloaded from: https://github.com/benjaminhgood/LTEE-metagenomic. Transcriptomic and proteomic data for REL606, grown in Davis minimal media with 500 mg/l glucose (DM500), were taken from the supplementary tables for Caglar et al. (2017). For robustness, I also analyzed the transcriptomic data for eleven 50,000 generation LTEE clones grown in DM4000 media (Favate et al. 2021) available at: https://github.com/shahlab/LTEE-gene-expression. I analyzed three different data sets of PPI in E. coli. First, I used the PPI network for E. coli K-12 MG1655 in the STRING database (Szklarczyk et al. 2021) as curated by Zitnik et al. (2019). Second, I used the data set of high confidence E. coli PPI interactions reported by Cong et al. (2019), which combines coevolutionary information in large protein multiple sequence alignments with gold-standard protein complexes in E. coli reported in the Ecocyc and Protein Databank (PDB) databases (Berman et al. 2000; Keseler et al. 2013). PPI network statistics were calculated using the SNAP toolkit (Leskovec and Sosič 2016; Zitnik et al. 2019). Third, additional data on E. coli PPI interactions and protein abundance were downloaded using the web interface to the ProteomeVis database (Razban et al. 2018), available at http://proteomevis.chem.harvard.edu/. Associated metadata for ProteomeVis were downloaded from: https://github.com/rrazban/proteomevis/blob/master/make_database/proteomevis_inspect.csv. All statistical analyses involve two-sided tests for Spearman correlation coefficients that are significantly different from zero, using the cor.test function in the R statistical programming language, version 4.0 (R Core Team 2020). Unless stated otherwise, all correlations include genes with no mutations (i.e., zeros are included).

Supplementary Material

Supplementary data are available at Genome Biology and Evolution online. Click here for additional data file.
  71 in total

1.  Evolutionary capacitance as a general feature of complex gene networks.

Authors:  Aviv Bergman; Mark L Siegal
Journal:  Nature       Date:  2003-07-31       Impact factor: 49.962

2.  Protein misinteraction avoidance causes highly expressed proteins to evolve slowly.

Authors:  Jian-Rong Yang; Ben-Yang Liao; Shi-Mei Zhuang; Jianzhi Zhang
Journal:  Proc Natl Acad Sci U S A       Date:  2012-03-13       Impact factor: 11.205

3.  Evolutionary capacitance may be favored by natural selection.

Authors:  Joanna Masel
Journal:  Genetics       Date:  2005-05-23       Impact factor: 4.562

4.  Maintenance of Metabolic Plasticity despite Relaxed Selection in a Long-Term Evolution Experiment with Escherichia coli.

Authors:  Nkrumah A Grant; Rohan Maddamsetti; Richard E Lenski
Journal:  Am Nat       Date:  2021-05-21       Impact factor: 3.926

5.  Improving the prediction accuracy of protein abundance in Escherichia coli using mRNA accessibility.

Authors:  Goro Terai; Kiyoshi Asai
Journal:  Nucleic Acids Res       Date:  2020-06-06       Impact factor: 16.971

6.  EcoCyc: fusing model organism databases with systems biology.

Authors:  Ingrid M Keseler; Amanda Mackie; Martin Peralta-Gil; Alberto Santos-Zavaleta; Socorro Gama-Castro; César Bonavides-Martínez; Carol Fulcher; Araceli M Huerta; Anamika Kothari; Markus Krummenacker; Mario Latendresse; Luis Muñiz-Rascado; Quang Ong; Suzanne Paley; Imke Schröder; Alexander G Shearer; Pallavi Subhraveti; Mike Travers; Deepika Weerasinghe; Verena Weiss; Julio Collado-Vides; Robert P Gunsalus; Ian Paulsen; Peter D Karp
Journal:  Nucleic Acids Res       Date:  2012-11-09       Impact factor: 16.971

7.  Core Genes Evolve Rapidly in the Long-Term Evolution Experiment with Escherichia coli.

Authors:  Rohan Maddamsetti; Philip J Hatcher; Anna G Green; Barry L Williams; Debora S Marks; Richard E Lenski
Journal:  Genome Biol Evol       Date:  2017-04-01       Impact factor: 3.416

8.  The functional proteome landscape of Escherichia coli.

Authors:  André Mateus; Johannes Hevler; Jacob Bobonis; Nils Kurzawa; Malay Shah; Karin Mitosch; Camille V Goemans; Dominic Helm; Frank Stein; Athanasios Typas; Mikhail M Savitski
Journal:  Nature       Date:  2020-12-09       Impact factor: 49.962

9.  Evolutionary capacitance and control of protein stability in protein-protein interaction networks.

Authors:  Purushottam D Dixit; Sergei Maslov
Journal:  PLoS Comput Biol       Date:  2013-04-04       Impact factor: 4.475

10.  Metabolic erosion primarily through mutation accumulation, and not tradeoffs, drives limited evolution of substrate specificity in Escherichia coli.

Authors:  Nicholas Leiby; Christopher J Marx
Journal:  PLoS Biol       Date:  2014-02-18       Impact factor: 8.029

View more
  2 in total

1.  Selection Maintains Protein Interactome Resilience in the Long-Term Evolution Experiment with Escherichia coli.

Authors:  Rohan Maddamsetti
Journal:  Genome Biol Evol       Date:  2021-06-08       Impact factor: 3.416

2.  Discovery of positive and purifying selection in metagenomic time series of hypermutator microbial populations.

Authors:  Rohan Maddamsetti; Nkrumah A Grant
Journal:  PLoS Genet       Date:  2022-08-18       Impact factor: 6.020

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.