Literature DB >> 32004314

Heterogeneity coordinates bacterial multi-gene expression in single cells.

Yichao Han1, Fuzhong Zhang1,2,3.   

Abstract

For a genetically identical microbial population, multi-gene expression in various environments requires effective allocation of limited resources and precise control of heterogeneity among individual cells. However, it is unclear how resource allocation and cell-to-cell variation jointly shape the overall performance. Here we demonstrate a Simpson's paradox during overexpression of multiple genes: two competing proteins in single cells correlated positively for every induction condition, but the overall correlation was negative. Yet this phenomenon was not observed between two competing mRNAs in single cells. Our analytical framework shows that the phenomenon arises from competition for translational resource, with the correlation modulated by both mRNA and ribosome variability. Thus, heterogeneity plays a key role in single-cell multi-gene expression and provides the population with an evolutionary advantage, as demonstrated in this study.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 32004314      PMCID: PMC7015429          DOI: 10.1371/journal.pcbi.1007643

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


Introduction

Bacteria often simultaneously turn on the expression of multiple pathways or cellular machineries to perform multitasking in response to various conditions. Obtaining optimal outcomes of multitasking is critical for population survival, bacteria-host interaction, cell-to-cell communication, biofilm formation, and biosynthetic performance [1-5]. During multitasking, modules for different tasks often compete with each other for limited intracellular resources, which could affect the performance of the overall system [6-9]. At the most fundamental level, it has been widely observed that overexpression of a heterologous gene decreases the expression level of other genes, leading to a negative correlation between competing proteins at the ensemble level [10-12]. Meanwhile, the performance of a module also varies from cell to cell due to biological stochasticity, leading to phenotypic heterogeneity. Distinctive phenotypes within a genetically identical population are sometimes harnessed as a mechanism for division of labor, where distinct subpopulations perform different tasks, thus reducing resource competition within each single cell. However, it remains elusive to what degree phenotypic heterogeneity affects simultaneous operation of multiple functional modules within every single cell. Specifically, how do single cells deal with resource competition, and how does a population coordinate single cell performances during multitasking to maximize population efficiencies [2,13,14]?

Results

In bacteria, RNA polymerases (RNAPs) and ribosomes are believed to be the limiting factors of transcription and translation, respectively [15]. To examine single cell multitasking in the most fundamental form, we designed two competing gene overexpression modules with fluorescent proteins as outputs (Fig 1A). One of them contains a constitutively expressed green fluorescent protein (gfp) gene in the Escherichia coli chromosome mimicking a naturally-occurring module [11]. The other competing module contains a Mycobacterium marinum carboxylic acid reductase (car) gene fused with an mCherry gene in a medium-copy plasmid. In our test E. coli strain, the burdensome CAR-mCherry protein does not serve any additional cellular or metabolic function [16], except for consuming global resources for both transcription and translation during its expression. Isopropyl β-D-1-thiogalactopyranoside (IPTG) mimics an environmental signal to increase the output of this module. Single cell GFP and CAR-mCherry fluorescence in steady state conditions was measured using fluorescence microscopy (Fig 1B) to evaluate heterogeneity in cellular performance. Under different IPTG conditions, the population mean GFP fluorescence decreased as the population mean CAR-mCherry fluorescence increased (Fig 1C), suggesting the presence of resource competition between the two proteins, in good agreement with previous ensemble-level observations [11,12]. At the single-cell level, the joint distribution of GFP and CAR-mCherry proteins resembled a statistical phenomenon called Simpson’s paradox [17]: the correlations between GFP and CAR-mCherry in single cells were positive at each constant induction condition, whereas the overall correlation became negative when the data for all induction conditions were merged (Fig 1D and S1A Fig). The negative trend is not affected by sample sizes when merged data is evenly sampled across induction conditions, and the standard deviation of correlation decreases with larger sample size (S1B Fig). The merged condition exemplifies the heterogenous and fluctuating environments where a microbial community lives, while each induction condition exemplifies constant environments that a local microbial group adapts. Thus, Simpson’s paradox phenomenon in bacterial gene expression may present in multiple systems where local regions have relative consistent module inputs while these inputs vary significantly among different regions in the system, such as biofilms [18] or large-scale fermenters [14]. The opposite correlation patterns suggest that a microbial community has the potential to explore a large area of protein expression space within the resource-limiting region and balance the outcome of multiple tasks (e.g., a certain ratio of correlated protein expression) according to the local environment.
Fig 1

Multi-gene expression in single-cells during translational competition.

(A) Translational competition of CAR-mCherry and GFP over limited shared ribosomes in single cells. The CAR-mCherry mRNAs are transcribed from an IPTG-inducible PlacUV5 promoter, while GFP mRNAs are constitutively transcribed. (B) Representative fluorescence images of combined green (GFP) and red (CAR-mCherry) channels at various induction levels. IPTG concentrations are labelled at the top of each image. Scale bars, 5 μm. (C) Population mean fluorescent intensity of GFP and CAR-mCherry at various IPTG induction levels. Error bars represent standard deviations of three replicates from different days. (D) Correlation between CAR-mCherry and GFP expression levels of single cells at various IPTG induction levels. The last plot contains all data points merged from the other seven plots. The dashed lines represent linear fittings to the data. a.u., arbitrary units.

Multi-gene expression in single-cells during translational competition.

(A) Translational competition of CAR-mCherry and GFP over limited shared ribosomes in single cells. The CAR-mCherry mRNAs are transcribed from an IPTG-inducible PlacUV5 promoter, while GFP mRNAs are constitutively transcribed. (B) Representative fluorescence images of combined green (GFP) and red (CAR-mCherry) channels at various induction levels. IPTG concentrations are labelled at the top of each image. Scale bars, 5 μm. (C) Population mean fluorescent intensity of GFP and CAR-mCherry at various IPTG induction levels. Error bars represent standard deviations of three replicates from different days. (D) Correlation between CAR-mCherry and GFP expression levels of single cells at various IPTG induction levels. The last plot contains all data points merged from the other seven plots. The dashed lines represent linear fittings to the data. a.u., arbitrary units. To understand the observed Simpson’s paradox and to quantify the combined effects of both resource competition and cell-to-cell variation on multi-gene overexpression, we developed a generic analytic framework that can be applied to resource competition at different levels (e.g., transcription, translation, and metabolism). Compared to previous resource competition models [7,11,19-22], our model considers cell-to-cell variations in resource availability and focuses on heterologous expression systems that have strong competition with the endogenous expression system, thus uniquely illuminating resource competition in engineered cells at the single-cell level [23,24]. Our model has several important assumptions: i) to emphasize the effect of resource competition, the two competing modules do not shared transcriptional nor translational regulators, such as transcription factors and small RNAs; ii) the amounts of resource available for gene expression, such as RNA polymerase or ribosome, vary among single cells; and iii) all macroscopic reaction rate constants are evaluated at steady state and do not vary among single cells. The model was first applied to study translational competition (Note 1 in S1 Text), where two module inputs, total heterologous mRNAs (M) and total endogenous mRNAs (M), compete for the limited amount of total ribosomes (Rib), and produce heterologous proteins (P) and endogenous proteins (P), respectively (Fig 2A). When Rib inside an individual cell is fixed, where Rib is the number of free ribosomes, n is the average number of ribosomes bound to the corresponding mRNA (i = 1, 2), and β represents the dissociation constant. On the right side, the second term is proportional to P, and the third term is proportional to P. The repression on P caused by increasing M () indicates the strength of resource competition. In each cell, lower Rib and higher M create stronger competition due to fewer Rib (Fig 2B). The dissociation constants β and β largely determine and respectively (Note1 in S1 Text). If β is much larger than Rib, the heterologous proteins P are not burdensome enough to sequester a significant amount of free ribosomes (i.e. the absolute value of is small). If β is much smaller than Rib, the expression of endogenous proteins P are not affected by reduced Rib (i.e. the value of is small). In both cases, the strength of resource competition is negligible (S2A and S2B Fig).
Fig 2

Coarse-grained model of translational resource competition.

(A) The coarse-grained model considers ribosome allocation between heterologous (i = 1) and endogenous (i = 2) mRNAs. The input, the output, and the resource are total mRNA M, protein P, and total ribosome Rib, respectively. Rib can either be free ribosome Rib or mRNA-bound ribosome. (B) Ribosome competition in a single cell. Top, decrement of the free ribosome fraction (Rib/Rib) caused by increasing M. Bottom, negative correlation between endogenous protein (P) and heterologous proteins (P). Calculations of Rib, P, and P are described in Note 1.2 in S1 Text, with parameters listed in Table A in S1 Text. (C-F) Correlation between P and P of single cells, r(P, P). Calculation of r(P, P) is described in Note 1.3 in S1 Text. M variability is set as zero for simplicity. (C) Mean Rib (10,000) and Rib variability (0.1) are set as constants. (D) Mean M (300) and M variability (0.1) are set as constants. (E) Mean M (300) and mean Rib (10,000) are set as constants. (F) M variability and Rib variability (both 0.1) are set as constants.

Coarse-grained model of translational resource competition.

(A) The coarse-grained model considers ribosome allocation between heterologous (i = 1) and endogenous (i = 2) mRNAs. The input, the output, and the resource are total mRNA M, protein P, and total ribosome Rib, respectively. Rib can either be free ribosome Rib or mRNA-bound ribosome. (B) Ribosome competition in a single cell. Top, decrement of the free ribosome fraction (Rib/Rib) caused by increasing M. Bottom, negative correlation between endogenous protein (P) and heterologous proteins (P). Calculations of Rib, P, and P are described in Note 1.2 in S1 Text, with parameters listed in Table A in S1 Text. (C-F) Correlation between P and P of single cells, r(P, P). Calculation of r(P, P) is described in Note 1.3 in S1 Text. M variability is set as zero for simplicity. (C) Mean Rib (10,000) and Rib variability (0.1) are set as constants. (D) Mean M (300) and M variability (0.1) are set as constants. (E) Mean M (300) and mean Rib (10,000) are set as constants. (F) M variability and Rib variability (both 0.1) are set as constants. To introduce cell-to-cell variations, M, M, and Rib are considered as random variables for individual cells, although they are assumed to be constants over time for each cell. At steady state, cell-to-cell variations of protein expression levels can be described by a linearized model: where denotes the mean value of X at steady state. The covariance between P and P at steady state is derived as Considering the cell-to-cell variations in Rib and M as the two main sources of cellular heterogeneity in this system, the covariance between P and P at steady state can be further approximated as a linear combination of the variances in Rib and M: where the first term is positive, and the second term is negative due to the competition effect (). Critically, the opposite contributions from variances in Rib and M reveal that variation in the shared resource strengthens the correlation of module outputs, whereas variation in the competing module inputs weakens and even reverses the correlation. To characterize these variables at different magnitudes, we calculated the Pearson correlation coefficient (r) and the squared coefficient of variance (CV2) as measures of correlation and variability. We assumed that the Rib variability is a constant (approximately 0.1, the variability lower bound of the typical abundant proteins in E. coli [25]). Here lies the explanation for the observed Simpson’s paradox in multi-gene expression: the protein correlation is positive when M variability is low (e.g., at each P induction condition as a constant environment), which is dominated by the resource variation effect, but the correlation can be reversed by the competition effect at high M variability (e.g., combining different P induction conditions as a fluctuating environment) (Fig 2C and 2E). The contributions from the two variation sources to the protein correlation ( and ) depend on the mean values of both Mand Rib of the population (Note 1 in S1 Text). Intuitively, enhanced overexpression of heterologous genes (higher mean M) or limited total ribosome (lower Rib) would cause fewer resources to be devoted to expressing native genes in single cells, causing reduced correlation between competing proteins. In reality, our model shows that, within certain ranges (e.g., M > 100 and Rib < 10,000), a higher mean M or a lower mean Rib increases the relative contribution from Rib variance compared with M variance in Eq (1), leading to increased correlation between competing proteins (Fig 2C, 2D and 2F). These analyses are robust even when the full Eq (3) was used (S2C–S2J Fig). Next, we investigated whether the Simpson’s paradox also exists at the transcriptional level. We applied our model to transcriptional competition and solved for correlations between competing mRNAs in single cells (Note 2 in S1 Text and S3A Fig). The major difference between transcriptional and translational competition is that mRNA production was believed to be mainly determined by promoter strength (treated equivalently as promoter copy number in our model), and to a lesser extent, by the amount of RNAPs [26-28], so the effects of both RNAP competition and cell-to-cell variation in RNAPs are attenuated. Our model, with feasible parameters in transcription (i.e. the number total RNAP ranges from 4000 to 12000; dissociation constants for RNAP binding range from 0.1 to 10), predicts three phenomena: i) within a large parameter range (1 to 100 copies of strong promoters per cell), introducing heterologous genes causes little repression on endogenous mRNA production (S3B Fig), ii) the correlations between competing mRNAs are determined by correlations between promoter strengths, and the promoter strength correlations can be weak or even negative in constant environments (S3C Fig), and iii) the correlations rarely change with promoter strength and its variability (S3D Fig). These features largely prevent the Simpson’s paradox from occurring at the transcriptional level (mathematically explanation in Note 2 in S1 Text). To validate model predictions, we experimentally quantified mRNA outputs of our testing modules in single cells, using two-color mRNA fluorescent in situ hybridization (FISH) (Fig 3A and 3B). The average GFP mRNA abundance was estimated to be approximately 2.02 ± 0.25 (mean ± s.d. across all conditions) copies per cell, ranking in the top 1% of all endogenous genes [25] and in agreement with RNA-seq measurements from the studied E. coli strain [29]. The GFP mRNAs at all induction levels followed similar Poisson distributions (S4 Fig), suggesting that endogenous mRNAs are not repressed by increasing heterologous mRNA levels (Fig 3C). Thus, both our model predictions and experimental results showed that resource competition mostly occurs at the translational level rather than at the transcriptional level, shining light on a previously debated issue about the cause of mRNA burden [7,29,30]. We further observed that the mRNA correlations in each induction condition were weak and positive, which also resulted in a weak and positive correlation when combining all conditions (Fig 3D). The result reveals that the strengths (or copy numbers) of these two promoter are weakly correlated likely due to cell division [31], and promoter strength variability with the RNAP competition effect alone is not sufficient to reverse the weak mRNA correlation in fluctuating environments.
Fig 3

Multi-gene expression in single-cells during transcriptional competition.

(A) Transcriptional competition between car-mCherry and gfp genes for limited shared RNAPs in single cells. CAR-mCherry mRNA and GFP mRNA were hybridized by Quasar 670- (blue) and Quasar 570-labeled (red) probes, respectively. The fluorescence of the mCherry protein was deactivated via the M71G mutation to prevent spectral overlap. (B) Representative FISH images of single cells induced at 500 μM IPTG. (C) Population mean mRNA copy numbers of GFP and CAR-mCherry at various IPTG concentrations. mRNA copy numbers of CAR-mCherry and GFP were estimated from fluorescence intensity. Error bars represent the 95% confidence interval, determined by bootstrapping. (D) GFP and CAR-mCherry mRNA copy numbers of single cells at various IPTG induction levels.

Multi-gene expression in single-cells during transcriptional competition.

(A) Transcriptional competition between car-mCherry and gfp genes for limited shared RNAPs in single cells. CAR-mCherry mRNA and GFP mRNA were hybridized by Quasar 670- (blue) and Quasar 570-labeled (red) probes, respectively. The fluorescence of the mCherry protein was deactivated via the M71G mutation to prevent spectral overlap. (B) Representative FISH images of single cells induced at 500 μM IPTG. (C) Population mean mRNA copy numbers of GFP and CAR-mCherry at various IPTG concentrations. mRNA copy numbers of CAR-mCherry and GFP were estimated from fluorescence intensity. Error bars represent the 95% confidence interval, determined by bootstrapping. (D) GFP and CAR-mCherry mRNA copy numbers of single cells at various IPTG induction levels. Our data in Fig 1D showed that when expressing multiple genes under limited resources, the ratio of competing proteins in single cells varies even when they are growing in the same environments (e.g., induction levels). In some circumstances, such as expressing metabolic pathways or multi-protein complexes with precise stoichiometry, it is desirable to keep multiple genes expressed at a fixed ratio within single cells to achieve optimal overall performance and maximize the efficiency of resource utilization. Using polycistronic operons in combination with translational regulation is a common strategy for controlling the ratio of multiple proteins at the ensemble level [32,33]. However, the protein ratio in single cells may be affected by translational competition, resulting in disrupted stoichiometry. To examine the degree of competition effects on multi-gene expression from polycistronic operons in single cells, we constructed a library of polycistronic operons containing both mCherry and gfp genes driven by different promoters (Fig 4A). We found that the ratios of mCherry protein to GFP were consistent among single cells for each type of promoter, regardless of their promoter strength (Fig 4B and 4C). The ratios were observed to be different between the inducible PLacUV5 promoter and constitutive promoters, which could be explained by different mRNA secondary structures near the ribosome binding site of the mCherry gene. In addition, the correlation between mCherry and GFP in single cells remained high, regardless of their expression strength and variability (Fig 4D and 4E). Collectively, these results suggest that resource competition and cellular heterogeneity hardly affect proportional protein production from the polycistronic operon.
Fig 4

Polycistronic operon enables highly correlated protein expression.

(A) Various promoters are used to control the co-expression of mCherry and GFP from a polycistronic operon. (B) mCherry and GFP in individual cells under the control of the inducible promoter PlacUV5 at different IPTG induction levels. a.u., arbitrary units. (C) mCherry and GFP in individual cells under the control of constitutive promoters with different strengths. (D) Relationships among variability, mean, and correlation between mCherry and GFP in the inducible promoter construct. (E) Relationships among variability, mean, and correlation between mCherry and GFP in promoter library constructs. Variability and mean are quantified using GFP.

Polycistronic operon enables highly correlated protein expression.

(A) Various promoters are used to control the co-expression of mCherry and GFP from a polycistronic operon. (B) mCherry and GFP in individual cells under the control of the inducible promoter PlacUV5 at different IPTG induction levels. a.u., arbitrary units. (C) mCherry and GFP in individual cells under the control of constitutive promoters with different strengths. (D) Relationships among variability, mean, and correlation between mCherry and GFP in the inducible promoter construct. (E) Relationships among variability, mean, and correlation between mCherry and GFP in promoter library constructs. Variability and mean are quantified using GFP. Finally, we sought to explore the evolutionary benefits of correlated protein outputs in single cells in the presence of resource competition. We considered a generic horizontal gene transfer process, where the acquired genes bring beneficial functions, while they also negatively affect the expression of native genes by competing for limited resources. An antibiotic resistance model was built, where a species can independently deactivate two antibiotics by producing two resistance proteins, respectively (Note 3 in S1 Text). Positively correlated resistance proteins allow a small subpopulation of cells to survive high concentrations of both antibiotics (Fig 5), presenting a strategy for a population to cope with extremely harsh environments. Because the resource competition effect is always accompanied by resource variation, our results suggest an evolutionary mechanism that bacteria can use to compensate for the negative resource competition effect during horizontal gene transfer.
Fig 5

Correlated expression of resistance proteins in single cells facilitates population survival under multiple antibiotics.

(A) An antibiotic resistance model. Two hypothetical antibiotics, A1 and A2, are independently deactivated by two resistance proteins, R1 and R2, respectively. Population survival rates are simulated in the presence of both A1 and A2. (B) Simulated joint distribution of R1 and R2 at three different scenarios: negative correlation with r(R1,R2) = -0.8, uncorrelated with r(R1,R2) = 0, and positive correlation with r(R1,R2) = 0.8. (C) The dependence of population survival rate on the correlation between R1 and R2. Error bars represent standard deviations of 100 simulations. (D) Survival rate profiles at three simulated correlations as in B.

Correlated expression of resistance proteins in single cells facilitates population survival under multiple antibiotics.

(A) An antibiotic resistance model. Two hypothetical antibiotics, A1 and A2, are independently deactivated by two resistance proteins, R1 and R2, respectively. Population survival rates are simulated in the presence of both A1 and A2. (B) Simulated joint distribution of R1 and R2 at three different scenarios: negative correlation with r(R1,R2) = -0.8, uncorrelated with r(R1,R2) = 0, and positive correlation with r(R1,R2) = 0.8. (C) The dependence of population survival rate on the correlation between R1 and R2. Error bars represent standard deviations of 100 simulations. (D) Survival rate profiles at three simulated correlations as in B.

Discussion

Overall, our results reveal that heterogeneity in shared resources and in competing modules are two seemingly opposite driving forces that work together to coordinate protein outputs for all genes in single cells. In harsh environments, positively correlated protein outputs allow a small subpopulation of cells with abundant resources to support multitasking, facilitating individual survival and evolution of the population, which could present a previously unknown challenge in treating multi-drug resistant bacteria [34]. As a resource becomes abundant for all cells, the corresponding module outputs no longer depend on the amount of the resource. In this case, the effects of both resource competition and resource variation are weak, and the module outputs rely solely on the corresponding module inputs and thus function independently. This understanding of generic resource allocation in single cells provides a basis for analyzing and designing more sophisticated gene regulatory networks with high precision and ensemble efficiency. Theoretically, our analytic framework can also be extended to describe competition and heterogeneity in other competing cellular processes. For example, two enzyme pathways often compete for a shared metabolite substrate. In this case, competition between two metabolic pathways, together with heterogeneity in cellular metabolite concentration, could affect single-cell metabolic flux in a similar way to that analyzed in this work, illuminating metabolic behavior previously unknown from existing analyses that do not consider their joint effects [14,35-37]. This improved understanding would bring us closer to more precise design of engineered microbial systems for various applications in biotechnology.

Materials and methods

Strains and DNA construction

The DH10GFP E. coli strain originally created by the Ellis lab [29] was ordered from Addgene (# 109392). The carboxylic acid reductase (car) gene was PCR amplified from the pB5k-sfp-car plasmid as described in previous work [16]. A mCherry gene was fused to the C-terminus of the car gene via a linker that encodes a helix-forming peptide A(EAAAK)3A, as used in previous paper [29]. The car-mCherry fusion gene was cloned into a BglBrick vector pBbA5c (p15A origin, lacUV5 promoter, chloramphenicol selection marker) via Golden Gate DNA Assembly, resulting in plasmid pBbA5c-CAR-mCherry. Meanwhile plasmid pBbA5c-CAR-mCherry(M71G) carrying a non-fluorescent mCherry mutant (M71G) was created via site-directed mutagenesis and was used in FISH experiments. Plasmids pBbA5c-CAR-mCherry and pBbA5c-CAR-mCherry(M71G) were individually transformed to strain DH10GFP, yielding strains sYH006 and sYH013, respectively (S2 Table). E. coli DH10B strain was purchased from New England Biolabs Ltd. (Ipswich, MA, USA) and used as a negative control in the FISH experiment. To investigate correlated protein expression from the same operon, an IPTG-inducible PlacUV5 promoter and a library of constitutive promoters were used to control the transcription of mCherry and GFP from the same mRNA. Strong and identical RBS sequences (tttaagaaggagatatacat) were used for both mCherry and GFP. A small library of constitutive promoters (S1 Table) was designed based on the sequence of BioBrick promoter J23119, and was constructed into a plasmid with SC101 origin and chloramphenicol selection marker using a one-step Golden-Gate DNA Assembly. All plasmids were confirmed by Sanger sequencing.

Growth conditions

Cell cultures were grown overnight in 3 mL of LB medium with 20 μg/mL chloramphenicol at 37°C. The overnight cultures were diluted, in ratios between 1:400 and 1:1000, into 30 mL (for FISH samples) or 3 mL (for fluorescent protein assay samples) of M9 minimal medium, supplemented with 0.4% glucose, 1 mM thiamine, 0.4 mM leucine, and varying amounts of IPTG in either baffled shake flasks (for FISH samples) or test tubes (for fluorescent protein assay samples). Cells were cultivated for approximately 10 hours (~5 cell cycles) and harvested in exponential phase when an OD600 of 0.2–0.4 was reached. Cells cultivated for 9 hours to 12 hours were randomly harvested as controls to confirm that 10 hours incubation is enough for the cells to reach a steady state.

Maturation of fluorescent proteins

To allow maturation of fluorescent protein for more accurate quantification, cells were incubated for an additional period before taking fluorescence measurements [38-40]. Specifically, 1 mL of cell cultures were transferred into pre-chilled test tubes and placed in ice-water bath for 10 min to halt cell growth and gene expression. The cell cultures were centrifuged at 13,000 rpm for 30 s at 4°C. The supernatant was removed, and the pellet was resuspended in 1 mL of phosphate buffered saline (PBS) solution containing 500 μg/mL of rifampicin. The resuspended cells were incubated at 37°C for 90 min and subjected to imaging.

mRNA fluorescence in situ hybridization (FISH)

Probe design

Two sets of custom probes for GFP and CAR-mCherry were designed using the online Stellaris Probe Designer (S4 Table) and synthesized by Biosearch Technologies Inc (Novato, CA, USA). Probes for GFP and CAR-mCherry were labelled with Quasar 570 and Quasar 670 fluorescent dyes, respectively.

Fixation and labelling

Cell fixation and mRNA labelling were performed following established protocols[41]. In detail, 15 mL of each cell culture at OD600 = 0.4 were collected and transferred to an ice-chilled 50-mL centrifuge tube, followed by immediate centrifugation at 4,500 g for 5 min at 4°C. The supernatant was carefully removed, and the pellet was resuspended in 1 mL of 3.7% formaldehyde in 1x PBS. The resuspended cells were then mixed gently at room temperature for 30 min using a nutator. Next, the cells were centrifuged at 400 g for 8 min at room temperature, then washed twice with 1 mL of 1x PBS. Then the cells were resuspended in 300 μL of DEPC-treated water, permeabilized by adding 700 μL of 100% ethanol, and mixed for 1 hour at room temperature using a nutator. After mixing, the cells were centrifuged at 600 g for 7 min at room temperature, and then resuspended in 1 mL of 40% wash solution (353 μL formamide, 100 μL 20x saline-sodium citrate (SSC), 547 μL water). The resuspended solution was then gently mixed for 5 min at room temperature using a nutator and centrifuged at 600 g for 7 min at room temperature. For each sample, the cell pellets were resuspended in 50 μL of 40% hybridization solution (1 g of dextran sulfate, 3530 μL of formamide, 10 mg of E. coli tRNA, 1 mL of 20x SSC, 40 μL of 50 mg/mL BSA, and 100 μL of 200 mM ribonucleoside vanadyl complex for 10 mL solution) with probes at a final concentration of 1 μM per probe set. The mixture was incubated at 30°C overnight. Samples after hybridization were then washed four times in 40% wash solution before imaging in 2x SSC.

Microscopy and image analysis

Microscopy was performed using a Nikon Eclipse Ti microscope (Tokyo, Japan) equipped with an EMCCD camera (Photometrics Inc. Huntington Beach, CA, USA) and a 100 x, NA 1.40, oil-immersion phase-contrast objective lens. An X-Cite 120 LED was the light source. Three band-pass filter cubes (FITC, DsRed, and C-FL CY5, all from Nikon Inc.) were used for spectral separation. In both FISH and protein fluorescence experiments, an exposure time of 20 ms was used for phase-contrast images. In FISH experiments, the DsRed filter and the C-FL CY5 filter were used to detect Quasar 570 (exposure time of 500 ms, with an electro-multiplier gain of 200 x) and Quasar 670 (exposure time of 300 ms, with an electro-multiplier gain of 100 x), respectively. In protein fluorescence experiments, the FITC and the DsRed filter cubes were used to detect GFP (exposure time of 500 ms, no electro-multiplication) and mCherry (exposure time of 300 ms, no electro-multiplication), respectively. The power of the LED light was carefully controlled so that no significant photobleaching was detected. Images were collected by an automated scanning function of the microscope with a built-in Perfect Focus System (PFS) and analyzed using the Nikon NIS-elements software package. On average, 3000 single cells per protein sample and 1000 single cells per FISH sample were collected and analyzed.

Cell segmentation

The phase-contrast images were used for cell identification and segmentation. Overlapped cells, dividing cells, and long unhealthy cells (totaling less than 1%) were excluded by a length filter, an area filter, and visual inspection.

mRNA fluorescence quantification

Single cell mRNA fluorescence was quantified following the previous method[41]. Specifically, background fluorescence was first subtracted to eliminate the effects of autofluorescence on different images. The total fluorescence intensity within a cell was normalized by the cell area to reduce the influence from variations in cell cycles and growth rates. False-positive thresholds for Quasar 570 and Quasar 670 were determined by the fluorescence distribution in a negative control sample (E. coli DH10B strain). The fluorescence intensity of a single mRNA was identified by the peak position of the fluorescence distribution in low-expression cells. To convert the total fluorescence in a cell to the mRNA copy number, we divided the total by the average fluorescence intensity of a single mRNA and rounded the value to the closest integer.

Protein fluorescence quantification

The background fluorescence of each image was subtracted, and the total fluorescence intensity of each cell was normalized by cell area. The cell-area-normalized total pixel intensity was used as the single-cell protein expression level.

Statistics

Gene expression variability was quantified in terms of the variance over the squared mean. The Pearson correlation coefficient was utilized to quantify the correlation between the expression levels of two genes in single cells. The 95% confidence intervals of all estimated parameters were constructed by bootstrap method.

Data and code availability

The data and the MATLAB codes for modelling results that support the findings are available from https://github.com/yhan0410/Data-and-model-in-Heterogeneity-coordinates-bacterial-multi-gene-expression-in-single-cells.

Sequences of constitutive promoters.

(DOCX) Click here for additional data file.

Strains used in this study.

(DOCX) Click here for additional data file.

Statistics determined by single cell experiments in this work.

(DOCX) Click here for additional data file.

Probes used in FISH experiments.

(DOCX) Click here for additional data file.

Models and parameters.

(DOCX) Click here for additional data file.

Data reproducibility for the Simpson’s paradox phenomenon in multi-gene expression.

(A) Dashed lines are linear fitting of the merged data. The three replicates were performed at different days. (B) Correlation from random and evenly sampling across all induction conditions. Error bars represent standard deviations of 100 replicates. (TIF) Click here for additional data file.

Translational resource competition under various parameters.

(A) The relationship between endogenous protein (P) and heterologous proteins (P) at various β values. (B) The relationship between endogenous protein (P) and heterologous proteins (P) at various β values. β and β are varied by tuning β and β (from 1*10−2 to 1*10−6) respectively. (C-J) The same relationship as Fig 2C–2F with M variability set as 0.1. (C-F) correlation between M and M is set as 0. (G-J) correlation between M and M is set as 0.2. (TIF) Click here for additional data file.

Coarse-grained model of transcriptional resource competition.

(A) Schematic of RNAP allocation among heterologous genes (i = 1), endogenous protein-coding genes (i = 2), and rRNA/tRNA genes (i = 3). RNAP, free RNAPs; RNAP, total RNAPs; D, genes free from RNAPs; D, gene-RNAP complexes; D, total genes; M, total mRNAs. (B) RNAP competition in a single cell. Left, relationship between D and the fraction of free RNAP (RNAP/RNAP). Right, relationship between heterologous mRNA (M) and endogenous mRNA (M) caused. Calculations of RNAP, M, and M are described in Note 2.2 in S1 Text with parameters listed in Table A in S1 Text. (C) Correlations between competing mRNAs in single cells r(M, M) changes with correlations between promoter strengths r(D, D) (left), D (center), and RNAP (right). D > 200 is considered as unrealistic region. RNAP affects r(M, M) only in RNAP limiting region. (TIF) Click here for additional data file.

Distributions of mRNA copy number under different conditions.

Single-cell GFP mRNA copy numbers measured from FISH were fitted to Poisson distributions due to its transcription from a constitutive promoter. CAR-mCherry mRNA copy numbers were fitted with negative binomial distributions because they were transcribed from an inducible promoter. (TIF) Click here for additional data file. 1 Nov 2019 Dear Dr Zhang, Thank you very much for submitting your manuscript 'Heterogeneity coordinates bacterial multi-gene expression in single cells' for review by PLOS Computational Biology. Your manuscript has been fully evaluated by the PLOS Computational Biology editorial team and in this case also by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the manuscript as it currently stands. While your manuscript cannot be accepted in its present form, we are willing to consider a revised version in which the issues raised by the reviewers have been adequately addressed. We cannot, of course, promise publication at that time. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Your revisions should address the specific points made by each reviewer. Please return the revised version within the next 60 days. If you anticipate any delay in its return, we ask that you let us know the expected resubmission date by email at ploscompbiol@plos.org. Revised manuscripts received beyond 60 days may require evaluation and peer review similar to that applied to newly submitted manuscripts. In addition, when you are ready to resubmit, please be prepared to provide the following: (1) A detailed list of your responses to the review comments and the changes you have made in the manuscript. We require a file of this nature before your manuscript is passed back to the editors. (2) A copy of your manuscript with the changes highlighted (encouraged). We encourage authors, if possible to show clearly where changes have been made to their manuscript e.g. by highlighting text. (3) A striking still image to accompany your article (optional). If the image is judged to be suitable by the editors, it may be featured on our website and might be chosen as the issue image for that month. These square, high-quality images should be accompanied by a short caption. Please note as well that there should be no copyright restrictions on the use of the image, so that it can be published under the Open-Access license and be subject only to appropriate attribution. Before you resubmit your manuscript, please consult our Submission Checklist to ensure your manuscript is formatted correctly for PLOS Computational Biology: http://www.ploscompbiol.org/static/checklist.action. Some key points to remember are: - Figures uploaded separately as TIFF or EPS files (if you wish, your figures may remain in your main manuscript file in addition). - Supporting Information uploaded as separate files, titled Dataset, Figure, Table, Text, Protocol, Audio, or Video. - Funding information in the 'Financial Disclosure' box in the online system. While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see here. We are sorry that we cannot be more positive about your manuscript at this stage, but if you have any concerns or questions, please do not hesitate to contact us. Sincerely, Christoph Kaleta Associate Editor PLOS Computational Biology Alice McHardy Deputy Editor PLOS Computational Biology A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The article presents experimental and modeling work on the correlation between protein levels (and between RNA levels) in a population of genetically identical bacteria. The central finding concerns the relationship between the anti-correlation of protein levels that occurs when different conditions are merged and the positive correlation that characterizes each condition individually. Fluctuations in available resources (mRNAs in particular) are identified as the key to explain observations. On the plus side I found the paper quite well written. The modelling results and experimental data are presented clearly and thoroughly. On the minus side, the modeling framework needs to be placed in the context of existing models, and the interpretation of the main finding is in my view not completely convincing. Main: 1) The general problem faced in this work (competition, allocation of cellular resources etc.) is well studied and the model discussed in this paper seems to be close to previous models both technically and in terms of the questions being asked and lessons being drawn (the fact that P1 is considered as “heterologous” seems to me to be immaterial for the conclusions). For instance, the fact that (the expression levels of) two proteins display negative correlation (in any given condition) immediately suggests, as the authors say, competition for shared positive regulators (eg ribosomes). The full picture is however much richer and includes the possibility of having positive correlations, depending essentially on kinetic details (see e.g. 10.1016/j.bpj.2013.04.049 & doi.org/10.1371/journal.pcbi.1002203). On the other hand a positive correlation can also be induced by competition for a shared negative regulator of gene expression such as microRNAs (acting on mRNAs, see e.g. 10.1038/srep43673 ). (The suggested links only represent a few examples that came to my mind, but the modelling literature on these topics is huge.) In each case, correlations between the corresponding transcripts do not need to reflect those between their functional products. In my view, a discussion of previous approaches and of how the present model deviates from/generalizes/complements them is necessary. In particular, it would be important that authors clarify what biological insight discussed in this paper cannot be obtained without the specific modeling frame/assumptions they employed. In this respect, I think that the role of resource variability could be further highlighted against previous work. 2) Regarding the Simpson paradox, merging data coming from different conditions does not necessarily yield, as far as I understand, a new condition from which conclusions can be drawn. In general I would say that cases like the experiment of Hecht & al [S Hecht, S Shlaer, and MH Pirenne, Energy, quanta and vision. , J Gen Physiol 25, 819–840 (1942)] provide a strong caveat against doing it: averaging over different conditions (patients in their case) can lead to erroneous interpretations of data. I understand that the authors take the merged dataset to model a “heterogeneous and fluctuating” environment, but frankly I am not convinced. Looking at individual conditions one would conclude that the competition between the two proteins is not there and everything is driven by the induction that changes the slope of the protein-protein dependence across different conditions. The fact that mixing experiments one observes a negative correlation does not change this fact. So why exactly do authors deem it important/interesting that averaging over conditions the correlation changes sign? Why is this special? This is really not clear to me, also because the negative correlation of mean values seems to be rather weak (especially when compared against the range of variability of single cells). Important: it seems to me (Table S3) that the nr of cells is rather unevenly distributed across induction conditions (less cells at the maximum level compared to the control). Am I right in assuming that the fitting procedure used for the merged data accounts for this imbalance? (Otherwise the fit could be biased to return a negative correlation). This should be made very clear all throughout the text. Minor: Supplementary Note: The first equation (unnumbered) of Note 1 as well as the first equation (unnumbered) of Note 2 appear on a single line without any separation in my doc reader. This is confusing (but it may depend on the reader, not sure) About parameters: Models tend to use somewhat standardised parameters so I don’t doubt that the representative results displayed by the authors represent a physiologically realistic scenario. However a discussion of how sensitive results are to the model’s parameters would be welcome. Equation 1 plays a central role in this manuscript. The approximation based on which it is derived (Supplementary Note 1) seems reasonable to me but I would stress it in the Main Text. Also, Eq 1 is rather intuitive once the approximation is explained. I suggest the authors provide the reader with some guideline to interpret the physical meaning of Eq 1 already in the Main Text. Reviewer #2: The work by Han and Zhang reports an extremely interesting study on resource competition in single cells at the translational and transcriptional level. The authors found that the former correlates positively in single cells, but negatively at in the population. This Simpson paradox is not found at the transcriptional level. The work is very nicely and concisely summarized. I only have some minor suggestions * as the authors submitted to PLOS CB, I do not think the mathematical model needs to be hidden in the supplementary materials. Rather the model and major mathematical results should be explicitly shown in the main text. I believe this is particularly true for line 80 to 130, where the train of thought is interrupted by reference to the supplementary material. * the authors say within certain ranges (line 107) with feasible parameters (line 116-117). I think these number should be made explicit (including some discussion) in the main text. * the connection between mathematical model and experimental realization may be improved if Fig 1A for instance also includes the model variables. * the authors say shining light on a previously debated issue (line 131-132). Please, could you briefly indicate the arguments put forward by the references in this debate. * line 134 - 136, what would be required to reverse the correlation so that’s consistent with line 119? * please deposit your data and matlab scripts on github or some other public database ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: No: Some data, in particular the scripts are only available on requests ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No 31 Dec 2019 Submitted filename: Response to reviewers__final.DOCX Click here for additional data file. 9 Jan 2020 Dear Dr Zhang, We are pleased to inform you that your manuscript 'Heterogeneity coordinates bacterial multi-gene expression in single cells' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Once you have received these formatting requests, please note that your manuscript will not be scheduled for publication until you have made the required changes. In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pcompbiol/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. One of the goals of PLOS is to make science accessible to educators and the public. PLOS staff issue occasional press releases and make early versions of PLOS Computational Biology articles available to science writers and journalists. PLOS staff also collaborate with Communication and Public Information Offices and would be happy to work with the relevant people at your institution or funding agency. If your institution or funding agency is interested in promoting your findings, please ask them to coordinate their releases with PLOS (contact ploscompbiol@plos.org). Thank you again for supporting Open Access publishing. We look forward to publishing your paper in PLOS Computational Biology. Sincerely, Christoph Kaleta Associate Editor PLOS Computational Biology Alice McHardy Deputy Editor PLOS Computational Biology Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: My concerns have been addressed. The connection between the merged dataset and heterogeneous regions in extended systems makes is indeed helpful. Reviewer #2: my concerns have been appropriately addressed ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: None Reviewer #2: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No 23 Jan 2020 PCOMPBIOL-D-19-01269R1 Heterogeneity coordinates bacterial multi-gene expression in single cells Dear Dr Zhang, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Sarah Hammond PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol
  40 in total

Review 1.  Multidrug evolutionary strategies to reverse antibiotic resistance.

Authors:  Michael Baym; Laura K Stone; Roy Kishony
Journal:  Science       Date:  2016-01-01       Impact factor: 47.728

Review 2.  Physiological heterogeneity in biofilms.

Authors:  Philip S Stewart; Michael J Franklin
Journal:  Nat Rev Microbiol       Date:  2008-03       Impact factor: 60.633

3.  Translational cross talk in gene networks.

Authors:  William H Mather; Jeff Hasty; Lev S Tsimring; Ruth J Williams
Journal:  Biophys J       Date:  2013-06-04       Impact factor: 4.033

4.  Modelling and measuring intracellular competition for finite resources during gene expression.

Authors:  Renana Sabi; Tamir Tuller
Journal:  J R Soc Interface       Date:  2019-05-31       Impact factor: 4.118

5.  Free RNA polymerase in Escherichia coli.

Authors:  Michael Patrick; Patrick P Dennis; Mans Ehrenberg; Hans Bremer
Journal:  Biochimie       Date:  2015-10-19       Impact factor: 4.079

6.  Characterizing bacterial gene circuit dynamics with optically programmed gene expression signals.

Authors:  Evan J Olson; Lucas A Hartsough; Brian P Landry; Raghav Shroff; Jeffrey J Tabor
Journal:  Nat Methods       Date:  2014-03-09       Impact factor: 28.547

Review 7.  Dynamic pathway regulation: recent advances and methods of construction.

Authors:  Sue Zanne Tan; Kristala Lj Prather
Journal:  Curr Opin Chem Biol       Date:  2017-10-20       Impact factor: 8.822

8.  Gratuitous overexpression of genes in Escherichia coli leads to growth inhibition and ribosome destruction.

Authors:  H Dong; L Nilsson; C G Kurland
Journal:  J Bacteriol       Date:  1995-03       Impact factor: 3.490

9.  Systematic characterization of maturation time of fluorescent proteins in living cells.

Authors:  Enrique Balleza; J Mark Kim; Philippe Cluzel
Journal:  Nat Methods       Date:  2017-11-20       Impact factor: 28.547

10.  ceRNA crosstalk stabilizes protein expression and affects the correlation pattern of interacting proteins.

Authors:  Araks Martirosyan; Andrea De Martino; Andrea Pagnani; Enzo Marinari
Journal:  Sci Rep       Date:  2017-03-07       Impact factor: 4.379

View more
  3 in total

1.  Practical observations on the use of fluorescent reporter systems in Clostridioides difficile.

Authors:  Ana M Oliveira Paiva; Annemieke H Friggen; Roxanne Douwes; Bert Wittekoek; Wiep Klaas Smits
Journal:  Antonie Van Leeuwenhoek       Date:  2022-01-18       Impact factor: 2.271

2.  Massively parallel gene expression variation measurement of a synonymous codon library.

Authors:  Alexander Schmitz; Fuzhong Zhang
Journal:  BMC Genomics       Date:  2021-03-02       Impact factor: 3.969

3.  Accelerating Whole-Cell Simulations of mRNA Translation Using a Dedicated Hardware.

Authors:  David Shallom; Danny Naiger; Shlomo Weiss; Tamir Tuller
Journal:  ACS Synth Biol       Date:  2021-11-23       Impact factor: 5.110

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.