Literature DB >> 34762642

A conserved expression signature predicts growth rate and reveals cell & lineage-specific differences.

Zhisheng Jiang1, Serena F Generoso2, Marta Badia3, Bernhard Payer2,3, Lucas B Carey1,3.   

Abstract

Isogenic cells cultured together show heterogeneity in their proliferation rate. To determine the differences between fast and slow-proliferating cells, we developed a method to sort cells by proliferation rate, and performed RNA-seq on slow and fast proliferating subpopulations of pluripotent mouse embryonic stem cells (mESCs) and mouse fibroblasts. We found that slowly proliferating mESCs have a more naïve pluripotent character. We identified an evolutionarily conserved proliferation-correlated transcriptomic signature that is common to all eukaryotes: fast cells have higher expression of genes for protein synthesis and protein degradation. This signature accurately predicted growth rate in yeast and cancer cells, and identified lineage-specific proliferation dynamics during development, using C. elegans scRNA-seq data. In contrast, sorting by mitochondria membrane potential revealed a highly cell-type specific mitochondria-state related transcriptome. mESCs with hyperpolarized mitochondria are fast proliferating, while the opposite is true for fibroblasts. The mitochondrial electron transport chain inhibitor antimycin affected slow and fast subpopulations differently. While a major transcriptional-signature associated with cell-to-cell heterogeneity in proliferation is conserved, the metabolic and energetic dependency of cell proliferation is cell-type specific.

Entities:  

Mesh:

Year:  2021        PMID: 34762642      PMCID: PMC8610284          DOI: 10.1371/journal.pcbi.1009582

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


Introduction

Rates of cell growth and division vary greatly, even among isogenic cells of a single cell-type, cultured in the same optimal environment [1]. Cell-to-cell heterogeneity in proliferation rate has important consequences for population survival in bacterial antibiotic resistance, stress resistance in budding yeast, and chemo-resistance in cancer [2-10]. A recent study has demonstrated semi-heritable cell-to-cell heterogeneity in gene expression in mammalian cells, which is associated with drug resistance in cancer [11] and time-lapse fluorescence microscopy has shown that cell-to-cell variability in the expression of some genes, such as p53 and p21, is associated with cell-to-cell variability in proliferation and survival [1,12]. While microscopy can identify dynamic relationships between gene expression and cell fate, it is limited to measurements of one or two genes per cell. Single-cell RNA sequencing measures transcriptome-level heterogeneity but does not directly link this to cell-biological heterogeneity in organelle state, or dynamic heterogeneity in proliferation or drug resistance. Transcriptome-level approaches for understanding within-population cell-to-cell heterogeneity in proliferation and other dynamic processes are lacking. While the presence of intrapopulation variation in proliferation, transcriptome, and organelle-state in both steady-state and in differentiation populations is well established, the relationship among the three remains unclear. One possibility is that the proliferation-correlated gene expression program is the same, regardless of if one looks at interpopulation variation due to genetic or environmental differences, or intrapopulation heterogeneity due to epigenetic differences and expression noise. However, in the budding yeast Saccharomyces cerevisiae, the expression program of intrapopulation heterogeneity in proliferation rate only partially resembles that of cells growing at different rates due to genetic or environmental perturbations[8]. The relation between gene expression and proliferation rate is much less well studied in mammalian cells. In yeast, in tumors, and in organs, genetic, environmental and developmental changes cause changes in proliferation rate, and changes in the expression of hundreds or thousands of genes [13-17]. Unsurprisingly, many of the genes for which changes in expression are associated with changes in proliferation rate are associated with adverse clinical outcomes in cancer and with antibiotic and antifungal resistance [18,19]. Within a population of microbes, and within a single multicellular organism, the correct balance of proliferation states and rates is essential. Yet measuring this heterogeneity is difficult, and without such data, understanding the consequences of this heterogeneity is impossible. Gene expression is associated with phenotype, but mRNAs themselves do not often directly cause phenotypes. Instead, they serve as markers for cell-biological differences between cells. Phenotypes are mostly driven by larger cell-biological differences between cells, such as differences in metabolic state. Cell-to-cell heterogeneity in mitochondria state has been linked to differences in transcription rates, growth rates, proliferation and developmental trajectories [20-22]. Both cancer cells and pluripotent stem cells have atypical metabolisms and use glycolysis to produce much of their ATP, instead of the mitochondria-based oxidative phosphorylation, which is the predominant form of ATP-generation in differentiated cells [23]. It is unknown if this inter-population variation in proliferation, transcriptome, and mitochondria extends to intra-population variation among single cells within a single isogenic population. Pluripotent stem cells exist in various states, such as naïve or primed, based on culture conditions and embryonic origin [24]. Mouse embryonic stem cells (mESCs) reflect the naïve pluripotency state of the blastocyst epiblast and can be cultured in either serum+LIF or 2i+LIF conditions, the latter involving inhibitors of FGF/ERK and GSK3 pathways. Culture in 2i+LIF conditions promotes a ground state more closely mirroring the in vivo situation with reduced heterogeneity in pluripotency gene expression and different cell cycle profile when compared to cells grown in serum+LIF [25-27]. Nevertheless, even in 2i+LIF conditions, mESCs display a certain amount of cell-to-cell heterogeneity [28,29] and it is unclear, how this relates to heterogeneity in differentiated cell types when it comes to gene expression and its link to proliferation rate. To understand the relation between intra-population transcriptome heterogeneity and heterogeneity in proliferation, we developed a FACS-based method to sort cells by proliferation rate. We applied this method to mouse immortalized fibroblasts and mESCs and performed RNA-seq on fast, medium and slow proliferating cell sub-populations. We identified a “proliferation signature”, mostly consisting of ribosome-biogenesis (protein synthesis) and proteasome-related (protein degradation) genes that are highly expressed in fast proliferating fibroblasts and ESCs. Moreover, the proliferation signature is conserved across cell-type and species, from yeast to cancer cells, allowing us to predict the relative proliferation rate from the transcriptome. We used this gene expression signature to predict proliferation rates in single cells from scRNA-seq data of C. elegans development. Unlike previous models to predict growth rate from gene expression [30], this model has no free parameters other than the set of genes, and does not suffer from overfitting–it can predict differences in growth rate in yeast, cancer cells and C. elegans, despite no data from either species going into the initial model. When applied to scRNA-seq data from developing C. elegans, this model identified a global slowdown in proliferation rate during development, with lineage-specific exceptions where some lineages maintain constant proliferation scores, and others even increase proliferation rate. In contrast to the universality of this main transcriptional signature, many mitochondria-related genes were upregulated in fast proliferating fibroblasts, yet down-regulated in fast-proliferating mESCs. Consistent with this, we found that a high mitochondria membrane potential is indicative of slow proliferating fibroblasts, while in mESCs this is a property of fast proliferating cells. And the mitochondrial electron transport chain complex III inhibitor antimycin treatment causes opposite effects on the proliferation of fibroblasts and ESCs. Fast, but not slow proliferating fibroblasts are particularly sensitive to the ATP synthase inhibitor oligomycin. Taken together, these results show the existence of a core protein-synthesis and protein-degradation expression program that is conserved across cell types and species, from yeast to mice, and a metabolic and energy-production program that is highly cell-type specific, with cell-type and proliferation-rate specific consequences on the effects of mitochondria inhibitors.

Results

A method to sort single mammalian cells by semi-heritable cell-to-cell heterogeneity in proliferation rate

To understand the causes and consequences of intrapopulation cell-to-cell heterogeneity in proliferation rate in mammalian cells we developed a method for sorting single mammalian cells by their proliferation rate (Figs . The cell-permeable dye carboxyfluorescein succinimidyl ester (CFSE) covalently binds free amines within cells, thus staining most intracellular proteins at lysine residues. In cell types that divide symmetrically, such as embryonic stem cells and immortalized fibroblasts [31], the equal dilution of CFSE into the two daughter cells enables counting of the number of divisions that each cell has undergone. This method is commonly used to differentiate proliferating from nonproliferating cells, and to count discrete numbers of cell division, such as in the study of T- and B-cell proliferation following antigen stimulation [32]. To eliminate confounding effects due to differences in initial staining we used fluorescence-activated cell sorting (FACS) to obtain an initially homogeneous cell population of cells with identical CFSE signals (Figs . Thus, the initial CFSE signal is independent of initial cell-to-cell variation in dye uptake or protein content, as the initial distribution is determined by the FACS gate. CFSECFR2 conjugates are stable and unable to exit the cell [33]; the dye signal is stable for over eight weeks in non-dividing lymphocytes [34]. The measured CFSE signal should be relatively insensitive to cell-to-cell variation in protein degradation. We cultured this sorted starting cell population for several generations, during which time the CFSE signal decreases with each cell division (. Consistent with the decrease in CFSE being mostly due to cell division, the population-level doubling time of each cell type can be calculated based on the decrease in CFSE signal over time (, and these doubling times (19–21 hours for fibroblasts and 10–12 hours for mESCs) are consistent with those reported by other methods [35,36].

A CFSE-based method to sort mammalian cells by proliferation rate.

(A) Cells were stained with CFSE and a subpopulation of cells with identical CFSE levels was collected by FACS. Growth for several generations resulted in a heterogeneous cell population with a broad CFSE distribution, and cells with high, medium, and low CFSE signal (slow, medium and fast proliferation, respectively) were sorted by FACS for RNA-sequencing. (B) The change in the CFSE distribution over time, for fibroblasts and ESCs. (C, D) The population-level doubling time can be calculated by fitting a line to the median of the log2(CFSE) signal. We discard data from time 0, cells immediately after the sort, because the CFSE signal decreases in the initial hours, even in the absence of cell division, likely due to efflux pumps. (E, F) Bromodeoxyuridine (BrdU) was used to measure the % of cells in S-phase for FACS-sorted fast and slow proliferating subpopulations. Fibroblasts: 4 replicates, p = 0.0002441. ESCs: 3 replicates for ESCs, p = 0.001953. p-values are for binomial tests across all biological replicates that the two populations have the same percentage of cells in S-phase. (G) Examples of genes whose expression positively or negatively correlated with proliferation rate. Each line is one biological replicate, and the error bars are 95% confidence intervals for each expression value. To test if the intrapopulation heterogeneity in CFSE that develops after a few doublings corresponds to intrapopulation heterogeneity in proliferation rates, we stained cells with CFSE, isolated a homogenous population by FACS, grew ESCs or fibroblasts for 24h or 48h respectively, and used FACS to isolate the 20% of cells with the highest and lowest CFSE signal, and measured both viability and the fraction of cells in S phase (. We found that fast proliferating (low CFSE) subpopulations maintain higher proliferation rates for at least three days (Figs , and found no differences in viability between CFSE subpopulations (. Thus, intrapopulation differences in CFSE correspond to semi-heritable differences in proliferation rates. To identify genes whose expression is positively or negatively correlated with proliferation rate within a single population we grew fibroblasts in MEF (mouse embryonic fibroblast) medium and mouse ESCs in pluripotent ground-state promoting 2i+LIF medium [37], stained cells with CFSE, performed the initial sort to isolate cells with the same CFSE signal, and then grew fibroblasts for five days, and ESCs for three days. We then used FACS to isolate cells with high, medium, and low CFSE signal, and performed RNA-seq on each sub-population (.

Slow-proliferating ESCs are of more naïve pluripotent character than fast-proliferating ESCs

Embryonic stem cells exhibit cell-to-cell heterogeneity in the expression of naïve pluripotency marker genes such as Nanog, Stella (Dppa3) or Rex1 (Zfp42) [38-40]. Although this heterogeneity is most apparent in ESCs cultured in serum+LIF, even when cultured in ground state-pluripotency-promoting 2i+LIF conditions, the sub-population of ESCs with low NANOG-levels displays a propensity for lineage-priming and differentiation [28,29]. To determine if cell-to-cell variation in proliferation rate was caused by a sub-population of mESCs initiating a differentiation program, we determined the fold-change in expression between slow and fast proliferating sub-populations for a set of genes that are upregulated during lineage commitment (see “Differential expression of pluripotency…” in Materials and Methods). We found no consistent enrichment of these differentiation genes in fast versus slow proliferating cells, as they could be found to be expressed in either population (. However, the slow proliferating subpopulation did have higher expression of genes that are upregulated in naïve pluripotent cells, and in 2-cell(2C)-like state stem cells (, suggesting that slow proliferating mESCs are in a more naïve pluripotent cell state than their fast proliferating counterparts.

Slow-proliferating ESCs display a more naive pluripotent stemness character than fast-proliferating ESCs.

(A) Comparison of lineage commitment-related gene expression between fast and slow proliferating subpopulations. (B) Comparison of pluripotency-associated gene expression between fast and slow proliferating subpopulations. (C) Comparison of 2C-like state markers expression between fast proliferating subpopulation and slow proliferating sub-population. Dashed lines separate genes expressed preferentially in slow- (left of dashed line) or in fast-proliferating (right of dashed line) ESCs. P-values are from binomial tests, testing if genes are more often highly expressed in slow cells than would be expected by chance (53.5% of all genes are more highly expressed in slow cells).

Identification of biological processes correlated with proliferation rate that are conserved across cell-types and species, and within single populations

To identify functional groups of genes that are differentially expressed between fast and slow proliferating cells within a single population we performed gene set enrichment analysis (GSEA)[41,42] (Figs on mRNA-seq data from fast and slow proliferating subpopulations. We found that in both fibroblasts and ESCs, as well as for intrapopulation variability in budding yeast FACS-sorted by proliferation rate (data from van Dijk et al. [8]), genes involved in ribosome-biogenesis and the proteasome are more highly expressed in fast proliferating cells (. High expression of ribosomal genes is a common signature for fast proliferating cells [13,43,44], and cancer cells often exhibit high proteasome expression [45-47], and the work of Geiler-Samerotte et al. showed that growth-predictive proteins are components of the cytosolic unfolded protein response [48], However, it is not clear if this is related to proliferation in-and-of-itself or due to aneuploidy and other genetic alterations [49]. Our results suggest that coordinated regulation of the ribosome and proteasome is an intrinsic signature of fast proliferating cells that is conserved across cell-types and species.

Functional pathways for which cell-to-cell heterogeneity in expression correlates with proliferation rate across cell types and species.

(A) In Gene Set Enrichment Analysis, genes are sorted by their fast/slow expression value (left panel, bottom), and each gene is represented by a single black line (left panel middle). The enrichment score is calculated as follows: for each gene not in the GO preribosome gene set, the value of the green line decreases, and for each gene in the gene set, the value of the green line increases. The ES score will be near zero if the genes in a gene set are randomly distributed across the sorted list of genes, positive if most genes are to the left, and negative if most genes are to the right. (B) The heatmap (right panel) shows the expression (z-scored read counts) of preribosome genes in fibroblasts across four biological replicates of the CFSE sorting experiment. (C) Gene sets enriched (FDR<0.1) in both fibroblasts and ESCs were mapped as a network of gene sets (nodes) related by mutual overlap (edges), where the color (red or blue) indicates if the gene set is more highly expressed in fast (red) or slow (blue) proliferating cells. Node size is proportional to the total number of genes in each set and edge thickness represents the number of overlapping genes between sets. (D) GSEA results (FDR<0.1) of S. cerevisiae [8] that sorted by cell-to-cell heterogeneity in proliferation rate. (E) Pearson correlations of mean expression (average of log2(TPM+1)) of ribosome biogenesis genes vs proteasome genes across organ developmental time courses in seven species (see also ). To test if the coupling between ribosome biogenesis and proteasome expression holds across species and in diverse cell types, we analyzed the bulk RNA-seq data across developmental stages, covering multiple organs in seven species [17]. Ribosome biogenesis and proteasome expression are highly correlated (. The coordinated expression change with developmental stages between ribosome biogenesis genes and proteasome genes across organs and species suggests that the coordination between protein synthesis and degradation is likely to be a conserved feature across a large number of species and cell-types (. In addition to ribosome-biogenesis and the proteasome, several other gene sets are differentially expressed between fast and slow proliferating cells in both fibroblasts and ESCs (. mTORC1 (mammalian Target Of Rapamycin Complex 1) functions as a nutrient sensor and regulator of protein synthesis, and is regulated by nutrient and cytokine conditions that cause differences in proliferation [50,51]. We find that, even in the absence of genetic and environmental differences, mTORC1 is more active in fast proliferating cells. Activation of mTORC1 can promote ribosome-biogenesis [50,52], however, there is still controversy about the regulation of proteasome activity by mTORC1 [51,53-57]. The transcription factor MYC (, and MYC target genes ( are more highly expressed in fast proliferating cells. MYC is frequently amplified in cancer, regulates the transcription of ~15% of all genes [58] and is a master regulator of cell proliferation [59]. Overexpression of MYC promotes ribosome-biogenesis and cell growth rates [60,61], and active mTORC1 can promote MYC activation [62,63]. Our data suggest that increased expression of MYC and increased mTORC1 activity are general properties of fast-proliferating cells, and those genetic or environmental perturbations are not necessary to cause differential expression of these pathways.

Defining a proliferation signature to predict the growth rate across species

Expression of typical proliferation markers, such as PCNA and KI67, did not correlate with intra-population heterogeneity in proliferation (. The high degree of conservation of genes whose expression correlates with intra-proliferation heterogeneity in proliferation, from yeast to mouse, suggests that there should be a set of genes whose expression is predictive of growth rate across all eukaryotes. To build such a set we combined “proliferation correlated genes”–those with a Spearman correlation of rho = 1 in both fibroblasts and ESCs (243 genes) with genes from six ribosome biogenesis and proteasome related gene sets that are significantly enriched in both fibroblasts and ESCs, which result in a final gene set consisting of 370 genes (. We then applied ssGSEA [64], a rank-based method that computes an overexpression measure for a gene set of interest relative to all other genes in the genome. The proliferation signature score is the Normalized Enrichment Score, calculated by ssGSEA, for the gene set containing all 370 proliferation-correlated genes.

A proliferation signature model can predict relative growth rates from gene expression for species and cell-types on which it was not trained.

(A) Genes and proliferation signature spearman correlation with proliferation rate (sorted by CFSE). Compare with KI67 or PCNA, proliferation signature has a better correlation with proliferation rate. (B) Using the proliferation signature to predict growth rate in budding yeast, we apply ssGSEA to calculate the enrichment score of proliferation signature for each sample. The Pearson correlation of proliferation signature score with growth rate is 0.82 (p = 8.9×10−7), the grey shading is a 95% confidence band. (C) Using the proliferation signature to predict growth rate in cancer cell lines, the Pearson correlation is 0.65 (p = 1.9×10−8), the grey shading is a 95% confidence band. (D) Comparison of proliferation signature score between 2C-like ESC and non-2C-like ESC (paired t-test, p = 0.04669). To test the ability of this proliferation signature to predict proliferation rates in new data we used out-of-species cross-validation. While several models have been developed to predict growth rates from gene expression [30], the performance of these models has been evaluated using within-experiment cross validation, in which a single sample (e.g., condition or genotype) was held-out (excluded) and used for testing model performance. Accurate prediction of growth rates in cells for which actual growth rates cannot be measured, such as tumors in-vivo, or from single-cell RNA-sequencing data, would be more useful. However, models tested using in-experiment cross-validation (also known as internal validation) are often over-fit, resulting in poor performance when the model is applied to new data from new experiments [65]. To overcome this problem, we used the proliferation signature from above, which was developed using data from mouse cells, to predict growth rate from gene expression in budding yeast. Our model has correlations of R = 0.82, 0.73 and 0.77 across three different datasets (Figs . In contrast, the model of Wytock et al. [30], which was trained on these yeast data, has an out-of experiment predictive power of R < 0.15 (Fig S6 in Wytock et al. [30]). The Wytock et al. model is over-fit and cannot predict proliferation rates in new data. Similarly, our model has better performance (R = 0.65) than a cancer-specific model [66], which was trained on cancer cell-line data and cannot predict out-of-experiment ( (Fig 4 in Waldman et al. [66]). In contrast to most published models, our proliferation signature score model performs well on data on which it has not been trained. When run on FACS-sorted 2C-like embryonic stem cells (2C::tdTomato+) [67], our proliferation signature model predicts that 2C-like mESCs proliferate slower (. 2-cell embryos also have uniquely low proliferation scores (). These results are experimentally independent of, and biologically consistent with, our observation that expression of 2C-like cell state marker genes is higher in slow proliferating mESCs ( This provides further evidence that the proliferation signature we have identified can be universally applied to predict the proliferative state of many cell types.

Prediction of lineage-specific changes in proliferation rates during C. elegans development by the proliferation signature

Expression of the most commonly used markers (PCNA and KI67) for measuring proliferation rates in bulk populations are cell-cycle regulated, but the expression levels of these markers do not measure proliferation rates in single cells. Single-cell RNA sequencing is a powerful method for understanding development and differentiation in vivo, but it suffers from high levels of noise at the single-gene level [68]. We reasoned that our proliferation signature model would be ideal for measuring the proliferation rates of single cells from scRNA-sequencing data, as the model takes into account the expression of >300 genes, most of which are highly expressed and therefore have low levels of technical noise. To test the ability of the proliferation signature model on scRNAseq data we used a dataset of 89,701 cells from C. elegans development [69]. We computed the proliferation signature score for each cell and divided the cells into terminal cell types vs preterminal cells. Non-terminally differentiated cells have a higher proliferation signature score (t-test, p = 4.9×10−41) (. A visual comparison of the 89,701 cells in UMAP space, colored by either embryo age [69] or by proliferation signature score (, suggested that, globally, proliferation rates in single cells decrease as development proceeds. However, three clusters of cells did not follow this pattern: germline, intestine and M cells. To quantify the relationship between proliferation rates of single cells and developmental time we binned all cells with same embryo time, and calculated the correlation between proliferation score and developmental time (). The proliferation signature score decreases as the embryo develops (rho = -0.65, p = 9.3 ×10−19 and rho = -0.73, p = 8.7×10−24 when excluding the three outlier groups (). This conclusion from our single-cell gene expression analysis using the proliferation signature score is therefore quantitatively consistent with lineage-tracing microscopy data from Sulston et al. [70], showing that the rate of cell division within the developing nematode decreases during development (.

Proliferation signature of cell development.

(A) A cartoon showing four terminal cells, and a partial linage showing the final four generations of preterminal cells. Comparison of single-cell proliferation signatures between preterminal cell lineage and terminal cell types (t-test, p = 4.9×10−41). (B) UMAP projection of 89,701 cells. Cells in the left panel are colored by estimated embryo times; in the right panel by proliferation signature score. (C) To calculate the proliferation signature score (y-axis) at each time point (x-axis) cells are binned by embryo time, and the mean proliferation signature score for all cells in the same bin is calculated. The spearman correlations are -0.65 (p = 9.3 ×10−19) for binned data and -0.42 (p < 2.2e-16) for unbinned data. (D) Boxplots (line shows median, boxes interquartile range) of proliferation signature score for all cells with embryo time > 650min. (E) Temporal dynamics of proliferation scores of select cell lineages, showing the average proliferation score for all single cells in that lineage, at each time point. (F-G) Boxplot of C. elegans (F) and human (G) proliferation signatures as a function of developmental time, from scRNAseq data. To our surprise, the predicted proliferation rates increased after 650 minutes (). To understand why, we grouped all cells in 650 minutes or older by lineage (. The three outlier groups from UMAP space: germline, M cell and intestine, had the highest proliferation signature among all cell types late in development. Specifically, for these three cell types, the proliferation score did not decrease with the embryo time, but increased or maintained a high level (. This can be explained by lineage-specific characteristics: the germline is the only cell type in C. elegans that is continuously proliferating, M cells are a highly proliferative single mesodermal blast cell, and the intestinal cells, although they do not proliferate, continue to increase in both biomass and DNA content through endoreplication. Other cell types with high proliferation scores, such as Z1/Z4, are also known to continue proliferation after 650 minutes [71]. The proliferation signature score decreases with embryo time for most cell types, including body wall cell, hypodermis and ciliated amphid neuron, which are the most prevalent in the single-cell RNA-seq dataset (. Two cell types, hypodermis and seam cells, exhibited very low proliferation signature scores at late time points, while intestinal cells exhibited very high proliferation scores (Figs . Both these cell types contain multinucleated cells, but these multinucleated cells arise through very different mechanisms: hypodermis and seam cells through cell-fusion, and intestine through endoreplication [71]. Thus, two cell types with seemingly similar properties have highly divergent transcriptomes, and highly divergent mechanisms to reach their similar final state. Upon development to an L1 larva, a nematode has more than half of the final number of cells present in the adult. In contrast, mammals continue to rapidly increase in cell number even after embryonic development is complete. This difference can be seen in the change in proliferation signature over time, which decreases in C. elegans, but increases in human (data from Petropoulos et al. [72]) and mouse (data from Deng et al. [73]) (Figs . In conclusion, our proliferation signature genes obtained from mouse fibroblast and ESC data can predict dynamic changes in proliferation rates during the development of various cell types and species, thereby confirming its universal applicability.

Cell-to-cell heterogeneity in mitochondria state predicts variation in proliferation both in ESCs and fibroblasts, but in opposite directions

While the pattern of within-population proliferation-correlated expression in yeast, mouse fibroblasts and ESCs was broadly similar with regard to genes involved in protein synthesis and degradation, the behavior of metabolic and mitochondria-related genes in fast and slow proliferating subpopulations was highly cell-type specific. Mitochondria membrane and respiratory chain-related gene sets were more highly expressed in fast proliferating fibroblasts, but not in fast proliferating ESCs (. These results are consistent with differential mitochondrial states in ESCs when compared to differentiated cells like fibroblasts [23], which suggests the existence of different types of metabolism and proliferation-related heterogeneity between pluripotent and differentiated cell-types. We also observed cell-type specific differences in glycolysis, fatty acid metabolism, and other metabolic processes, suggesting fundamental differences in the metabolic pathways required for fast proliferation between pluripotent ESCs and differentiated cells like fibroblasts (. Shown are representative gene sets whose expression is significantly correlated with proliferation in either fibroblasts or ESCs, but whose expression changes with proliferation in opposing directions. NES>0 (higher expression in fast); NES<0 (higher expression in slow) The mitochondrial membrane potential is a major predictor of cell-to-cell heterogeneity in proliferation rate in budding yeast [9]. Mitochondria-related genes are more highly expressed in the fast proliferating subpopulation of fibroblasts (). In contrast, these genes are slightly more highly expressed in the slow proliferating subpopulation of ESCs. This suggests that the relation between cell-to-cell heterogeneity in mitochondria state and proliferation may be different in these two cell types. To test the ability of mitochondrial membrane potential to predict proliferation rate in mammalian cells we used the mitochondria membrane potential-specific dye tetramethylrhodamine ethyl ester (TMRE) to stain fibroblasts and ESCs, and performed both RNA-seq and proliferation-rate assays on high and low TMRE sub-populations (Fig 6A).
Fig 6

Expression of proliferation-related gene sets in cells sorted by intra-population heterogeneity in mitochondria membrane potential.

(A) Cells were stained with Hoechst and CFSE and a homogenous population of equally sized cells in G1 with equal CFSE was obtained by FACS. These cells were stained with TMRE sorted by TMRE, and then used for RNA-seq, or allowed to proliferate to measure the doubling time of each TMRE sub-population. (B, C) Enrichment maps of fibroblasts and ESCs sorted by TMRE. (D) Doublings times, as estimated by the measured by the decrease in CFSE signal over time, for high, medium and low TMRE sorted cells. P-values are from ANOVA, testing if TMRE is predictive of doubling time (see Materials and Methods).

Expression of proliferation-related gene sets in cells sorted by intra-population heterogeneity in mitochondria membrane potential.

(A) Cells were stained with Hoechst and CFSE and a homogenous population of equally sized cells in G1 with equal CFSE was obtained by FACS. These cells were stained with TMRE sorted by TMRE, and then used for RNA-seq, or allowed to proliferate to measure the doubling time of each TMRE sub-population. (B, C) Enrichment maps of fibroblasts and ESCs sorted by TMRE. (D) Doublings times, as estimated by the measured by the decrease in CFSE signal over time, for high, medium and low TMRE sorted cells. P-values are from ANOVA, testing if TMRE is predictive of doubling time (see Materials and Methods). Unlike the proliferation-based sort (, sorting ESCs and fibroblasts by mitochondria-state (Figs resulted in highly divergent expression profiles. ESCs with high TMRE signal had high expression of ribosome-biogenesis, proteasome, MYC-targets and mitochondrial-related genes, while in fibroblasts these gene sets are more highly expressed in the low TMRE sub-population (. This is consistent with the opposite behavior of mitochondria-related gene sets in proliferation-rate sorted cells from the two cell types (. We note that the differences between high and low TMRE populations are smaller than the difference between high and low CFSE populations (, either due to technical limitation of the dye, or because there is less heterogeneity in mitochondria state than there is in proliferation rate. These expression data make the following prediction: ESCs with high TMRE should have a shorter doubling time, while fibroblasts with high TMRE should have a longer doubling time. To test this, we sorted fibroblasts and ESCs by TMRE, and found that high TMRE fibroblasts indeed do proliferate more slowly, while high TMRE ESCs proliferate more rapidly (. In addition, we tested the effect of ascorbic acid (vitamin C, an antioxidant) and O2 levels (ambient 21% atmospheric vs. low 5% physiological levels) on doubling time, but found no significant effects (. To investigate additional cell types, we searched for RNA-seq data for cells sorted by mitochondria state, and analyzed RNA-seq data of mouse CD8+ T-lymphocytes that have been sorted by mitochondria membrane potential (TMRM) [22]. CD8+ T cells with high TMRM signal (high ΔΨm) showed higher expression of ribosome-biogenesis and proteasome-related genes (, and proliferate more rapidly [22], thereby behaving similarly to ESCs. Thus, across yeast, mouse ESCs, fibroblasts and CD8+ T cells, while mitochondria state and proliferation rate co-vary within a single population, the direction of this correlation is different, with yeast and fibroblasts behaving similarly with each other, and opposite to ESCs and CD8+ T cells.

Perturbation of mitochondria function affects fast and slow proliferating fibroblasts and ESCs in different ways

To investigate the relationship between proliferation rate, cell type, and mitochondrial state, we performed perturbation experiments by directly inhibiting mitochondria function. We stained both mouse ESCs and fibroblasts with CFSE and sorted 20% of the viable cells on the peak of the CFSE signal to have a homogeneous starting population. After culturing them for 24h or 48h respectively, two bins were sorted: the lowest 20% (fast proliferating cells) and the highest 20% CFSE (slow proliferating cells) (. Next we cultured the sorted cells for 8h, then treated with either medium containing DMSO as mock control, the mitochondrial electron transport chain complex III inhibitor antimycin, the ATP synthase inhibitor oligomycin for 16h, washed out the drugs, and measured both viability and proliferation rate for every 24h.

Growth-rate and cell-type specific effects of mitochondria inhibitors on proliferation rate, cell viability and cell state.

(A) Schematic of the experimental setup for measuring the effects of mitochondria inhibitors on slow and fast proliferating cells. (B) Fast proliferating Fibroblast and ESCs sorted by CFSE signal maintained a higher fraction of cells in S phase over two days of growth in medium+DMSO. (C) Effect of antimycin treatment on fast and slow proliferating fibroblasts and ESCs (Drug effect: log2 of the fraction of cells in S-phase when treated with DMSO divided by the fraction of cells in S-phase when treated with drug). (D) Fast fibroblasts changed morphology after the treatment with oligomycin. Scale bars = 80 μm. (E) Immunostaining of fibroblasts for N-Cadherin and DAPI after drug treatment and corresponding quantifications. Fast fibroblasts lose N-Cadherin staining specifically after oligomycin treatment. Scale bars = 15 μm. The fluorescence intensity of N-Cadherin has been quantified on the right. Medians and the 95% confidence intervals are shown as error bars. Kruskal–Wallis test was used for statistical comparison (ns, not significant, **** p < 0.0001). (F) Effect of oligomycin treatment on fibroblast viability. The % viable cells is measured as % trypan blue-negative cells. Both fast fibroblasts and ESCs sorted by CFSE signal maintained a higher fraction of cells in S-phase over two days in growth-media with DMSO, indicating that the proliferation status, fast vs slow, is semi-heritable (. Interestingly, we found cell-type and proliferation-state specific effects of mitochondria perturbation. Antimycin A, an inhibitor of complex III of the electron transport chain, strongly decreased the fraction of slow-proliferating fibroblasts that were in the S-phase but had a weaker effect on fast-proliferating fibroblasts (t-test, p = 0.0089) (. In ESCs, the effect of antimycin appeared somewhat stronger on fast- than on slow-proliferating cells (although it did not differ significantly). In contrast, fast proliferating fibroblasts were highly sensitive to oligomycin treatment, which blocks the ATP-synthase of complex V in the electron transport chain. Specifically, we observed a change in cell morphology upon treatment (. In comparison with DMSO-treated cells, the cells lost their elongated shape and became more round and smaller. This led us to hypothesize if that morphology change might be explained by a mesenchymal to epithelial transition (MET) upon oligomycin treatment. In fact, during induced pluripotent stem cell (iPSC) reprogramming, MET of fibroblasts is an important early reprogramming step [74,75]. In that context, oligomycin treatment has been recently shown not only to promote a metabolic switch from oxidative phosphorylation to glycolysis, but also to modulate mesenchymal markers during reprogramming [75,76]. Therefore we measured the protein levels of the regulators N-cadherin (mesenchymal marker expressed in fibroblasts) and E-cadherin (epithelial marker expressed in ESCs) with and without treatment, by both immunostaining and Western Blot (Figs . We could not detect E-cadherin in fibroblasts, but we observed reduced expression of N-cadherin in comparison with DMSO treated-cells in particular in oligomycin-treated fast cycling fibroblasts (Figs . In addition to the change in morphology, oligomycin treatment reduced cell viability specifically in fast proliferating fibroblasts, but not in slow fibroblasts (. In conclusion, although we observed both a change in cell viability, morphology and a reduction in N-cadherin levels, oligomycin treatment did not induce a complete mesenchymal to epithelial transition in the fast-proliferating fibroblasts as indicated by the lack of E-cadherin upregulation. Antimycin and oligomycin interfere with different parts of the electron transport chain of distinct functions—complex III, oxidation of substrates and setting up membrane potential vs. complex V, reduction of membrane potential and ATP-production. Therefore, the distinct effects of antimycin treatment on the proliferation of fibroblasts and ESCs and the subpopulation-specific effect of oligomycin on fast fibroblasts are in line with a differential dependency on the diverse mitochondrial functions between the different subpopulations of fibroblasts and ESCs.

Discussion

In summary, we have developed a method to sort cells by their proliferation rate and used these data to identify a pattern of proliferation-correlated gene expression that is conserved among eukaryotes. We used these data to develop a model that can predict proliferation rates from gene expression in multiple eukaryotic species and cell types, and for types of data, such as single-cell RNA sequencing in a developing organism, for which proliferation rates cannot be measured experimentally. While the CFSE signal is not a measure of the instantaneous proliferation rate, but instead determined by the average proliferation rate integrated over several days, the fact that (A) the transcriptomes of the sorted cells are predictive of proliferation rates, and (B) the cells with low CFSE maintain faster proliferation rates over at least three days, suggests that there are not likely to be large differences in the instantaneous proliferation rate vs the average rate, at least for these cell types and experimental timescales. We found that genes involved in protein synthesis (ribosome-biogenesis, translation initiation), and in protein degradation (the proteasome and proteasome-related protein degradation) are highly expressed in fast proliferating eukaryotic cells, including mammalian, nematode and yeast cells. Previous studies have reported that high expression of the proteasome in fast-growing cells may be necessary in order to degrade misfolded protein, because the fast protein synthesis in fast-growing cells produces more incorrectly folded proteins [51,77,78]. Even with a constant translation and folding error rate, fast proliferating cells will produce more protein, and therefore more misfolded protein that needs to be degraded. In all non-cancer mammalian cells we investigated, we also found the mTORC1 signaling pathway enriched in fast proliferating cells and P53-targets enriched in slow proliferating cells. Our results show both upregulations of the mTORC1 signaling pathway and proteasome activity in fast proliferating cells, which is consistent with several previous studies [9,12-15]. Our analysis of fast versus slow proliferating ESCs cultured in 2i+LIF conditions indicated at several levels that slow proliferating cells were of a more naïve ground state pluripotent character than fast proliferating cells. First, this was supported by the fact that they displayed a higher expression of naïve pluripotency marker genes and markers of 2C-like cells (. Second, we observed enrichment of E2F targets and genes involved in G1 S cell cycle phase transition ( in our fast cycling ESC population, indicative of a shortened G1 phase and a shorter doubling-time, as described for ESCs cultured in serum+LIF [27]. Finally, although we could find differentiation genes to be expressed both in fast and slow proliferating cells (, we saw several differentiation pathways to be enriched specifically in fast dividing ESCs (. In summary, even when ESCs are cultured in ground-state pluripotency promoting 2i+LIF conditions, they display heterogeneity in proliferation rate, with the slow proliferating being of more naïve pluripotent character when compared to fast dividing cells. While we observed ESCs behave similarly to other cell types like fibroblasts or yeast when it comes to gene expression signatures characteristic of fast proliferating cells related to protein synthesis and turnover (, we found a very different behavior when it comes to regulation of metabolism. Although the growth rate can be predicted by mitochondrial membrane potential in Saccharomyces cerevisiae [23], where it is negatively correlated with proliferation rate like in fibroblasts as we show in this study, our results show mitochondrial membrane potential to be positively correlated with proliferation rate in ESCs (. This suggests mitochondrial membrane potential has different functions in pluripotent cells when compared to differentiated cell types or yeast. This is corroborated by our gene expression analysis of cells with high vs. low mitochondrial membrane potential (, where we found pathways linked with fast proliferating cells to be enriched in fibroblasts with low mitochondrial membrane potential but on the contrary, enriched in ESCs with high mitochondrial membrane potential. Surprisingly, primed pluripotent stem cells have been described to rely more on non-oxidative, glycolysis-based metabolism than naïve pluripotent stem cells [79-81], which appears in contradiction with our result that our slow proliferating, mitochondria activity low ESCs being more naïve-like. However, TMRE is not a direct measure of ATP generation by mitochondria; yeast cells that are respiring and producing all of their ATP using their mitochondria, and yeast cells unable to respire, both have high TMRE signals [9]. Differentiated cells, in general, rely more on oxidative metabolism than pluripotent cells, therefore our fast proliferating ESCs could potentially reflect a more differentiation prone state. In vivo, cells exhibit a great degree of variability in proliferation rates, from terminally differentiated neurons, to slowly proliferating cancer stem cells, to rapidly proliferating embryonic stem cells. Many cell types, such as hematopoietic stem cells, contain both proliferating and non-proliferating populations. The proliferation signature model, because of its applicability across all tested species and cell types, provides a useful tool for understanding in vivo development for systems, in which precise measurements of proliferation are impossible. Our model has been validated on scRNA-seq data, using published time-lapse microscopy of cell lineages in C. elegans as the ground truth [70]. However, it is technically challenging to do microscopy or to otherwise measure proliferation of individual cells inside of a developing mouse embryo, or in a tumor in a patient. Models that can accurately predict difficult to measure properties, such as proliferation rate, from easy to measure ones, such as gene expression, will therefore aid in our understanding of complex biological processing during tumor formation, differentiation, and development.

Materials and methods

Cell culture growth conditions

Tail tip fibroblasts (TTFs) were isolated from a female newborn mouse from a Mus musculus x Mus Castaneus cross and immortalized with SV40 large T antigen [82]. The clonal line 68-5-11 [83] was established and maintained in DMEM supplemented with 10% serum (LifeTech), HEPES (30mM, Life Tech), Sodium Pyruvate (1mM, Life Tech), non-essential amino acids (NEAA) (Life Tech), penicillin-streptomycin (Ibian Tech), 2-mercaptoethanol (0.1mM, Life Tech). The mouse embryonic stem cell (ESC) line EL16.7 (40XX, Mus musculus/M.castaneus hybrid background [84] was maintained on gelatin coated tissue culture dishes and passaged every 2 days by seeding around 2x106 cells in 2i+LIF medium. Accutase (Merck Chemicals and Life Science) was regularly used for cell detachment when passaging cells. 2i+LIF medium contains a 1:1 mixture of DMEM/F12 supplemented with N2 (LifeTech) and neurobasal media (LifeTech) supplemented with glutamine (LifeTech), B27 (LifeTech), insulin (Sigma), penicillin-streptomycin (Ibian Tech), 2-mercaptoethanol (LifeTech), based on previous reports [37]. It also has additional modifications reported in Hayashi et al. [85] containing two chemical inhibitors 0.4 μM PD032591 (Selleck Chemicals, S1036) and 3 μM CHIR99021 (SML1046, SML1046) together with 1,000 U/ml LIF (ORF Genetics, 01-A1140-0100). Both TTFs and EL16.7 were cultured at 37°C in 5% CO2.

Proliferation and doubling time analysis

ESCs and fibroblasts were plated on 10 cm plates at 5.3x106 and 7.3x105 concentrations, respectively. Cells were expanded and counted for 7 days. To monitor distinct generations of proliferating cells, carboxyfluorescein succinimidyl ester (CFSE, Thermo Fisher Scientific) was used to stain the cells and the dilution of the dye was detected by flow cytometry every day. CFSE was dissolved in dimethyl sulfoxide at a concentration of 5 mM as stock solution and CFSE was added to a 1 ml cell suspension, to a final concentration of 5uM or 10uM. After the addition of CFSE, cells were incubated at 37°C for 20 min. Then the cells were washed twice with complete medium and maintained on ice until use in a buffer containing PBS, 2% serum and 1% pen-strep. Cell viability was determined by DAPI (Biogen Cientifica) staining. Dye signals were measured on an LSRII flow cytometer.

RNA-seq

To collect cells with different growth rates, cells were isolated by sorting at room temperature according to the CFSE signal (median and high CFSE signal). ESCs and fibroblasts were sorted into 1.5ml Eppendorf tubes containing medium and were cultured for 3 days and 5 days respectively in specific culture conditions as described earlier. For each cell line three bins were sorted: the lowest 10%, the median 10% and the highest 10% CFSE. Cells were sorted into prechilled 1.5ml Eppendorf tubes containing 200μl medium each. Cells were then centrifuged at 1000 rpm for 5 min, the media removed and the resulting cell pellet was used for RNA extraction. All bins were treated identically throughout the process. Cellular RNA was extracted using the Maxwell RNA Purification Kit and processed for RNA sequencing. For biological replicates, all experiments were repeated on three or four different days. Expression was quantified using Kallisto v0.42.3 [86] from the raw reads (no pre-processing) using the gencode.VM18.transcript annotations. We experimented with multiple methods for batch effect removal using the R package SVA [87] and found that the results of the GSEA, with regards to which gene sets were differentially expressed between fast and slow, or high and low TMRE cell populations, did not change. We therefore used the original data. PCA on TMRE sorted biological replicates showed that one TMRE sort was an extreme outlier (; this pair was discarded from all analysis.

BrdU Staining

Cell Proliferation was measured by the incorporation of bromodeoxyuridine (BrdU). Every 24h BrdU was added at a final concentration of 10 μM to the cells. Incubation under the appropriate growth conditions occurred for 30 minutes for ESCs and 45 minutes for fibroblasts to pulse label the cells. Cells were trypsinized, spun down at 1050 rpm for 5 minutes. After washing them in ice-cold PBS, cells were fixed overnight in ice cold Ethanol (70%) while maintaining a gentle vortex. The following day the Ethanol fixed cells were centrifuged and the DNA denatured by adding 2N HCl—0.5% Triton X-100 for 30min at room temperature. Then cells were centrifuged and resuspended in 0.1 M Na2B4O7, pH 8.5 for 10 minutes at room temperature. After spinning them down the cells were incubated overnight at 4°C with PerCP/Cy5.5 anti-BrdU (1:30 dilution) (BioLegend) in a buffer containing 0.5% Tween 20 / 1% BSA/PBS and RNase (0.8 mg/ml). The following day cells were washed in PBS and spun down at 1050rpm for 5min at room temperature. The pellet was resuspended in PBS with DAPI (1:1000) and analyzed in an BD LSRII flow cytometer.

Mitochondria inhibitor treatment assay

For the assessment of chemical inhibitors on membrane potential changes, cells were incubated with medium containing DMSO (0.1%, mock control), antimycin A (500nM), oligomycin (1 μM) for 16h. Cells with or without treatment were washed with PBS and trypsinized. After spinning the cells for 5 minutes at 1050 rpm at room temperature, the cell pellet has been stained with 50nM TMRE for 20 min at 37°C. After 2 times washes with PBS, cells were resuspended in PBS containing DAPI and immediately analyzed by flow cytometer BD LSRII. In order to measure cell viability the trypan blue exclusion assay has been used. Upon trypsinization, an aliquot of the cell suspension has been mixed at 1:1 with trypan blue solution (0.4%, Thermo Fisher Scientific). The mix has been loaded on the counting chamber slide and analyzed at the automated counter (Invitrogen Countess II FL Automated Cell Counter).

Western blot

Cells were lysed in a lysis buffer (20 mM Tris-HCL, pH 8.0, 150 mM NaCl, 1% Triton X-100, supplemented with protease inhibitors cocktail) and centrifuged for 30 minutes at 16000g. The supernatant was boiled in SDS loading buffer. After SDS-PAGE, the samples were transferred to a polyvinylidene difluoride membrane using a transfer apparatus according to the manufacturer’s protocols (Bio-Rad). After incubation with 5% nonfat milk in TBST (10 mM Tris, pH 8.0, 150 mM NaCl, 0.5% Tween 20) for 1h, the membrane was washed once with TBST and incubated with antibodies against N-Cadherin (BD Biosciences, 1:1000), E-Cadherin (BD Biosciences, 1:1000), GAPDH (Santa Cruz, 1:5000), at 4°C for 16 h. Membranes were washed three times for 10 min and incubated with a 1:5000 dilution of Rabbit Anti-Mouse Immunoglobulins/HRP for 1.5 h. Blots were washed with TBST three times and developed with the ECL system (Amersham Biosciences) according to the manufacturer’s protocols.

Immunofluorescence staining

Cells were grown in 8-well Lab-Tek chamber slides (Thermo Fisher Scientific) and fixed in 4% paraformaldehyde for 10min at room temperature. Then, washed three times in PBS. Fixed cells were permeabilized in 0.5% Triton X-100 (Sigma-Aldrich) in PBS buffer for 10min at room temperature. And then washed in PBST (PBS with 0.1% Tween (Sigma-Aldrich)) for 2min at RT. Then cells were incubated in a blocking solution containing 10% bovine serum albumin (BSA, Sigma) and 0.01% Triton X-100 for 1h at room temperature. Cells were then left at 4°C overnight in a blocking solution containing the primary antibody: mouse E-Cadherin (BD Biosciences, 1:1000) and mouse N-Cadherin (BD Biosciences, 1:1000). The next day, the cells were washed three times in PBS and then incubated with the secondary antibody for 45min at room temperature. Goat anti-mouse IgG, (1:1000, Life Technologies) conjugated to Alexa Fluor-555, was used as a secondary antibody. Nuclear staining was performed with DAPI (1:1000, Biogen Cientifica). Images were taken with a Leica TCS SP8 confocal microscopy system and were analyzed with Fiji (ImageJ).

Differential expression of pluripotency, 2C-like state and lineage commitment-related genes in mESCs sorted by proliferation rate (CFSE)

Pluripotent state gene markers are chosen from 4 different studies [26, 29, 88, 89], only genes that are used as pluripotent state gene marker in at least 2 of these 4 papers are used in this paper. Lineage commitment and 2C-like state gene markers are the same as genes in Fig 5A of Kolodziejczyk et al. [26] and key differentiation regulators in Fig 6 of the same study. To see the corresponding pluripotent cell state of fast and slow proliferating mESCs, we calculated the mean expression of naïve pluripotent markers in four fast-proliferating and four slow-proliferating replicates and log2(fast/slow) was calculated to compare genes expression in fast proliferating subpopulation and slow proliferating sub-population. The same method was applied to lineage commitment gene markers and 2C-like state gene markers.

Mitochondrial membrane potential measurements

The relative mitochondrial transmembrane potential (ΔΨm) was measured using the membrane-potential-dependent fluorescent dye TMRE (Tetramethylrhodamine, Ethyl Ester, Perchlorate) (Molecular Probes, Thermo Fisher Scientific) [90]. For TMRE staining fibroblasts and ESCs were grown, washed in PBS, trypsinized and resuspended in PBS with 0.1% BSA and TMRE added at a final concentration of 50nM, from a 10μM stock dissolved in DMSO. Cells were incubated for 20min at 37°C, washed with PBS and were analyzed by flow cytometry or sorted.

Cell sorting

Cells at 80% confluence in 10cm plates were trypsinized, centrifuged at 1000rpm for 5min and stained with medium containing 10μM CFSE for 20min. Then cells were washed twice with PBS and stained with DAPI as viability dye. To have a homogeneous starting population, 20% of the viable cells were sorted according to the proliferation rate on the peak of CFSE signal and re-plated. ESCs and fibroblasts have been cultured for 24h or 48h respectively and two bins were sorted: the lowest 20% (fast proliferating cells) and the highest 20% CFSE (slow proliferating cells). Cells were sorted into prechilled 1.5-ml Eppendorf tubes containing 200 μl medium each. Cells were then centrifuged at 1000 rpm for 5 min, the media removed and plated in their culture medium. To monitor their proliferation rate, the dilution of the CFSE dye was detected by flow cytometry every day up to 3 days for ESCs and 5 days for fibroblasts. Dye signals were measured on an LSRII flow cytometer. For the CFSE sort (no TMRE), cells were stained with CFSE and DAPI, and we used FACS to obtain a population of viable cells with the same CFSE signal. We then grew cells for 3 or 5 days, and every 24 hours measured the CFSE signal using flow cytometry. Staining did not have a strong effect on cell viability or proliferation (. For the TMRE sort for proliferation rate, cells were stained with CFSE and TO-PRO-3, and we used FACS to obtain a population of G1 cells with the same CFSE signal. We then grow cells for 3 or 5 days, and every 24 hours measured the CFSE signal using flow cytometry. In order to have a homogeneous starting population, both cell types were stained with Hoechst (10 ug/ml, Life Technologies) to pick cells in G0/G1 phase. Within this population, cells were selected according to the proliferation rate on the peak of CFSE signal prior to staining them with the dye. Then cells were sorted by TMRE into three bins: low, medium and high with a BD Influx cell sorter into prechilled 1.5 ml Eppendorf tubes containing 200 μl medium each. Cells were then centrifuged at 1000 rpm for 5 min, the cell pellet was washed with PBS and used for RNA extraction. All bins were treated identically throughout the process. Cellular RNA was extracted using the Maxwell RNA Purification Kit and processed for RNA sequencing. Cell viability was determined by TO-PRO-3 (Thermo Fisher Scientific) staining. To test the effect of O2 levels and ascorbic acid/vitamin C in both cell types, sorted cells from each bin were plated into each of the four different conditions (low O2 (5%), normal oxygen growing conditions, and with or without ascorbic acid/vitamin C (25 ug/ml, Sigma-Aldrich)) in duplicate. After one day of recovery from the sorting, the cells were washed in PBS, were trypsinized, and counted. After seeding the same initial number, the rest of the cells was analyzed on a BD Fortessa analyzer. Every day a sample from each condition and replicate was taken for counting, and stained with 50 nM TMRE, up to 3 days for ESCs and 5 days for fibroblasts, and both TMRE and CFSE were measured by flow cytometry. Images of CFSE and TMRE stained cells are shown in .

Gene set enrichment analysis (GSEA)

GSEA was performed using the GSEA software and the MSigDB (Molecular Signature Database v6.2) [41,91]. We use signal-to-noise (requires at least three replicates) or log2 ratio of classes (for experiments with less than three replicates) to calculate the rank of each gene. The maximum number of genes in each gene set size was set to 500, the minimum to 15, and GSEA was run with 1000 permutations. We provided all GSEA results in this study (.

Enrichment map

Enrichment maps of this study (Figs ) are created using EnrichmentMap in Cytoscape [92,93], we refer to Reimand et al’s protocol [94] for using EnrichmentMap. We imported the output file of GSEA to EnrichmentMap and set FDR threshold as 0.1, other parameters set as default.

Coordination of expression of ribosome biogenesis and proteasome related genes

We first calculate the mean expression (average of log2(TPM+1)) of ribosome biogenesis genes (genes in GO preribosome gene set) and proteasome genes (genes in GO proteasome complex gene set) across organ developmental time course, then we calculate the Pearson correlation of ribosome biogenesis and proteasome.

Calculation of proliferation signature scores

To obtain proliferation correlated genes, we first calculate, for each gene, the Spearman correlation with proliferation rate for each replicate, as measured by the decrease in CFSE signal, in both fibroblasts and ESCs. We define “proliferation correlated genes” as genes that have a correlation of 1 in all replicates (two replicates for both fibroblasts and ESCs, totally four sets of samples) and in both fibroblasts and ESCs (243 genes). The formula to decide the genes to be chosen as “proliferation correlated genes” is: where gene(i) is the gene expression for all samples with different proliferation rates (slow, medium and fast) in replicate i, pr(i) is the proliferation rate for all samples in replicate i. As we have two replicates for each of the two cell types, which means we have four replicates, and the mean of correlation equal to 1 means the correlations in each replicate is rho = 1. To this set we add genes from six ribosome biogenesis and proteasome related gene sets that are significantly enriched in both fibroblasts and ESCs, which result in a final gene set consisting of 370 genes ( and we called this gene set proliferation signature. To use proliferation signature gene set to predict proliferation rate, we apply a widely used method, ssGSEA [95], a rank-based method that computes an overexpression measure for a gene set of interest relative to all other genes in the genome, to derive a single value for a sample, which we called proliferation signature score. We use R package GSVA to apply ssGSEA with default settings [64]. The input gene expression matrix can be derived from next-generation sequencing or microarray. To apply proliferation signature in other species, the R package Biomart [96,97] was used to obtain homologous genes of other species in this study and to map across different gene naming schemes (eg: transfer Ensemble gene id to Entrez gene id). If the mapping from human to other species is one to many, then we choose the first mapped gene. If the mapping is many to one, then we keep all mapped genes.

Prediction of growth rates using proliferation signature

Published expression profiling data for yeasts cultured in the chemostat with controlled growth rate from Airoldi et al. (dataset1) [98], Slavov et al. (dataset2) [99], Regenberg et al. (dataset3) [14] and cancer cell lines with corresponding growth rate [66] were downloaded. For each dataset, we calculated the Pearson correlation of proliferation signature score with growth rate. We also used another method to calculate proliferation signature score for these 3 yeast datasets. We use the sum of genes expression for all genes in the proliferation gene set to represent proliferation signature score (, the result is slightly worse than the ssGSEA method.

Proliferation score of 2C-like mESCs and non-2C-like mESCs

RNA-seq data (GSE33923) of 2C-like mESCs are from Macfarlan et al. [67], who FACS separated 2C-like cells (high MuERVpromoter driven expression, 2C::tdTomato+) from non-2C-like mESCs (2C::tdTomato-). We calculated the proliferation signature score for each of the six samples, and used a paired t-test to control for differences between replicates.

Brief description of experiments from other papers

In van Dijk et al. [8] cts1Δ histone-GFP budding yeast undergo cytokinesis to separate mother and daughter cells, but these cells remain physically attached to each other by their cell wall. Thus, starting from an initial population of single cells in G1, variation in proliferation rate can be measured by variability in histone-GFP signal in physically connected clusters of cells. Each cluster contains cells descended from the same ancestor cell. In Dhar et al. [9] wild-type yeast were stained with TMRE, and sorted into four bins with varying TMRE signal. In Sukumar et al. [22] pmel-1 T cell receptor (TCR) transgenic mice were injected with recombinant vaccinia virus encoding hgp100 (gp100-VV). Five days after vaccination, they isolated CD8+ T cells, stained them with the lipophilic cationic dye tetramethylrhodamine methyl ester (TMRM) (25 nm for 30 min at 37°C) and FACS-sorted the highest and lowest 7–10% of cells for subsequent RNA-seq.

Proliferation scores of preterminal cell lineages vs terminal cell types

Preterminal cell lineage and terminal cell type pseudo-bulk RNAseq data of C. elegans were downloaded from Murray et al. [69], specifically, gene expression profile for terminal cell types and preterminal cell lineage is in S7 and S8 Tables in Ref. [69], annotation file for terminal cell types and preterminal cell lineage is in S2 and S4 Tables in Ref. [69]. As there are multiple time points for one terminal cell type, we only use the sample with maximum time point to represent the corresponding terminal cell type, processed data provided in this study (. For each cell we calculate proliferation signature score, and a t-test was used to compare the mean proliferation signature score of all cells in each of the two groups.

C. elegans scRNA-seq data analysis

C. elegans scRNA-seq data was provided in R package “VisCello.celegans”. After loading the package, type cello() to load all data into the current environment. We calculated the proliferation signature score for all single cells, then color them by proliferation signature score in UMAP space. The calculation of proliferation signature score for single cell data is different from the calculation for bulk RNA-seq data. We just sum up the expression value of genes in the proliferation signature gene set to get a proliferation signature score. We do not use ssGSEA consider ssGSEA is a rank-based method, however most of the genes have 0 expression in this scRNA-seq data set (, which makes it not appropriate to use ssGSEA. We binned all single cells by their embryo time. We first calculate the mean proliferation score for cells with same embryo time, then calculate Spearman correlation of this mean proliferation score with embryo time, the result is rho = -0.65 (p = 9.3 ×10−19), the correlation of unbinned data is -0.41 (p < 2.2×10−16). After excluding three special cell types germline, M cell and intestine, the result is rho = -0.73 (p = 8.6 ×10−24), the correlation of unbinned data is -0.45 (p < 2.2×10−16).

Experimental data for C. elegans development

Developmental data of C. elegans was extracted from Fig 4 of Sulston et al. [70]. This figure is cell number (live nuclei number) change over embryo time and we use WebPlotDigitizer [100] to extract data. We use the data to plot log2 cell number change over embryo time. The difference of log2 cell number for two adjacent time points divided by the difference of time is the proliferation rate of mean of two time points.

Fast proliferating subpopulations maintain higher proliferation rates than slow proliferating subpopulations for at least 3 days.

FACS gating strategy for CFSE staining to get cell subpopulations with different proliferation rates. (A, B) The gating strategy for CFSE staining to get cell subpopulations with different proliferation rates in fibroblasts (A) and in ESCs (B). Slow, medium and fast proliferating cell subpopulations were sorted by FACS according to their CFSE signal. Then RNA-seq was performed on each of the three subpopulations. (C, D) FACS gating strategy for measuring the heritability of proliferation rates. The gating strategy for measuring the heritability of proliferation rates in fibroblasts (C) and in ESCs (D). In all experiments, the laser voltage was increased so that, when sorting high and low CFSE cells, the modal CFSE signal was at least 103; the voltage is not the same for the first and second CFSE sorts. (E) 3 Replicates of fibroblasts that similar to . (F) 2 Replicates of ESCs that similar to . (TIF) Click here for additional data file. (A) GSEA result plot of Go preribosome genes set for ESC. (B) The heatmap shows the expression (z-scored read counts) of preribosome genes in ESCs across four biological replicates of the CFSE sorting experiment. (C) Higher expression of Myc in both fast proliferating ESCs and fibroblasts. log2 fold change of Myc expression between fast and slow proliferating subpopulation in both ESCs and fibroblasts, each cell type 4 replicates. (D) Correlated changes in the expression of ribosome biogenesis and proteasome related genes during organ development. Change of average expression of log2(TPM+1) of genes in ribosome biogenesis (Go preribosome) gene set and proteasome complex (Go proteasome complex) gene set with developmental stages across different organs in seven species [16]. Points (circle and triangle) are the mean expression of replicates, error bars represent the maximum and minimum value in the replicates. (TIF) Click here for additional data file.

Proliferation signature scores predict growth rate, using different methods of calculation, and different species.

(A-C) Using the Normalized Enrichment Score from ssGSEA to predict growth rate in three different data sets. The Pearson correlation of proliferation signature score with growth rate in are R = 0.82 (p = 8.9×10−7), R = 0.73 (p = 1.3×10−8) and R = 0.77 (p = 3.7×10−3). (D-F) Similar to A-C, but using the sum of expression values for all genes in the proliferation signature gene set to calculate proliferation signature score. The Pearson correlation of proliferation signature score with growth rate are R = 0.83 (p = 7.0×10−7), R = 0.73 (p = 1.6×10−8) and R = 0.55 (p = 0.65×10−2). (TIF) Click here for additional data file.

Lineage-specific proliferation signature scores during C. elegans, human and mouse development.

(A) A density histogram of counts (UMI) across 1000 randomly sampled cells; 95.6% of genes have zero reads. This causes ssGSEA to give unreliable results, so the sum of expression values method is used for calculating the proliferation signature score for single cells. (B) Cells are binned by embryo time, and the mean proliferation signature score for all cells not the three outlier cell types (germline, intestine and M cells). The Spearman correlation is rho = -0.73 (p = 8.7×10−24) for binned data, and rho = -0.45 (p < 2.2×10−16) for unbinned data. (C) Similar to Fig 5C, but only showing cells with an age higher than 650 minutes, rho = 0.5 (p = 1.5×10−2). (D, E) The change in cell number, and the rate of change in cell number, during development, as measured by microscopy [70]. (F) Change in proliferation signature score for seam cells, which form multinucleated cells through cell-fusion. (G) Human single-cell gene expression data from Petropoulos et al. [72] projected onto the first two principal components and colored by proliferation signature score or developmental stages. And boxplot shows the change of proliferation signature score with developmental stages. (H) Similar to G, but using mouse scRNA-seq data from Deng et al. [73].
Fig 5

Proliferation signature of cell development.

(A) A cartoon showing four terminal cells, and a partial linage showing the final four generations of preterminal cells. Comparison of single-cell proliferation signatures between preterminal cell lineage and terminal cell types (t-test, p = 4.9×10−41). (B) UMAP projection of 89,701 cells. Cells in the left panel are colored by estimated embryo times; in the right panel by proliferation signature score. (C) To calculate the proliferation signature score (y-axis) at each time point (x-axis) cells are binned by embryo time, and the mean proliferation signature score for all cells in the same bin is calculated. The spearman correlations are -0.65 (p = 9.3 ×10−19) for binned data and -0.42 (p < 2.2e-16) for unbinned data. (D) Boxplots (line shows median, boxes interquartile range) of proliferation signature score for all cells with embryo time > 650min. (E) Temporal dynamics of proliferation scores of select cell lineages, showing the average proliferation score for all single cells in that lineage, at each time point. (F-G) Boxplot of C. elegans (F) and human (G) proliferation signatures as a function of developmental time, from scRNAseq data.

(TIF) Click here for additional data file.

FACS gating strategy for TMRE staining and volcano plots for fibroblasts and ESCs sorted by CFSE or TMRE.

(A, B) The gating strategy for TMRE staining to get cell subpopulations with different mitochondrial states in fibroblasts (A) and in ESCs (B). We use Hoechst to get cells in G0/G1, gate by CFSE to get a more uniform cell population, and separate populations with high and low TMRE signal, then do RNA-seq on each of the two subpopulations. (C, D) Deseq2 was used to calculate log2 fold change and adjusted p-values for CFSE sorting (C) and TMRE sorting (D), combining biological replicates. To set the axes to be maximally informative, genes with p < 10−5 had p set to 10−5, and those abs(log2 fold change) > 5 were truncated at -5 or +5. (E) PCA for RNA-seq data of ESCs sorted by TMRE. Low TMRE ESCs of replicate 1 is an outlier, so we remove replicate 1 for all analysis. (TIF) Click here for additional data file.

N-cadherin and E-cadherin levels in oligomycin and antimycin treated cells.

(A) E-cadherin staining of ESCs as positive control for E-cadherin detection, Scale bar = 80 μm. Immunostaining for E-cadherin does not show detectable levels in fibroblasts, Scale bars = 15 μm. (B) Western blot for N-cadherin and E-cadherin (Gapdh = loading control) in DMSO- and drug-treated fibroblasts and ESCs. Oligomycin, but not antimycin treatment led to a reduction in N-cadherin in fast fibroblasts. Double-drug treatment also showed no strong effect likely due to antimycin treatment (complex III) acting upstream of oligomycin treatment (complex V) in the electron transport chain and therefore being the dominant drug in the double-treatment. (TIF) Click here for additional data file.

The effects of Hoechst and CFSE staining on cell viability and proliferation rates.

Shown are the estimated doubling times (based on the increased in cell number after 24hrs growth) and measured viability (trypan blue) for fibroblasts (A) and ESCs (B). The microscopy shows that stained cells maintain the correct morphology. (TIF) Click here for additional data file.

The images of CFSE staining for both fast and slow proliferating fibroblasts and ESCs.

Fibroblasts (A) and ESCs (B) were stained by CFSE and sorted into two bins: fast proliferating cells (low CFSE) and slow proliferating cells (high CFSE). (TIF) Click here for additional data file.

Images of TMRE staining for fibroblasts and ESCs, showing heterogeneity.

Bright-field and TMRE staining images for both Fibroblasts and ESCs. Two pairs of ESC colonies of similar size but showing staining heterogeneity are circled in blue. (TIF) Click here for additional data file.

Viability data.

(XLSX) Click here for additional data file.

Overlap of GSEA results of fibroblasts and ESCs (CFSE) and GSEA result of yeast Fitflow.

(XLSX) Click here for additional data file.

Cor table prolif vs gene expression.

(TSV) Click here for additional data file.

Proliferation signature combined human C. elegans yeast mouse.

(XLSX) Click here for additional data file.

GSEA result of fibroblasts TMRE and ESCs TMRE.

(XLSX) Click here for additional data file.

TMRE sort ANOVA.

(XLSX) Click here for additional data file.

CD8T Ribosome and Proteasome table.

(XLSX) Click here for additional data file.

GSEA results of Fibroblasts, ESCs and yeast sorted by CFSE or TMRE.

(XLSX) Click here for additional data file.

C. elegans id single cell 20201027 max as terminal.

(TSV) Click here for additional data file.

Published data used in this paper.

(XLSX) Click here for additional data file.

Yeast Fit Flow Fast Slow.

(TSV) Click here for additional data file. 29 May 2021 Dear Dr. Carey, Thank you very much for submitting your manuscript "A conserved expression signature predicts growth rate and reveals cell & lineage-specific differences" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Elana J. Fertig, PhD Associate Editor PLOS Computational Biology Mark Alber Deputy Editor PLOS Computational Biology *********************** A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: Jiang et al present a clever experimental approach to measuring growth rate-dependent gene expression, and find a gene-expression profile that predicts growth rate across a range of species and cell types. The experimental approach is to label cells with a stable, cell-permeable dye, sort the cells to get a homogeneously fluorescent starting population, and then sort cells at a later time for RNA-seq according to how much fluorescence remains (because fluorescence decreases as cells grow and divide). Overall, the experiments are rigorously performed and the predictive power is impressive, especially with cross-species validation. The paper will be an important contribution to the study of cell-to-cell heterogeneity and growth control. However, I do have two main recommendations to strengthen it: 1) The manuscript could do a better job of placing this work in the context of the existing literature. The work of Terry Hwa and colleagues on the connection between translational capacity and growth is not cited. Recent work identifying semi-heritable cell-to-cell heterogeneity in mammalian cells using RNA-seq is not cited (Raj lab and Tanay lab). Wytock & Motter 2019 is cited, but other work on predicting growth rate from gene expression is not (e.g., doi:10.1371/journal.pone.0075320 which discusses ribosomal protein expression). 2) The computational/prediction method should be more completely discussed in the main text and the method should be more explicitly justified. For example, the cutoff for growth-correlated genes of rho = 1 seems a bit confusing at first (line 255), until one considers that this is a rank (Spearman) correlation and there are only three bins of growth rate (cells sorted with high, medium or low fluorescence). So rho = 1 just means growth rate increases monotonically with growth. This seems sensible enough, but it does not consider measurement noise the way a proper statistical model would, so this decision should be defended. (And because of experimental noise, it also would likely cause problems if, say, another experiment used 5 bins because the cutoff would likely be too stringent.) The addition of ribosome-biogenesis and proteasome gene sets to the genes that contribute to the proliferation signature score is also somewhat arbitrary (or at least one could imagine different decisions being made). So some kind of analysis of how the proliferation score is sensitive (or not) to these decisions is warranted. The performance of the proliferation score is well validated by using data from divergent organisms, but that's not to say it could not be improved. Likewise, how the gene-expression signature is transferred from one organism to another is not fully described. The Methods section explains that homologs were identified using the R package Biomart, but more detail is needed. Are these 1:1 orthologs? orthologs + paralogs? It is also not entirely clear how the C. elegans profileration predictions were tested against data. It seems that the comparison is only based on the rate of increase of cell number through development. Are there C. elegans data on the actual rates of division of particular cells that could be used as well? Reviewer #2: This paper used a fluorescence-activated cell sorting (FACS) based method to sort fast- and slow-proliferating subpopulations in mammalian cells and reports a gene expression signature, which can be used to predict the proliferation rates across eukaryotic species and cell types. This signature includes expression of house-keeping genes such as ribosome and proteasome, which are often expressed at high levels and hence can be reliably measured even using methods with low sampling efficiency (such as single-cell RNA-seq). This proliferation model is more robust than other models currently available and it has the potential to be widely used, particularly in the field of single-cell transcriptomics. Additionally, the authors found that slow-proliferating ESCs display a more naïve pluripotent expression profile than fast-proliferating ESCs. This finding is counter-intuitive and should be of interest to the stem cell field. Overall, the paper is well written, the data are well presented and supports the main collusion. The weaker part of the paper is the cell-type specific correlation between mitochondria function and proliferation rates, where the authors seem to have made efforts towards gaining mechanistic insights behind the correlation. But the data seem piecemeal and it is unclear what message these data really deliver, particularly with seemingly contradicting data unexplained (see specific comments). Compared to other important findings in the paper, this part could perhaps be reduced to a more coherent and concise figure. Specific comments: 1. What effect does protein degradation have on the intensity of CFSE dye? The authors comment that protein degradation should not have a huge effect but it should be shown with analytical/experimental evidence, especially given the finding that proteasome genes are differentially expressed between the two sorted populations. This may be tested experimentally using proteasome inhibitors. 2. Given that the entire paper is built on sorting using CFSE dye, an orthogonal method should be used to validate that the sorted slow proliferating cells are indeed proliferating more slowly. I envision that a simple growth curve should be sufficient. 3. Line 729, why only the genes with “mean(Cor(gene(i), pr(i))) = 1” are chosen as “proliferation correlated genes”? The genes with correlation value of -1 should be also highly correlated and equally important as the value 1 genes. 4. It would be useful to know which (groups of) genes in the selected gene sets contribute more for predicting the proliferation rate. 5. Contradicting statement on the relation between % S-phase cells and proliferation rates. Line 178, higher S phase fraction is used to support that fast proliferating subpopulations maintain higher proliferation rates (Figure 1 E, F). This conclusion can be also found in Line 422-423. However, in Line 293-296, for explaining no correlation between PCNA or Ki67 expression with proliferating rate, authors mentioned that “these markers are measuring the fraction of the population that is in S-phase or is not in G0…” 6. Magnitude of the proliferation score. I noticed that the proliferation score varies from 10-1 level (yeast, Figure 4B) to 102 level (human, Figure 5G). I am also curious that sometimes this score become negative. What does this mean? Can the authors comment on the condition where this score might be compared? 7. Antimycin and oligomycin are both mitochondria respiration inhibitors, yet the two have opposite effects on fast vs. slow fibroblasts (in Fig. 7C antimycin inhibits proliferation of slow FIB greater than fast FIB, but in Fig. 7F, oligomycin has a stronger effect on fast FIB). The authors should explain the discrepancy and what it really means. 8. Figure 7C, the label “Drug effect log2(DMSO/drug)” is ambiguous. What is being measured should be clearly stated in the figure or caption. Figure 7E is dim and hard to see. The cadherin staining should be quantified. 9. Figure S6B, for the same protein, experiments group and the control (e.g. ESC as the positive control for cadherin) should be on the same gel. Other minor issues: I. In main text, abbreviation “FACS” first appeared at Line 119 but the abbreviation was introduced at Line 160. In Figure 1(E, F) caption, abbreviation “BrdU” was used but introduced later in Materials AND Methods (Line 586). II. The metric prefix “micro” was written as “u” instead of “μ” at Line 560 and 603, which is not a SI recommendation. III. It’s recommended to leave a space between a number and its unit, e.g., at Line 560, 572, 586, 603. In Line 592, there should be a space between “0.5%” and “Triton”. IV. Line 679, “viability” was marked bold for no obvious reason. V. Line 442 and 444, Figure 7E and S6A-C are referenced. However, Figure S6 has no panel C. VI. Figure 4A, the gene name “KI67” in the legend is inconsistent with the “Ki67” in the caption and main text. Reviewer #3: This is a well-written manuscript with really fantastic results. There is strong experimental and computational work, so it will be quite well-received by PLoS Computational Biology readers. Indeed, the results the authors have shown “a nearly universal signature that is strongly associated with proliferation” will be of interest to a broad reader base spanning across biological and computational fields. I also appreciated that, in applying their approach across organisms/data, the authors demonstrated the “overfitting” used by other methods of proliferation estimation that were based on single-species data. This level of cross-species validation is truly remarkable. That said, there were a couple of points that would have greatly increased my comprehension of the results from an initial read that I recommend authors and editors consider. (In addition there are a few minor points/typos.) 1) I found it a bit disappointing that, in my view, that details of the critical achievement of the work: the establishment of a proliferation score, was relegated to the methods section. I’m a mathematical modeler, so at least from my point of view the proliferation score itself is a result of the paper and merits greater discussion in the main text. Although it relies on ssGSEA, an established method, some details within the manuscript of how the score itself is calculated would have been appreciated. 2) To that end, reading the methods it appears that the authors consider only the genes with a correlation of 1 to proliferation in calculating their score. While clearly, this results in a highly predictive and not-overfit model, this criteria seemed quite strict. Could the authors elaborate if, perhaps within a species, there are genes that are close to perfect correlation that would increase their predictive power? Or once you move outside perfect correlation do you encounter overfitting? 3) I am not an experimentalist, but have worked with CFSE data as a modeler. So I had some questions about these experiments. Were the cells at the start were synchronized in some way? It seemed to me that heterogeneity in the cell cycle at the start of the experiment would lead to complications in identifying the “fast” from “slow” cells as you might capture who was “lucky” enough to be at the right point in the cell cycle. Again, I am familiar with mathematical approaches to estimating proliferation rates from CFSE data (such as described in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3196292/). The CFSE data in this manuscript (figure 1) is far “cleaner” than I am used to seeing from these previously published studies. If there non-standard experimental procedures which produced this cleaned data, I would recommend they be highlighted in the main text. Minor Points: • Figure 2: Panels (B) and (C), the caption should define what the “grey” regions around the linear fit. (Is this a 90% confidence band? 95% confidence band?) • Line 109: Is the first time in the main text that ESCs are used, I would recommend spelling this out. • Line 139: “cause” should be “causes” • Line 732: I was I was confused by what was meant by “proliferation rate” in the correlation? Is this just the normally calculated doubling time of a sample? • Lines 793: The sentence beginning “We just sum up…” is quite long and hard to parse. I recommend putting a period after “proliferation signature score” and making a new sentence beginning “We do not use ssGSEA…” ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols References: Review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. 20 Sep 2021 Submitted filename: Response to Reviewers - PLOS Comp Bio - 2021 0627 f+BP.docx Click here for additional data file. 21 Oct 2021 Dear Dr Payer, We are pleased to inform you that your manuscript 'A conserved expression signature predicts growth rate and reveals cell & lineage-specific differences' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Elana J. Fertig, PhD Associate Editor PLOS Computational Biology Mark Alber Deputy Editor PLOS Computational Biology *********************************************************** Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors have responded well to suggestions from the previous round of review. I'm still not completely convinced that using a rank correlation threshold of 1 is the best way to go, because it is very specific to this work's use of 3 bins. I would still prefer to see a statistical treatment that models measurement noise. Nonetheless, the authors' results and conclusions are convincing, so this is more a comment about the general utility of this approach to others who subsequently perform similar analyses and not a comment on the rigor of the present work. Reviewer #2: The authors have addressed all my questions. Reviewer #3: The authors have satisfied my questions and concerns. I believe the paper is ready for publication. ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No 5 Nov 2021 PCOMPBIOL-D-21-00616R1 A conserved expression signature predicts growth rate and reveals cell & lineage-specific differences Dear Dr Payer, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Livia Horvath PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol
Table 1

Gene sets whose expression exhibits opposite correlations with growth between fibroblasts and ESCs.

FibroblastsESCs
Gene set nameGene set sizeNESFDR.q.valNESFDR.q.val
Mitochondria Inner mitochondrial membrane protein complex1012,52<0.001-0,39>0.1
Mitochondrial membrane part1642,26<0.001-0,44>0.1
Mitochondrial respiratory chain complex assembly742,19<0.001-0,49>0.1
Mitochondrial respiratory chain complex I biogenesis542,12<0.001-0,49>0.1
Mitochondrial matrix4041,97<0.05-0,49>0.1
Metabolism Metabolism of proteins3772,47<0.001-0,54>0.1
Glycolysis gluconeogenesis602,03<0.05-1,35>0.1
Monosaccharide biosynthetic process521,94<0.05-0,76>0.1
Monosaccharide catabolic process561,67<0.05-0,93>0.1
Hallmark fatty acid metabolism1571,52<0.1-0,71>0.1
Differentiation Dopaminergic neuron differentiation28-1,66<0.051,27>0.1
Hematopoietic progenitor cell differentiation97-1,59<0.11,09>0.1
Regulation of cardiac muscle cell differentiation19-1,57<0.10,92>0.1
regulation of smooth muscle cell differentiation20-1,45>0.11,66<0.1
Glial cell differentiation136-1,00>0.11,63<0.1
Cell cycle Cell cycle G1 S phase transition104-1,95<0.052,03<0.05
Hallmark E2F targets195-2,09<0.0012,43<0.001
Fischer G1 S cell cycle177-2,03<0.0011,90<0.05
Cell cycle checkpoints110-0,84>0.11,82<0.05
Cell cycle phase transition247-1,99<0.0011,31>0.1

Shown are representative gene sets whose expression is significantly correlated with proliferation in either fibroblasts or ESCs, but whose expression changes with proliferation in opposing directions. NES>0 (higher expression in fast); NES<0 (higher expression in slow)

  95 in total

Review 1.  Fluorescent dyes for lymphocyte migration and proliferation studies.

Authors:  C R Parish
Journal:  Immunol Cell Biol       Date:  1999-12       Impact factor: 5.126

2.  Functional genomics reveals a BMP-driven mesenchymal-to-epithelial transition in the initiation of somatic cell reprogramming.

Authors:  Payman Samavarchi-Tehrani; Azadeh Golipour; Laurent David; Hoon-Ki Sung; Tobias A Beyer; Alessandro Datti; Knut Woltjen; Andras Nagy; Jeffrey L Wrana
Journal:  Cell Stem Cell       Date:  2010-06-17       Impact factor: 24.633

3.  A mesenchymal-to-epithelial transition initiates and is required for the nuclear reprogramming of mouse fibroblasts.

Authors:  Ronghui Li; Jialiang Liang; Su Ni; Ting Zhou; Xiaobing Qing; Huapeng Li; Wenzhi He; Jiekai Chen; Feng Li; Qiang Zhuang; Baoming Qin; Jianyong Xu; Wen Li; Jiayin Yang; Yi Gan; Dajiang Qin; Shipeng Feng; Hong Song; Dongshan Yang; Biliang Zhang; Lingwen Zeng; Liangxue Lai; Miguel Angel Esteban; Duanqing Pei
Journal:  Cell Stem Cell       Date:  2010-06-17       Impact factor: 24.633

Review 4.  Growth control and ribosome biogenesis.

Authors:  Harri Lempiäinen; David Shore
Journal:  Curr Opin Cell Biol       Date:  2009-09-30       Impact factor: 8.382

Review 5.  Metabolic remodeling during the loss and acquisition of pluripotency.

Authors:  Julie Mathieu; Hannele Ruohola-Baker
Journal:  Development       Date:  2017-02-15       Impact factor: 6.868

6.  Abnormally high expression of proteasomes in human leukemic cells.

Authors:  A Kumatori; K Tanaka; N Inamura; S Sone; T Ogura; T Matsumoto; T Tachikawa; S Shin; A Ichihara
Journal:  Proc Natl Acad Sci U S A       Date:  1990-09       Impact factor: 11.205

7.  Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt.

Authors:  Steffen Durinck; Paul T Spellman; Ewan Birney; Wolfgang Huber
Journal:  Nat Protoc       Date:  2009-07-23       Impact factor: 13.491

8.  Connecting variability in global transcription rate to mitochondrial variability.

Authors:  Ricardo Pires das Neves; Nick S Jones; Lorena Andreu; Rajeev Gupta; Tariq Enver; Francisco J Iborra
Journal:  PLoS Biol       Date:  2010-12-14       Impact factor: 8.029

9.  Mitochondrial Membrane Potential Identifies Cells with Enhanced Stemness for Cellular Therapy.

Authors:  Madhusudhanan Sukumar; Jie Liu; Gautam U Mehta; Shashank J Patel; Rahul Roychoudhuri; Joseph G Crompton; Christopher A Klebanoff; Yun Ji; Peng Li; Zhiya Yu; Greg D Whitehill; David Clever; Robert L Eil; Douglas C Palmer; Suman Mitra; Mahadev Rao; Keyvan Keyvanfar; David S Schrump; Ena Wang; Francesco M Marincola; Luca Gattinoni; Warren J Leonard; Pawel Muranski; Toren Finkel; Nicholas P Restifo
Journal:  Cell Metab       Date:  2015-12-08       Impact factor: 27.287

10.  Cell Line Data Base: structure and recent improvements towards molecular authentication of human cell lines.

Authors:  Paolo Romano; Assunta Manniello; Ottavia Aresu; Massimiliano Armento; Michela Cesaro; Barbara Parodi
Journal:  Nucleic Acids Res       Date:  2008-10-15       Impact factor: 16.971

View more
  1 in total

1.  Quantitative considerations about the size dependence of cellular entry and excretion of colloidal nanoparticles for different cell types.

Authors:  Yanan Kang; Leroy Marwin Nack; Yang Liu; Bing Qi; Yalan Huang; Ziyao Liu; Indranath Chakraborty; Florian Schulz; Abdullah A A Ahmed; Mirco Clavo Poveda; Fereshta Hafizi; Sathi Roy; Marina Mutas; Malte Holzapfel; Carlos Sanchez-Cano; K David Wegner; Neus Feliu; Wolfgang J Parak
Journal:  ChemTexts       Date:  2022-01-25
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.