Literature DB >> 30753298

Modeling cell proliferation in human acute myeloid leukemia xenografts.

Marco S Nobile^1,2, Thalia Vlachou³, Simone Spolaor¹, Daniela Bossi³, Paolo Cazzaniga^2,4, Luisa Lanfrancone³, Giancarlo Mauri^1,2, Pier Giuseppe Pelicci^3,5, Daniela Besozzi¹.

Abstract

MOTIVATION: Acute myeloid leukemia (AML) is one of the most common hematological malignancies, characterized by high relapse and mortality rates. The inherent intra-tumor heterogeneity in AML is thought to play an important role in disease recurrence and resistance to chemotherapy. Although experimental protocols for cell proliferation studies are well established and widespread, they are not easily applicable to in vivo contexts, and the analysis of related time-series data is often complex to achieve. To overcome these limitations, model-driven approaches can be exploited to investigate different aspects of cell population dynamics.
RESULTS: In this work, we present ProCell, a novel modeling and simulation framework to investigate cell proliferation dynamics that, differently from other approaches, takes into account the inherent stochasticity of cell division events. We apply ProCell to compare different models of cell proliferation in AML, notably leveraging experimental data derived from human xenografts in mice. ProCell is coupled with Fuzzy Self-Tuning Particle Swarm Optimization, a swarm-intelligence settings-free algorithm used to automatically infer the models parameterizations. Our results provide new insights on the intricate organization of AML cells with highly heterogeneous proliferative potential, highlighting the important role played by quiescent cells and proliferating cells characterized by different rates of division in the progression and evolution of the disease, thus hinting at the necessity to further characterize tumor cell subpopulations.
AVAILABILITY AND IMPLEMENTATION: The source code of ProCell and the experimental data used in this work are available under the GPL 2.0 license on GITHUB at the following URL: https://github.com/aresio/ProCell. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Chemical Disease Gene Species

Year: 2019 PMID： 30753298 PMCID： PMC6748761 DOI： 10.1093/bioinformatics/btz063

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 Introduction

Acute myeloid leukemia (AML) is one of the most frequent hematological malignancies in adults, with variable prognosis among patients and a high mortality rate. Despite the advances in the field, the backbone of therapeutic intervention for non-promyelocytic AML has remained essentially unaltered for the last 40 years. About 60–85% of patients below the age of 60 respond to therapy and initially achieve complete remission (CR). Nevertheless, most patients will relapse within 3 years after diagnosis. Salvage therapy, in the form of aggressive chemotherapy or allogeneic hematopoietic stem cell (HSC) transplantation, may still be an option at this stage, but the prognosis after the first recurrence becomes dismal, especially when relapse occurs after a brief CR period (Döhner ). It is thought that disease recurrence is caused by the persistence of chemoresistant leukemic cells at CR, presumably residing within the quiescent leukemia stem cell (LSC) compartment. It has been previously shown that only a fraction of distinguishable leukemic cells, the LSCs, has reproducibly high clonogenic capacity and the ability to propagate the disease upon transplantation in recipient mice (Bonnet and Dick, 1997). Among the LSC population, a subpopulation of cell-cycle restricted cells appears to play a pivotal role in both tumor initiation and maintenance (Viale ). This quiescent LSC subset could also evade current chemotherapeutic approaches, which mainly target highly proliferative cells, and initiate disease relapse. Indeed, quiescence has been previously associated through indirect evidence with chemoresistance in AML (Ishikawa ; Saito ). To investigate the dynamic regulation of quiescent LSCs in vivo and to monitor their proliferation kinetics over time, we generated patient-derived xenografts (PDX) of human AML and implemented a label-retaining assay for in vivo cell division tracking of leukemic cell populations with different proliferation potential. For this purpose, human leukemic blasts were labeled, by means of lentiviral transduction, with histone 2 B (H2B) - green fluorescent protein (GFP) and transplanted in immunocompromised mice. The expression of the fusion protein is regulated by a Tet-Off promoter system, allowing for conditional suppression in the presence of tetracycline or tetracycline derivatives (e.g. doxycycline-dox). Thus, the dilution of the H2B-GFP signal in the presence of dox (chasing period) can be used to infer information on the relative contribution of cells with different cell-cycle kinetics in a given AML population, and to estimate their respective proliferation rates during the progression of the disease. The available experimental methods may not be enough to quantify and assess cell proliferation dynamics, because of the complexity of data analysis, the lack of single-cell resolution data, and the inherent difficulties in applying them in vivo. Ideally, the visual identification of each cell generation as a distinct peak in the H2B-GFP fluorescence histogram would require analyses of a starting population of AML blasts (prior to any dox administration) that express homogeneously H2B-GFP and divide in a synchronous manner. Expectedly, individual AML cells are heterogeneous with respect to their cell-cycle properties, and patterns of H2B-GFP expression and dilution in our AML model were highly variable (see Supplementary Section S1 for further information). To investigate the heterogeneity and high complexity of an asynchronously dividing cell population monitored in vivo, we decided to apply computational modeling on flow cytometry data obtained from bone marrow (BM) samples at three time points during a chasing period of 3 weeks (days 0, 10 and 21). Several mathematical models of cancer growth have been proposed in literature, relying on Ordinary Differential Equations to describe different dynamics of cell proliferation (e.g. exponential, power law, logistic) (Sarapata and de Pillis, 2014; Zheng ). Among them, some works investigated the occurrence of subpopulations to obtain a more accurate description and novel insights about tumor evolution (see, for instance, Ibrahim-Hashim ; Kansal ), whereas other works are based on cellular automata or agents to describe emergent and self-organizing dynamics (d’Inverno and Saunders, 2005). Here, following a different strategy from previous works (Foudi ; Hross and Hasenauer, 2016; Luzyanina ; Takizawa ), we propose a novel modeling and simulation approach, called ProCell, which takes into account the inherent stochasticity of cell division events. We apply ProCell to investigate the possible occurrence of different subpopulations of cells in a given AML population. Namely, we define and compare four computational models to study the differences between the presence of quiescent and proliferating cells, with an additional characterization of slowly and fast proliferating cells. To the best of our knowledge, this is also the first attempt of modeling in vivo heterogeneous proliferation kinetics in PDX. The four models required a proper parameterization to run meaningful simulations of cell proliferation in AML. The calibration of the models against the experimental data (i.e. the H2B-GFP fluorescence histogram at any given time point) is a necessary step to determine which model is more adequate to explain the observed cell populations in each BM sample. Therefore, we carried out a parameter estimation task, coupled with the stochastic algorithm of cell division implemented in ProCell, to determine the model parameters leading to a simulation outcome able to fit the experimental data. The problem of evaluating how much a simulation outcome fits with the target fluorescence histogram is a non-convex, non-linear and noisy minimization problem. In this context, global optimization methods can be efficiently exploited, as they are able to intelligently explore the search space of possible model parameterizations and to converge to an optimal solution (Moles ). In this work, we exploit Fuzzy Self-Tuning PSO (FST-PSO, Nobile ), an efficient and settings-free variant of the global optimization method Particle Swarm Optimization (PSO, Poli ). Although classic PSO was shown to be the most suitable algorithm for parameter estimation (Dräger ; Nobile ), a special modification of FST-PSO was proven to outperform all the other methods for this particular application (Nobile ). The four models of cell proliferation in AML were calibrated and validated by using H2B-GFP fluorescence histograms of BM samples measured after 10 and 21 days of chasing, respectively. Our results show that the model accounting for the coexistence of three subpopulations of cells—i.e. quiescent, slowly and fast proliferating cells—yields the best fit against the experimental target data at 10 days. Moreover, this model provides the best prediction of the distribution of AML cells after 21 days of chasing, further supporting the idea that both quiescent and proliferating cells contribute to disease progression in AML.

2 Materials and methods

2.1 Experimental protocol

The experimental strategy is based on the xenotransplantation of human AML blasts infected with a Tet-Off inducible lentivirus that encodes the H2B-GFP fusion protein (Falkowska-Hansen ). H2B-GFP expression in the transduced AML PDX cells is under the control of a tetracycline-responsive promoter element (TRE), which is in turn regulated by a tetracycline-controlled transactivator protein (tTA). In the absence of tetracycline or tetracycline derivatives (e.g. dox), tTA binds to TRE and H2B-GFP is stably expressed in the infected cells. Upon dox administration, tTA can no longer bind to the TRE and, consequently, H2B-GFP expression is shut off. The prolonged suppression of H2B-GFP expression by in vivo dox administration (chasing period), enables the identification of leukemic cell populations endowed with different label-retaining capacity, which is expected to correlate with their cycling properties. Further details on the characteristics of the biological model adopted in this work and the experimental setup are reported in Supplementary Section S1. Dox was administered to the experimental animals engrafted with H2B-GFP expressing blasts through specially modified feed (containing 625 mg dox per kg of pellets). BM cells were collected from dox-treated and -untreated mice at three time points during the chasing period (0, 10 and 21 days) and analyzed by flow cytometry, using FACSCalibur (BD). Alive cells were identified based on physical parameters (forward scatter and side scatter, which relate to cell size and content, respectively). For each BM sample, flow cytometry data were acquired for a maximum of 70 000 alive cells. This sampling process means that the data obtained do not reflect the absolute number of cells expected to be generated by the exponential expansion of the starting population at day 0 (see Supplementary Section S2). Human blasts were further distinguished from the murine BM niche cells using antibodies specific for the human CD45 surface marker (allophycocyanin APC fluorochrome conjugated, clone 2D1, BD). A separate population of non-infected (NI) AML blasts was used as a negative control to set the GFP fluorescence background threshold, below which any measurement is attributed to autofluorescence and the cells are considered as GFP-negative (GFPneg, see Supplementary Section S2). The conditional repression of H2B-GFP expression, upon dox treatment in vivo, resulted in a gradual loss of the intensity of the GFP fluorescent signal in proliferating cells. In fact, in the presence of dox, at each cell division, the H2B-GFP protein is expected to be equally distributed to the two daughter cells, thus allowing monitoring of the number of cell divisions that any given cell has undergone, and the identification of quiescent/slow-proliferating cells. The histogram of the fluorescence intensity of the GFP-positive (GFPpos) cells was then generated using FlowJo v10 (see Supplementary Section S2). GFPneg cells were excluded from further analyses, as their GFP fluorescence values are not informative for the inference of their proliferative history. FlowJo’s proliferation platform was used for the estimation of the number of cell divisions that can be monitored, starting from the original population (day 0), and searching for peaks with diminishing fluorescence with an approximate ratio of 0.5 per generation and peak coefficient of variation (peak CV) of 4–7%. In accordance with the literature (Falkowska-Hansen ), we found that our H2B-GFP expressing AML model allows tracking of up to eight rounds of cell division before losing any detectable GFPpos signal. In vivo studies were performed after approval from the fully authorized animal facility of the Department of Experimental Oncology, notification of the experiments to the Ministry of Health (as required by the Italian Law) (IACUCs No 18/2013) and in accordance to EU directive 2010/63. Human samples were collected from patients whose informed consent was obtained in writing according to the policies of the Ethics Committee of the European Institute of Oncology and regulations of Italian Ministry of Health. The studies were conducted in full compliance with the Declaration of Helsinki.

2.2 Modeling cell proliferation in AML

Four computational models describing cell proliferation were defined to investigate the possible coexistence of different subpopulations of AML cells: quiescent cells, proliferating cells, slowly proliferating cells, fast proliferating cells. In each model, a specific configuration of cell subpopulations was considered, to predict whether cells proliferated (fast or slowly) or remained quiescent in a given BM sample. Namely, in Model #1, we assume the absence of quiescent cells and the occurrence of proliferating cells only, without any distinction about their rate of division. In Model #2, we assume the presence of both quiescent cells and proliferating cells, without any distinction about their rate of division. In Model #3, we assume the presence of quiescent cells and two different subpopulations of proliferating cells, characterized by slow and fast rates of division. Finally, in Model #4, we assume the absence of quiescent cells and the occurrence of slowly and fast proliferating cells. These four models were defined in order to assess the actual existence of quiescent cells in the AML sample data, and the presence of proliferating cells that could be characterized or not by different rates of division. The probability distribution of the division times of proliferating cells is still poorly characterized, given the inherent variability between cells belonging to distinct species, tissues, or grown in different conditions, and the lack of studies regarding this matter (Kinjyo ; Stukalin ). Since an agreement has still to be reached, especially in the case of human primary cells, for the sake of simplicity we assume that the division interval of (slowly and fast) proliferating cells is a random variable with a truncated normal distribution , for some mean μ and standard deviation (SD) σ, whose left tail is limited to zero (i.e. values lower than zero are rejected and re-sampled); in contrast, quiescent cells do not divide by definition. These parameters, and the proportions of cell types in each model, are unknown. Table 1 summarizes the model parameters that had to be estimated, as described in Section 3.2. We also assume that cells do not change their proliferating rate as a consequence of cell division: any (slowly or fast) proliferating cell will generate, respectively, two (slowly or fast) proliferating daughter cells. Any cell that divides during the chasing period, leads to the generation of two daughter cells with half of the fluorescence intensity. Cells cannot re-gain fluorescence in the presence of dox. H2B-GFP is stable for the period of the 3 weeks we study here, meaning no loss of fluorescence intensity is expected for other reasons than cell division.

Table 1.

Model parameters to be estimated: proportion of cells (π), mean division interval (μ), SD of division interval (σ)

	Quiescent	Proliferating			Slowly proliferating			Fast proliferating
	π_q	π_p	μ_p	σ_p	π_s	μ_s	σ_s	π_f	μ_f	σ_f
Model #1			✓	✓
Model #2	✓	^a	✓	✓
Model #3	✓				^b	✓	✓	✓ ^c	✓	✓
Model #4					^d	✓	✓	✓	✓	✓

aEqual to .

bEqual to .

cEstimated on the remainder .

dEqual to .

Model parameters to be estimated: proportion of cells (π), mean division interval (μ), SD of division interval (σ) aEqual to . bEqual to . cEstimated on the remainder . dEqual to . The simulation outcomes of each model were compared with experimental data for model calibration and validation. The data consist of histograms of fluorescence levels measured at 0, 240 and 504 h, as described in Section 2.1. In what follows we denote a histogram as a finite set of ordered pairs where M is the number of different fluorescence levels in a given experimental dataset and, for each and ψ denote the ith fluorescence level and the frequency of the ith fluorescence level, respectively, measured at time t. The fluorescence levels satisfy the condition that for each . The main assumption of our modeling approach is that every time a proliferating cell with a fluorescence level divides, it generates two daughter cells whose fluorescence levels are equal to , because of the even distribution of H2B-GFP. The division events are stochastically simulated (see Section 3.1) to determine which configuration of cell subpopulations better fits with the experimental data. Each cell occurring in the initial population is explicitly tracked during the simulation until its fluorescence level reaches a minimum threshold . Quiescent cells naturally keep their fluorescence levels unaltered throughout the simulation.

2.3 Fuzzy Self-Tuning Particle Swarm Optimization

PSO is an iterative stochastic population-based meta-heuristics inspired by the collective movement of fish and birds (Poli ). Specifically, in PSO a swarm of P candidate solutions (named particles), each one identified by a position vector x, moves inside a bounded D-dimensional search space, cooperating to converge to the global best solution of the optimization problem. Traditionally, the global exploration and local exploitation behaviors of particles are controlled by two settings, the cognitive attractor and the social attractor . Moreover, the velocity of particles is limited on each dimension of the search space by using a maximum velocity value , with , and modulated by an inertia factor to avoid chaotic behaviors in the swarm. The goodness of the solutions identified by PSO strongly depends on the accurate selection of its functioning settings, that is, and (for each ). However, such settings are problem-dependent and require a massive number of trials to be determined. To bypass this issue, in this work we rely on FST-PSO, a settings-free variant of PSO that leverages fuzzy logic to dynamically adjust the functioning settings of each particle. FST-PSO was empirically shown to be competitive with state-of-the-art methods (Nobile ), especially for the parameter estimation problem (Nobile ). In FST-PSO, the search space is bounded to avoid the divergence of candidate solutions towards infinity; in particular, the algorithm assumes damping boundary conditions (Xu and Rahmat-Samii, 2007). The ranges of parameter values used to define the search space for the four models of cell proliferation in AML are given in Table 2. The rationale for the choice of these intervals is based on the analysis of the experimental histograms: The optimization process carried out by FST-PSO is realized by evaluating, at each iteration, the ‘quality’ of the position of each particle in the search space (that is, the goodness of the candidate model parameterization) by means of a so-called fitness function. This is necessary to identify the current global and local best particles, and to update the position of each particle. In this work, since each particle represents a candidate model parameterization, the fitness function of a particle consists in comparing the histogram corresponding to the experimental data with the histogram generated by simulating each model of cell proliferation using the parameterization codified by that particle (see Section 3.2).

Table 2.

Range of parameter values used by FST-PSO (time expressed in hours)

Proliferating		Slowly proliferating		Fast proliferating		Proportion
μ_p	σ_p	μ_s	σ_s	μ_f	σ_f	π_q	π_f
[30,504]	[0.01,40]	[63, 504]	[0.01,40]	[30, 63]	[0.01,30]	[0, 1]	[0, 1]

if the division rate of proliferating cells was longer than 21 days, then the initial histogram (i.e. at t = 0 days) would be identical to the histograms measured at t = 21 days, and cells would appear as quiescent cells. Thus, 504 h was selected as the upper bound for the mean of the division interval for both proliferating and slowly proliferating cells; if the division rate of proliferating cells was shorter than 30 h, then the unthresholded experimental histogram measured at t = 10 days would greatly overlap with the histogram of NI cells. Since only a minor fraction of cells (<1%) falls under the autofluorescence threshold at t = 10 days (see Supplementary Section S2) and eight divisions are required for an H2B-GFP labeled cell to become GFPneg, 30 h were selected as the lower bound for the mean of the division interval for both proliferating and fast proliferating cells; the threshold between slowly and fast proliferating cells was chosen similarly to the previous reasoning: if slowly proliferating cells divided more than 8 times during the t = 21 days time span, the experimental histograms would appear with zero frequency for fluorescence levels between 101 and 103. Hence, the threshold between slowly and fast proliferating cells was set to 63 h; the ranges of values for the SD were chosen coherently with the mean of the division interval for each subpopulation type, and reasonably large to avoid the possibility of determining a search space that does not include the optimal solution. Range of parameter values used by FST-PSO (time expressed in hours)

3 Algorithm

3.1 ProCell: stochastic simulation of cell proliferation

ProCell is a novel, general-purpose stochastic algorithm that allows to simulate cell division in a population of proliferating cells. Here we apply ProCell in the context of AML considering the configuration of subpopulations defined in Models #1–#4 (see Table 1), and taking into account that cell fluorescence is halved after each cell division due to GFP-marked chromatin separation, as described in Section 2.1. Figure 1 shows that (slowly and fast) proliferating cells can stochastically divide at different time instants during the simulation, generating two daughter cells with halved fluorescence levels. Note that quiescent cells never divide and keep their initial fluorescence level unaltered till the end of the simulation. ProCell accounts for this stochasticity since the exact timing of each cell division is not fixed a priori, but it is a random variable sampled at each iteration from a truncated normal distribution , for some mean division interval μ and SD σ. Given a model of cell proliferation, an arbitrary model parameterization x, an initial histogram of fluorescence levels and a specific threshold , ProCell generates the predicted output histogram H(t) that represents the frequency distribution of simulated GFPpos cells at time t. Precisely, the input of ProCell consists in the following data:

Fig. 1.

Scheme of ProCell functioning. Quiescent cells (salmon) never divide and keep their initial fluorescence level unaltered. Slowly and fast proliferating cells (cyan and green, respectively) stochastically divide at different time instants, generating daughter cells with halved fluorescence levels. When the maximum simulation time is reached, the fluorescence distribution of the final cell population is returned

the histogram H(0) of the initial fluorescence levels (as defined in Section 2.2, assuming t = 0); the threshold of H2B-GFP specific cell fluorescence detection; the vector of the cell types in the population; the proportions of cell types in the population, where ; the vector of the mean division intervals for non-quiescent cells; the vector of the SD of division intervals for non-quiescent cells; the number of hours tmax to be simulated. ProCell’s output is the final histogram of the population after tmax hours. Scheme of ProCell functioning. Quiescent cells (salmon) never divide and keep their initial fluorescence level unaltered. Slowly and fast proliferating cells (cyan and green, respectively) stochastically divide at different time instants, generating daughter cells with halved fluorescence levels. When the maximum simulation time is reached, the fluorescence distribution of the final cell population is returned ProCell relies on a stack of cell objects. Each cell is characterized by four attributes: (i) the cell type; (ii) the fluorescence level; (iii) the timer, denoting the waiting time before the cell’s next division; (iv) the current simulation time instant t. The first step of ProCell consists in creating a stack of cells (see Listing 1). Specifically, ProCell begins by fetching couples from the initial histogram H(0) (Line 3). For each couple, exactly ψ cells with fluorescence level are created (Lines 4–5). The total number of cells pushed on the stack is equal to the cumulative frequency of the histogram, i.e. , where M is the number of different fluorescence levels in H(0). When a new cell is created, it is assigned to one of the available population types. The type of the new cell is randomly chosen with a probability proportional to the user-specified proportions (Line 6). In the listing, denotes a number randomly chosen from a uniform distribution in . In the context of AML we consider the following cell types: quiescent (q), proliferating (p), slowly proliferating (s), and fast proliferating (f) cells. The vector of the proportions of cell subpopulations appearing in each model of cell proliferation in AML is unknown, and is estimated at run time by coupling the simulation of cell divisions with FST-PSO, as described in Section 3.2. If the cell is quiescent, its timer is set to infinity (Line 9); otherwise, its timer is set by randomly sampling the waiting time from a truncated normal distribution with the user-specified mean and SD (Line 11). Cells are not synchronized, that is, they can be in any phase of the cell-cycle. Hence, the initial time t of the cell is set to an arbitrary value of its waiting time. This is achieved by multiplying a random sample from a truncated normal distribution, with the user-specified mean and SD, by (Line 13). In the context of AML, both the vectors of the mean () and SD () values of division intervals for all cell types appearing in the four models are not known and need to be estimated at run time. When the creation process is completed, the cell is pushed on the stack (Line 14). Listing 1. Initialization of the cells’ stack The simulation algorithm at the basis of ProCell is shown in Listing 2. As a first step, ProCell creates a container (in our implementation, a dictionary) that collects the final fluorescence levels after tmax hours (Line 2). Then, as long as the cell stack is not empty (Line 3), cell divisions are repeatedly performed. During each iteration, a cell is popped out of the stack and its type is determined. If the cell is quiescent, by definition, it does not divide and its initial fluorescence level is conserved until the end of the simulation. Hence, a new hit of that specific fluorescence level is directly added to the dictionary (Lines 5–6). If the cell is not quiescent, it can undergo division. The algorithm verifies whether the waiting time before the next division will exceed the maximum simulation time tmax. If the calculated division interval is greater than tmax, the cell will not divide before the end of the simulation and a new hit of its current fluorescence level is added to the dictionary (Lines 23–24). On the contrary, if the calculated division interval is smaller than or equal to tmax, then the simulation time of the cell is updated (Line 9) and the fluorescence level of the two daughter cells is verified: as soon as the fluorescence level drops below the threshold , the corresponding cell is removed from the simulation (Line 10) since it cannot be detected as GFPpos by the experimental instrumentation. Listing 2. Simulation algorithm If the fluorescence level remains above the threshold, two new cell objects are created. These new cells inherit the halved fluorescence levels (Lines 11–12), the current simulation time (Lines 13–14), and the type (Lines 15–17) from the parent cell. The waiting time before the next cell division is not directly inherited from the parent cell, but it is sampled from the user-specified truncated normal distribution according to the cell type (Lines 18–19). Finally, the new cells are pushed on the stack (Lines 20–21). The simulator iterates until the stack is empty; when the simulation is over, the dictionary representing the histogram of the fluorescence distribution at tmax is returned (Line 28). It is worth noting that, given any non-quiescent cell in the stack, the sequence of its divisions halts under two circumstances: when the simulation time reaches tmax, or when the fluorescence levels of its daughter cells fall below the threshold . The latter circumstance is more likely to occur for proliferating cells characterized by a low initial fluorescence level, especially in the case of fast proliferating cells. The thresholding of the fluorescence level reflects the actual laboratory reality and mitigates the exponential growth of the stack size.

3.2 Parameter estimation of cell proliferation models

Given an initial histogram of fluorescence levels H(0) and a vector of model parameters (i.e. in the context of AML), the algorithm given in Listing 2 is used to simulate cell proliferation for each subpopulation type. In this work, each model presented in Table 1 is instantiated by setting to zero the proportion of each subpopulation that is not explicitly considered in that model. For instance, Model #1 is obtained by setting in x, so that only proliferating cells are created in the stack of cells according to Listing 1. The parameters in x are unknown for the cell proliferation models in human AML xenografts, and were estimated by means of the FST-PSO algorithm. During the execution of FST-PSO, each candidate parameterization codified by a particle is used to perform an independent simulation of cell proliferation with ProCell; the fitness function of the parameterization is calculated as the distance between the simulated histogram and the experimental histogram of the fluorescence values sampled after a fixed amount of time tmax. Specifically, the measure used to assess the fitness function is the Hellinger distance (see Nikulin, 2001 for further information). The rationale of our fitness function—along with the binning and normalization procedures used to make the target and simulated distributions directly comparable—are described in Supplementary Section S3. In all fitness evaluations, we set to the minimum fluorescence level in the target histogram. In order to analyze the average behavior of FST-PSO, for each model we run Λ = 30 independent runs of the cell division algorithm coupled with FST-PSO, and collect statistics about the best solution found at the end of the optimization process. Both ProCell and the parameter estimation methodology were implemented in Python, exploiting the libraries numpy v1.13.3 (Oliphant, 2007), scipy v1.0.1 (Jones ), fst-pso v1.4.8 (Nobile ) and miniful v0.0.4. In order to accelerate the fitness function evaluations, we exploited the Python multiprocessing facility to parallelize the simulations of the particles over multiple cores. Moreover, we further increased the level of parallelism by executing multiple independent runs of FST-PSO on a multi-node supercomputer (CINECA’s Marconi).

4 Results

4.1 Model calibration

The parameters of each model of cell proliferation (Table 1) were calibrated by means of FST-PSO, exploiting the leukemic blasts collected from the BM of AML xenografts. The simulated histograms were calculated starting from an experimental initial histogram of fluorescence levels H(0) and fitted against the experimental target histograms according to the fluorescence of the cells harvested after tmax = 240 h of chasing, as described in Section 3.2. Figure 2 shows a swarm plot (SP) of the optimal solutions found in Λ = 30 runs for each model (organized along the x axis). A SP is a special type of scatterplot, in which data are distributed along the y axis according to their values. The peculiarity of SP is that points with similar quality are packed but displaced along the x axis, to avoid any overlap and to give an optimal perception of the distribution of data. In Figure 2, data represent the fitness values, i.e. the Hellinger distance between each optimal solution and the target experimental histogram. According to this SP, Models #3 and #4 (dark green and blue, respectively) achieve a better fitting of the target distributions with respect to Models #1 and #2 (red and light green, respectively). Interestingly, Figure 2 highlights that Models #1 and #2 always converge to solutions with the same quality; on the contrary, the fitness landscapes associated with Models #3 and #4 are characterized by a local optimum, represented by the clusters of solutions characterized by a Hellinger distance value approximately equal to 18. This circumstance is due to the stochastic initialization of particles’ position (see Supplementary Section S4 for a detailed explanation). This result emphasizes the importance of performing multiple repetitions (e.g. ) of the parameter estimation task, instead of relying on a single outcome of the optimization procedure. As a matter of fact, this approach is preferable as it reduces the probability to find out a ‘low quality’ model parameterization, corresponding to a particle stuck into one of the local minima. Therefore, considering Λ = 30 runs of FST-PSO, we report in Table 3 the best parameterization found for each model of cell proliferation.

Fig. 2.

Table 3.

Best parameterizations found by FST-PSO

	μp(σp)	μs(σs)	μf(σf)	πq:πp:πs:πf
Model #1	46.1 (28.9)	—	—	− :1: − :−
Model #2	46.0 (26.4)	—	—	0.14 : 0.86 : − : −
Model #3	—	63.7 (28.9)	41.1 (14.3)	0.06 : − : 0.47 : 0.47
Model #4	—	83.9 (26.6)	44.2 (19.5)	− : − :0.28:0.72

Note: Time expressed in hours.

Results of the parameter estimation over the four competing models, shown as SPs of the solutions found (the lower, the better). Models #3 (dark green) and #4 (blue) achieve the best fitting with the experimental histograms Best parameterizations found by FST-PSO Note: Time expressed in hours. The best parameterizations found across the Λ repetitions were used to perform a simulation of cell proliferation with ProCell, and to visually inspect the quality of the solutions. In Figure 3 we show the comparison between the experimental histograms (green bars) and the simulated histograms (red bars) for each cell proliferation model, obtained using the best model parameterizations listed in Table 3. These results further confirm that Models #3 and #4 better overlap the target histogram with respect to Models #1 and #2. The inset figures show the right tail of the distribution: this region highlights the role of quiescent cells, since only Models #2 and #3 are able to fit this specific part of the histogram. Finally, the bars on the top of Figure 4 show the Hellinger distances between the target distribution and the simulated distribution evaluated using the best model parameterizations found by FST-PSO, and further confirm that Model #3 provides the best explanation for the experimental data.

Fig. 3.

Fig. 4.

Comparison of the Hellinger distances between the experimental and the simulated histograms at tmax = 240 h for model calibration (bars on the top), and tmax = 504 h for model validation (bars on the bottom). The simulations were performed using the best parameterization found by FST-PSO. A lower Hellinger distance corresponds to a better result: Model #3 represents the best explanation for cell proliferation in AML

Comparison between the target histogram (green bars) and the simulated histogram (red bars) obtained at tmax = 240 h. Simulations were run with the best parameterizations found by FST-PSO for each cell proliferation model. The inset represents the right tail of the histogram, i.e. cells whose fluorescence is above 103 Comparison of the Hellinger distances between the experimental and the simulated histograms at tmax = 240 h for model calibration (bars on the top), and tmax = 504 h for model validation (bars on the bottom). The simulations were performed using the best parameterization found by FST-PSO. A lower Hellinger distance corresponds to a better result: Model #3 represents the best explanation for cell proliferation in AML Altogether, these results suggest that, on the one hand, a naïve distinction between quiescent and proliferating cells in the population (i.e. Model #2) is not sufficient to reproduce the fluorescence levels measured after tmax = 240 h of chasing, even though these two subpopulations help to partially describe the high-fluorescence tail of the histogram. On the other hand, the distinction between two types of proliferating cells with different average division intervals (i.e. slowly and fast proliferating cells in Model #4) has a stronger impact on the overall population dynamics. In fact, our results indicate that it is the simultaneous presence of quiescent, slowly proliferating and fast proliferating cells in the population (i.e. Model #3) that is able to reproduce at best the experimental data. At this point, we wanted to exclude the possibility that the good fit obtained by Model #3 was due to the larger number of its free parameters with respect to the other three models. We performed a tailored analysis by defining an additional model characterized by a larger number of free parameters, named Model #5. According to our results, we can affirm that Model #3 represents the simplest and the most likely explanation for the observed phenomena (see Supplementary Section S5 for more details). Moreover, since some works reported the gamma distribution (Kinjyo ; Stukalin ) as a putative distribution of cell division times, we repeated the parameter estimation of the five models exploiting this distribution. The results of this analysis (reported in Supplementary Section S6) show that, in this particular context, models adopting the gamma distribution do not provide more fitting solutions with respect to the ones adopting the normal distribution. We conclude that the H2B-GFP dilution profile at day 10 is consistent with the persistence of quiescent and slowly proliferating cell subpopulations during the exponential phase of leukemia development in our xenotransplantation model. Importantly, according to the best parameterization of Model #3 estimated by FST-PSO, only ∼47% of the initial leukemic population (day 0) appears to divide rapidly to fuel tumor growth.

4.2 Model validation

To validate the predictive capability of Model #3, we simulated cell proliferation up to tmax = 504 h (i.e. 21 days) using the best parameterization found by FST-PSO. This time interval was chosen as the maximum chasing period experimentally implemented, in order to avoid unnecessary animal suffering. Notably, at the end of the 3-week chasing, <1% of the leukemic cells retained a high intensity GFP signal. We show in Figure 5 the outcome of Model #3 validation. According to our results, the predicted distribution obtained with the best parameterization given in Table 3 still fits with the experimental data measured at tmax = 504 h. For the sake of completeness, we carried out the validation of the other three models of cell proliferation, and calculated the Hellinger distance on the predicted distributions generated by their best parameterizations, as given in Table 3. The bars on the bottom of Figure 4 further confirm that Model #3 is characterized by the most fitting behavior (even better than Model #5, as shown in Supplementary Section S5), therefore consolidating the theory that Model #3 is the most likely model of cell proliferation in AML.

Fig. 5.

Validation of Model #3: the predicted distribution (red bars) fits with the experimental data (green bars) at tmax = 504 h

Validation of Model #3: the predicted distribution (red bars) fits with the experimental data (green bars) at tmax = 504 h In conclusion, our analysis demonstrates that the analyzed BM samples of AML xenografts are composed of populations with highly heterogeneous proliferative potential. According to the best parameterization found by FST-PSO (Table 3), in the analyzed timeframe only 47% of the initial population of cells can be considered as fast proliferating, with a division event occurring approximately every 2 days. Instead, an equal fraction of the population at the start of the chasing period was found to proliferate at a slower rate (with a division interval of about 3 days), while 6% remains undivided for the entire chasing period. We consider this latter population of label-retaining cells as bona fide quiescent leukemic blasts.

5 Conclusion

The stochastic modeling approach presented here provides new insights on the intricate cellular organization of AML, highlighting the existence of slowly proliferating and long-term quiescent leukemic cells in vivo. Notably, based on the comparison of different models of cell proliferation and on their best parameterizations found by FST-PSO, ∼1 in 20 leukemic blasts of the initial population at day 0 is predicted to remain quiescent during a chasing period of 3 weeks. In addition, around half of the blasts at day 0 are predicted to contribute in fueling tumor growth, albeit with a lower cell turnover. This complex scenario, depicted by Model #3, could have important clinical implications in the context of disease recurrence, since most therapeutic approaches target mainly highly proliferating cells. Quiescent/slowly dividing cells, instead, could evade aggressive chemotherapeutic treatment, and drive tumor evolution and adaptation under harsh environmental challenges. Thus, our in vivo H2B-GFP label-retaining assay establishes an ideal experimental setting for the isolation of viable leukemic cells with distinct proliferation properties, allowing further manipulation aiming at their functional, transcriptional and genomic characterization, with the ultimate goal of revealing intrinsic vulnerabilities and novel therapeutic targets. To our knowledge, this is the first attempt to decipher in vivo cell-cycle and proliferation kinetics in AML xenografts. Similar studies have been performed, instead, on the normal HSC counterpart, using H2B-GFP, CFSE or BrdU label-dilution flow cytometry data after long-term in vivo chasing periods (Bernitz ; Foudi ; Takizawa ; Wilson ). Depending on the study, active HSCs appear to divide once in about every 10–36 days, while the dormant ones are estimated to enter cell-cycle every 100–150 days (Takizawa ; Wilson ). In contrast to studies on normal hematopoiesis, leukemia progression imposes strict limitations on the length of the chasing periods that can be safely applied, due to animal welfare reasons. Nevertheless, in all cases, dividing leukemia blasts exhibited faster kinetics in our PDX model (∼2.7 days for the slowly dividing, and ∼1.7 days for the fast dividing cells), while quiescent cells were detectable even after 6 weeks of continuous dox treatment. We plan to further optimize our experimental setting, to allow monitoring of cell proliferation within wider timeframes. From the computational point of view, although the relevant running time required by ProCell was strongly reduced by the combined use of multi-threading and distributed computation over multi-node clusters, simulating the temporal evolution of cell populations is still very time consuming. As future developments, we plan to realize a novel implementation of ProCell that exploits the computing power of modern graphics processing units (GPUs). GPUs are massive multi-core co-processors capable of drastically reducing the running time required by any computational tool, paving the way to thorough investigations of biological systems (Nobile ). Specifically, we will re-implement ProCell using the C++ language and the Nvidia CUDA library. Moreover, we plan to leverage CUDA’s dynamic parallelism in order to recursively spawn new threads on the GPU as soon as a cell divides. Ad hoc data structures for histogram accumulation and storage—combined with atomic mathematical functions to prevent critical runs—will provide a strong reduction of the overall computation time, allowing more complex analyses at the cell population level. To determine the parameters of cell proliferation models, the computational framework presented in this work exploits the settings-free algorithm FST-PSO. So doing, the input of our computational framework is reduced only to the vector of cell types and the ranges of the parameter values, thus facilitating users without programing or computational expertise to run experiments with ProCell. Click here for additional data file.

21 in total

1. Emergence of a subpopulation in a computational model of tumor growth.

Authors: A R Kansal; S Torquato; E A Chiocca; T S Deisboeck
Journal: J Theor Biol Date: 2000-12-07 Impact factor: 2.691

2. Parameter estimation in biochemical pathways: a comparison of global optimization methods.

Authors: Carmen G Moles; Pedro Mendes; Julio R Banga
Journal: Genome Res Date: 2003-10-14 Impact factor: 9.043

3. An inducible Tet-Off-H2B-GFP lentiviral reporter vector for detection and in vivo isolation of label-retaining cells.

Authors: Berit Falkowska-Hansen; Jasmin Kollar; Barbara Maria Grüner; Merle Schanz; Petra Boukamp; Jens Siveke; Axel Rethwilm; Marc Kirschner
Journal: Exp Cell Res Date: 2010-02-18 Impact factor: 3.905

4. Hematopoietic stem cells reversibly switch from dormancy to self-renewal during homeostasis and repair.

Authors: Anne Wilson; Elisa Laurenti; Gabriela Oser; Richard C van der Wath; William Blanco-Bose; Maike Jaworski; Sandra Offner; Cyrille F Dunant; Leonid Eshkind; Ernesto Bockamp; Pietro Lió; H Robson Macdonald; Andreas Trumpp
Journal: Cell Date: 2008-12-12 Impact factor: 41.582

5. Induction of cell cycle entry eliminates human leukemia stem cells in a mouse model of AML.

Authors: Yoriko Saito; Naoyuki Uchida; Satoshi Tanaka; Nahoko Suzuki; Mariko Tomizawa-Murasawa; Akiko Sone; Yuho Najima; Shinsuke Takagi; Yuki Aoki; Atsushi Wake; Shuichi Taniguchi; Leonard D Shultz; Fumihiko Ishikawa
Journal: Nat Biotechnol Date: 2010-02-14 Impact factor: 54.908

6. Chemotherapy-resistant human AML stem cells home to and engraft within the bone-marrow endosteal region.

Authors: Fumihiko Ishikawa; Shuro Yoshida; Yoriko Saito; Atsushi Hijikata; Hiroshi Kitamura; Satoshi Tanaka; Ryu Nakamura; Toru Tanaka; Hiroko Tomiyama; Noriyuki Saito; Mitsuhiro Fukata; Toshihiro Miyamoto; Bonnie Lyons; Koichi Ohshima; Naoyuki Uchida; Shuichi Taniguchi; Osamu Ohara; Koichi Akashi; Mine Harada; Leonard D Shultz
Journal: Nat Biotechnol Date: 2007-10-21 Impact factor: 54.908

7. Cell-cycle restriction limits DNA damage and maintains self-renewal of leukaemia stem cells.

Authors: Andrea Viale; Francesca De Franco; Annette Orleth; Valeria Cambiaghi; Virginia Giuliani; Daniela Bossi; Chiara Ronchini; Simona Ronzoni; Ivan Muradore; Silvia Monestiroli; Alberto Gobbi; Myriam Alcalay; Saverio Minucci; Pier Giuseppe Pelicci
Journal: Nature Date: 2009-01-01 Impact factor: 49.962

8. Dynamic variation in cycling of hematopoietic stem cells in steady state and inflammation.

Authors: Hitoshi Takizawa; Roland R Regoes; Chandra S Boddupalli; Sebastian Bonhoeffer; Markus G Manz
Journal: J Exp Med Date: 2011-02-07 Impact factor: 14.307

9. Analysis of histone 2B-GFP retention reveals slowly cycling hematopoietic stem cells.

Authors: Adlen Foudi; Konrad Hochedlinger; Denille Van Buren; Jeffrey W Schindler; Rudolf Jaenisch; Vincent Carey; Hanno Hock
Journal: Nat Biotechnol Date: 2008-12-05 Impact factor: 54.908

10. Modeling metabolic networks in C. glutamicum: a comparison of rate laws in combination with various parameter optimization strategies.

Authors: Andreas Dräger; Marcel Kronfeld; Michael J Ziller; Jochen Supper; Hannes Planatscher; Jørgen B Magnus; Marco Oldiges; Oliver Kohlbacher; Andreas Zell
Journal: BMC Syst Biol Date: 2009-01-14

2 in total

1. Modeling heterogeneous tumor growth dynamics and cell-cell interactions at single-cell and cell-population resolution.

Authors: Leonard A Harris; Samantha Beik; Patricia M M Ozawa; Lizandra Jimenez; Alissa M Weaver
Journal: Curr Opin Syst Biol Date: 2019-09-16

Review 2. Single-Cell Technologies to Decipher the Immune Microenvironment in Myeloid Neoplasms: Perspectives and Opportunities.

Authors: Chiara Caprioli; Iman Nazari; Sara Milovanovic; Pier Giuseppe Pelicci
Journal: Front Oncol Date: 2022-02-02 Impact factor: 5.738

2 in total