Literature DB >> 32097421

Population modeling of tumor growth curves and the reduced Gompertz model improve prediction of the age of experimental tumors.

Cristina Vaghi^1,2, Anne Rodallec³, Raphaëlle Fanciullino³, Joseph Ciccolini³, Jonathan P Mochel⁴, Michalis Mastri⁵, Clair Poignard^1,2, John M L Ebos^5,6, Sébastien Benzekry^1,2.

Abstract

Tumor growth curves are classically modeled by means of ordinary differential equations. In analyzing the Gompertz model several studies have reported a striking correlation between the two parameters of the model, which could be used to reduce the dimensionality and improve predictive power. We analyzed tumor growth kinetics within the statistical framework of nonlinear mixed-effects (population approach). This allowed the simultaneous modeling of tumor dynamics and inter-animal variability. Experimental data comprised three animal models of breast and lung cancers, with 833 measurements in 94 animals. Candidate models of tumor growth included the exponential, logistic and Gompertz models. The exponential and-more notably-logistic models failed to describe the experimental data whereas the Gompertz model generated very good fits. The previously reported population-level correlation between the Gompertz parameters was further confirmed in our analysis (R2 > 0.92 in all groups). Combining this structural correlation with rigorous population parameter estimation, we propose a reduced Gompertz function consisting of a single individual parameter (and one population parameter). Leveraging the population approach using Bayesian inference, we estimated times of tumor initiation using three late measurement timepoints. The reduced Gompertz model was found to exhibit the best results, with drastic improvements when using Bayesian inference as compared to likelihood maximization alone, for both accuracy and precision. Specifically, mean accuracy (prediction error) was 12.2% versus 78% and mean precision (width of the 95% prediction interval) was 15.6 days versus 210 days, for the breast cancer cell line. These results demonstrate the superior predictive power of the reduced Gompertz model, especially when combined with Bayesian estimation. They offer possible clinical perspectives for personalized prediction of the age of a tumor from limited data at diagnosis. The code and data used in our analysis are publicly available at https://github.com/cristinavaghi/plumky.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Year: 2020 PMID： 32097421 PMCID： PMC7059968 DOI： 10.1371/journal.pcbi.1007178

Source DB: PubMed Journal: PLoS Comput Biol ISSN： 1553-734X Impact factor: 4.475

Introduction

In the era of personalized oncology, mathematical modeling is a valuable tool for quantitative description of physiopathological phenomena [1, 2]. It allows for a better understanding of biological processes and generates useful individual clinical predictions, for instance for personalized dose adaptation in cancer therapeutic menagement [3]. Tumor growth kinetics have been studied since several decades both clinically [4] and experimentally [5]. One of the main findings of these early studies is that tumor growth is not entirely exponential, provided it is observed over a long timeframe (100 to 1000 folds of increase) [6]. The specific growth rate slows down and this deceleration can be particularly well captured by the Gompertz model [7, 6, 8]: where Vinj is the initial tumor size at tinj = 0 and α and β are two parameters. While the etiology of the Gompertz model has been long debated [9], several independent studies have reported a strong and significant correlation between the parameters α and β in either experimental systems [6, 10, 11], or human data [11, 12, 13]. While some authors suggested this would imply a constant maximal tumor size (given by in (1)) across tumor types within a given species [11], others argued that because of the presence of the exponential function, this so called ‘carrying capacity’ could vary over several orders of magnitude [14]. Mathematical models for tumor growth have been previously studied and compared at the level of individual kinetics and for prediction of future tumor growth [15, 16]. However, detailed studies of statistical properties of tumor growth models using a population approach (i.e. integrating structural dynamics with inter-subject variability [17]) are rare [18]. Nonlinear mixed effects modeling of the Gompertz model has been applied to several fields in biology, e.g. to model growth in Japanese quails [19] or broiler chicken growth [20]. In the field of tumor growth modeling, studies using a population approach have mostly been conducted for perturbed tumor growth under the action of therapeutics (see e.g. [21] for a clinical study and [22] for a review). In a previous publication, our group has used a mixed-effects framework to compare the descriptive power of several unperturbed tumor growth models, yet without reporting visual predictive checks, analysis of residuals nor values of the population parameters (typical values and standard deviations of the random effects) [15]. Other related works include the coupling of tumor growth models with metastatic spreading [23, 24], or an analysis of tumor growth kinetics from different cell lines using the Simeoni model only [25, 18]. A calibrated model of lymphoma tumor growth has also been introduced and used for predictions in [26]. More complex mechanistic models have been proposed to investigate the link between biological processes and tumor growth dynamics and perform predictions, including angiogenesis [27] and solid stress [28]. A model for tumor-immune interactions has been developed and validated in [29, 30], demonstrating its ability to predict future prostate specific antigen dynamics based on several pre- and post-treatment initiation data points. Mathematical models of tumor growth inhibition were presented to assess tumor size dynamics in colorectal cancer [31] and adult diffuse low-grade gliomas [32]. Spatial models have also been widely proposed in a theoretical context but few of them have been compared to data (see [33] for an example on thyroidal lung nodules and [34, 35] for gliomas). Here we provide a detailed and comparative analysis of statistical properties of multiple classical tumor growth models within a population framework, applied to a data set of 94 animals, including three animal models and two methods of tumor size quantification (versus 54 animals in [15]). The main focus and novelty of the work reported here is to analyze the above-mentioned correlation between Gompertz parameters using a population approach, in order to improve model-derived predictions. This led us to a simplified model with only one subject-specific parameter (and one population-specific), the “reduced Gompertz” model [11]. Using population distributions as priors allows to make predictions on new subjects by means of Bayesian algorithms [36, 37, 38]. The added value of the latter method is that only few measurements per individual are necessary to obtain reliable predictions. In contrast with previous work focusing on the forward prediction of the size of a tumor [15], the present study addresses the backward problem, i.e. the estimation of the age of a tumor [39]. This question is of fundamental importance in the clinic since the age of a tumor can be used as a proxy for determination of the invisible metastatic burden at diagnosis [24]. In turn, this estimation has critical implications for decision of the extent of adjuvant therapy [40]. Since predictions of the initiation time of clinical tumors are hardly possible to verify for clinical cases, we developed and validated our method using experimental data from multiple data sets in several animal models. This setting allowed to have enough measurements, on a large enough time frame in order to assess the predictive power of the methods.

Materials and methods

The python code and the data used in our analysis are available at https://github.com/cristinavaghi/plumky.

Ethics statement

Animal tumor model studies were performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. Protocols used were approved by the Institutional Animal Care and Use Committee (IACUC) at Tufts University School of Medicine for studies using murine Lewis lung carcinoma (LLC) cells (Protocol: #P11-324) and at Roswell Park Cancer Institute (RPCI) for studies using human LM2-4LUC+ breast carcinoma cells (Protocol: 1227M). Institutions are AAALAC accredited and every effort was made to minimize animal distress [15]. For the breast data measured by fluorescence, guidelines for animal welfare in experimental oncology as recommended by European regulations (decree 2013-118 of February 1, 2013) were followed. All animal experiments were approved by the Animal Ethic Committee of the Aix-Marseille Université (CE14). The protocol was registered as #2017031717108767 at the French Ministry of Research. Mice were monitored daily for signs of distress, pain, decreased physical activity, or any behavioral change and weighted thrice a week. Water was supplemented with paracetamol (80 mg/kg/day) to prevent any metastasis-related pain [41].

Mice experiments

The experimental data comprised three data sets. Animal tumor model studies were performed in strict accordance with guidelines for animal welfare in experimental oncology and were approved by local ethics committees. Precise description of experimental protocols was reported elsewhere (see [15] for the volume measurements and [41] for the fluorescence measurements).

Breast data measured by volume (N = 66)

This dataset is publicly available at the following repository [42]. It consisted of human LM2-4LUC+ triple negative breast carcinoma cells originally derived from MDA-MB-231 cells. Animal studies were performed as described previously under Roswell Park Comprehensive Cancer Center (RPCCC) Institutional Animal Care and Use Committee (IACUC) protocol number 1227M [15, 24]. Briefly, animals were orthotopically implanted with LM2-4LUC+ cells (106 cells at injection) into the right inguinal mammary fat pads of 6- to 8-week-old female severe combined immunodeficient (SCID) mice. Tumor size was measured regularly with calipers to a maximum volume of 2 cm3, calculated by the formula V = π/6w2 L (ellipsoid) where L is the largest and w is the smallest tumor diameter. The data were pooled from eight experiments conducted with a total of 581 observations. All LM2-4LUC+ implanted animals used in this study are vehicle-treated animals from published studies [15, 24]. Vehicle formulation was carboxymethylcellulose sodium (USP, 0.5% w/v), NaCl (USP, 1.8% w/v), Tween-80 (NF, 0.4% w/v), benzyl alcohol (NF, 0.9% w/v), and reverse osmosis deionized water (added to final volume) and adjusted to pH 6 (see [43]) and was given at 10ml/kg/day for 7-14 days prior tumor resection.

Breast data measured by fluorescence (N = 8)

This dataset is publicly available at the following repository [44]. It consisted of human MDA-MB-231 cells stably transfected with dTomato lentivirus. Animals were orthotopically implanted (80,000 cells at injection) into the mammary fat pads of 6-week-old female nude mice. Tumor size was monitored regularly with fluorescence imaging. The data comprised a total of 64 observations. To recover the fluorescence value corresponding to the injected cells, we computed the ratio between the fluorescence signal and the volume measured in mm3. We used linear regression considering the volume data of a different data set with same experimental setup (mice, tumor type and number of injected cells). The estimated ratio was 1.52 ⋅ 109 photons/(s ⋅ mm3) with relative standard error of 11.3%, therefore the initial fluorescence signal was 1.22 ⋅ 107 photons/s.

Lung data measured by volume (N = 20)

This dataset is publicly available at the following repository [45]. It consisted of murine Lewis lung carcinoma cells originally derived from a spontaneous tumor in a C57BL/6 mouse [46]. Animals were implanted subcutaneously (106 cells at injection) on the caudal half of the back in anesthetized 6- to 8-week-old C57BL/6 mice. Tumor size was measured as described for the breast data to a maximum volume of 1.5 cm3. The data was pooled from two experiments with a total of 188 observations.

Tumor growth models

We denote by t and V the initial conditions of the equation. At time of injection (t = 0), we assumed that all tumors within a group had the same size/volume Vinj (equal to the number of injected cells converted into the appropriate unit) and denoted by α the specific growth rate (i.e. ) at this time and size. We considered the exponential, logistic and Gompertz models [15]. The first two are respectively defined by the following equations In the logistic equation, K is a carrying capacity parameter. It expresses a maximal reachable size due to competition between the cells (e.g. for space or nutrients). The quantity is a coefficient related to the growth rate. For small values of Vinj, ρ tends to α. The Gompertz model is characterized by an exponential decrease of the specific growth rate with rate denoted here by β. Although multiple expressions and parameterizations coexist in the literature, the definition we adopted here reads as follows: Note that the injected volume Vinj appears in the differential equation defining V. This is a natural consequence of our assumption of α as being the specific growth rate at V = Vinj. This model exhibits sigmoidal growth up to a saturating value given by . Note also that the value of K in the Gompertz model is independent of the initial data (t, V). The latter was considered to be (0, Vinj) when performing population analysis, while it was set equal to the observation of an animal i for backward prediction (see section Individual predictions).

Population approach

Let N be the number of subjects within a population (group) and the vector of longitudinal measurements in animal i, where is the observation of subject i at time for i = 1, …, N and j = 1, …, n (n is the number of measurements of individual i). We assumed the following observation model where is the evaluation of the tumor growth model at time , is the vector of the parameters relative to the individual i and the residual error model, to be defined later. An individual parameter vector depends on fixed effects , identical within the population, and on a random effect , specific to each animal. Random effects follow a normal distribution with mean zero and variance matrix . Specifically: The choice of a log-normal distribution ensured the positivity of the parameters without adding any constraint. Moreover, the ratio of two log-normal distributions is a log-normal distribution. We considered a combined residual error model , defined as where are the residual errors and = [σ1, σ2] is the vector of the residual error model parameters. In order to compute the population parameters, we maximized the population likelihood, obtained by pooling all the data together. Usually, this likelihood cannot be computed explicitly for nonlinear mixed-effect models. We used the stochastic approximation expectation minimization algorithm (SAEM) [17], implemented in the Monolix 2018 R2 software [47]. This algorithm is a variation of the EM algorithm, where the expectation step is replaced by a stochastic approximation of the likelihood function [48]. This method has been proven to efficiently converge to the maximum likelihood estimator for nonlinear mixed effects models [17]. In the remainder of the manuscript we will denote by ϕ = {, , } the set of the population parameters containing the fixed effects , the covariance of the random effects and the error model parameters .

Individual predictions

For a given animal i, the backward prediction problem we considered was to predict the age of the tumor based on the three last measurements . Since we were in an experimental setting, we considered the injection time as the initiation time and thus the age was given by . Then, we considered as model f(t;) the solution of the Cauchy problem (3) endowed with initial conditions . For estimation of the parameters (estimate ), we applied two different methods: likelihood maximization alone (no use of prior population information) and Bayesian inference (use of prior). The predicted age was then defined by that is: in case of the Gompertz model.

Likelihood maximization

For individual predictions with likelihood maximization, no prior information on the distribution of the parameters was used. Parameters of the error model were not re-estimated: values from the population analysis were used. The log-likelihood can be derived from (4): where is the likelihood of the observation of the animal i at time . In order to guarantee the positivity of the parameters, we introduced the relation and substituted this in Eq (6). The negative of Eq (6) was minimized with respect to (yielding the maximum likelihood estimate ) with the function minimize of the python module scipy.optimize, for which the Nelder-Mead algorithm was applied. Thanks to the invariance property, the maximum likelihood estimator of was determined as . Individual prediction intervals were computed by sampling the parameters from a gaussian distribution with variance-covariance matrix of the estimate defined as where , with p the number of parameters (and the factor 3 in the denominator because this is the number of observations), the Fisher information matrix and the gradient of the function g() evaluated in the estimate . Denoting by and by , the Fisher information matrix was defined by [49]:

Bayesian inference

When applying the Bayesian method, we considered training sets to learn the distribution of the parameters ϕ and test sets to derive individual predictions. For a given animal i of a test set, we predicted the age of the tumor based on the combination of: 1) population parameters ϕ identified on the training set using the population approach and 2) the three last measurements of animal i. We set as initial conditions t = 0 and . We considered the initial volume V to be a random variable to account for measurement uncertainty on . We then estimated the posterior distribution of the parameters using a Bayesian approach [37]: where is the prior distribution of the parameters estimated through nonlinear mixed-effects modeling (i.e., the population parameters ϕ), is the likelihood, defined from Eq (4), and is a normalization factor. The predicted distributions of extrapolated growth curves and subsequent were computed by sampling from its posterior distribution (8) using Pystan, a Python interface to the software Stan [38] for Bayesian inference based on the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo [36]. The sampling procedure depends on the evaluation of the likelihood , which relies itself on . Therefore, was sampled from its distribution for each realization of the posterior distribution. Predictions of were then obtained from (5), considering the median value of the distribution. Different data sets were used for learning the priors (training sets) and prediction (test sets) by means of k-fold cross validation, with k equal to the total number of animals of the dataset (k = N, i.e. leave-one-out strategy). At each iteration we computed the parameters distribution of the population composed by N − 1 individuals and used this as prior to predict the initiation time of the excluded subject i. The Stan software was used to draw 2000 realizations from the posterior distribution of the parameters of the individual i.

Results

Results were similar for the three data sets presented in the materials and methods. For conciseness, the results presented below are related to the largest dataset (breast cancer data measured by volume). Results relative to the other datasets are reported in S1–S10 Figs and S1–S4 Tables.

Population analysis of tumor growth curves

The population approach was applied to test the descriptive power of the exponential, logistic and Gompertz models for tumor growth kinetics. The number of injected cells at time tinj = 0 was 106, therefore we fixed the initial volume Vinj = 1 mm3 in the whole dataset [15]. We set (t, V) = (tinj, Vinj) as initial condition of the equations. We ran the SAEM algorithm with the Monolix software to estimate the fixed and random effects [47]. Moreover, we evaluated different statistical indices in order to compare the different tumor growth models. This also allowed learning of the parameter population distributions that were used later as priors for individual predictions. Results are reported in Table 1, where the models are ranked according to their AIC (Akaike Information Criterion), a metrics combining parsimony and goodness-of-fit. The Gompertz model was the one with the lowest values, indicating superior goodness-of-fit. This was confirmed by diagnostic plots (Fig 1). The visual predictive checks (VPCs) in Fig 1A compare the empirical percentiles with the theoretical percentiles, i.e. those obtained from simulations of the calibrated models. The VPC of the exponential and logistic models showed clear model misspecification. On the other hand, the VPC of the Gompertz model was excellent, with observed percentiles close to the predicted ones and small prediction intervals (indicative of correct identifiability of the parameters). Fig 1B shows the prediction distributions of the three models. This allowed to compare the observations with the theoretical distribution of the predictions. Only the prediction distribution of the Gompertz model covered the entire dataset. The logistic model exhibited a saturation of tumor dynamics at lower values than compatible with the data.

Table 1

Models ranked in ascending order of AIC (Akaike information criterion).

Other statistical indices are the log-likelihood estimate (-2LL) and the Bayesian information criterion (BIC). *The reduced Gompertz model is introduced below.

Model	-2LL	AIC	BIC
Gompertz	7129	7143	7158
Reduced Gompertz*	7259	7269	7280
Logistic	7584	7596	7609
Exponential	8652	8660	8669

Fig 1

Population analysis of experimental tumor growth kinetics.

Models ranked in ascending order of AIC (Akaike information criterion).

Other statistical indices are the log-likelihood estimate (-2LL) and the Bayesian information criterion (BIC). *The reduced Gompertz model is introduced below.

Population analysis of experimental tumor growth kinetics.

(A) Visual predictive checks assess goodness-of-fit for both structural dynamics and inter-animal variability by reporting model-predicted percentiles (together with confidence prediction intervals (P.I) in comparison to empirical ones. They were obtained by multiple simulations of each model. The time axis was then split into bins and in each interval the empirical percentiles of the observed data were compared with the respective predicted medians and intervals of the simulated data [47]. (B) Prediction distributions. They were obtained by multiple simulations of all individuals in the dataset, excluding the residual error [47]. (C) Individual weighted residuals (IWRES) with respect to time. (D) Observations vs predictions Left: exponential, Center: logistic, Right: Gompertz models. Moreover, the distribution of the residuals was symmetrical around a mean value of zero with the Gompertz model (Fig 1C), strengthening its good descriptive power, while the exponential and logistic models exhibited clear skewed distributions. The observations vs individual predictions in Fig 1D further confirmed these findings. These observations at the population level were confirmed by individual fits, computed from the mode of the posterior conditional parameter distribution for each individual (Fig 2). Confirming previous results [15], the optimal fits of the exponential and logistic models were unable to give appropriate description of the data, suggesting that these models should not be used to describe tumor growth, at least in similar settings to ours. Fitting of late timepoints data forced the proliferation parameter of the exponential model to converge towards a rather low estimate, preventing reliable description of the early datapoints. The converse occurred for the logistic. Constrained by the early data points imposing to the model the pace of the growth deceleration, the resulting estimation of the carrying capacity K was biologically irrelevant (much too small, typical value 1303 mm3, see Table 2), preventing the model to give a good description of the late growth.

Fig 2

Individual fits from population analysis.

Table 2

Fixed effects (typical values) of the parameters of the different models.

Par. = parameter. = standard deviation of the random effects. R.S.E. = relative standard errors of the estimates. = residual error model parameters. *The reduced Gompertz model is introduced below.

Model	Par.	Unit	Fixed effects	ω	R.S.E. (%)
Gompertz	α	day⁻¹	0.58	0.19	2.51
	β	day⁻¹	0.072	0.26	3.42
	σ	-	[20.5, 0.11]	-	[16.9, 7.53]
Reduced Gompertz*	β	day⁻¹	0.075	0.13	1.74
	k	-	7.87	-	0.21
	σ	-	[14.8, 0.17]	-	[19.3, 5.32]
Logistic	ρ	day⁻¹	0.325	0.138	1.82
	K	mm³	1303	0.25	3.81
	σ	-	[58.9, 0.12]	-	[8.97, 9.14]
Exponential	α	day⁻¹	0.231	0.08	1.38
Exponential	σ	-	[272, 0.26]	-	[6.10, 15.1]

Fixed effects (typical values) of the parameters of the different models.

Par. = parameter. = standard deviation of the random effects. R.S.E. = relative standard errors of the estimates. = residual error model parameters. *The reduced Gompertz model is introduced below.

Individual fits from population analysis.

Three representative examples of individual fits (animal (A), animal (B) and animal (C)) computed with the population approach relative to the exponential (left), the logistic (center) and the Gompertz (right) models. Table 2 provides the values of the population parameters. The relative standard error estimates associated to population parameters were all rather low (<3.81%), indicating good practical identifiability of the model parameters. Standard error estimates of the constant error model parameters were found to be slightly larger (<19.3%), suggesting that for some models a proportional error model might have been more appropriate—but not in case of the exponential model. Since our aim was to compare different tumor growth equations, we established a common error model parameter, i.e. a combined error model. Relative standard errors of the standard deviations of the random effects were all smaller than 9.6% (not shown). These model findings in the breast cancer cell line were further validated with the other cell lines. For both the lung cancer and the fluorescence-breast cancer cell lines, the Gompertz model outperformed the other competing models (see S1 and S2 Tables for goodness-of-fit metrics, and S3 and S4 Tables for parameter values), as also shown by the diagnostic plots (S1 and S2 Figs). Individual plots confirmed these observations and are provided in S3 and S4 Figs. For the fluorescence-breast cancer cell line the constant part of the error model was found negligible and we used a proportional error model (i.e., we fixed σ1 = 0). Value of σ2 was found particularly high for the Exponential model (S4 Table), which resulted in inappropriate fits (S2 and S4 Figs), further supporting rejection of this model. Estimated inter-individual variability for the other models was found small. This was probably due to the small number of animals in the data set. Together, these results confirmed that the exponential and logistic models are not appropriate models of tumor growth while the Gompertz model has excellent descriptive properties, for both goodness-of-fit and parameter identifiability purposes.

The reduced Gompertz model

Correlation between the Gompertz parameters

During the estimation process of the Gompertz parameters, we found a high correlation between α and β within the population. At the population level, the SAEM algorithm estimated a correlation of the random effects equal to 0.981. At the individual level, α and β were also highly linearly correlated (Fig 3A, R2 = 0.968), confirming previous results [6, 11, 10, 12, 50]. This motivated the reformulation of the alpha parameter as follows: where k and c are representing the slope and the intercept of the regression line, respectively. In our analysis we found c to be small (c = 0.14), thus we further assumed this term to be negligible and fixed it to 0. This suggests k as a constant of tumor growth within a given animal model with similar characteristics (note however that from (3), k depends on Vinj) [11, 51]. In turn, this implies an approximately constant limiting size

Fig 3

Correlation of the Gompertz parameters and diagnostic plots of the reduced Gompertz model from population analysis.

Correlation of the Gompertz parameters and diagnostic plots of the reduced Gompertz model from population analysis.

Correlation between the individual parameters of the Gompertz model (A) and results of the population analysis of the reduced Gompertz model: visual predictive check (B), scatter plots of the residuals (C), prediction distribution (D) and examples of individual fits (E). The other data sets gave analogous results in terms of goodness of fit and correlation between α and β, even if the constant limiting size was found different in the three cell lines. The estimated correlations of the random effects were 0.967 and 0.998 for the lung cancer and for the fluorescence-breast cancer, respectively. The correlation between the parameters was also confirmed at the individual level (see S5A and S6A Figs, R2 was 0.923 and 0.99 for the two data sets, respectively).

Biological interpretation in terms of the proliferation rate

By definition, the parameter α is the specific growth rate (SGR) at the volume Vinj, simply assumed to be the volume corresponding to the number of injected cells within a given animal model (e.g. Vinj = 1 for the breast data measured by volume). Assuming that the cells don’t change their proliferation kinetics when implanted, this value should thus in theory be equal to the in vitro proliferation rate (supposed to be the same for all the cells of the same cell line), denoted here by λ. The value of this biological parameter was assessed in vitro and estimated at 0.837 [24]. In support to our quantitative assumptions, we indeed found estimated values of α close to λ (fixed effects of 0.58, see Table 2). However, most of the values of α were smaller than λ in the majority of the cases (Fig 3A). We postulated that this difference could be explained by the fact that not all the cells will be successfully grafted when injected in an animal. Under such assumption the SGR at the initial time, to be compared with λ, would not be given by α anymore. Instead, denoting by the (unknown) volume of the successfully grafted cells, and assuming further that the SGR at initiation would be fixed and given by λ leads to the following reformulation of the Gompertz model In turn, fitting this model to the data provides estimates of the percentage of successful engraftment of 7% ± 12.5% (mean ± standard deviation). Alternatively, these results might also be explained by a time lag between the cell implantation and the initiation of tumor growth, due to the time needed by the cells to adapt to the new environment [52]. However, the two interpretations are indistinguishable in our case and might require a more elaborate analysis with specific data.

Population analysis of the reduced Gompertz model

The high correlation among the Gompertz parameters, suggested that a reduction of the degrees of freedom (number of parameters) in the Gompertz model could improve identifiability and yield a more parsimonious model. We considered the expression (9), assuming c to be negligible. We therefore propose the following reduced Gompertz model: where β has mixed effects, while k has only fixed effects, i.e., is constant within the population. Fig 3 shows the results relative to the population analysis of this reduced Gompertz model. Results of the diagnostic plots indicated no deterioration of the goodness-of-fit as compared with the Gompertz model (Fig 3B–3D). Only on the last timepoint was the model slightly underestimating the data (Fig 3D), which might explain why the model performs slightly worse than the two-parameters Gompertz model in terms of strictly quantitative statistical indices (but still better than the logistic or exponential models, Table 1). Individual dynamics were also accurately described (Fig 3E). Parameter identifiability was also excellent (Table 2). The other two data sets gave similar results (see S5 and S6 Figs). Together, these results demonstrated the accuracy of the reduced Gompertz model, with improved robustness as compared to previous models.

Prediction of the age of a tumor

Considering the increased robustness of the reduced Gompertz model (one individual parameter less than the Gompertz model), we further investigated its potential for improvement of predictive power. We considered the problem of estimating the age of a tumor, that is, the time elapsed between initiation (here the time of injection) and detection occurring at larger tumor size (Fig 4). For a given animal i, we considered as first observation and aimed to predict its age (see Methods). We compared the results given by the Bayesian inference with the ones computed with standard likelihood maximization method (see Methods). To that end, we did not consider any information on the distribution of the parameters. For the reduced Gompertz model however (likelihood maximization case), we used the value of k calculated in the previous section (Table 2), thus using information on the entire population. Importantly, for both prediction approaches, our methods allowed not only to generate a prediction of a for estimation of the model accuracy (i.e. absolute relative error of prediction), but also to estimate the uncertainty of the predictions (i.e. precision, measured by the width of the 95% prediction interval (PI)).

Fig 4

Backward predictions computed with likelihood maximization and with Bayesian inference.

Backward predictions computed with likelihood maximization and with Bayesian inference.

Examples of backward predictions of three individuals (A), (B) and (C) computed with likelihood maximization (LM) and Bayesian inference: Gompertz model with likelihood maximization (first row); reduced Gompertz with likelihood maximization (second row); Gompertz with Bayesian inference (third row) and reduced Gompertz with Bayesian inference (fourth row). Only the last three points are considered to estimate the parameters. The grey area is the 95% prediction interval (P.I) and the dotted blue line is the median of the posterior predictive distribution. The red line is the predicted initiation time and the black vertical line the actual initiation time. Fig 4 presents a few examples of prediction of three individuals without (LM) and with (Bayesian inference) priors relative to the breast cancer measured by volume. The reduced Gompertz model combined to Bayesian inference (bottom row) was found to have the best accuracy in predicting the initiation time (mean error = 12.2%, 8.8% and 12.3% for the volume-breast cancer, lung cancer and fluorescence-breast cancer respectively) and to have the smallest uncertainty (precision = 15.6, 7.79 and 23.6 days for the three data sets, respectively). Table 3 gathers results of accuracy and precision for the Gompertz and reduced Gompertz models under LM and Bayesian inference relative to the three data sets. With only local information of the three last data points, the Gompertz model predictions were very inaccurate (mean error = 156%, 178% and 236%) and the Fisher information matrix was often singular, preventing standard errors to be adequately computed. With one degree of freedom less, the reduced Gompertz model had better performances with LM estimation but still large uncertainty (mean precision under LM = 210, 103 and 368 days) and poor accuracy using LM (mean error = 79%, 68.9% and 91.7%). Examples shown in Fig 4 were representative of the entire population relative to the breast cancer measured by volume. Eventually, for 97%, 95% and 87.5% of the individuals of the three data sets the actual value of the age fell in the respective prediction interval when Bayesian inference was applied in combination with the reduced Gompertz models. This means a good coverage of the prediction interval and indicates that our precision estimates were correct. On the other hand, this observation was not valid in case of likelihood maximization, where the actual value fell in the respective prediction interval for only 42.4%, 35% and 75% of the animals when the reduced Gompertz model was used.

Table 3

Accuracy and precision of methods for prediction of the age of experimental tumors of the three cell lines.

Cell line	Model	Estimation method	Error	PI
Breast, volume	Reduced Gompertz	Bayesian	12.2 (1.05)	15.6 (0.509)
	Reduced Gompertz	LM	79 (13.2)	210 (58.6)
	Gompertz	Bayesian	16.4 (1.65)	41.1 (1.63)
	Gompertz	LM	156 (21.7)	-
Lung, volume	Reduced Gompertz	Bayesian	8.78 (1.43)	7.79 (0.275)
	Reduced Gompertz	LM	68.9 (33.1)	103 (92.6)
	Gompertz	Bayesian	18.9 (2.87)	19.7 (1.89)
	Gompertz	LM	178 (71.6)	-
Breast, fluorescence	Reduced Gompertz	Bayesian	12.3 (2.9)	23.6 (5.15)
	Reduced Gompertz	LM	91.7 (21.1)	368 (223)
	Gompertz	Bayesian	13.5 (3.5)	45.4 (4.43)
	Gompertz	LM	236 (150)	-

Accuracy and precision of methods for prediction of the age of experimental tumors of the three cell lines.

Accuracy was defined as the absolute value of the relative error (in percent). Precision was defined as the width of the 95% prediction interval (PI column, in days). Reported are the means and standard errors (in parenthesis). LM = likelihood maximization. Addition of a priori population information by means of Bayesian estimation resulted in drastic improvement of the prediction performances (Fig 5). This result was confirmed in the the other data sets (see S7 and S8 Figs for the lung cell line and S9 and S10 Figs for the breast cell line measured by fluorescence). For the breast and lung cancer cell lines measured by volume, a Wilcoxon test was performed to analyze the different error distributions shown in Figures Fig 5C and S8C Fig. For the fluorescence-breast cancer cell line we could not report a significant difference in terms of accuracy between the Gompertz and the reduced Gompertz when applying Bayesian inference. This can be explained by the low number of individuals included in the data set.

Fig 5

Accuracy of the prediction models.

Swarmplots of relative errors obtained under likelihood maximization (A) or Bayesian inference (B) (* p-value < 0.05, ** p-value < 0.01, Levene’s test). (C) Absolute errors: comparison between the different distributions (* p-value < 0.05, ** p-value < 0.01, Wilcoxon test). In (A) three extreme outliers were omitted (values of the relative error were greater than 20) for both the Gompertz and the reduced Gompertz in order to ensure readability. LM = Likelihood Maximization.

Accuracy of the prediction models.

Discussion

We have analyzed tumor growth curves from multiple animal models and experimental techniques, using a population framework. This approach is ideally suited for experimental or clinical data of the same tumor type within a given group of subjects. Indeed, it allows for a description of the inter-subject variability that is impossible to obtain when fitting models to averaged data (as often done for tumor growth kinetics [53]), while enabling a robust population-level description that is strictly more informative than individual fits alone. As expected from the classical observation of decreasing specific growth rates [6, 54, 8, 55, 56], the exponential model generated very poor fits. More surprisingly given its popularity in the theoretical community (probably due to its ecological ground), the logistic model was also rejected, due to unrealistically small inferred value of the carrying capacity K. This finding confirms at the population level previous results obtained from individual fits [15, 57]. It suggests that the underlying theory (competition between the tumor cells for space or nutrients) is unable—at least when considered alone—to explain the decrease of the specific growth rate, suggesting that additional mechanisms need to be accounted for. Indeed, the logistic model relies on space-independent cellular interactions, which might be biologically unrealistic [58]. Few studies have previously compared the descriptive performances of growth models on the same data sets [15, 59, 16]. In contrast to our results, Vaidya and Alexandro [16] found admissible description of tumor growth data employing the logistic model. Beyond the difference of animal model, we believe that the major reason explaining this discrepancy is the type of error model that was employed, as also noticed by others [57]. Here we used a combined error model, in accordance to our previous study [15] that had examined repeated measurements of tumor size and concluded to rejection of a constant error model (used in [16]). Moreover, statistical goodness-of-fit metrics were substantially worse when using a constant error model (e.g AIC of 7362 versus 7129, for the Gompertz model, results not shown). To avoid overfitting, we also made the assumption to keep the initial value V fixed to Vinj. As noted before [15], releasing this constraint leads to acceptable fits by either the exponential or logistic models (to the price of deteriorated identifiability). However, the estimated values of V are in this case biologically inconsistent. On the other hand, the Gompertz model demonstrated excellent goodness-of-fit in all the experimental systems that we investigated. This is in agreement with a large body of previous experimental and clinical research works using the Gompertz model to describe unaltered tumor growth in syngeneic [60, 6, 10, 57] and xenograft [61, 62] preclinical models, as well as human data [55, 13, 12, 8]. The poor performances of the logistic model compared to the Gompertz model can be related to the structural properties of the models. The two sigmoid functions lie between two asymptotes (V = 0 and V = K) and are characterized by an initial period of fast growth followed by a phase of decreasing growth. These two phases are symmetrical in the logistic model, which is characterized by a decrease of the specific growth rate at constant speed. On the other hand, the Gompertz model exhibits a faster decrease of the specific growth rate, at speed , or e− as a function of t, and the sigmoidal curve is not symmetric around its inflexion point. The logistic and Gompertz models belong to the same family of tumor growth equations and can be seen as specific cases of the generalized logistic model [56, 15]. We also analyzed the latter model, which demonstrated good descriptive power but lacked robustness of convergence. Indeed, the SAEM algorithm converged to different estimates starting from different initial guesses of the parameters. This might be explained by the larger number of parameters (3) that led to identifiability problems. In addition, we found that values of ν able to describe the data were often very small (< 10−3), thus suggesting convergence to the Gompertz model. Similarly to previous reports [6, 11, 12, 13], we also found a very strong linear correlation between the two parameters of the Gompertz model, i.e. α the proliferation rate at injection and β the rate of decrease of the specific growth rate. Importantly, this correlation is not due to a lack of identifiability of the parameters at the individual level, which we investigated and found to be excellent. Such finding motivated our choice to use a reduced Gompertz model, with only one individual-specific parameter, and one population-specific parameter. This model has been proposed before in the context of individual tumor growth curves [11, 51] but here we leveraged the population approach to ensure reliable estimation of the population-level parameter and statistical distribution of the individual-level parameter. Importantly, while previous studies had only investigated the resulting predictive power in only one animal [10] or using simulation data [51], here we rigorously demonstrated how the reduced Gompertz allows better backward (or forward, although not reported here) prediction of tumor size and time of initiation. This analysis was performed using state-of-the art techniques from predictive modeling (e.g. cross-validation), on a large number of animals. The descriptive power of the reduced Gompertz model was found similar to the two-parameters Gompertz model. Critically, while previous work had demonstrated that two individual parameters were sufficient to describe tumor growth curves [15], these results now show that this number can be reduced to one. Interestingly, we found different values of the carrying capacity K for the breast and the lung cancer cell lines measured by volume (K = 2600 mm3 and 12300 mm3, respectively), in contrast with previous claims [11]. This suggests that there might not be a characteristic saturation point within a species [51] but the carrying capacity could be a typical feature of a tumor type in an animal model. From (10), the population constant k depends on the value of the parameter Vinj, therefore it cannot be viewed as a universal constant of tumor growth. However, it can be considered as a common trait within a species with similar characteristics (such as tumor type and value of Vinj). We used the formulations of the Gompertz (3) and reduced Gompertz (10) in order to define α as the specific growth rate at injection, which could be compared to the in vitro proliferation rate λ. This could be leveraged clinically to predict past or future tumor growth kinetics based on proliferation assays, derived from a patient’s tumor sample. The reduced Gompertz model, combined to Bayesian estimation from the population prior, allowed to reach good levels of accuracy and precision of the time elapsed between the injection of the tumor cells and late measurements, used as an experimental surrogate of the age of a given tumor. Importantly, performances obtained without using a prior were substantially worse. The method proposed herein remains to be extended to clinical data, although it will not be possible to have a firm confirmation since the natural history of neoplasms from their inception cannot be reported in a clinical setting. Nevertheless, the encouraging results obtained here could allow to give informative estimates, even if approximative. Importantly, the methods we developed also provide a measure of precision, which would give a quantitative assessment of the reliability of the predictions. For clinical translation, Vinj should be replaced by the volume of one cell V = 10−6 mm3. Moreover, because the Gompertz model has a specific growth rate that tends to infinity when V gets arbitrarily small, our results might have to be adapted with the Gomp-Exp model [63, 24]. Our methodology might face multiple challenges for future clinical applications. First, it is difficult to fully characterize unperturbed tumor kinetics in humans and only few studies support the evidence that it follows a gompertzian growth [8]. This is due to the limited number of available observations in the clinic and to the fact that saturation of human tumors is almost never reached, since it coincides with an advanced stage of the cancer where patients usually receive a treatment. Moreover, human tumor growth curves, even if limited to the same organ and histological type, exhibit a substantially larger variability than in in vivo experimental settings where immortalized cancer cell lines are injected in genetically identical mice. Here, we have proven that a given animal model (i.e. same mice, tumor type and number of injected cells) is characterized by a common tumor growth constant, that defines the saturation point. In the human setting, it could be interesting to analyze this constant as a function of some covariates (such as weight, sex, tumor type). Eventually, in the Gompertz model we haven’t considered that the initial phase of tumor growth might be affected by intrinsic stochasticity. Our choice was motivated by the large number of injected cells (of the order of 106) that allowed us to consider the initial variability to be negligible. For accurate clinical translation, stochasticity should ideally be taken into account to model the initial stages of tumor growth. Personalized estimations of the age of a given patient’s tumor would yield important epidemiological insights and could also be informative for routine clinical practice [39]. By estimating the period at which the cancer initiated, it could give clues on the possible causes (environmental or behavioral) of neoplastic formation. Moreover, reconstruction of the natural history of the pre-diagnosis tumor growth might inform the presence and extent of invisible metastasis at diagnosis. Indeed, an older tumor has a greater probability of having already spread than a younger one. Altogether, the present findings could contribute to the development of personalized computational models of metastasis [24, 64, 65].

Statistical indices of the tumor growth models (lung, volume).

Models ranked in ascending order of AIC (Akaike information criterion). Other statistical indices are the log-likelihood estimate (-2LL) and the Bayesian information criterion (BIC). (PDF) Click here for additional data file.

Statistical indices of the tumor growth models (breast, fluorescence).

Parameter values estimated with the SAEM algorithm (lung, volume).

Fixed effects (typical values) of the parameters of the different models. is the standard deviation of the random effects. is vector of the residual error model parameters. Last column shows the relative standard errors (R.S.E.) of the estimates. (PDF) Click here for additional data file.

Parameter values estimated with the SAEM algorithm (breast, fluorescence).

Diagnostic plots from population analysis (lung, volume).

Population analysis of experimental tumor growth kinetics. A) Visual predictive checks assess goodness-of-fit for both structural dynamics and inter-animal variability by reporting model-predicted percentiles (together with confidence prediction intervals (P.I) in comparison to empirical ones. B) Prediction distributions. C) Individual weighted residuals (IWRES) with respect to time. D) Observations vs predictions Left: exponential, Center: logistic, Right: Gompertz models. (TIF) Click here for additional data file.

Diagnostic plots from population analysis (breast, fluorescence).

Individual fits from population analysis (lung, volume).

Three representative examples of individual fits (animal A, animal B and animal C) computed with the population approach relative to the exponential (left), the logistic (center) and the Gompertz (right) models. (TIF) Click here for additional data file.

Individual fits from population analysis (breast, fluorescence).

Correlation between the Gompertz parameters and diagnostic plots of the reduced Gompertz model with the population approach (lung, volume).

Correlation between the Gompertz parameters and diagnostic plots of the reduced Gompertz model with the population approach (breast, fluorescence).

Backward predictions computed with likelihood maximization (LM) and with Bayesian inference (lung, volume).

Three examples of backward predictions of individuals A, B and C computed with likelihood maximization (LM) and Bayesian inference: Gompertz model with likelihood maximization (first row); reduced Gompertz with likelihood maximization (second row); Gompertz with Bayesian inference (third row) and reduced Gompertz with Bayesian inference (fourth row). Only the last three points are considered to estimate the parameters. The grey area is the 95% prediction interval (P.I) and the dotted blue line is the median of the posterior predictive distribution. The red line is the predicted initiation time and the black vertical line the actual initiation time. (TIF) Click here for additional data file.

Error analysis of the predicted initiation time (lung, volume).

Backward predictions computed with likelihood maximization (LM) and with Bayesian inference (breast, fluorescence).

Error analysis of the predicted initiation time (breast, fluorescence).

Accuracy of the prediction models. Swarmplots of relative errors obtained under likelihood maximization (A) or Bayesian inference (B). (C) Absolute errors: comparison between the different distributions (* p-value < 0.05, ** p-value < 0.01). (TIF) Click here for additional data file. 29 Aug 2019 Dear Dr Benzekry, Thank you very much for submitting your manuscript 'A reduced Gompertz model for predicting tumor age using a population approach' for review by PLOS Computational Biology. Your manuscript has been fully evaluated by the PLOS Computational Biology editorial team and in this case also by three independent peer reviewers. While the reviewers found your work an interesting and well-designed implementation of data-driven modelling and model selection, they raised some substantial concerns about the manuscript as it currently stands. In particular, the validity of some of your conclusions, and the generality of others, is questioned by the reviewers. While your manuscript cannot be accepted in its present form, we urge you to consider a revised version in which the issues raised by the reviewers have been adequately addressed. We cannot, of course, promise publication at that time. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Your revisions should address the specific points made by each reviewer. Please return the revised version within the next 60 days. If you anticipate any delay in its return, we ask that you let us know the expected resubmission date by email at ploscompbiol@plos.org. Revised manuscripts received beyond 60 days may require evaluation and peer review similar to that applied to newly submitted manuscripts. In addition, when you are ready to resubmit, please be prepared to provide the following: (1) A detailed list of your responses to the review comments and the changes you have made in the manuscript. We require a file of this nature before your manuscript is passed back to the editors. (2) A copy of your manuscript with the changes highlighted (encouraged). We encourage authors, if possible to show clearly where changes have been made to their manuscript e.g. by highlighting text. Before you resubmit your manuscript, please consult our Submission Checklist to ensure your manuscript is formatted correctly for PLOS Computational Biology: http://www.ploscompbiol.org/static/checklist.action. Some key points to remember are: - Figures uploaded separately as TIFF or EPS files (if you wish, your figures may remain in your main manuscript file in addition). - Supporting Information uploaded as separate files, titled Dataset, Figure, Table, Text, Protocol, Audio, or Video. - Funding information in the 'Financial Disclosure' box in the online system. While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see here. We are sorry that we cannot be more positive about your manuscript at this stage, but if you have any concerns or questions, please do not hesitate to contact us. Sincerely, Zvia Agur, PhD Guest Editor PLOS Computational Biology Feilim Mac Gabhann Editor-in-Chief PLOS Computational Biology A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: Dear Professor Agur, The manuscript by Benzekry and coworkers is very interesting and insightful but it needs some important improvements. Thus, I suggest major revisions. The following is a list of key points to be reworked: 1. I liked a lot the basic idea to use mixed models as a way to relate the single individual laws of growth population-wide data. However, I disagree with the choice of comparing the Gompertz model only with the simplest two other ODE models of tumor growth: the exponential and the classical logistic model. The comparison must be done also with respect to more realistic models such as: the generalized logistic (V’ = a V – b V^{n} with 0 2. The fact that the classical logistic law V’= p V – q V^2 did not perform well is not so surprisingly. See for example some consideration at the end of the section 2 of d’Onofrio, Chaos, solitons and Fractals (2009): the classical logistic model might correspond to space-independent cellular interactions, which are non-physical. 3. The authors have detected a correlation between the parameters alpha and beta. If I well understood what the author mean, this correlation is not new but has been reported in literature in the seventies. See, for example, section 5.5 of the Wheldoin’s book (ref 40). This is not explicitly said in the manuscript: the authors simply cite references 11 and 26 and only in the discussion they list a series of previous similar results. The fact that this correlation is something previously known in literature must be explicitly written in the revised version where the correlation is described, not only in the discussion. 4. Are the authors sure that the reduced Gompertz model has not been defined previously ? 5. In non-oncological literature on biological growth there is a number of works that apply nonlinear mixed effects modeling to Gompertz model. They ought to be mentioned and reviewed in the introduction of the revised version of the manuscript. 6. There is also some works applying mixed-effects statistical modeling to investigate tumor growth described by simple ODEs models, including the Gompertz and other ode-based models (e.g the above mentioned paper by Ribba et al or the paper by Hartung et al cancer research (2014) and, of course, Benzekry et al, 2014). Indeed, currently the authors only generically wrote “However, to our knowledge, a detailed study of statistical properties of classical growth models at the level of the population (i.e. integrating structural dynamics with inter-animal variability) yet remains to be reported. Longitudinal data analysis with nonlinear mixed-effects is an ideal tool to perform such a task [17, 18].” Thus, the authors ought to clarify in detail which are the differences between their study and the previous ones (not limited to the above-mentioned papers….) 7. Summarizing points 3, 5 and 6: my feeling is that a substantial effort is needed to compare the present work to previous literature, explicitly stressing what is really new in this work (a lot) and what it has been previously published. 8. In my opinion, the description of the Bayesian inferences ought to be more detailed. This would help both those that are more oriented to the frequentist approach in Statistics, and those (e.g. biomathematicians) who are not expert at all in statistics 9. As far as the “tumor age” is concerned, I would be much more prudent due to the fact that tumour growth is both affected by intrinsic stochasticity (of course, especially in the initial stages of growth) and also extrinsic stochasticity. What the authors brilliantly computed is merely an estimation of the average tumor age, and the associated confidence-credibility intervals of that estimate. This is something radically different from inferring the probability density of the random variable “tumor age”, which can only be done in the framework of a stochastic tumour growth model 10. In the discussion the authors must add a detailed discussion of the limitations of their work. Kind Regards, A Referee Reviewer #2: The manuscript considers an interesting problem of predicting tumor age. The main model used for the data analysis is the Gompertz model and its reduced form. This is correct from my point of view. The reduced version is well argued. On the other hand, the authors compare the classic logistic model with constant carrying capacity (which should be understood as maximal tumor size here) with the form of Gompertz model in which carring capacity depends on initial data (as it is in the original Gompertz model). Typically, for the comparizon the Gompertz model is rescaled in such a way that there is a carying capacity independent of the initial value, and then the comparizon between the models makes sense. The authors should address this problem in their manuscript. Moreover, in both type of the Gompertz model (full and reduced) there arises a problem of dependance on V_inj, which is known in vitro experiments but uknown in vivo. This is probably the main problem considering the utility of the proposed method. I have also foud some editorial and language mistakes, like: - lines 108/109 "number of injected" WHAT? - line 116 "litterature" should be literature - in Formula (6): what is p here? In line 168 p is the number of parameters! - in line 158: "equation(7)" - lack of space. Reviewer #3: Review is uploaded as an attachment ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: None Reviewer #2: Yes Reviewer #3: None ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No Submitted filename: Revision PCOMPBIOL-D-19-00943.docx Click here for additional data file. 21 Nov 2019 Submitted filename: revision.docx Click here for additional data file. 11 Dec 2019 Dear Dr Benzekry, Thank you for submitting a revised manuscript 'A reduced Gompertz model for predicting tumor age using a population approach.' The paper is interesting and most of the reviewers' comments have been addressed satisfactorily, and the discrepancies still existing are minor. Nevertheless, we would ask you to make a minor revision before the manuscript is finally accepted for publication. One reviewer still has reservations on the implications of the assumption that k may be constant. The reviewer trusts your findings that from the statistical point of view, this can be assumed here without loss of goodness of fit. However, he doubts your claim that this is a universal property. Simply put, in this model we have carrying capacity K = Vinj exp(k). If k is fixed for the tumor type and animal, this means that by repeating this experiment, and injecting say 0.1*Vinj one will obtain a 10 times smaller carrying capacity; or inversely, injecting 10 times more cells will allow tumor to grow asymptotically to a 10 times larger size. This dependence of saturation level on the initial size seems biologically unsound. The reviewer does not think this invalidates your results, but that it calls for more careful interpretation. Possibly, the real "natural constant" is K, and the value of k would change if you inject larger number of the same cells into the same animals. This concern should be elaborated in the discussion, especially when you envisage the possible use of the model to trace the tumor growth back in time to the size of 1 cell. The second concern the reviewer raises is the insufficient review of previous works on “mathematical models for tumor growth, which have been previously studied and compared at the level of individual kinetics and for prediction of future tumor growth” (line 53). Please note that there is previous work on the subject, e.g., Kronik N., Kogan Y., Elishmereni M., Halevi-Tobias K., Vuk Pavlović S., Agur Z. Predicting Effect of Prostate Cancer Immunotherapy by Personalized Mathematical Models PLoS One 2010 5(12) (and it deals with the clinical data) and several other publications in the last 10 years. Please, study the literature and include previous work on the subject appropriately in your review. We would therefore like to ask you to modify the manuscript according to the review recommendations before we can consider your manuscript for acceptance. Your revisions should address the specific points made by each reviewer and we encourage you to respond to particular issues Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.raised. In addition, when you are ready to resubmit, please be prepared to provide the following: (1) A detailed list of your responses to the review comments and the changes you have made in the manuscript. We require a file of this nature before your manuscript is passed back to the editors. (2) A copy of your manuscript with the changes highlighted (encouraged). We encourage authors, if possible to show clearly where changes have been made to their manuscript e.g. by highlighting text. (3) A striking still image to accompany your article (optional). If the image is judged to be suitable by the editors, it may be featured on our website and might be chosen as the issue image for that month. These square, high-quality images should be accompanied by a short caption. Please note as well that there should be no copyright restrictions on the use of the image, so that it can be published under the Open-Access license and be subject only to appropriate attribution. Before you resubmit your manuscript, please consult our Submission Checklist to ensure your manuscript is formatted correctly for PLOS Computational Biology: http://www.ploscompbiol.org/static/checklist.action. Some key points to remember are: - Figures uploaded separately as TIFF or EPS files (if you wish, your figures may remain in your main manuscript file in addition). - Supporting Information uploaded as separate files, titled 'Dataset', 'Figure', 'Table', 'Text', 'Protocol', 'Audio', or 'Video'. - Funding information in the 'Financial Disclosure' box in the online system. While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we ask that you let us know the expected resubmission date by email at ploscompbiol@plos.org. If you have any questions or concerns while you make these revisions, please let us know. Sincerely, Zvia Agur, PhD Guest Editor PLOS Computational Biology Feilim Mac Gabhann Editor-in-Chief PLOS Computational Biology 27 Dec 2019 Submitted filename: minor_revision.docx Click here for additional data file. 6 Jan 2020 Dear Dr Benzekry, We are pleased to inform you that your manuscript 'A reduced Gompertz model for predicting tumor age using a population approach' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Once you have received these formatting requests, please note that your manuscript will not be scheduled for publication until you have made the required changes. In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pcompbiol/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. One of the goals of PLOS is to make science accessible to educators and the public. PLOS staff issue occasional press releases and make early versions of PLOS Computational Biology articles available to science writers and journalists. PLOS staff also collaborate with Communication and Public Information Offices and would be happy to work with the relevant people at your institution or funding agency. If your institution or funding agency is interested in promoting your findings, please ask them to coordinate their releases with PLOS (contact ploscompbiol@plos.org). Thank you again for supporting Open Access publishing. We look forward to publishing your paper in PLOS Computational Biology. Sincerely, Z Agur, PhD Guest Editor PLOS Computational Biology Feilim Mac Gabhann Editor-in-Chief PLOS Computational Biology 13 Feb 2020 PCOMPBIOL-D-19-00943R2 Population modeling of tumor growth curves and the reduced Gompertz model improve prediction of the age of experimental tumors Dear Dr Benzekry, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Matt Lyles PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

48 in total

1. Analysis of growth of multicellular tumour spheroids by mathematical models.

Authors: M Marusić; Z Bajzer; J P Freyer; S Vuk-Pavlović
Journal: Cell Prolif Date: 1994-02 Impact factor: 6.831

2. The Gompertz Curve as a Growth Curve.

Authors: C P Winsor
Journal: Proc Natl Acad Sci U S A Date: 1932-01 Impact factor: 11.205

3. Model-based prediction of phase III overall survival in colorectal cancer on the basis of phase II tumor dynamics.

Authors: Laurent Claret; Pascal Girard; Paulo M Hoff; Eric Van Cutsem; Klaas P Zuideveld; Karin Jorga; Jan Fagerberg; René Bruno
Journal: J Clin Oncol Date: 2009-07-27 Impact factor: 44.544

4. Reconsidering the paradigm of cancer immunotherapy by computationally aided real-time personalization.

Authors: Yuri Kogan; Karin Halevi-Tobias; Moran Elishmereni; Stanimir Vuk-Pavlović; Zvia Agur
Journal: Cancer Res Date: 2012-03-15 Impact factor: 12.701

5. Revisiting Dosing Regimen Using Pharmacokinetic/Pharmacodynamic Mathematical Modeling: Densification and Intensification of Combination Cancer Therapy.

Authors: Christophe Meille; Dominique Barbolosi; Joseph Ciccolini; Gilles Freyer; Athanassios Iliadis
Journal: Clin Pharmacokinet Date: 2016-08 Impact factor: 6.447

6. Systematic Modeling and Design Evaluation of Unperturbed Tumor Dynamics in Xenografts.

Authors: Zinnia P Parra-Guillen; Victor Mangas-Sanjuan; Maria Garcia-Cremades; Iñaki F Troconiz; Gary Mo; Celine Pitou; Philip W Iversen; Johan E Wallin
Journal: J Pharmacol Exp Ther Date: 2018-04-24 Impact factor: 4.030

Review 7. The mathematics of cancer: integrating quantitative models.

Authors: Philipp M Altrock; Lin L Liu; Franziska Michor
Journal: Nat Rev Cancer Date: 2015-12 Impact factor: 60.716

8. Vascular endothelial growth factor-mediated decrease in plasma soluble vascular endothelial growth factor receptor-2 levels as a surrogate biomarker for tumor growth.

Authors: John M L Ebos; Christina R Lee; Elena Bogdanovic; Jennifer Alami; Paul Van Slyke; Giulio Francia; Ping Xu; Anthony J Mutsaers; Daniel J Dumont; Robert S Kerbel
Journal: Cancer Res Date: 2008-01-15 Impact factor: 12.701

9. Invasion and proliferation kinetics in enhancing gliomas predict IDH1 mutation status.

Authors: Anne L Baldock; Kevin Yagle; Donald E Born; Sunyoung Ahn; Andrew D Trister; Maxwell Neal; Sandra K Johnston; Carly A Bridge; David Basanta; Jacob Scott; Hani Malone; Adam M Sonabend; Peter Canoll; Maciej M Mrugala; Jason K Rockhill; Russell C Rockne; Kristin R Swanson
Journal: Neuro Oncol Date: 2014-06 Impact factor: 12.300

10. Characteristic species dependent growth patterns of mammalian neoplasms.

Authors: G F Brunton; T E Wheldon
Journal: Cell Tissue Kinet Date: 1978-03

20 in total

1. A continuum mechanical framework for modeling tumor growth and treatment in two- and three-phase systems.

Authors: Cass T Miller; William G Gray; Bernhard A Schrefler
Journal: Arch Appl Mech Date: 2021-06-09 Impact factor: 2.467

2. Practical identifiability analysis of a mechanistic model for the time to distant metastatic relapse and its application to renal cell carcinoma.

Authors: Arturo Álvarez-Arenas; Wilfried Souleyreau; Andrea Emanuelli; Lindsay S Cooley; Jean-Christophe Bernhard; Andreas Bikfalvi; Sebastien Benzekry
Journal: PLoS Comput Biol Date: 2022-08-25 Impact factor: 4.779

3. A Novel Integrated Pharmacokinetic-Pharmacodynamic Model to Evaluate Combination Therapy and Determine In Vivo Synergism.

Authors: Young Hee Choi; Chao Zhang; Zhenzhen Liu; Mei-Juan Tu; Ai-Xi Yu; Ai-Ming Yu
Journal: J Pharmacol Exp Ther Date: 2021-03-12 Impact factor: 4.030

4. Nutrient supply, cell spatial correlation and Gompertzian tumor growth.

Authors: P Castorina; D Carco'
Journal: Theory Biosci Date: 2021-05-14 Impact factor: 1.919

5. Duration of lead time in screening for lung cancer.

Authors: Jochanan Benbassat
Journal: BMC Pulm Med Date: 2021-01-06 Impact factor: 3.317

6. Imaging-Based Subtypes of Pancreatic Ductal Adenocarcinoma Exhibit Differential Growth and Metabolic Patterns in the Pre-Diagnostic Period: Implications for Early Detection.

Authors: Mohamed Zaid; Dalia Elganainy; Prashant Dogra; Annie Dai; Lauren Widmann; Pearl Fernandes; Zhihui Wang; Maria J Pelaez; Javier R Ramirez; Aatur D Singhi; Anil K Dasyam; Randall E Brand; Walter G Park; Syed Rahmanuddin; Michael H Rosenthal; Brian M Wolpin; Natalia Khalaf; Ajay Goel; Daniel D Von Hoff; Eric P Tamm; Anirban Maitra; Vittorio Cristini; Eugene J Koay
Journal: Front Oncol Date: 2020-12-02 Impact factor: 5.738

7. Study of Combinatorial Drug Synergy of Novel Acridone Derivatives With Temozolomide Using in-silico and in-vitro Methods in the Treatment of Drug-Resistant Glioma.

Authors: Malobika Chakravarty; Piyali Ganguli; Manikanta Murahari; Ram Rup Sarkar; Godefridus Johannes Peters; Y C Mayur
Journal: Front Oncol Date: 2021-03-15 Impact factor: 6.244

8. Frequency-dependent interactions determine outcome of competition between two breast cancer cell lines.

Authors: Audrey R Freischel; Mehdi Damaghi; Jessica J Cunningham; Arig Ibrahim-Hashim; Robert J Gillies; Robert A Gatenby; Joel S Brown
Journal: Sci Rep Date: 2021-03-01 Impact factor: 4.379

9. Data-Driven Discovery of Mathematical and Physical Relations in Oncology Data Using Human-Understandable Machine Learning.

Authors: Daria Kurz; Carlos Salort Sánchez; Cristian Axenie
Journal: Front Artif Intell Date: 2021-11-25

10. Mathematical model of a personalized neoantigen cancer vaccine and the human immune system.

Authors: Marisabel Rodriguez Messan; Osman N Yogurtcu; Joseph R McGill; Ujwani Nukala; Zuben E Sauna; Hong Yang
Journal: PLoS Comput Biol Date: 2021-09-24 Impact factor: 4.475