Literature DB >> 31655256

Dynamic Emergence of Observed and Hidden Intra-tumor Heterogeneity.

Franck Raynaud1, Marco Mina2, Giovanni Ciriello3.   

Abstract

Intra-tumor heterogeneity is frequently observed in cancer patients, and it is associated with therapeutic resistance and disease relapse. However, its systematic assessment is still limited and often unfeasible. Here, we use a mathematical model of tumor progression to decipher how multiple clones emerge and organize into complex architectures. We found a trade-off between cancer cell alteration and proliferation rates that defines a transition between low and high heterogeneity, the latter characterized by branching tumor phylogenies. We predict the existence of observed and hidden intra-tumor heterogeneity, which challenges the correct estimation of intrinsic tumor complexity. Although the numbers of observed and hidden clones do not always correlate, we demonstrate that population frequencies of observed clones can be used to estimate the extent of hidden heterogeneity in both simulated and human tumors. The characterization of complex clonal architectures is a critical first step toward understanding their organizing principles and predicting their emergence.
Copyright © 2019 The Author(s). Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Bioinformatics; Cancer; Cancer Systems Biology; Mathematical Biosciences; Systems Biology

Year:  2019        PMID: 31655256      PMCID: PMC6820272          DOI: 10.1016/j.isci.2019.10.018

Source DB:  PubMed          Journal:  iScience        ISSN: 2589-0042


Introduction

Individual tumors are often a composite of heterogeneous cancer cells, which have accumulated distinct molecular modifications over time. This dynamic process has been characterized according to evolutionary principles (Nowell, 1976) where abnormal cell proliferation is associated with the emergence and selection of alterations. Molecular alterations are in the most cases functionally neutral, but occasionally they confer a selective advantage to the cancer cell where they occurred, by promoting key oncogenic hallmarks (Hanahan and Weinberg, 2011). Typically referred to as “drivers”, selected alterations promote clonal expansion and, thus, the evolution and diversification of the tumor. Starting from this premise, evolutionary theory has provided an accurate framework to understand the emergence of cancer and its progression. Seminal works established this connection for neoplasms and metastasis (Fidler, 1978, Nowell, 1976). Ever since, multiple models of cancer evolution have been proposed, mostly falling in two major categories: models of tumor initiation and models of tumor progression. Tumor initiation requires the fixation of a first selective event, a gain-of-function mutation in an oncogene (Michor et al., 2004), a loss-of-function alteration in a tumor suppressor gene (Knudson, 1971, Komarova, 2007), or early incremental increases of cell fitness (Michor et al., 2011). Tumor progression, instead, is characterized by a succession of selected mutations and corresponding waves of clonal expansion. In this context, models of clonal expansion proved to be useful to characterize disease progression, resistance to therapy (Durrett and Moseley, 2010, Michor et al., 2005), mutation burden (Bozic et al., 2010, McFarland et al., 2013, McFarland et al., 2014), and mutation timing (Beerenwinkel et al., 2007, Beerenwinkel et al., 2014). Despite the numerous insights provided by these models, the link between evolutionary parameters modeling tumor growth, such as alteration and/or replication rates, and the resulting extent of intra-tumor heterogeneity remains largely unexplored (Bozic et al., 2016, Chowell et al., 2018), especially in relation to actual human tumors. Indeed, a robust experimental estimation of the parameters themselves is currently missing, and the clonal composition of a tumor is mainly assessable through algorithmic inferences from bulk tumor sequencing rather than direct observations (Deshwar et al., 2015, Jiang et al., 2016, Miller et al., 2014, Raynaud et al., 2018, Roth et al., 2014). Recently, distinct types of clonal architectures have been inferred for multiple tumor types analyzed by multi-regional sequencing and/or longitudinal sample cohorts (McGranahan and Swanton, 2017, Yates et al., 2017). Nonetheless, whether and how these architectures depend on properties of tumor growth remains to be investigated. Here we numerically explored the emergence of diverse clonal architectures as a function of alteration and proliferation rates. Based on extensive simulations using a wide range of parameters, we explored fundamental features of intra-tumor heterogeneity, such as how many clones can typically be found in a tumor sample, whether they have similar sizes or one accounts for most of the cell population, how did they descend from each other, and, finally, what are the evolutionary determinants of these features. Our results from simulated tumors directly link clonal architectures to characteristics of tumor progression and indicate that intra-tumor heterogeneity is frequently underestimated. Importantly, we could determine features that can be computed for both simulated and human tumors that are predictive of this underestimation.

Model of Cancer Evolution

To investigate the properties of cancer evolution, tumor clonal architecture, and intra-tumor heterogeneity, we adapted a previously proposed model of tumor progression (Bozic et al., 2010). In this model, tumors evolve according to a discrete time process in which cells, simultaneously, at each time step either replicate or die. When a cell replicates, one of the daughter cells can gain a new alteration with a probability μ (alteration rate). We consider two types of alterations: driver alterations, which increase the selective advantage of the cell by reducing its probability to die, and passenger alterations, which have instead a neutral effect. To which extent the probability of dying is reduced by the acquisition of a new driver alteration is modulated by the parameter s (fitness). The alteration rate μ and fitness s are input parameters; they are identical for all cells and remain constant during the simulation. For a given fitness parameter s, a cell that has accumulated k driver alterations has a probability to die d given by: The probability to replicate b satisfies the following relationship: Human tumors typically exhibit highly heterogeneous proliferation rates and alteration loads. Here, to mimic such heterogeneity, we did not estimate specific model parameters, instead we chose to explore a wide range of possible parameter values for μ and s. First, we set the fitness parameter s ∈ [10−4 – 5 × 10−1] in agreement with previously proposed values (Bozic et al., 2010, Chowell et al., 2018, Levy et al., 2015, McFarland et al., 2014, Vermeulen et al., 2013). Then, we estimated a range of alteration rates based on previously reported mutation burden across multiple human cancers (Lawrence et al., 2013). By analyzing tumor cohorts from The Cancer Genome Atlas (TCGA), the number of mutations per nucleotide was estimated between 7 × 10−8 to 10−4 (assuming that the exome accounts for ∼2% of the genome). Therefore, we set for our simulations μ ∈ [10−7 – 10−3], also consistent with recent numerical studies (Bozic et al., 2010, Chowell et al., 2018, McFarland et al., 2014). New driver mutations occur at a rate μ × μ, with μ = 0.025. We chose this value based on an estimated number of 500 cancer-associated alterations (Bailey et al., 2018, Mina et al., 2017). Notably, a similar number of cancer-associated mutations was reported in the COSMIC Cancer Census, http://cancer.sanger.ac.uk/census. In our simulations, we defined a clone as a population of cells with identical mutational composition (i.e., genotype): after each replication step, if no alteration has occurred then the two daughter cells will remain in the same clone, otherwise the sibling with the new alteration will create a new clone, whether the new alteration is a driver or a passenger (Figure 1A). Tracking exact lineages between clones of each simulated tumors is computationally intensive, as all cells have to be monitored at each replication step. To improve efficiency, we simulated the evolution of clones, rather than cells. Specifically, for a clone with N identical cells, the number of replicating cells n was drawn from a binomial distribution with a success probability b. Then among n, we determined the number of mutating cells n from a binomial distribution with a probability μ. Finally, for each newly altered cell, the probability for the new mutation to be a driver mutation was μ.
Figure 1

Tumor Clonal Architectures and Transition between Linear and Branched Phylogenies

(A) Representation of the mutational process. Top, cell division without mutational event. The size of the clone is just increased by one unit. Middle, an alteration occurs during replication. A new clone is created (blue) by one cell, and the size of the original clone does not change. Bottom, cell death. In practice, more than one cell replicates and/or acquires a new mutation in the same time, resulting in the formation of more than one new clone at each replication. We pictured one event for the sake of simplicity.

(B) Example of Tree score values for phylogenies with four clones. Tree scores increase with increasing divergence.

(C) Tree scores of simulated tumors as a function of the alteration rate μ (x axis) and fitness s (y axis). Tree scores are color coded (cold colors for low scores, warm colors for high scores). The region (white contour) and fitted line (red line) corresponding to the transition between linear and divergent phylogenies are highlighted.

(D) Tree score variance values of simulated tumors as a function of the alteration rate μ (x axis) and fitness s (y axis). Tree score variances are color coded (white to black corresponding to low to high variance). The region (white contour) and fitted line (red line) corresponding to the transition between linear and divergent phylogenies are highlighted.

(E) Tree score distribution of tumors within the region of transition.

(F) The transition line (red line) was derived by fitting the points with maximal variance (gray dots).

See also Figure S1.

Tumor Clonal Architectures and Transition between Linear and Branched Phylogenies (A) Representation of the mutational process. Top, cell division without mutational event. The size of the clone is just increased by one unit. Middle, an alteration occurs during replication. A new clone is created (blue) by one cell, and the size of the original clone does not change. Bottom, cell death. In practice, more than one cell replicates and/or acquires a new mutation in the same time, resulting in the formation of more than one new clone at each replication. We pictured one event for the sake of simplicity. (B) Example of Tree score values for phylogenies with four clones. Tree scores increase with increasing divergence. (C) Tree scores of simulated tumors as a function of the alteration rate μ (x axis) and fitness s (y axis). Tree scores are color coded (cold colors for low scores, warm colors for high scores). The region (white contour) and fitted line (red line) corresponding to the transition between linear and divergent phylogenies are highlighted. (D) Tree score variance values of simulated tumors as a function of the alteration rate μ (x axis) and fitness s (y axis). Tree score variances are color coded (white to black corresponding to low to high variance). The region (white contour) and fitted line (red line) corresponding to the transition between linear and divergent phylogenies are highlighted. (E) Tree score distribution of tumors within the region of transition. (F) The transition line (red line) was derived by fitting the points with maximal variance (gray dots). See also Figure S1. To estimate intra-tumor heterogeneity (the mean number of clones) and quantify the tumor clonal architecture, we retained only clones with a number of cells greater or equal to 1% of the total population (here referred to as observable clones). This is consistent with the fraction of sequencing reads typically required by cancer exome-sequencing studies, e.g., from TCGA, to retain a somatic mutation (Figure S1A). The model of clonal evolution is implemented in Python, using the environment for tree exploration (ETE) (Huerta-Cepas et al., 2010).

Results

In total, we simulated ∼40,000 tumors, with a similar number of replicates for each pair of parameters (μ, s). For each simulation, we tracked the number of clones as well as their size and lineage, thus to be able to reconstruct the exact architecture of each simulated tumor. We stopped the simulation when a tumor reached a size of 5 × 108 cells.

Properties of Tumor Clonal Architectures

To capture the evolutionary history of a tumor, we need to understand how clone lineages are organized, i.e., how individual clones descend from one another (clone phylogeny). Clone phylogenies can be conveniently represented as trees wherein nodes are the clones and two clones are connected if one is the offspring of the other. The founder clone is the root of the tree, and clones without descendant are the leaves. At the extremes of the possible tree configurations are linear phylogenies, i.e., sequential generation of clones along a single lineage, and star-like phylogenies, i.e., each new clone descends directly from the root. Intuitively, the more branching (i.e., star-like) a phylogeny, the shorter is the path from the leaf to the root; in contrast, linear phylogenies will have a single leaf at the maximal distance from the root. To quantify the extent of branching in a phylogeny we introduced the following quantity:where N is the number of observable clones and is the average length of the path from the root to the leaves. Linear phylogenies obtain a Tree score equal to 0, whereas the Tree score will increase with greater branching (Figure 1B). We should note that Tree scores are not uniquely associated with phylogenies, i.e., different phylogenies can receive the same Tree score. However, in the simulations, this degeneracy remained moderate: for each pair of parameter values, μ and s, we observed on average less than two different phylogenies with the same Tree score (Figure S1B). Moreover, phylogenies receiving the same Tree score were significantly more similar than phylogenies with different Tree scores (Figures S1C–S1E and Transparent Methods), indicating that the Tree score provides a good representation of the tree structure. Next, we assessed how Tree scores varied in our simulated tumors based on input parameters. The resulting distribution (Figure 1C) showed an opposite effect of the fitness and alteration rate on the Tree score. At fixed fitness the mean Tree score increased with increasing alteration rates (i.e., multiple independent linages emerge when increasing the alteration rate), whereas at a fixed alteration rate, it decreased with increasing fitness. The parameter space was largely split into two regions: one with low branching or linear phylogenies (low or zero Tree score) and one with highly branching phylogenies (high Tree score). Interestingly, where the mean Tree score shifted to non-zero values, we observed an increase of its variance (Figure 1D). This result is reminiscent of a first-order phase transition in physical systems, where the variance of the order parameter diverges at the transition and different phases coexist, as shown by the distribution of Tree scores at the transition (Figure 1E). In our case, the transition was characterized by two phases: linear and branching evolution, separated by the transition line (with α = 1.5 ± 0.05 Figure 1F). To further assess the robustness of this result, we estimated the transition line with an alternative approach. Briefly, for every fitness values s we identified the mutation rate μ* that generated the highest number of observable clones. We then fit the measured mutation rate μ* as function of s. The best fit showed that μ*(s) followed the same power law as the transition estimated using the variance of the Tree score, and the rescaling of the mean number of clones confirmed that the transition held for diverse model parameters (Figures S2A–S2F). Overall, Tree scores of simulated phylogenies were highly correlated with their number of clones, indicating that highly heterogeneous tumors typically exhibited a branching phylogeny (Figure S2G). Intriguingly, we recently reported a similar association between Tree scores and number of clones in human tumors (Raynaud et al., 2018) (Figure S2H). Indeed, the set of phylogenies generated by our simulations and inferred from human tumors were highly consistent and, importantly, they represented only a subset of all possible phylogenies for a given number of clones (Figure S2I). These results indicate that the growth dynamics of our model and human tumors prevent highly polyclonal phylogenies with limited branching from emerging. In the parameter space defined by μ and s, the Tree score surprisingly decreased from high to low values in a region characterized by low fitness and high alteration rates (Figure 1C, bottom right corner). However, here we did not observe high variance of scores, suggesting a different mechanism for the reemergence of linear phylogenies in this region of parameters. A similar trend was observed for the overall distribution of the number of clones (Figure 2A). Indeed, here two transitions were observed from low to high number of clones (blue to red in Figure 2A) and from high to low in the region with high alteration rates and relatively low fitness (red to blue in Figure 2A, bottom right corner). Given we initially restricted the analysis to clones with a number of cells corresponding to at least 1% of the total cell population (observable clones), we wondered whether the number of clones below such threshold was significantly different in the two areas characterized by low number of clones and low Tree scores. In our simulations, we could count all clones independently of their sizes and we found that the total number of clones increased linearly with the alteration rate and never decreased (Figures 2B and S2J). Indeed, for samples with the same number of observable clones (e.g., n = 6), the number of “hidden” clones was significantly higher in the second transition (high to low) than in the first one (low to high) (Figure 2C) and confirmed that the decrease in the number of clones and Tree score was simply associated with the filtering process. The immediate consequence of these results is that highly heterogeneous tumors could exhibit the same number of observable clones as less heterogeneous ones, “hiding” their true complexity.
Figure 2

Observed and Total Number of Clones in Simulated Tumors

(A) Mean number of clones with a size (number of cells) greater than 1% of the total cell population obtained by simulated tumors as a function of their alteration rate μ (x axis) and fitness s (y axis). For each pair of coordinates, μ and s, the mean number of clones observed in simulations corresponding to those coordinates is color coded (cold colors for low number, warm colors for high numbers). Simulations with six clones (white lines) can be found for different ranges of parameters.

(B) The total number of clones, irrespective of their size, increases linearly with the mutation rate. The slope of the curves depends on the fitness s (curves for five representative s values are shown) and decreases with the logarithm of s (inset). Data are presented as mean ± SD.

(C) Boxplot of the number of clones before filtering for simulated tumors with six detectable clones found in the area of low hidden heterogeneity, upper white curve in (A), and of high hidden heterogeneity, lower white curve in (A); p value is calculated with a two-tailed t test.

See also Figure S2.

Observed and Total Number of Clones in Simulated Tumors (A) Mean number of clones with a size (number of cells) greater than 1% of the total cell population obtained by simulated tumors as a function of their alteration rate μ (x axis) and fitness s (y axis). For each pair of coordinates, μ and s, the mean number of clones observed in simulations corresponding to those coordinates is color coded (cold colors for low number, warm colors for high numbers). Simulations with six clones (white lines) can be found for different ranges of parameters. (B) The total number of clones, irrespective of their size, increases linearly with the mutation rate. The slope of the curves depends on the fitness s (curves for five representative s values are shown) and decreases with the logarithm of s (inset). Data are presented as mean ± SD. (C) Boxplot of the number of clones before filtering for simulated tumors with six detectable clones found in the area of low hidden heterogeneity, upper white curve in (A), and of high hidden heterogeneity, lower white curve in (A); p value is calculated with a two-tailed t test. See also Figure S2. The total number of clones, irrespective of their size, could reach values orders of magnitude higher than those typically observed in human datasets (Andor et al., 2016). The difference in the number of clones due to the 1% filtering indicated that simulated tumors were composed by few large clones (up to ∼10 clones on average) and a wide array of small clones with size below the detectability threshold (Figures 2A and 2B). A similar observation was recently made in Chowell et al. (2018) for a different detection threshold (10%), suggesting that independent of the threshold values, the majority of the true heterogeneity remains hidden. Given that clones are here defined by their genotype, independent of whether they actually exhibit greater fitness, we expect many of these small hidden clones to disappear. Hence, we asked what percentage of the hidden cell population is predicted to expand and form a new clonal lineage. Where hidden heterogeneity is the greatest (Figure 2B, bottom right corner), the number of driver mutations per hidden clone increased (Figure S2K). However, given the low values of the fitness parameter in this space, less than 0.1% of the hidden cell population was predicted expanding (Figure S2L). Vice versa, for high values of both μ and s, this value reached 5% (Figure S2L). Although multiple factors could limit the growth of newly generated clones in human tumors, this in silico observation suggests that filtering mutations with low frequencies may result in underestimating intra-tumor heterogeneity. Importantly, in human tumors where alteration rates and fitness are not given it remains impossible to estimate the extent of hidden heterogeneity. Given each simulated tumor is composed of the same number of cells, the difference between the number of observable and hidden clones suggests that the relative size of detectable clones could provide an indication of the actual extent of heterogeneity. We found that the normalized size of the biggest clone C decreased with the alteration rate as a stretched exponential (Figures 3A, S3A, and S3B and Transparent Methods)where μ* and β are both functions of the fitness s (Transparent Methods and Figures S3C and S3D). Vice versa, the size of the other clones Ci (i = 2,3,..,N) increased with the alteration rate while μ ≪ s (corresponding to an increase of the number of detectable clones), but as μ approaches s, the curves reached a maximum and then collapsed below the detection threshold (resulting in a decrease of the number of observable clones) (Figures 3A and S3E). We can write the sizes of the subsequent clones Ci using the following ansatz:where αi is a function of the rank i and the fitness s (Transparent Methods and Figures S3F–S3H). Using our numerical results, we estimated the value of μ*, β, and αi and analytically reconstructed the distribution of the number of observable clones. The resulting distribution closely mimicked what we observed in our simulations (Figure 3B), and, importantly, it can be estimated independent of any detection threshold because μ*, β, and αi do not depend on it.
Figure 3

Clone Size Distribution and Analytical Prediction of Intra-tumor Heterogeneity

(A) The size, as a percentage of the total population, of the first clone (black dots) across simulations at different alteration rates and fixed fitness (s = 0.0002) decreases with increasing alteration rates μ (x axis) and is fitted by a stretched exponential (black line). The size of the second clone (red dots) initially grows with increasing alteration rates but eventually decreases. This trend can be fitted as a function of the size of the first clone and the model parameters (red line, see Methods). Subsequent clones follow the same trend (red dotted lines). Clones with sizes greater than 1% (blue line) are detectable.

(B) Analytically derived number of clones as a function of mutation rate and fitness.

See also Figure S3.

Clone Size Distribution and Analytical Prediction of Intra-tumor Heterogeneity (A) The size, as a percentage of the total population, of the first clone (black dots) across simulations at different alteration rates and fixed fitness (s = 0.0002) decreases with increasing alteration rates μ (x axis) and is fitted by a stretched exponential (black line). The size of the second clone (red dots) initially grows with increasing alteration rates but eventually decreases. This trend can be fitted as a function of the size of the first clone and the model parameters (red line, see Methods). Subsequent clones follow the same trend (red dotted lines). Clones with sizes greater than 1% (blue line) are detectable. (B) Analytically derived number of clones as a function of mutation rate and fitness. See also Figure S3. Based on Equations 4 and 5, we can predict that for low alteration rates the first clone is significantly bigger than the second one (Figure 3A, left side), whereas at high alteration rates their sizes become comparable (Figure 3A, right side). As clone sizes can be inferred in human tumors, this observation could be used to discriminate between low and high hidden heterogeneity and help estimating its extent in human samples without any information on the underlying alteration rate and fitness. A convenient way to analyze the difference in size among subsequent clones is to rely on the concept of population frequency (PF), which is independent of the final size of the tumor. For a clone i, its PF, PF, corresponds to the fraction of cells in the tumor that exhibits the set of alterations in i, but not necessarily only those, i.e., the fraction of cells in i and in all the clones descending from i. Formally, the PF of clone i in a tumor T is defined as:where |i| is the size of the clone i and {s} is the set of clones descending from i (Figure S4A). The distribution of sorted PF values can recapitulate the differences among clone sizes that we observed at varying alteration rates (Figure 4A). Intuitively, a sharply decreasing distribution indicates that the first clone had time to grow before the emergence of new clones, hence the first clone is considerably bigger than the others (Figure 4A, green line). By contrast, a slowly decreasing distribution indicates that new clones rapidly emerged giving rise to observable clones of similar size (Figure 4A, red line).
Figure 4

Estimation of Hidden Intra-tumor Heterogeneity Using Clone Population Frequencies and Comparison with Human Tumor Samples

(A) Schematic of the curves of ranked clone population frequencies. A sharply decreasing curve (green dotted line) corresponds to a large initial clone followed by small clones (green dots with size indicative of the clone size). A slowly decreasing curve (red dotted line) corresponds to clones of a similar size (red dots with size indicative of the clone size). Each curve can be scored by its area under the curve (AUC, e.g., the gray area below the green line).

(B) Clone population frequency AUC (PF-AUC) values of simulated tumors with the number of clones between 5 and 10 as a function of the alteration rate μ (x axis) and fitness s (y axis). PF-AUC values are color coded (cold colors for low values, warm colors for high values). Simulation with six clones (white lines) has different PF-AUC values based on the range of parameters.

(C) Hidden heterogeneity values (1 − fraction of cells in detectable clones) of simulated tumors with the number of clones between 5 and 10 as a function of the alteration rate μ (x axis) and fitness s (y axis). Values are color coded (light colors for low values, dark colors for high values). Simulation with six clones (white lines) has different extent of hidden heterogeneity.

(D) Ranked clone population frequency curves for simulated tumors with the number of clones between 5 and 10. PF curves were separately derived for simulated tumors in three ranges of parameters (μ and s) as shown in the inset on the top right corner (group A in green, group B in black, group C in red). For each group, the corresponding curves were aggregated and the range of values are displayed.

(E) Ranked clone population frequency curves for human (continuous lines in the background) and simulated (dotted lines on top) tumors with the number of clones between 5 and 10. PF curves of human tumors were assigned to groups A, B, or C (lines are colored with the color of the corresponding group) based on the closest curve of simulated tumors with the same number of clones.

(F) Distribution of PF-AUC values for bootstrap (upper panel, n = 501 samples) and subsampling (lower panel, n = 108 samples). Histograms represent the empirical distributions; black lines are the best fit from a mixture of three Gaussian distributions. Histograms are color coded for clarity and to help to visualize regions A–C.

(G) Fraction of samples within each category of hidden heterogeneity in human data. Actual values are written on the top of the bar plots cut for readability.

See also Figure S4 and Tables S1 and S2.

Estimation of Hidden Intra-tumor Heterogeneity Using Clone Population Frequencies and Comparison with Human Tumor Samples (A) Schematic of the curves of ranked clone population frequencies. A sharply decreasing curve (green dotted line) corresponds to a large initial clone followed by small clones (green dots with size indicative of the clone size). A slowly decreasing curve (red dotted line) corresponds to clones of a similar size (red dots with size indicative of the clone size). Each curve can be scored by its area under the curve (AUC, e.g., the gray area below the green line). (B) Clone population frequency AUC (PF-AUC) values of simulated tumors with the number of clones between 5 and 10 as a function of the alteration rate μ (x axis) and fitness s (y axis). PF-AUC values are color coded (cold colors for low values, warm colors for high values). Simulation with six clones (white lines) has different PF-AUC values based on the range of parameters. (C) Hidden heterogeneity values (1 − fraction of cells in detectable clones) of simulated tumors with the number of clones between 5 and 10 as a function of the alteration rate μ (x axis) and fitness s (y axis). Values are color coded (light colors for low values, dark colors for high values). Simulation with six clones (white lines) has different extent of hidden heterogeneity. (D) Ranked clone population frequency curves for simulated tumors with the number of clones between 5 and 10. PF curves were separately derived for simulated tumors in three ranges of parameters (μ and s) as shown in the inset on the top right corner (group A in green, group B in black, group C in red). For each group, the corresponding curves were aggregated and the range of values are displayed. (E) Ranked clone population frequency curves for human (continuous lines in the background) and simulated (dotted lines on top) tumors with the number of clones between 5 and 10. PF curves of human tumors were assigned to groups A, B, or C (lines are colored with the color of the corresponding group) based on the closest curve of simulated tumors with the same number of clones. (F) Distribution of PF-AUC values for bootstrap (upper panel, n = 501 samples) and subsampling (lower panel, n = 108 samples). Histograms represent the empirical distributions; black lines are the best fit from a mixture of three Gaussian distributions. Histograms are color coded for clarity and to help to visualize regions A–C. (G) Fraction of samples within each category of hidden heterogeneity in human data. Actual values are written on the top of the bar plots cut for readability. See also Figure S4 and Tables S1 and S2. We computed the PF distribution for all simulated tumors with 5–10 observable clones and scored each simulation by the area under the curve of the PF distribution (PF-AUC) (Figure 4A, gray area). PF-AUC scores increased with alteration rates and decreased with fitness (Figure 4B) and were highly correlated with the extent of hidden heterogeneity (Spearman’s correlation = −0.95, p-value = 10−165, Figures 4B and 4C). This observation confirmed that PF-AUC could be used to discriminate among tumors with the same number of observable clones but different extent of hidden heterogeneity. Interestingly, PF distributions in our simulated dataset, separated into three classes corresponding to different parameters and clonal heterogeneity (Figure 4D). Tumors characterized by low hidden heterogeneity (Figure 4D, region A) had sharply decreasing PF distributions, consistent with the presence of the dominant first clone, and those with high hidden heterogeneity (Figure 4D, region C) were characterized by PF distributions decreasing slower, and thus by multiple clones of more similar size. A third group corresponded to simulated tumors with a high mean number of clones (see Figure 2B) and intermediate level of hidden heterogeneity (Figure 4D, region B). To assess whether these regions of hidden heterogeneity are independent of the adopted growth model, first we implemented a second model of tumor evolution, similar to a previously proposed one wherein passenger mutations induce a small, yet not null, deleterious effect on cell fitness (see Transparent Methods) (McFarland et al., 2014). The results showed the same three distributions independently of the chosen model (Figure S4C). Next, we ran simulations with variable rates of driver mutations (μd), and again three regions of hidden heterogeneity emerged, consistent with the previous results (Figure S4D). Overall, these regions were robust to modifications of the adopted growth model. Finally, to estimate the extent of hidden heterogeneity in human tumors, we collected the cancer genomes of 6,000 samples from 32 tumor types profiled by TCGA and inferred clone population frequencies from the phylogeny of each human sample estimated from its set of somatic mutations and copy number alterations (Raynaud et al., 2018). We assigned each human sample, with 5–10 predicted clones, to one of the three groups (A, B, and C) by matching its PF distribution to the closest mean distribution obtained in each group by simulated tumors (Figure 4E). Importantly, distributions of PF-AUC values in human tumors were neither uniformly nor normally distributed but rather consistent with a trimodal distribution whose modes were significantly different against sub- and bootstrap sampling (Figure 4F, Transparent Methods, Tables S1 and S2). The identification of three classes with a different amount of hidden heterogeneity was thus independently confirmed by simulations using different growth models and inferred phylogenies from human samples. In the TCGA dataset, we found that different percentages of samples within each tumor type were assigned to the three classes, with almost all tumor types including samples in the high hidden heterogeneity class and 17 of 32 cancer types having more than 10% of their samples in this category (Figure 4G). Despite the fact that this analysis could be done on a limited number of samples (only samples with 5–10 clones were analyzed), preliminary observations indicated that hidden heterogeneity classes were often associated with different overall survival (Figure S4E). Overall, these results suggest that clone population frequencies and PF-AUC scores can provide a simple but effective way to discriminate among human tumors with similar observable intra-tumor heterogeneity, those with high and low hidden heterogeneity.

Discussion

Intra-tumor heterogeneity severely hinders durable and complete response to cancer therapy (Jamal-Hanjani et al., 2017, Siravegna et al., 2015). Its systematic assessment is often unfeasible in the clinic, and markers of its emergence and extent are far from being characterized. Evolutionary parameters, such as alteration and proliferation rates, remain poorly documented in human tumors, and evolutionary architectures can at best be algorithmically inferred. From this perspective, mathematical modeling of cancer progression provides an ideal framework to explore diverse evolutionary features and their impact on intra-tumor heterogeneity. Here, we investigated through extensive numerical simulations how clonal diversity emerges under a wide range of alteration rate (μ) and fitness (s) parameters, the latter positively associated with a cell replicative potential. By tracking clonal lineages for all simulated tumors, we first demonstrated that intra-tumor heterogeneity increases with the alteration rate and decreases with fitness and that high intra-tumor heterogeneity invariably corresponds to branching phylogenies. Interestingly, the transition from linear to branched evolution could be accurately parameterized in the space of parameters (μ, s) by . Given the correlation observed between the number of clones and the extent of branching measured by Tree score, this transition effectively describes the emergence of intra-tumor heterogeneity as a function of the evolutionary parameters, alteration rate, and fitness. Recently, rare passenger (Bozic et al., 2016) and driver mutations (Chowell et al., 2018) have been predicted to characterize small subclones that could go undetected by standard molecular profiling. Indeed, our simulations revealed that in tumors with high alteration rates and low fitness, the number of clones with a size greater than 1% of the total cell population decreases concomitantly with an increasing presence of small clones below the detectability threshold. Importantly, the broad range of parameters here analyzed allowed to show that the number of observable and hidden clones do not always correlate. This result implies that tumors with the same number of observable clones, for example, inferred from the analysis of human molecular profiles, could exhibit instead very different extent of intra-tumor heterogeneity. To address this limitation, we found that the extent of hidden heterogeneity can be estimated by clone population frequencies, providing a unique opportunity to re-assess inferred clonal diversity in human tumors. Indeed, our results show that all the analyzed cancer types include samples with high extent of hidden heterogeneity, potentially associated with survival differences. It should be noted, that distinct clone population frequencies were observed in simulated tumors with the same number of clones but different values of alteration rate and fitness. This association could thus prove useful to infer evolutionary parameters in human tumors. Finally, our approach relied on a simple model of clonal evolution that ignored spatial constraints, such as interactions with the microenvironment and mechanical constraints between cells, and it assumed global and constant parameters in each simulation. On the one hand, modifications of the adopted growth model including deleterious effects of passenger mutations, a variable number of expected drivers, or a variable alteration rate during a single simulation all led to concordant results. On the other hand, it will be important in the future to explore alternative models including, for example, spatial models of tumor evolution that could assess the effect of cell-cell interactions or different cell phenotypes on intra-tumor heterogeneity. We envision that with extended models and a growing availability of data from detailed intra-tumor molecular profiling, it will be possible to provide qualitative and quantitative endpoints to systematically characterize the tumor clonal architecture of each patient.

Limitations of the Study

We have limited our modeling framework to a discrete time branching process disregarding spatial and mechanical constraints, as well as the phenotype of subclonal populations. Such aspects are worth considering in the future. Comparison with human tumor samples requires reliable datasets to resolve phylogenetic reconstructions. Tumor phylogenies have been inferred using PhyloWGS whose accuracy depends on the number of mutations as well as the sequencing read depth.

Methods

All methods can be found in the accompanying Transparent Methods supplemental file.
  30 in total

1.  PyClone: statistical inference of clonal population structure in cancer.

Authors:  Andrew Roth; Jaswinder Khattra; Damian Yap; Adrian Wan; Emma Laks; Justina Biele; Gavin Ha; Samuel Aparicio; Alexandre Bouchard-Côté; Sohrab P Shah
Journal:  Nat Methods       Date:  2014-03-16       Impact factor: 28.547

2.  Conditional Selection of Genomic Alterations Dictates Cancer Evolution and Oncogenic Dependencies.

Authors:  Marco Mina; Franck Raynaud; Daniele Tavernari; Elena Battistello; Stephanie Sungalee; Sadegh Saghafinia; Titouan Laessle; Francisco Sanchez-Vega; Nikolaus Schultz; Elisa Oricchio; Giovanni Ciriello
Journal:  Cancer Cell       Date:  2017-07-27       Impact factor: 31.743

3.  Modeling the Subclonal Evolution of Cancer Cell Populations.

Authors:  Diego Chowell; James Napier; Rohan Gupta; Karen S Anderson; Carlo C Maley; Melissa A Wilson Sayres
Journal:  Cancer Res       Date:  2017-11-29       Impact factor: 12.701

4.  Dynamics of chronic myeloid leukaemia.

Authors:  Franziska Michor; Timothy P Hughes; Yoh Iwasa; Susan Branford; Neil P Shah; Charles L Sawyers; Martin A Nowak
Journal:  Nature       Date:  2005-06-30       Impact factor: 49.962

5.  Pan-cancer analysis of the extent and consequences of intratumor heterogeneity.

Authors:  Noemi Andor; Trevor A Graham; Marnix Jansen; Li C Xia; C Athena Aktipis; Claudia Petritsch; Hanlee P Ji; Carlo C Maley
Journal:  Nat Med       Date:  2015-11-30       Impact factor: 53.440

Review 6.  Tumor heterogeneity and the biology of cancer invasion and metastasis.

Authors:  I J Fidler
Journal:  Cancer Res       Date:  1978-09       Impact factor: 12.701

7.  Mutation and cancer: statistical study of retinoblastoma.

Authors:  A G Knudson
Journal:  Proc Natl Acad Sci U S A       Date:  1971-04       Impact factor: 11.205

8.  Genetic progression and the waiting time to cancer.

Authors:  Niko Beerenwinkel; Tibor Antal; David Dingli; Arne Traulsen; Kenneth W Kinzler; Victor E Velculescu; Bert Vogelstein; Martin A Nowak
Journal:  PLoS Comput Biol       Date:  2007-11       Impact factor: 4.475

9.  Mutational heterogeneity in cancer and the search for new cancer-associated genes.

Authors:  Michael S Lawrence; Petar Stojanov; Paz Polak; Gregory V Kryukov; Kristian Cibulskis; Andrey Sivachenko; Scott L Carter; Chip Stewart; Craig H Mermel; Steven A Roberts; Adam Kiezun; Peter S Hammerman; Aaron McKenna; Yotam Drier; Lihua Zou; Alex H Ramos; Trevor J Pugh; Nicolas Stransky; Elena Helman; Jaegil Kim; Carrie Sougnez; Lauren Ambrogio; Elizabeth Nickerson; Erica Shefler; Maria L Cortés; Daniel Auclair; Gordon Saksena; Douglas Voet; Michael Noble; Daniel DiCara; Pei Lin; Lee Lichtenstein; David I Heiman; Timothy Fennell; Marcin Imielinski; Bryan Hernandez; Eran Hodis; Sylvan Baca; Austin M Dulak; Jens Lohr; Dan-Avi Landau; Catherine J Wu; Jorge Melendez-Zajgla; Alfredo Hidalgo-Miranda; Amnon Koren; Steven A McCarroll; Jaume Mora; Brian Crompton; Robert Onofrio; Melissa Parkin; Wendy Winckler; Kristin Ardlie; Stacey B Gabriel; Charles W M Roberts; Jaclyn A Biegel; Kimberly Stegmaier; Adam J Bass; Levi A Garraway; Matthew Meyerson; Todd R Golub; Dmitry A Gordenin; Shamil Sunyaev; Eric S Lander; Gad Getz
Journal:  Nature       Date:  2013-06-16       Impact factor: 49.962

10.  Quantifying Clonal and Subclonal Passenger Mutations in Cancer Evolution.

Authors:  Ivana Bozic; Jeffrey M Gerold; Martin A Nowak
Journal:  PLoS Comput Biol       Date:  2016-02-01       Impact factor: 4.475

View more
  1 in total

Review 1.  Epithelial-mesenchymal transition in cancer stemness and heterogeneity: updated.

Authors:  Keywan Mortezaee; Jamal Majidpoor; Ebrahim Kharazinejad
Journal:  Med Oncol       Date:  2022-09-07       Impact factor: 3.738

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.