Literature DB >> 28780931

Multiregional Tumor Trees Are Not Phylogenies.

João M Alves1, Tamara Prieto1, David Posada2.   

Abstract

Tumor samples most often comprise a mixture of different cell lineages. Multiregional trees built from bulk mutational profiles do not consider this heterogeneity and can potentially lead to erroneous evolutionary inferences, including biased timing of somatic mutations, spurious parallel mutation events, and/or incorrect chronological ordering of metastatic events.
Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  bulk sequencing; intratumor genetic heterogeneity; multiregional studies; somatic evolution; tumor phylogenies

Mesh:

Year:  2017        PMID: 28780931      PMCID: PMC5549612          DOI: 10.1016/j.trecan.2017.06.004

Source DB:  PubMed          Journal:  Trends Cancer        ISSN: 2405-8025


Following the pioneer work by Gerlinger et al. [1], a number of studies have taken advantage of next-generation sequencing (NGS) data obtained from bulk tissue samples extracted from multiple tumor regions 2, 3, 4, 5, 6, 7, 8. In these multiregional studies, ‘phylogenetic trees’ are inferred to depict the evolutionary history of the tumor, and therefore to unveil the temporal and spatial occurrence of mutations, or the origins of metastases. However, there are two types of trees that can be inferred from multiregional NGS bulk tumor data, namely, ‘sample/multiregional trees’ and ‘clone trees’, and these are not necessarily equivalent in terms of meaning or interpretation. ‘Sample trees’ are built using a genetic summary of each sample, most often its mutational profile (i.e., the set of mutations absent or present in a given sample) 3, 4, 5, 6. Importantly, given the pervasiveness of intratumor genetic heterogeneity (ITH), and the lack of prior hypotheses about tumor history at the time of biopsy, tumor bulk samples will seldom correspond with isolated cell populations that share a common single ancestor – in evolutionary terms, tumor samples will often be heavily admixed and rarely reciprocally monophyletic [9] (Box 1). On the contrary, tumor bulk samples will usually contain multiple cell lineages, lacking a clear correspondence with evolutionary units (i.e., they are ill-defined from an evolutionary perspective). For that reason, sample trees will reflect overall similarity but not necessarily evolutionary history. As a simple analogy, imagine that we sequence pools of individuals from New York, London, Rio de Janeiro, and Singapore and that we infer a sample tree from the aggregated mutational profiles observed at each city. Given the high level of ethnic admixture in these cities, single branches of this tree might represent more than one lineage. Such a tree would lack a clear evolutionary meaning and would be better classified as a similarity dendrogram than as a phylogeny. While one could be tempted to make the analogy between tumor sample trees and the ‘population/species trees’ [10] estimated by evolutionary biologists, the latter describe the evolutionary relationships among well-defined evolutionary units (populations or species), distinguishable in the presence of none or limited gene flow (i.e., limited admixture). In general, we expect bulk samples from tumors to be much more admixed than samples from organismal, natural populations. This is in part because the latter are usually surveyed according to recognizable geographical, environmental, or phenotypic features that help identifying a priori more or less isolated, discrete groups of individuals. Moreover, samples of natural populations are usually composed of single individuals rather than pools like in bulk tumor samples – the latter perhaps more akin to metagenomic research. Current strategies for multiregional tumor sampling have to be, in general, much simpler than strategies for sampling natural populations, for multiple reasons, including more complex accessibility and smaller scale. Furthermore, the growth of many natural populations occurs along two dimensions, with notable exceptions, while presumably the growth of a 3D tumor could be much more intermingled. In contrast to sample trees, ‘clone trees’ represent the evolutionary relationships among the different genetic cell lineages identified. Note that for the purpose of phylogenetic reconstruction, the definition of clone is not controversial; it is the set of cells identical for the portion of the genome studied. In evolutionary biology, clone trees would correspond to ‘gene trees’ (a ‘gene’ being any genomic region) or genealogies, and as such they can be studied using the same set of tools. Coming back to the previous analogy, a clonal tree would be equivalent to a tree whose tips are the different ethnicities/lineages living in New York, London, Rio de Janeiro, and Singapore. Such a tree would be a meaningful phylogeny, as each branch would represent exactly one lineage. On this basis, before inferring a clone tree, the clone(s) present at each tumor sample should be estimated, and while there are a number of factors that can complicate this task, particularly copy number variation, a number of sophisticated statistical algorithms have been developed to infer the clonal composition of tumors from bulk tumor data, usually by clustering mutations with similar variant allele frequency [11].

Biased Inferences from Tumor ‘Sample Trees’

While the need to account for ITH within single biopsies is generally recognized by the cancer genomics community – multiple studies have already focused their evolutionary analysis on clonal sequences 7, 8 – a number of recent multiregional studies are based on sample trees built from bulk mutational profiles 4, 5, 6. Importantly, at the heart of these studies is the implicit assumption that a tumor sample can be meaningfully summarized as the collection of mutations observed in that sample, or that only a single or dominant clone exists per sample that carries all mutations. Given the high level of ITH expected in most tumors [12], this assumption is not justified and can lead to biased inferences. To illustrate the potential biases induced by tumor sample trees, consider the patient shown in Figure 1A, whose primary tumor harbors three genetically distinct cell clones. For simplicity, we assume that the tumor is diploid and that there is no stromal contamination. The three clones (true clonal sequences) share some mutations that reflect their evolutionary history (true clonal phylogeny). Figure 1B depicts a hypothetical multiregional sequencing study of this patient, where three arbitrary regions were sampled and sequenced. While all three clones were sampled, their proportion varies across samples. Note that in this case only the mutational profile for Sample III corresponds to a true clonal sequence (Clone C), while the mutational profiles obtained for the other two samples represent a composite of Clones AB and AC, respectively. If we now build a maximum parsimony or a neighbor-joining tree using these composite clones 1, 3, 4, 6, we will wrongly infer that Mutation 5 occurred two times, independently in Sample II and Sample III. By contrast, a maximum parsimony tree of the inferred clonal sequences accurately recovers the true clonal history and the right order of mutations (Figure 1C).
Figure 1

Phylogenetic Analysis of Bulk Tumor Samples. (A) Left panel: clonal composition of a hypothetical primary tumor. Colored circles represent the three clones present (Clones A–C). Mid panel: true clonal sequences for five different genomic sites, where the dashed square indicates a somatic mutation. Right panel: true clonal history with red dots depicting the chronological order of mutations. Tumor most recent common ancestor (MRCA) highlighted as an internal node. (B) Left panel: bulk regional samples (I–III), with intermixed clones at different proportions. Mid panel: mutational profile (presence/absence) inferred, dashed square indicates presence of mutation. Right panel: inferred sample history using maximum parsimony. Red dots depicting the chronological order of mutations. (C) Left panel: bulk regional samples (I–III), with intermixed clones at different proportions. Mid panel: variant allele frequency (VAF) estimates for mutation at each sample, and inferred clonal sequences using the Clomial algorithm [14]. Right panel: inferred clonal history using maximum parsimony. Red dots depicting the chronological order of mutations.

Phylogenetic Analysis of Bulk Tumor Samples. (A) Left panel: clonal composition of a hypothetical primary tumor. Colored circles represent the three clones present (Clones A–C). Mid panel: true clonal sequences for five different genomic sites, where the dashed square indicates a somatic mutation. Right panel: true clonal history with red dots depicting the chronological order of mutations. Tumor most recent common ancestor (MRCA) highlighted as an internal node. (B) Left panel: bulk regional samples (I–III), with intermixed clones at different proportions. Mid panel: mutational profile (presence/absence) inferred, dashed square indicates presence of mutation. Right panel: inferred sample history using maximum parsimony. Red dots depicting the chronological order of mutations. (C) Left panel: bulk regional samples (I–III), with intermixed clones at different proportions. Mid panel: variant allele frequency (VAF) estimates for mutation at each sample, and inferred clonal sequences using the Clomial algorithm [14]. Right panel: inferred clonal history using maximum parsimony. Red dots depicting the chronological order of mutations. Another potential issue associated with the use of sample trees is the inferred evolutionary relationships between primary tumors and metastases. Consider now a patient for whom four distinct samples have been sequenced from the primary tumor (P) and metastases (MI–MIII; Figure 2A). For simplicity, we assume that (i) there is no contamination from healthy cells, (ii) all the existing clones have been sampled at each location, and (iii) mutations accumulate linearly with time (strict molecular clock). If we build a sample tree using the mutational profiles (Figure 2C, right panel, ‘sample tree’), we might be tempted to (wrongly) conclude that MII occurred before MI, as MII diverges before MI in the sample tree, although in real life sampling is not perfect and clonal branching points do not need to coincide with metastatic seedings, so alternative explanations are possible. In addition, we would spuriously infer that Mutations 3 and 6 occurred in parallel in the primary tumor and in MII (Figure 2C, right panel). Furthermore, the long branch leading to the primary tumor sample suggests that it evolved much faster than the metastases, when in this example all lineages evolved at the same rate. This is because using a mutational profile to summarize a sample implies putting all mutations along a single (sample tree) branch, and therefore overestimates the absolute age of the sample and/or its rate of evolution if more than one clone is present. By contrast, if we use the observed variant allele frequencies to deconvolute the clones present in each sample, despite our inability in this case to distinguish Clones A and D (Figure 2D, left panel), we still infer a clonal tree that is congruent with the evolutionary history of this tumor (Figure 2D, right panel), being able to represent the right timing of metastases (M1–M2–M3), lacking spurious parallel changes, and with only a minor deviation from the molecular clock.
Figure 2

Incorrect Chronological Ordering of Metastatic Events Using Tumor Sample Trees. (A) Sampling scheme of geographically distinct tumor samples: one primary tumor (P) and three metastatic sites (MI–MIII). Colored circles represent the five cellular clones (i.e., A, B, C, D, and E). (B) Left panel: clonal sequences based on genotype information from 15 somatic mutations – dashed square indicates presence of mutation. Right panel: true clonal phylogenetic tree and geographical location of each clone. Chronological order of metastatic events, assuming a molecular clock, depicted in the gray bar below the tree. (C) Left panel: derived sample mutational profiles using presence/absence states. Right panel: inferred sample tree using maximum likelihood or maximum parsimony for the mutational profiles. Inferred chronological order of metastatic events, assuming a molecular clock, depicted in the gray bar below the tree. (D) Left panel: allele frequency estimates of each mutation per sample, and inferred clonal sequences (ICs) using Clomial [14]. Right panel: phylogenetic tree drawn from the inferred clones and inferred geographical location of each clone. Inferred chronological order of metastatic events, assuming a molecular clock, depicted in the gray bar below the clonal tree. Abbreviation: VAF, variant allele frequency.

Incorrect Chronological Ordering of Metastatic Events Using Tumor Sample Trees. (A) Sampling scheme of geographically distinct tumor samples: one primary tumor (P) and three metastatic sites (MI–MIII). Colored circles represent the five cellular clones (i.e., A, B, C, D, and E). (B) Left panel: clonal sequences based on genotype information from 15 somatic mutations – dashed square indicates presence of mutation. Right panel: true clonal phylogenetic tree and geographical location of each clone. Chronological order of metastatic events, assuming a molecular clock, depicted in the gray bar below the tree. (C) Left panel: derived sample mutational profiles using presence/absence states. Right panel: inferred sample tree using maximum likelihood or maximum parsimony for the mutational profiles. Inferred chronological order of metastatic events, assuming a molecular clock, depicted in the gray bar below the tree. (D) Left panel: allele frequency estimates of each mutation per sample, and inferred clonal sequences (ICs) using Clomial [14]. Right panel: phylogenetic tree drawn from the inferred clones and inferred geographical location of each clone. Inferred chronological order of metastatic events, assuming a molecular clock, depicted in the gray bar below the clonal tree. Abbreviation: VAF, variant allele frequency. To illustrate our point, here we used simplistic examples in which we purposely ignored real confounding factors such as healthy contamination, unsampled/extinct clones, selection, or rate variation among lineages. These are oversimplifying assumptions that should not be adopted by default when analyzing real data. Nevertheless, in our opinion, these examples highlight that the use of mutational profiles and sample trees from bulk sequencing can compromise the study of tumor evolution. Given the spread of ITH, the types of biases we have described here − wrong clonal histories, spurious parallel changes, reversed timings of metastases, but also inaccurate divergence times and incorrect phylogeographic patterns − might be commonplace, suggesting that the interpretation of tumor sample trees might need to be reevaluated. The fact that some articles describe ‘incompatible mutations’ in their sample trees, together with our own analyses of published data sets [13], suggests that these types of problems do occur, although whether they are pervasive or not still needs to be proven. In any case, we encourage the use of clonal trees over sample trees even if only on conceptual grounds. Indeed, we acknowledge that clonal deconvolution is not an easy task, and if failing, it could lead to inaccurate inferences. Because of this, clonal estimation should be carried out with the necessary precautions, using benchmarked tools and ideally on high-depth NGS data.
  12 in total

1.  Early and multiple origins of metastatic lineages within primary tumors.

Authors:  Zi-Ming Zhao; Bixiao Zhao; Yalai Bai; Atila Iamarino; Stephen G Gaffney; Joseph Schlessinger; Richard P Lifton; David L Rimm; Jeffrey P Townsend
Journal:  Proc Natl Acad Sci U S A       Date:  2016-02-08       Impact factor: 11.205

2.  Pan-cancer analysis of the extent and consequences of intratumor heterogeneity.

Authors:  Noemi Andor; Trevor A Graham; Marnix Jansen; Li C Xia; C Athena Aktipis; Claudia Petritsch; Hanlee P Ji; Carlo C Maley
Journal:  Nat Med       Date:  2015-11-30       Impact factor: 53.440

3.  Extremely high genetic diversity in a single tumor points to prevalence of non-Darwinian cell evolution.

Authors:  Shaoping Ling; Zheng Hu; Zuyu Yang; Fang Yang; Yawei Li; Pei Lin; Ke Chen; Lili Dong; Lihua Cao; Yong Tao; Lingtong Hao; Qingjian Chen; Qiang Gong; Dafei Wu; Wenjie Li; Wenming Zhao; Xiuyun Tian; Chunyi Hao; Eric A Hungate; Daniel V T Catenacci; Richard R Hudson; Wen-Hsiung Li; Xuemei Lu; Chung-I Wu
Journal:  Proc Natl Acad Sci U S A       Date:  2015-11-11       Impact factor: 11.205

Review 4.  Mitochondrial DNA differentiation during the speciation process in Peromyscus.

Authors:  J C Avise; J F Shapira; S W Daniel; C F Aquadro; R A Lansman
Journal:  Mol Biol Evol       Date:  1983-12       Impact factor: 16.240

5.  Genomic architecture and evolution of clear cell renal cell carcinomas defined by multiregion sequencing.

Authors:  Marco Gerlinger; Stuart Horswell; James Larkin; Andrew J Rowan; Max P Salm; Ignacio Varela; Rosalie Fisher; Nicholas McGranahan; Nicholas Matthews; Claudio R Santos; Pierre Martinez; Benjamin Phillimore; Sharmin Begum; Adam Rabinowitz; Bradley Spencer-Dene; Sakshi Gulati; Paul A Bates; Gordon Stamp; Lisa Pickering; Martin Gore; David L Nicol; Steven Hazell; P Andrew Futreal; Aengus Stewart; Charles Swanton
Journal:  Nat Genet       Date:  2014-02-02       Impact factor: 38.330

6.  A Big Bang model of human colorectal tumor growth.

Authors:  Andrea Sottoriva; Haeyoun Kang; Zhicheng Ma; Trevor A Graham; Matthew P Salomon; Junsong Zhao; Paul Marjoram; Kimberly Siegmund; Michael F Press; Darryl Shibata; Christina Curtis
Journal:  Nat Genet       Date:  2015-02-09       Impact factor: 38.330

7.  The evolutionary history of lethal metastatic prostate cancer.

Authors:  Christopher Foster; Douglas Easton; Ultan McDermott; David C Wedge; G Steven Bova; Gunes Gundem; Peter Van Loo; Barbara Kremeyer; Ludmil B Alexandrov; Jose M C Tubio; Elli Papaemmanuil; Daniel S Brewer; Heini M L Kallio; Gunilla Högnäs; Matti Annala; Kati Kivinummi; Victoria Goody; Calli Latimer; Sarah O'Meara; Kevin J Dawson; William Isaacs; Michael R Emmert-Buck; Matti Nykter; Zsofia Kote-Jarai; Hayley C Whitaker; David E Neal; Colin S Cooper; Rosalind A Eeles; Tapio Visakorpi; Peter J Campbell
Journal:  Nature       Date:  2015-04-01       Impact factor: 49.962

8.  Subclonal diversification of primary breast cancer revealed by multiregion sequencing.

Authors:  Lucy R Yates; Moritz Gerstung; Stian Knappskog; Christine Desmedt; Gunes Gundem; Peter Van Loo; Turid Aas; Ludmil B Alexandrov; Denis Larsimont; Helen Davies; Yilong Li; Young Seok Ju; Manasa Ramakrishna; Hans Kristian Haugland; Peer Kaare Lilleng; Serena Nik-Zainal; Stuart McLaren; Adam Butler; Sancha Martin; Dominic Glodzik; Andrew Menzies; Keiran Raine; Jonathan Hinton; David Jones; Laura J Mudie; Bing Jiang; Delphine Vincent; April Greene-Colozzi; Pierre-Yves Adnet; Aquila Fatima; Marion Maetens; Michail Ignatiadis; Michael R Stratton; Christos Sotiriou; Andrea L Richardson; Per Eystein Lønning; David C Wedge; Peter J Campbell
Journal:  Nat Med       Date:  2015-06-22       Impact factor: 53.440

Review 9.  Cancer evolution: mathematical models and computational inference.

Authors:  Niko Beerenwinkel; Roland F Schwarz; Moritz Gerstung; Florian Markowetz
Journal:  Syst Biol       Date:  2014-10-07       Impact factor: 15.683

10.  Inferring clonal composition from multiple sections of a breast cancer.

Authors:  Habil Zare; Junfeng Wang; Alex Hu; Kris Weber; Josh Smith; Debbie Nickerson; ChaoZhong Song; Daniela Witten; C Anthony Blau; William Stafford Noble
Journal:  PLoS Comput Biol       Date:  2014-07-10       Impact factor: 4.475

View more
  22 in total

1.  Multiregion Sequence Analysis to Predict Intratumor Heterogeneity and Clonal Evolution.

Authors:  Soyeon Ahn; Haiyan Huang
Journal:  Methods Mol Biol       Date:  2021

2.  Analysis of Telomere Lengths in p53 Signatures and Incidental Serous Tubal Intraepithelial Carcinomas Without Concurrent Ovarian Cancer.

Authors:  Shiho Asaka; Christine Davis; Shiou-Fu Lin; Tian-Li Wang; Christopher M Heaphy; Ie-Ming Shih
Journal:  Am J Surg Pathol       Date:  2019-08       Impact factor: 6.394

3.  A Glioblastoma Genomics Primer for Clinicians.

Authors:  John D Patterson; Thidathip Wongsurawat; Analiz Rodriguez
Journal:  Med Res Arch       Date:  2020-02-21

4.  Estimation of cancer cell fractions and clone trees from multi-region sequencing of tumors.

Authors:  Lily Zheng; Noushin Niknafs; Laura D Wood; Rachel Karchin; Robert B Scharpf
Journal:  Bioinformatics       Date:  2022-06-01       Impact factor: 6.931

5.  Emerging Frontiers in the Study of Molecular Evolution.

Authors:  David A Liberles; Belinda Chang; Kerry Geiler-Samerotte; Aaron Goldman; Jody Hey; Betül Kaçar; Michelle Meyer; William Murphy; David Posada; Andrew Storfer
Journal:  J Mol Evol       Date:  2020-04       Impact factor: 2.395

6.  SCARLET: Single-cell tumor phylogeny inference with copy-number constrained mutation losses.

Authors:  Gryte Satas; Simone Zaccaria; Geoffrey Mon; Benjamin J Raphael
Journal:  Cell Syst       Date:  2020-04-22       Impact factor: 10.304

Review 7.  Pathological Bases and Clinical Impact of Intratumor Heterogeneity in Clear Cell Renal Cell Carcinoma.

Authors:  José I López; Javier C Angulo
Journal:  Curr Urol Rep       Date:  2018-01-27       Impact factor: 3.092

8.  Predicting clone genotypes from tumor bulk sequencing of multiple samples.

Authors:  Sayaka Miura; Karen Gomez; Oscar Murillo; Louise A Huuki; Tracy Vu; Tiffany Buturla; Sudhir Kumar
Journal:  Bioinformatics       Date:  2018-12-01       Impact factor: 6.937

9.  The Spatiotemporal Evolution of Lymph Node Spread in Early Breast Cancer.

Authors:  Peter Barry; Alexandra Vatsiou; Inmaculada Spiteri; Daniel Nichol; George D Cresswell; Ahmet Acar; Nicholas Trahearn; Sarah Hrebien; Isaac Garcia-Murillas; Kate Chkhaidze; Luca Ermini; Ian Said Huntingford; Hannah Cottom; Lila Zabaglo; Konrad Koelble; Saira Khalique; Jennifer E Rusby; Francesca Muscara; Mitch Dowsett; Carlo C Maley; Rachael Natrajan; Yinyin Yuan; Gaia Schiavon; Nicholas Turner; Andrea Sottoriva
Journal:  Clin Cancer Res       Date:  2018-06-11       Impact factor: 12.531

10.  Evolution of Barrett's esophagus through space and time at single-crypt and whole-biopsy levels.

Authors:  Pierre Martinez; Diego Mallo; Thomas G Paulson; Xiaohong Li; Carissa A Sanchez; Brian J Reid; Trevor A Graham; Mary K Kuhner; Carlo C Maley
Journal:  Nat Commun       Date:  2018-02-23       Impact factor: 14.919

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.