| Literature DB >> 29208983 |
Jie Liu1, John T Halloran2, Jeffrey A Bilmes2, Riza M Daza1, Choli Lee1, Elisabeth M Mahen3,4,5, Donna Prunkard6, Chaozhong Song3,4,5, Sibel Blau3,7, Michael O Dorschner3,6, Vijayakrishna K Gadi8,9, Jay Shendure1,10, C Anthony Blau11,12,13, William S Noble14,15.
Abstract
A comprehensive characterization of tumor genetic heterogeneity is critical for understanding how cancers evolve and escape treatment. Although many algorithms have been developed for capturing tumor heterogeneity, they are designed for analyzing either a single type of genomic aberration or individual biopsies. Here we present THEMIS (Tumor Heterogeneity Extensible Modeling via an Integrative System), which allows for the joint analysis of different types of genomic aberrations from multiple biopsies taken from the same patient, using a dynamic graphical model. Simulation experiments demonstrate higher accuracy of THEMIS over its ancestor, TITAN. The heterogeneity analysis results from THEMIS are validated with single cell DNA sequencing from a clinical tumor biopsy. When THEMIS is used to analyze tumor heterogeneity among multiple biopsies from the same patient, it helps to reveal the mutation accumulation history, track cancer progression, and identify the mutations related to treatment resistance. We implement our model via an extensible modeling platform, which makes our approach open, reproducible, and easy for others to extend.Entities:
Mesh:
Year: 2017 PMID: 29208983 PMCID: PMC5717219 DOI: 10.1038/s41598-017-16813-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Software tools for characterizing within-patient and intra-tumor heterogeneity and their features, including whether they capture SNVs, CNAs and structural variants (SVs), whether they support multiple biopsy analysis, and their key models and algorithms.
| Software | Year | SNV | CNA | SV | Multiple | Model/Algorithm |
|---|---|---|---|---|---|---|
| OncoSNP[ | 2010 | ✓ | Mixture model, EM, Bayesian methods | |||
| TuMult[ | 2010 | ✓ | ✓ | Breakpoint distance | ||
| GRAFT[ | 2012 | ✓ | Partial maximum likelihood | |||
| ABSOLUTE[ | 2012 | ✓ | ✓ | Maximum likelihood | ||
| TrAp[ | 2013 | ✓ | Exhaustive search under constraints | |||
| THetA[ | 2013 | ✓ | Maximum likelihood | |||
| CancerTiming[ | 2013 | ✓ | Maximum likelihood | |||
| OncoSNP-seq[ | 2013 | ✓ | Mixture model, EM, Bayesian methods | |||
| PyClone[ | 2014 | ✓ | ✓ | Dirichlet Process, beta-binomial/MCMC | ||
| SciClone[ | 2014 | ✓ | ✓ | Beta mixture model/variational Bayes | ||
| Clomial[ | 2014 | ✓ | ✓ | Binomial mixture model, EM | ||
| CloneHD[ | 2014 | ✓ | ✓ | ✓ | HMM, EM, variational Bayes | |
| MEDICC[ | 2014 | ✓ | ✓ | Finite state transducer, minimum-event distance | ||
| TITAN[ | 2014 | ✓ | Two-chain factorial HMM/EM | |||
| SubcloneSeeker[ | 2014 | ✓ | ✓ | ✓ | Clustering, enumeration and co-localization prediction | |
| BTP[ | 2014 | ✓ | Binary tree partition | |||
| BreakDown[ | 2014 | ✓ | Maximum likelihood | |||
| PhyloSub[ | 2014 | ✓ | ✓ | Tree-structured stick-breaking process prior, MCMC | ||
| BayClone[ | 2015 | ✓ | ✓ | Categorical Indian Buffet Process | ||
| PhyloWGS[ | 2015 | ✓ | ✓ | ✓ | Tree-structured stick-breaking process prior, MCMC | |
| CITUP[ | 2015 | ✓ | ✓ | Quadratic integer programming | ||
| LICHeE[ | 2015 | ✓ | ✓ | Clustering and evolutionary constraint network | ||
| AncesTree[ | 2015 | ✓ | ✓ | Integer linear programming | ||
| SPRUCE[ | 2016 | ✓ | ✓ | ✓ | Combinatorial enumeration | |
| Canopy[ | 2016 | ✓ | ✓ | ✓ | MCMC | |
| THEMIS (our work) | 2017 | ✓ | ✓ | ✓ | Dynamic graphical models |
Figure 1Example THEMIS input observations and the corresponding inferred outputs. (a) Inputs to THEMIS, including allelic ratio, log ratio and genomic position information. Somatic mutation sites are indicated by blue diamonds. (b) Outputs of THEMIS show that there are two tumor clones in the tumor biopsy, one parent tumor clone with 40% cell prevalence and one child tumor clone with 35% cell prevalence. The CNAs of the two tumor clones are visualized with color bars across the genome, and the genomic positions hosting an SNV are indicated by blue diamonds.
Figure 2Results of single cell validation experiments. (a) Predicted cell category from the Bayesian classifier: normal cell (green), parent tumor cell (red), child tumor cells (purple) and unknown due to low sequencing quality (grey). (b) Three different types of regions that we use to distinguish the three types of cells, namely clonal 2-copy regions, clonal LOH regions and subclonal LOH regions. (c) Histograms of the relative coverage rate in clonal LOH segments and subclonal LOH segments demonstrate cell 34 is normal, cell 26 is from the parent tumor clone and cell 1 is from the child tumor clone. Red dotted lines in the histograms indicate the expected coverage rates in the cells.
Figure 3Experiments and results from joint analysis of three biopsies from the same patient. (a) The collection of the three biopsies from the same patient during three stages of treatment and the inferred tumor clones in the biopsies. (b) The recovered phylogenetic tree and the mutations accumulated on each edge. The mutations are shown on a genome-wide plot, and the two numbers on each edge are the number of germline heterozygous sites affected by these CNAs and the number of SNVs. (c) Signal pathways (a simplified version of signal transduction pathways from Wikipedia), and the number of genes with copy number changes (copy gain and copy loss) in different stages of cancer progression. *Denotes at least one of the mutated genes is a core component of the signaling pathway.
Figure 4The THEMIS model.