| Literature DB >> 35758777 |
Xuecong Fu1, Haoyun Lei2, Yifeng Tao2, Russell Schwartz1,2.
Abstract
MOTIVATION: Cancer develops through a process of clonal evolution in which an initially healthy cell gives rise to progeny gradually differentiating through the accumulation of genetic and epigenetic mutations. These mutations can take various forms, including single-nucleotide variants (SNVs), copy number alterations (CNAs) or structural variations (SVs), with each variant type providing complementary insights into tumor evolution as well as offering distinct challenges to phylogenetic inference.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35758777 PMCID: PMC9236577 DOI: 10.1093/bioinformatics/btac253
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.931
Fig. 1.Overview of the tumor evolution reconstruction method of TUSV-ext. (1) Multi-regional samples are collected from one or more tumor sites or progressions stages, assumed to contain different compositions of a common set of clones, which we assume have been sequenced and from which various variant types have been called. (2) We subsample the structural variants and SNVs as needed and preprocess the variant information into three matrices: a variants copy number matrix F, a positional encoding matrix Q, and a breakpoints pairing matrix G. (3) We then apply the TUSV-ext algorithm to deconvolve the mixed samples into a set of clones each defined by a variant set, copy number profile, and frequency in each sample, as well as an inferred phylogeny on the clones. (4) By using the inferred frequencies and segmental copy number information of each clone, we assign unsampled breakpoints and SNVs to the inferred phylogeny to obtain a comprehensive evolutionary trajectory
Essential parameters and variables
| Notation | Meaning |
|---|---|
|
|
|
|
|
|
|
|
|
|
| The number of breakpoints |
|
| The number of SNVs |
|
| The number of haploid copy number segments |
|
| The number of leaves in the phylogenetic tree |
|
| The number of total clones in the phylogenetic tree, |
|
| The number of samples |
|
| The maximum allowed copy number for each segment |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Fig. 2.Results on task 1 with varying mutation rates. (a) The root mean square error (RMSE) of estimate C. (b) The root mean square error (RMSE) of estimate U. (c) Normalized Roubinson-Foulds distance of estimated trees and true tree. All the experiments were run with 1000 s time limit per iteration. Boxes show two quartiles and whiskers show the rest of the distribution except for outliers
Fig. 3.Results on task 1 with varying sample numbers. (a) The root mean square error (RMSE) of estimate C. (b) The root mean square error (RMSE) of estimate U. (c) Normalized Roubinson-Foulds distance of estimated trees and true tree. All the experiments were run with 1000 s time limit per iteration. Boxes show two quartiles and whiskers show the rest of the distribution except for outliers
Fig. 4.Results on task 2 with different samples size. (a) Accuracy of determining number of clones. The upper dashed line shows the best performance of accurately predict the number of clones, and the lower dashed line shows the lower bound of the relative distance when it reaches the maximum clone number. (b) Average precision of breakpoints. (c) Average precision of SNVs. (d) Average precision of breakpoints-SNVs co-clustering. Boxes show two quartiles and whiskers show the rest of the distribution except for outliers
Fig. 5.Results on task 2 with different clone numbers (or different total mutation rate) when the mutation rate per branch remains λ = 5 (a) Accuracy of determining number of clones. The upper dashed line shows the best performance of accurately predict the number of clones, and the lower dashed lines show the lower bound of the relative distance when it reaches the maximum clone number. (b) Average precision of breakpoints. (c) Average precision of SNVs. (d) Average precision of breakpoints-SNVs co-clustering. Boxes show two quartiles and whiskers show the rest of the distribution except for outliers
Fig. 6.Evaluation of model robustness using simulated data with mutation rate , 7 clones and 5 random subsamples, with an upper bound on the number of breakpoints of 80 or 120 and an upper bound on the sum of breakpoints and SNVs of 120 or 180, respectively. Time limits of 500, 1000 and 5000 s per iteration were tested for each subsample. Five different random starts were conducted in each setting with 12 iterations. (a) Accuracy of determining number of clones. (b) Average precision of breakpoints. (c) Average precision of SNVs. (d) Average precision of breakpoints-SNVs co-clustering. Boxes show two quartiles and whiskers show the rest of the distribution except for outliers
Fig. 7.Results for prostate cancer A32 patient data. (a) Inferred consensus tree structure with a subset of variants/mutations. (b) Comparison of proportions of all 9 clones across samples