Dimitrios V Vavoulis1,2,3,4, Anthony Cutts1,4, Jenny C Taylor2,3, Anna Schuh1,3,4,5. 1. Department of Oncology, University of Oxford, Oxford, OX3 7DQ, UK. 2. Nuffield Department of Medicine, Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK. 3. NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Trust, Oxford, OX3 9DU, UK. 4. Department of Oncology, Molecular Diagnostic Centre, University of Oxford, Oxford OX3 9DU, UK. 5. Department of Haematology, Oxford University Hospitals NHS Trust, Oxford OX3 9DU, UK.
Abstract
MOTIVATION: Tumours are composed of distinct cancer cell populations (clones), which continuously adapt to their local micro-environment. Standard methods for clonal deconvolution seek to identify groups of mutations and estimate the prevalence of each group in the tumour, while considering its purity and copy number profile. These methods have been applied on cross-sectional data and on longitudinal data after discarding information on the timing of sample collection. Two key questions are how can we incorporate such information in our analyses and is there any benefit in doing so? RESULTS: We developed a clonal deconvolution method, which incorporates explicitly the temporal spacing of longitudinally sampled tumours. By merging a Dirichlet Process Mixture Model with Gaussian Process priors and using as input a sequence of several sparsely collected samples, our method can reconstruct the temporal profile of the abundance of any mutation cluster supported by the data as a continuous function of time. We benchmarked our method on whole genome, whole exome and targeted sequencing data from patients with chronic lymphocytic leukaemia, on liquid biopsy data from a patient with melanoma and on synthetic data and we found that incorporating information on the timing of tissue collection improves model performance, as long as data of sufficient volume and complexity are available for estimating free model parameters. Thus, our approach is particularly useful when collecting a relatively long sequence of tumour samples is feasible, as in liquid cancers (e.g. leukaemia) and liquid biopsies. AVAILABILITY AND IMPLEMENTATION: The statistical methodology presented in this paper is freely available at github.com/dvav/clonosGP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION:Tumours are composed of distinct cancer cell populations (clones), which continuously adapt to their local micro-environment. Standard methods for clonal deconvolution seek to identify groups of mutations and estimate the prevalence of each group in the tumour, while considering its purity and copy number profile. These methods have been applied on cross-sectional data and on longitudinal data after discarding information on the timing of sample collection. Two key questions are how can we incorporate such information in our analyses and is there any benefit in doing so? RESULTS: We developed a clonal deconvolution method, which incorporates explicitly the temporal spacing of longitudinally sampled tumours. By merging a Dirichlet Process Mixture Model with Gaussian Process priors and using as input a sequence of several sparsely collected samples, our method can reconstruct the temporal profile of the abundance of any mutation cluster supported by the data as a continuous function of time. We benchmarked our method on whole genome, whole exome and targeted sequencing data from patients with chronic lymphocytic leukaemia, on liquid biopsy data from a patient with melanoma and on synthetic data and we found that incorporating information on the timing of tissue collection improves model performance, as long as data of sufficient volume and complexity are available for estimating free model parameters. Thus, our approach is particularly useful when collecting a relatively long sequence of tumour samples is feasible, as in liquid cancers (e.g. leukaemia) and liquid biopsies. AVAILABILITY AND IMPLEMENTATION: The statistical methodology presented in this paper is freely available at github.com/dvav/clonosGP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Jenny Klintman; Niamh Appleby; Basile Stamatopoulos; Katie Ridout; Toby A Eyre; Pauline Robbe; Laura Lopez Pascua; Samantha J L Knight; Helene Dreau; Maite Cabes; Niko Popitsch; Mats Ehinger; Jose I Martín-Subero; Elías Campo; Robert Månsson; Davide Rossi; Jenny C Taylor; Dimitrios V Vavoulis; Anna Schuh Journal: Blood Date: 2021-05-20 Impact factor: 22.113