Sayaka Miura1,2, Karen Gomez1,2,3, Oscar Murillo1,4, Louise A Huuki1,2, Tracy Vu1,2, Tiffany Buturla1,2, Sudhir Kumar1,2,5. 1. Institute for Genomics and Evolutionary Medicine. 2. Department of Biology, Temple University, Philadelphia, PA, USA. 3. College of Physicians and Surgeons, Columbia University, New York, NY, USA. 4. Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA. 5. Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia.
Abstract
Motivation: Analyses of data generated from bulk sequencing of tumors have revealed extensive genomic heterogeneity within patients. Many computational methods have been developed to enable the inference of genotypes of tumor cell populations (clones) from bulk sequencing data. However, the relative and absolute accuracy of available computational methods in estimating clone counts and clone genotypes is not yet known. Results: We have assessed the performance of nine methods, including eight previously-published and one new method (CloneFinder), by analyzing computer simulated datasets. CloneFinder, LICHeE, CITUP and cloneHD inferred clone genotypes with low error (<5% per clone) for a majority of datasets in which the tumor samples contained evolutionarily-related clones. Computational methods did not perform well for datasets in which tumor samples contained mixtures of clones from different clonal lineages. Generally, the number of clones was underestimated by cloneHD and overestimated by PhyloWGS, and BayClone2, Canopy and Clomial required prior information regarding the number of clones. AncesTree and Canopy did not produce results for a large number of datasets. Overall, the deconvolution of clone genotypes from single nucleotide variant (SNV) frequency differences among tumor samples remains challenging, so there is a need to develop more accurate computational methods and robust software for clone genotype inference. Availability and implementation: CloneFinder is implemented in Python and is available from https://github.com/gstecher/CloneFinderAPI. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: Analyses of data generated from bulk sequencing of tumors have revealed extensive genomic heterogeneity within patients. Many computational methods have been developed to enable the inference of genotypes of tumor cell populations (clones) from bulk sequencing data. However, the relative and absolute accuracy of available computational methods in estimating clone counts and clone genotypes is not yet known. Results: We have assessed the performance of nine methods, including eight previously-published and one new method (CloneFinder), by analyzing computer simulated datasets. CloneFinder, LICHeE, CITUP and cloneHD inferred clone genotypes with low error (<5% per clone) for a majority of datasets in which the tumor samples contained evolutionarily-related clones. Computational methods did not perform well for datasets in which tumor samples contained mixtures of clones from different clonal lineages. Generally, the number of clones was underestimated by cloneHD and overestimated by PhyloWGS, and BayClone2, Canopy and Clomial required prior information regarding the number of clones. AncesTree and Canopy did not produce results for a large number of datasets. Overall, the deconvolution of clone genotypes from single nucleotide variant (SNV) frequency differences among tumor samples remains challenging, so there is a need to develop more accurate computational methods and robust software for clone genotype inference. Availability and implementation: CloneFinder is implemented in Python and is available from https://github.com/gstecher/CloneFinderAPI. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Andrew McPherson; Andrew Roth; Emma Laks; Tehmina Masud; Ali Bashashati; Allen W Zhang; Gavin Ha; Justina Biele; Damian Yap; Adrian Wan; Leah M Prentice; Jaswinder Khattra; Maia A Smith; Cydney B Nielsen; Sarah C Mullaly; Steve Kalloger; Anthony Karnezis; Karey Shumansky; Celia Siu; Jamie Rosner; Hector Li Chan; Julie Ho; Nataliya Melnyk; Janine Senz; Winnie Yang; Richard Moore; Andrew J Mungall; Marco A Marra; Alexandre Bouchard-Côté; C Blake Gilks; David G Huntsman; Jessica N McAlpine; Samuel Aparicio; Sohrab P Shah Journal: Nat Genet Date: 2016-05-16 Impact factor: 38.330
Authors: Marco Gerlinger; Andrew J Rowan; Stuart Horswell; James Larkin; David Endesfelder; Eva Gronroos; Pierre Martinez; Nicholas Matthews; Aengus Stewart; Charles Swanton; M Math; Patrick Tarpey; Ignacio Varela; Benjamin Phillimore; Sharmin Begum; Neil Q McDonald; Adam Butler; David Jones; Keiran Raine; Calli Latimer; Claudio R Santos; Mahrokh Nohadani; Aron C Eklund; Bradley Spencer-Dene; Graham Clark; Lisa Pickering; Gordon Stamp; Martin Gore; Zoltan Szallasi; Julian Downward; P Andrew Futreal Journal: N Engl J Med Date: 2012-03-08 Impact factor: 91.245
Authors: Jason A Somarelli; Heather Gardner; Vincent L Cannataro; Ella F Gunady; Amy M Boddy; Norman A Johnson; Jeffrey Nicholas Fisk; Stephen G Gaffney; Jeffrey H Chuang; Sheng Li; Francesca D Ciccarelli; Anna R Panchenko; Kate Megquier; Sudhir Kumar; Alex Dornburg; James DeGregori; Jeffrey P Townsend Journal: Mol Biol Evol Date: 2020-02-01 Impact factor: 16.240
Authors: Nicholas R Hum; Aimy Sebastian; Sean F Gilmore; Wei He; Kelly A Martin; Aubree Hinckley; Karen R Dubbin; Monica L Moya; Elizabeth K Wheeler; Matthew A Coleman; Gabriela G Loots Journal: Cancers (Basel) Date: 2020-03-14 Impact factor: 6.639