| Literature DB >> 28989791 |
Mamoru Kato1, Daniel A Vasco2, Ryuichi Sugino3, Daichi Narushima1, Alexander Krasnitz4.
Abstract
Single-cell sequencing is a promising technology that can address cancer cell evolution by identifying genetic alterations in individual cells. In a recent study, genome-wide DNA copy numbers of single cells were accurately quantified by single-cell sequencing in breast cancers. Phylogenetic-tree analysis revealed genetically distinct populations, each consisting of homogeneous cells. Bioinformatics methods based on population genetics should be further developed to quantitatively analyse the single-cell sequencing data. We developed a bioinformatics framework that was combined with molecular-evolution theories to analyse copy-number losses. This analysis revealed that most deletions in the breast cancers at the single-cell level were generated by simple stochastic processes. A non-standard type of coalescent theory, the multiple-merger coalescent model, aided by approximate Bayesian computation fit well with the data, allowing us to estimate the population-genetic parameters in addition to false-positive and false-negative rates. The estimated parameters suggest that the cancer cells underwent sweepstake evolution, where only one or very few parental cells produced a descendent cell population. We conclude that breast cancer cells successively substitute in a tumour mass, and the high reproduction of only a portion of cancer cells may confer high adaptability to this cancer.Entities:
Keywords: bioinformatics; cancer genomics; coalescent theory; copy-number alteration; molecular evolution; single-cell sequencing
Year: 2017 PMID: 28989791 PMCID: PMC5627131 DOI: 10.1098/rsos.171060
Source DB: PubMed Journal: R Soc Open Sci ISSN: 2054-5703 Impact factor: 2.963
Figure 1.The nature of deletion alleles. Results from T10 only are shown because T16 showed essentially the same tendencies. (a) Profile patterns of the integer copy number (CN). The horizontal axis represents the chromosomal position. The general CN pattern ‘2→n→ … →m→2’ indicates that the copy numbers of a segment changed from 2 copies to n copies, … and to m copies finally back to 2 copies along a chromosome. (b) Evolutionary model of deletions. Every deletion event (the inverted U-shaped marks shown in red and green) leaves a unique pair of left and right breakpoints as fingerprints on a homologous chromosome (blue lines). (c) Procedure to convert copy-number profiles into alleles. ‘L’ and ‘R’ represent positions to the left and right of breakpoints, and an ‘L–R’ pair defines a locus. The symbols ‘0’ and ‘1’ represent the ancestral and derived alleles, respectively. (d) Distribution of deletion allele lengths. We excluded deletion alleles larger than the size of a chromosome level (40 Mb). (e) Distributions of breakpoint locations. The locations were normalized with respect to chromosome lengths.
Figure 2.Phylogenetic trees and MMC. (a) Phylogenetic trees reconstructed by the neighbour-joining method. HP, AP, DP, PDP, MDP and MAP represent respective subpopulations of hypodiploid, aneuploid, diploid, primary diploid, metastatic diploid and metastatic aneuploid cells, which were defined previously [8]. OG represents the outgroup: no deletions at all sites. (b) Flow chart of our ABC. (c) The posterior distributions for the parameters of the MMC model. (d) The posterior distributions for the parameters of the Kingman population-growth model. (e) The posterior distributions for the parameters of the Kingman population-constant model. (f) Site-frequency spectrum under the MAP estimates. Sample size n = 23. (g) Distribution of the number of merged lineages under the MAP estimates. For (f,g), the results of 100 000 replications in the simulations were used.
Features and summary statistics. The reasons for selecting these features are listed in electronic supplementary material, table S1.
| feature | summary statistics |
|---|---|
| number of mutation sites | the number itself |
| allele frequencies at all sites | 10, 30, 50, 70 and 90% quantiles |
| distances between all cell pairs in a tree | 10, 30, 50, 70 and 90% quantiles |
| all branch lengths in a tree | 10, 30, 50, 70 and 90% quantiles |
| associations ( | 10, 30, 50, 70 and 90% quantiles |
MAP estimates. α and β represent the population-growth rate and the parameter of the distribution that describes the rate of multiple mergers, respectively. See the text for more information on α and β. Here, θ is the population mutation rate.
| parameter | the MMC model | Kingman population-constant model | Kingman population-growth model |
|---|---|---|---|
| 17.1 | n.a. | 17.7 | |
| 1.64 | n.a. | n.a. | |
| 102.9 | 52.1 | 97.2 | |
| false-positive rate | 0.012 | 0.013 | 0.010 |
| false-negative rate | 0.068 | 0.072 | 0.098 |
| false-positive sites | 89 | 72 | 70 |
| false-negative sites | 21 | 26 | 35 |