| Literature DB >> 33232318 |
Young Mi Kwon1, Kevin Gori1, Naomi Park2, Nicole Potts1, Kate Swift3, Jinhong Wang1, Maximilian R Stammnitz1, Naomi Cannell1, Adrian Baez-Ortega1, Sebastien Comte4,5, Samantha Fox6,7, Colette Harmsen3, Stewart Huxtable6, Menna Jones4, Alexandre Kreiss8, Clare Lawrence6, Billie Lazenby6, Sarah Peck6, Ruth Pye8, Gregory Woods8, Mona Zimmermann1, David C Wedge9, David Pemberton6, Michael R Stratton2, Rodrigo Hamede4,10, Elizabeth P Murchison1.
Abstract
Devil facial tumour 1 (DFT1) is a transmissible cancer clone endangering the Tasmanian devil. The expansion of DFT1 across Tasmania has been documented, but little is known of its evolutionary history. We analysed genomes of 648 DFT1 tumours collected throughout the disease range between 2003 and 2018. DFT1 diverged early into five clades, three spreading widely and two failing to persist. One clade has replaced others at several sites, and rates of DFT1 coinfection are high. DFT1 gradually accumulates copy number variants (CNVs), and its telomere lengths are short but constant. Recurrent CNVs reveal genes under positive selection, sites of genome instability, and repeated loss of a small derived chromosome. Cultured DFT1 cell lines have increased CNV frequency and undergo highly reproducible convergent evolution. Overall, DFT1 is a remarkably stable lineage whose genome illustrates how cancer cells adapt to diverse environments and persist in a parasitic niche.Entities:
Mesh:
Year: 2020 PMID: 33232318 PMCID: PMC7685465 DOI: 10.1371/journal.pbio.3000926
Source DB: PubMed Journal: PLoS Biol ISSN: 1544-9173 Impact factor: 8.029
Fig 1DFT1 phylogeny and lineage dynamics.
(A) Phylogenetic tree constructed using 550 SNVs and 932 CNVs genotyped across 639 DFT1 tumours (9 tumours were excluded from tree due to missing data). Each tip represents a tumour, coloured by clade. Branch lengths are proportional to number of variants, not evolutionary time, and high-resolution bootstrapped tree is available in S1 Fig. Tumours sampled in the north-east of Tasmania (Wukalina), the putative origin of DFT1, are indicated with arrowheads. SNV and CNV genotypes are available in S2 and S3 Tables, respectively. (B) Map of Tasmania showing distribution of DFT1 clades represented by 593 tumours in (A) and S1 Table (metastases, tumours involving captive devils, and cases of repeated sampling of individual tumours are excluded (S1 Table)). Each tumour is represented by a coloured symbol. Mapping coordinates have been randomly offset in heavily sampled labelled locations in order to aid visualisation. Sample density reflects trapping effort, not tumour prevalence. Tumour clade data and geographical coordinates are available in S1 Table. Map outline was obtained from https://library.unimelb.edu.au/collections/map_collection/map_collection_outline_maps. (C) Spatial spread of DFT1 clades. Each tumour is represented by a coloured symbol. Arrows represent putative spatial movements interpreted from the phylogeny, and year when the DFT1 clade was first observed in each location within the set of sampled tumours is marked. Dotted lines demarcate inferred distributions of clades A2 and B before sampling for this study began, with arrows illustrating inferred tumour migrations out of these core areas. Tumour clade data, sampling dates, and geographical coordinates are available in S1 Table. Map outline was obtained from https://library.unimelb.edu.au/collections/map_collection/map_collection_outline_maps. (D) DFT1 clade distribution in nine locations indicated on map in (B) between 2003 and 2018. Distribution of three clade A1 subgroups is shown in Forestier. Tumour population size reflects sampling effort, not disease incidence, and is not comparable between sites. Years with consistent sampling efforts within sites are marked with black dots. All devils were removed from the Forestier Peninsula in 2012 (arrow). Data are available in S1 Table. (E) DFT1 coinfection. The number of devils with between two and nine independently sampled DFT1 tumours is shown (277 devils had only one tumour sampled, not shown on plot). Those devils for which genotypes of all sampled tumours were indistinguishable or could be distinguished only by variation private to the individual are designated as “single infection.” Individual devils hosting tumours with distinct genotypes known to occur in other devils in the population are labelled “coinfection,” although in some cases this might represent sampling of the index tumour within which the new variation arose (S4 Table). An image of a coinfected devil (Devil 165) with tumour genotypes labelled is shown on the right. Underlying data are available in S4 Table. (F) Facial distribution of DFT1 tumours. Diagram was constructed by superimposing facial tumour locations of 96 DFT1s. Raw data are available in S1 Data in https://doi.org/10.5281/zenodo.4046235. CNV, copy number variant; DFT1, devil facial tumour 1; km, kilometre; MRCA, most recent common ancestor; SNV, single nucleotide variant.
Fig 2Copy number variation in DFT1.
(A) Distribution of diploid DFT1 CNV states and widths, measured in bp. Subclonal copy number states are labelled 1.5, 2.5, and 3.5, and represent subclonal states between the two closest integer states. CNVs occurring after a whole genome duplication event are excluded. Gain and loss CNVs with CN2, loss CNVs with CN >2, and gain CNVs with CN <2 represent back mutations. The CNV detection limit was 500 kb (dotted line). Data are available in S3 Table. (B) Representative DFT1 copy number plots. Each dot represents normalised read coverage within a 100-kb genomic window. Tumour identities are, top to bottom, 377T1 (clade E), 56T2 (clade C), 228T1 (clade A2), and 209T3 (clade A1). The diploid tumour, 228T1, has lost M5, whose CNVs are visible on chromosomes 2 and 5 (red arrows). A CNV encompassing major histocompatibility complex class I component B2M was acquired in the common ancestor of clades A, B, C, and D (blue arrows). Copy number plots for all tumours are available in S2 Data in https://doi.org/10.5281/zenodo.4046235. (C) Rate of DFT1 CNV acquisition. Each tumour is represented by a grey dot, and labels mark 1 January of the labelled year. Sets of CNVs cooccuring with the same copy number in the same samples are counted once. CNVs are called relative to the reference genome. Grey shading represents regression standard error, and upper and lower lines represent prediction intervals. We corrected for ploidy in tetraploid tumours, and cell lines are excluded. Data are available in S1 Table. (D) DFT1 CNV chromosome map. Grey bars represent chromosomes with scale on left in Mb. Blue and red bars represent copy number losses and gains, respectively. Each bar is an independent CNV occurrence, and CNVs shared between tumours via a common ancestor are illustrated once. Each CNV step away from CN2 is illustrated separately. Candidate driver genes within frequently amplified regions are annotated, and complete lists of genes within CNV intervals are found in S6 Table. CNVs associated with M5 are indicated. HMGA2 is not annotated in the reference genome, Devil7.1, but is inferred to be present in the labelled interval based on cross-species genome alignment. CNVs occurring exclusively in DFT1 cell lines are excluded. CNV data are available in S3 Table. (E) DFT1 CNVs are more clustered than expected by chance. Number and length of overlapping CNVs in observed data and in data derived from 2,000 simulations. CNV data are available in S3 Table. (F) Loss of small derived chromosome M5 in diploid and tetraploid DFT1s and across DFT1 clades. M5 was acquired in a DFT1 common ancestor and is composed of fragments of chromosomes 2, 5, and X; this chromosome has been repeatedly lost in the DFT1 lineage (cartoon). The Fisher exact tests compare the observed distribution of M5 losses to those expected by chance, with the latter calculated assuming equal opportunity for M5 loss across all M5–positive tumours, correcting for change in opportunity caused by tetraploidy. Data are available in S8 Table. (G) Frequency of CNV breakpoint reuse within 567 DFT1 biopsies (cell line breakpoints excluded). Ends of chromosomes, M5 CNVs, and instances of reuse within individual samples are excluded from count. Simulated data are derived from 2,000 neutral simulations. Data are available in S9 Table. (H) Number of CNVs in Tasmanian devil cancers of different histotypes. “Lymphoma” includes two unspecified cutaneous round cell tumours (S1 Table). CNVs exclusive to DFT1 cell lines are excluded. Number of CNVs is relative to the reference genome. Each tumour is plotted as a dot, and each non-DFT data point represents an independent clone except for the three dots representing tumours 106T1, 106T2, and 106T3, which are separate tumours belonging to a single non-DFT cancer. Data are available in S1 and S3 Tables. (I) Frequency of CNV breakpoint reuse across Tasmanian devil cancers. Number of reused breakpoints between pairs or trios of devil cancer groups are shown, with colours indicating the number of breakpoints within each category which are also reused within individual devil cancer cohorts. DFT1 cell line CNVs, ends of chromosomes, M5 CNVs, and reuse within individual samples are excluded from count. Simulated data are derived from 2,000 neutral simulations of datasets of the same size. Simulated data are coloured for breakpoint reuse within group using the same colour key as used for real data, as shown on the plot. None of the reused breakpoints were found in the genomes of normal devil tissues, including those of matched hosts (S1 Table). Non-DFT, spontaneously arising non-transmissible devil cancers that are neither DFT1 nor DFT2. Data are available in S9 Table. bp, base pair; CI, confidence interval; CN, copy number; CNV, copy number variant; CN2, copy number 2; DFT1, devil facial tumour 1; LogR, ratio of tumour to normal reads, log base 2; Mb, megabase; M5, marker 5.
Fig 3Copy number variation in DFT1 cell lines.
(A) DFT1 phylogenetic tree with 43 cell lines (including some duplicates collected at different passages), and tissue biopsies from devils inoculated with cell lines, indicated. DFT1 clades are labelled. Higher resolution tree is available in S1 Fig, and underlying SNV and CNV data used to generate tree are available in S2 and S3 Tables, respectively. Sample information, including cell line status, is available in S1 Table. (B) Number of CNVs in DFT1 biopsies and cell lines. Data are available in S1 and S3 Tables. (C) M5 loss events in DFT1 biopsies and cell lines. Cell lines derived from tumour lineages lacking M5 are excluded. Data are available in S8 Table. (D) Chromosome map illustrating CNVs in DFT1 cell lines. Grey bars represent chromosomes. Blue and red bars represent copy number losses and gains, respectively, which were observed in DFT1 cell lines but not in DFT1 biopsies. Each bar is an independent CNV occurrence, and CNVs shared by multiple cell lines via a common ancestor are illustrated once. Each CNV step away from CN2 is illustrated separately, thus a CNV present at CN <1 or >3 is shown multiple times. The location of ERBB2 is marked, together with a window within which the rDNA occurs (exact coordinates are unknown). Data are available in S3 Table. (E) Representative genomic copy number plots for a DFT1 biopsy, cell line, and experimental inoculation of the same cell line into a devil. Each dot represents normalised read coverage within a 100-kb genomic window. Tumour identifiers are labelled in parentheses, all three samples belong to Clade A1-3A3 (S1 Table). Copy number plots for all tumours are available in S2 Data in https://doi.org/10.5281/zenodo.4046235. CN, copy number; CNV, copy number variant; DFT1, devil facial tumour 1; M5, marker 5; logR, ratio of tumour to normal reads, log base 2; rDNA, ribosomal DNA; SNV, single nucleotide variant.
Conditions for MIP phosphorylation reaction.
| Volume (μL) | |
|---|---|
| MIP pool (500 μM total concentration) | 20 |
| T4 PNK | 10 |
| T4 PNK buffer (10×) | 7 |
| ATP (10 mM) | 7 |
| Nuclease-free water | 63 |
| Total | 100 |
ATP, adenosine triphosphate; MIP, molecular inversion probe; T4 PNK, T4 polynucleotide kinase.
Conditions for MIP hybridisation and circularisation reactions.
| Volume (μL) per reaction | |
|---|---|
| Ampligase 10× buffer | 1.3 |
| Phosporylated MIP pool (1:1,000 dilution) | 0.484 |
| dNTPs (0.1 mM) | 0.08 |
| Hemo Klentaq | 0.08 |
| Ampligase (5 U/μl) | 0.1 |
| Nuclease-free water | 0.01 |
| DNA (30 ng) | 10 |
| Total | 12 |
dNTP, deoxynucleoside triphosphate; MIP, molecular inversion probe.
MIP exonuclease reaction conditions.
| Volume (μL) per reaction | |
|---|---|
| Ampligase 10× buffer | 0.1 |
| Exo I | 0.125 |
| Exo III | 0.125 |
| Nuclease-free water | 0.65 |
| Total | 1 |
Exo, exonuclease.
Conditions for MIP amplification and index incorporation.
| Volume (μL) per reaction | |
|---|---|
| Q5 High-Fidelity 2X Master Mix | 13 |
| MIP F primer (100 μM) | 0.13 |
| Indexed MIP R primer (10 μM) | 1.3 |
| MIP capture reaction | 12 |
| Total | 26.4 |
MIP, molecular inversion probe; F primer, forward primer; R primer, reverse primer.
Thermal cycling conditions for MIP amplification and index incorporation.
| Temperature | Time | Cycles |
|---|---|---|
| 98°C | 30 s | 1× |
| 98°C | 15 s | 23× |
| 58°C | 30 s | |
| 72°C | 30 s | |
| 72°C | 2 min | 1× |
Coordinates and deletion class for scaffolds used in purity calculation.
| Type | Deletion class | Scaffold coordinates |
|---|---|---|
| DFT1, DFT2 | LOH | Chr3_supercontig_000000288:1090000- Chr3_supercontig_000000297:40000 |
| DFT1 | LOH | Chr2_supercontig_000000346:1- Chr2_supercontig_000000353:269000 |
| DFT1 | Homdel | Chr2_supercontig_000000278:490000–510000 |
| DFT2 | LOH | Chr1_supercontig_000000015:1- Chr1_supercontig_000000027: 440000 |
| DFT2 | LOH | Chr5_supercontig_000000095:1- Chr5_supercontig_000000109:650000 |
| DFT2 | Homdel | Chr3_supercontig_000000242:130000–190000 |
DFT, devil facial tumour; Homdel, homozygous deletion; LOH, loss of heterozygosity.