| Literature DB >> 24192544 |
Carson Holt1, Bojan Losic, Deepa Pai, Zhen Zhao, Quang Trinh, Sujata Syam, Niloofar Arshadi, Gun Ho Jang, Johar Ali, Tim Beck, John McPherson, Lakshmi B Muthuswamy.
Abstract
MOTIVATION: Copy number variations (CNVs) are a major source of genomic variability and are especially significant in cancer. Until recently microarray technologies have been used to characterize CNVs in genomes. However, advances in next-generation sequencing technology offer significant opportunities to deduce copy number directly from genome sequencing data. Unfortunately cancer genomes differ from normal genomes in several aspects that make them far less amenable to copy number detection. For example, cancer genomes are often aneuploid and an admixture of diploid/non-tumor cell fractions. Also patient-derived xenograft models can be laden with mouse contamination that strongly affects accurate assignment of copy number. Hence, there is a need to develop analytical tools that can take into account cancer-specific parameters for detecting CNVs directly from genome sequencing data.Entities:
Mesh:
Year: 2013 PMID: 24192544 PMCID: PMC3957071 DOI: 10.1093/bioinformatics/btt611
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Flow chart showing the analysis procedure
Fig. 2.Detection of signal discontinuities using wavelet transformed and de-noised signal over a 16 kb region. Top panel shows the raw read depth (gray) and the denoised signal (red). Bottom panel illustrates copy number break points where the coefficient of the maximal scale intersects those of the finest scale. The y-axis is the squared approximation wavelet coefficient, and x-axis is the genomic position in megabases
Fig. 3.MAF distribution of SNVs in a 30 Mb region of chr1. (A) MAF density in a pancreatic cancer cell line; (B) observed (red) and normal fitted expect (blue) distribution curves of MAF for pancreatic cancer cell line; (C) MAF density in a pancreatic xenograft model; (D) observed (red), normal fitted expect (blue) and expect with mouse contamination (green) for pancreatic xenograft model
Experimental validation of cellularity estimates
| Mixed tumor fraction | WaveCNV estimate |
|---|---|
| 0.05 | 0.043 |
| 0.10 | 0.088 |
| 0.15 | 0.155 |
| 0.20 | 0.236 |
| 0.40 | 0.403 |
| 0.60 | 0.602 |
| 1.00 | 1.00 |
Note: The table shows WaveCNV-derived cellularity estimates for a dilution series of diploid/normal contamination mixed into a pancreatic cancer cell line model.
Fig. 4.Modeling for aneuploidy. (A) The expected segment median coverage for a diploid genome is estimated using kernel density estimation. This value then serves to define a range for estimating the sample base coverage (coverage of copy number 1). (B) The normalized likelihood of the observed coverage (red line) as well as the normalized residual sum of squares value (rss) for all MAF distribution fits (blue line) are calculated for each candidate base coverage (assuming ploidy range 1–4). The base coverage that produces the maximum separation between likelihood and rss (yellow line) is then selected. (C and D) show the expected segment median coverage and the base coverage selected for a triploid genome
Fig. 5.Validation of copy number calls using three methods. (A) Verification of 80 CNV loci by qPCR on a pancreatic cancer genome. Copy numbers from qPCR were estimated based on threshold cycle (Ct) values. The Pearson correlation coefficient is 0.94. (B) Verification of 473 somatic CNVs on the whole-genome using Illumina Human Omni 1Million microarray. Shown here is the concordance between intensity ratios in microarray to WaveCNV CN. The Pearson correlation coefficient is 0.86. (C) Verification of 468 somatic CNVs on the whole genome using Nimblegen 2.1 Million aCGH microarray. Shown here is the concordance between aCGH intensities ratio in microarray to WaveCNV CN. The Pearson correlation coefficient is 0.97
WaveCNV comparison to other algorithms
| Algorithm | Events | Gains | Losses | Total basepair gains | Total basepair losses | Congruency gains | Congruency losses | Congruency all |
|---|---|---|---|---|---|---|---|---|
| WaveCNV | 764 | 359 | 405 | 312 922 439 | 567 442 194 | – | – | – |
| CNVnator | 3658 | 829 | 2829 | 319 703 400 | 622 106 100 | 0.95 | 0.92 | 0.93 |
| OncoSNP | 1423 | 567 | 856 | 260 783 488 | 912 819 293 | 0.87 | 0.80 | 0.80 |
Note: This table shows the base pair level congruency in copy number alterations called by WaveCNV compared with CNVnator and OncoSNP-SEQ.