| Literature DB >> 27147817 |
Lun-Ching Chang1, Biswajit Das2, Chih-Jian Lih2, Han Si2, Corinne E Camalier2, Paul M McGregor2, Eric Polley1.
Abstract
With rapid advances in DNA sequencing technologies, whole exome sequencing (WES) has become a popular approach for detecting somatic mutations in oncology studies. The initial intent of WES was to characterize single nucleotide variants, but it was observed that the number of sequencing reads that mapped to a genomic region correlated with the DNA copy number variants (CNVs). We propose a method RefCNV that uses a reference set to estimate the distribution of the coverage for each exon. The construction of the reference set includes an evaluation of the sources of variability in the coverage distribution. We observed that the processing steps had an impact on the coverage distribution. For each exon, we compared the observed coverage with the expected normal coverage. Thresholds for determining CNVs were selected to control the false-positive error rate. RefCNV prediction correlated significantly (r = 0.96-0.86) with CNV measured by digital polymerase chain reaction for MET (7q31), EGFR (7p12), or ERBB2 (17q12) in 13 tumor cell lines. The genome-wide CNV analysis showed a good overall correlation (Spearman's coefficient = 0.82) between RefCNV estimation and publicly available CNV data in Cancer Cell Line Encyclopedia. RefCNV also showed better performance than three other CNV estimation methods in genome-wide CNV analysis.Entities:
Keywords: copy number variation; methodology; next-generation sequencing; whole exome sequencing
Year: 2016 PMID: 27147817 PMCID: PMC4849420 DOI: 10.4137/CIN.S36612
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1Scatter plot of first two principal components (PCs) from PCA of all replicated reference NA12878 prepared by two library preparation methods (red: robotic; black: manual).
Copy number variation predicted by RefCNV and corresponding dPCR results.
| CELL LINE | ||||||
|---|---|---|---|---|---|---|
| CNVs | dPCR | CNVs | dPCR | CNVs | dPCR | |
| A-431 | D | 1.45 | A | 11.1 | N | 2.02 |
| BT-20 | N | 2.38 | A | 12.7 | D | 1.68 |
| BT-474 | N | 1.28 | A | 0.439 | A | 14.30 |
| C-32 | A | 6.08 | N | 0.212 | N | 2.25 |
| Daoy | A | 2.97 | A | 3.12 | D | 1.15 |
| HOP-92 | A | 1.89 | A | 1.95 | N | 1.25 |
| Hs746T | A | 16.50 | N | 1.50 | N | 1.32 |
| MDA-MB-231 | N | 2.14 | N | 2.41 | A | 2.58 |
| MDA-MB-361 | D | 1.28 | N | 2.04 | A | 10.90 |
| MDA-MB-453 | D | 2.05 | N | 2.04 | A | 5.60 |
| NCI-H1993 | A | 22.30 | N | 2.17 | N | 1.60 |
| SK-BR3 | A | 5.51 | A | 3.52 | A | 17.10 |
| SNU-5 | A | 22.60 | A | 5.63 | A | 3.98 |
Abbreviations: CNVs, CNV called by RefCNV; N, normal (diploid); D, deletion (less than two copies); A, amplification (more than two copies); dPCR, digital PCR.
Figure 2Scatter plot of median standardized residuals (MSRs) and digital PCR of 13 cell lines on genes MET, EGFR, and ERBB2.
Figure 3Correlation between MSRs and dPCR of 13 cell lines for genes EGFR, ERBB2, and MET.