| Literature DB >> 28569140 |
Fatima Zare1, Michelle Dow2, Nicholas Monteleone1, Abdelrahman Hosny1, Sheida Nabavi3.
Abstract
BACKGROUND: Recently copy number variation (CNV) has gained considerable interest as a type of genomic/genetic variation that plays an important role in disease susceptibility. Advances in sequencing technology have created an opportunity for detecting CNVs more accurately. Recently whole exome sequencing (WES) has become primary strategy for sequencing patient samples and study their genomics aberrations. However, compared to whole genome sequencing, WES introduces more biases and noise that make CNV detection very challenging. Additionally, tumors' complexity makes the detection of cancer specific CNVs even more difficult. Although many CNV detection tools have been developed since introducing NGS data, there are few tools for somatic CNV detection for WES data in cancer.Entities:
Keywords: Cancer; Copy number variation; Somatic aberrations; Whole-exome sequencing
Mesh:
Year: 2017 PMID: 28569140 PMCID: PMC5452530 DOI: 10.1186/s12859-017-1705-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Selected tools for the performance analysis of CNV detection tools using WES data
| Tool name | ADTEx | CONTRA | cn.MOPS | ExomeCNV | VarScan 2 |
|---|---|---|---|---|---|
| Chara- Cteristics | |||||
| Control set required | Yes | Yes | No | Yes | No |
| Prog. Language | Python, S/R | Python, R | R | R | Java |
| Input format | BAM, BED | BAM, SAM, BED | BAM, Read count matrices | BAM, Pileup, GTF | BAM, Pileup |
| Segmentation Algorithm | HMM | CBS | CBS | CBS | NAa |
| OS | GNU, Linux | Linux, Mac OS | Linux, Mac OS, windows | Linux, Mac OS, windows | Linux, Mac OS, windows |
| Methodology characteristic | DWTc for de-noising, use BAFd | Base-level log-ratio | Bayesian approach for de-noising | Statistical test for analyzing BAF data | CMDSb for generating read counts |
| Year | 2014 | 2012 | 2012 | 2011 | 2012 |
| URL |
|
|
|
|
|
aSegmentation is not imbedded in the tool. CBS is recommended for segmentation
bCorrelation Matrix Diagonal Segmentation
cDiscrete wavelet transform
dB allele frequencies
Computing TP, FP, TN and FN for Gene-Based comparison of the performance of the tools
| Amplification | CNVbench > | CNVbench < |
| CNVtest > |
|
|
| CNVtest < |
|
|
| Deletion | CNVbench < (− | CNVbench > (− |
| CNVtest < (− |
|
|
| CNVtest > (− |
|
|
Computing TP, FP and FN for Segment-Based comparison
| Amplification | BenchSeg CNV | BenchSeg CNV |
| TestSeg CNV |
|
|
| TestSeg CNV |
|
|
| Deletion | BenchSeg CNV | BenchSeg CNV > |
| TestSeg CNV < − |
|
|
| TestSeg CNV > − |
|
|
Overall performance of the CNV detection tools using the gene-based comparison approach for real data
| Method | ADTEx | CONTRA | cn.MOPS | ExomeCNV | VarScan2 |
|---|---|---|---|---|---|
| Amplification | |||||
| Sensitivity | 51.53% | 54.37% | 58.03% |
| 69.11% |
| FDR | 33.70% | 53.52% | 57.36% | 38.79% |
|
| SPC | 89.84% | 83.06 | 66.54% | 82.07 |
|
| Deletion | |||||
| Sensitivity | 50.14% | 64.95% | 52.81% |
| 76.77% |
| FDR |
| 64.86% | 61.35% | 45.31% | 51.91% |
| SPC |
| 78.86% | 78.08% | 87.26% | 82.52% |
In the table, bold value in each line represents the best value of each performance measure
Fig. 1CNV call of 55 breast cancer related genes. Blue: deletion, Red: amplification, and light yellow no CNV call. Order of tools from left to right: 1: ADTEx, 2: ExomeCNV, 3: CONTRA, 4: cn.MOPS, and 5: VarScan2
Fig. 2Venn diagrams of the average of the number of truly detected CNV genes from the 5 tools, (a) amplified genes, (b) deleted genes
Fig. 3Characteristics of the detected CNV regions by the 5 tools. a Size distributions of CNV segments. b Number of detected CNV segments
Fig. 4Average execution times of the tools from 5 runs on a real breast cancer dataset
Fig. 5Sensitivity (TPR) versus 1- specificity (FPR) of the tools for different coverage values, using simulated data, for (a) amplified genes, and (b) deleted genes. Since CONTRA could not generate the proper output for the coverage of 0.01 M, its results for coverage of 0.05 have not been shown