| Literature DB >> 30343665 |
Fatima Zare1, Abdelrahman Hosny2, Sheida Nabavi3.
Abstract
BACKGROUND: Due to recent advances in sequencing technologies, sequence-based analysis has been widely applied to detecting copy number variations (CNVs). There are several techniques for identifying CNVs using next generation sequencing (NGS) data, however methods employing depth of coverage or read depth (RD) have recently become a main technique to identify CNVs. The main assumption of the RD-based CNV detection methods is that the readcount value at a specific genomic location is correlated with the copy number at that location. However, readcount data's noise and biases distort the association between the readcounts and copy numbers. For more accurate CNV identification, these biases and noise need to be mitigated. In this work, to detect CNVs more precisely and efficiently we propose a novel denoising method based on the total variation approach and the Taut String algorithm.Entities:
Keywords: Copy number variation; Denoising; Next generation sequencing; Signal processing; Taut string; Total variation
Mesh:
Year: 2018 PMID: 30343665 PMCID: PMC6196408 DOI: 10.1186/s12859-018-2332-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Denoising with a) Taut String and b) DWT c) MA. Using simulated readcount data at SNR = 7
Fig. 2The effect of the squeezing factor on a) Sensitivity b) FDR of CNVs detection. Using simulated readcount data
Fig. 3a) Sensitivity and b) FDR of detection of amplified CNVs segments before and after applying denoising methods for different SNR. Using Simulated Readcount data
Fig. 4a) Sensitivity and b) FDR of detection of deleted CNVs segments before and after applying denoising methods for different SNR. Using simulated Readcount data
Fig. 5Breakpoint accuracy before and after applying denoising for different SNR, using simulated readcount data
Fig. 6Sensitivity and FDR of detection of amplified CNV segments with different CNV lengths
Fig. 7Sensitivity and FDR of detection of deleted CNV segments with different CNV lengths
Possible results for each candidate CNV genes
| CNV gene | Not identified | Identified |
|---|---|---|
| Present | FN | TP |
| Not present | TN | FP |
| Performance metrics: | ||
| Sensitivity = | FDR = | Specificity = |
Overall Performance of The Denoising Methods Using the simulated WES data generated by CNV-Sim Data
| Denoising Methods | Amplified CNVs | Deleted CNVs | ||||
|---|---|---|---|---|---|---|
| Sensitivity | FDR | Specificity | Sensitivity | FDR | Specificity | |
| Before applying denoising method | 79.65% | 35.23% | 80.93% | 78.64% | 37.03% | 81.02% |
| After applying DWT | 86.87% | 22.88% | 91.32% | 87.20% | 20.54% | 90.32% |
| After applying Taut String | 87.17% | 22.94% | 92.82% | 88.15% | 23.65% | 89.49% |
Overall Performance of The Denoising Methods Using the Real WES Data
| Denoising Methods | Amplified CNVs | Deleted CNVs | ||||
|---|---|---|---|---|---|---|
| Sensitivity | FDR | Specificity | Sensitivity | FDR | Specificity | |
| Before applying denoising method | 50.99% | 42.06% | 80.45% | 60.37% | 64.32% | 56.71% |
| After applying DWT | 68.81% | 41.65% | 79.92% | 77.65% | 54.32% | 72.23% |
| After applying Taut String | 69.52% | 40.21% | 84.51% | 79.93% | 50.72% | 77.25% |
Percentage of CNV genes that the difference between their copy number values from their benchmark values are less than 20% of the benchmark copy number values
| Denoising methods | Amplified genes | Deleted genes |
|---|---|---|
| no denoising | 44.48% | 42.77% |
| DWT | 73.36% | 58.35% |
| Taut String | 76.87% | 70.26% |
Average differences between detected copy number values and benchmark copy number values respect to the benchmark copy number values
| Denoising methods | Amplified genes | Deleted genes |
|---|---|---|
| no denoising | 50% | 53% |
| DWT | 38% | 36% |
| Taut String | 25% | 36% |