| Literature DB >> 21172047 |
Dan He1, Nicholas Furlotte, Eleazar Eskin.
Abstract
BACKGROUND: The characterization of structural variations (SV) such as insertions, deletions and copy number variations is a critical step in the process of understanding the full genetic architecture of organisms. Copy number variations (CNV) have attracted much recent attention due to their effects on gene expression and disease status.Entities:
Mesh:
Year: 2010 PMID: 21172047 PMCID: PMC3024866 DOI: 10.1186/1471-2105-11-S11-S12
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Summary of simulated CNVs
| CNV Length | Number | Copy-counts (2,3,4) |
|---|---|---|
| 80 | 31,26,33 | |
| 500, 000 ≤ | 390 | 120,139,131 |
| 100, 000 ≤ | 975 | 319,335,331 |
| 50, 000 ≤ | 217 | 70,79,68 |
| 10, 000 ≤ | 247 | 78,89,80 |
| 5, 000 ≤ | 47 | 15,14,18 |
| 1, 000 ≤ | 44 | 12,15,17 |
The number of CNVs belonging to each length class along with the number of copy-counts for each range are given.
Summary of the percentage of detected CNVs and predicted copy-counts broken down by true copy-counts
| Copy-Counts | Detected | Predicted copy-counts | ||||
|---|---|---|---|---|---|---|
| C=40 | C=30 | C=20 | C=40 | C=30 | C=20 | |
| 2 | 91.2% | 91.0% | 90.7% | 89.2% | 89.7% | 89.7% |
| 3 | 87.9% | 87.5% | 86.6% | 87.9% | 88.6% | 88.5% |
| 4 | 86.7% | 85.1% | 84.7% | 72.7% | 71.7% | 72.2% |
CNVs were considered to be detected when the begining and ending reference CNV positions were predicted within 10 base-pairs. In practice, the average deviation from the true positions was about 2 base-pairs. The percentage of predicted-copy counts is reported as the percentage of detected CNVs for which were able to accurately predict the true copy-counts. The average running times for the detection algorithm are 552s, 527s and 497s for coverage of 40, 30 and 20, respectively.
CNV Length vs. Coverage Ratio vs. CNV junction identification accuracy
| CNV Length | Accuracy | Run Time (sec.) | ||||
|---|---|---|---|---|---|---|
| C=40 | C=30 | C=20 | C=40 | C=30 | C=20 | |
| l ≥ 1,000,000 | 72.35% | 68.6% | 67.8% | 968.59 | 228.39 | 93.6 |
| 500,000 ≤ l < 1,000,000 | 79.35% | 78% | 77.5% | 189.38 | 100.25 | 60.17 |
| 100,000 ≤ l < 500,000 | 84.10% | 84.8% | 83.2% | 3.71 | 3.42 | 3.04 |
| 50,000 ≤ l < 100,000 | 82.3% | 82.3% | 88% | 0.01 | 0.01 | 0.01 |
| l < 50,000 | 96.7% | 96.7% | 96.7% | 0.016 | 0.014 | 0.014 |
l is the length of the CNV in the reference, C is coverage ratio.
Copy Counts vs. CNV junction identification accuracy
| Copy-Counts | Accuracy | Average Length | Run time (sec.) |
|---|---|---|---|
| 2 | 77.14% | 519,571 | 79.13 |
| 3 | 79.44% | 446,300 | 115.35 |
| 4 | 73.57% | 525,717 | 1754.72 |
Figure 1A discordant pair can imply the presence of a CNV.
Figure 2An example for reconstruction CNV.The reference CNV is “CTGTCG”. The CNV is copied three times in the donor sequence.