| Literature DB >> 24555668 |
Xin Zhang, Renqian Du, Shilin Li, Feng Zhang, Li Jin, Hongyan Wang1.
Abstract
BACKGROUND: Copy Number Variations (CNVs) are usually inferred from Single Nucleotide Polymorphism (SNP) arrays by use of some software packages based on given algorithms. However, there is no clear understanding of the performance of these software packages; it is therefore difficult to select one or several software packages for CNV detection based on the SNP array platform.We selected four publicly available software packages designed for CNV calling from an Affymetrix SNP array, including Birdsuite, dChip, Genotyping Console (GTC) and PennCNV. The publicly available dataset generated by Array-based Comparative Genomic Hybridization (CGH), with a resolution of 24 million probes per sample, was considered to be the "gold standard". Compared with the CGH-based dataset, the success rate, average stability rate, sensitivity, consistence and reproducibility of these four software packages were assessed compared with the "gold standard". Specially, we also compared the efficiency of detecting CNVs simultaneously by two, three and all of the software packages with that by a single software package.Entities:
Mesh:
Year: 2014 PMID: 24555668 PMCID: PMC4015297 DOI: 10.1186/1471-2105-15-50
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Flowchart of the study.
Summary statistics of CNVs called by four software packages
| Birdsuite | 127,235 | 76,082 | 951 | 8.1% |
| dChip | 178,920 | 83,144 | 639 | 5.4% |
| GTC | 316,932 | 181,000 | 205 | 1.7% |
| PennCNV-Affy | 152,634 | 87,813 | 564 | 4.8% |
| CGH& | 19,040 | 11,502 | 11,759 | 100% |
&The data are from Park et al.[12].
Comparison of CNVs between two high-throughput platforms
| Birdsuite | 41.3% | 12.4% | 46.3% |
| dChip | 9.4% | 31.0% | 59.6% |
| GTC | 66.3% | 3.9% | 29.8% |
| PennCNV-Affy | 45.9% | 13.8% | 40.3% |
Figure 2Study of CNV calling in matched groups. (A) Venn showing CNV calls generated by four software packages (B) CNV calls generated by multiple software packages.
The average success rate of the four CNV-calling methods, according to CNV length and frequency (a)
| 30-100K | 1075 | 176(16.4%) | 45(4.2%) | 117(10.9%) | 157(14.6%) |
| 100-150K | 209 | 75(35.9%) | 38(18.2%) | 41(19.6%) | 60(28.7%) |
| 150-1000K | 334 | 208(62.3%) | 81(24.3%) | 101(30.2%) | 110(32.9%) |
| | | | | | |
| ≤20% | 417 | 85(20.4%) | 38(9.1%) | 72(17.3%) | 70(16.8%) |
| 20%<a<=40% | 216 | 69(31.9%) | 39(8817.8%) | 59(27.3%) | 72(33.3%) |
| 40%<a<=60% | 188 | 89(47.3%) | 50(26.6%) | 60(31.9%) | 66(35.1%) |
| 60%<a<=80% | 107 | 46(43%) | 8(7.5%) | 37(34.6%) | 34(31.8%) |
| a>80% | 699 | 170(24.3%) | 29(4.1%) | 32(4.6%) | 85(12.2%) |
Figure 3Study Performance of CNV calling. (A) CNV calls of size distribution. (B) CNV frequency of occurrence.
Figure 4Study CNV calling of size distribution and Chromosome distribution. (A) CNV calls of size distribution (B) CNV calls of Chromosome distribution.
Batch effect test
| Birdsuite | 41.7 | 51.6 | 52.9 |
| GTC | 52.9 | 52.9 | 52.9 |
| dChip | 56.7 | 73.0 | 75.0 |
| PennCNV-Affy | 88.6 | 85.0 | 85.7 |
Figure 5Study CNV calling in non_overlap group. (A) Venn showing CNV calls generated by four software packages (B) CNV calls of multiple software packages.
Figure 6ROC/AUC of study.