| Literature DB >> 32000839 |
Ramakrishnan Rajagopalan1,2, Jill R Murrell1,3, Minjie Luo1,3, Laura K Conlin4,5.
Abstract
BACKGROUND: Exome sequencing (ES) is a first-tier diagnostic test for many suspected Mendelian disorders. While it is routine to detect small sequence variants, it is not a standard practice in clinical settings to detect germline copy-number variants (CNVs) from ES data due to several reasons relating to performance. In this work, we comprehensively characterized one of the most sensitive ES-based CNV tools, ExomeDepth, against SNP array, a standard of care test in clinical settings to detect genome-wide CNVs.Entities:
Keywords: Clinical exome sequencing; Copy-number variation
Mesh:
Substances:
Year: 2020 PMID: 32000839 PMCID: PMC6993336 DOI: 10.1186/s13073-020-0712-0
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Fig. 1The default (a) and the modified (b) exome-based CNV detection and validation workflow
Fig. 2Schematic showing the cohorts used in this study
Characteristics of all the CNVs from ES with the default ExomeDepth workflow (307 samples)
| Deletions | Duplications | |
|---|---|---|
| Number of CNVs | 24,628 | 19,976 |
| Number of exons | 1 to 470 (mean = 4 (13 kb), median = 2 (968 bp)) | 1 to 981 (mean = 4 (14 kb), median = 2 (1.4 kb)) |
| Number of CNVs per individual | 45 to 210 (mean = 80, median = 78) | 34 to 158 (mean = 65, median = 63) |
Filtering cascade to create a list of high-quality true-positive CNVs from the SNP arrays
| Filtering cascade | Autosomal | Chromosome X | Total | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Loss | Gain | Loss | Gain | |||||||
| Het | Hom | Dup | Trip | Het | Hom | Hemi | Dup | Trip | ||
| All | 5166 | 194 | 1062 | 7 | 103 | 5 | 12 | 84 | 1 | 6634 |
| CNVs with ≥ 10 probes | 527 | 11 | 411 | 6 | 17 | 3 | 2 | 35 | 1 | 1013 |
| CNVs with an overlapping exon and a bait | 170 | 6 | 268 | 4 | 8 | 0 | 2 | 28 | 1 | 487 |
| Refined true positives | 165 | 6 | 266 | 4 | 7 | 0 | 2 | 28 | 1 | 479 |
Het heterozygous, Hom homozygous, Hemi hemizygous, Dup duplication, Trip triplication
Characteristics of the high-quality true-positive CNV dataset as defined by the SNP array
| Characteristics of the baseline truth CNVs defined by the SNP array | ||
|---|---|---|
| Deletion | Duplication | |
| Number of CNVs | 180 | 299 |
| Size | 2.1 kb to 3.1 Mb (mean = 160 kb, median = 57 kb) | 6 kb to 1.8 Mb (mean = 202 kb, median = 94 kb) |
| No. of SNP probes | 10 to 2258 (mean = 67, median = 20) | 10 to 774 (mean = 62, median = 29) |
| No. of exons overlapping the CNV | 1 to 464 (mean = 15, median = 9) | 1 to 359 (mean = 19, median = 11) |
| Small CNVs (< 4 exons) | 36 (20%) | 68 (23%) |
| Clinically reported CNVs | 24 (13%) | 17 (6%) |
Sensitivity of the default and modified ExomeDepth workflow
| True-positive rate | Default ExomeDepth workflow | Modified ExomeDepth workflow | ||
|---|---|---|---|---|
| Deletions | Duplications | Deletions | Duplications | |
| Overall | 96% (172/180) | 95% (283/299) | 98% (163/166) | 96% (280/293) |
| Heterozygous deletions | 95% (164/172) | 98% (157/160) | ||
| Homozygous deletions | 100% (6/6) | 100% (4/4) | ||
| Hemizygous deletions | 100% (2/2) | 100% (2/2) | ||
| Duplications | 95% (278/294) | 95% (275/288) | ||
| Triplications | 100% (5/5) | 100% (5/5) | ||
| Autosomal | 96% (165/171) | 95% (256/270) | 98% (156/159) | 95% (254/266) |
| Chromosome X | 78% (7/9) | 93% (27/29) | 100% (7/7) | 96% (26/27) |
| Clinically reported CNVs | 100% (24/24) | 100% (17/17) | 100% (22/22) | 100% (17/17) |
| CNVs overlapping < 4 exons | 86% (31/36) | 87% (59/68) | 94% (29/31) | 87% (58/67) |
| CNVs overlapping ≥ 4 exons | 98% (141/144) | 97% (224/231) | 99% (134/135) | 98% (222/226) |
Fig. 3Analysis of CNVs from the false discovery rate cohort, stratified by false positives and true positives. a Violin plot of the percentage of exons that overlap segmental duplications within each CNV. b Violin plot of the mean mappability score across each CNV
Fig. 4Histogram of the number of CNVs identified per individual using the default and the modified ExomeDepth workflow. The dotted lines represent the mean value for each group (51, 145 respectively)
Details of the validation of the CNVs identified in STRC
| Genomic coordinates of the CNV from ES (hg19) | Exome call | Validation method | Result |
|---|---|---|---|
| chr15:43891026-43895609 | Deletion | SNP array ( | Heterozygous deletion of |
| chr15:43891026-43940259 | Deletion | SNP array ( | Heterozygous deletion of |
| chr15:43891026-43940259 | Duplication | ddPCR ( | Duplication of |
| chr15:43892733-43892880 | Deletion | long-range PCR followed by NGS ( | Gene conversion involving exon 26 of |
| chr15:43893595-43893749 | Deletion | long-range PCR followed by NGS ( | Gene conversion involving exon 24 of |
New diagnoses made by the ES pipeline that were previously not reported
| ID | Genomic coordinates (hg19) | Gene | Size (bp) | CNV | Number of exons | SNPs in SNP array | Comment | Confirmation method |
|---|---|---|---|---|---|---|---|---|
| 1 | chr12:116,457,030-116,460,406 | 3376 | Het del | 3 | 3 | Under SNP array resolution | ddPCR | |
| 2 | chr6:33,405,980-33,409,266 | 3286 | Het del | 5 | 4 | Under SNP array resolution | ddPCR | |
| 3 | chr3:191,888,248-192,126,012 | 237,765 | Dup | 4 | 123 | Not known disease gene | SNP array and breakpoint sequencing | |
| 4 | chr4:123,976,639-123,989,201 | 12,562 | Het del | 2 | 3 | Under SNP array resolution, in trans with SNV | ddPCR and breakpoint sequencing |
Number of CNVs identified by the modified ExomeDepth pipeline at every stage
| ID | Number of CNVs identified by the default pipeline | Number of CNVs identified by the modified pipeline | Number of reproducible CNVs (> 850 iterations) | Number of CNVs in OMIM disease genes | Number of diagnostic CNVs relevant to the patient phenotype |
|---|---|---|---|---|---|
| 1 | 137 | 47 | 27 | 3 | 1 |
| 2 | 152 | 54 | 33 | 1 | 1 |
| 3 | 163 | 54 | 26 | 1 | 1 |
| 4 | 174 | 53 | 26 | 1 | 1 |