| Literature DB >> 30890783 |
Theodore Chiang1, Xiuping Liu2, Tsung-Jung Wu2, Jianhong Hu2, Fritz J Sedlazeck2, Simon White3, Daniel Schaid4, Mariza de Andrade4, Gail P Jarvik5, David Crosslin6, Ian Stanaway6, David S Carrell6, John J Connolly7, Hakon Hakonarson7, Emily E Groopman8, Ali G Gharavi8, Alexander Fedotov9, Weimin Bi10,11, Magalie S Leduc12, David R Murdock2,10, Yunyun Jiang2, Linyan Meng10,11, Christine M Eng10,11, Shu Wen10,11, Yaping Yang10,11, Donna M Muzny2, Eric Boerwinkle2,13, William Salerno2, Eric Venner2, Richard A Gibbs2,10.
Abstract
PURPOSE: To provide a validated method to confidently identify exon-containing copy-number variants (CNVs), with a low false discovery rate (FDR), in targeted sequencing data from a clinical laboratory with particular focus on single-exon CNVs.Entities:
Keywords: Atlas-CNV; CNV; copy-number variation; single-exon deletion duplication; targeted gene panel clinical sequencing
Mesh:
Year: 2019 PMID: 30890783 PMCID: PMC6752313 DOI: 10.1038/s41436-019-0475-4
Source DB: PubMed Journal: Genet Med ISSN: 1098-3600 Impact factor: 8.822
Fig. 1Atlas-CNV method for calling exonic copy-number variants (CNVs) in a midpool of samples. (a) RPKM normalization is first performed on each sample with each exon assigned a single coverage value expressed as a proportion of reads per unit kilobase per sample reads in millions. (b) The median sample at each exon is selected as the reference after the 5% outliers are excluded. Log2 scores of sample/median are computed for each sample at that exon and the StDev of these scores is called the E for that exon. Low quality exons and samples are filtered and flagged. (c) CNVs are called with thresholds at the exon level with visual bar plots at the sample gene level and at the batch exon level. ANOVA analysis of variance, DoC depth of coverage.
Gold standard CNVs from 13 clinical samples used to assess Atlas-CNV performance
| Sample | Gene | Chr | Start | End | Sizea | cSNP sizeb | Class | Number of ES exons in CNV | Exon range of CNV |
|---|---|---|---|---|---|---|---|---|---|
| 1–100155 |
| 16 | 15797848 | 15932110 | 134,262 | 815,577 | hetdel | 42 | 2–42c |
| 2–100159 |
| 16 | 15797848 | 15932110 | 134,262 | 735,806 | hetdel | 42 | 2–42 |
| 3–100161 |
| 16 | 15797848 | 15932110 | 134,262 | 2,104,716 | hetdel | 42 | 2–42 |
| 4–100171 |
| 16 | 15797848 | 15932110 | 134,262 | 1,158,489 | hetdel | 42 | 2–42 |
| 5–100168 |
| 1 | 237205822 | 237995948 | 790,126 | 4,170,626 | hetdel | 104 | 1–105d |
| 6–100175 |
| 18 | 29078215 | 29118942 | 40,727 | 20,632 | hetdel | 12 | 1–12 |
| 7–100185 |
| 17 | 41242961 | 41249307 | 6346 | 5803 | hetdel | 4 | 8–11 |
| 8–100196 |
| 9 | 6241695 | 6256169 | 14,474 | 232,335 | hetdel | 7 | 1–7 |
| 9–100199 |
| 14 | 23882022 | 23889487 | 7465 | 28,302 | hetdel | 14 | 27–40 |
| 10–100208 | 3 | 37089010 | 37116608 | 27,598 | 27,528 | hetdel | 4 (9) | 16–19 (21–29) | |
| 11–100184 |
| 17 | 36047375 | 36104876 | 57,501 | 1,351,422 | hetdel | 9 | 1–9 |
| 12–100189 |
| 17 | 36047336 | 36104876 | 57,540 | 1,375,039 | hetdel | 9 | 1–9 |
| 13–100209 |
| 17 | 36047336 | 36104876 | 57,540 | 1,375,039 | hetdel | 9 | 1–9 |
All CNVs are heterozygous deletions and were previously identified in ES and Illumina HumanExome-12v array with the exception of sample 13, which has no ES.
CNV copy-number variant, ES exome sequencing.
aCNV size from clinical ES.
bCNV size from Illumina HumanExome-12v array.
cMHY11 exon 42 has 2 targets in ES data.
dRYR2 exon 91 is absent in ES.
Fig. 2Performance measures comparing Atlas-CNV and VisCap on 13 gold standard samples. Technical replicates were sequenced in triplicates for all samples. Biological replicates were performed on two samples (1 and 12). The performance measure for each sample is first obtained by averaging sample replicates in comparison with the gold standard (GS). Then, an overall mean of the 13 samples is plotted in each bar. Reproducibility is the pairwise comparison of two identical runs i, j expressed as the proportion of common exons to the union of exons in i and j. Sensitivity and specificity are the respective proportions of GS exons (true positives, TPs) or exons other than GS (true negatives, TNs) over the respective sum of these with FNs (false negatives; failed to call) or FPs (false positives; failed to reject). Precision and false discovery rate (FDR) are the respective proportions of TPs or FPs over all positives.
Fig. 3Distribution of copy-number variants (CNVs) in 10,926 eMERGE samples: deletions (a) and duplications (b). Values in bars represent the sample count broken down by CNV type (single-exon, exactly two exons, three or more, or full gene) with total counts and overall frequency listed in the legend [square brackets]. American College of Medical Genetics and Genomics (ACMG) genes are grouped alphabetically in the left dotted box and eMERGE-specific genes in the right. Gene names labeled in blue were identified in both gains and losses.
Candidate single-exon CNVs from 29 eMERGESeq samples selected for MLPA confirmation
| Sample | Site | Gene-Transcript_Exon | Class | CNV exons in sample | CNV genes in sample | Exon size | Target size | Mappability | C-score | EStDev | SampleQC | Distance of probe(s) to exon | MLPA |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Columbia | BRCA1–005_8 | hetdel | 6 |
|
| 99 | 1.00 |
| 0.10 | 0.15 | No | |
| 2 | NU-CGM | BRCA1–005_7 | hetdel | 2 | 2 | 106 | 105 | 1.00 | −14.17 | 0.06 | 0.10 | Yes | |
| 3 | Mayo | BRCA1–005_2 | hetdel | 1 | 1 | 99 | 99 | 0.97 | −11.67 | 0.09 | 0.08 | Yes | |
| 4 | Mayo | LDLR-003_5 | hetdel | 2 | 2 | 123 | 122 | 1.00 | −10.33 | 0.09 | 0.10 | Yes | |
| 5 | NU-CGM | MLH1–001_13 | hetdel | 1 | 1 | 149 | 148 | 1.00 | −22.40 | 0.05 | 0.09 | Yes | |
| 6 | Vanderbilt | PKP2–002_1 | hetdel | 1 | 1 | 248 | 222 | 0.95 | −11.88 | 0.08 | 0.08 | Yes | |
| 7 | CHOP | PKP2–002_4 | hetdel | 1 | 1 | 136 | 135 | 0.97 | −10.91 |
| 0.10 | Yes | |
| 8 | Vanderbilt | DSP-001_24 | hetdel | 1 | 1 | 4076 | 3236 | 1.00 | −42.50 | 0.02 | 0.10 | Yes | |
| 9 | Vanderbilt | DSP-001_24 | hetdel | 1 | 1 | 4076 | 3236 | 1.00 | −29.33 | 0.03 | 0.09 | Yes | |
| 10 | Mayo | CFTR-001_2 | hetdel | 1 | 1 | 111 | 110 | 1.00 | −18.00 | 0.07 | 0.16 | Yes | |
| 11 | Mayo | TGFBR2–002_3 | dup | 1 | 1 | 169 | 168 | 1.00 | 9.33 | 0.06 | 0.08 | Yes | |
| 12 | CHOP | TCF4–004_3 | hetdel | 2 | 2 | 92 | 99 | 1.00 | −8.75 | 0.08 | 0.14 | No | |
| 13 | CHOP | PTEN-001_2 | hetdel | 6 |
| 85 | 99 |
| −8.10 | 0.10 | 0.15 | No | |
| 14 | Columbia | PTEN-001_2 | hetdel | 37 |
| 85 | 99 |
| −8.00 | 0.09 | 0.20 | No | |
| 15 | Mayo | TCF4–004_2 | hetdel | 1 | 1 | 91 | 99 |
| −10.64 |
| 0.09 | 315 | Yes |
| 16 | NU-CGM | PTEN-001_3 | hetdel | 14 |
|
| 99 | 0.96 | −8.86 | 0.07 | 0.17 | 145,28,226 | Inconclusive |
| 17 | NU-CGM | PTEN-001_4 | hetdel | 3 | 3 |
| 99 | 1.00 |
| 0.09 | 0.15 | 14,61 | Inconclusive |
| 18 | Vanderbilt | CACNA1A-001_47 | hetdel | 2 | 2 | 1612 | 740 | 0.99 | −12.33 | 0.06 | 0.12 | Flanking 5’ | Inconclusive |
| 19 | Mayo | CACNA1A-001_47 | hetdel | 2 | 2 | 1612 | 740 | 0.99 | −8.86 | 0.07 | 0.13 | Flanking 5’ | Inconclusive |
| 20 | Mayo | CACNA1A-001_47 | hetdel | 3 | 3 | 1612 | 740 | 0.99 | −8.25 | 0.08 | 0.15 | Flanking 5’ | Inconclusive |
| 21 | Vanderbilt | CACNA1A-001_47 | hetdel | 1 | 1 | 1612 | 740 | 0.99 | −8.00 | 0.08 | 0.13 | Flanking 5’ | Inconclusive |
| 22 | NU-CGM | MYH7–001_27 | hetdel | 3 | 3 | 390 | 389 |
| −11.00 | 0.09 | 0.11 | 63 | Inconclusive |
| 23 | CHOP | MYH7–001_27 | hetdel | 1 | 1 | 390 | 389 |
| −9.88 | 0.08 | 0.09 | 63 | Inconclusive |
| 24 | Columbia | MYH7–001_27 | hetdel | 1 | 1 | 390 | 389 |
| −9.44 | 0.09 | 0.09 | 63 | Inconclusive |
| 25 | Vanderbilt | CHEK2–003_15 | hetdel | 1 | 1 | 81 | 99 |
| −13.83 | 0.06 | 0.08 | 207 | Inconclusive |
| 26 | Mayo | CHEK2–003_14 | hetdel | 1 | 1 | 86 | 99 |
| −8.92 |
| 0.10 | 54 | Inconclusive |
| 27 | Mayo | CHEK2–003_15 | hetdel | 1 | 1 | 81 | 99 |
| −8.57 |
| 0.10 | 207 | Inconclusive |
| 28 | Vanderbilt | SDHC-001_1 | hetdel | 3 | 3 | 169 | 99 | 1.00 | −9.25 | 0.08 | 0.13 | 382,132 | Inconclusive |
| 29 | Vanderbilt | SDHC-001_1 | hetdel | 1 | 1 | 169 | 99 | 1.00 | −8.56 | 0.09 | 0.09 | 382,132 | Inconclusive |
Twenty-three high confidence CNVs (C-score ≥8 and EStDev ≤ 0.1) and 6 borderline CNVs were tested despite over half (15/29) not having exact MLPA probes on the exon of interest (see probe distance to exon). High CNV genes per sample, exon size <50 bp, or mappability <0.8 may be factors that account for some of the negative and inconclusive cases (see values in bold). For candidates with C-score >8 and having CNV genes per sample <3, MLPA tests confirmed 90.9% (10/11) of single-exon CNVs.
CHOP, CNV copy-number variant, MLPA, NU-CGM.