| Literature DB >> 35296758 |
Henry C M Leung1, Huijing Yu1, Yifan Zhang1, Wing Sze Leung1, Ivan F M Lo2, Ho Ming Luk2, Wai-Chun Law3, Ka Kui Ma1, Chak Lim Wong1, Yat Sing Wong1, Ruibang Luo4, Tak-Wah Lam5.
Abstract
Structural variation (SV) is a major cause of genetic disorders. In this paper, we show that low-depth (specifically, 4×) whole-genome sequencing using a single Oxford Nanopore MinION flow cell suffices to support sensitive detection of SV, particularly pathogenic SV for supporting clinical diagnosis. When using 4× ONT WGS data, existing SV calling software often fails to detect pathogenic SV, especially in the form of long deletion, terminal deletion, duplication, and unbalanced translocation. Our new SV calling software SENSV can achieve high sensitivity for all types of SV and a breakpoint precision typically ± 100 bp; both features are important for clinical concerns. The improvement achieved by SENSV stems from several new algorithms. We evaluated SENSV and other software using both real and simulated data. The former was based on 24 patient samples, each diagnosed with a genetic disorder. SENSV found the pathogenic SV in 22 out of 24 cases (all heterozygous, size from hundreds of kbp to a few Mbp), reporting breakpoints within 100 bp of the true answers. On the other hand, no existing software can detect the pathogenic SV in more than 10 out of 24 cases, even when the breakpoint requirement is relaxed to ± 2000 bp.Entities:
Mesh:
Year: 2022 PMID: 35296758 PMCID: PMC8927474 DOI: 10.1038/s41598-022-08576-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Comparison of SENSV, NanoVar, Sniffles, SVIM and cuteSV on the ability to detect the pathogenic SV from the 24 patients’ ONT WGS data.
| ID | SV length | SV type | Detect the pathogenic SV? (# of predicted SVs of the same type) | ||||
|---|---|---|---|---|---|---|---|
| SENSV | NanoVar | Sniffles | SVIM | cuteSV | |||
| 1 | 146 K | Long deletion | Y- (1027) | ||||
| 2 | 266 K | Long deletion | |||||
| 3 | 638 K | Long deletion | N (1743) | N (1146) | N (732) | N (2207) | N (599) |
| 4 | 670 K | Long deletion | N (1176) | N (794) | N (2036) | N (658) | |
| 5 | 1.5 M | Long deletion | N (1514) | N (1062) | N (4970) | N (848) | |
| 6 | 1.4 M | Long deletion | N (3692) | N (1375) | Y- (17,783) | N (834) | |
| 7 | 1.4 M | Duplication | N (1779) | N (667) | Y- (1901) | N (2105) | |
| 8 | 2.8 M | Long deletion | N (1320) | N (1192) | N (2983) | N (1016) | |
| 9 | 5.2 M | Long deletion | N (1294) | N (1014) | N (1982) | N (786) | |
| 10 | 6.6 M | Long deletion | N (1474) | ||||
| 11 | 342 K | Unbalanced translocation | N (490) | N (7106) | N (1383) | N (1154) | N (1310) |
| 12 | 1.4 M | Unbalanced translocation | N (7802) | N (1432) | N (1179) | N (1326) | |
| 13* | 5.9 M | Terminal deletion | N (1498) | N (1510) | N (4649) | N (1168) | |
| 14 | 18 M | Unbalanced translocation | N (6156) | N (1719) | N (1429) | N (1595) | |
| 15 | 19 M | Terminal deletion | N (1220) | N (968) | N (2031) | N (798) | |
| 16 | 58 M | Unbalanced translocation | N (6688) | N (2340) | N (2053) | N (2263) | |
| 17 | 142 K | Inversion | Y- (1413) | Y- (1391) | Y- (849) | ||
| 18* | 73 M | Inversion | Y- (903) | ||||
| 19 | 33 M | Inversion | N (190) | N (210) | |||
| 20 | N/A | Balanced translocation | |||||
| 21* | N/A | Balanced translocation | N (2012) | ||||
| 22 | N/A | Balanced translocation | N (996) | ||||
| 23 | N/A | Balanced translocation | N (1065) | ||||
| 24 | N/A | Balanced translocation | |||||
Below, “Y” [and “Y-”] mean that a method can detect the pathogenic SV with correct SV type and with breakpoints off by at most 100 bp [and by at most 2000 bp respectively]; and “N” indicates the method unable to detect the SV with breakpoints off by at most 2000 bp. SENSV can detect more SVs, especially for difficult cases, with much fewer false positives. Other software usually detects much more SVs than SENSV but most of them are false positives. The samples ID with asterisk have been basecalled using both Guppy versions (v3.1.5 and v5.0.11).
The best results of a row are in bold.
The performance of SENSV, NanoVar, Sniffles, SVIM and cuteSV on detecting SVs in HG002 with breakpoint precision of 100 bp and 2000 bp respectively.
| Short deletions (< 100 kbp) | Long deletion (> 100 kbp) | |||
|---|---|---|---|---|
| Sensitivity | # of confirmed SV detected | Sensitivity | ||
| 353 (381) | 66.48% (71.75%) | 0 (0) | 0% (0%) | |
| 269 (313) | 50.66% (58.95%) | 0 (0) | 0% (0%) | |
| 378 (427) | 71.19% (80.41%) | |||
| 272 (294) | 51.22% (55.38%) | 0 (0) | 0% (0%) | |
The benchmark set of HG002 contains 531 confirmed short deletions with size of smaller than 100 kbp and one long deletion with size of larger than 100 kbp. The sensitivity inside the paratheses is measured with the relaxed breakpoint precision of 2000 bp.
The best results of a row are in bold.
The number of SVs detected with a breakpoint precision of 100 bp by the software grouped by SV types in two simulated datasets.
| # of Detected SVs for the 10 implanted SVs | |||||
|---|---|---|---|---|---|
| SENSV | NanoVar | Sniffles | SVIM | cuteSV | |
| Long deletion (> 100 kbp) | 6 | 3 (4) | 5 (6) | 3 | |
| Duplication | 4 | 4 | 5 | 4 | |
| Terminal deletion | 0 | 0 | 2 | 3 | |
| Unbalanced translocation | 2 | 3 | 2 | 3 | |
| Short deletion (< 100 kbp) | 4 | 7 | 6 | ||
| Inversion | 5 | 8 | |||
| Balanced translocation | 7 | 3 | 7 | ||
The number in the parentheses is the number of SVs detected with a breakpoint precision of 2000 bp. Each type has total 10 SVs implanted in two simulated datasets.
The best results of a row are in bold.
Figure 1The workflow of SENSV.