| Literature DB >> 34856923 |
Junwei Luo1, Hongyu Ding1, Jiquan Shen2, Haixia Zhai1, Zhengjiang Wu1, Chaokun Yan3, Huimin Luo3.
Abstract
BACKGROUND: Structural variations (SVs) occupy a prominent position in human genetic diversity, and deletions form an important type of SV that has been suggested to be associated with genetic diseases. Although various deletion calling methods based on long reads have been proposed, a new approach is still needed to mine features in long-read alignment information. Recently, deep learning has attracted much attention in genome analysis, and it is a promising technique for calling SVs.Entities:
Mesh:
Year: 2021 PMID: 34856923 PMCID: PMC8641175 DOI: 10.1186/s12859-021-04499-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1BreakNet module for detecting deletions
The detail of datasets
| HG002 CLR | HG002 CCS | HG00514 | HG00733 | NA19240 | |
|---|---|---|---|---|---|
| Read count | 29,157,344 | 6,596,012 | 12,430,587 | 13,521,896 | 20,452,822 |
| Average length | 7937 | 13,478 | 11,800 | 12,295 | 6503 |
| Coverage | 69X | 28X | 42X | 45X | 39X |
| Aligner | NGMLR | PBMM2 | BWA | BWA | BWA |
Performance comparison of SV caller on HG002 data
| Coverage | BreakNet | SVIM | cuteSV | SNIFFLES | ||
|---|---|---|---|---|---|---|
| CLR | 69X | Precision | 0.9704 | 0.9678 | 0.9604 | |
| Recalll | 0.9169 | 0.9282 | 0.9224 | |||
| F1 | 0.9429 | 0.9492 | 0.9410 | |||
| 35X | Precision | 0.9469 | 0.9653 | 0.9556 | ||
| Recall | 0.9169 | 0.9292 | 0.8955 | 0.9160 | ||
| F1 | 0.9316 | 0.9351 | 0.9355 | |||
| 20X | Precision | 0.9524 | 0.9790 | 0.9720 | ||
| Recall | 0.8389 | 0.8203 | 0.7983 | |||
| F1 | 0.9004 | 0.8926 | 0.8770 | |||
| 10X | Precision | 0.9213 | 0.9819 | 0.9785 | ||
| Recall | 0.6704 | 0.6646 | 0.6470 | |||
| F1 | 0.7959 | 0.7925 | 0.7790 | |||
| CCS | 28X | Precision | 0.9552 | 0.9492 | 0.9020 | |
| Recall | 0.9350 | 0.9430 | 0.9336 | 0.8325 | ||
| F1 | 0.9415 | 0.9414 | 0.8657 | |||
| 10X | Precision | 0.9424 | 0.9609 | 0.9110 | ||
| Recall | 0.9282 | 0.8940 | 0.8398 | 0.6357 | ||
| F1 | 0.9146 | 0.8965 | 0.7490 |
Bold values represent best results
Performance comparison of SV caller on HG00514 and HG00733 data
| BreakNet | Sniffles | cuteSV | SVIM | |||
|---|---|---|---|---|---|---|
| HG00514 | 42X | Precision | 0.6772 | 0.4539 | 0.5547 | |
| Recall | 0.3137 | 0.3286 | 0.2261 | |||
| F1 | 0.4290 | 0.3811 | 0.3213 | |||
| 21X | Precision | 0.5900 | 0.4660 | 0.4407 | ||
| Recall | 0.2624 | 0.2898 | 0.1632 | |||
| F1 | 0.3633 | 0.3960 | 0.2382 | |||
| 10X | Precision | 0.5552 | 0.5562 | 0.2593 | ||
| Recall | 0.2585 | 0.3013 | 0.3003 | |||
| F1 | 0.3611 | 0.3909 | 0.2783 | |||
| HG00733 | 48X | Precision | 0.6528 | 0.4834 | 0.5166 | |
| Recall | 0.3076 | 0.2932 | 0.2277 | |||
| F1 | 0.4182 | 0.3650 | 0.3162 | |||
| 24X | Precision | 0.7197 | 0.4250 | 0.5981 | ||
| Recall | 0.2832 | 0.3379 | 0.2218 | |||
| F1 | 0.4065 | 0.3765 | 0.3235 | |||
| 10X | Precision | 0.5444 | 0.5117 | 0.2361 | ||
| Recall | 0.2510 | 0.2959 | 0.3147 | |||
| F1 | 0.3611 | 0.375 | 0.2700 |
Bold values represent best results
Fig. 2Effect of using new loss and log loss functions on the AUC values of the model training