| Literature DB >> 32777815 |
Abstract
SUMMARY: Defining the precise location of structural variations (SVs) at single-nucleotide breakpoint resolution is a challenging problem due to large gaps in alignment. Previously, Alignment with Gap Excision (AGE) enabled us to define breakpoints of SVs at single-nucleotide resolution; however, AGE requires a vast amount of memory when aligning a pair of long sequences. To address this, we developed a memory-efficient implementation-LongAGE-based on the classical Hirschberg algorithm. We demonstrate an application of LongAGE for resolving breakpoints of SVs embedded into segmental duplications on Pacific Biosciences (PacBio) reads that can be longer than 10 kb. Furthermore, we observed different breakpoints for a deletion and a duplication in the same locus, providing direct evidence that such multi-allelic copy number variants (mCNVs) arise from two or more independent ancestral mutations.Entities:
Year: 2021 PMID: 32777815 PMCID: PMC8128450 DOI: 10.1093/bioinformatics/btaa703
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Defining breakpoints of mCNV on chromosome 19 in Chinese Trio from GIAB. (A) Read depth signals from top to bottom corresponding to father (HG006), mother (HG007) and son (HG005). (B) Haplotypes with deletion and duplication are passed down from both parents to son. (C) Haplotypes with tandem duplication and deletion were assembled by haplotype-assigned PacBio reads. Breakpoints of the deletion and duplications are different.
Memory usage in megabytes and run time in seconds of AGE and LongAGE in controlled experiments on aligning two sequences with various variant lengths
| Tools | 1 kb | 2 kb | 4 kb | 8 kb | 16 kb | 32 kb | 1 Mbp |
|---|---|---|---|---|---|---|---|
| Memory usage (megabytes) | |||||||
| AGE | 550.83 | 600.85 | 700.90 | 901.04 | 1301.21 | 2101.68 |
|
| LongAGE | 2.71 | 2.92 | 3.13 | 3.55 | 3.62 | 5.55 | 113.29 |
| Running time (s) | |||||||
| AGE | 5.05 | 5.55 | 6.57 | 8.37 | 12.03 | 19.27 |
|
| LongAGE | 18.92 | 20.72 | 22.80 | 23.77 | 32.06 | 50.63 | 1159.61 |
Note: Benchmarks were made on an Intel Xeon(R) Gold 6148 Processor (27.5M Cache, 2.40 GHz) with 192 GB of memory.