| Literature DB >> 31830908 |
Iñigo Prada-Luengo1, Anders Krogh2,3, Lasse Maretty4, Birgitte Regenberg5.
Abstract
BACKGROUND: Circular DNA has recently been identified across different species including human normal and cancerous tissue, but short-read mappers are unable to align many of the reads crossing circle junctions hence limiting their detection from short-read sequencing data.Entities:
Keywords: Extra chromosomal circular DNA; Next generation sequencing; Structural variation; circRNA; ecDNA; eccDNA
Mesh:
Substances:
Year: 2019 PMID: 31830908 PMCID: PMC6909605 DOI: 10.1186/s12859-019-3160-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Circle-Map read realignment strategy. a Reads are mapped to the reference genome and discordantly aligned reads (green) and alignments containing soft clips (blue) are extracted; concordantly aligned reads (grey) are ignored. b Using the extracted reads, a graph of putative breakpoint connections between genomic regions is constructed and used as a prior to narrow down the genomic search space for realigning soft clipped reads. c Non-aligned parts of the soft clipped reads are realigned probabilistically using the breakpoint graph as guide. d Evidence from split-reads and discordant reads are combined to create the final circle calls together with information about concordant, split-read and discordant read coverage for each circle
Fig. 2Evaluation of the circular DNA detection methods. a-d Circle-Map (orange), CIRCexplorer2 (blue), Circle_finder (green) and Circle-Map with no realignment (grey) were evaluated on simulated circular DNA datasets with varying sequencing depths (e-g) and on real circle enriched data from human muscle. a Sensitivity at 30X and b 7.5X measured as the number of called circles found in the simulation set divided by the total number of simulated circles. Precision at (c) 30X (d) and 7.5X measured as the number of correctly called circles divided by the total number of called circles, true and false. e Histogram with the percentage of bases covered by sequencing reads for every circular DNA detected. The number of breakpoint reads (e.g. split and discordant reads) relative to the mean sequencing coverage within the circular DNA coordinates for f Circle-Map, g CIRCexplorer2 and h Circle_finder
Fig. 3Evaluation of the computation time and memory usage of the circular DNA detection methods. The runtimes and maximum memory usage of Circle-Map (orange), CIRCexplorer2 (blue) and Circle_finder (green) were evaluated on simulated (a-b) and real (c-d) circular DNA datasets. a Wall time and b maximum memory usage on the 30X simulated dataset. c Wall time and d maximum memory usage on the real circular DNA enriched dataset from human muscle
Fig. 4Size distribution of the DNA circles found the circular DNA enriched muscle dataset. Evaluation of the circular DNA size distributions found by the method described by Møller et al., from a previous study [3] (a) and the size distribution found by Circle-Map (b)