| Literature DB >> 29096012 |
Brent S Pedersen1,2,3, Aaron R Quinlan1,2,3.
Abstract
Summary: Mosdepth is a new command-line tool for rapidly calculating genome-wide sequencing coverage. It measures depth from BAM or CRAM files at either each nucleotide position in a genome or for sets of genomic regions. Genomic regions may be specified as either a BED file to evaluate coverage across capture regions, or as a fixed-size window as required for copy-number calling. Mosdepth uses a simple algorithm that is computationally efficient and enables it to quickly produce coverage summaries. We demonstrate that mosdepth is faster than existing tools and provides flexibility in the types of coverage profiles produced. Availability and implementation: mosdepth is available from https://github.com/brentp/mosdepth under the MIT license. Contact: bpederse@gmail.com. Supplementary information: Supplementary data are available at Bioinformatics online.Entities:
Mesh:
Year: 2018 PMID: 29096012 PMCID: PMC6030888 DOI: 10.1093/bioinformatics/btx699
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1Mosdepth coverage calculation algorithm. An array the size of the current chromosome is allocated. As each alignment is read from a position-sorted BAM or CRAM file, the value at each start is incremented and the value at each stop is decremented. As illustrated by the alignment with a deletion (D) CIGAR operation, each alignment may have multiple starts and ends. If the leftmost read (the one seen first) of a paired-end alignment has an end that overlaps the position of its mate (which is given as a field in the BAM record) then it is stored in a hash-table until its mate is seen. At that time, the overlap between the mates is calculated, the regions of overlap are decremented and the item is removed from the hash. This prevents double counting coverage from two ends of the same paired-end DNA fragment (black alignment, ‘*’ operation means no coverage increment or decrement is made). Once all reads for a chromosome are consumed, the per-base coverage is simply the cumulative sum of the preceding positions
Comparison of depth tools for time and memory use on a 30× BAM
| Tool | Threads | Relative time | Time (hh:mm:ss) | Memory (MiB) |
|---|---|---|---|---|
| Mosdepth | 1 | 1 | 25:23 | 1196 |
| Mosdepth | 3 | 0.57 | 14:27 | 1196 |
| Samtools | 1 | 1.98 | 50:12 | 27 |
| Sambamba | 1 | 5.71 | 2:24:53 | 166 |
| BEDtools | 1 | 5.31 | 2:14:44 | 1908 |
Note: Mosdepth and BEDTools use much more memory, but mosdepth is nearly twice as fast as the next fastest tool, samtools. The threads column reflects the number of threads for BAM/CRAM decompression.