Heng Li1,2, Jiazhen Rong2. 1. Department of Data Science, Dana-Faber Cancer Institute, Boston, MA 02215, USA. 2. Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02215, USA.
Abstract
SUMMARY: We present bedtk, a new toolkit for manipulating genomic intervals in the BED format. It supports sorting, merging, intersection, subtraction and the calculation of the breadth of coverage. Bedtk uses implicit interval tree, a data structure for fast interval overlap queries. It is several to tens of times faster than existing tools and tends to use less memory. AVAILABILITY AND IMPLEMENTATION: The source code is available at https://github.com/lh3/bedtk.
SUMMARY: We present bedtk, a new toolkit for manipulating genomic intervals in the BED format. It supports sorting, merging, intersection, subtraction and the calculation of the breadth of coverage. Bedtk uses implicit interval tree, a data structure for fast interval overlap queries. It is several to tens of times faster than existing tools and tends to use less memory. AVAILABILITY AND IMPLEMENTATION: The source code is available at https://github.com/lh3/bedtk.
Authors: Shane Neph; M Scott Kuehn; Alex P Reynolds; Eric Haugen; Robert E Thurman; Audra K Johnson; Eric Rynes; Matthew T Maurano; Jeff Vierstra; Sean Thomas; Richard Sandstrom; Richard Humbert; John A Stamatoyannopoulos Journal: Bioinformatics Date: 2012-05-09 Impact factor: 6.937