| Literature DB >> 28751969 |
Kent A Riemondy1, Ryan M Sheridan2, Austin Gillen1, Yinni Yu2, Christopher G Bennett3, Jay R Hesselberth1,2.
Abstract
New tools for reproducible exploratory data analysis of large datasets are important to address the rising size and complexity of genomic data. We developed the valr R package to enable flexible and efficient genomic interval analysis. valr leverages new tools available in the "tidyverse", including dplyr. Benchmarks of valr show it performs similar to BEDtools and can be used for interactive analyses and incorporated into existing analysis pipelines.Entities:
Keywords: BEDtools; Genomics; Intervals; R; RStudio; reproducibility
Year: 2017 PMID: 28751969 PMCID: PMC5506536 DOI: 10.12688/f1000research.11997.1
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Figure 1. Visualizing interval operations in valr with bed_glyph().
An overview of major functions available in valr.
| Function Name | Purpose |
|---|---|
|
| |
| read_bed | Read BED files |
| read_bedgraph | Read bedGraph files |
| read_narrowpeak | Read narrowPeak files |
| read_broadpeak | Read broadPeak files |
|
| |
| bed_slop | Expand interval coordinates |
| bed_shift | Shift interval coordinates |
| bed_flank | Create flanking intervals |
| bed_merge | Merge overlapping intervals |
| bed_cluster | Identify (but not merge) overlapping intervals |
| bed_complement | Create intervals not covered by a query |
|
| |
| bed_intersect | Report intersecting intervals from x and y tbls |
| bed_map | Apply functions to selected columns for overlapping intervals |
| bed_subtract | Remove intervals based on overlaps |
| bed_window | Find overlapping intervals within a window |
| bed_closest | Find the closest intervals independent of overlaps |
|
| |
| bed_random | Generate random intervals from an input genome |
| bed_shuffle | Shuffle the coordinates of input intervals |
|
| |
| bed_fisher, bed_
| Calculate significance of overlaps between two sets of
|
| bed_reldist | Quantify relative distances between sets of intervals |
| bed_absdist | Quantify absolute distances between sets of intervals |
| bed_jaccard | Quantify extent of overlap between two sets of intervals |
|
| |
| bed_glyph | Visualize the actions of valr functions |
| bound_intervals | Constrain intervals to a genome reference |
| bed_makewindows | Subdivide intervals |
| bed12_to_exons | Convert BED12 to BED6 format |
| interval_spacing | Calculate spacing between intervals |
| db_ucsc, db_ensembl | Access remote databases |
Figure 2. Meta-analysis of signals relative to genomic features with valr.
( A) Summarized coverage of human H3K4Me3 Chip-Seq coverage across positive strand transcription start sites on chromosome 22. Data presented +/- SD.
Figure 3. Performance of valr functions.
( A) Timings were calculated by performing 10 repetitions of indicated functions on data frames preloaded in R containing 1 million random 1 kilobase x/y intervals generated using bed_random(). ( B) Timings for executing functions in BEDtools v2.25.0 or equivalent functions in valr using the same interval sets as in ( A) written to files. All BEDtools function outputs were written to /dev/null/, and were timed using GNU time. Timings for valr functions in ( B) include times for reading files using read_bed() functions and were timed using the microbenchmark package.