Literature DB >> 23702556

Genomic region operation kit for flexible processing of deep sequencing data.

Kristian Ovaska1, Lauri Lyly, Biswajyoti Sahu, Olli A Jänne, Sampsa Hautaniemi.   

Abstract

Computational analysis of data produced in deep sequencing (DS) experiments is challenging due to large data volumes and requirements for flexible analysis approaches. Here, we present a mathematical formalism based on set algebra for frequently performed operations in DS data analysis to facilitate translation of biomedical research questions to language amenable for computational analysis. With the help of this formalism, we implemented the Genomic Region Operation Kit (GROK), which supports various DS-related operations such as preprocessing, filtering, file conversion, and sample comparison. GROK provides high-level interfaces for R, Python, Lua, and command line, as well as an extension C++ API. It supports major genomic file formats and allows storing custom genomic regions in efficient data structures such as red-black trees and SQL databases. To demonstrate the utility of GROK, we have characterized the roles of two major transcription factors (TFs) in prostate cancer using data from 10 DS experiments. GROK is freely available with a user guide from >http://csbi.ltdk.helsinki.fi/grok/.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23702556     DOI: 10.1109/TCBB.2012.170

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  4 in total

1.  Tissue-specific pioneer factors associate with androgen receptor cistromes and transcription programs.

Authors:  Päivi Pihlajamaa; Biswajyoti Sahu; Lauri Lyly; Viljami Aittomäki; Sampsa Hautaniemi; Olli A Jänne
Journal:  EMBO J       Date:  2014-01-22       Impact factor: 11.598

Review 2.  Integrated Bio-Search: challenges and trends for the integration, search and comprehensive processing of biological information.

Authors:  Marco Masseroli; Barend Mons; Erik Bongcam-Rudloff; Stefano Ceri; Alexander Kel; François Rechenmann; Frederique Lisacek; Paolo Romano
Journal:  BMC Bioinformatics       Date:  2014-01-10       Impact factor: 3.169

3.  START: a system for flexible analysis of hundreds of genomic signal tracks in few lines of SQL-like queries.

Authors:  Xinjie Zhu; Qiang Zhang; Eric Dun Ho; Ken Hung-On Yu; Chris Liu; Tim H Huang; Alfred Sze-Lok Cheng; Ben Kao; Eric Lo; Kevin Y Yip
Journal:  BMC Genomics       Date:  2017-09-22       Impact factor: 3.969

4.  JOA: Joint Overlap Analysis of multiple genomic interval sets.

Authors:  Burçak Otlu; Tolga Can
Journal:  BMC Bioinformatics       Date:  2019-03-08       Impact factor: 3.169

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.