Literature DB >> 31378808

SeQuiLa-cov: A fast and scalable library for depth of coverage calculations.

Marek Wiewiórka1, Agnieszka Szmurło1, Wiktor Kuśmirek1, Tomasz Gambin1.   

Abstract

BACKGROUND: Depth of coverage calculation is an important and computationally intensive preprocessing step in a variety of next-generation sequencing pipelines, including the analysis of RNA-sequencing data, detection of copy number variants, or quality control procedures.
RESULTS: Building upon big data technologies, we have developed SeQuiLa-cov, an extension to the recently released SeQuiLa platform, which provides efficient depth of coverage calculations, reaching >100× speedup over the state-of-the-art tools. The performance and scalability of our solution allow for exome and genome-wide calculations running locally or on a cluster while hiding the complexity of the distributed computing with Structured Query Language Application Programming Interface.
CONCLUSIONS: SeQuiLa-cov provides significant performance gain in depth of coverage calculations streamlining the widely used bioinformatic processing pipelines.
© The Author(s) 2019. Published by Oxford University Press.

Entities:  

Keywords:  CNV-calling; NGS data analysis; RNA-seq; SQL; big data; depth of coverage; distributed computing; quality control for sequencing data

Mesh:

Year:  2019        PMID: 31378808      PMCID: PMC6680061          DOI: 10.1093/gigascience/giz094

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


  13 in total

1.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.

Authors:  Aaron McKenna; Matthew Hanna; Eric Banks; Andrey Sivachenko; Kristian Cibulskis; Andrew Kernytsky; Kiran Garimella; David Altshuler; Stacey Gabriel; Mark Daly; Mark A DePristo
Journal:  Genome Res       Date:  2010-07-19       Impact factor: 9.043

2.  Using XHMM Software to Detect Copy Number Variation in Whole-Exome Sequencing Data.

Authors:  Menachem Fromer; Shaun M Purcell
Journal:  Curr Protoc Hum Genet       Date:  2014-04-24

3.  SeQuiLa: an elastic, fast and scalable SQL-oriented solution for processing and querying genomic intervals.

Authors:  Marek Wiewiórka; Anna Leśniewska; Agnieszka Szmurło; Kacper Stępień; Mateusz Borowiak; Michał Okoniewski; Tomasz Gambin
Journal:  Bioinformatics       Date:  2019-06-01       Impact factor: 6.937

4.  CODEX: a normalization and copy number variation detection method for whole exome sequencing.

Authors:  Yuchao Jiang; Derek A Oldridge; Sharon J Diskin; Nancy R Zhang
Journal:  Nucleic Acids Res       Date:  2015-01-23       Impact factor: 16.971

5.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

6.  BEDTools: a flexible suite of utilities for comparing genomic features.

Authors:  Aaron R Quinlan; Ira M Hall
Journal:  Bioinformatics       Date:  2010-01-28       Impact factor: 6.937

7.  Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data.

Authors:  Konstantin Okonechnikov; Ana Conesa; Fernando García-Alcalde
Journal:  Bioinformatics       Date:  2015-10-01       Impact factor: 6.937

8.  Homozygous and hemizygous CNV detection from exome sequencing data in a Mendelian disease cohort.

Authors:  Tomasz Gambin; Zeynep C Akdemir; Bo Yuan; Shen Gu; Theodore Chiang; Claudia M B Carvalho; Chad Shaw; Shalini Jhangiani; Philip M Boone; Mohammad K Eldomery; Ender Karaca; Yavuz Bayram; Asbjørg Stray-Pedersen; Donna Muzny; Wu-Lin Charng; Vahid Bahrambeigi; John W Belmont; Eric Boerwinkle; Arthur L Beaudet; Richard A Gibbs; James R Lupski
Journal:  Nucleic Acids Res       Date:  2017-02-28       Impact factor: 16.971

9.  Mosdepth: quick coverage calculation for genomes and exomes.

Authors:  Brent S Pedersen; Aaron R Quinlan
Journal:  Bioinformatics       Date:  2018-03-01       Impact factor: 6.937

10.  Rail-RNA: scalable analysis of RNA-seq splicing and coverage.

Authors:  Abhinav Nellore; Leonardo Collado-Torres; Andrew E Jaffe; José Alquicira-Hernández; Christopher Wilks; Jacob Pritt; James Morton; Jeffrey T Leek; Ben Langmead
Journal:  Bioinformatics       Date:  2017-12-15       Impact factor: 6.937

View more
  2 in total

1.  Identification of SNPs and InDels associated with berry size in table grapes integrating genetic and transcriptomic approaches.

Authors:  Claudia Muñoz-Espinoza; Alex Di Genova; Alicia Sánchez; José Correa; Alonso Espinoza; Claudio Meneses; Alejandro Maass; Ariel Orellana; Patricio Hinrichsen
Journal:  BMC Plant Biol       Date:  2020-08-03       Impact factor: 4.215

2.  A Large-Scale and Serverless Computational Approach for Improving Quality of NGS Data Supporting Big Multi-Omics Data Analyses.

Authors:  Dariusz Mrozek; Krzysztof Stępień; Piotr Grzesik; Bożena Małysiak-Mrozek
Journal:  Front Genet       Date:  2021-07-13       Impact factor: 4.599

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.