Literature DB >> 28475668

MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud.

Roberto R Expósito1, Jorge Veiga1, Jorge González-Domínguez1, Juan Touriño1.   

Abstract

SUMMARY: This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted MapReduce programming model to fully exploit Big Data technologies on cloud-based infrastructures. Written in Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for scalable Big Data processing. On a 16-node cluster deployed on the Amazon EC2 cloud platform, MarDRe is up to 8.52 times faster than a representative state-of-the-art tool.
AVAILABILITY AND IMPLEMENTATION: Source code in Java and Hadoop as well as a user's guide are freely available under the GNU GPLv3 license at http://mardre.des.udc.es . CONTACT: rreye@udc.es.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Mesh:

Year:  2017        PMID: 28475668     DOI: 10.1093/bioinformatics/btx307

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  4 in total

1.  Fast-HBR: Fast hash based duplicate read remover.

Authors:  Sami Altayyar; Abdel Monim Artoli
Journal:  Bioinformation       Date:  2022-01-31

2.  HSRA: Hadoop-based spliced read aligner for RNA sequencing data.

Authors:  Roberto R Expósito; Jorge González-Domínguez; Juan Touriño
Journal:  PLoS One       Date:  2018-07-31       Impact factor: 3.240

3.  GPrimer: a fast GPU-based pipeline for primer design for qPCR experiments.

Authors:  Jeongmin Bae; Hajin Jeon; Min-Soo Kim
Journal:  BMC Bioinformatics       Date:  2021-04-29       Impact factor: 3.169

4.  BigFiRSt: A Software Program Using Big Data Technique for Mining Simple Sequence Repeats From Large-Scale Sequencing Data.

Authors:  Jinxiang Chen; Fuyi Li; Miao Wang; Junlong Li; Tatiana T Marquez-Lago; André Leier; Jerico Revote; Shuqin Li; Quanzhong Liu; Jiangning Song
Journal:  Front Big Data       Date:  2022-01-18
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.