Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Minirmd: accurate and fast duplicate removal tool for short reads via multiple minimizers.

Literature DB >> 33112385

Minirmd: accurate and fast duplicate removal tool for short reads via multiple minimizers.

Yuansheng Liu¹, Xiaocai Zhang², Quan Zou³, Xiangxiang Zeng¹.

Abstract

SUMMARY: Removing duplicate and near-duplicate reads, generated by high-throughput sequencing technologies, is able to reduce computational resources in downstream applications. Here we develop minirmd, a de novo tool to remove duplicate reads via multiple rounds of clustering using different length of minimizer. Experiments demonstrate that minirmd removes more near-duplicate reads than existing clustering approaches and is faster than existing multi-core tools. To the best of our knowledge, minirmd is the first tool to remove near-duplicates on reverse-complementary strand.
AVAILABILITY AND IMPLEMENTATION: https://github.com/yuansliu/minirmd. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Year: 2021 PMID： 33112385 DOI： 10.1093/bioinformatics/btaa915

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

5 in total

Minirmd: accurate and fast duplicate removal tool for short reads via multiple minimizers.

1. Fast-HBR: Fast hash based duplicate read remover.

Review 2. Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm.

Review 3. Research on the Computational Prediction of Essential Genes.

4. SparkGC: Spark based genome compression for large collections of genomes.

5. Hamming-shifting graph of genomic short reads: Efficient construction and its application for compression.