Kun Sun1. 1. Shenzhen Bay Laboratory, Shenzhen 518055, China.
Abstract
MOTIVATION: Next-generation sequencing (NGS) data frequently suffer from poor-quality cycles and adapter contaminations therefore need to be preprocessed before downstream analyses. With the ever-growing throughput and read length of modern sequencers, the preprocessing step turns to be a bottleneck in data analysis due to unmet performance of current tools. Extra-fast and accurate adapter- and quality-trimming tools for sequencing data preprocessing are therefore still of urgent demand. RESULTS: Ktrim was developed in this work. Key features of Ktrim include: built-in support to adapters of common library preparation kits; supports user-supplied, customized adapter sequences; supports both paired-end and single-end data; supports parallelization to accelerate the analysis. Ktrim was ∼2-18 times faster than current tools and also showed high accuracy when applied on the testing datasets. Ktrim could thus serve as a valuable and efficient tool for short-read NGS data preprocessing. AVAILABILITY AND IMPLEMENTATION: Source codes and scripts to reproduce the results descripted in this article are freely available at https://github.com/hellosunking/Ktrim/, distributed under the GPL v3 license. CONTACT: sunkun@szbl.ac.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Next-generation sequencing (NGS) data frequently suffer from poor-quality cycles and adapter contaminations therefore need to be preprocessed before downstream analyses. With the ever-growing throughput and read length of modern sequencers, the preprocessing step turns to be a bottleneck in data analysis due to unmet performance of current tools. Extra-fast and accurate adapter- and quality-trimming tools for sequencing data preprocessing are therefore still of urgent demand. RESULTS: Ktrim was developed in this work. Key features of Ktrim include: built-in support to adapters of common library preparation kits; supports user-supplied, customized adapter sequences; supports both paired-end and single-end data; supports parallelization to accelerate the analysis. Ktrim was ∼2-18 times faster than current tools and also showed high accuracy when applied on the testing datasets. Ktrim could thus serve as a valuable and efficient tool for short-read NGS data preprocessing. AVAILABILITY AND IMPLEMENTATION: Source codes and scripts to reproduce the results descripted in this article are freely available at https://github.com/hellosunking/Ktrim/, distributed under the GPL v3 license. CONTACT: sunkun@szbl.ac.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.