Kang Hu1, Neng Huang1, You Zou1, Xingyu Liao1, Jianxin Wang1. 1. Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
Abstract
MOTIVATION: Compared with the second generation sequencing technologies, the third generation sequencing technologies allows us to obtain longer reads (average ∼10kbps, maximum 900kbps), but brings a higher error rate (∼15% error rate). Nanopolish is a variant and methylation detection tool based on Hidden Markov Model (HMM), which uses Oxford Nanopore sequencing data for signal-level analysis. Nanopolish can greatly improve the accuracy of assembly, whereas it is limited by long running time since most executive parts of Nanopolish is a serial and computationally expensive process. RESULTS: In this paper, we present an effective polishing tool, Multithreading Nanopolish (MultiNanopolish), which decomposes the whole process of iterative calculation in Nanopolish into small independent calculation tasks, making it possible to run this process in the parallel mode. Experimental results show that MultiNanopolish reduces running time by 50% with read-uncorrected assembler (Miniasm) and 20% with read-corrected assembler (Canu and Flye) based on 40 threads mode compared to the original Nanopolish. AVAILABILITY: MultiNanopolish is available at GitHub: https://github.com/BioinformaticsCSU/MultiNanopolish. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Compared with the second generation sequencing technologies, the third generation sequencing technologies allows us to obtain longer reads (average ∼10kbps, maximum 900kbps), but brings a higher error rate (∼15% error rate). Nanopolish is a variant and methylation detection tool based on Hidden Markov Model (HMM), which uses Oxford Nanopore sequencing data for signal-level analysis. Nanopolish can greatly improve the accuracy of assembly, whereas it is limited by long running time since most executive parts of Nanopolish is a serial and computationally expensive process. RESULTS: In this paper, we present an effective polishing tool, Multithreading Nanopolish (MultiNanopolish), which decomposes the whole process of iterative calculation in Nanopolish into small independent calculation tasks, making it possible to run this process in the parallel mode. Experimental results show that MultiNanopolish reduces running time by 50% with read-uncorrected assembler (Miniasm) and 20% with read-corrected assembler (Canu and Flye) based on 40 threads mode compared to the original Nanopolish. AVAILABILITY: MultiNanopolish is available at GitHub: https://github.com/BioinformaticsCSU/MultiNanopolish. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.