Haisi Yi1, Zhe Li2, Tao Li3, Jindong Zhao4. 1. Key Laboratory of Algal Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei 430072, China, University of Chinese Academy of Sciences, Beijing 100049, China. 2. State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China and. 3. Key Laboratory of Algal Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei 430072, China. 4. Key Laboratory of Algal Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei 430072, China, College of Life Science, Peking University, Beijing 100871, China.
Abstract
UNLABELLED: Demultiplexing is used after high-throughput sequencing to in silico assign reads to the samples of origin based on the sequenced reads of the indices. Existing demultiplexing tools based on the similarity between the read index and the reference index sequences may fail to provide satisfactory results on low-quality datasets. We developed Bayexer, a Bayesian demultiplexing algorithm for Illumina sequencers. Bayexer uses the information extracted directly from the contaminant sequences of the targeting reads as the training dataset for a naïve Bayes classifier to assign reads. According to our evaluation, Bayexer provides higher capability, accuracy and speed on various real datasets than other tools. AVAILABILITY AND IMPLEMENTATION: Bayexer is implemented in Perl and freely available at https://github.com/HaisiYi/Bayexer.
UNLABELLED: Demultiplexing is used after high-throughput sequencing to in silico assign reads to the samples of origin based on the sequenced reads of the indices. Existing demultiplexing tools based on the similarity between the read index and the reference index sequences may fail to provide satisfactory results on low-quality datasets. We developed Bayexer, a Bayesian demultiplexing algorithm for Illumina sequencers. Bayexer uses the information extracted directly from the contaminant sequences of the targeting reads as the training dataset for a naïve Bayes classifier to assign reads. According to our evaluation, Bayexer provides higher capability, accuracy and speed on various real datasets than other tools. AVAILABILITY AND IMPLEMENTATION: Bayexer is implemented in Perl and freely available at https://github.com/HaisiYi/Bayexer.