Literature DB >> 32000034

conLSH: Context based Locality Sensitive Hashing for mapping of noisy SMRT reads.

Angana Chakraborty1, Sanghamitra Bandyopadhyay2.   

Abstract

Single Molecule Real-Time (SMRT) sequencing is a recent advancement of Next Gen technology developed by Pacific Bio (PacBio). It comes with an explosion of long and noisy reads demanding cutting edge research to get most out of it. To deal with the high error probability of SMRT data, a novel contextual Locality Sensitive Hashing (conLSH) based algorithm is proposed in this article, which can effectively align the noisy SMRT reads to the reference genome. Here, sequences are hashed together based not only on their closeness, but also on similarity of context. The algorithm has O(nρ+1) space requirement, where n is the number of sequences in the corpus and ρ is a constant. The indexing time and querying time are bounded by Onρ+1·lnnln1P2 and O(nρ) respectively, where P2 > 0, is a probability value. This algorithm is particularly useful for retrieving similar sequences, a widely used task in biology. The proposed conLSH based aligner is compared with rHAT, popularly used for aligning SMRT reads, and is found to comprehensively beat it in speed as well as in memory requirements. In particular, it takes approximately 24.2% less processing time, while saving about 70.3% in peak memory requirement for H.sapiens PacBio dataset.
Copyright © 2020 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Algorithm; Locality Sensitive Hashing; PacBio dataset; Sequence alignment; Sequence analysis; Single Molecule Real-Time (SMRT) sequencing

Mesh:

Year:  2020        PMID: 32000034     DOI: 10.1016/j.compbiolchem.2020.107206

Source DB:  PubMed          Journal:  Comput Biol Chem        ISSN: 1476-9271            Impact factor:   2.877


  2 in total

1.  kngMap: Sensitive and Fast Mapping Algorithm for Noisy Long Reads Based on the K-Mer Neighborhood Graph.

Authors:  Ze-Gang Wei; Xing-Guo Fan; Hao Zhang; Xiao-Dan Zhang; Fei Liu; Yu Qian; Shao-Wu Zhang
Journal:  Front Genet       Date:  2022-05-05       Impact factor: 4.772

Review 2.  Technology dictates algorithms: recent developments in read alignment.

Authors:  Mohammed Alser; Jeremy Rotman; Onur Mutlu; Serghei Mangul; Dhrithi Deshpande; Kodi Taraszka; Huwenbo Shi; Pelin Icer Baykal; Harry Taegyun Yang; Victor Xue; Sergey Knyazev; Benjamin D Singer; Brunilda Balliu; David Koslicki; Pavel Skums; Alex Zelikovsky; Can Alkan
Journal:  Genome Biol       Date:  2021-08-26       Impact factor: 13.583

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.