Literature DB >> 23872968

On the accuracy of short read mapping.

Peter Menzel1, Jes Frellsen, Mireya Plass, Simon H Rasmussen, Anders Krogh.   

Abstract

The development of high-throughput sequencing technologies has revolutionized the way we study genomes and gene regulation. In a single experiment, millions of reads are produced. To gain knowledge from these experiments the first thing to be done is finding the genomic origin of the reads, i.e., mapping the reads to a reference genome. In this new situation, conventional alignment tools are obsolete, as they cannot handle this huge amount of data in a reasonable amount of time. Thus, new mapping algorithms have been developed, which are fast at the expense of a small decrease in accuracy. In this chapter we discuss the current problems in short read mapping and show that mapping reads correctly is a nontrivial task. Through simple experiments with both real and synthetic data, we demonstrate that different mappers can give different results depending on the type of data, and that a considerable fraction of uniquely mapped reads is potentially mapped to an incorrect location. Furthermore, we provide simple statistical results on the expected number of random matches in a genome (E-value) and the probability of a random match as a function of read length. Finally, we show that quality scores contain valuable information for mapping and why mapping quality should be evaluated in a probabilistic manner. In the end, we discuss the potential of improving the performance of current methods by considering these quality scores in a probabilistic mapping program.

Mesh:

Year:  2013        PMID: 23872968     DOI: 10.1007/978-1-62703-514-9_3

Source DB:  PubMed          Journal:  Methods Mol Biol        ISSN: 1064-3745


  3 in total

1.  Highly accessible AU-rich regions in 3' untranslated regions are hotspots for binding of regulatory factors.

Authors:  Mireya Plass; Simon H Rasmussen; Anders Krogh
Journal:  PLoS Comput Biol       Date:  2017-04-14       Impact factor: 4.475

Review 2.  Unravelling the tumour genome: The evolutionary and clinical impacts of structural variants in tumourigenesis.

Authors:  Alhafidz Hamdan; Ailith Ewing
Journal:  J Pathol       Date:  2022-04-28       Impact factor: 9.883

3.  Evaluation of microRNA alignment techniques.

Authors:  Mark Ziemann; Antony Kaspi; Assam El-Osta
Journal:  RNA       Date:  2016-06-09       Impact factor: 4.942

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.