Literature DB >> 20378555

Correction of sequencing errors in a mixed set of reads.

Leena Salmela1.   

Abstract

MOTIVATION: High-throughput sequencing technologies produce large sets of short reads that may contain errors. These sequencing errors make de novo assembly challenging. Error correction aims to reduce the error rate prior assembly. Many de novo sequencing projects use reads from several sequencing technologies to get the benefits of all used technologies and to alleviate their shortcomings. However, combining such a mixed set of reads is problematic as many tools are specific to one sequencing platform. The SOLiD sequencing platform is especially problematic in this regard because of the two base color coding of the reads. Therefore, new tools for working with mixed read sets are needed.
RESULTS: We present an error correction tool for correcting substitutions, insertions and deletions in a mixed set of reads produced by various sequencing platforms. We first develop a method for correcting reads from any sequencing technology producing base space reads such as the SOLEXA/Illumina and Roche/454 Life Sciences sequencing platforms. We then further refine the algorithm to correct the color space reads from the Applied Biosystems SOLiD sequencing platform together with normal base space reads. Our new tool is based on the SHREC program that is aimed at correcting SOLEXA/Illumina reads. Our experiments show that we can detect errors with 99% sensitivity and >98% specificity if the combined sequencing coverage of the sets is at least 12. We also show that the error rate of the reads is greatly reduced. AVAILABILITY: The JAVA source code is freely available at http://www.cs.helsinki.fi/u/lmsalmel/hybrid-shrec/ CONTACT: leena.salmela@cs.helsinki.fi

Mesh:

Year:  2010        PMID: 20378555     DOI: 10.1093/bioinformatics/btq151

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  33 in total

1.  Fulcrum: condensing redundant reads from high-throughput sequencing studies.

Authors:  Matthew S Burriesci; Erik M Lehnert; John R Pringle
Journal:  Bioinformatics       Date:  2012-03-13       Impact factor: 6.937

2.  ECHO: a reference-free short-read error correction algorithm.

Authors:  Wei-Chun Kao; Andrew H Chan; Yun S Song
Journal:  Genome Res       Date:  2011-04-11       Impact factor: 9.043

Review 3.  From next-generation resequencing reads to a high-quality variant data set.

Authors:  S P Pfeifer
Journal:  Heredity (Edinb)       Date:  2016-10-19       Impact factor: 3.821

4.  Pluribus-Exploring the Limits of Error Correction Using a Suffix Tree.

Authors:  Daniel Savel; Thomas LaFramboise; Ananth Grama; Mehmet Koyuturk
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2016-06-29       Impact factor: 3.710

5.  DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI.

Authors:  Yongchao Liu; Bertil Schmidt; Douglas L Maskell
Journal:  BMC Bioinformatics       Date:  2011-03-29       Impact factor: 3.169

6.  Quake: quality-aware detection and correction of sequencing errors.

Authors:  David R Kelley; Michael C Schatz; Steven L Salzberg
Journal:  Genome Biol       Date:  2010-11-29       Impact factor: 13.583

7.  RecountDB: a database of mapped and count corrected transcribed sequences.

Authors:  Edward Wijaya; Martin C Frith; Kiyoshi Asai; Paul Horton
Journal:  Nucleic Acids Res       Date:  2011-12-01       Impact factor: 16.971

8.  ConDeTri--a content dependent read trimmer for Illumina data.

Authors:  Linnéa Smeds; Axel Künstner
Journal:  PLoS One       Date:  2011-10-19       Impact factor: 3.240

9.  Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects.

Authors:  Rhys A Farrer; Daniel A Henk; Dan MacLean; David J Studholme; Matthew C Fisher
Journal:  Sci Rep       Date:  2013       Impact factor: 4.379

10.  Estimation of sequencing error rates in short reads.

Authors:  Xin Victoria Wang; Natalie Blades; Jie Ding; Razvan Sultana; Giovanni Parmigiani
Journal:  BMC Bioinformatics       Date:  2012-07-30       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.