Literature DB >> 24184706

Probabilistic model based error correction in a set of various mutant sequences analyzed by next-generation sequencing.

Takuyo Aita1, Norikazu Ichihashi, Tetsuya Yomo.   

Abstract

To analyze the evolutionary dynamics of a mutant population in an evolutionary experiment, it is necessary to sequence a vast number of mutants by high-throughput (next-generation) sequencing technologies, which enable rapid and parallel analysis of multikilobase sequences. However, the observed sequences include many errors of base call. Therefore, if next-generation sequencing is applied to analysis of a heterogeneous population of various mutant sequences, it is necessary to discriminate between true bases as point mutations and errors of base call in the observed sequences, and to subject the sequences to error-correction processes. To address this issue, we have developed a novel method of error correction based on the Potts model and a maximum a posteriori probability (MAP) estimate of its parameters corresponding to the "true sequences". Our method of error correction utilizes (1) the "quality scores" which are assigned to individual bases in the observed sequences and (2) the neighborhood relationship among the observed sequences mapped in sequence space. The computer experiments of error correction of artificially generated sequences supported the effectiveness of our method, showing that 50-90% of errors were removed. Interestingly, this method is analogous to a probabilistic model based method of image restoration developed in the field of information engineering.
Copyright © 2013 Elsevier Ltd. All rights reserved.

Keywords:  Base call error; Image restoration; Quality score; Quasispecies; SMRT; Sequence analysis

Mesh:

Substances:

Year:  2013        PMID: 24184706     DOI: 10.1016/j.compbiolchem.2013.09.006

Source DB:  PubMed          Journal:  Comput Biol Chem        ISSN: 1476-9271            Impact factor:   2.877


  2 in total

Review 1.  The analysis of clonal expansions in normal and autoimmune B cell repertoires.

Authors:  Uri Hershberg; Eline T Luning Prak
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2015-09-05       Impact factor: 6.237

2.  Denoising DNA deep sequencing data-high-throughput sequencing errors and their correction.

Authors:  David Laehnemann; Arndt Borkhardt; Alice Carolyn McHardy
Journal:  Brief Bioinform       Date:  2015-05-29       Impact factor: 11.622

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.