Literature DB >> 1358801

The accuracy of DNA sequences: estimating sequence quality.

G A Churchill1, M S Waterman.   

Abstract

In this paper we describe a method for the statistical reconstruction of a large DNA sequence from a set of sequenced fragments. We assume that the fragments have been assembled and address the problem of determining the degree to which the reconstructed sequence is free from errors, i.e., its accuracy. A consensus distribution is derived from the assembled fragment configuration based upon the rates of sequencing errors in the individual fragments. The consensus distribution can be used to find a minimally redundant consensus sequence that meets a prespecified confidence level, either base by base or across any region of the sequence. A likelihood-based procedure for the estimation of the sequencing error rates, which utilizes an iterative EM algorithm, is described. Prior knowledge of the error rates is easily incorporated into the estimation procedure. The methods are applied to a set of assembled sequence fragments from the human G6PD locus. We close the paper with a brief discussion of the relevance and practical implications of this work.

Entities:  

Mesh:

Substances:

Year:  1992        PMID: 1358801     DOI: 10.1016/s0888-7543(05)80288-5

Source DB:  PubMed          Journal:  Genomics        ISSN: 0888-7543            Impact factor:   5.736


  22 in total

1.  An Eulerian path approach to DNA fragment assembly.

Authors:  P A Pevzner; H Tang; M S Waterman
Journal:  Proc Natl Acad Sci U S A       Date:  2001-08-14       Impact factor: 11.205

2.  Adjust quality scores from alignment and improve sequencing accuracy.

Authors:  Ming Li; Magnus Nordborg; Lei M Li
Journal:  Nucleic Acids Res       Date:  2004-09-30       Impact factor: 16.971

Review 3.  Decoding DNA, RNA and peptides with quantum tunnelling.

Authors:  Massimiliano Di Ventra; Masateru Taniguchi
Journal:  Nat Nanotechnol       Date:  2016-02       Impact factor: 39.213

4.  Genome assembly reborn: recent computational challenges.

Authors:  Mihai Pop
Journal:  Brief Bioinform       Date:  2009-05-29       Impact factor: 11.622

5.  An MCMC algorithm for haplotype assembly from whole-genome sequence data.

Authors:  Vikas Bansal; Aaron L Halpern; Nelson Axelrod; Vineet Bafna
Journal:  Genome Res       Date:  2008-08       Impact factor: 9.043

6.  A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads.

Authors:  Tobias Rausch; Sergey Koren; Gennady Denisov; David Weese; Anne-Katrin Emde; Andreas Döring; Knut Reinert
Journal:  Bioinformatics       Date:  2009-03-05       Impact factor: 6.937

7.  Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi.

Authors:  Jong Hyun Kim; Michael S Waterman; Lei M Li
Journal:  Genome Res       Date:  2007-06-13       Impact factor: 9.043

8.  A hidden Markov model that finds genes in E. coli DNA.

Authors:  A Krogh; I S Mian; D Haussler
Journal:  Nucleic Acids Res       Date:  1994-11-11       Impact factor: 16.971

9.  Estimation of errors in "raw" DNA sequences: a validation study.

Authors:  P Richterich
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

10.  Fundamental Bounds for Sequence Reconstruction from Nanopore Sequencers.

Authors:  Abram Magner; Jarosław Duda; Wojciech Szpankowski; Ananth Grama
Journal:  IEEE Trans Mol Biol Multiscale Commun       Date:  2016-06
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.