Literature DB >> 8165143

Assignment of position-specific error probability to primary DNA sequence data.

C B Lawrence1, V V Solovyev.   

Abstract

DNA sequence predicted from polyacrylamide gel-based technologies is inaccurate because of variations in the quality of the primary data due to limitations of the technology, and to sequence-specific variations due to nucleotide interactions within the DNA molecule and with the gel. The ability to recognize the probability of error in the primary data will be useful in reconstructing the target sequence of a DNA sequencing project, and in estimating the accuracy of the final sequence. This paper describes the use of linear discriminant analysis to assign position-specific probabilities of incorrect, over- and under-prediction of nucleotides for each predicted nucleotide position in primary sequence data generated by a gel-based DNA sequencing technology. Using this method, most of the error potential in primary sequence data can be assigned to a limited number of discrete positions. The use of probability values in the sequence reconstruction process, and in estimating the accuracy of consensus sequence determination is described.

Entities:  

Mesh:

Year:  1994        PMID: 8165143      PMCID: PMC523653          DOI: 10.1093/nar/22.7.1272

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  9 in total

1.  The accuracy of DNA sequences: estimating sequence quality.

Authors:  G A Churchill; M S Waterman
Journal:  Genomics       Date:  1992-09       Impact factor: 5.736

2.  A contig assembly program based on sensitive detection of fragment overlaps.

Authors:  X Huang
Journal:  Genomics       Date:  1992-09       Impact factor: 5.736

3.  An estimate of the sequencing error frequency in the DNA sequence databases.

Authors:  T Kristensen; R Lopez; H Prydz
Journal:  DNA Seq       Date:  1992

4.  Sequence accuracy of large DNA sequencing projects.

Authors:  W Q Chen; T Hunkapiller
Journal:  DNA Seq       Date:  1992

5.  Finding errors in DNA sequences.

Authors:  J Posfai; R J Roberts
Journal:  Proc Natl Acad Sci U S A       Date:  1992-05-15       Impact factor: 11.205

6.  Sequence errors described in GenBank: a means to determine the accuracy of DNA sequence interpretation.

Authors:  S A Krawetz
Journal:  Nucleic Acids Res       Date:  1989-05-25       Impact factor: 16.971

7.  Neighboring nucleotide interactions during DNA sequencing gel electrophoresis.

Authors:  J M Bowling; K L Bruner; J L Cmarik; C Tibbetts
Journal:  Nucleic Acids Res       Date:  1991-06-11       Impact factor: 16.971

8.  Molecular sequence accuracy and the analysis of protein coding regions.

Authors:  D J States; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1991-07-01       Impact factor: 11.205

9.  The C. elegans genome sequencing project: a beginning.

Authors:  J Sulston; Z Du; K Thomas; R Wilson; L Hillier; R Staden; N Halloran; P Green; J Thierry-Mieg; L Qiu
Journal:  Nature       Date:  1992-03-05       Impact factor: 49.962

  9 in total
  9 in total

1.  Basecalling with LifeTrace.

Authors:  D Walther; G Bartha; M Morris
Journal:  Genome Res       Date:  2001-05       Impact factor: 9.043

2.  Estimation of errors in "raw" DNA sequences: a validation study.

Authors:  P Richterich
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

3.  The application of numerical estimates of base calling accuracy to DNA sequencing projects.

Authors:  J K Bonfield; R Staden
Journal:  Nucleic Acids Res       Date:  1995-04-25       Impact factor: 16.971

4.  Characterization of synthetic DNA bar codes in Saccharomyces cerevisiae gene-deletion strains.

Authors:  Robert G Eason; Nader Pourmand; Waraporn Tongprasit; Zelek S Herman; Kevin Anthony; Olufisayo Jejelowo; Ronald W Davis; Viktor Stolc
Journal:  Proc Natl Acad Sci U S A       Date:  2004-07-16       Impact factor: 11.205

5.  Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames.

Authors:  V V Solovyev; A A Salamov; C B Lawrence
Journal:  Nucleic Acids Res       Date:  1994-12-11       Impact factor: 16.971

6.  Error and error mitigation in low-coverage genome assemblies.

Authors:  Melissa J Hubisz; Michael F Lin; Manolis Kellis; Adam Siepel
Journal:  PLoS One       Date:  2011-02-14       Impact factor: 3.240

7.  Genomic lineages of Rhizobium etli revealed by the extent of nucleotide polymorphisms and low recombination.

Authors:  José L Acosta; Luis E Eguiarte; Rosa I Santamaría; Patricia Bustos; Pablo Vinuesa; Esperanza Martínez-Romero; Guillermo Dávila; Víctor González
Journal:  BMC Evol Biol       Date:  2011-10-17       Impact factor: 3.260

8.  Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors?

Authors:  Caroline Deshayes; Emmanuel Perrodou; Sebastien Gallien; Daniel Euphrasie; Christine Schaeffer; Alain Van-Dorsselaer; Olivier Poch; Odile Lecompte; Jean-Marc Reyrat
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

9.  PIPEBAR and OverlapPER: tools for a fast and accurate DNA barcoding analysis and paired-end assembly.

Authors:  Renato Renison Moreira Oliveira; Gisele Lopes Nunes; Talvâne Glauber Lopes de Lima; Guilherme Oliveira; Ronnie Alves
Journal:  BMC Bioinformatics       Date:  2018-08-08       Impact factor: 3.169

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.