Literature DB >> 1316617

Finding errors in DNA sequences.

J Posfai1, R J Roberts.   

Abstract

An algorithm is described that can detect certain errors within coding regions of DNA sequences. The algorithm is based on the idea that an insertion or deletion error within a coding sequence would interrupt the reading frame and cause the correct translation of a DNA sequence to require one or more frameshifts. If the coding sequence shows similarity to a known protein sequence then such errors can be detected by comparing the conceptual translations of DNA sequences in all six reading frames with every sequence in a protein sequence data base. We have incorporated these ideas into a computer program, called DETECT, that can serve as an aid to the experimentalist who is determining new DNA sequences so that obvious errors may be located and corrected. The program has been tested using raw experimental data and against sequences from the European Molecular Biology Laboratory data base, annotated as containing frameshifts. We have also tested it using unidentified open reading frames that flank known, annotated genes in the GenBank data base. Many potential errors are apparent and in some cases functions can be suggested for the "corrected" versions of these reading frames leading to the identification of new genes. As more sequences are determined the power of this method will increase substantially.

Mesh:

Substances:

Year:  1992        PMID: 1316617      PMCID: PMC49150          DOI: 10.1073/pnas.89.10.4698

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  31 in total

1.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

2.  Predictive motifs derived from cytosine methyltransferases.

Authors:  J Pósfai; A S Bhagwat; G Pósfai; R J Roberts
Journal:  Nucleic Acids Res       Date:  1989-04-11       Impact factor: 16.971

3.  Recognition of protein coding regions in DNA sequences.

Authors:  J W Fickett
Journal:  Nucleic Acids Res       Date:  1982-09-11       Impact factor: 16.971

4.  Computer programs for the assembly of DNA sequences.

Authors:  T R Gingeras; J P Milazzo; D Sciaky; R J Roberts
Journal:  Nucleic Acids Res       Date:  1979-09-25       Impact factor: 16.971

5.  Molecular sequence accuracy and the analysis of protein coding regions.

Authors:  D J States; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1991-07-01       Impact factor: 11.205

6.  Bordetella pertussis adenylate cyclase toxin and hemolytic activities require a second gene, cyaC, for activation.

Authors:  E M Barry; A A Weiss; I E Ehrmann; M C Gray; E L Hewlett; M S Goodwin
Journal:  J Bacteriol       Date:  1991-01       Impact factor: 3.490

7.  Transcriptional organization of the Escherichia coli hemolysin genes.

Authors:  R A Welch; S Pellett
Journal:  J Bacteriol       Date:  1988-04       Impact factor: 3.490

8.  Complete nucleotide sequence of macrolide-lincosamide-streptogramin B-resistance transposon Tn917 in Streptococcus faecalis.

Authors:  J H Shaw; D B Clewell
Journal:  J Bacteriol       Date:  1985-11       Impact factor: 3.490

9.  Secretion of cyclolysin, the calmodulin-sensitive adenylate cyclase-haemolysin bifunctional protein of Bordetella pertussis.

Authors:  P Glaser; H Sakamoto; J Bellalou; A Ullmann; A Danchin
Journal:  EMBO J       Date:  1988-12-01       Impact factor: 11.598

Review 10.  RNA editing and the mitochondrial cryptogenes of kinetoplastid protozoa.

Authors:  L Simpson; J Shaw
Journal:  Cell       Date:  1989-05-05       Impact factor: 41.582

View more
  9 in total

1.  Detecting and analyzing DNA sequencing errors: toward a higher quality of the Bacillus subtilis genome sequence.

Authors:  C Médigue; M Rose; A Viari; A Danchin
Journal:  Genome Res       Date:  1999-11       Impact factor: 9.043

2.  Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes.

Authors:  Nitin Gupta; Jamal Benhamida; Vipul Bhargava; Daniel Goodman; Elisabeth Kain; Ian Kerman; Ngan Nguyen; Noah Ollikainen; Jesse Rodriguez; Jian Wang; Mary S Lipton; Margaret Romine; Vineet Bafna; Richard D Smith; Pavel A Pevzner
Journal:  Genome Res       Date:  2008-04-21       Impact factor: 9.043

3.  PairWise and SearchWise: finding the optimal alignment in a simultaneous comparison of a protein profile against all DNA translation frames.

Authors:  E Birney; J D Thompson; T J Gibson
Journal:  Nucleic Acids Res       Date:  1996-07-15       Impact factor: 16.971

4.  Assignment of position-specific error probability to primary DNA sequence data.

Authors:  C B Lawrence; V V Solovyev
Journal:  Nucleic Acids Res       Date:  1994-04-11       Impact factor: 16.971

5.  A frameshift error detection algorithm for DNA sequencing projects.

Authors:  G A Fichant; Y Quentin
Journal:  Nucleic Acids Res       Date:  1995-08-11       Impact factor: 16.971

6.  Polymorphism, monomorphism, and sequences in conserved microsatellites in primate species.

Authors:  A Blanquer-Maumont; B Crouau-Roy
Journal:  J Mol Evol       Date:  1995-10       Impact factor: 2.395

7.  Error and error mitigation in low-coverage genome assemblies.

Authors:  Melissa J Hubisz; Michael F Lin; Manolis Kellis; Adam Siepel
Journal:  PLoS One       Date:  2011-02-14       Impact factor: 3.240

8.  ICDS database: interrupted CoDing sequences in prokaryotic genomes.

Authors:  Emmanuel Perrodou; Caroline Deshayes; Jean Muller; Christine Schaeffer; Alain Van Dorsselaer; Raymond Ripp; Olivier Poch; Jean-Marc Reyrat; Odile Lecompte
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

9.  Segmentally variable genes: a new perspective on adaptation.

Authors:  Yu Zheng; Richard J Roberts; Simon Kasif
Journal:  PLoS Biol       Date:  2004-04-13       Impact factor: 8.029

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.