Literature DB >> 7659513

A frameshift error detection algorithm for DNA sequencing projects.

G A Fichant1, Y Quentin.   

Abstract

During the determination of DNA sequences, frameshift errors are not the most frequent but they are the most bothersome as they corrupt the amino acid sequence over several residues. Detection of such errors by sequence alignment is only possible when related sequences are found in the databases. To avoid this limitation, we have developed a new tool based on the distribution of non-overlapping 3-tuples or 6-tuples in the three frames of an ORF. The method relies upon the result of a correspondence analysis. It has been extensively tested on Bacillus subtilis and Saccharomyces cerevisiae sequences and has also been examined with human sequences. The results indicate that it can detect frameshift errors affecting as few as 20 bp with a low rate of false positives (no more than 1.0/1000 bp scanned). The proposed algorithm can be used to scan a large collection of data, but it is mainly intended for laboratory practice as a tool for checking the quality of the sequences produced during a sequencing project.

Entities:  

Mesh:

Substances:

Year:  1995        PMID: 7659513      PMCID: PMC307128          DOI: 10.1093/nar/23.15.2900

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  22 in total

1.  Finding errors in DNA sequences.

Authors:  J Posfai; R J Roberts
Journal:  Proc Natl Acad Sci U S A       Date:  1992-05-15       Impact factor: 11.205

Review 2.  Assessment of protein coding measures.

Authors:  J W Fickett; C S Tung
Journal:  Nucleic Acids Res       Date:  1992-12-25       Impact factor: 16.971

3.  Prediction of gene structure.

Authors:  R Guigó; S Knudsen; N Drake; T Smith
Journal:  J Mol Biol       Date:  1992-07-05       Impact factor: 5.469

4.  K-tuple frequency analysis: from intron/exon discrimination to T-cell epitope mapping.

Authors:  J M Claverie; I Sauvaget; L Bougueleret
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

5.  Searching for patterns in protein and nucleic acid sequences.

Authors:  R Staden
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

6.  Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach.

Authors:  E C Uberbacher; R J Mural
Journal:  Proc Natl Acad Sci U S A       Date:  1991-12-15       Impact factor: 11.205

7.  Statistical method for predicting protein coding regions in nucleic acid sequences.

Authors:  G Fichant; C Gautier
Journal:  Comput Appl Biosci       Date:  1987-11

8.  Compositional compartmentalization and gene composition in the genome of vertebrates.

Authors:  D Mouchiroud; G Fichant; G Bernardi
Journal:  J Mol Evol       Date:  1987       Impact factor: 2.395

9.  A comprehensive set of sequence analysis programs for the VAX.

Authors:  J Devereux; P Haeberli; O Smithies
Journal:  Nucleic Acids Res       Date:  1984-01-11       Impact factor: 16.971

10.  The mosaic genome of warm-blooded vertebrates.

Authors:  G Bernardi; B Olofsson; J Filipski; M Zerial; J Salinas; G Cuny; M Meunier-Rotival; F Rodier
Journal:  Science       Date:  1985-05-24       Impact factor: 47.728

View more
  7 in total

1.  Detecting and analyzing DNA sequencing errors: toward a higher quality of the Bacillus subtilis genome sequence.

Authors:  C Médigue; M Rose; A Viari; A Danchin
Journal:  Genome Res       Date:  1999-11       Impact factor: 9.043

2.  Use and misuse of correspondence analysis in codon usage studies.

Authors:  Guy Perrière; Jean Thioulouse
Journal:  Nucleic Acids Res       Date:  2002-10-15       Impact factor: 16.971

Review 3.  Current methods of gene prediction, their strengths and weaknesses.

Authors:  Catherine Mathé; Marie-France Sagot; Thomas Schiex; Pierre Rouzé
Journal:  Nucleic Acids Res       Date:  2002-10-01       Impact factor: 16.971

4.  Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes.

Authors:  Nitin Gupta; Jamal Benhamida; Vipul Bhargava; Daniel Goodman; Elisabeth Kain; Ian Kerman; Ngan Nguyen; Noah Ollikainen; Jesse Rodriguez; Jian Wang; Mary S Lipton; Margaret Romine; Vineet Bafna; Richard D Smith; Pavel A Pevzner
Journal:  Genome Res       Date:  2008-04-21       Impact factor: 9.043

5.  PairWise and SearchWise: finding the optimal alignment in a simultaneous comparison of a protein profile against all DNA translation frames.

Authors:  E Birney; J D Thompson; T J Gibson
Journal:  Nucleic Acids Res       Date:  1996-07-15       Impact factor: 16.971

6.  An approach for searching insertions in bacterial genes leading to the phase shift of triplet periodicity.

Authors:  Maria A Korotkova; Nikolay A Kudryashov; Eugene V Korotkov
Journal:  Genomics Proteomics Bioinformatics       Date:  2011-10       Impact factor: 7.691

7.  Predicting genes expressed via -1 and +1 frameshifts.

Authors:  Sanghoon Moon; Yanga Byun; Hong-Jin Kim; Sunjoo Jeong; Kyungsook Han
Journal:  Nucleic Acids Res       Date:  2004-09-15       Impact factor: 16.971

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.