MOTIVATION: In cDNA sequencing projects, it is vital to know whether the protein coding region of a sequence is complete, or whether errors have occurred during library construction. Here we present a linear discriminant approach that predicts this completeness by estimating the probability of each ATG being the initiation codon. RESULTS: Because of the current shortage of full-length cDNA data on which to base this work, tests were performed on a non-redundant set of 660 initiation codon-containing DNA sequences that had been conceptually spliced into mRNA/cDNA. We also used an edited set of the same sequences that only contained the region following the initiation codon as a negative control. Using the criterion that only a single prediction is allowed for each sequence, a cut-off was selected at which discrimination of both positive and negative sets was equal. At this cut-off, 67% of each set could be correctly distinguished, with the correct ATG codon also being identified in the positive set. Reliability could be increased further by raising the cut-off or including homologues, the relative merits of which are discussed. AVAILABILITY: The prediction program, called ATGpr, and other data are available at http://www.hri.co.jp/atgpr CONTACT: swintech@hri.co.jp
MOTIVATION: In cDNA sequencing projects, it is vital to know whether the protein coding region of a sequence is complete, or whether errors have occurred during library construction. Here we present a linear discriminant approach that predicts this completeness by estimating the probability of each ATG being the initiation codon. RESULTS: Because of the current shortage of full-length cDNA data on which to base this work, tests were performed on a non-redundant set of 660 initiation codon-containing DNA sequences that had been conceptually spliced into mRNA/cDNA. We also used an edited set of the same sequences that only contained the region following the initiation codon as a negative control. Using the criterion that only a single prediction is allowed for each sequence, a cut-off was selected at which discrimination of both positive and negative sets was equal. At this cut-off, 67% of each set could be correctly distinguished, with the correct ATG codon also being identified in the positive set. Reliability could be increased further by raising the cut-off or including homologues, the relative merits of which are discussed. AVAILABILITY: The prediction program, called ATGpr, and other data are available at http://www.hri.co.jp/atgpr CONTACT: swintech@hri.co.jp
Authors: Joseph C Kuhl; Foo Cheung; Qiaoping Yuan; William Martin; Yayeh Zewdie; John McCallum; Andrew Catanach; Paul Rutherford; Kenneth C Sink; Maria Jenderek; James P Prince; Christopher D Town; Michael J Havey Journal: Plant Cell Date: 2003-12-11 Impact factor: 11.277
Authors: Giacomo Meneghetti; Tatjana Skobo; Martina Chrisam; Nicola Facchinello; Camilla Maria Fontana; Stefania Bellesso; Patrizia Sabatelli; Flavia Raggi; Francesco Cecconi; Paolo Bonaldo; Luisa Dalla Valle Journal: Autophagy Date: 2019-03-17 Impact factor: 16.016