MOTIVATION: In an effort to identify potential programmed frameshift sites by statistical analysis, we explore the hypothesis that selective pressure would have rendered such sites underabundant and underrepresented in protein-coding sequences. We developed a computer program to compare the frequencies of k-length subsequences of nucleotides with the frequencies predicted by a zero order Markov chain determined by the codon bias of the same set of sequences. The program was used to calculate and evaluate the distribution of 7-base oligonucleotides in the 6000+ putative protein-coding sequences of S. cerevisiae preliminary to the laboratory testing of the most highly underrepresented oligos for frameshifting efficiency. RESULTS: Among the most significant results is the finding that the heptanucleotides CUU-AGG-C and CUU-AGU-U, sites of the programmed +1 translational frameshifts required for the production in yeast of actin filament-binding protein ABP140 and telomerase subunit EST3, respectively, rank among the least represented of phase I heptanucleotides in the coding sequences of S. cerevisiae. Laboratory experiments demonstrated that other underrepresented heptanucleotides identified by the program, for example GGU-CAG-A, are also prone to significant translational frameshifting, suggesting the possibility that genes containing other underrepresented heptamers may also encode transframe products. AVAILABILITY: The program is available for download from http://www.gesteland.genetics.utah.edu/freqAnalysis SUPPLEMENTARY INFORMATION: Complete results from the analysis of S. cerevisiae are available on http://www.gesteland.genetics.utah.edu/freqAnalysis
MOTIVATION: In an effort to identify potential programmed frameshift sites by statistical analysis, we explore the hypothesis that selective pressure would have rendered such sites underabundant and underrepresented in protein-coding sequences. We developed a computer program to compare the frequencies of k-length subsequences of nucleotides with the frequencies predicted by a zero order Markov chain determined by the codon bias of the same set of sequences. The program was used to calculate and evaluate the distribution of 7-base oligonucleotides in the 6000+ putative protein-coding sequences of S. cerevisiae preliminary to the laboratory testing of the most highly underrepresented oligos for frameshifting efficiency. RESULTS: Among the most significant results is the finding that the heptanucleotides CUU-AGG-C and CUU-AGU-U, sites of the programmed +1 translational frameshifts required for the production in yeast of actin filament-binding protein ABP140 and telomerase subunit EST3, respectively, rank among the least represented of phase I heptanucleotides in the coding sequences of S. cerevisiae. Laboratory experiments demonstrated that other underrepresented heptanucleotides identified by the program, for example GGU-CAG-A, are also prone to significant translational frameshifting, suggesting the possibility that genes containing other underrepresented heptamers may also encode transframe products. AVAILABILITY: The program is available for download from http://www.gesteland.genetics.utah.edu/freqAnalysis SUPPLEMENTARY INFORMATION: Complete results from the analysis of S. cerevisiae are available on http://www.gesteland.genetics.utah.edu/freqAnalysis
Authors: Olga L Gurvich; Pavel V Baranov; Jiadong Zhou; Andrew W Hammer; Raymond F Gesteland; John F Atkins Journal: EMBO J Date: 2003-11-03 Impact factor: 11.598
Authors: Virag Sharma; Andrew E Firth; Ivan Antonov; Olivier Fayet; John F Atkins; Mark Borodovsky; Pavel V Baranov Journal: Mol Biol Evol Date: 2011-06-14 Impact factor: 16.240
Authors: John F Atkins; Gary Loughran; Pramod R Bhatt; Andrew E Firth; Pavel V Baranov Journal: Nucleic Acids Res Date: 2016-07-19 Impact factor: 16.971
Authors: Matthew R McFarland; Corina D Keller; Brandon M Childers; Stephen A Adeniyi; Holly Corrigall; Adélaïde Raguin; M Carmen Romano; Ian Stansfield Journal: Nucleic Acids Res Date: 2020-04-06 Impact factor: 16.971