| Literature DB >> 19783813 |
Pei-Yu Liao1, Yong Seok Choi, Kelvin H Lee.
Abstract
In +1 programmed ribosomal frameshifting (PRF), ribosomes skip one nucleotide toward the 3'-end during translation. Most of the genes known to demonstrate +1 PRF have been discovered by chance or by searching homologous genes. Here, a bioinformatic framework called FSscan is developed to perform a systematic search for potential +1 frameshift sites in the Escherichia coli genome. Based on a current state of the art understanding of the mechanism of +1 PRF, FSscan calculates scores for a 16-nt window along a gene sequence according to different effects of the stimulatory signals, and ribosome E-, P- and A-site interactions. FSscan successfully identified the +1 PRF site in prfB and predicted yehP, pepP, nuoE and cheA as +1 frameshift candidates in the E. coli genome. Empirical results demonstrated that potential +1 frameshift sequences identified promoted significant levels of +1 frameshifting in vivo. Mass spectrometry analysis confirmed the presence of the frameshifted proteins expressed from a yehP-egfp fusion construct. FSscan allows a genome-wide and systematic search for +1 frameshift sites in E. coli. The results have implications for bioinformatic identification of novel frameshift proteins, ribosomal frameshifting, coding sequence detection and the application of mass spectrometry on studying frameshift proteins.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19783813 PMCID: PMC2790909 DOI: 10.1093/nar/gkp796
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The scoring system for FSscan program. FSscan calculates scores for a 16-nt window along the gene sequence. Each step is 3 nt. FS index (FSI) = S + E + P + A.
Nucleotide sequences incorporated into the dual fluorescence reporter system for testing +1 frameshift efficiency in vivo in this study
| Original gene | 16-nt window with max FSI in the gene (the P-site position is underlined) | Strain (transformed with corresponding reporter plasmids) |
|---|---|---|
| yehP6 | ||
| nuoE6 | ||
| pepP6 | ||
| cheA6 | ||
| ygcH6 | ||
| yeaI6 | ||
| pspD6 | ||
| glnD6 | ||
| yjgN6 | ||
| cysD6 | ||
| ran1 | ||
| yehP7 |
yehP, nuoE, pepP, cheA, ygcH and yeaI are the top ranking candidates identified by FSscan.
glnD, yjgN and cysD are selected genes with one or two frameshifting features. rand is a randomly designed sequence to serve as a negative control.
Figure 2.FSscan identifies the +1 frameshift site in prfB. A peak FSI is observed as the ribosome P-site is positioned at the 25th codon.
Figure 3.Maximum FSI in each of the 4132 E. coli protein-coding sequences. Five genes with a maximum FSI above 3.5 are indicated in red. prfB has the maximum FSI 5.05. yehP has the maximum FSI 4.47. nuoE has the maximum FSI 4.39. pepP has the maximum FSI 4.39. cheA has the maximum FSI 3.55.
Figure 4.Frameshift efficiency (FS%) for potential frameshift sequences identified by FSscan. The histogram indicates the experimentally observed FS% for different test strains listed in Table 1. Error bars show the standard deviation. Diamonds demonstrate the program calculated FSI for the potential frameshift cassettes (sequences are shown in Table 1).
Figure 5.Frameshift efficiency (FS%) for yehP6 and yehP7. In yehP6, the linker inserted between the two fluorescence reporters contains the predicted yehP frameshift sequence: GTG GAG TAT GGT CGG C. In yehP7, the frameshift sequence is mutated to GTG GAG TTA GGT CGG C (where zero frame codons are separated by spaces).
Figure 6.(a) The nucleotide sequence design for yehP40, yehP41 and yehP4C. (b) Western blot for the cell lysate to detect the frameshift protein. Lane 1: total lysate from yehP40; lane 2: total lysate from yehP41; lane 3: total lysate from yehP4C. The amount of the protein loaded for yehP40 is one-third of the amount of the protein for yehP41 and yehP4C.
Figure 7.Nucleotide and amino acid sequence for the YehP-EGFP frameshift protein in yehP41. (a) The nucleotide and amino acid sequence for the predicted frameshift region in YehP-EGFP. The predicted frameshift sequence is shown in bold, with the P-site codon underlined. The zero frame and the +1 frame amino acid sequences are shown under the nucleotide sequence. The peptide spanning the frameshift site, with the zero frame translation before the site and the +1 frame translation after the site, is shown in red. (b) Amino acid sequence for the frameshift protein in yehP41 strain. The YehP-EGFP was expressed as a result of +1 frameshifting. Tryptic peptides observed by MRM are marked in red (>95% confidence level). The sequence coverage is 21.7%.
BLAST result for yehP. blastn was used as the algorithm to search the nucleotide collection database in National Center for Biotechnology Information's website
| Accession | Description | Max score | Total score | Query coverage (%) | Max ident (%) | |
|---|---|---|---|---|---|---|
| CP000948.1 | 2254 | 2290 | 100 | 0.0 | 100 | |
| AP009048.1 | 2254 | 2290 | 100 | 0.0 | 100 | |
| U00096.2 | 2254 | 2290 | 100 | 0.0 | 100 | |
| U00007.1 | 47 to 48 centisome region of | 2254 | 2254 | 100 | 0.0 | 100 |
| CU928160.2 | 2119 | 2155 | 100 | 0.0 | 100 | |
| AP009240.1 | 2095 | 2132 | 100 | 0.0 | 100 | |
| CP000800.1 | 2095 | 2132 | 100 | 0.0 | 100 | |
| CP000036.1 | 2095 | 2168 | 100 | 0.0 | 100 | |
| AB426057.1 | 2087 | 2087 | 100 | 0.0 | 98 | |
| CP000034.1 | 2087 | 2160 | 100 | 0.0 | 100 | |
| CP000946.1 | 2056 | 2092 | 100 | 0.0 | 100 | |
| CP000802.1 | 2032 | 2068 | 100 | 0.0 | 100 | |
| AE005674.1 | 1992 | 2065 | 100 | 0.0 | 100 | |
| AE014073.1 | 1992 | 2065 | 100 | 0.0 | 100 | |
| AE014075.1 | 1976 | 2085 | 100 | 0.0 | 100 | |
| CU928164.2 | 1961 | 2033 | 100 | 0.0 | 100 | |
| BA000007.2 | 1961 | 2033 | 100 | 0.0 | 100 | |
| AE005174.2 | 1961 | 2033 | 100 | 0.0 | 100 | |
| CP001164.1 | 1953 | 2025 | 100 | 0.0 | 100 | |
| CP000970.1 | 1937 | 2009 | 100 | 0.0 | 100 | |
| CU928162.2 | 1913 | 2021 | 100 | 0.0 | 100 | |
| FM180568.1 | 1905 | 1977 | 100 | 0.0 | 100 | |
| CU928161.2 | 1897 | 2006 | 100 | 0.0 | 100 | |
| CP000468.1 | 1897 | 2006 | 100 | 0.0 | 100 | |
| CP000243.1 | 1897 | 2006 | 100 | 0.0 | 100 | |
| CU928158.2 | 1850 | 1924 | 100 | 0.0 | 95 | |
| CP000247.1 | 1850 | 1958 | 100 | 0.0 | 100 | |
| CU928163.2 | 1842 | 1914 | 100 | 0.0 | 100 | |
| CU651637.1 | 1818 | 1926 | 100 | 0.0 | 100 | |
| AP000400.1 | Enterobacteria phage VT1-Sakai genomic DNA, prophage inserted region in | 1542 | 1542 | 81 | 0.0 | 96 |
| CP000038.1 | 603 | 675 | 29 | 8e-169 | 100 |
The search was optimized for highly similar sequences
Max ident, Maximum identities.
Figure 8.Sequence conservation of the predicted frameshift cassette in yehP. The sequence logo was generated by aligning 31 sequences in Table 2.