| Literature DB >> 17717002 |
Yaramah M Zalucki1, Peter M Power, Michael P Jennings.
Abstract
The definition of a typical sec-dependent bacterial signal peptide contains a positive charge at the N-terminus, thought to be required for membrane association. In this study the amino acid distribution of all Escherichia coli secretory proteins were analysed. This revealed that there was a statistically significant bias for lysine at the second codon position (P2), consistent with a role for the positive charge in secretion. Removal of the positively charged residue P2 in two different model systems revealed that a positive charge is not required for protein export. A well-characterized feature of large amino acids like lysine at P2 is inhibition of N-terminal methionine removal by methionyl amino-peptidase (MAP). Substitution of lysine at P2 for other large or small amino acids did not affect protein export. Analysis of codon usage revealed that there was a bias for the AAA lysine codon at P2, suggesting that a non-coding function for the AAA codon may be responsible for the strong bias for lysine at P2 of secretory signal sequences. We conclude that the selection for high translation initiation efficiency maybe the selective pressure that has led to codon and consequent amino acid usage at P2 of secretory proteins.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17717002 PMCID: PMC2034453 DOI: 10.1093/nar/gkm577
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Primers used in this study
| Primer | Sequence (5′–3′) |
|---|---|
| Splice overlap primers | |
| Bla-rev | TTGGCTGCAGATCAACCGGGGTAAATCAAT |
| 5′bla-MBP | CAGTGAATTCTATTGAAAAAGGAAGAGTAT GAAAATAAAAACAGGTGCACGCATCCT |
| 5′MBP-bla | GATGTTTTCCGCCTCGGCTCTCGCCCACCCA GAAACGCTGGTGAAAGT |
| 3′MBPss | GGCGAGAGCCGAGGCGGAAAACATC |
| 3′phoAss | GGCTTTTGTCACAGGGGTAAA |
| 5′bla_phoA_F | CAGTGAATTCTATTGAAAAAGGAAGAGTAT GAAACAAAGCACTATTGCACTGG |
| 5′phoA_bla | TTTACCCCTGTGACAAAAGCCCACCCAGAA ACGCTGGTG |
| Mutagenic primers | |
| MBP::K2G | CAGTGAATTCTATTGAAAAAGGAAGAGTAT GGGAATAAAAACAGGT |
| MBP::K2L | CAGTGAATTCTATTGAAAAAGGAAGAGTAT GCTGATAAAAACAGGT |
| MBP::K2N | CAGTGAATTCTATTGAAAAAGGAAGAGTAT GAATATAAAAACAGGT |
| MBP::K2A | CAGTGAATTCTATTGAAAAAGGAAGAGTAT GGCAATAAAAACAGGT |
| MBP::3N | CAGTGAATTCTATTGAAAAAGGAAGAGTATG AATATAAATACAGGTGCAAACATCCTCGC |
| PhoA::K2N | CAGTGAATTCTATTGAAAAAGGAAGAGTATG AATCAAAGCAC |
Figure 1.Stacked histogram showing the frequency of charged residues by position in the secretory genes of E. coli.
Observed and expected frequencies of amino acids at P2 and P3 in secretory group
| Amino acid | P2 | P3 | ||
|---|---|---|---|---|
| Observed | Expected | Observed | Expected | |
| Glycine | 4 | 9 | 12 | 13 |
| Alanine | 20 | 42 | 21 | 22 |
| Proline | 14 | 15 | 10 | 13 |
| Serine | 35 | 73 | 25 | 29 |
| Threonine | 24 | 41 | 30 | 40 |
| Valine | 5 | 10 | 13 | 23 |
| Cysteine | 0 | 1 | 1 | 3 |
| Asparagine | 43 | 37 | 21 | 30 |
| Aspartic acid | 4 | 15 | 5 | 23 |
| Leucine | 16 | 29 | 32 | 36 |
| Isoleucine | 19 | 26 | 36 | 37 |
| Histadine | 6 | 6 | 8 | 12 |
| Glutamine | 12 | 19 | 10 | 31 |
| Glutamic acid | 9 | 20 | 2 | 27 |
| Phenylalanine | 21 | 15 | 14 | 15 |
| Methionine | 17 | 9 | 18 | 9 |
| Lysine | 170 | 67 | 124 | 55 |
| Tyrosine | 4 | 5 | 12 | 14 |
| Tryptophan | 1 | 1 | 5 | 6 |
| Arginine | 42 | 27 | 67 | 31 |
aamino acids that promote f-Met removal by MAP. The P-values are bP = 1.72 × 10−24, cP = 2.71 × 10−10, dP = 0.001, which were calculated using a χ2 test with 19 degrees of freedom. The expected values of amino acids at P2 and P3 were calculated by using their frequency of occurrence (f) at their respective position in all protein encoding E. coli genes (4153). For example, alanine occurs 372 times in all genes at P2. Expressed as an f-value this is: fala = 372/4153 = 0.0896. So the expected value of alanine in the secreted group is: 0.0896 × 466 = 41.74. The expected numbers shown in the table are the rounded to the nearest whole number.
Figure 2.Graph showing the percentage of f-Met removal in secretory and non-secretory proteins using algorithm developed by Frottin et al. (14). The probability that the proportions are equivalent was calculated using a difference of proportions test. P(1 and 2) = < 10−5, P(3) = 0.999. The number in bracket corresponds to the number above three sections of the graph. The values on the right of the dotted line were with the lysine codon AAA removed from the analysis.
Observed and expected codon usage at P2 for secretory group
| Codon | Amino Acid | Observed | Expected |
|---|---|---|---|
| UCU | Ser | 4 | 13 |
| UCC | Ser | 7 | 7 |
| UCA | Ser | 2 | 9 |
| UCG | Ser | 2 | 5 |
| AGU | Ser | 13 | 20 |
| AGC | Ser | 7 | 19 |
| AAU | Asn | 25 | 23 |
| AAC | Asn | 18 | 14 |
| AAA | Lys | 138 | 53 |
| AAG | Lys | 32 | 13 |
| CGU | Arg | 16 | 8 |
| CGC | Arg | 10 | 6 |
| CGA | Arg | 4 | 4 |
| CGG | Arg | 3 | 2 |
| AGA | Arg | 7 | 4 |
| AGG | Arg | 2 | 1 |
aP-value is 1.79 × 10−7 calculated using a χ2 test with 60 degrees of freedom. Expected values were calculated using the frequency of codon usage of all genes in the E. coli genome.
Figure 3.(A) Schematic of the pMalE::bla and pPhoA::bla constructs used in this study. The cloning sites EcoRI and PstI are shown. The white box (to scale) represents the fusion of the signal sequence of malE and phoA to bla. For panels (B) and (C), the DNA and amino acid sequence of the first 10 amino acids are shown for all constructs used in this study. Changes to the amino acid code from the respective wt-sequences are shown in bold. To the right of each sequence is a vertical Western blot, showing the respective protein expression and the corresponding MIC values in DH5α. Panel B deals with the requirement for positive charge at P2, whereas panel C deals with the effect of N-terminal methionine removal on export. All MIC values for kanamycin were 8 mg/ml showing that plasmid copy number was constant for all constructs in this study. The black line indicates the 37.1 kDa molecular weight marker (Invitrogen, Cat. No. 10748-010). Key: ss: signal sequence.