| Literature DB >> 15960807 |
Sebastian Maurer-Stroh1, Frank Eisenhaber.
Abstract
We refined the motifs for carboxy-terminal protein prenylation by analysis of known substrates for farnesyltransferase (FT), geranylgeranyltransferase I (GGT1) and geranylgeranyltransferase II (GGT2). In addition to the CaaX box for the first two enzymes, we identify a preceding linker region that appears constrained in physicochemical properties, requiring small or flexible, preferably hydrophilic, amino acids. Predictors were constructed on the basis of sequence and physical property profiles, including interpositional correlations, and are available as the Prenylation Prediction Suite (PrePS, http://mendel.imp.univie.ac.at/sat/PrePS) which also allows evaluation of evolutionary motif conservation. PrePS can predict partially overlapping substrate specificities, which is of medical importance in the case of understanding cellular action of FT inhibitors as anticancer and anti-parasite agents.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15960807 PMCID: PMC1175975 DOI: 10.1186/gb-2005-6-6-r55
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Sequence logos [74] and physicochemical property profiles of FT and GGT1 substrates. Selected physical properties (hydrophilicity = KRIW790102; flexibility = KARP850103, size = CHOC760101; aliphatic = ZVEL_ALI_1; see Tables 1 and 2 for details) are calculated as average over the nonredundant learning sets of FT and GGT1. The plotted lines correspond to the relative deviation of the respective properties from an average calculated over carboxy termini from the UniRef50 database [22].
Physical property terms in the FT scoring function
| Property | Position | Rationale | Explanation |
| ARGP820103 [62] | +3 | Corr = 0.7(nrLS) | Membrane-buried preference, lipid contact when entering binding pocket |
| logPREN_CKQX_FT [15] | +3 | Corr = -0.72(nrLS) | Kinetic measurement, relative unprocessed FPP amounts with tetrapeptide CKQX |
| CHOC760101 [63] | +1 to +3 | Fisher = 1.3 | Side chain volume |
| ZVEL_CHARG [64] | +1 to +3 | LS composition | General charge penalty |
| ZVEL_CHNEG [64] | +1 to +3 | LS composition | Special negative charge penalty |
| WERD780102 [65] | +1 and +3 | Fisher = 1.51 | Hydrophobicity compensation for inside preference |
| ZVEL_ALI_1 [64] | +1 and +2 | +2: Corr = 0.85(prof) | Amino-acid property: aliphatic |
| LIFS790102 [66] | +1 and +2 | +2: Correlation = 0.76(prof) | Preference for extended conformations |
| ZVEL_TINY_ [64] | -1 | Corr = 0.68(prof) | Size, bulkiness |
| MOBILITY_2 [21] | -1 | Corr = 0.61(nrLS) | Side chain mobility |
| VINM940101 [67] | -11 to -1 | -2: Corr = 0.72(prof) | Normalized flexibility average |
| KRIW790102 [68] | -11 to -1 | -2: Corr = 0.76(prof) | Fraction of site occupied with water |
| Buried helix (see Materials and methods) | -20 to -1 | Remove false positives | Helix with strongly hydrophobic sides folds back to protein core and reduces flexibility and accessibility of C-terminus |
Corr, correlation; LS, learning set; nrLS, nonredundant; prof, profile.
Physical property terms in the GGT1 scoring function
| Property | Position | Rationale | Explanation |
| ARGP820103 [62] | +3 | Corr = 0.8(prof) | Membrane-buried preference, lipid contact when entering binding pocket |
| LEVM760105 [69] | +1 to +3 | Fisher = 1.36 | Size limitation (radius of gyration of side-chain) |
| YUTK870101 [70] | +1 to +3 | Fisher = 1.38 | Hydrophobicity compensation (Unfolding Gibbs energy in water, pH7.0) |
| ZVEL_CHARG [64] | +1 to +3 | LS composition | General charge penalty |
| ZVEL_CHNEG [64] | +1 to +3 | LS composition | Special negative charge penalty |
| ZVEL_ALI_1 [64] | +1 and +2 | +2: Corr = 0.87(prof) | Amino-acid property: aliphatic |
| LIFS790102 [66] | +1 and +2 | +2: Corr = 0.77(prof) | Preference for extended conformations |
| FAUJ880101 [71] | -1 and +2 | Fisher = 1.52 | Size, bulkiness (residues although 10 Å apart, face to same side of base pair) |
| FINA910103 [72] | -1 | Corr = 0.75(prof) | Helix termination (for example, K, S favored, D,E,L,I,V disfavored) |
| KARP850103 [73] | -7 to-1 | -1: Corr = 0.69(prof) | Flexibility (GGT1 lysine preference) |
| VINM940101 [67] | -11 to -1 | -4: Corr = 0.72(prof) | Normalized flexibility average |
| KRIW790102 [68] | -11 to -1 | -3: Corr = 0.70(prof) | Fraction of site occupied with water |
| Buried helix (see Materials and methods) | -20 to -1 | Remove false positives | Helix with strongly hydrophobic sides folds back to protein core and reduces flexibility and accessibility of carboxy terminus |
Figure 2The two CaaX prenyltransferases. (a) Ribbon representations of FT (PDB 1D8D [75]) and GGT1 (PDB 1N4Q [76]); pink, alpha subunit; yellow, beta subunit. (b) The prenylpyrophosphates (green) and CaaX tetrapeptides (blue) inside the binding pockets with enzyme-specific conservation (conservation in FT or GGT1 minus conservation in joined FT+GGT1 alignment) mapped to binding-pocket surface. Increasing conservation difference is shaded from white to yellow to red. FPP, farnesyl-, GGPP, geranylgeranylpyrophosphate. The alignment of the sequences of these proteins is shown in Figure 6. Visualized with Swiss-Pdb Viewer [59].
Figure 3Correlation between predicted and experimental FT/GGT1 substrate selectivity. The correlation of the difference between predicted FT and GGT1 scores with the difference of the experimentally measured logarithmic affinities for FT and GGT1 of the same substrates is plotted.
Figure 4Determinants of GGT2 prenylation. (a) Sequence logos [74] of Ras superfamily members around part of the Rab-REP interaction site (colored red in the otherwise yellow GTPase structure). (b) Structural model of the Rab-REP-GGT2 prenylation complex based on PDB entries 1LTX [77] and 1VG0 [4]. REP1 (green) has a prenyl-binding pocket which is proposed to be involved in the dual geranylgeranylation mechanism (bound geranylgeranyl is shown in green). However, the catalytic attachment to the substrate cysteines takes place in the center of the GGT2 alpha-beta complex (light and dark blue) where the prenylpyrophosphate that will be transferred is also bound (blue space-filling representation, zinc in red). The structure was visualized using Swiss-Pdb Viewer [59].
Figure 5Screenshot of the output provided by the PrePS server [39]. On the left is the prediction result for the query protein H-Ras (GenBank P01112) and the three prenylating enzymes. On the right, is shown the carboxy-terminal alignment and PrePS predictions of homologs of the query protein for evaluation of evolutionary motif conservation. Note that H-Ras is predicted to be prenylated only by FT, whereas the homologs K-Ras and N-Ras can also be prenylated by GGT1.
Comparison of prediction performances
| FT | GGT1 | |||||
| Prosite PS00294 | Beese, Casey and colleagues' rules | PrePS FT | Prosite PS00294 | Beese, Casey and colleagues' rules | PrePS GGT1 | |
| Sensitivity I | 85%* | 95%* | ||||
| Sensitivity II | NA | NA | NA | NA | ||
| Probability of false positive prediction (POFP) for -CXXX motifs (GenBank sequences) | 9.9% | 10.0% | ||||
| POFP -CXXX 'cytoplasmic'‡ | 8.9% | 8.6% | ||||
| POFP -CXXX 'nuclear'‡ | 10.5% | 9.6% | ||||
| POFP -CXXX 'membrane'‡ | 10.3% | 12.0% | ||||
| POFP --CXXX 'extracellular'$ | 7.9% | 8.6%* | ||||
| Overall probability of false positive prediction (GenBank sequences, assuming 1.7% with -CXXX) | 0.16% | 0.17% | ||||
*Prosite pattern PS00294 does not distinguish between prenylation by FT and GGT1.
†Sensitivity rises to 97.9% when the exceptional motif CRPQ of hepatitis delta antigen is removed. ‡For details see Materials and methods. Sensitivity I is the rate of finding known substrates from described learning set = self-consistency. Sensitivity II is the rate of finding known substrates after their exclusion (including homologs) from the learning set = cross-validation (see Materials and methods). Probabilities of false-positive predictions (POFP) complement the specificities to 100% (Specificity = 100 - POFP). The first listed POFP estimates the rates of false positives among query proteins that have a canonical -CXXX motif (which corresponds to 1.7% of all sequences). Below are estimations of POFPs for subsets of Swiss-Prot proteins that differ in their annotated subcellular localization (see Materials and methods). The final POFP is the estimate for false-positive predictions for all sequences (for example, when analyzing complete proteomes or large databases), independent of existence of a -CXXX motif. Formatting signifies: best (bold), intermediate (plain text), worst (italic) performance.
Figure 6Alignment of FT and GGT1 beta subunits (FTb, GGT1b) in the regions of binding-pocket residues (marked with arrow) using ClustalX [57]. Residue ranges shown above and below correspond to the numbering in the PDB structures of rat FT beta (PDB 1D8D) and GGT1 beta (PDB 1N4Q), respectively. Accession numbers are as follows (GenBank unless indicated otherwise): Hs (Homo sapiens) FTb, NP_002019; GGT1b, NP_005014; Mm (Mus musculus) NP_666039; NP_766215; Rn (Rattus norvegicus) PDB 1D8D; 1N4Q; Tn (Tetraodon nigroviridis) CAG09215; CAF904630; Dm (Drosophila melanogaster) NP_650540; NP_525100; Ag (Anopheles gambiae) XP_321357; XP_317045; Ce (Caenorhabditis elegans) NP_506580; NP_496848; At (Arabidopsis thaliana) NP_198844; NP_181487; Sp (Schizosaccharomyces pombe) NP_594251; NP_594142; Sc (Saccharomyces cerevisiae) P22007; NP_011360. Standard ClustalX coloring (according to conserved amino acid type).
Relative weightings of motif positions in profile term
| Position(s) | FT | GGT1 |
| -11 to -1 | 0.07 | 0.07 |
| +1 | 0.15 | 0.14 |
| +2 | 0.29 | 0.41 |
| +3 | 0.17 | 0.42 |