| Literature DB >> 22969783 |
Sigrun Reumann1, Daniela Buchwald, Thomas Lingner.
Abstract
Prediction of subcellular protein localization is essential to correctly assign unknown proteins to cell organelle-specific protein networks and to ultimately determine protein function. For metazoa, several computational approaches have been developed in the past decade to predict peroxisomal proteins carrying the peroxisome targeting signal type 1 (PTS1). However, plant-specific PTS1 protein prediction methods have been lacking up to now, and pre-existing methods generally were incapable of correctly predicting low-abundance plant proteins possessing non-canonical PTS1 patterns. Recently, we presented a machine learning approach that is able to predict PTS1 proteins for higher plants (spermatophytes) with high accuracy and which can correctly identify unknown targeting patterns, i.e., novel PTS1 tripeptides and tripeptide residues. Here we describe the first plant-specific web server PredPlantPTS1 for the prediction of plant PTS1 proteins using the above-mentioned underlying models. The server allows the submission of protein sequences from diverse spermatophytes and also performs well for mosses and algae. The easy-to-use web interface provides detailed output in terms of (i) the peroxisomal targeting probability of the given sequence, (ii) information whether a particular non-canonical PTS1 tripeptide has already been experimentally verified, and (iii) the prediction scores for the single C-terminal 14 amino acid residues. The latter allows identification of predicted residues that inhibit peroxisome targeting and which can be optimized using site-directed mutagenesis to raise the peroxisome targeting efficiency. The prediction server will be instrumental in identifying low-abundance and stress-inducible peroxisomal proteins and defining the entire peroxisomal proteome of Arabidopsis and agronomically important crop plants. PredPlantPTS1 is freely accessible at ppp.gobics.de.Entities:
Keywords: Arabidopsis; PTS1; machine learning; orthologs; peroxisome; proteome; subcellular targeting
Year: 2012 PMID: 22969783 PMCID: PMC3427985 DOI: 10.3389/fpls.2012.00194
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Screenshot of the . (B) All four variants differ in size (At1g18700.1, 700 aa; At1g18700.2, 705 aa; At1g18700.3, 695 aa; At1g18700.4, 715 aa). Residues with PWM scores that lie outside a defined interval are highlighted in green and red colors, respectively, and indicate a predicted positive and negative effect on peroxisome targeting by the PTS1 pathway, respectively.
Figure 2Position-specific prediction score range of the general PWM score matrix of plant PTS1 proteins. From the matrix values of each amino acid residue the position-specific range of values has been determined and the mean value (−0.069) and the standard deviation have been calculated separately for the PTS1 tripeptide (0.112) and the 11 upstream residues (0.057) to color extreme aa or amino acid residues of high (green) and low (red) PWM prediction scores on the result page of PredPlantPTS1.
Overview table of experimentally validated plant PTS1 tripeptides.
| AHL> | FKL> | SFM> | SNL> | SSI> |
| AKI> | GRL> | SGL> | SNM> | SSL> |
| AKL> | IKL> | SHI> | SPL> | SSM> |
| ALL> | KRL> | SKI> | SQL> | STI> |
| ANL> | LKL> | SKL> | SRF> | STL> |
| ARL> | PKI> | SKM> | SRI> | SYM> |
| ARM> | PKL> | SKV> | SRL> | TRL> |
| ASL> | PRL> | SLL> | SRM> | VKL> |
| CKI> | SCL> | SLM> | SRV> | |
| CKL> | SEL> | SML> | SRY> |
Comparative PTS1 protein prediction of experimentally validated .
| AGI code | Acronym | C-terminal 14 aa | Exp. targ. | PTS1Prowler | PTS1 predictor | PeroxisomeDB | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Score | Prob. | Prediction | Probability | Prediction | Score | Prediction | |||||
| At1g51745.1/2 | Tudor | PTD | 0.615 | 0.990 | Peroxisomal | 0.00 | Non-perox. | −4.86 | Twilight zone | 1.000 | |
| At3g01980.1/3/4 | SDRc | PTD/FLP | 0.610 | 0.989 | Peroxisomal | 0.00 | Non-perox. | 2.96 | Peroxisomal | 0.290 | |
| At4g16340.1 | SPK1 | PTD | 0.567 | 0.973 | Peroxisomal | 0.00 | Non-perox. | −27.15 | Non-perox. | 0.220 | |
| At1g43770.2 | PHD | PTD | 0.499 | 0.891 | Peroxisomal | 0.00 | Non-perox. | 0.73 | Peroxisomal | 1.000 | |
| At3g44830.1 | LCAT | PTD | 0.438 | 0.657 | Peroxisomal | 0.00 | Non-perox. | −11.71 | Non-perox. | 1.000 | |
| At5g28360.1 | ACS31 | PTD | 0.426 | 0.582 | Peroxisomal | 0.00 | Non-perox. | −14.79 | Non-perox. | 0.210 | |
| At5g20070.1 | NUDT19 | FLP | 0.385 | 0.328 | Non-perox. | 0.00 | Non-perox. | 4.275 | Peroxisomal | 1.000 | |
| At5g04870.1 | CPK1 | PTD | 0.321 | 0.080 | Non-perox. | 0.00 | Non-perox. | −5.10 | Twilight zone | 0.058 | |
| At1g49350.1 | pxPfkB | FLP | 0.298 | 0.044 | Non-perox. | 0.00 | Non-perox. | −8.143 | Twilight zone | 0.160 | |
| At2g01880.1 | PAP7 | PTD | 0.130 | 0.000 | Non-perox. | 0.43 | Non-perox. | 7.97 | Peroxisomal | 0.270 | |
The table shows prediction results of different PTS1 prediction servers (see .
Figure 3Analysis of predicted PTS1 conservation in putative orthologs of ambiguously predicted plant PTS1 proteins by a combination of phylogenetic and PTS1 prediction analysis. Two ambiguously predicted, putative PTS1 proteins from P. trichocarpa (XP_002313892) (A) and Arabidopsis thaliana (NP_176647) (B) were blasted against the non-redundant protein database of GenBank. Putatively orthologous proteins (including in-paralogs) were identified in spermatophyta including eudicotyledons (e.g., Arabidopsis, Ricinus), monocotyledons (Liliopsida, Oryza, Zea), and gymnosperms (Coniferopsida, Picea), in mosses (Lyciopodiophyta, Selaginella; Bryophyta, Physcomitrella), and in microalgae (chlorophyta, e.g., Micromonas, Ostreococcus). The sequences were aligned using ClustalX, and the phylogenetic relationship among the sequences was analyzed by the neighbor joining method using MEGA 5. For all putative orthologs the PWM-based PTS1 protein prediction scores and the presence of experimentally validated PTS1 tripeptides were determined (Tables A2 and A3 in Appendix). Positive (+) and negative (−) PWM-based PTS1 protein predictions (e.g., PWM:+) and experimentally validated PTS1 tripeptides (PTS1 trip.:+) are indicated. For At UP (At4g33925) the predictions are given only for the first splice variant.
Strengthening of PTS1 protein prediction for an ambiguously predicted .
| Accession | Species | Annotation | Group | C-term. 14 aa | PWM score | Post. prob.(%) | Pred. | Exp. PTS1 tripeptide validation |
|---|---|---|---|---|---|---|---|---|
| XP | Predicted protein | Eudicotyledons | 0.293 | 3.7 | C | Val. | ||
| At4g33925.1 | Uncharacterized protein | Eudicotyledons | 0.789 | 99.9 | P | Val. | ||
| At4g33925.2 | Uncharacterized protein | Eudicotyledons | −1.178 | 0 | C | Not val. | ||
| XP_002518659 | Conserved hypothetical protein | Eudicotyledons | 0.469 | 80.2 | P | Val. | ||
| XP_002272459 | Zinc finger SWIM domain-containing protein 7 | Eudicotyledons | 0.965 | 100.0 | P | Val. | ||
| XP_003527578 | Zinc finger SWIM domain-containing protein 7-like | Eudicotyledons | 0.198 | 0.2 | C | Val. | ||
| XP_003523843 | Zinc finger SWIM domain-containing protein 7-like | Eudicotyledons | 0.584 | 98.1 | P | Val. | ||
| XP_003598325 | Zinc finger SWIM domain-containing protein | Eudicotyledons | 0.675 | 99.7 | P | Val. | ||
| XP_003605702 | Zinc finger SWIM domain-containing protein | Eudicotyledons | −0.603 | 0.0 | C | Not val. | ||
| EEC82375 | Hypothetical protein OsI_26711 | Liliopsida | 0.901 | 100.0 | P | Val. | ||
| NP_001060165 | Os07g0593200 (partial) | Liliopsida | 0.901 | 100.0 | P | Val. | ||
| NP_001144742 | Uncharacterized protein LOC100277790 | Liliopsida | −0.069 | 0.0 | C | Not val. | ||
| XP_003560014 | Zinc finger SWIM domain-containing protein 7-like | Liliopsida | 0.901 | 100.0 | P | Val. | ||
| BAK05023 | Predicted protein | Liliopsida | 0.901 | 100.0 | P | Val. | ||
| XP_002463119 | Hypothetical protein SORBIDRAFT_02g038190 | Liliopsida | 0.810 | 100.0 | P | Val. | ||
| ABR17386 | Unknown | Coniferopsida | −0.607 | 0.0 | C | Not val. | ||
| XP_001767328 | Predicted protein | Bryophyta | −1.054 | 0.0 | C | Not val. | ||
| XP_002988124 | Hypothetical protein SELMODRAFT_127426, partial | Lycopodiophyta | −1.233 | 0.0 | C | Not val. | ||
| XP_002503713 | Predicted protein | Chlorophyta | −0.817 | 0.0 | C | Not val. | ||
| XP_003059293 | Predicted protein | Chlorophyta | −1.131 | 0.0 | C | Not val. | ||
| XP_001417589 | Predicted protein | Chlorophyta | −0.835 | 0.0 | C | Not val. | ||
| XP_003078967 | Unnamed protein product | Chlorophyta | −0.142 | 0.0 | C | Not val. | ||
| XP_001692127 | Hypothetical protein (partial) | Chlorophyta | −1.156 | 0.0 | C | Not val. |
An ambiguously predicted, putative PTS1 protein from .
Falsifying PTS1 protein prediction for the ambiguously predicted .
| Accession | Species | Annotation | Group | C-term. 14 aa | PWM score | Post. prob.(%) | Pred. | Exp. PTS1 tripeptide validation |
|---|---|---|---|---|---|---|---|---|
| At1g64660 | Methionine gamma-lyase | Eudicotyledons | 0.455 | 74.2 | P | not val. | ||
| XP_002299428 | Predicted protein | Eudicotyledons | −0.651 | 0 | C | not val. | ||
| XP_002304835 | Predicted protein | Eudicotyledons | 0.469 | 80.0 | P | not val. | ||
| XP_002336096 | Predicted protein | Eudicotyledons | −0.350 | 0 | C | not val. | ||
| XP_002518910 | Cystathionine gamma-synthase, putative | Eudicotyledons | −1.149 | 0 | C | not val. | ||
| XP_002280162 | Methionine gamma-lyase-like | Eudicotyledons | −0.724 | 0 | C | not val. | ||
| ADN33936 | Cystathionine gamma-synthase | Eudicotyledons | −0.669 | 0 | C | not val. | ||
| XP_003536171 | Methionine gamma-lyase-like | Eudicotyledons | −1.271 | 0 | C | not val. | ||
| XP_003520012 | Methionine gamma-lyase-like | Eudicotyledons | −0.992 | 0 | C | not val. | ||
| XP_003601451 | Cystathionine gamma-lyase | Eudicotyledons | −0.716 | 0 | C | not val. | ||
| EAY79213 | Hypothetical protein OsI_34329 | Liliopsida | −1.181 | 0 | C | not val. | ||
| NP_001065069 | Os10g0517500 | Liliopsida | −0.132 | 0 | C | not val. | ||
| NP_001152224 | O-succinylhomoserine sulfhydrylase | Liliopsida | −0.523 | 0 | C | not val. | ||
| XP_003574196 | Cystathionine gamma-lyase-like | Liliopsida | −0.871 | 0 | C | not val. | ||
| BAK03127 | Predicted protein | Liliopsida | −1.281 | 0 | C | not val. | ||
| XP_002464368 | Hypothetical protein | Liliopsida | −0.587 | 0 | C | not val. | ||
| ABK27101 | Unknown | Coniferopsida | 0.286 | 3.1 | C | not val. | ||
| XP_001751901 | Predicted protein | Bryophyta | −1.166 | 0 | C | not val. | ||
| XP_001759514 | Predicted protein | Bryophyta | −0.710 | 0 | C | not val. | ||
| XP_001756897 | Predicted protein | Bryophyta | −1.135 | 0 | C | not val. | ||
| XP_002961730 | Hypothetical protein | Lycopodiophyta | −0.523 | 0 | C | not val. | ||
| EIE26481 | Cystathionine gamma-synthase | Chlorophyta | −0.538 | 0 | C | not val. | ||
| XP_002955875 | Hypothetical protein | Chlorophyta | −0.819 | 0 | C | not val. | ||
| EFN56203 | Hypothetical protein | Chlorophyta | −1.166 | 0 | C | not val. |
An ambiguously predicted, putative PTS1 protein from .