| Literature DB >> 28172591 |
Castrense Savojardo1, Pier Luigi Martelli1, Piero Fariselli2, Rita Casadio1,3.
Abstract
Motivation: Chloroplasts are organelles found in plants and involved in several important cell processes. Similarly to other compartments in the cell, chloroplasts have an internal structure comprising several sub-compartments, where different proteins are targeted to perform their functions. Given the relation between protein function and localization, the availability of effective computational tools to predict protein sub-organelle localizations is crucial for large-scale functional studies.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28172591 PMCID: PMC5408801 DOI: 10.1093/bioinformatics/btw656
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Distribution of proteins in SCEXP2016 into the six different chloroplastic sub-compartments
| Compartment | Number of proteins |
|---|---|
| Inner membrane | 47 |
| Outer membrane | 24 |
| Stroma | 119 |
| Plastoglobule | 32 |
| Thylakoid lumen | 37 |
| Thylakoid membrane | 131 |
Distribution of annotated targeting and membrane features of SCEXP2016 proteins
| Feature | Number of annotated proteins |
|---|---|
| Chloroplastic targeting | 317 |
| Thylakoid targeting | 60 |
| Single-pass membrane | 34 |
| Multi-pass membrane | 62 |
| Peripheral membrane | 41 |
Fig. 1Overview of the SChloro system architecture
Single- and multi-label performance with different combinations of input features on the SCEXP2016 dataset by adopting a 10-fold cross-validation procedure
| Multi-label prediction | Single-label prediction | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Input features | ACCml | mlACC | mlPRE | mlREC | mlF1 | ACCsl(I) | ACCsl(O) | ACCsl(S) | ACCsl(L) | ACCsl(M) | ACCsl(P) | ACCsl |
| Basic | 0.48 | 0.56 | 0.56 | 0.61 | 0.59 | 0.33 | 0.26 | 0.60 | 0.61 | 0.57 | 0.48 | 0.58 |
| Basic+target (predicted) | 0.62 | 0.75 | 0.76 | 0.90 | 0.83 | 0.62 | 0.57 | 0.89 | 0.89 | 0.82 | 0.80 | 0.89 |
| Basic+mem (predicted) | 0.60 | 0.74 | 0.75 | 0.89 | 0.82 | 0.67 | 0.65 | 0.84 | 0.85 | 0.83 | 0.75 | 0.85 |
| Basic+target+mem (predicted) | 0.63 | 0.77 | 0.79 | 0.93 | 0.85 | 0.66 | 0.63 | 0.90 | 0.89 | 0.84 | 0.81 | 0.91 |
| Basic+target+mem (observed) | 0.73 | 0.82 | 0.84 | 0.93 | 0.89 | 0.81 | 0.75 | 0.97 | 0.97 | 0.92 | 0.86 | 0.96 |
Basic = PSSM + Hydrophobicty; target = [p(c),p(t)]; mem = [p(s),p(m),p(r)]. Scoring indexes are defined as in Section 2.3. In single-label scoring indexes, I, O, S, L, M and P stand for inner membrane, outer membrane, stroma, thylakoid lumen, thylakoid membrane and plastoglobule, respectively.
10-Fold cross-validation performance of SChloro classifiers for targeting signals and membrane interactions
| Classifier | AUC | MCC |
|---|---|---|
| Chloroplast targeting | 0.85 | 0.76 |
| Thylakoid targeting | 0.94 | 0.70 |
| Single-pass membrane | 0.82 | 0.47 |
| Multi-pass membrane | 0.93 | 0.67 |
| Peripheral membrane | 0.80 | 0.37 |
Comparison of single-label performance of different methods on the S60 dataset adopting a jackknife test
| Method | ACCsl | ACCsl(E) | ACCsl(S) | ACCsl(L) | ACCsl(M) |
|---|---|---|---|---|---|
| SChloro | 0.90 | 0.93 | 0.96 | 0.98 | 0.89 |
| MultiP-Schlo | 0.89 | 0.73 | 0.96 | 0.61 | 1.0 |
| SubChlo | 0.67 | 0.40 | 0.67 | 0.43 | 0.84 |
| ChloroRF | 0.67 | 0.48 | 0.57 | 0.39 | 0.88 |
| SubIdent | 0.89 | 0.80 | 0.86 | 0.64 | 0.98 |
| BS-KNN | 0.76 | 0.48 | 0.74 | 0.78 | 0.85 |
Scoring indexes are defined in Section 2.3. Labels E, S, L and M stand for envelope, stroma, thylakoid lumen and thylakoid membrane, respectively.
Data taken from Wang et al. (2015)
Comparison of multi-label performance of MultiP-Schlo and our method on the MSchlo578 dataset
| Method | ACCml | mlACC | mlPRE | mlREC | mlF1 |
|---|---|---|---|---|---|
| SChloro | 0.74 | 0.76 | 0.78 | 0.78 | 0.78 |
| MultiP-Schlo | 0.56 | 0.63 | 0.64 | 0.71 | 0.67 |
Scoring indexes are defined in Section 2.3. The comparison adopts a jackknife test.