| Literature DB >> 31457027 |
Adele Sadat Haghighat Hoseini1, Mitra Mirzarezaee1,2.
Abstract
BACKGROUND: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from protein sequences. In contrast, protein interactions have been less investigated.Entities:
Keywords: Machine learning; Mitochondria; Protein localization; Protein-Protein Interaction (PPI)
Year: 2018 PMID: 31457027 PMCID: PMC6697825 DOI: 10.15171/ijb.1933
Source DB: PubMed Journal: Iran J Biotechnol ISSN: 1728-3043 Impact factor: 1.671
Majority voting with different Cutting points.
| Conditions | Results (%) |
|---|---|
| PPIs with score greater than or equal to 200 | 83.27 |
| PPIs with score greater than or equal to 500 | 86.86 |
| PPIs with score greater than or equal to 700 | 93.035 |
| PPIs with score greater than or equal to 900 | 89.19 |
Best results using support vector machine for different feature sets.
| PseAAC (λ=15) with (C= 1, γ=0.01) (%) | GO (C = 6.5 and γ = 0.01) (%) | PSSM (C = 10 and γ = 0.01) (%) | FD (C = 100 and γ = 0.2) (%) | |
|---|---|---|---|---|
| 76.49 | 71.20 | 66.56 | 66.17 | |
| IM | 77.84 | 76.84 | 69.58 | 72.11 |
| IMS | 43 | 50 | 48.81 | 75 |
| M | 77.86 | 90.32 | 76.55 | 85.86 |
| OM | 93 | 73.53 | 59.71 | 60 |
| IM | 68.68 | 73.68 | 67.16 | 84.21 |
| IMS | 0 | 0 | 0 | 50 |
| M | 87.31 | 100 | 69.23 | 84.62 |
| OM | 93.15 | 50 | 40 | 20 |
| IM | 88.21 | 80 | 76 | 60 |
| IMS | 80 | 100 | 97.62 | 100 |
| M | 63.42 | 80.65 | 83.87 | 87.1 |
| OM | 80 | 97.06 | 79.41 | 100 |
Inner membrane proteins (IM), intermembrane space proteins (IMS), matrix proteins (M), outer membrane proteins (OM).
Best results using KNN classifier for different feature sets.
| PseAAC (λ=15) with (K=10) (%) | FD with (K=1) (%) | Pair-SW with (K=3) (%) | PSSM with (K=4) (%) | |
|---|---|---|---|---|
| 69.62 | 62.01 | 60.02 | 58.26 | |
| IM | 70.21 | 68.74 | 70.31 | 59.05 |
| IMS | 48.63 | 75 | 40 | 50 |
| M | 70.33 | 81.39 | 80.67 | 63.03 |
| OM | 98 | 60 | 59.23 | 66.18 |
| IM | 68.42 | 89.47 | 70.33 | 52.11 |
| IMS | 0 | 50 | 0 | 0 |
| M | 69.23 | 69.23 | 30 | 70.54 |
| OM | 100 | 20 | 60 | 40 |
| IM | 80 | 48 | 70 | 80.83 |
| IMS | 100 | 100 | 99 | 100 |
| M | 77.42 | 93.55 | 80.33 | 64.52 |
| OM | 100 | 100 | 90.94 | 82.35 |
Inner membrane proteins (IM), intermembrane space proteins (IMS), matrix proteins (M), outer membrane proteins (OM).
Best results using Naive Bayes for different feature sets.
| PseAAC (λ=15) (%) | PseAAC (λ=2) (%) | PSSM(%) | PseAAC (λ=12) (%) | |
|---|---|---|---|---|
| 70.77 | 58.34 | 58.07 | 56.54 | |
| IM | 71.84 | 75.58 | 58.42 | 74.21 |
| IMS | 49.51 | 48.81 | 46.43 | 48.81 |
| M | 73.92 | 81.64 | 72.33 | 76.18 |
| OM | 100 | 52.66 | 62.55 | 5.59 |
| IM | 55 | 63.16 | 36.84 | 68.42 |
| IMS | 0 | 0 | 0 | 0 |
| M | 71.43 | 92.31 | 76.92 | 84.62 |
| OM | 100 | 20 | 40 | 10 |
| IM | 82.61 | 88 | 80 | 80 |
| IMS | 95.24 | 97.62 | 92.86 | 97.62 |
| M | 72.41 | 70.79 | 67.74 | 67.74 |
| OM | 100 | 85.29 | 85.29 | 91.18 |
Inner membrane proteins (IM), intermembrane space proteins (IMS), matrix proteins (M), outer membrane proteins (OM).
Best results using Decision tree for different feature sets.
| GO (%) | PseAAC (λ=15) (%) | PseAAC (λ=20) (%) | Mix of all feature with PPI (%) | |
|---|---|---|---|---|
| 66.32 | 73.45 | 57.86 | 82.49 | |
| IM | 75.58 | 82.11 | 71.86 | 86.11 |
| IMS | 50 | 50 | 50 | 47.62 |
| M | 79.03 | 72.7 | 75.56 | 83.62 |
| OM | 58.53 | 100 | 60.59 | 100 |
| IM | 63.16 | 84.21 | 70.95 | 84.21 |
| IMS | 0 | 0 | 0 | 0 |
| M | 100 | 61.54 | 68.92 | 76.92 |
| OM | 20 | 100 | 20 | 100 |
| IM | 88 | 80 | 70 | 88 |
| IMS | 100 | 100 | 94.32 | 95.24 |
| M | 58.04 | 83.87 | 70.63 | 90.32 |
| OM | 97.06 | 100 | 89.13 | 100 |
Inner membrane proteins (IM), intermembrane space proteins (IMS), matrix proteins (M), outer membrane proteins (OM).
Best results using Random forest for different feature sets.
| GO (%) | PseAAC (λ=15) (%) | PseAAC (λ=20) (%) | Mix of all feature with PPI (%) | |
|---|---|---|---|---|
| 67.76 | 73.965 | 67.448 | 83.35 | |
| IM | 69.47 | 79.78 | 73.28 | 90.74 |
| IMS | 50 | 50 | 50 | 50 |
| M | 76.55 | 75.12 | 68.62 | 91.32 |
| OM | 68.53 | 100 | 93.5 | 100 |
| IM | 78.95 | 90 | 83.5 | 89.47 |
| IMS | 0 | 0 | 0 | 0 |
| M | 69.23 | 57.14 | 50.64 | 90.31 |
| OM | 40 | 100 | 93.5 | 100 |
| IM | 60 | 69.57 | 64.32 | 92 |
| IMS | 100 | 100 | 92.15 | 100 |
| M | 83.87 | 93.1 | 86.62 | 92.33 |
| OM | 97.06 | 100 | 83.43 | 100 |
Inner membrane proteins (IM), intermembrane space proteins (IMS), matrix proteins (M), outer membrane proteins (OM).
Number of available protein interactions based on different cutting points.
| Range | Number of protein interactions |
|---|---|
| Interaction with a score greater than or equal to 200 | 251 |
| Interaction with a score greater than or equal to 500 | 213 |
| Interaction with a score greater than or equal to 700 | 201 |
| Interaction with a score greater than or equal to 900 | 109 |