| Literature DB >> 32802888 |
Abstract
There are a lot of bacteria in the environment, and Gram-positive bacteria are the most common ones. Some Gram-positive bacteria are very harmful to the human body, so it is significant to predict Gram-positive bacterial protein subcellular location. And identification of Gram-positive bacterial protein subcellular location is important for developing effective drugs. In this paper, a new Gram-positive bacterial protein subcellular location dataset was established. The amino acid composition, the gene ontology annotation information, the hydropathy dipeptide composition information, the amino acid dipeptide composition information, and the autocovariance average chemical shift information were selected as characteristic parameters, then these parameters were combined. The locations of Gram-positive bacterial proteins were predicted by the Support Vector Machine (SVM) algorithm, and the overall accuracy (OA) reached 86.1% under the Jackknife test. The overall accuracy (OA) in our predictive model was higher than those in existing methods. This improved method may be helpful for protein function prediction.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32802888 PMCID: PMC7421015 DOI: 10.1155/2020/9701734
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Dataset of Gram-positive bacteria subcellular location proteins.
| Subcellular location | Number of proteins |
|---|---|
| Cell wall | 22 |
| Extracell | 214 |
| Cytoplasm | 252 |
| Cell membrane | 212 |
| Total | 700 |
Figure 1Predictive results with respect to the correlation factor λ of the acACS based on the Jackknife test. The best results obtained with λ = 40.
Figure 2The combination scheme of chemical shifts. The number 1 denotes 1H, 2 denotes 1H, 3 denotes 15N, and 4 denotes13C.
The predictive results based on the different information parameters in the Jackknife test.
| Features | Location | OA (%) | ||||
|---|---|---|---|---|---|---|
| Cell wall | Extracell | Cytoplasm | Cell membrane | |||
| AAC |
| 13.64 | 74.30 | 70.64 | 85.85 | 74.6% |
|
| 99.85 | 84.77 | 88.62 | 89.34 | ||
| MCC | 0.31 | 0.58 | 0.61 | 0.73 | ||
| ACC (%) | 96.71 | 81.57 | 82.14 | 88.29 | ||
| DC |
| 0.00 | 70.09 | 70.24 | 84.91 | 72.4% |
|
| 99.85 | 84.57 | 86.34 | 88.53 | ||
| MCC | -0.01 | 0.54 | 0.57 | 0.71 | ||
| ACC (%) | 96.71 | 80.14 | 80.57 | 88.53 | ||
| GO |
| 0.00 | 75.23 | 71.03 | 66.98 | 68.9% |
|
| 99.71 | 86.63 | 82.14 | 85.45 | ||
| MCC | -0.01 | 0.61 | 0.53 | 0.52 | ||
| ACC (%) | 96.57 | 83.14 | 78.14 | 79.86 | ||
| acACS |
| 0.00 | 66.36 | 70.24 | 73.11 | 67.7% |
|
| 99.95 | 83.13 | 80.36 | 88.53 | ||
| MCC | -0.01 | 0.49 | 0.50 | 0.62 | ||
| ACC (%) | 96.85 | 78.00 | 76.71 | 83.86 | ||
| hpDC |
| 0.00 | 73.82 | 76.59 | 76.42 | 73.3% |
|
| 99.85 | 86.01 | 82.81 | 91.60 | ||
| MCC | 0.07 | 0.59 | 0.59 | 0.69 | ||
| ACC (%) | 96.71 | 82.3 | 80.57 | 87.00 | ||
The predictive results based on the hybrid information in the Jackknife test.
| Features | Location | OA (%) | ||||
|---|---|---|---|---|---|---|
| Cell wall | Extracell | Cytoplasm | Cell membrane | |||
| AAC+GO |
| 9.09 | 87.85 | 76.59 | 86.79 | 81.0% |
|
| 99.56 | 89.10 | 92.86 | 90.78 | ||
| MCC | 0.18 | 0.75 | 0.71 | 0.76 | ||
| ACC (%) | 96.71 | 88.71 | 87.00 | 89.57 | ||
| AAC+hpDC |
| 40.91 | 83.18 | 83.33 | 89.60 | 83.9% |
|
| 99.26 | 90.74 | 91.96 | 94.50 | ||
| MCC | 0.50 | 0.74 | 0.78 | 0.74 | ||
| ACC (%) | 97.14 | 87.26 | 88.94 | 89.24 | ||
| AAC+GO+acACS |
| 22.73 | 85.51 | 81.35 | 86.79 | 82.4% |
|
| 99.56 | 90.74 | 91.74 | 92.21 | ||
| MCC | 0.37 | 0.75 | 0.74 | 0.78 | ||
| ACC (%) | 97.14 | 89.14 | 88.00 | 90.57 | ||
| AAC+DC+hpDC |
| 40.91 | 86.92 | 87.30 | 88.68 | 86.1% |
|
| 99.71 | 91.15 | 92.86 | 93.65 | ||
| MCC | 0.49 | 0.77 | 0.77 | 0.82 | ||
| ACC (%) | 97.57 | 89.96 | 89.57 | 92.14 | ||
| AAC+GO+acACS+hpDC |
| 22.73 | 83.65 | 86.51 | 85.85 | 83.4% |
|
| 99.56 | 90.71 | 90.63 | 94.67 | ||
| MCC | 0.37 | 0.73 | 0.77 | 0.81 | ||
| ACC (%) | 97.14 | 88.57 | 89.14 | 92.00 | ||
| AAC+DC+GO+hpDC |
| 36.36 | 83.18 | 82.54 | 91.98 | 84.1% |
|
| 99.41 | 88.86 | 92.86 | 93.24 | ||
| MCC | 0.48 | 0.74 | 0.76 | 0.84 | ||
| ACC (%) | 97.43 | 88.86 | 89.14 | 92.86 | ||
| AAC+DC+GO+acACS+hpDC |
| 22.27 | 84.11 | 84.13 | 90.09 | 84.1% |
|
| 99.71 | 90.95 | 92.86 | 93.24 | ||
| MCC | 0.44 | 0.74 | 0.78 | 0.82 | ||
| ACC (%) | 97.43 | 88.86 | 89.71 | 92.29 | ||
The results compared with previous methods.
| Method | Validation method | OA (%) |
|---|---|---|
| Shen's first worka | Jackknife test | 82.7% |
| Shen's second workb | Jackknife test | 82.2% |
| Hu's workc | Jackknife test | 85.9% |
| Julia Rahman's workd | 8-Fold cross-validation | 73.2% |
| This study | Jackknife test | 86.1% |
aSee ref. [8]. bSee ref. [9]. cSee ref. [10]. dSee ref. [11].