| Literature DB >> 30103521 |
Zhe Yang1, Juan Wang2, Zhida Zheng3, Xin Bai4.
Abstract
Research on cytokine recognition is of great significance in the medical field due to the fact cytokines benefit the diagnosis and treatment of diseases, but the current methods for cytokine recognition have many shortcomings, such as low sensitivity and low F-score. Therefore, this paper proposes a new method on the basis of feature combination. The features are extracted from compositions of amino acids, physicochemical properties, secondary structures, and evolutionary information. The classifier used in this paper is SVM. Experiments show that our method is better than other methods in terms of accuracy, sensitivity, specificity, F-score and Matthew's correlation coefficient.Entities:
Keywords: PSSM; PseAAC; SVM; cytokines; feature combination
Mesh:
Substances:
Year: 2018 PMID: 30103521 PMCID: PMC6222536 DOI: 10.3390/molecules23082008
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Results of each feature extraction method with different kernel functions.
| Feature Vector | Kernel Function |
|
|
|
| |
|---|---|---|---|---|---|---|
|
| linear | 80.836% | 85.576% | 76.174% | 81.579% | 61.996% |
|
| linear | 81.259% | 81.677% | 80.836% | 81.210% | 62.532% |
| Gaussian | 84.882% | 84.192% | 85.560% | 84.665% | 69.766% | |
|
| linear | 80.402% | 76.262% | 85.760% | 79.863% | 62.378% |
| Gaussian | 76.823% | 62.769% | 88.829% | 72.222% | 53.404% | |
|
| linear | 82.832% | 75.378% | 88.886% | 80.836% | 64.865% |
| Gaussian | 77.588% | 63.533% | 93.250% | 74.470% | 59.732% | |
|
| linear | 74.242% | 77.585% | 70.954% | 74.939% | 48.636% |
| Gaussian | 72.950% | 72.616% | 73.296% | 72.716% | 45.912% |
Results of feature combination methods with different kernel functions.
| Feature Vector | Kernel Function |
|
|
|
| |
|---|---|---|---|---|---|---|
|
| linear | 84.081% | 84.373% | 83.797% | 84.025% | 68.167% |
| Gaussian | 86.149% | 85.412% | 86.885% | 85.957% | 72.299% | |
|
| linear | 85.700% | 81.505% | 89.836% | 84.970% | 71.621% |
| Gaussian | 78.526% | 62.766% | 94.050% | 74.346% | 59.913% | |
|
| linear | 89.722% | 88.284% | 91.132% | 89.506% | 79.473% |
| Gaussian | 86.969% | 86.046% | 87.882% | 86.765% | 73.951% | |
|
| linear | 89.923% | 88.492% | 91.325% | 89.706% | 79.870% |
| Gaussian | 85.531% | 84.951% | 86.110% | 85.358% | 71.062% |
The accuracy of different C and γ for F with Gaussian kernel.
| 83.936 | 84.726 | 85.769 | 86.934 | 88.285 | 89.244 | 89.974 | 90.451 | 90.252 | 89.878 | 89.558 | |
| 83.604 | 84.316 | 85.148 | 86.451 | 87.754 | 88.912 | 89.558 | 89.606 | 89.491 | 89.371 | 89.286 | |
| 76.624 | 79.972 | 82.856 | 85.962 | 86.626 | 87.917 | 88.388 | 88.484 | 88.466 | 88.472 | 88.472 | |
| 56.476 | 61.983 | 64.523 | 68.565 | 73.578 | 80.817 | 81.402 | 81.378 | 81.378 | 81.378 | 81.378 | |
| 50.407 | 50.407 | 50.407 | 50.570 | 56.995 | 67.696 | 69.259 | 69.259 | 69.259 | 69.259 | 69.259 | |
| 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 51.433 | 52.404 | 52.404 | 52.404 | 52.404 | 52.404 | |
| 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.419 | 50.413 | 50.413 | 50.413 | 50.413 | 50.413 | |
| 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | |
| 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | |
| 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | |
| 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 | 50.407 |
Results of 10 times 10-fold cross-validation.
| Times |
|
|
|
| |
|---|---|---|---|---|---|
| 1 | 90.748% ± 0.567% | 89.181% ± 1.003% | 92.287% ± 0.880% | 90.538% ± 0.617% | 81.533% ± 1.133% |
| 2 | 90.965% ± 0.348% | 89.297% ± 1.090% | 92.632% ± 0.650% | 90.750% ± 0.375% | 81.980% ± 0.657% |
| 3 | 90.851% ± 0.620% | 89.238% ± 0.706% | 92.450% ± 0.836% | 90.644% ± 0.558% | 81.737% ± 1.235% |
| 4 | 90.954% ± 0.695% | 89.343% ± 0.890% | 92.539% ± 0.727% | 90.743% ± 0.772% | 81.942% ± 1.395% |
| 5 | 90.775% ± 0.646% | 89.264% ± 1.327% | 92.267% ± 0.804% | 90.571% ± 0.689% | 81.591% ± 1.269% |
| 6 | 90.819% ± 0.546% | 89.232% ± 0.776% | 92.382% ± 1.097% | 90.612% ± 0.501% | 81.673% ± 1.101% |
| 7 | 90.813% ± 0.682% | 89.120% ± 1.024% | 92.462% ± 0.960% | 90.592% ± 0.725% | 81.660% ± 1.367% |
| 8 | 90.868% ± 0.619% | 89.232% ± 0.964% | 92.473% ± 0.580% | 90.649% ± 0.727% | 81.766% ± 1.238% |
| 9 | 90.895% ± 0.580% | 89.302% ± 0.752% | 92.460% ± 0.863% | 90.692% ± 0.458% | 81.816% ± 1.151% |
| 10 | 90.688% ± 0.532% | 89.134% ± 0.985% | 92.224% ± 0.567% | 90.478% ± 0.572% | 81.407% ± 1.048% |
Results of feature combinations with the ratio of the positives to the negatives 1:9.
| Feature Vector | Kernel Function |
|
|
|
| |
|---|---|---|---|---|---|---|
|
| Gaussian | 92.315% | 31.948% | 98.946% | 44.943% | 46.357% |
|
| Gaussian | 92.875% | 37.385% | 98.963% | 50.743% | 51.529% |
|
| linear | 93.520% | 43.640% | 98.826% | 57.745% | 57.433% |
|
| linear | 93.943% | 48.157% | 98.781% | 60.930% | 60.319% |
|
| linear | 94.980% | 62.231% | 98.572% | 70.899% | 69.132% |
|
| linear | 94.966% | 62.039% | 98.574% | 70.782% | 68.989% |
Figure 1Comparison between our method and the 473+SVM.
Figure 2Comparison between our method and the MRDR+LIBD3C.
Figure 3Comparison between our method and the PCA+BP-NN.
Figure 4Overview of feature extraction.