| Literature DB >> 17526528 |
Jin-Rui Xu1, Jing-Xian Zhang, Bu-Cong Han, Liang Liang, Zhi-Liang Ji.
Abstract
The interactions between cytokines and their complementary receptors are the gateways to properly understand a large variety of cytokine-specific cellular activities such as immunological responses and cell differentiation. To discover novel cytokine-receptor interactions, an advanced support vector machines (SVMs) model, CytoSVM, was constructed in this study. This model was iteratively trained using 449 mammal (except rat) cytokine-receptor interactions and about 1 million virtually generated positive and negative vectors in an enriched way. Final independent evaluation by rat's data received sensitivity of 97.4%, specificity of 99.2% and the Matthews correlation coefficient (MCC) of 0.89. This performance is better than normal SVM-based models. Upon this well-optimized model, a web-based server was created to accept primary protein sequence and present its probabilities to interact with one or several cytokines. Moreover, this model was applied to identify putative cytokine-receptor pairs in the whole genomes of human and mouse. Excluding currently known cytokine-receptor interactions, total 1609 novel cytokine-receptor pairs were discovered from human genome with probability approximately 80% after further transmembrane analysis. These cover 220 novel receptors (excluding their isoforms) for 126 human cytokines. The screening results have been deposited in a database. Both the server and the database can be freely accessed at http://bioinf.xmu.edu.cn/software/cytosvm/cytosvm.php.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17526528 PMCID: PMC1933174 DOI: 10.1093/nar/gkm254
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The descriptions of different SVM models
| Model | Enriched training | Virtual Positives | Ratio of Positive/Negative | AUC |
|---|---|---|---|---|
| M1 | Yes | Yes | 1: 2.83 | 0.9692 |
| M2 | Yes | No | 1: 3.37 | 0.9204 |
| M3 | No | No | 1: 2.83 | 0.8856 |
| M4 | No | Yes | 1: 3.31 | 0.8897 |
| M5 | No | Yes | 1: 1.08 | 0.8353 |
| M6 | No | Yes | 1: 6.02 | 0.8834 |
aYes’ means all virtual negative vectors are adopted for model training in an iterative manner (the enriched training). ‘No’ means only certain portion of random selection of negative vectors are used for model training.
b‘Yes’ means virtual positive vectors are adopted for model training. ‘No’ means no virtual positive vectors are used.
cThe ratio of positive vectors against negative vectors in the training data sets. The ratio for Models M1-4 is about 1:3, M5 is about 1:1 and M6 is about 1:6.
dThe area under receiver operating characteristic (ROC) plot. AUC is often used to measure the performance of models; the higher value indicates better performance.
The evaluation of CytoSVM model
| Testing set | Independent evaluation set | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Positive | Negative | MCC | Positive | Negative | MCC | ||||||||
| TP | FN | SE (%) | TN | FP | SP (%) | TP | FN | SE (%) | TN | FP | SP (%) | ||
| 2343 | 0 | 100 | 4445 | 1 | 99.98 | 0.99 | 77 | 2 | 97.4 | 2343 | 17 | 99.2 | 0.89 |
TP: true positives; FN: false negatives; TN: true negatives; FP: false positives; SE: sensitivity SE = TP/(TP + FN); SP: specificity SP = TN/(TN + FP); MCC: Matthews correlation coefficient.
Figure 1.The interface of CytoSVM server.
Figure 2.The interface of CytoSVM database.