| Literature DB >> 25375323 |
Ranjan Kumar Barman1, Sudipto Saha2, Santasabuj Das3.
Abstract
BACKGROUND: Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs) has great implication for therapeutics.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25375323 PMCID: PMC4223108 DOI: 10.1371/journal.pone.0112034
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
List of best 4 features selected based on categorical regression method.
| Features | Beta | Bootstrap (1000)Estimate of Std. Error | df | F | Sig.(PValue) |
| Average domain-domainassociation score | 0.511 | 0.016 | 1.000 | 982.607 | 0.000 |
| Virus Methionine | 0.070 | 0.021 | 1.000 | 10.911 | 0.001 |
| Virus Serine | 0.106 | 0.021 | 1.000 | 25.838 | 0.000 |
| Virus Valine | 0.094 | 0.023 | 1.000 | 16.829 | 0.000 |
Comparison of performance between selected best 4 features vs all 44 features.
| Method | All Features | Selected Features | ||||
| Accuracy(%) | Area underROC curve | F1 Score(%) | Accuracy(%) | Area underROC curve | F1Score (%) | |
| Naïve Bayes | 67.48 | 0.66 | 56.72 | 68.50 | 0.71 | 54.35 |
| SVM |
|
|
|
|
|
|
| Random Forest | 71.69 | 0.77 | 67.13 | 72.41 | 0.76 | 66.39 |
SVM based performance on testing dataset (5-fold cross-validation) using parameters t = 2 (RBF kernel), and g = 1, c = 0.1, j = 2.
| Threshold | Sensitivity (%) | Specificity (%) | Accuracy (%) | PPV (%) | MCC |
| 0.8 | 37 | 91 | 64 | 76 | 0.32 |
| 0.7 | 46 | 89 | 67 | 78 | 0.39 |
| 0.6 | 52 | 85 | 68 | 76 | 0.40 |
| 0.5 | 59 | 80 | 69 | 75 | 0.42 |
|
|
|
|
|
|
|
| 0.3 | 69 | 70 | 70 | 69 | 0.42 |
| 0.2 | 73 | 65 | 69 | 68 | 0.41 |
| 0.1 | 76 | 59 | 68 | 65 | 0.38 |
| 0 | 80 | 51 | 66 | 62 | 0.35 |
| -0.1 | 81 | 46 | 64 | 60 | 0.32 |
| -0.2 | 83 | 40 | 62 | 58 | 0.28 |
| -0.3 | 85 | 36 | 60 | 57 | 0.26 |
| -0.4 | 87 | 29 | 58 | 55 | 0.23 |
| -0.5 | 89 | 25 | 57 | 54 | 0.20 |
| -0.6 | 89 | 20 | 54 | 53 | 0.15 |
| -0.7 | 91 | 16 | 53 | 52 | 0.12 |
| -0.8 | 91 | 11 | 51 | 51 | 0.08 |
Comparison of performance measures among Naïve Bayes, SVM and Random Forest methods on testing dataset using 5-fold cross-validation technique in our study.
| Methods | Sensitivity (%) | Specificity (%) | Accuracy (%) | PPV (%) | MCC | Area under ROC curve | F1 Score (%) |
| Naïve Bayes | 37.49 | 99.52 | 68.50 | 98.80 | 0.47 | 0.71 | 54.35 |
| SVMlight |
|
|
|
|
|
|
|
| Random Forest | 55.66 | 89.08 | 72.41 | 82.26 | 0.48 | 0.76 | 66.39 |
Comparison of proposed method with other viral-host PPIs prediction methods.
| Performance Mesaure | Dyer et al. Dataset | Performance Mesaure | Cui et al. Dataset | |||
| Dyer et al. | Proposed SVM Model | Shen et al. | Proposed SVM Model | Cui et al. | ||
| Sensitivity (%) | 40.00 | 87.05 | Accuracy (%) | 78.00 | 80.00 | 82.00 |
*Partial dataset.
Figure 1Hierarchical clustering of highly predicted SVM score of HBV-human protein pairs.
Hierarchical clustering analysis was done using TIBCO Spotfire software with complete linkage clustering method, cosine correlation distance measure, average value ordering weight, scale between 0 and 1 normalization and empty value replace by 0 for both (row and column) dendrogram. The high, average and low SVM predicted scores are marked in red, white and blue, respectively.
Figure 2A network of HBX-human protein interactions predicted by our proposed method.
The network visualized by Cytoscape 3.0.2 [35]. The HBX protein is represented by cyan node. The significant gene ontology enriched human proteins are representing by salmon node, whereas other human proteins are representing by slate grey node.
The Gene Ontology Biological Process enrichment analysis on interacting human protein partners of HBV proteins using DAVID server.
| HepatitisB virusprotein | GOterm∼BiologicalProcess | Human protein |
| C | GO:0022406∼membrane docking | SCFD1, SCFD2, VPS45, STXBP1, STXBP2, STXBP3 |
| GO:0006835∼dicarboxylic acid transport | SLC1A4, SLC1A5, SLC1A2, SLC1A3,SLC1A1 | |
| GO:0006865∼amino acid transport | SLC1A4, CPT1B, SLC1A5, SLC1A2,CPT2, SLC1A3, XK, SLC1A1 | |
| X | GO:0001906∼cell killing | DEFA6, DEFA5, DEFA4, DEFA3, DEFA1 |
| GO:0009620∼response to fungus | DEFA6, DEFA5, DEFA4, DEFA3, DEFA1 | |
| GO:0006952∼defense response | YWHAZ, DEFB4A, CD74, IL17C,IL17D, IL17A, IL17B, DEFA6, AOAH, DEFA5, DEFA4, IL17F, DEFA3, DEFA1 | |
| P | GO:0051186∼cofactor metabolic process | NAMPT, ACO2, HMGCR, ACO1,IREB2, GIF, PNP, SOD2, SDHA,GSS, MTHFS, PGLS, PANK2, PANK3, FXN, PANK1, NARFL, CTNS, NAPRT1, FH |
| GO:0006732∼coenzyme metabolic process | NAMPT, ACO2, HMGCR, ACO1, PNP, SOD2, SDHA, GSS, MTHFS, PGLS, PANK2, PANK3, PANK1, CTNS, NAPRT1, FH |
Significant biological process annotation terms were filtered by FDR (false discovery rate) <0.05.