| Literature DB >> 29028879 |
Anup Kumar Halder1, Pritha Dutta1, Mahantapas Kundu1, Subhadip Basu1, Mita Nasipuri1.
Abstract
Identification of potential virus-host interactions is useful and vital to control the highly infectious virus-caused diseases. This may contribute toward development of new drugs to treat the viral infections. Recently, database records of clinically and experimentally validated interactions between a small set of human proteins and Ebola virus (EBOV) have been published. Using the information of the known human interaction partners of EBOV, our main objective is to identify a set of proteins that may interact with EBOV proteins. Here, we first review the state-of-the-art, computational methods used for prediction of novel virus-host interactions for infectious diseases followed by a case study on EBOV-human interactions. The assessment result shows that the predicted human host proteins are highly similar with known human interaction partners of EBOV in the context of structure and semantics and are responsible for similar biochemical activities, pathways and host-pathogen relationships.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29028879 PMCID: PMC7109800 DOI: 10.1093/bfgp/elx026
Source DB: PubMed Journal: Brief Funct Genomics ISSN: 2041-2649 Impact factor: 4.241
Computational approaches on virus host (human) interaction prediction
| Interactors | Approaches/methods | References |
|---|---|---|
| Ebola–human | Graph-based multitask learning-based approach | [ |
| Network similarity-based approach | [ | |
| Dengue virus–human | Sequence- and structure-based method | [ |
| Structural motif–domain interaction-based approach | [ | |
| Structural similarity of DENV proteins to human proteins having known interactions | [ | |
| Human papillomaviruse–human | Relative frequency of amino acid triplets (RFATs), frequency difference of amino acid triplets (FDATs) and AC | [ |
| Fixed-length feature vector of protein sequence | [ | |
| Hepatitis C virus–human | Graph-based multitask learning-based approach | [ |
| RFATs, FDATs and AC | [ | |
| Sequence, network topology, domain, GO and pathway-based kernel method | [ | |
| Topological and functional properties of interaction network and domain interaction-based method | [ | |
| Fixed-length feature vector of protein sequence | [ | |
| Sequence- and structure-based method | [ | |
| Multi-instance homolog transfer-based approach | [ | |
| Virus–host domain interaction, gene expression, pathway sharing and sequence-based | [ | |
| Obtain host–pathogen interactome using sequence and interacting domain similarity to known PPIs | [ | |
| Homology detection method using template PPI databases | [ | |
| [ | ||
| GO annotation and sequence filtering-based approach | [ | |
| Homology detection method using template PPI databases | [ | |
| Domain–domain interaction probability-based approach | [ | |
| Influenza A–human | Graph-based multitask learning-based approach | [ |
| Structural homology-based approach | [ | |
| HIV-1–human | Short linear motifs-based approach | [ |
| Bi-clustering with association rule mining | [ | |
| Sequence-based classifier ensembling | [ | |
| Differential gene expression between virus and host | [ | |
| Hierarchical bi-clusters and minimal covers of association rule-based approach | [ | |
| Supervised learning and prediction of physical interactions | [ | |
| Homology detection method using template PPI databases | [ | |
| Stringent homology which uses inter-species template PPI | [ |
Known human target proteins of EBOV GP
| Functional group | Gene name | Protein name |
|---|---|---|
| C-type lectin domain family | Liver/lymph node-specific ICAM-3 grabbing non-integrin (L-SIGN) | |
| Liver and lymph node sinusoidal endothelial cell C-type lectin (LSECTIN) | ||
| Human macrophage galactose-and N-acetyl-galactosamine-specific C-type lectin (hMGL) | ||
| Dendritic cell-specific intercellular adhesion molecule (ICAM) | Dendritic cell-specific intercellular adhesion molecule-3-grabbingnon-integrin (DC-SIGN) | |
| Tyrosine-protein kinase receptor | Tyrosine-protein kinase receptor UFO | |
| Tyrosine-protein kinase receptor TYRO3 | ||
| Tyrosine-protein kinase Mer | ||
| T-cell immunoglobulin and mucin domain | T-cell immunoglobulin and mucin domain-containing protein 1 | |
| T-cell immunoglobulin and mucin domain-containing protein 4 | ||
| Integrin domain | Integrin beta-1 | |
| Integrin alpha-5 | ||
| Lactadherin | Lactadherin | |
| Growth arrest-specific protein | Growth arrest-specific protein 6 |
GO-based cluster center threshold with k-value
| GO | Cluster center threshold | Width |
|---|---|---|
| MF | 0.18 | 2 |
| CC | 0.55 | 1 |
| BP | 0.31 | 3 |
Figure 1.ROC curves of DT, KNN, SVM and GNB. (A colour version of this figure is available online at: https://academic.oup.com/bfg)
Figure 2.The workflow of cluster analysis based on known human target proteins of EBOV GP using DT, KNN, SVM and GNB. (A colour version of this figure is available online at: https://academic.oup.com/bfg)
Significant common KEGG pathways found in known and new human proteins
| Serial number | KEGG | Term | Host | New host | ||
|---|---|---|---|---|---|---|
| % of proteins | % of proteins | |||||
| 1 | hsa05162 | Measles | 15.38 | 5.7E-2 | 22.52 | 7.8E-4 |
| 2 | hsa04145 | Phagosome | 30.77 | 1.1E-5 | 27.27 | 3.10E-06 |
| 3 | hsa05203 | Viral carcinogenesis | 18.22 | 1.30E-11 | 33.21 | 5.30E-09 |
| 4 | hsa05152 | Tuberculosis | 15.31 | 7.5E-2 | 17.07 | 4.80E-02 |
| 5 | hsa05133 | Pertussis | 25.81 | 3.2E-2 | 14.34 | 5.0E-3 |
| 6 | hsa05205 | Proteoglycans in cancer | 12.23 | 8.40E-03 | 18.79 | 2.90E-02 |
Significant common GO terms (biological process) found in known and new human proteins
| Term | Host | New host | ||
|---|---|---|---|---|
| % of proteins | % of proteins | |||
| Antigen processing and presentation | 35.38 | 3.50E-02 | 21.73 | 7.40E-06 |
| Modulation by virus of host morphology or physiology | 15.32 | 4.60E-03 | 24.22 | 5.60E-04 |
| Innate immune response | 30.7 | 2.40E-03 | 17.39 | 2.50E-02 |
| Viral genome replication | 23.08 | 3.50E-05 | 13.62 | 1.80E-02 |
| Integrin-mediated signaling pathway | 14.83 | 6.30E-02 | 28.19 | 4.70E-04 |
| Platelet activation | 16.61 | 5.00E-05 | 18.4 | 6.80E-02 |
Significant common GO terms (cellular component) found in known and new human proteins
| Serial number | KEGG | Term | Host | New host | ||
|---|---|---|---|---|---|---|
| % of proteins | % of proteins | |||||
| 1 | GO:0005615 | Extracellular space | 31.87 | 5.40E-02 | 19.56 | 5.50E-06 |
| 2 | GO:0009986 | Cell surface | 26.14 | 4.70E-03 | 17.97 | 7.10E-03 |
| 3 | GO:0005886 | Plasma membrane | 53.84 | 3.50E-02 | 28.2 | 7.70E-02 |
| 4 | GO:0001726 | Ruffle | 22.15 | 5.80E-02 | 9.6 | 4.50E-03 |
Significant common GO terms (molecular function) found in known and new human proteins
| Serial number | KEGG | Term | Host | New host | ||
|---|---|---|---|---|---|---|
| % of proteins | % of proteins | |||||
| 1 | GO:0001618 | Virus receptor activity | 61.54 | 1.20E-14 | 19.43 | 3.20E-06 |
| 2 | GO:0001786 | Phosphatidylserine binding | 33.17 | 2.10E-06 | 17.3 | 4.20E-07 |
| 3 | GO:0004714 | Transmembrane receptor protein tyrosine kinase activity | 23.08 | 3.20E-04 | 9.8 | 1.60E-03 |