| Literature DB >> 32038713 |
Jie Li1, Shiming Wang1, Zhuo Chen1, Yadong Wang1.
Abstract
Pathogen-host interactions play an important role in understanding the mechanism by which a pathogen can infect its host. Some approaches for predicting pathogen-host association have been developed, but prediction accuracy is still low. In this paper, we propose a bipartite network module-based approach to improve prediction accuracy. First, a bipartite network with pathogens and hosts is constructed. Next, pathogens and hosts are divided into different modules respectively. Then, modular information on the pathogens and hosts is added into a bipartite network projection model and the association scores between pathogens and hosts are calculated. Finally, leave-one-out cross-validation is used to estimate the performance of the proposed method. Experimental results show that the proposed method performs better in predicting pathogen-host association than other methods, and some potential pathogen-host associations with higher prediction scores are also confirmed by the results of biological experiments in the publically available literature.Entities:
Keywords: BNMP; bipartite network project; host; pathogen; pathogen–host association
Year: 2020 PMID: 32038713 PMCID: PMC6992693 DOI: 10.3389/fgene.2019.01357
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Process of the bipartite network module-based project. (A) Construct the pathogen–host bipartite network and choose a host as the seed vertex. (B) Divide the pathogen set into several modules. (C) Calculate the association score between the seed and pathogens in each module. (D) Select each host as the seed vertex in turn and repeat process (A–C) then obtain the pathogen–host association score matrix S (E) Choose a pathogen as the seed vertex. (F) Divide the host set into several modules. (G) Calculate the association score between the seed and hosts in each module. (H) Select each pathogen as the seed vertex in turn and repeat process (E–G) then obtain the host–pathogen association score matrix S. (I) Integrate matrix S and S as the association score matrix between all pathogens and hosts.
The constructed network 1 and network 2.
| Network | Number of pathogens | Number of hosts | Number of associations |
|---|---|---|---|
| Network 1 | 388 | 243 | 997 |
| Network 2 | 167 | 96 | 653 |
Figure 2Prediction performance of BNMP with network 1. (A) Influence on AUROC values by different balance parameter values. (B) Influence on AUPR values by different balance parameter values. (C) ROC curves of BNMP with the different balance parameter values. (D) PR curves of BNMP with the different balance parameter values.
Figure 3Prediction performance of BNMP with network 2. (A) Influence on AUROC values by different balance parameter values. (B) Influence on AUPR values by different balance parameter values.
Figure 4Comparison of five methods. (A) ROC curves. (B) PR curves.
Figure 5Paired t-test for the AUROC and AUPR values of pathogens between BNMP and other methods. (A) Box-and-whisker plot of AUROC values with p-values. (B) Box-and-whisker plot of AUPR values with p-values.
Pathogen–host pairs predicted using BNMP and their rank according to five methods.
| Pathogen |
| BNMP | NTSMDA | BNP | Zhang's method | WBSMDA |
|---|---|---|---|---|---|---|
| Serratia marcescens | Mus musculus ( | 1 | 43 | 15 | 17 | 13 |
| Cronobacter turicensis | Mus musculus ( | 3 | 10 | 26 | 24 | 109 |
| Escherichia coli O157:H7 | Mus musculus ( | 4 | 38 | 172 | 14 | 10 |
| Acinetobacter nosocomialis | Homo sapiens ( | 5 | 13 | 251 | 119 | 18 |
| Stenotrophomonas maltophilia | Mus musculus ( | 6 | 44 | 124 | 21 | 13082 |
| Sclerotinia sclerotiorum | Nicotiana tabacum ( | 7 | 61 | 44 | 540 | 169 |
| Pseudomonas aeruginosa | Oryctolagus cuniculus ( | 8 | 588 | 62 | 960 | 55 |
| Enterococcus faecalis | Homo sapiens ( | 9 | 37 | 33 | 109 | 19 |
| Alternaria citri | Citrus reticulata ( | 10 | 528 | 57 | 9021 | 41 |
| Mycobacterium marinum | Homo sapiens ( | 12 | 39 | 36 | 115 | 26 |
| Mycobacteroides abscessus | Homo sapiens ( | 14 | 20 | 25 | 102 | 20 |
| Alternaria alternata | Solanum lycopersicum ( | 15 | 261 | 40 | 447 | 3045 |
| Enterococcus faecium | Homo sapiens ( | 16 | 40 | 27 | 106 | 121 |
| Fusarium oxysporum | Nicotiana tabacum ( | 17 | 118 | 43 | 537 | 1313 |
| Pectobacterium carotovorum | Arabidopsis thaliana ( | 19 | 259 | 74 | 199 | 764 |
| Mycoplasma agalactiae | Mus musculus ( | 20 | 26 | 201 | 101 | 211 |