| Literature DB >> 33907496 |
İrfan Kösesoy1, Murat Gök1, Tamer Kahveci2.
Abstract
Knowledge of the pathogen-host interactions between the species is essentialin order to develop a solution strategy against infectious diseases. In vitro methods take extended periods of time to detect interactions and provide very few of the possible interaction pairs. Hence, modelling interactions between proteins has necessitated the development of computational methods. The main scope of this paper is integrating the known protein interactions between thehost and pathogen organisms to improve the prediction success rate of unknown pathogen-host interactions. Thus, the truepositive rate of the predictions was expected to increase.In order to perform this study extensively, encoding methods and learning algorithms of several proteins were tested. Along with human as the host organism, two different pathogen organisms were used in the experiments. For each combination of protein-encoding and prediction method, both the original prediction algorithms were tested using only pathogen-host interactions and the same methodwas testedagain after integrating the known protein interactions within each organism. The effect of merging the networks of pathogen-host interactions of different species on the prediction performance of state-of-the-art methods was also observed. Successwas measured in terms of Matthews correlation coefficient, precision, recall, F1 score, and accuracy metrics. Empirical results showed that integrating the host and pathogen interactions yields better performance consistently in almost all experiments.Entities:
Keywords: bioinformatics; host-pathogen interactions; machine learning; protein networks; protein-protein interactions; Infectious diseases
Year: 2021 PMID: 33907496 PMCID: PMC8068772 DOI: 10.3906/biy-2009-4
Source DB: PubMed Journal: Turk J Biol ISSN: 1300-0152
The combined score thresholds for datasets downloaded from STRING.
| B. anthracis | Yersinia pestis | |
|---|---|---|
| Host-host int. | 0.913 | 0.923 |
| Pathogen-pathogen int. | 0.704 | 0.974 |
Number of PH and HH interactions obtained from the PHISTO and STRING databases.
| B. anthracis | Yersinia Pestis | |
|---|---|---|
| # of known PH interactions | 3050 | 4097 |
| # of used negative PH interactions | 9500 | 12950 |
| # of used PH interactions | 1900 | 2590 |
| # of used HH int. | 1500 | 2000 |
| # of used PP int. | 234 | 176 |
The evaluation results for Bacillus anthracis dataset.
| PHI | ENM | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Feat. | Meth. | Prec. | Rec. | F1 | MCC | Acc. | Prec. | Rec. | F1 | MCC | Acc. |
| AAC | BN | 0.453 | 0.663 | 0.538 | 0.437 | 0.811 | 0.661 | 0.776 | 0.714 | 0.596 | 0.828 |
| NB | 0.325 | 0.735 | 0.451 | 0.331 | 0.702 | 0.532 | 0.784 | 0.634 | 0.474 | 0.75 | |
| kNN | 0.396 | 0.639 | 0.489 | 0.373 | 0.777 | 0.593 | 0.818 | 0.688 | 0.556 | 0.794 | |
| K-star | 0.379 | 0.736 | 0.501 | 0.395 | 0.755 | 0.572 | 0.874 | 0.691 | 0.565 | 0.784 | |
| j48 | 0.458 | 0.417 | 0.437 | 0.331 | 0.821 | 0.706 | 0.693 | 0.7 | 0.586 | 0.835 | |
| RF | 0.866 | 0.303 | 0.449 | 0.472 | 0.876 | 0.956 | 0.634 | 0.762 | 0.717 | 0.891 | |
| AAP | BN | 0.417 | 0.707 | 0.525 | 0.422 | 0.787 | 0.598 | 0.675 | 0.634 | 0.485 | 0.785 |
| NB | 0.495 | 0.421 | 0.455 | 0.359 | 0.832 | 0.727 | 0.439 | 0.547 | 0.451 | 0.799 | |
| kNN | 0.652 | 0.466 | 0.543 | 0.479 | 0.869 | 0.839 | 0.692 | 0.758 | 0.683 | 0.878 | |
| K-star | 0.643 | 0.512 | 0.57 | 0.502 | 0.874 | 0.727 | 0.636 | 0.678 | 0.602 | 0.873 | |
| j48 | 0.518 | 0.503 | 0.51 | 0.415 | 0.839 | 0.713 | 0.728 | 0.72 | 0.612 | 0.844 | |
| RF | 0.827 | 0.386 | 0.527 | 0.515 | 0.884 | 0.916 | 0.688 | 0.786 | 0.732 | 0.896 | |
| LBE | BN | 0.429 | 0.791 | 0.556 | 0.468 | 0.789 | 0.629 | 0.797 | 0.703 | 0.579 | 0.814 |
| NB | 0.491 | 0.409 | 0.446 | 0.350 | 0.831 | 0.716 | 0.429 | 0.537 | 0.438 | 0.795 | |
| kNN | 0.638 | 0.513 | 0.569 | 0.498 | 0.87 | 0.807 | 0.737 | 0.77 | 0.689 | 0.878 | |
| K-star | 0.699 | 0.131 | 0.221 | 0.255 | 0.846 | 0.938 | 0.442 | 0.601 | 0.572 | 0.838 | |
| j48 | 0.52 | 0.534 | 0.527 | 0.431 | 0.84 | 0.737 | 0.762 | 0.75 | 0.652 | 0.859 | |
| RF | 0.783 | 0.468 | 0.586 | 0.550 | 0.89 | 0.892 | 0.733 | 0.804 | 0.746 | 0.901 | |
The evaluation results for the Yersinia pestis dataset.
| PHI | ENM | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Feat. | Meth. | Prec. | Rec. | F1 | MCC | Acc. | Prec. | Rec. | F1 | MCC | Acc. |
| AAC | BN | 0.407 | 0.639 | 0.497 | 0.384 | 0.785 | 0.608 | 0.743 | 0.668 | 0.535 | 0.802 |
| NB | 0.303 | 0.683 | 0.42 | 0.284 | 0.685 | 0.473 | 0.741 | 0.578 | 0.393 | 0.708 | |
| kNN | 0.389 | 0.525 | 0.447 | 0.322 | 0.783 | 0.589 | 0.766 | 0.666 | 0.530 | 0.793 | |
| K-star | 0.416 | 0.683 | 0.517 | 0.411 | 0.788 | 0.597 | 0.835 | 0.696 | 0.575 | 0.804 | |
| j48 | 0.464 | 0.416 | 0.439 | 0.335 | 0.823 | 0.684 | 0.674 | 0.679 | 0.563 | 0.829 | |
| RF | 0.954 | 0.27 | 0.421 | 0.469 | 0.876 | 0.973 | 0.575 | 0.723 | 0.695 | 0.881 | |
| AAP | BN | 0.391 | 0.685 | 0.498 | 0.387 | 0.77 | 0.548 | 0.653 | 0.596 | 0.432 | 0.762 |
| NB | 0.43 | 0.423 | 0.427 | 0.314 | 0.811 | 0.622 | 0.426 | 0.506 | 0.378 | 0.776 | |
| kNN | 0.596 | 0.366 | 0.454 | 0.389 | 0.853 | 0.811 | 0.635 | 0.713 | 0.632 | 0.862 | |
| K-star | 0.612 | 0.462 | 0.527 | 0.457 | 0.866 | 0.704 | 0.561 | 0.624 | 0.548 | 0.864 | |
| j48 | 0.486 | 0.476 | 0.481 | 0.378 | 0.829 | 0.691 | 0.708 | 0.7 | 0.588 | 0.837 | |
| RF | 0.838 | 0.33 | 0.473 | 0.479 | 0.878 | 0.912 | 0.645 | 0.756 | 0.698 | 0.888 | |
| LBE | BN | 0.395 | 0.778 | 0.524 | 0.429 | 0.764 | 0.576 | 0.792 | 0.667 | 0.531 | 0.787 |
| NB | 0.414 | 0.401 | 0.408 | 0.292 | 0.806 | 0.602 | 0.387 | 0.471 | 0.344 | 0.766 | |
| kNN | 0.598 | 0.453 | 0.516 | 0.440 | 0.858 | 0.776 | 0.707 | 0.74 | 0.651 | 0.866 | |
| K-star | 0.677 | 0.154 | 0.251 | 0.272 | 0.847 | 0.903 | 0.445 | 0.596 | 0.559 | 0.838 | |
| j48 | 0.5 | 0.505 | 0.502 | 0.402 | 0.833 | 0.707 | 0.728 | 0.718 | 0.612 | 0.846 | |
| RF | 0.79 | 0.406 | 0.536 | 0.503 | 0.883 | 0.883 | 0.7 | 0.78 | 0.720 | 0.894 | |
The evaluation results for the merged dataset.
| Merged dataset | ||||||
|---|---|---|---|---|---|---|
| Feature | Method | Prec. | Rec. | F1 | MCC | Acc. |
| AAC | BN | 0.412 | 0.648 | 0.504 | 0.393 | 0.788 |
| NB | 0.304 | 0.693 | 0.423 | 0.289 | 0.685 | |
| kNN | 0.4 | 0.598 | 0.479 | 0.360 | 0.783 | |
| K-star | 0.411 | 0.733 | 0.526 | 0.426 | 0.78 | |
| j48 | 0.495 | 0.434 | 0.462 | 0.365 | 0.832 | |
| RF | 0.926 | 0.306 | 0.46 | 0.489 | 0.88 | |
| AAP | BN | 0.398 | 0.687 | 0.504 | 0.395 | 0.775 |
| NB | 0.447 | 0.412 | 0.428 | 0.320 | 0.817 | |
| kNN | 0.633 | 0.412 | 0.499 | 0.437 | 0.862 | |
| K-star | 0.627 | 0.51 | 0.562 | 0.491 | 0.872 | |
| j48 | 0.504 | 0.496 | 0.5 | 0.401 | 0.835 | |
| RF | 0.831 | 0.36 | 0.502 | 0.499 | 0.881 | |
| LBE | BN | 0.407 | 0.787 | 0.536 | 0.445 | 0.773 |
| NB | 0.444 | 0.394 | 0.417 | 0.310 | 0.817 | |
| kNN | 0.617 | 0.474 | 0.536 | 0.463 | 0.863 | |
| K-star | 0.693 | 0.148 | 0.244 | 0.271 | 0.847 | |
| j48 | 0.522 | 0.53 | 0.526 | 0.430 | 0.841 | |
| RF | 0.793 | 0.435 | 0.562 | 0.528 | 0.887 | |