| Literature DB >> 31875156 |
Hao Li1, Chanin Nantasenamat1.
Abstract
The continued and general rise of antibiotic resistance in pathogenic microbes is a well-recognized global threat. Host defense peptides (HDPs), a component of the innate immune system have demonstrated promising potential to become a next generation antibiotic effective against a plethora of pathogens. While the effectiveness of antimicrobial HDPs has been extensively demonstrated in experimental studies, theoretical insights on the mechanism by which these peptides function is comparably limited. In particular, experimental studies of AMP mechanisms are limited in the number of different peptides investigated and the type of peptide parameters considered. This study makes use of the random forest algorithm for classifying the antimicrobial activity as well for identifying molecular descriptors underpinning the antimicrobial activity of investigated peptides. Subsequent manual interpretation of the identified important descriptors revealed that polarity-solubility are necessary for the membrane lytic antimicrobial activity of HDPs.Entities:
Keywords: Antibiotic resistance; Antimicrobial resistance; Host defense peptides; Quantitative structure-activity relationship
Year: 2019 PMID: 31875156 PMCID: PMC6927346 DOI: 10.7717/peerj.8265
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Classification performance of the random forest algorithm of the AMP activities against different bacterial strains.
The numerical pIC50 values were binned to two or three levels. CV denotes 10-fold cross-validation and Test denotes test set performance. Recall performance was near perfect in all cases and not included in this table. Acc denotes the accuracy and is the proportion of peptides whose activity level has been correctly classified. Kappa stands for Cohen’s kappa coefficient, which is a measure of confidence of the prediction, a value in the range of 0.4–0.6 is empirically considered moderate prediction performance (Landis & Koch, 1977). MCC stands for Matthew’s correlation coefficient in which a value of 1 indicate perfect correlation whereas a value of 0 indicates no correlation. The supplemental file S3 (https://github.com/chaninlab/antimicrobial-peptide-QSAR/blob/master/S3.xlsx) contains all outputs relating to the prediction performance, including confusion matrices.
| Strain name | Gram | Two activity levels (CV) | Two activity levels (test) | Three activity levels (CV) | Three activity levels (test) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Acc | Kappa | MCC | Acc | Kappa | MCC | Acc | Kappa | MCC | Acc | Kappa | MCC | ||
| Pos | 0.85 ± 0.02 | 0.70 ± 0.03 | 0.71 ± 0.03 | 0.58 ± 0.00 | 0.17 ± 0.00 | 0.17 ± 0.00 | 0.79 ± 0.01 | 0.69 ± 0.02 | 0.69 ± 0.02 | 0.78 ± 0.00 | 0.78 ± 0.00 | 0.67 ± 0.00 | |
| Pos | 0.64 ± 0.01 | 0.28 ± 0.03 | 0.29 ± 0.03 | 0.58 ± 0.00 | 0.17 ± 0.00 | 0.17 ± 0.00 | 0.70 ± 0.01 | 0.55 ± 0.02 | 0.55 ± 0.02 | 0.66 ± 0.02 | 0.49 ± 0.03 | 0.49 ± 0.03 | |
| Pos | 0.84 ± 0.01 | 0.69 ± 0.02 | 0.69 ± 0.02 | 0.63 ± 0.00 | 0.25 ± 0.00 | 0.26 ± 0.00 | 0.82 ± 0.01 | 0.73 ± 0.01 | 0.73 ± 0.01 | 0.75 ± 0.01 | 0.62 ± 0.02 | 0.63 ± 0.02 | |
| Pos | 0.72 ± 0.01 | 0.43 ± 0.02 | 0.44 ± 0.02 | 0.70 ± 0.02 | 0.39 ± 0.04 | 0.39 ± 0.04 | 0.81 ± 0.01 | 0.71 ± 0.01 | 0.71 ± 0.01 | 0.80 ± 0.01 | 0.70 ± 0.01 | 0.70 ± 0.01 | |
| Neg | 0.83 ± 0.00 | 0.65 ± 0.00 | 0.66 ± 0.00 | 0.70 ± 0.00 | 0.40 ± 0.00 | 0.40 ± 0.00 | 0.85 ± 0.01 | 0.77 ± 0.02 | 0.77 ± 0.01 | 0.81 ± 0.05 | 0.71 ± 0.07 | 0.72 ± 0.07 | |
| Neg | 0.81 ± 0.00 | 0.61 ± 0.01 | 0.62 ± 0.00 | 0.78 ± 0.01 | 0.55 ± 0.02 | 0.55 ± 0.02 | 0.84 ± 0.01 | 0.76 ± 0.01 | 0.76 ± 0.01 | 0.77 ± 0.01 | 0.66 ± 0.02 | 0.66 ± 0.02 | |
| Neg | 0.80 ± 0.01 | 0.60 ± 0.02 | 0.60 ± 0.02 | 0.74 ± 0.02 | 0.49 ± 0.04 | 0.49 ± 0.04 | 0.84 ± 0.01 | 0.75 ± 0.01 | 0.75 ± 0.01 | 0.76 ± 0.02 | 0.64 ± 0.02 | 0.65 ± 0.02 | |
Figure 1Pie chart of the pooled important descriptors of all classification instances.
The original descriptors were divided into global, local sequence and structural descriptor classes.
Figure 2Important descriptors of each target bacteria’s activity level classification as ranked by their importance value (mean decrease of impurity) and classed by their degree of sequence dependence (green-red-yellow bar).
(A) B. subtilis ATCC 6633, (B) E. faecalis ATCC 29212, (C) S. aureus ATCC 6538, (D) S. aureus ATCC 25923, (E) E. coli ATCC 25726, (F) E. coli ATCC 25922 and (G) P. aeruginosa ATCC 27853. Green denote global descriptors, yellow denote local sequence dependent and red denote structural descriptors. The blue-black bar shows the importance distribution of descriptors pertaining to polarity-solubility (blue) and descriptors not related to polarity-solubility (black). It is worthy to note that the number of descriptors important enough to be retained by the CfsSubsetEval algorithm for activity prediction is not equal for each strain. The bars of this figure representing the classes of the descriptors have been scaled to equal length for ease of comparison. An unscaled version of this figure together with raw descriptor names and importance value can be found in supplementary file S4 (https://github.com/chaninlab/antimicrobial-peptide-QSAR/blob/master/S4.xlsx). Q1–Q4 each represents 1/4 of the retained and ranked important descriptors, with Q1 containing the highest ranked and Q4 the lowest ranked descriptors. The “Total” entry at the bottom denotes the proportion of descriptor classes of all retained descriptors for each strain. Numbers in brackets stands for the proportion of descriptors related to polarity-solubility. For example, a notation of “Total (0.59)” means that 59% of all important descriptors are related to polarity-solubility. While a notation of “G (0.75)” means that 75% of global descriptors are related to polarity-solubility.
Summary of averaged descriptor value and standard deviations of all important global and non-dipeptide local sequence descriptors for various strains.
Relative differences in the averaged value of important descriptors between high and low activity AMPs can provide information about the activity mechanism. For each bacterial species, four pieces of information are shown (from top to bottom) as follows: (1) averaged value and standard deviation of the descriptor from the high activity class, (2) averaged value and standard deviation of the descriptor from the low activity class, (3) p-value from the Welch’s t-test, (4) whether the null hypothesis (no significant difference between high and low activity class of the descriptor’s mean) was rejected. It should be noted that the rejection threshold used was p-value < 0.05.
| Descriptors | |||||||
|---|---|---|---|---|---|---|---|
| Important global descriptors | |||||||
| C | – | – | – | 0.00 | – | 0.06 | – |
| – | – | – | 0.43 | – | 0.51 | – | |
| – | – | – | 1.40E−02 | – | 1.62E−02 | – | |
| – | – | – | Yes | – | Yes | – | |
| Composition.of.Charge.1 | 32.57 | – | – | 29.65 | – | – | – |
| 20.75 | – | – | 22.98 | – | – | – | |
| 1.80E−04 | – | – | 1.77E−04 | – | – | – | |
| Yes | – | – | Yes | – | – | – | |
| Composition.of.Charge.2 | – | – | – | – | 75.73 | – | – |
| – | – | – | – | 83.74 | – | – | |
| – | – | – | – | 1.74E−05 | – | – | |
| – | – | – | – | Yes | – | – | |
| Composition.of.CLogP.2 | – | 27.58 | 14.16 | 29.25 | – | – | – |
| – | 31.52 | 29.32 | 37.26 | – | – | – | |
| – | 2.67E−01 | 4.00E−04 | 2.79E−05 | – | – | – | |
| – | No | Yes | Yes | – | – | – | |
| Composition.of.No.of.hydrogen.bond.donor.in.side.chain.1 | 34.33 | 35.47 | – | – | – | 33.48 | – |
| 27.83 | 28.50 | – | – | – | 29.16 | – | |
| 4.17E−02 | 1.92E−02 | – | – | – | 7.39E−03 | – | |
| Yes | Yes | – | – | – | Yes | – | |
| Composition.of.No.of.hydrogen.bond.donor.in.side.chain.2 | – | – | – | – | 10.49 | 11.75 | – |
| – | – | – | – | 13.69 | 15.04 | – | |
| – | – | – | – | 1.91E−02 | 6.29E−04 | – | |
| – | – | – | – | Yes | Yes | – | |
| Composition.of.No.of.hydrogen.bond.donor.in.side.chain.3 | – | 52.16 | – | – | – | – | – |
| – | 57.50 | – | – | – | – | – | |
| – | 2.66E−01 | – | – | – | – | – | |
| – | No | – | – | – | – | – | |
| Composition.of.Normalized.vdW.volumes.3 | – | – | – | – | – | – | 47.16 |
| – | – | – | – | – | – | 40.34 | |
| – | – | – | – | – | – | 2.48E−02 | |
| – | – | – | – | – | – | Yes | |
| Composition.of.Polarity.2 | – | – | – | 21.31 | – | – | – |
| – | – | – | 28.70 | – | – | – | |
| – | – | – | 6.92E−05 | – | – | – | |
| – | – | – | Yes | – | – | – | |
| Composition.of.Polarizability.1 | – | 13.38 | – | – | – | – | – |
| – | 19.81 | – | – | – | – | – | |
| – | 5.70E−02 | – | – | – | – | – | |
| – | No | – | – | – | – | – | |
| Composition.of.Polarizability.3 | – | – | – | – | – | – | 47.16 |
| – | – | – | – | – | – | 40.34 | |
| – | – | – | – | – | – | 2.48E−02 | |
| – | – | – | – | – | – | Yes | |
| Composition.of.Secondary.structure.2 | 19.90 | – | – | – | – | 23.17 | – |
| 26.50 | – | – | – | – | 27.00 | ||
| 1.54E−02 | – | – | – | – | 2.86E−03 | – | |
| Yes | – | – | – | – | Yes | ||
| Composition.of.Solvent.accessibility.1s | – | – | – | – | – | – | 49.63 |
| – | – | – | – | – | – | 58.35 | |
| – | – | – | – | – | – | 5.00E−06 | |
| – | – | – | – | – | – | Yes | |
| Composition.of.Surface.tension.3 | – | – | – | – | – | 42.33 | – |
| – | – | – | – | – | 46.60 | ||
| – | – | – | – | – | 8.98E−03 | – | |
| – | – | – | – | – | Yes | – | |
| Isoelectric | 10.68 | – | – | – | 10.39 | – | 11.25 |
| 9.57 | – | – | – | 9.81 | – | 10.34 | |
| 7.98E−05 | – | – | – | 2.80E−04 | – | 3.29E−07 | |
| Yes | – | – | – | yes | – | Yes | |
| M | – | – | – | 0.48 | – | – | – |
| – | – | – | 1.19 | – | – | – | |
| – | – | – | 6.29E−03 | – | – | – | |
| – | – | – | Yes | – | – | – | |
| Mass | – | – | – | – | – | 2,752.86 | 2,960.49 |
| – | – | – | – | – | 1,855.92 | 1,926.44 | |
| – | – | – | – | – | 1.00E−13 | 1.81E−11 | |
| – | – | – | – | – | Yes | Yes | |
| P | – | – | – | – | – | – | 2.78 |
| – | – | – | – | – | – | 1.04 | |
| – | – | – | – | – | – | 5.39E−04 | |
| – | – | – | – | – | – | Yes | |
| S | – | – | – | – | – | 3.46 | – |
| – | – | – | – | – | 5.79 | – | |
| – | – | – | – | – | 9.80E−04 | – | |
| – | – | – | – | – | Yes | – | |
| Important local sequence descriptors | |||||||
| Transition.of.Charge.3 | – | – | – | – | 2.56 | – | – |
| – | – | – | – | 3.99 | – | – | |
| – | – | – | – | 1.90E−01 | – | – | |
| – | – | – | – | No | – | – | |
| Transition.of.No.of.hydrogen.bond.donor.in.side.chain.3 | – | – | – | – | 11.34 | 12.90 | 11.91 |
| – | – | – | – | 15.93 | 17.28 | 16.61 | |
| – | – | – | – | 2.16E−03 | 7.57E−05 | 2.14E−04 | |
| – | – | – | – | Yes | Yes | Yes | |
| Transition.of.Normalized.vdW.volumes.1 | – | – | 8.90 | – | – | – | – |
| – | – | 19.59 | – | – | – | – | |
| – | – | 4.70E−03 | – | – | – | – | |
| – | – | Yes | – | – | – | – | |
| Transition.of.Secondary.structure.1 | – | – | – | – | 30.51 | – | – |
| – | – | – | – | 22.63 | – | – | |
| – | – | – | – | 2.07E−03 | – | – | |
| – | – | – | – | Yes | – | – | |
| Transition.of.Solvent.accessibility.3 | – | – | – | 5.94 | – | – | – |
| – | – | – | 6.88 | – | – | – | |
| – | – | – | 2.81E−01 | – | – | – | |
| – | – | – | No | – | – | – | |