| Literature DB >> 22606039 |
Radu E Sestraş1, Lorentz Jäntschi1,2, Sorana D Bolboacă3.
Abstract
A contingency of observed antimicrobial activities measured for several compounds vs. a series of bacteria was analyzed. A factor analysis revealed the existence of a certain probability distribution function of the antimicrobial activity. A quantitative structure-activity relationship analysis for the overall antimicrobial ability was conducted using the population statistics associated with identified probability distribution function. The antimicrobial activity proved to follow the Poisson distribution if just one factor varies (such as chemical compound or bacteria). The Poisson parameter estimating antimicrobial effect, giving both mean and variance of the antimicrobial activity, was used to develop structure-activity models describing the effect of compounds on bacteria and fungi species. Two approaches were employed to obtain the models, and for every approach, a model was selected, further investigated and found to be statistically significant. The best predictive model for antimicrobial effect on bacteria and fungi species was identified using graphical representation of observed vs. calculated values as well as several predictive power parameters.Entities:
Keywords: antimicrobial effect; bacteria and fungi species; multiple linear regression (MLR); oils compounds; probability distribution function; quantitative structure-activity relationship (QSAR)
Mesh:
Substances:
Year: 2012 PMID: 22606039 PMCID: PMC3344275 DOI: 10.3390/ijms13045207
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 6.208
Figure 1Results of probability distribution functions analysis. X: Compounds (1–21; 1 = Citral, 2 = Geraniol, 3 = Geranyl formate, 4 = Geranyl acetate, 5 = Geranyl butyrate, 6 = Geranyl tiglate, 7 = Neral, 8 = Nerol, 9 = Nerol acetate, 10 = Neryl butyrate, 11 = Neryl propanoate, 12 = Citronellal, 13 = Citronellyl formate, 14 = Citronellyl acetate, 15 = Citronellyl butyrate, 16 = Citronellyl isobutyrate, 17 = Citronellyl propionate, 18 = Hydroxycitronellal, 19 = Rose oxide, 20 = Eugenol, 21 = Sulfametrole, 32 = Citronellol), Oils (22–29; 22 = Citronella, 23 = Geranium Africa, 24 = Geranium Bourbon, 25 = Geranium China, 26 = Helichrysum, 27 = Palmarosa, 28 = Rose, 29 = Verbena), Mixtures (30–31; 30 = Tetracycline hydrochloride, 31 = Ciproxin); Y: Binomial (◆), NegBino (■), Poisson (▴); “Is Y the distribution of any X on bacteria and fungi species?”.
Statistical parameters and population properties.
| Mode | Mean | Var | StDev | Skew | EKurt | Median | ||
|---|---|---|---|---|---|---|---|---|
| Citral (638011) | 14.125 | 14 | 14.125 | 14.125 | 3.758 | 0.266 | 0.071 | 13.457 |
| Geraniol (637566) | 13.750 | 13 | 13.750 | 13.750 | 3.708 | 0.270 | 0.073 | 13.082 |
| Geranyl formate (5282109) | 8.875 | 8 | 8.875 | 8.875 | 2.979 | 0.336 | 0.113 | 8.207 |
| Geranyl acetate (1549026) | 8.200 | 8 | 8.200 | 8.200 | 2.864 | 0.349 | 0.122 | 7.531 |
| Geranyl butyrate (5355856) | 8.714 | 8 | 8.714 | 8.714 | 2.952 | 0.339 | 0.115 | 8.046 |
| Geranyl tiglate (5367785) | 11.625 | 11 | 11.625 | 11.625 | 3.410 | 0.293 | 0.086 | 10.957 |
| Neral (643779) | 13.500 | 13 | 13.500 | 13.500 | 3.674 | 0.272 | 0.074 | 12.932 |
| Nerol (643820) | 11.250 | 11 | 11.250 | 11.250 | 3.354 | 0.298 | 0.089 | 10.582 |
| Nerol acetate (1549025) | 7.333 | 7 | 7.333 | 7.333 | 2.708 | 0.369 | 0.136 | 6.664 |
| Neryl butyrate (5352162) | 10.714 | 10 | 10.714 | 10.714 | 3.273 | 0.306 | 0.093 | 10.046 |
| Neryl propanoate (5365982) | 10.714 | 10 | 10.714 | 10.714 | 3.273 | 0.306 | 0.093 | 10.046 |
| Citronellal (7794) | 14.600 | 14 | 14.600 | 14.600 | 3.821 | 0.262 | 0.068 | 13.932 |
| Citronellyl formate (7778) | 12.143 | 12 | 12.143 | 12.143 | 3.485 | 0.287 | 0.082 | 11.475 |
| Citronellyl acetate (9017) | 7.286 | 7 | 7.286 | 7.286 | 2.699 | 0.370 | 0.137 | 6.617 |
| Citronellyl butyrate (8835) | 8.167 | 8 | 8.167 | 8.167 | 2.858 | 0.350 | 0.122 | 7.498 |
| Citronellyl isobutyrate (60985) | 8.200 | 8 | 8.200 | 8.200 | 2.864 | 0.349 | 0.122 | 7.531 |
| Citronellyl propionate (8834) | 14.333 | 14 | 14.333 | 14.333 | 3.786 | 0.264 | 0.070 | 13.665 |
| Hydroxycitronellal (7888) | 18.750 | 18 | 18.750 | 18.750 | 4.330 | 0.231 | 0.053 | 18.083 |
| Rose oxide (27866) | 12.800 | 12 | 12.800 | 12.800 | 3.578 | 0.280 | 0.078 | 12.132 |
| Eugenol (3314) | 28.250 | 28 | 28.250 | 28.250 | 5.315 | 0.188 | 0.035 | 27.583 |
| Sulfametrole (64939) | 19.200 | 19 | 19.200 | 19.200 | 4.382 | 0.228 | 0.052 | 18.533 |
| Citronella | 9.750 | 9 | 9.750 | 9.750 | 3.122 | 0.320 | 0.103 | 9.082 |
| Geranium Africa | 13.250 | 13 | 13.250 | 13.250 | 3.640 | 0.275 | 0.075 | 12.582 |
| Geranium Bourbon | 12.500 | 12 | 12.500 | 12.500 | 3.536 | 0.283 | 0.080 | 11.832 |
| Geranium China | 13.625 | 13 | 13.625 | 13.625 | 3.691 | 0.271 | 0.073 | 12.957 |
| Helichrysum | 10.667 | 10 | 10.667 | 10.667 | 3.266 | 0.306 | 0.094 | 9.999 |
| Palmarosa | 11.625 | 11 | 11.625 | 11.625 | 3.410 | 0.293 | 0.086 | 10.957 |
| Rose | 12.750 | 12 | 12.750 | 12.750 | 3.571 | 0.280 | 0.078 | 12.082 |
| Verbena | 16.500 | 16 | 16.500 | 16.500 | 4.062 | 0.246 | 0.061 | 15.833 |
| Tetracycline hydrochloride | 15.143 | 15 | 15.143 | 15.143 | 3.891 | 0.257 | 0.066 | 14.476 |
| Ciproxin | 26.000 | 26 | 26.000 | 26.000 | 5.099 | 0.196 | 0.038 | 25.333 |
λ = Parameter of Poisson distribution; Var = variance; StDev = standard deviation; Skew = skewness; EKurt = Excess Kurtosis.
Figure 2Williams plot (training set): Dragon descriptors.
Figure 3Observed vs. calculated parameter: QSAR-Dragon (Equation (1)R2TS = determination coefficient in test set).
Figure 4Williams plots (training set): SAPF descriptors.
Figure 5Observed vs. calculated parameter: QSAR-SAPF (Equation (2)R2TS = determination coefficient in test set).
QSAR Residuals: Dragon vs. SAPF.
| Set | CID | Y | ŶDragon | ResDragon | ŶSAPF | ResSAPF |
|---|---|---|---|---|---|---|
| Training | 1549025 | 1.9924 | 2.0070 | −0.0146 | 2.0761 | −0.0836 |
| Training | 8835 | 2.1001 | 2.0564 | 0.0437 | 2.1461 | −0.0460 |
| Training | 60985 | 2.1041 | 2.0768 | 0.0273 | 2.0553 | 0.0488 |
| Training | 5282109 | 2.1832 | 2.2596 | −0.0764 | 2.3267 | −0.1435 |
| Training | 643820 | 2.4204 | 2.6106 | −0.1902 | 2.7127 | −0.2923 |
| Training | 7778 | 2.4968 | 2.4132 | 0.0835 | 2.2816 | 0.2151 |
| Training | 27866 | 2.5494 | 2.5905 | −0.0411 | 2.4957 | 0.0538 |
| Training | 637566 | 2.6210 | 2.6106 | 0.0104 | 2.7127 | −0.0917 |
| Training | 638011 | 2.6479 | 2.7061 | −0.0582 | 2.6042 | 0.0437 |
| Training | 8842 | 2.6741 | 2.6435 | 0.0307 | 2.5713 | 0.1029 |
| Training | 7794 | 2.6810 | 2.6929 | −0.0118 | 2.6430 | 0.0380 |
| Training | 7888 | 2.9312 | 2.7346 | 0.1966 | 2.8638 | 0.0674 |
| Training | 64939 | 2.9549 | 2.8674 | 0.0875 | ||
| Test | 1549026 | 2.1041 | 2.0070 | 0.0971 | 2.2012 | −0.0971 |
| Test | 5355856 | 2.1650 | 1.9271 | 0.2379 | 2.2830 | −0.1180 |
| Test | 5352162 | 2.3716 | 1.9271 | 0.4445 | 2.7847 | −0.4132 |
| Test | 5367785 | 2.4532 | 1.8661 | 0.5870 | 2.4642 | −0.0111 |
| Test | 643779 | 2.6027 | 2.7061 | −0.1034 | 2.6006 | 0.0021 |
| Test | 8834 | 2.6626 | 2.4108 | 0.2518 | 2.6207 | 0.0418 |
| Test | 3314 | 3.3411 | 2.7843 | 0.5568 | 3.3685 | −0.0274 |
| External | 9017 | 1.9859 | 2.1432 | −0.1572 | 2.0053 | −0.0194 |
| External | 5365982 | 2.3716 | 2.2688 | 0.1028 | 2.2889 | 0.0827 |
CID = compound identification number; Y = observed ln(λ) value; Ŷ = estimated/predicted value; Res = residuals; Dragon = model from Equation(1); SAPF = model from Equation(2).
Results of comparison: QSAR-Dragon model vs. QSAR-SAPF model.
| Parameter (Abbreviation) | Dragon– | SAPF– | ||||
|---|---|---|---|---|---|---|
| Root-mean-square error (RMSE) | 0.2314 | 0.1357 | ||||
| Mean absolute error (MAE) | 0.1582 | 0.0967 | ||||
| Mean Absolute Percentage Error (MAPE) | 0.0628 | 0.0403 | ||||
| Standard error of prediction (SEP) | 0.2371 | 0.0628 | ||||
| Relative error of prediction (REP%) | 9.2964 | 5.4523 | ||||
| Predictive Power of the Model | ||||||
| Q2F1 | 0.2121 | 0.8436 | ||||
| Q2F2 | 0.2041 | 0.8421 | ||||
| Q2F3 | n.a. | 0.7742 | ||||
| ρc-TR | 0.9457 | 0.9063 | ||||
| ρc-TS | 0.4885 | 0.9219 | ||||
| Fisher’s Predictive Power | TS | EX | TS + EX | TS | EX | TS + EX |
| | 7 | 2 | 9 | 7 | 2 | 9 |
| | 3.1148 | −0.2095 | 2.5071 | −1.5344 | 0.6198 | −1.2830 |
| | 0.0104 | 0.4343 | 0.0230 | 0.0879 | 0.3234 | 0.1234 |
= test set include also external compounds; ρc = concordance correlation coefficient; TR = training set; TS = test set;
accuracy = 0.9985, precision = 0.9471;
accuracy = 0.7357, precision = 0.6639;s
accuracy = 0.9956, precision = 0.9103;
accuracy = 0.9867, precision = 0.9344;
= external set (two compounds);
= training and external sets.
Compounds, oils and mixtures: inhibition zones (mm).
| n | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Citral (638011) | 15 | 23 | 11 | 9 | 10 | 8 | 9 | 28 | 8 |
| 2 | Geraniol (637566) | 15 | 12 | 15 | 12 | 11 | 10 | 10 | 25 | 8 |
| 3 | Geranyl formate (5282109) | 10 | 9 | 7 | 8 | 8 | 7 | 7 | 15 | 8 |
| 4 | Geranyl acetate (1549026) | 10 | 8 | 7 | NIO | NIO | 7 | NIO | 9 | 5 |
| 5 | Geranyl butyrate (5355856) | 10 | 11 | 7 | NIO | 9 | 7 | 7 | 10 | 7 |
| 6 | Geranyl tiglate (5367785) | 17 | 10 | 11 | 9 | 8 | 8 | 15 | 15 | 8 |
| 7 | Neral (643779) | 15 | 20 | 10 | 6 | 12 | 10 | 10 | 25 | 8 |
| 8 | Nerol (643820) | 11 | 8 | 10 | 10 | 10 | 7 | 7 | 27 | 8 |
| 9 | Nerol acetate (1549025) | 8 | NIO | 7 | 7 | 7 | 8 | 7 | NIO | 6 |
| 10 | Neryl butyrate (5352162) | 25 | 8 | 8 | 8 | NIO | 8 | 8 | 10 | 7 |
| 11 | Neryl propanoate (5365982) | 17 | 10 | NIO | 7 | 8 | 9 | 10 | 14 | 7 |
| 12 | Citronellal (7794) | 25 | 18 | NIO | 9 | NIO | 7 | 14 | NIO | 5 |
| 13 | Citronellyl formate (7778) | 18 | 20 | 10 | 8 | 9 | 7 | NIO | 13 | 7 |
| 14 | Citronellyl acetate (9017) | 10 | 6 | NIO | 6 | 7 | 6 | 7 | 9 | 7 |
| 15 | Citronellyl butyrate (8835) | 8 | 8 | NIO | NIO | 8 | 7 | 8 | 10 | 6 |
| 16 | Citronellyl isobutyrate (60985) | 8 | 10 | 9 | 7 | NIO | NIO | 7 | NIO | 5 |
| 17 | Citronellyl propionate (8834) | 15 | 20 | NIO | NIO | 10 | 15 | 11 | 15 | 6 |
| 18 | Hydroxycitronellal (7888) | 20 | 20 | 23 | 16 | 17 | 15 | 14 | 25 | 8 |
| 19 | Rose oxide (27866) | 8 | 10 | NIO | 11 | 7 | NIO | NIO | 28 | 5 |
| 20 | Eugenol (3314) | 30 | 30 | 28 | 28 | 25 | 25 | 28 | 32 | 8 |
| 21 | Sulfametrole (64939) | 27 | 27 | 11 | 23 | NIO | 8 | NIO | NIO | 5 |
| 32 | Citronellol (8842) | 25 | 18 | NIO | 8 | NIO | 7 | NIO | NIO | 4 |
| 22 | Citronella | 10 | 10 | 7 | 10 | 7 | 7 | 7 | 20 | 8 |
| 23 | Geranium Africa | 16 | 12 | 10 | 10 | 10 | 9 | 11 | 28 | 8 |
| 24 | Geranium Bourbon | 13 | 12 | 8 | 12 | 10 | 10 | 10 | 25 | 8 |
| 25 | Geranium China | 20 | 13 | 14 | 9 | 9 | 9 | 10 | 25 | 8 |
| 26 | Helichrysum | 20 | 13 | 8 | NIO | 9 | NIO | 7 | 7 | 6 |
| 27 | Palmarosa | 8 | 13 | 12 | 9 | 11 | 10 | 10 | 20 | 8 |
| 28 | Rose | 20 | 15 | 10 | 10 | 8 | 9 | 10 | 20 | 8 |
| 29 | Verbena | 27 | 25 | 10 | 13 | 10 | 12 | 10 | 25 | 8 |
| 30 | Tetracycline hydrochloride | 15 | 22 | 11 | 13 | 15 | 10 | 20 | NIO | 7 |
| 31 | Ciproxin | 35 | 33 | 22 | 25 | 32 | 10 | 25 | NIO | 7 |
SA = Staphylococcus aureus; EF = Enterococcus faecalis; EC = Escherichia coli; PV = Proteus vulgaris; PA = Pseudomonas aeruginosa; SS = Salmonella sp.; KP = Klebsiella pneumoniae; CA = Candida albicans; n = sample size; NIO = No Inhibition Observed.
Figure 6SAPF descriptors (v = value, ln = natural logarithm, V = vector, T = topology, G = geometry, x, y, z = geometric atomic coordinates, i = atom, refD = modality to calculate coordinates—from average, refP = modality to calculate coordinates—from property center formula, t = topological atomic coordinate.
Statistical parameters used to assess QSAR models.
| Parameter (Abbreviation) | Formula [ref] | Remarks |
|---|---|---|
| Root-mean-square error (RMSE) | RMSE > MAE → variation in the errors exist | |
| Mean absolute error (MAE) | ||
| Mean Absolute Percentage Error (MAPE) n | MAPE ~ 0 → perfect fit | |
| Standard error of prediction (SEP) | Lower value indicate a good model | |
| Relative error of prediction (REP%) | Lower value indicate a good model | |
| Concordance analysis (ρc) | Strength of agreement [ | |
| Predictive Power of the Model Prediction is considered accurate if the predictive power of the model is > 0.6 [ | Prediction power relative to mean value of observable in training set | |
| Prediction power relative to mean value of observable in test set | ||
| Overall prediction weighted by test set sample size relative to observable weighted by mean of observed value in training set weighted by sample size in training set | ||
| Predictive Power: Fisher’s approach | Evaluate if the mean of residual is statistically different by the expected value (0) |
yi = observed ln(λ) for ith compound; ŷi = estimated/predicted ln(λ) by model from Equation(1), respectively Equation(2); n = sample size; ȳ = arithmetic mean of the observed ln(λ); = arithmetic mean of estimated/predicted ln(λ); ρc = concordance correlation coefficient; TR = training set; TS = test set; = arithmetic mean of residuals; res = residuals; StDev = standard deviation; abs = absolute value.