| Literature DB >> 36199875 |
Shahram Lotfi1, Shahin Ahmadi2, Parvin Kumar3.
Abstract
In the ecotoxicological risk assessment, acute toxicity is one of the most significant criteria. Green alga Pseudokirchneriella subcapitata has been used for ecotoxicological studies to assess the toxicity of different toxic chemicals in freshwater. Quantitative Structure Activity Relationships (QSAR) are mathematical models to relate chemical structure and activity/physicochemical properties of chemicals quantitatively. Herein, Quantitative Structure Toxicity Relationship (QSTR) modeling is applied to assess the toxicity of a data set of 334 different chemicals on Pseudokirchneriella subcapitata, in terms of EC10 and EC50 values. The QSTR models are established using CORAL software by utilizing the target function (TF2) with the index of ideality of correlation (IIC). A hybrid optimal descriptor computed from SMILES and molecular hydrogen-suppressed graphs (HSG) is employed to construct QSTR models. The results of various statistical parameters of the QSTR model developed for pEC10 and pEC50 range from excellent to good and are in line with the standard parameters. The models prepared with IIC for Split 3 are chosen as the best model for both endpoints (pEC10 and pEC50). The numerical value of the determination coefficient of the validation set of split 3 for the endpoint pEC10 is 0.7849 and for the endpoint pEC50, it is 0.8150. The structural fractions accountable for the toxicity of chemicals are also extracted. The hydrophilic attributes like 1…n…(… and S…(…[double bond, length as m-dash]… exert positive contributions to controlling the aquatic toxicity and reducing algal toxicity, whereas attributes such as c…c…c…, C…C…C… enhance lipophilicity of the molecules and consequently enhance algal toxicity. This journal is © The Royal Society of Chemistry.Entities:
Year: 2022 PMID: 36199875 PMCID: PMC9434604 DOI: 10.1039/d2ra03936b
Source DB: PubMed Journal: RSC Adv ISSN: 2046-2069 Impact factor: 4.036
The summary of statistical characteristics and criteria of predictability of the QSTR models obtained for pEC10 and pEC50 of organic compounds for three random splits
| Split | Set |
|
| CCC | IIC |
|
|
|
|
| CRp2 |
| Δ |
| MAE |
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| ||||||||||||||||
| 1 | Training | 118 | 0.8550 | 0.9218 | 0.8072 | 0.8504 | 0.8522 | 0.651 | 0.496 | 684 | ||||||
| Invisible training | 79 | 0.8609 | 0.8856 | 0.5277 | 0.8535 | 0.8556 | 0.742 | 0.576 | 476 | |||||||
| Calibration | 54 | 0.7186 | 0.8349 | 0.8389 | 0.6883 | 0.7282 | 0.7045 | 0.8212 | 0.7154 | 0.7111 | 0.6049 | 0.1210 | 0.725 | 0.592 | 133 | |
| Validation | 83 | 0.7246 | 0.8435 | 0.6846 | 0.7149 | 0.7246 | 0.6174 | 0.143 | 0.8339 | 6291 | ||||||
| 2 | Training | 115 | 0.8855 | 0.9393 | 0.8932 | 0.8804 | 0.8793 | 0.533 | 0.408 | 874 | ||||||
| Invisible training | 73 | 0.8868 | 0.9022 | 0.4317 | 0.8802 | 0.8823 | 0.706 | 0.553 | 553 | |||||||
| Calibration | 63 | 0.8487 | 0.9146 | 0.9210 | 0.8391 | 0.8466 | 0.8460 | 0.8362 | 0.8160 | 0.8388 | 0.7468 | 0.1385 | 0.657 | 0.513 | 342 | |
| Validation | 83 | 0.7643 | 0.8716 | 0.7643 | 0.7731 | 0.7575 | 0.6965 | 0.1219 | 0.8779 | 0.7052 | ||||||
| 3 | Training | 113 | 0.8866 | 0.9399 | 0.7473 | 0.8826 | 0.8796 | 0.545 | 0.426 | 867 | ||||||
| Invisible training | 79 | 0.8775 | 0.9194 | 0.5672 | 0.8722 | 0.8742 | 0.691 | 0.517 | 551 | |||||||
| Calibration | 59 | 0.8106 | 0.8985 | 0.8632 | 0.7970 | 0.8002 | 0.7987 | 0.8465 | 0.7260 | 0.8049 | 0.7336 | 0.0152 | 0.679 | 0.537 | 244 | |
| Validation | 83 | 0.7892 | 0.8648 | 0.8831 | 0.7776 | 0.7612 | 0.6061 | 0.1010 | 0.6765 | 0.5691 | ||||||
|
| ||||||||||||||||
| 1 | Training | 114 | 0.8401 | 0.9131 | 0.7161 | 0.8331 | 0.8335 | 0.683 | 0.537 | 588 | ||||||
| Invisible training | 82 | 0.8395 | 0.9006 | 0.7660 | 0.8311 | 0.8278 | 0.733 | 0.587 | 418 | |||||||
| Calibration | 52 | 0.7915 | 0.8717 | 0.8839 | 0.7771 | 0.7853 | 0.7851 | 0.8433 | 0.7479 | 0.7792 | 0.6529 | 0.1900 | 0.681 | 0.533 | 190 | |
| Validation | 85 | 0.7924 | 0.8297 | 0.7490 | 0.7774 | 0.6276 | 0.5802 | 0.0949 | 0.7716 | 0.6247 | ||||||
| 2 | Training | 116 | 0.8341 | 0.9096 | 0.9133 | 0.8289 | 0.8297 | 0.655 | 0.517 | 573 | ||||||
| Invisible training | 76 | 0.8704 | 0.9186 | 0.8496 | 0.8626 | 0.8634 | 0.671 | 0.529 | 497 | |||||||
| Calibration | 59 | 0.7802 | 0.8795 | 0.8808 | 0.7623 | 0.7622 | 0.7435 | 0.7914 | 0.6309 | 0.7679 | 0.6918 | 0.1218 | 0.774 | 0.596 | 202 | |
| Validation | 83 | 0.7366 | 0.8517 | 0.8494 | 0.7231 | 0.5993 | 0.6371 | 0.0756 | 0.7696 | 0.6055 | ||||||
| 3 | Training | 116 | 0.8665 | 0.9285 | 0.7831 | 0.8617 | 0.8568 | 0.617 | 0.461 | 740 | ||||||
| Invisible training | 79 | 0.9130 | 0.9350 | 0.9123 | 0.9088 | 0.9065 | 0.512 | 0.409 | 808 | |||||||
| Calibration | 56 | 0.7270 | 0.8484 | 0.8525 | 0.7031 | 0.6898 | 0.6860 | 0.7888 | 0.5823 | 0.7205 | 0.6237 | 0.0829 | 0.756 | 0.606 | 144 | |
| Validation | 83 | 0.8150 | 0.9020 | 0.8320 | 0.8065 | 0.7743 | 0.7402 | 0.0683 | 0.7245 | 0.6110 | ||||||
Fig. 1Graphical display of QSTR models for pEC10 and pEC50 of organic compounds obtained for three splits.
Fig. 2A graphical presentation of residual pEC10 versus predicted pEC10 (A) and residual pEC50versus predicted pEC50 (B) for all constructed QSTR models.
The structural attribute as promoters of endpoint increase/decrease, their correlation weights, the number of each attribute in each set and instances of interpretation attributes
| Endpoint | SAk | Split | CWs run 1 | CWs run 2 | CWs run 3 | N1 | N2 | N3 | Defect | Comments |
|---|---|---|---|---|---|---|---|---|---|---|
|
| ||||||||||
| pEC10 | C5……0… | 1 | 0.0518 | 0.65382 | 0.39759 | 113 | 73 | 54 | 0.0003 | Absence of five-member rings |
| 2 | 0.65986 | 1.29285 | 0.5346 | 108 | 67 | 62 | 0.0003 | |||
| 3 | 1.07757 | 1.25744 | 0.48657 | 107 | 78 | 56 | 0 | |||
| c…c…c… | 1 | 0.60173 | 0.70843 | 0.06133 | 57 | 23 | 24 | 0.0005 | Presence of three consecutive aromatic carbons | |
| 2 | 0.05675 | 0.25722 | 0.44496 | 50 | 27 | 30 | 0.0005 | |||
| 3 | 0.12996 | 0.47028 | 0.75404 | 49 | 32 | 26 | 0.0001 | |||
| c…(…c… | 1 | 0.70809 | 0.0575 | 1.34021 | 56 | 22 | 25 | 0.0001 | Presence of two aromatic carbon with branching | |
| 2 | 0.38614 | 0.33907 | 0.1424 | 42 | 29 | 26 | 0.0007 | |||
| 3 | 0.11744 | 0.51584 | 1.00395 | 44 | 37 | 22 | 0.0002 | |||
| C…C…C… | 1 | 0.5463 | 1.04019 | 0.43998 | 27 | 22 | 18 | 0.0023 | Presence of three consecutive aliphatic carbons | |
| 2 | 1.06593 | 1.00475 | 0.65444 | 33 | 13 | 20 | 0.0006 | |||
| 3 | 0.6226 | 0.74328 | 0.93265 | 27 | 24 | 14 | 0 | |||
| N…(…C… | 1 | 0.40394 | 0.48781 | 1.00039 | 23 | 19 | 13 | 0.0013 | Presence of aliphatic nitrogen and aliphatic carbon with branching | |
| 2 | 1.07368 | 0.89617 | 0.01453 | 22 | 11 | 15 | 0.0013 | |||
| 3 | 0.3214 | 0.5839 | 0.59097 | 21 | 23 | 12 | 0.0005 | |||
| C…(…C… | 1 | 0.44453 | 0.59506 | 1.29729 | 43 | 45 | 31 | 0.0028 | Presence of two aliphatic carbon with branching | |
| 2 | 0.30408 | 0.72512 | 0.53272 | 55 | 36 | 29 | 0.0002 | |||
| 3 | 0.07648 | 0.49085 | 0.56336 | 52 | 40 | 31 | 0.0008 | |||
| Promoter of decrease | ||||||||||
| c… | 1 | −0.0287 | −0.44782 | −1.06545 | 3 | 5 | 2 | 0.0023 | Presence of aromatic nitrogen between two aromatic carbon | |
| 2 | −1.23621 | −0.75303 | −0.19895 | 4 | 3 | 2 | 0.0005 | |||
| 3 | −1.56359 | −1.69078 | −0.99982 | 4 | 2 | 2 | 0.0002 | |||
| S…(…C… | 1 | −0.49469 | −2.01565 | −0.11843 | 3 | 2 | 2 | 0.0023 | Presence of sulphur with branching with carbon | |
| ++++S⋯B2 | 2 | −0.56135 | −0.0549 | −0.38003 | 13 | 9 | 7 | 0.0001 | Presence of sulphur with a double bond | |
| S…(… | 3 | −1.33917 | −1.08063 | −1.01206 | 4 | 7 | 3 | 0.0022 | Presence of sulphur with branching and double bond | |
| ++++Cl⋯S | 1 | −0.57654 | −0.35935 | −0.62229 | 3 | 1 | 1 | 0.0017 | Presence of chlorine with sulphur | |
|
| ||||||||||
| pEC50 | C5……0… | 1 | 0.21385 | 0.49259 | 2.08764 | 111 | 77 | 46 | 0.0006 | Absence of five-member rings |
| 2 | 2.21049 | 0.704 | 1.70235 | 110 | 74 | 56 | 0 | |||
| 3 | 1.43226 | 1.5737 | 2.15936 | 110 | 75 | 54 | 0.0001 | |||
| C…(…C… | 1 | 0.25343 | 0.18014 | 0.42824 | 58 | 36 | 27 | 0.0001 | Presence of two aliphatic carbon with branching | |
| 2 | 1.24886 | 0.47209 | 1.01774 | 60 | 35 | 28 | 0.0005 | |||
| 3 | 1.25593 | 0.18762 | 1.24105 | 59 | 34 | 23 | 0.0012 | |||
| c…c…c… | 1 | 0.30441 | 0.73197 | 0.19607 | 47 | 38 | 17 | 0.0013 | Presence of three consecutive aromatic carbons | |
| 2 | 0.42912 | 0.29232 | 0.60473 | 46 | 31 | 27 | 0.0008 | |||
| 3 | 1.09812 | 0.29491 | 0.08848 | 45 | 36 | 24 | 0.0006 | |||
| C6…A…1… | 1 | 1.2658 | 0.05442 | 0.26358 | 38 | 21 | 15 | 0.0008 | Presence of one six-member aromatic ring | |
| 2 | 0.40875 | 0.26677 | 0.20824 | 32 | 22 | 21 | 0.0015 | |||
| 3 | 0.89128 | 1.05867 | 0.39511 | 30 | 28 | 21 | 0.0023 | |||
| C…C…C… | 1 | 0.91697 | 0.94405 | 0.63035 | 32 | 21 | 14 | 0.0002 | Presence of three consecutive aliphatic carbons | |
| 2 | 1.21722 | 1.07992 | 1.33949 | 29 | 24 | 16 | 0.0005 | |||
| 3 | 1.08398 | 1.16517 | 0.89018 | 29 | 17 | 15 | 0.0004 | |||
| Promoter of decrease | ||||||||||
| 1… | 1 | −0.82145 | −1.58394 | −0.45675 | 7 | 6 | 3 | 0.0004 | Presence of aromatic nitrogen on the first ring with branching | |
| 2 | −0.84679 | −1.04423 | −0.83943 | 6 | 3 | 4 | 0.0016 | |||
| 3 | −0.94912 | −0.6517 | −0.22174 | 6 | 5 | 1 | 0.0048 | |||
| S…(… | 2 | −0.75816 | −0.71358 | −1.06783 | 8 | 3 | 2 | 0.0035 | Presence of sulphur with branching and double bond | |
| ++++O⋯S | 3 | −0.84696 | −0.5927 | −0.1503 | 14 | 2 | 5 | 0.0017 | Presence of oxygen with sulphur | |
| […–…Cl… | 2 | −0.95022 | −0.29786 | −0.57338 | 4 | 2 | 2 | 0.0001 | Presence of chloride ion | |
| 3 | −0.92707 | −0.72223 | −0.89901 | 5 | 3 | 0 | 1 | |||
Fig. 3Some examples in organic chemicals responsible for enhancing and reducing algal toxicity based on model interpretation.
The comparison between some of the earlier published models and the present study for the prediction pEC10 and pEC50
| S. no. |
| Chemical class | h (test duration in h) | No of descriptor | Total number of components | Data set size |
|
| MAE | Ref. | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Training | Inv. train. | Cal | Test | Training | Test | Training | Test | ||||||||
| 1 | pEC50 | Benzoic acids | 48 | 2 | 20 | 20 | 0.965 and 0.921 |
| |||||||
| 2 | pEC50 | -Polar narcotic chemicals | 72 | 2 | 58 | 58 | 0.6 | — |
| ||||||
| 3 | pEC50 | Non-polar narcotic chemicals | 72 | 2 | 50 | 50 | 0.9469 |
| |||||||
| 4 | pEC50 | Polar and nonpolar narcotic chemicals | 72 | 3 | 108 | 87 | 21 | 0.9149 | |||||||
| 5 | pEC50 | Cosmetics | 96 | 4 | 30 | 20 | 10 | 0.885 | 0.712 | 0.328 |
| ||||
| 6 | pEC50 | Pharmaceuticals | 96 | 5 | 69 | 53 | 16 | 0.69 | 0.73 | 0.55 |
| ||||
| 7 | pEC50 | Pharmaceuticals | 96 | 5 | 69 | 53 | 16 | 0.71 | 0.64 | 0.57 | |||||
| 8 | pEC50 | Organic compounds | 24 | 6 | 334 | 251 | 83 | 0.72 | 0.7 | 0.69 | 0.67 |
| |||
| 9 | pEC10 | Organic compounds | 24 | 8 | 334 | 251 | 83 | 0.7 | 0.77 | 0.7 | 0.61 | ||||
| 10 | pEC10 | Organic chemicals | 24 | 6 | 334 | 167 | 167 | 0.76 | 0.75 | 0.60 | 0.61 |
| |||
| 11 | pEC50 | Organic chemicals | 24 | 6 | 334 | 167 | 167 | 0.75 | 0.74 | 0.6 | 0.61 | ||||
| 12 | pEC50 | Organic chemicals | 72 | 7 | 271 | 217 | 54 | 0.72 | 0.718 | 0.693 | 0.506 | 0.432 |
| ||
| 13 | pEC50 | Organic chemicals | 24 | 1 | 334 | 113 | 79 | 59 | 83 | 0.8150 | 0.8665 | 0.6110 | 0.461 | Present work | |
| 14 | pEC10 | Organic chemicals | 24 | 1 | 334 | 116 | 79 | 56 | 83 | 0.7892 | 0.8866 | 0.5691 | 0.426 | ||