| Literature DB >> 36080444 |
Walid Kamal Abdelbasset1,2, Shereen H Elsayed3, Sameer Alshehri4, Bader Huwaimel5, Ahmed Alobaida6, Amal M Alsubaiyel7, Abdulsalam A Alqahtani8, Mohamed A El Hamd9,10, Kumar Venkatesan11, Kareem M AboRas12, Mohammed A S Abourehab13,14.
Abstract
The efficient production of solid-dosage oral formulations using eco-friendly supercritical solvents is known as a breakthrough technology towards developing cost-effective therapeutic drugs. Drug solubility is a significant parameter which must be measured before designing the process. Decitabine belongs to the antimetabolite class of chemotherapy agents applied for the treatment of patients with myelodysplastic syndrome (MDS). In recent years, the prediction of drug solubility by applying mathematical models through artificial intelligence (AI) has become known as an interesting topic due to the high cost of experimental investigations. The purpose of this study is to develop various machine-learning-based models to estimate the optimum solubility of the anti-cancer drug decitabine, to evaluate the effects of pressure and temperature on it. To make models on a small dataset in this research, we used three ensemble methods, Random Forest (RFR), Extra Tree (ETR), and Gradient Boosted Regression Trees (GBRT). Different configurations were tested, and optimal hyper-parameters were found. Then, the final models were assessed using standard metrics. RFR, ETR, and GBRT had R2 scores of 0.925, 0.999, and 0.999, respectively. Furthermore, the MAPE metric error rates were 1.423 × 10-1 7.573 × 10-2, and 7.119 × 10-2, respectively. According to these facts, GBRT was considered as the primary model in this paper. Using this method, the optimal amounts are calculated as: P = 380.88 bar, T = 333.01 K, Y = 0.001073.Entities:
Keywords: anti-cancer drug; artificial intelligence; optimization; simulation
Mesh:
Substances:
Year: 2022 PMID: 36080444 PMCID: PMC9457620 DOI: 10.3390/molecules27175676
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.927
The whole dataset.
| No | X1 = P (Bar) | X2 = T (K) | Y = Solubility (Mole Fraction) |
|---|---|---|---|
| 1 | 120 | 308 | 5.04 × 10−5 |
| 2 | 120 | 318 | 4.51 × 10−5 |
| 3 | 120 | 328 | 3.69 × 10−5 |
| 4 | 120 | 338 | 2.84 × 10−5 |
| 5 | 160 | 308 | 8.23 × 10−5 |
| 6 | 160 | 318 | 9.37 × 10−5 |
| 7 | 160 | 328 | 9.11 × 10−5 |
| 8 | 160 | 338 | 7.79 × 10−5 |
| 9 | 200 | 308 | 1.18 × 10−4 |
| 10 | 200 | 318 | 1.55 × 10−4 |
| 11 | 200 | 328 | 1.77 × 10−4 |
| 12 | 200 | 338 | 2.05 × 10−4 |
| 13 | 240 | 308 | 1.37 × 10−4 |
| 14 | 240 | 318 | 1.87 × 10−4 |
| 15 | 240 | 328 | 2.82 × 10−4 |
| 16 | 240 | 338 | 3.71 × 10−4 |
| 17 | 280 | 308 | 1.76 × 10−4 |
| 18 | 280 | 318 | 2.40 × 10−4 |
| 19 | 280 | 328 | 3.42 × 10−4 |
| 20 | 280 | 338 | 4.90 × 10−4 |
| 21 | 320 | 308 | 1.97 × 10−4 |
| 22 | 320 | 318 | 2.69 × 10−4 |
| 23 | 320 | 328 | 4.27 × 10−4 |
| 24 | 320 | 338 | 7.15 × 10−4 |
| 25 | 360 | 308 | 2.18 × 10−4 |
| 26 | 360 | 318 | 3.40 × 10−4 |
| 27 | 360 | 328 | 5.60 × 10−4 |
| 28 | 360 | 338 | 8.74 × 10−4 |
| 29 | 400 | 308 | 2.83 × 10−4 |
| 30 | 400 | 318 | 5.06 × 10−4 |
| 31 | 400 | 328 | 7.88 × 10−4 |
| 32 | 400 | 338 | 1.07 × 10−3 |
Figure 1Data distribution, P (pressure), T (temperature), and Y (solubility).
Figure 2RFR Model: test and train data predictions.
Figure 3ETR Model: test and train data predictions.
Figure 4GBRT Model: test and train data predictions.
Outputs.
| Models | R2 Score | MAPE |
|---|---|---|
| RFR | 0.925 | 1.423 × 10−1 |
| ETR | 0.999 | 7.573 × 10−2 |
| GBRT | 0.999 | 7.119 × 10−2 |
Figure 5Three-dimensional illustration of inputs/outputs (GBRT Model).
Figure 6Tendency of P.
Figure 7Tendency of T.
Optimal values (GBRT Model).
| P (Bar) | T (K) | Y |
|---|---|---|
| 380.88 | 333.01 | 0.001073 |