| Literature DB >> 32023993 |
YoungJin Choi1, YooKyung Boo2.
Abstract
(1) Medical research has shown an increasing interest in machine learning, permitting massive multivariate data analysis. Thus, we developed drug intoxication mortality prediction models, and compared machine learning models and traditional logistic regression. (2) Categorized as drug intoxication, 8,937 samples were extracted from the Korea Centers for Disease Control and Prevention (2008-2017). We trained, validated, and tested each model through data and compared their performance using three measures: Brier score, calibration slope, and calibration-in-the-large. (3) A chi-square test demonstrated that mortality risk statistically significantly differed according to severity, intent, toxic substance, age, and sex. The multilayer perceptron model (MLP) had the highest area under the curve (AUC), and lowest Brier score in training and validation phases, while the logistic regression model (LR) showed the highest AUC (0.827) and lowest Brier score (0.0307) in the testing phase. MLP also had the second-highest AUC (0.816) and second-lowest Brier score (0.003258) in the testing phase, demonstrating better performance than the decision-making tree model. (4) Given the complexity of choosing tuning parameters, LR proved competitive when using medical datasets, which require strict accuracy.Entities:
Keywords: drug intoxication; influencing factor; logistic regression; machine learning; mortality prediction
Mesh:
Year: 2020 PMID: 32023993 PMCID: PMC7037603 DOI: 10.3390/ijerph17030897
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Sample structure.
Results of chi-square.
| Items | Training | Validation | Testing | ||||
|---|---|---|---|---|---|---|---|
|
| Under 65 | 3.333(75.0) |
| 1.360(74.5) | 68.5 *** | 1.878(70.3) | 67.1 *** |
| Over 65 | 1.109(25.0) | 465(25.5) | 792(29.7) | ||||
|
| Toxic Drug | 1.768(39.8) | 440.1 *** | 739(40.5) | 104.2 *** | 1.176(44.0) | 64.4 *** |
| Alcohol | 28(0.6) | 17(0.9) | 34(1.3) | ||||
| Hazardous Substance | 1.590(35.8) | 614(33.6) | 809(30.3) | ||||
| Other | 1.056(23.8) | 455(24.9) | 651(24.4) | ||||
|
| 0 | 3.844(86.5) | 17.5 *** | 1.577(86.4) | 14.5 *** | 2.295(86.0) | 17.9 *** |
| 1 | 403(9.1) | 181(9.9) | 249(9.3) | ||||
| 2 | 114(2.6) | 44(2.4) | 73(2.7) | ||||
| 3 | 81(1.8) | 23(1.3) | 53(2.0) | ||||
|
| Conflict with Relatives | 666(15.0) | 11.3 ** | 192(10.5) | 13.3 ** | 317(11.9) | 13.3 ** |
| Physical Illness | 116(2.6) | 69(3.8) | 94(3.5) | ||||
| Mental Problem | 678(15.3) | 285(15.6) | 327(12.2) | ||||
| Financial Problem | 106(2.4) | 67(3.7) | 114(4.3) | ||||
| Other | 2.876(64.7) | 1.212(66.4) | 1.818(68.1) | ||||
|
| Unintentional | 1.657(37.3) | 123.8 *** | 699(38.3) | 25.6 *** | 1.021(38.2) | 37.3 *** |
| Intentional | 2.562(57.7) | 1.033(56.6) | 1.500(56.2) | ||||
| Missing | 223(5.) | 93(5.1) | 149(5.6) | ||||
|
| 4.442 | 1.825 | 2.670 | ||||
*** p < 0.01, ** p < 0.05, * p < 0.1.
Model performance test.
| Items | Brier Score | AUC | Calibration | |
|---|---|---|---|---|
| Logistic Regression | Training | 0.06032 | 0.779 | −0.00342 |
| Validation | 0.04266 | 0.788 | 0.207416 | |
| Testing | 0.030796 | 0.827 | 0.149374 | |
| Decision Tree | Training | 0.060441 | 0.845 | 0.244034 |
| Validation | 0.042295 | 0.845 | −0.11715 | |
| Testing | 0.033615 | 0.764 | −0.49888 | |
| Multilayer Perceptron | Training | 0.059971 | 0.848 | −0.31857 |
| Validation | 0.043033 | 0.853 | −0.3938 | |
| Testing | 0.032589 | 0.816 | −0.50177 | |
Figure 2Box Graph.
Figure 3ROC Graph.