| Literature DB >> 34339461 |
So Jin Park1,2, Sun Jung Lee1,2, HyungMin Kim1,2, Jae Kwon Kim1,2, Ji-Won Chun1,2, Soo-Jung Lee3, Hae Kook Lee3, Dai Jin Kim4, In Young Choi1,2.
Abstract
BACKGROUND: Alcohol use disorder (AUD) is a chronic disease with a higher recurrence rate than that of other mental illnesses. Moreover, it requires continuous outpatient treatment for the patient to maintain abstinence. However, with a low probability of these patients to continue outpatient treatment, predicting and managing patients who might discontinue treatment becomes necessary. Accordingly, we developed a machine learning (ML) algorithm to predict which the risk of patients dropping out of outpatient treatment schemes.Entities:
Year: 2021 PMID: 34339461 PMCID: PMC8328309 DOI: 10.1371/journal.pone.0255626
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Flow chart of inclusion of subjects.
Fig 2Block diagram of the process of the research analysis.
Patient characteristics.
| Follow-up | Follow-up loss (n = 713) | P-value | |
|---|---|---|---|
| (n = 126) | |||
| 0.406 | |||
| Under 28d | 437 (61.3%) | 77 (61.1%) | |
| 29-56d | 136 (19.1%) | 25 (19.8%) | |
| 57-70d | 110 (15.4%) | 15 (11.9%) | |
| Over 70d | 30 (4.2%) | 9 (7.1%) | |
| 0.008 | |||
| Male | 91 (72.2%) | 590 (82.7%) | |
| Female | 35 (27.8%) | 123 (17.3%) | |
| 0.058 | |||
| Under 29 | 9 (7.1%) | 22 (3.1%) | |
| 30–39 | 22 (17.5%) | 96 (13.5%) | |
| 40–49 | 29 (23.0%) | 201 (28.2%) | |
| 50–59 | 30 (23.8%) | 216 (30.3%) | |
| 60+ | 36 (28.6%) | 178 (25.0%) | |
| 0.04 | |||
| Seoul | 37 (29.4%) | 144 (20.2%) | |
| Gyeonggi | 75 (59.5%) | 451 (63.3%) | |
| Other | 14 (11.1%) | 118 (16.5%) | |
| 0.01 | |||
| Psychiatry | 111 (88.1%) | 546 (76.6%) | |
| Gastroenterology | 9 (7.1%) | 104 (14.6%) | |
| Other | 6 (4.8%) | 63 (8.8%) | |
| 0.000 | |||
| No | 35 (27.8%) | 325 (45.6%) | |
| Yes | 91 (72.2%) | 388 (54.4%) | |
| 0.087 | |||
| No | 109 (86.5%) | 654 (91.7%) | |
| Yes | 17 (13.5%) | 59 (8.3%) | |
| 0.224 | |||
| No | 107 (84.9%) | 569 (79.8%) | |
| Yes | 19 (15.1%) | 144 (20.2%) | |
| 0.006 | |||
| No | 78 (61.9%) | 529 (74.2%) | |
| Yes | 48 (38.1%) | 184 (25.8%) | |
| 0.053 | |||
| No | 104 (82.5%) | 635 (89.1%) | |
| Yes | 22 (17.5%) | 78 (10.9%) | |
| 0.000 | |||
| No | 93 (73.8%) | 626 (87.8%) | |
| Yes | 33 (26.2%) | 87 (12.2%) |
* p<0.05
The performance of machine learning algorithms.
| Model | AUROC | Accuracy | Sensitivity | Specificity |
|---|---|---|---|---|
| Logistic Regression | 0.6914 | 0.6130 | 0.7058 | 0.6026 |
| SVM | 0.6797 | 0.7023 | 0.6470 | 0.7086 |
| KNN | 0.6166 | 0.6726 | 0.5294 | 0.6887 |
| Random Forest | 0.6365 | 0.7380 | 0.4705 | 0.7682 |
| Neural Network | 0.6891 | 0.7440 | 0.5294 | 0.7682 |
| AdaBoost | 0.7241 | 0.6428 | 0.7647 | 0.6291 |
Fig 3ROC curves of six different machine learning models.
Comparison of imbalanced data set sampling methods.
| Method | AUROC | Accuracy | Sensitivity | Specificity |
|---|---|---|---|---|
| Random Undersampling | 0.6505 | 0.5773 | 0.7058 | 0.5629 |
| Random Oversampling | 0.7241 | 0.6428 | 0.7647 | 0.6291 |
| SMOTE | 0.6427 | 0.5952 | 0.5294 | 0.6026 |
Fig 4Feature importance of AdaBoost decision tree.
Fig 5Density plot of length of hospitalization.