| Literature DB >> 35573334 |
Zhaohe Zhou1, Dan Luo2,3, Bing Xiang Yang2,3,4, Zhongchun Liu4.
Abstract
Background: The 2019 novel coronavirus (COVID-19)-related depression symptoms of healthcare workers have received worldwide recognition. Although many studies identified risk exposures associated with depression symptoms among healthcare workers, few have focused on a predictive model using machine learning methods. As a society, governments, and organizations are concerned about the need for immediate interventions and alert systems for healthcare workers who are mentally at-risk. This study aims to develop and validate machine learning-based models for predicting depression symptoms using survey data collected during the COVID-19 outbreak in China. Method: Surveys were conducted of 2,574 healthcare workers in hospitals designated to care for COVID-19 patients between 20 January and 11 February 2020. The patient health questionnaire (PHQ)-9 was used to measure the depression symptoms and quantify the severity, a score of ≥5 on the PHQ-9 represented depression symptoms positive, respectively. Four machine learning approaches were trained (75% of data) and tested (25% of data). Cross-validation with 100 repetitions was applied to the training dataset for hyperparameter tuning. Finally, all models were compared to evaluate their predictive performances and screening utility: decision tree, logistics regression with least absolute shrinkage and selection operator (LASSO), random forest, and gradient-boosting tree.Entities:
Keywords: COVID-19; depression; health personnel; machine learning; predictive value of tests
Year: 2022 PMID: 35573334 PMCID: PMC9106105 DOI: 10.3389/fpsyt.2022.876995
Source DB: PubMed Journal: Front Psychiatry ISSN: 1664-0640 Impact factor: 5.435
Figure 1Study flowchart.
Demographic and variable characteristics.
|
|
|
| |||
|---|---|---|---|---|---|
|
| |||||
|
| |||||
| Female | 783 (75.2%) | 746 (83.7%) | 266 (76.9%) | 241 (81.4%) | |
| Male | 258 (24.8%) | 145 (16.3%) | 80 (23.1) | 55 (18.6%) | |
|
| |||||
| Yes | 279 (26.8%) | 406 (45.6%) | 98(28.3%) | 132(44.6%) | |
| No | 762 (73.2%) | 485 (54.4%) | 248(71.7%) | 164(55.4%) | |
|
| |||||
| Yes | 715 (68.7%) | 548 (61.5%) | 238 (68.8%) | 197 (66.6%) | |
| No | 326 (31.3%) | 343 (38.5%) | 108 (31.2%) | 99 (33.4%) | |
|
| |||||
| Graduate degree or higher | 177 (17.0%) | 154 (17.3%) | 66(19.1%) | 51(17.2%) | |
| Undergraduate degree or lower | 864 (83.0%) | 737 (82.7%) | 280(80.9%) | 245(82.8%) | |
|
| |||||
| Yes | 126 (12.1%) | 194 (21.8%) | 37 (10.7%) | 68 (23.0%) | |
| No | 915 (87.9%) | 697 (78.2%) | 309 (89.3%) | 228 (77.0%) | |
|
| |||||
| Yes | 740 (71.1%) | 819 (91.9%) | 228 (65.9%) | 268 (90.5%) | |
| No | 301 (28.9%) | 72 (8.1%) | 118 (34.1%) | 28 (9.5%) | |
|
| |||||
| Yes | 572 (54.9%) | 562 (63.1%) | 204 (59.0%) | 195 (65.9%) | |
| No | 469 (45.1%) | 329 (36.9%) | 142 (41.0%) | 101 (34.1%) | |
|
| |||||
| Yes | 997 (95.8%) | 861 (96.6%) | 329 (95.1%) | 283 (95.6%) | |
| No | 44 (4.2%) | 30 (3.4%) | 17 (4.9%) | 13 (4.4%) | |
|
| |||||
| Yes | 574 (55.1%) | 429 (48.1%) | 202 (58.4%) | 148 (50.0%) | |
| No | 467 (44.9%) | 462 (51.9%) | 144 (41.6%) | 148 (50.0%) | |
|
| |||||
| Yes | 141 (13.5%) | 125 (14.0%) | 52 (15.0%) | 48 (16.2%) | |
| No | 900 (86.5%) | 766 (86.0%) | 294 (85.0%) | 248 (83.8%) | |
|
| |||||
| Yes | 672 (64.6%) | 451 (50.6%) | 228 (65.9%) | 149 (50.3%) | |
| No | 369 (35.4%) | 440 (49.4%) | 118 (34.1%) | 147 (49.7%) | |
|
| |||||
| Yes | 42 (4.0%) | 45 (5.1%) | 23 (6.6%) | 11 (3.7%) | |
| No | 999 (96.0%) | 846 (94.9%) | 323 (93.4%) | 285 (96.3%) | |
|
| |||||
| Live with family | 712 (68.4%) | 484 (54.3%) | 249 (72.0%) | 170 (57.4%) | |
| Live alone | 214 (20.6%) | 221 (24.8%) | 63 (18.2%) | 72 (24.3%) | |
| Live with friends | 108 (10.4%) | 168 (18.9%) | 33 (9.5%) | 51 (17.2%) | |
| Live with others | 7 (0.7%) | 18 (2.0%) | 1 (0.3%) | 3 (1.0%) | |
|
| |||||
| Wuhan city | 377 (36.2%) | 459 (51.5%) | 112 (32.4%) | 154 (52.0%) | |
| Hubei province | 297 (28.5%) | 172 (19.3%) | 103 (29.8%) | 68 (23.0%) | |
| Other province | 367 (35.3%) | 260 (29.2%) | 131 (37.9%) | 74 (25.0%) | |
|
| |||||
| <1 h | 206 (19.8%) | 144 (16.2%) | 58 (16.8%) | 34 (11.5%) | |
| 1–2 h | 473 (45.4%) | 336 (37.7%) | 166 (48.0%) | 121 (40.9%) | |
| 3–4 h | 226 (21.7%) | 218 (24.5%) | 68 (19.7%) | 83 (28.0%) | |
| Over 5 h | 136 (13.1%) | 193 (21.7%) | 54 (15.6%) | 58 (19.6%) | |
|
| |||||
| Very strong | 117 (11.2%) | 237 (26.6%) | 39 (11.3%) | 69 (23.3%) | |
| Strong | 338 (32.5%) | 339 (38.0%) | 119 (34.4%) | 127 (42.9%) | |
| Normal | 538 (51.7%) | 306 (34.3%) | 178 (51.4%) | 97 (32.8%) | |
| None | 48 (4.6%) | 9 (1.0%) | 10 (2.9%) | 3 (1.0%) | |
|
| |||||
| Much worse | 91 (8.7%) | 312 (35.0%) | 22 (6.4%) | 110 (37.2%) | |
| Worse | 867 (83.3%) | 499 (56.0%) | 288 (83.2%) | 162 (54.7%) | |
| Unchanged | 76 (7.3%) | 33 (3.7%) | 34 (9.8%) | 7 (2.4%) | |
| Better | 7 (0.7%) | 47 (5.3%) | 2 (0.6%) | 17 (5.7%)142 | |
|
| |||||
| Yes | 17 (1.6%) | 34 (3.8%) | 1 (0.3%) | 13 (4.4%) | |
| No | 1,024 (98.4%) | 857 (96.2%) | 345 (99.7%) | 283 (95.6%) | |
|
| |||||
| Yes | 235 (22.6%) | 299 (33.6%) | 59 (17.1%) | 119 (40.2%) | |
| No | 806 (77.4%) | 592 (66.4%) | 287 (82.9%) | 177 (59.8%) | |
|
| |||||
| Yes | 80 (7.7%) | 113 (12.7%) | 20 (5.8%) | 35 (11.8%) | |
| No | 961 (92.3%) | 778 (87.3%) | 326 (94.2%) | 261 (88.2%) | |
|
| |||||
| Yes | 141 (13.5%) | 184 (20.7%) | 34 (9.8%) | 58 (19.6%) | |
| No | 900 (86.5%) | 707 (79.3%) | 312 (90.2%) | 238 (80.4%) | |
|
| |||||
| 18–30 | 450 (43.2%) | 433 (48.6%) | 142 (41.0%) | 137 (46.3%) | |
| 31–40 | 323 (31.0%) | 294 (33.0%) | 102 (29.5%) | 89 (30.1%) | |
| 41 and above | 268 (25.7%) | 164 (18.4%) | 102 (29.5%) | 70 (23.6%) | |
|
| |||||
| Not good | 3 (0.3%) | 53 (5.9%) | 2 (0.6%) | 15 (5.1%) | |
| Normal | 300 (28.8%) | 612 (68.7%) | 82 (23.7%) | 202 (68.2%) | |
| Good | 738 (70.9%) | 226 (25.4%) | 262 (75.7%) | 79 (26.7%) | |
Figure 2Feature weights and contributions to the models: (A) logistic regression with LASSO; (B) decision tree; (C) random forest; and (D) gradient-boosting tree. The beta coefficient in logistic regression with LASSO and the importance of variables scaled by the maximum value of 100 in the decision tree, random forest, and gradient-boosting tree were shown. LASSO, Absolute shrinkage and selection operator; [], reference variable.
Figure 3ROC curve and calibration plot of models in the test dataset. (A) The ROC curve of the models, X-axis: specificity, Y-axis: sensitivity. The AUC [95% CI] of the models; random forest: 0.828 [0.792–0.856], logistic regression with LASSO: 0.824 [0.797–0.859], gradient-boosting tree: 0.829 [0.798–0.861], and decision tree: 0.785 [0.752–0.819]. (B) Calibration plot, X-axis: probabilities estimated by machine learning models, Y-axis: observed probabilities of outcome. ROC, Receiver operating characteristic; AUC, Area under the curve; LASSO, Absolute shrinkage and selection operator.
Figure 4Decision curve analysis, X-axis: threshold probability for machine learning models to make a prediction, Y-axis: net benefit.