| Literature DB >> 35982600 |
Benson Kung1, Maurice Chiang1, Gayan Perera2,3, Megan Pritchard2,3,4, Robert Stewart2,3,4.
Abstract
OBJECTIVES: This study evaluated an unsupervised machine learning method, latent Dirichlet allocation (LDA), as a method for identifying subtypes of depression within symptom data.Entities:
Keywords: Depression; Machine Learning; Medical Informatics; Mental Health; Psychiatry
Year: 2022 PMID: 35982600 PMCID: PMC9388921 DOI: 10.4258/hir.2022.28.3.256
Source DB: PubMed Journal: Healthc Inform Res ISSN: 2093-3681
Demographic information of the latent Dirichlet analysis groups
| Full sample | Mild groups | Psychotic | Severe | Mild | Agitated | Anergic-apathetic | |
|---|---|---|---|---|---|---|---|
| Total sample | 18,314 | 12,115 | 3,059 | 3,140 | 4,844 | 4,291 | 2,980 |
|
| |||||||
| Sex | |||||||
| Female | 11,377 (62.1) | 7,825 (64.6) | 1,703 (55.7) | 1,849 (58.9) | 3,441 (71.0) | 2,500 (58.3) | 1,884 (63.2) |
| Male | 6,926 (37.8) | 4,283 (35.4) | 1,353 (44.2) | 1,290 (41.1) | 1,401 (28.9) | 1,789 (41.7) | 1,093 (36.7) |
|
| |||||||
| Race | |||||||
| Asian | 915 (5.0) | 573 (4.7) | 191 (6.2) | 151 (4.8) | 227 (4.7) | 218 (5.1) | 128 (4.3) |
| Black | 2,728 (14.9) | 1,709 (14.1) | 571 (18.7) | 448 (14.3) | 670 (13.8) | 603 (14.1) | 436 (14.6) |
| Mixed | 400 (2.2) | 274 (2.3) | 64 (2.1) | 62 (2.0) | 111 (2.3) | 95 (2.2) | 68 (2.3) |
| Other | 1,833 (10) | 1,236 (10.2) | 292 (9.5) | 305 (9.7) | 506 (10.4) | 449 (10.5) | 281 (9.4) |
| White | 10,458 (57.1) | 6,956 (57.4) | 1,653 (54.0) | 1,849 (58.9) | 2,787 (57.5) | 2,449 (57.1) | 1,720 (57.7) |
| Ethnicity missing | 1,980 (10.8) | 1,367 (11.3) | 288 (9.4) | 325 (10.4) | 543 (11.2) | 477 (11.1) | 347 (11.6) |
|
| |||||||
| Age (yr) | |||||||
| <18 | 2,352 (12.8) | 1,750 (14.4) | 257 (8.4) | 345 (11.0) | 772 (15.9) | 664 (15.5) | 314 (10.5) |
| 18–34 | 5,951 (32.5) | 3,954 (32.6) | 965 (31.5) | 1,032 (32.9) | 1,580 (32.6) | 1,289 (30.0) | 1,085 (36.4) |
| 35–49 | 4,513 (24.6) | 2,923 (24.1) | 757 (24.7) | 833 (26.5) | 1,175 (24.3) | 1,033 (24.1) | 715 (24) |
| 50–64 | 2,561 (14) | 1,576 (13) | 505 (16.5) | 480 (15.3) | 620 (12.8) | 590 (13.7) | 366 (12.3) |
| ≥65 | 2,934 (16) | 1,910 (15.8) | 575 (18.8) | 449 (14.3) | 696 (14.4) | 714 (16.6) | 500 (16.8) |
|
| |||||||
| Deprivation score | 25.1 ± 10.2 | 25.1 ± 10.3 | 25.4 ± 10.1 | 24.8 ± 10.2 | 25.0 ± 10.0 | 25.2 ± 10.4 | 25.2 ± 10.2 |
Values are presented as number (%) or mean ± standard deviation.
Demographic information of the latent class analysis groups
| Full sample (n = 18,314) | Psychotic (n = 987) | Severe (n = 1,596) | Moderate (n = 6,063) | Mild (n = 9,668) | |
|---|---|---|---|---|---|
| Sex | |||||
| Female | 11,377 (62.1) | 544 (55.1) | 896 (56.1) | 3,729 (61.5) | 6,208 (64.2) |
| Male | 6,926 (37.8) | 443 (44.9) | 700 (43.9) | 2,332 (38.5) | 3,451 (35.7) |
|
| |||||
| Race | |||||
| Asian | 915 (5) | 83 (8.4) | 92 (5.8) | 298 (4.9) | 442 (4.6) |
| Black | 2,728 (14.9) | 244 (24.7) | 241 (15.1) | 867 (14.3) | 1,376 (14.2) |
| Mixed | 400 (2.2) | 12 (1.2) | 35 (2.2) | 119 (2) | 234 (2.4) |
| Other | 1,833 (10) | 76 (7.7) | 137 (8.6) | 589 (9.7) | 1,031 (10.7) |
| White | 10,458 (57.1) | 496 (50.3) | 987 (61.8) | 3,605 (59.5) | 5,370 (55.5) |
| Ethnicity missing | 1,980 (10.8) | 76 (7.7) | 104 (6.5) | 585 (9.6) | 1,215 (12.6) |
|
| |||||
| Age (yr) | |||||
| <18 | 2,352 (12.8) | 58 (5.9) | 225 (14.1) | 751 (12.4) | 1,318 (13.6) |
| 18–34 | 5,951 (32.5) | 316 (32) | 542 (34) | 2,044 (33.7) | 3,049 (31.5) |
| 35–49 | 4,513 (24.6) | 252 (25.5) | 401 (25.1) | 1,505 (24.8) | 2,355 (24.4) |
| 50–64 | 2,561 (14) | 191 (19.4) | 249 (15.6) | 796 (13.1) | 1,325 (13.7) |
| ≥65 | 2,934 (16) | 170 (17.2) | 179 (11.2) | 966 (15.9) | 1,619 (16.7) |
|
| |||||
| Deprivation score | 25.1 ± 10.2 | 25.8 ± 10.2 | 25.7 ± 10.1 | 24.9 ± 10.2 | 25.1 ± 10.1 |
Values are presented as number (%) or mean ± standard deviation.
Figure 1Five-topic latent Dirichlet allocation (LDA) symptom distribution. Column colors represent individual subtypes. Symptoms were included here if they were one of the two most common symptoms for a subtype. The red column corresponds to the “Severe” group, blue to “Psychotic”, yellow to “Mild,” green to “Agitated,” and pink to “Anergic-apathetic.”
Figure 2Three-class latent class analysis (LCA) symptom likelihoods. Column colors represent individual subtypes. The top 10 most common symptoms in the dataset were included here. The red and yellow columns can be viewed as severe subtypes, where the latter is distinguished by psychotic features. The blue, overall, forms a mild subtype.
Figure 3Four-class LCA symptom likelihoods. Column colors represent individual subtypes. The top 10 most common symptoms in the dataset were included here. The red column corresponds to the “Severe” group, blue to “Psychotic,” yellow to “Moderate,” and green to “Mild.”
Odds ratios (ORs) for crisis events and emergency presentations
| Psychotic | Severe | Mild | Agitated | Anergic | |||
|---|---|---|---|---|---|---|---|
| LDA | Emergency presentations | OR (95% CI) | 1.29 (1.17–1.43) | 1.16 (1.05–1.29) | 0.86 (0.78–0.94) | 0.83 (0.75–0.92) | 1.01 (0.91–1.13) |
| <0.001[ | 0.01[ | <0.001[ | <0.001[ | 0.83 | |||
| Crisis events | OR (95% CI) | 2.45 (2.15–2.80) | 1.14 (0.98–1.33) | 0.49 (0.41–0.57) | 0.96 (0.86–1.13) | 0.64 (0.54–0.78) | |
| <0.001[ | 0.08 | <0.001[ | 0.82 | <0.001[ | |||
|
| |||||||
|
|
|
|
| ||||
|
| |||||||
| LCA | Emergency presentations | OR (95% CI) | 4.16 (3.50–4.95) | 5.26 (4.58–6.05) | 0.84 (0.74–0.95) | 0.27 (0.23–0.31) | - |
| <0.001[ | <0.001[ | <0.001[ | <0.001[ | ||||
| Crisis events | OR (95% CI) | 1.32 (1.12–1.56) | 1.62 (1.43–1.84) | 1.12 (1.03–1.22) | 0.71 (0.65–0.77) | - | |
| <0.001[ | <0.001[ | <0.001[ | <0.001[ | ||||
Adjusted for age, gender, ethnicity, and index of multiple deprivation score.
LDA: latent Dirichlet allocation, LCA: latent class analysis, CI: confidence interval.
p < 0.05.
HoNOS problems in the LDA patient groups
| Scale | Total (n = 18,314) | Psychotic (n = 3,059) | Severe (n = 3,140) | Mild (n = 4,844) | Agitated (n = 4,291) | Anergic (n = 2,980) | |
|---|---|---|---|---|---|---|---|
| Agitation | 1,397 (7.6) | 442 (14.4) | 180 (5.7) | 282 (5.8) | 358 (8.3) | 135 (4.5) | <0.001 |
| Self-injury | 2,624 (14.3) | 490 (16.0) | 612 (19.5) | 561 (11.6) | 623 (14.5) | 338 (11.3) | <0.001 |
| Drug misuse | 1,403 (7.7) | 290 (9.5) | 261 (8.3) | 327 (6.8) | 329 (7.7) | 196 (6.6) | 0.01 |
| Cognition | 1,328 (7.3) | 364 (11.9) | 193 (6.1) | 286 (5.9) | 289 (6.7) | 196 (6.6) | <0.001 |
| Physical illness | 3,846 (21.0) | 693 (22.7) | 696 (22.2) | 954 (19.7) | 890 (20.7) | 613 (20.6) | 0.06 |
| Hallucinations | 1,178 (6.4) | 699 (22.9) | 119 (3.8) | 94 (1.9) | 179 (4.2) | 87 (2.9) | <0.001 |
| Depressed | 9,063 (49.5) | 1,634 (53.4) | 1,616 (51.5) | 2,243 (46.3) | 2,033 (47.4) | 1,537 (51.6) | <0.001 |
| Relationship | 3,685 (20.1) | 709 (23.2) | 691 (22.0) | 925 (19.1) | 822 (19.2) | 538 (18.1) | <0.001 |
| Daily living | 3,130 (17.1) | 635 (20.8) | 553 (17.6) | 689 (14.2) | 726 (16.9) | 527 (17.7) | <0.001 |
| Living conditions | 1,714 (9.4) | 391 (12.8) | 355 (11.3) | 363 (7.5) | 347 (8.1) | 258 (8.7) | <0.001 |
| Occupational | 3,304 (18) | 676 (22.1) | 619 (19.7) | 728 (15.0) | 750 (17.5) | 531 (17.8) | <0.001 |
| HoNOS missing | 10,704 (58.4) | 2,027 (66.3) | 1,798 (57.3) | 2,680 (55.3) | 244 (57) | 1,751 (58.8) | <0.001 |
Values are presented as number (%).
HoNOS: Health of the Nation Outcome Scales, LDA: latent Dirichlet allocation.
Chi-squared test with 4 degrees-of-freedom.
HoNOS problems in the LCA patient groups
| Scale | Total (n = 18,314) | Psychotic (n = 987) | Severe (n = 1,596) | Moderate (n = 6,063) | Mild (n = 9,668) | |
|---|---|---|---|---|---|---|
| Agitation | 1,397 (7.6) | 242 (24.5) | 245 (15.4) | 426 (7) | 484 (5) | <0.0001 |
| Self-injury | 2,624 (14.3) | 195 (19.8) | 619 (38.8) | 1,043 (17.2) | 767 (7.9) | <0.0001 |
| Drug misuse | 1,403 (7.7) | 95 (9.6) | 266 (16.7) | 490 (8.1) | 552 (5.7) | <0.0001 |
| Cognition | 1,328 (7.3) | 197 (20) | 126 (7.9) | 413 (6.8) | 592 (6.1) | <0.0001 |
| Physical illness | 3,846 (21.0) | 210 (21.3) | 333 (20.9) | 1,279 (21.1) | 2,024 (20.9) | <0.0001 |
| Hallucinations | 1,178 (6.4) | 401 (40.6) | 216 (13.5) | 251 (4.1) | 310 (3.2) | <0.0001 |
| Depressed | 9,063 (49.5) | 599 (60.7) | 1,137 (71.2) | 3,170 (52.3) | 4,157 (43) | <0.0001 |
| Relationship | 3,685 (20.1) | 274 (27.8) | 519 (32.5) | 1,268 (20.9) | 1,624 (16.8) | <0.0001 |
| Daily living | 3,130 (17.1) | 257 (26) | 330 (20.7) | 1,072 (17.7) | 1,471 (15.2) | <0.0001 |
| Living conditions | 1,714 (9.4) | 153 (15.5) | 236 (14.8) | 598 (9.9) | 727 (7.5) | <0.0001 |
| Occupational | 3,304 (18) | 255 (25.8) | 446 (27.9) | 1,118 (18.4) | 1,485 (15.4) | <0.0001 |
| HoNOS missing | 10,704 (58.4) | 233 (23.6) | 369 (23.1) | 2,530 (41.7) | 4,490 (46.4) | <0.0001 |
Values are presented as number (%).
HoNOS: Health of the Nation Outcome Scales, LCA: latent class analysis.
Chi-squared test with 4 degrees-of-freedom.
Figure 4Symptom likelihoods for the latent Dirichlet allocation (LDA) patient groups. Symptoms were included here if they were one of the top 10 most common symptoms, and were one of the top two symptoms in an LDA subtype.
Figure 5Symptom likelihoods for the latent class analysis (LCA) patient groups. Symptoms were included here if they were one of the top ten most common symptoms, and were one of the top two symptoms in an latent Dirichlet allocation (LDA) subtype.