| Literature DB >> 35478769 |
David Lin1, Tahmida Nazreen1, Tomasz Rutowski1, Yang Lu1, Amir Harati1, Elizabeth Shriberg1, Piotr Chlebek1, Michael Aratow1.
Abstract
Background: Depression and anxiety create a large health burden and increase the risk of premature mortality. Mental health screening is vital, but more sophisticated screening and monitoring methods are needed. The Ellipsis Health App addresses this need by using semantic information from recorded speech to screen for depression and anxiety.Entities:
Keywords: NLP; artificial intelligence; behavioral health monitoring; biomarkers; machine learning; mental health screening; smartphone; speech
Year: 2022 PMID: 35478769 PMCID: PMC9037748 DOI: 10.3389/fpsyg.2022.811517
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Figure 1Topic choice for Ellipsis Health App participants.
Figure 2Question prompt for Ellipsis Health App participants.
Figure 3Overall Study Workflow.
Study completion survey.
| Question | |||
|---|---|---|---|
| 1 | Please rate how easy the study was for you? | ||
| 1 – Very Easy | 3 – Okay | 5 – Very Difficult | |
| 2 | If you encountered any difficulty over the six sessions performing the study, please describe them. | ||
| 3 | Would you do a single session once per year while waiting at your doctor’s office? Please answer yes or no. | ||
| 4 | Did you feel the study felt repetitive after several sessions? | ||
| 1 – Very Repetitive | 3 – It was fine. | 5 – Look forward to doing it | |
| 5 | Did you feel the compensation was appropriate for your participation? | ||
| 1 – Not enough | 3 – About right | 5 – Too much | |
| 6 | Would you be interested in improving behavioral health today by partnering with Ellipsis Health? Your interaction with new products and feedback regarding emotional assessment is extremely valuable. | ||
| 7 | Please share any other comments and suggestions about participating in this study? | ||
Demographic characteristics of Desert Oasis Healthcare study participants for Group Positive and Group Negative Chi-square and t-tests.
| Variable | Value of | Statistical test |
|---|---|---|
| White, non-white | 0.66 | Chi-square |
| Male, Female | 0.46 | Chi-square |
| <60, 60–70, >70 | 0.02 | Chi-square |
| Mean | 0.01 | |
| <6, 6, >6 | 0.11 | Chi-square |
| Mean | 0.87 | |
Removed responses for those who declined to answer and/or had erroneous race categories.
Demographics and session duration/frequency – ANOVA, Welch’s ANOVA and Chi-squared tests with initial PHQ/GAD scores and survey answers in Group Positive participants.
| Categories | PHQ: mean scores (value of | GAD: mean scores (value of | Mean recording duration (value of | Average number of sessions (value of | Survey question #1: how easy mean (value of | Survey question #3: annual survey at doctor’s office (value of | Survey question #4: How repetitive mean (value of | Survey question #5: compensation mean (value of |
|---|---|---|---|---|---|---|---|---|
| Male | 0.81 | 0.85 | 0.71 | 0.47 | 0.019 | 0.79 | 0.79 | 0.66 |
| <60 | 0.004 | <0.00 | 0.24 | 0.49 | 0.75 | 0.12 | 0.31 | 0.57 |
| White | 0.69 | 0.42 | 0.98 | 0.20 | – | – | – | – |
| <6, 6, >6 | 0.028 | 0.11 | 0.67 | – | 0.23 | 0.87 | 0.69 | 0.35 |
Welch’s ANOVA, (−) No results shown due to small sample size.
Age, number of completed sessions and post-hoc Gender Pairwise t-test on statistically significant differences determined by ANOVA with initial PHQ/GAD scores and survey questions of Group Positive participants.
| Category 1 | Category 2 | Value of | Value of |
|---|---|---|---|
| <60 | 60–70 | 0.006 | <0.001 |
| 60–70 | <70 | 0.34 | 0.10 |
| <60 | >70 | 0.001 | <0.001 |
| <6 sessions | 6 sessions | 0.48 | – |
| 6 sessions | >6 sessions | 0.16 | – |
| <6 sessions | >6 sessions | 0.036 | – |
| Value of | |||
| Male | Female | 0.019 | |
Demographic characteristics of Desert Oasis Healthcare study participants for Group Positive (those with a history of depression) and Group Negative (those without).
| Variable | Group Positive | Group Negative |
|---|---|---|
| Black | 8 | 1 |
| White | 151 | 10 |
| Other | 6 | 1 |
| Declined to answer | 60 | 10 |
| Male | 94 | 11 |
| Female | 144 | 11 |
| <60 | 56 | 11 |
| 60–70 | 125 | 9 |
| >70 | 57 | 2 |
| Mean (SD) | 64 (9.8) | 56 (12.5) |
| <6 | 93 | 9 |
| 6 | 83 | 12 |
| >6 | 64 | 2 |
| Mean (SD) | 5.0 (2.4) | 4.9 (1.8) |
Group sums less than the total amount per group reflect cases in which demographic data was not self-reported.
T-test reveals significance, p ≤ 0.05.
Demographics, initial PHQ/GAD scores, and session duration/average number for Group Positive participants.
| Categories | Number (%) | PHQ: mean scores (SD) | GAD: Mean scores (SD) | Mean recording duration (SD) | Average number of sessions (SD) |
|---|---|---|---|---|---|
| 240 (100%) | 7.5 (5.1) | 5.8 (4.8) | 245.5 (106.2) | 5.0 (1.0) | |
| <6 | 93 (38.8%) | 8.5 (−) | 6.4 (−) | 239.0 (−) | – |
| 6 | 83 (34.6%) | 7.3 (2.5) | 5.9 (2.2) | 253.2 (62.0) | – |
| >6 (Mean = 7.4,SD = 1.2) | 64 (26.7%) | 6.2 (2.3) | 4.8 (2.0) | 244.7 (58.2) | – |
| Male | 94 (39.2%) | 7.4 (2.2) | 5.8 (1.8) | 242.3 (60.2) | 4.8 (2.6) |
| Female | 144 (60.0%) | 7.6 (2.6) | 5.9 (2.3) | 247.4 (60.5) | 5.1 (2.6) |
| <60 | 56 (23.3%) | 9.8 (2.5) | 8.9 (2.4) | 256.0 (54.5) | 5.2 (2.0) |
| 60–70 | 125(52.1%) | 7.1 (2.4) | 5.2 (2.2) | 249.6 (62.8) | 5.0 (2.6) |
| >70 | 57 (23.8%) | 6.3 (2.5) | 4.2 (1.7) | 225.5 (61.3) | 4.7 (2.3) |
| White | 151 (62.9%) | 7.3 (2.4) | 5.6 (2.1) | 253.7 (57.9) | 5.1 (2.3) |
| Non-white | 14 (5.8%) | 7.9 (2.4) | 6.6 (2.8) | 252.8 (44.6) | 4.3 (3.0) |
SD (Standard Deviation) not calculated for groups with fewer than 6 total sessions; post hoc tests showed non-significant differences unless otherwise specified.
ANOVA reveals significant difference, p ≤ 0.05.
Group sums less than the total amount per group reflect cases in which demographic data was not self-reported.
Welch ANOVA reveals significant difference, p ≤ 0.05.
Removed responses for those who declined to answer and/or had erroneous race categories.
Demographics and survey answers (see Table 1) in Group Positive participants (N = 103).
| Categories | Survey Question #1: How Easy Mean (SD) | Survey Question #3: Annual survey at doctor’s office (%) | Survey Question #4: How Repetitive Mean (SD) | Survey Question #5: Compensation Mean (SD) |
|---|---|---|---|---|
| 2.1 (1.3) | 60.2% | 2.2 (1.2) | 2.8 (0.8) | |
| Male | 1.8 (0.7) | 63.1% | 2.1 (1.1) | 2.8 (0.7) |
| Female | 2.3 (0.9) | 58.5% | 2.2 (1.3) | 2.8 (0.9) |
| <60 | 2.3 (1.2) | 45.5% | 1.9 (1.2) | 2.6 (0.7) |
| 60–70 | 2.0 (1.2) | 69.1% | 2.3 (1.2) | 2.9 (0.8) |
| >70 | 2.2 (1.4) | 53.8% | 2.0 (1.4) | 2.8 (0.8) |
| White | 2.2 (1.3) | 56.5% | 3.1 (1.2) | 2.9 (0.8) |
| Non-white | – | – | – | – |
| <6 (Mean = 2.4, SD = 1.4) | 2.8 (1.3) | 50.0% | 2.5 (1.5) | 2.3 (1.0) |
| 6 | 2.0 (1.2) | 61.1% | 2.1 (1.1) | 2.8 (0.7) |
| >6 (Mean = 7.4, SD = 1.2) | 2.2 (1.3) | 60.5% | 2.2 (1.4) | 2.8 (0.9) |
Mean response from Likert scale of 1–5.
% ‘Yes’ response.
ANOVA reveals significant difference, p ≤ 0.05.
No results given due to small sample size.
Semantic algorithm performance for initial PHQ and GAD scores.
| Measure | AUC | EER | PPV | NPV |
|---|---|---|---|---|
| Initial PHQ | 0.82 (0.80, 0.85) | 0.75 | 0.54 | 0.88 |
| Initial GAD | 0.82 (0.79, 0.85) | 0.75 | 0.45 | 0.91 |
| Initial PHQ | 0.83 (0.80, 0.85) | 0.75 | 0.54 | 0.89 |
| Initial GAD | 0.83 (0.80, 0.86) | 0.75 | 0.44 | 0.92 |
AUC (Area Under Curve).
EER (Equal Error Rate).
PPV (Positive Predictive Value).
NPV (Negative Predictive Value).
Figure 4PHQ AUC Performance Comparing LSTM with Transformer Methodology for Both GP and GN.
Figure 5GAD AUC Performance Comparing LSTM with Transformer Methodology for Both GP and GN.
Figure 6Sensitivity vs. Specificity in relation to recording length cutoffs for PHQ-8.
Figure 7Sensitivity vs. Specificity in relation to recording length cutoffs for GAD-7.