| Literature DB >> 34966297 |
Yasunori Yamada1, Kaoru Shinkawa1, Miyuki Nemoto2, Tetsuaki Arai3.
Abstract
Loneliness is a perceived state of social and emotional isolation that has been associated with a wide range of adverse health effects in older adults. Automatically assessing loneliness by passively monitoring daily behaviors could potentially contribute to early detection and intervention for mitigating loneliness. Speech data has been successfully used for inferring changes in emotional states and mental health conditions, but its association with loneliness in older adults remains unexplored. In this study, we developed a tablet-based application and collected speech responses of 57 older adults to daily life questions regarding, for example, one's feelings and future travel plans. From audio data of these speech responses, we automatically extracted speech features characterizing acoustic, prosodic, and linguistic aspects, and investigated their associations with self-rated scores of the UCLA Loneliness Scale. Consequently, we found that with increasing loneliness scores, speech responses tended to have less inflections, longer pauses, reduced second formant frequencies, reduced variances of the speech spectrum, more filler words, and fewer positive words. The cross-validation results showed that regression and binary-classification models using speech features could estimate loneliness scores with an R 2 of 0.57 and detect individuals with high loneliness scores with 95.6% accuracy, respectively. Our study provides the first empirical results suggesting the possibility of using speech data that can be collected in everyday life for the automatic assessments of loneliness in older adults, which could help develop monitoring technologies for early detection and intervention for mitigating loneliness.Entities:
Keywords: health-monitoring; mental health; social connectedness; speech analysis and processing; voice
Year: 2021 PMID: 34966297 PMCID: PMC8710612 DOI: 10.3389/fpsyt.2021.712251
Source DB: PubMed Journal: Front Psychiatry ISSN: 1664-0640 Impact factor: 4.157
Characteristics of study participants (N=57).
|
| ||
|---|---|---|
| Age [years], mean (SD) | 73.2 | (4.5) |
| Sex, n (%) | ||
| Men | 27 | (47.4) |
| Women | 30 | (52.6) |
| Education [years], mean (SD) | 13.8 | (2.2) |
| Marital status, n (%) | ||
| Never married | 0 | (0) |
| Divorced | 2 | (3.5) |
| Widowed | 8 | (14.0) |
| Married | 47 | (82.5) |
| Mini-Mental State Examination | 27.4 | (1.9) |
| Geriatric Depression Scale | 2.9 | (2.5) |
| UCLA Loneliness Score | 37 | (8.6) |
The total possible score ranges from 0 to 30.
The total possible score ranges from 0 to 15.
The total possible score ranges from 20 to 80.
Figure 1Overview of experimental setup for collecting speech data. (A) Participant's turn and (B) tablet's turn.
Figure 2Overview of automatic analysis pipeline for estimating loneliness scores and for detecting individuals with high loneliness scores from speech responses to daily life questions.
Figure 3Histogram of scores of the UCLA Loneliness Scale for study participants. Cut-off score was determined by using the mean + 1SD of our participants' scores and 46 points. In our sample, 10 older adults (18% of the participants) scored equal to or greater than the cut-off score.
Figure 4Analysis results of the associations of speech responses to eight daily life questions with scores of the UCLA Loneliness Scale. (A) Examples of speech features correlated with loneliness scores (Spearman correlation; *P < 0.05 and **P < 0.01). (B) Regression performances of the models using speech features for estimating loneliness scores. (C) Actual and predicted loneliness scores by the regression model using acoustic, prosodic, and linguistic features. (D) Confusion matrix of the binary-classification model using acoustic, prosodic, and linguistic features for detecting individuals with high loneliness scores. It was obtained using 20 iterations of 10-fold cross-validation. The number in parentheses indicates the mean number of participants among 20 iterations.
Regression model performance of speech features predicting loneliness scores resulting from 20 iterations of 10-fold cross validation.
|
|
|
|
|
|
|---|---|---|---|---|
| (P) Prosodic | 0.219 [0.177, 0.261] | 0.227 [0.185, 0.268] | 5.96 [5.79, 6.12] | 7.56 [7.36, 7.76] |
| (A) Acoustic | 0.442 [0.424, 0.459] | 0.442 [0.425, 0.460] | 4.86 [4.78, 4.93] | 6.40 [6.30, 6.50] |
| (L) Linguistic | 0.483 [0.467, 0.500] | 0.484 [0.468, 0.501] | 4.75 [4.66, 4.83] | 6.16 [6.06, 6.26] |
| (P) + (A) + (L) | 0.568 [0.550, 0.586] | 0.570 [0.553, 0.587] | 4.46 [4.36, 4.57] | 5.63 [5.51, 5.74] |
Each value indicates the average value across 20 iterations with 95% confidence interval. EV, explain variance; MAE, mean absolute error; RMSE, root mean square error.
Classification model performance of speech features detecting individuals with high loneliness level resulting from 20 iterations of 10-fold cross validation.
|
|
|
|
|
|
|---|---|---|---|---|
| (P) Prosodic | 87.7 [87.7, 87.7] | 30.0 [30.0, 30.0] | 100.0 [100.0, 100.0] | 46.2 [46.2, 46.2] |
| (L) Linguistic | 91.0 [90.7, 91.3] | 60.0 [60.0, 60.0] | 97.6 [97.2, 97.9] | 70.0 [69.3, 70.7] |
| (A) Acoustic | 92.7 [92.2, 93.3] | 70.0 [70.0, 70.0] | 97.6 [96.9, 98.2] | 77.2 [75.9, 78.5] |
| (P) + (L) + (A) | 95.6 [95.0, 96.2] | 90.0 [90.0, 90.0] | 96.8 [96.1, 97.6] | 87.9 [86.4, 89.4] |
Each value indicates the average value across 20 iterations with 95% confidence interval.