| Literature DB >> 32604728 |
Brian Sumali1, Yasue Mitsukura2, Kuo-Ching Liang3, Michitaka Yoshimura3, Momoko Kitazawa3, Akihiro Takamiya3, Takanori Fujita4, Masaru Mimura3, Taishiro Kishimoto3.
Abstract
Loss of cognitive ability is commonly associated with dementia, a broad category of progressive brain diseases. However, major depressive disorder may also cause temporary deterioration of one's cognition known as pseudodementia. Differentiating a true dementia and pseudodementia is still difficult even for an experienced clinician and extensive and careful examinations must be performed. Although mental disorders such as depression and dementia have been studied, there is still no solution for shorter and undemanding pseudodementia screening. This study inspects the distribution and statistical characteristics from both dementia patient and depression patient, and compared them. It is found that some acoustic features were shared in both dementia and depression, albeit their correlation was reversed. Statistical significance was also found when comparing the features. Additionally, the possibility of utilizing machine learning for automatic pseudodementia screening was explored. The machine learning part includes feature selection using LASSO algorithm and support vector machine (SVM) with linear kernel as the predictive model with age-matched symptomatic depression patient and dementia patient as the database. High accuracy, sensitivity, and specificity was obtained in both training session and testing session. The resulting model was also tested against other datasets that were not included and still performs considerably well. These results imply that dementia and depression might be both detected and differentiated based on acoustic features alone. Automated screening is also possible based on the high accuracy of machine learning results.Entities:
Keywords: audio features; automated mental health screening; machine learning; pseudodementia; statistical testing
Year: 2020 PMID: 32604728 PMCID: PMC7348868 DOI: 10.3390/s20123599
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Recording setup during interview session. P is the patient and T is the psychiatrist. There is a distance of approximately 70 cm between the patient’s seat and the recording apparatus.
Figure 2Flowchart of dataset filtration.
List of features utilized in this study.
| Feature | Mathematical Functions and References |
|---|---|
| Pitch | [ |
| Harmonics-to-noise ratio (HNR) | [ |
| Zero-Crossing Rate (ZCR) |
|
| Mel-frequency cepstral coefficients (MFCC) | [ |
| Gammatone cepstral coefficients (GTCC) | [ |
| Mean frequency | Mean of power spectrum from the signal |
| Median frequency | Median of power spectrum from the signal |
| Signal energy ( |
|
| Spectral centroid ( | |
| Spectral rolloff point ( |
For ZCR: N, , and denotes the length of signal, signum function extracting the sign of a real number (positive, negative, or zero), and i-th sequence of signal X, respectively. For mean frequency and median frequency: power spectrum from the signal was applied by performing Fourier transform. For signal energy: is the signal energy of signal X, denotes the function of standard deviation of signal X and indicates the function of mean of signal X. For spectral centroid: c denotes the spectral centroid, is the frequency in Hertz corresponding to bin i, is the spectral value at bin i, and and are the band edges, in bins, over which to calculate the spectral centroid. For spectral rolloff point: r is the spectral rolloff frequency, is the spectral value at bin i, and and are the band edges, in bins, over which to calculate the spectral spread.
Figure 3Flowchart of supervised machine learning procedure. The first and second phase used age-matched symptomatic depression and dementia subjects. The first phase consists of unsupervised machine learning clustering while the second phase consists of conventional training and evaluation. The third phase involves of utilizing machine learning model trained from age-matched subjects against non-age matched subjects.
Figure 4Confusion matrix and class label utilized in this study.
List of evaluation metrics.
| Metric | Mathematical Formula |
|---|---|
| Accuracy (ACC) |
|
| True positive rate (TPR) |
|
| True negative rate (TNR) |
|
| Positive predictive value (PPV) |
|
| Negative predictive value (NPV) |
|
| F1 score |
|
| Cohen’s kappa |
|
|
| |
| Matthew’s correlation coefficient (MCC) |
|
Subject demographics.
| Demographics | Depression | Dementia | |
|---|---|---|---|
| Symptomatic | n (dataset/subject) | 300/77 | 119/43 |
| age (mean ± s.d. years) | 50.4 ± 15.1 | 80.8 ± 8.3 | |
| sex (female %) | 54.5 | 72.1 | |
| Age-matched | n (dataset/subject) | 89/24 | 88/29 |
| age (mean ± s.d. years) | 67.8 ± 7.1 | 77.0 ± 7.5 | |
| sex (female %) | 83.3 | 72.4 | |
| Young depression, | n (dataset/subject) | 211/53 | 31/14 |
| age (mean ± s.d. years) | 42.5 ± 10.4 | 88.5 ± 1.9 | |
| sex (female %) | 41.5 | 71.4 | |
Figure 5Distribution of features with significant correlation to HAMD and MMSE. * marks the statistically different features between the groups, corrected with Bonferroni correction.
Features with significant Pearson correlation in both depression and dementia patients.
| Feature Description | Pearson’s Correlation | |
|---|---|---|
| HAMD | MMSE | |
| mean GTCC_1 | −0.346 | 0.226 |
| mean MFCC_1 | −0.346 | 0.226 |
| median GTCC_1 | −0.325 | 0.219 |
| median GTCC_3 | −0.224 | 0.230 |
| median MFCC_1 | −0.325 | 0.219 |
| SD GTCC_12 | −0.218 | 0.257 |
| SD MFCC_4 | −0.289 | 0.329 |
| SD MFCC_7 | −0.221 | 0.274 |
| SD MFCC_12 | −0.259 | 0.224 |
Phase 1: Unsupervised machine learning result.
| Metric | kMeans (%) |
|---|---|
| Accuracy (ACC) | 62.7 |
| True positive rate (TPR) | 89.9 |
| True negative rate (TNR) | 35.2 |
| Positive predictive value (PPV) | 58.3 |
| Negative predictive value (NPV) | 77.5 |
| F1 score | 70.8 |
| Cohen’s kappa | 25.2 |
| Matthew’s correlation coefficient (MCC) | 30.0 |
Figure 6Absolute value of feature contributions of linear SVM with LASSO feature selection, sorted descending.
Phase 2: Supervised machine learning result—SVM with linear kernel.
| Metrices | Training (Mean ± SD %) | Testing (Mean ± SD %) | ||
|---|---|---|---|---|
| No LASSO | With LASSO | No LASSO | With LASSO | |
| Accuracy (ACC) | 90.1 ± 2.4 | 95.2 ± 0.7 | 84.2 ± 5.3 | 93.3 ± 7.7 |
| True positive rate (TPR) | 94.4 ± 0.9 | 98.3 ± 0.9 | 88.8 ± 10.5 | 97.8 ± 4.7 |
| True negative rate (TNR) | 85.7 ± 4.6 | 92.6 ± 1.2 | 79.6 ± 11.5 | 89.4 ± 13.7 |
| Positive predictive value (PPV) | 87.1 ± 3.5 | 92.1 ± 1.2 | 82.5 ± 8.8 | 90.4 ± 11.7 |
| Negative predictive value (NPV) | 93.8 ± 1.0 | 98.4 ± 0.8 | 88.8 ± 8.9 | 98.0 ± 4.2 |
| F1 score | 90.6 ± 2.0 | 95.1 ± 0.7 | 84.8 ± 5.5 | 93.5 ± 7.2 |
| Cohen’s kappa | 80.2 ± 4.7 | 90.5 ± 1.4 | 68.3 ± 10.5 | 86.7 ± 15.0 |
| Matthew’s correlation coefficient (MCC) | 80.5 ± 4.4 | 90.6 ± 1.4 | 69.8 ± 10.3 | 87.8 ± 13.5 |
Phase 2: Supervised machine learning result—SVM with 3rd order Polynomial kernel.
| Metrices | Training (Mean ± SD %) | Testing (Mean ± SD %) | ||
|---|---|---|---|---|
| No LASSO | With LASSO | No LASSO | With LASSO | |
| Accuracy (ACC) | 91.5 ± 3.1 | 94.6 ± 8.1 | 79.1 ± 7.6 | 89.7 ± 11.4 |
| True positive rate (TPR) | 96.4 ± 2.4 | 99.1 ± 1.0 | 85.3 ± 10.8 | 96.7 ± 5.4 |
| True negative rate (TNR) | 86.5 ± 4.0 | 90.0 ± 16.1 | 72.6 ± 14.3 | 83.1 ± 22.9 |
| Positive predictive value (PPV) | 87.9 ± 3.5 | 92.3 ± 9.9 | 76.9 ± 8.3 | 87.6 ± 13.8 |
| Negative predictive value (NPV) | 95.9 ± 2.7 | 98.9 ± 1.2 | 84.1 ± 9.9 | 96.9 ± 5.0 |
| F1 score | 91.9 ± 2.9 | 95.3 ± 6.1 | 80.3 ± 6.9 | 91.1 ± 8.2 |
| Cohen’s kappa | 82.9 ± 6.3 | 89.2 ± 16.2 | 58.0 ± 15.2 | 79.7 ± 21.7 |
| Matthew’s correlation coefficient (MCC) | 83.3 ± 6.2 | 90.1 ± 13.7 | 59.4 ± 14.6 | 81.8 ± 17.9 |
Phase 2: Supervised machine learning result—SVM with RBF kernel.
| Metrices | Training (Mean ± SD %) | Testing (Mean ± SD %) | ||
|---|---|---|---|---|
| No LASSO | With LASSO | No LASSO | With LASSO | |
| Accuracy (ACC) | 90.4 ± 6.2 | 95.6 ± 1.9 | 75.3 ± 12.4 | 88.7 ± 7.9 |
| True positive rate (TPR) | 96.4 ± 2.9 | 98.8 ± 1.0 | 77.5 ± 16.6 | 91.0 ± 10.3 |
| True negative rate (TNR) | 84.3 ± 10.2 | 92.4 ± 3.0 | 72.9 ± 17.3 | 86.1 ± 13.1 |
| Positive predictive value (PPV) | 86.7 ± 7.9 | 93.0 ± 2.6 | 75.6 ± 13.8 | 88.3 ± 10.4 |
| Negative predictive value (NPV) | 95.7 ± 3.7 | 98.6 ± 1.2 | 77.6 ± 14.6 | 91.3 ± 8.9 |
| F1 score | 91.2 ± 5.4 | 95.8 ± 1.7 | 75.7 ± 12.5 | 89.1 ± 7.9 |
| Cohen’s kappa | 80.8 ± 12.3 | 91.2 ± 3.7 | 50.5 ± 24.8 | 77.3 ± 15.9 |
| Matthew’s correlation coefficient (MCC) | 81.5 ± 11.7 | 91.4 ± 3.6 | 51.8 ± 25.0 | 78.3 ± 15.4 |
Phase 3: Machine learning result against non-age matched dataset.
| Metrics | Linear | Polynomial | RBF | |||
|---|---|---|---|---|---|---|
| All Feats | LASSO | All Feats | LASSO | All Feats | LASSO | |
| Accuracy (ACC) | 83.5 | 82.6 | 80.2 | 81.4 | 65.7 | 81.0 |
| True positive rate (TPR) | 87.7 | 83.9 | 82.5 | 82.9 | 66.8 | 82.9 |
| True negative rate (TNR) | 54.8 | 74.2 | 64.5 | 71.0 | 58.1 | 67.7 |
| Positive predictive value (PPV) | 93.0 | 95.7 | 94.1 | 95.1 | 91.6 | 94.6 |
| Negative predictive value (NPV) | 39.5 | 40.4 | 35.1 | 37.9 | 20.5 | 36.8 |
| F1 score | 90.2 | 89.4 | 87.9 | 88.6 | 77.3 | 88.4 |
| Cohen’s kappa | 36.5 | 42.8 | 34.6 | 39.3 | 13.9 | 37.3 |
| Matthew’s correlation coefficient (MCC) | 37.2 | 45.7 | 37.0 | 42.2 | 17.3 | 39.9 |