| Literature DB >> 31862920 |
Yi-Wei Chien1, Sheng-Yi Hong1, Wen-Ting Cheah1, Li-Hung Yao1, Yu-Ling Chang2, Li-Chen Fu3.
Abstract
Alzheimer disease and other dementias have become the 7th cause of death worldwide. Still lacking a cure, an early detection of the disease in order to provide the best intervention is crucial. To develop an assessment system for the general public, speech analysis is the optimal solution since it reflects the speaker's cognitive skills abundantly and data collection is relatively inexpensive compared with brain imaging, blood testing, etc. While most of the existing literature extracted statistics-based features and relied on a feature selection process, we have proposed a novel Feature Sequence representation and utilized a data-driven approach, namely, the recurrent neural network to perform classification in this study. The system is also shown to be fully-automated, which implies the system can be deployed widely to all places easily. To validate our study, a series of experiments have been conducted with 120 speech samples, and the score in terms of the area under the receiver operating characteristic curve is as high as 0.838.Entities:
Year: 2019 PMID: 31862920 PMCID: PMC6925285 DOI: 10.1038/s41598-019-56020-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The flowchart of the system.
Figure 2The block diagram of the Feature Sequence Generator.
Figure 3The block diagram of the AD Assessment Engine.
Subject information of NTUH Dataset.
| CH | AD | |
|---|---|---|
| Number of data | 10 | 10 |
| Age | 67.2 ± 8.42 | 78.6 ± 6.49 |
| Years of education | 16.2 ± 1.88 | 13.0 ± 3.46 |
| MMSE | 28.6 ± 0.91 | 22.4 ± 3.72 |
| Gender(% women) | 60 | 40 |
The AUROC score on different RNN cell unit.
| Unidirectional | Bidirectional | |||||
|---|---|---|---|---|---|---|
| GRU | LSTM | Simple | biGRU | biLSTM | biSimple | |
| thr = 1 | 0.932 ± 0.02 | 0.922 ± 0.01 | 0.682 ± 0.08 | 0.951 ± 0.02 | 0.948 ± 0.02 | 0.871 ± 0.06 |
| thr = 3 | 0.941 ± 0.03 | 0.916 ± 0.02 | 0.758 ± 0.06 | 0.969 ± 0.01 | 0.942 ± 0.01 | 0.888 ± 0.03 |
| thr = 5 | 0.936 ± 0.02 | 0.922 ± 0.03 | 0.760 ± 0.03 | 0.948 ± 0.01 | 0.937 ± 0.02 | 0.885 ± 0.02 |
The Specificity score on different RNN cell unit.
| Unidirectional | Bidirectional | |||||
|---|---|---|---|---|---|---|
| GRU | LSTM | Simple | biGRU | biLSTM | biSimple | |
| thr = 1 | 0.898 ± 0.02 | 0.882 ± 0.02 | 0.633 ± 0.05 | 0.913 ± 0.02 | 0.904 ± 0.02 | 0.816 ± 0.05 |
| thr = 3 | 0.909 ± 0.04 | 0.862 ± 0.03 | 0.689 ± 0.07 | 0.936 ± 0.02 | 0.913 ± 0.02 | 0.833 ± 0.04 |
| thr = 5 | 0.880 ± 0.02 | 0.860 ± 0.02 | 0.718 ± 0.03 | 0.918 ± 0.02 | 0.904 ± 0.01 | 0.840 ± 0.01 |
Evaluation of the Feature Sequence Generator.
| Edit Distance | Token Error Rat | Length Difference | ||||
|---|---|---|---|---|---|---|
| CH | AD | CH | AD | CH | AD | |
| Fruit | 24.0 ± 7.7 | 39.6 ± 13.3 | 0.524 ± 0.129 | 0.787 ± 0.081 | 0.7 ± 4.6 | -12.1 ± 9.5 |
| Location | 33.3 ± 10.6 | 51.4 ± 19.7 | 0.426 ± 0.120 | 0.796 ± 0.099 | -1.3 ± 6.7 | -13.5 ± 9.7 |
| Picture | 73.8 ± 27.7 | 77.7 ± 30.8 | 0.448 ± 0.091 | 0.830 ± 0.056 | -12.6 ± 10.7 | -28.3 ± 13.8 |
Performance on the NTUH dataset.
| Manual | Automatic Transcription | ||
|---|---|---|---|
| N/A | Without Fine-Tuning | After Fine-Tuning | |
| thr = 3, biGRU (AUROC) | 0.808 ± 0.05 | 0.803 ± 0.03 | 0.838 ± 0.03 |
| thr = 3, biGRU (Sensitivity) | 0.736 ± 0.07 | 0.711 ± 0.05 | 0.756 ± 0.07 |
| thr = 3, biGRU (Specificity) | 0.750 ± 0.05 | 0.822 ± 0.03 | 0.764 ± 0.06 |
The Sensitivity score on different RNN cell unit.
| Unidirectional | Bidirectional | |||||
|---|---|---|---|---|---|---|
| GRU | LSTM | Simple | biGRU | biLSTM | biSimple | |
| thr = 1 | 0.862 ± 0.04 | 0.869 ± 0.02 | 0.624 ± 0.05 | 0.891 ± 0.03 | 0.853 ± 0.03 | 0.771 ± 0.06 |
| thr = 3 | 0.889 ± 0.01 | 0.889 ± 0.03 | 0.702 ± 0.07 | 0.887 ± 0.02 | 0.856 ± 0.03 | 0.791 ± 0.03 |
| thr = 5 | 0.880 ± 0.02 | 0.880 ± 0.02 | 0.676 ± 0.04 | 0.904 ± 0.02 | 0.862 ± 0.04 | 0.804 ± 0.03 |