| Literature DB >> 35922457 |
Toshiro Horigome1, Kimihiro Hino2, Hiroyoshi Toyoshiba2, Norihisa Shindo2, Kei Funaki1, Yoko Eguchi1, Momoko Kitazawa1, Takanori Fujita3, Masaru Mimura1, Taishiro Kishimoto4,5,6.
Abstract
In recent years, studies on the use of natural language processing (NLP) approaches to identify dementia have been reported. Most of these studies used picture description tasks or other similar tasks to encourage spontaneous speech, but the use of free conversation without requiring a task might be easier to perform in a clinical setting. Moreover, free conversation is unlikely to induce a learning effect. Therefore, the purpose of this study was to develop a machine learning model to discriminate subjects with and without dementia by extracting features from unstructured free conversation data using NLP. We recruited patients who visited a specialized outpatient clinic for dementia and healthy volunteers. Participants' conversation was transcribed and the text data was decomposed from natural sentences into morphemes by performing a morphological analysis using NLP, and then converted into real-valued vectors that were used as features for machine learning. A total of 432 datasets were used, and the resulting machine learning model classified the data for dementia and non-dementia subjects with an accuracy of 0.900, sensitivity of 0.881, and a specificity of 0.916. Using sentence vector information, it was possible to develop a machine-learning algorithm capable of discriminating dementia from non-dementia subjects with a high accuracy based on free conversation.Entities:
Mesh:
Year: 2022 PMID: 35922457 PMCID: PMC9349220 DOI: 10.1038/s41598-022-16204-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Number of datasets used for training and testing. GDS: Geriatric Depression Scale. CHC: cognitively healthy control. MCI: mild cognitive impairment.
Demographic data.
| Total | Dementia | Non-dementia | Dementia data that meets the criteria for training | Non-dementia data that meets the criteria for training | |
|---|---|---|---|---|---|
| Dataset (% female) | 432 (62.5) | 193 (76.7) | 239 (51.0) | 127 (85.8) | 197 (54.8) |
| n (% female) | 135 (57.0) | 58 (70.7) | 83 (47.0) | 47 (78.7) | 78 (47.4) |
Age (mean ± SD), in years | 74.6 ± 10.9 | 79.0 ± 8.9 | 71.1 ± 11.1 | 79.2 ± 9.8 | 70.5 ± 11.6 |
| MMSE (mean ± SD) | 23.2 ± 7.0 | 16.4 ± 4.8 | 28.6 ± 1.8 | 14.9 ± 4.8 | 28.7 ± 1.6 |
| CDR (mean ± SD) | 0.6 ± 0.8 | 1.3 ± 0.7 | 0.1 ± 0.2 | 1.6 ± 0.5 | 0.1 ± 0.2 |
| LM II (mean ± SD) | 6.8 ± 7.1 | 0.5 ± 1.2 | 11.8 ± 5.8 | 0.3 ± 0.8 | 12.5 ± 5.9 |
| Letters (mean ± SD) | 2161.6 ± 880.8 | 1755.7 ± 876.0 | 2489.3 ± 737.8 | 1622.6 ± 857.9 | 2508.4 ± 775.9 |
Non-dementia includes cognitively healthy controls and participants with mild cognitive impairment. Criteria for training: criteria for datasets obtained from neuropsychological tests to be used as training data for machine learning.
MMSE mini-mental state examination, CDR clinical dementia rating, LM II logical memory delayed recall of Wechsler Memory Scale-Revised, Letters: Number of letters uttered by the subject in the recorded free conversation.
Discrimination results between dementia and non-dementia groups using machine learning.
| Data | Incorrect | Accuracy | Sensitivity | Specificity | χ2 | ||
|---|---|---|---|---|---|---|---|
| All data | 432 | 43 | 0.900 | 0.881 | 0.916 | 0.402 | 0.526 |
| Data that meets the criteria for training | 324 | 27 | 0.917 | 0.929 | 0.909 | – | – |
| Male | 162 | 17 | 0.895 | 0.733 | 0.957 | 0.015 | 0.901 |
| Female | 270 | 26 | 0.904 | 0.926 | 0.877 | – | – |
| Age < 75 years | 178 | 12 | 0.933 | 0.881 | 0.949 | 2.902 | 0.088 |
| Age ≥ 75 years | 254 | 31 | 0.878 | 0.881 | 0.874 | – | – |
Figure 2Receiver Operating Characteristic (ROC) curve of machine learning model.
Figure 3Relationship between number of letters and prediction accuracy of machine learning model.
Prediction accuracy by voting of XGBoost and DNN model.
| Voting | Accuracy | |||
|---|---|---|---|---|
| DNN model for original algorithm vector | XGBoost model for original algorithm vector | DNN model for TF-IDF vector | DNN model for BERT vector | |
| 0 | 0.447 | 0.447 | 0.447 | 0.447 |
| 1 | 0.900 | 0.785 | 0.819 | 0.794 |
| 2 | 0.889 | 0.836 | 0.824 | 0.829 |
| 3 | 0.875 | 0.833 | 0.810 | 0.836 |
| 4 | 0.870 | 0.836 | 0.799 | 0.831 |
| 5 | 0.859 | 0.831 | 0.785 | 0.847 |
| 6 | 0.856 | 0.801 | 0.785 | 0.840 |
| 7 | 0.847 | 0.780 | 0.768 | 0.822 |
| 8 | 0.845 | 0.752 | 0.752 | 0.822 |
| 9 | 0.836 | 0.727 | 0.741 | 0.808 |
| 10 | 0.815 | 0.674 | 0.706 | 0.778 |
DNN deep neural network, TF-IDF term frequency–inverse document frequency, BERT bidirectional encoder representations from transformers.
Results using other Document Embedding and machine learning models.
| Document Embedding | Model | Accuracy | Sensitivity | Specificity |
|---|---|---|---|---|
| Original algorithm | DNN | 0.900 | 0.881 | 0.916 |
| GNB | 0.817 | 0.674 | 0.933 | |
| LR | 0.831 | 0.674 | 0.958 | |
| SVC | 0.863 | 0.756 | 0.95 | |
| XGboost | 0.829 | 0.731 | 0.908 | |
| TF-IDF | DNN | 0.824 | 0.798 | 0.845 |
| GNB | 0.78 | 0.617 | 0.912 | |
| LR | 0.752 | 0.482 | 0.971 | |
| SVC | 0.785 | 0.565 | 0.962 | |
| XGboost | 0.785 | 0.653 | 0.891 | |
| BERT | DNN | 0.847 | 0.762 | 0.916 |
| GNB | 0.833 | 0.762 | 0.891 | |
| LR | 0.447 | 1 | 0 | |
| SVC | 0.845 | 0.710 | 0.95 | |
| XGboost | 0.826 | 0.731 | 0.904 |
TF-IDF term frequency–inverse document frequency, BERT bidirectional encoder representations from transformers, DNN deep neural network, GNB Gaussian Naive Bayes, LR logistic regression, SVC support vector machine classifier.
Figure 4Architecture of the machine learning and validation methods.