| Literature DB >> 35493375 |
Ning Liu1,2, Kexue Luo3, Zhenming Yuan4, Yan Chen5.
Abstract
Alzheimer's disease (AD) is a neurodegenerative disease that is difficult to be detected using convenient and reliable methods. The language change in patients with AD is an important signal of their cognitive status, which potentially helps in early diagnosis. In this study, we developed a transfer learning model based on speech and natural language processing (NLP) technology for the early diagnosis of AD. The lack of large datasets limits the use of complex neural network models without feature engineering, while transfer learning can effectively solve this problem. The transfer learning model is firstly pre-trained on large text datasets to get the pre-trained language model, and then, based on such a model, an AD classification model is performed on small training sets. Concretely, a distilled bidirectional encoder representation (distilBert) embedding, combined with a logistic regression classifier, is used to distinguish AD from normal controls. The model experiment was evaluated on Alzheimer's dementia recognition through spontaneous speech datasets in 2020, including the balanced 78 healthy controls (HC) and 78 patients with AD. The accuracy of the proposed model is 0.88, which is almost equivalent to the champion score in the challenge and a considerable improvement over the baseline of 75% established by organizers of the challenge. As a result, the transfer learning method in this study improves AD prediction, which does not only reduces the need for feature engineering but also addresses the lack of sufficiently large datasets.Entities:
Keywords: Alzheimer's disease; BERT; machine learning; natural language processing; transfer learning
Mesh:
Year: 2022 PMID: 35493375 PMCID: PMC9043451 DOI: 10.3389/fpubh.2022.772592
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Figure 1The logical architecture of the model.
Figure 2A picture of a Boston Cookie-Theft description task.
The basic composition of the participants in every group.
|
|
|
|
|
|
|---|---|---|---|---|
| 50,55 | 2 | 0 | 2 | 0 |
| 55,60 | 7 | 6 | 7 | 6 |
| 60,65 | 4 | 9 | 4 | 9 |
| 65,70 | 9 | 14 | 9 | 14 |
| 70,75 | 9 | 11 | 9 | 11 |
| 75,80 | 4 | 3 | 4 | 3 |
| Total | 35 | 43 | 35 | 43 |
The average and SD of age and MMSE.
|
|
| |||
|---|---|---|---|---|
|
|
|
|
|
|
|
| 66.56 | 6.60 | 66.79 | 6.83 |
|
| 29.01 | 1.16 | 17.79 | 5.48 |
Relationship between predicted class and true class.
|
| ||
|---|---|---|
|
|
|
|
| Positive | True positive(TP) | False positive (FP) |
| Negative | False negative(FN) | True negative (TN) |
Parameters of the distilBert model.
|
|
|
|---|---|
| Epoch | 1 |
| DistilBatch_size | 156 |
| Pad_size | 500 |
| Pre-trained model | distilBert-base-uncased |
| Hidden_size | 768 |
The performance of different models.
|
|
|
|
|
|
|---|---|---|---|---|
| Linear discriminant analysis ( | 0.625 | 0.60 | 0.75 | 0.67 |
| DistilBert | 0.48 | 0.51 | 0.48 | 0.48 |
| ERNIE ( | 0.42 | 0.46 | 0.42 | 0.30 |
| DistilBert +CNN | 0.58 | 0.34 | 0.58 | 0.43 |
| DistilBert+RF | 0.79 | 0.79 | 0.79 | 0.79 |
| DistilBert+SVM | 0.625 | 0.629 | 0.625 | 0.622 |
| DistilBert+Ada | 0.73 | 0.73 | 0.73 | 0.73 |
| ERNIE+Pause ( |
|
| 0.833 |
|
| DistilBert+ LR | 0.88 | 0.88 |
| 0.87 |
ERNIE+Pause (.
The best performance in a column of measure.
The process of our algorithm description.
| 1: Input: Dataset |
| 2: The load pre-trained model tokenizes a sentence by splitting the sentence into words or subwords and then pads all lists to the same size. |
| 3: Use the distilBert model to train the dataset to obtain the embedding vector. |
| 4: Put the embedding vector into the logistic regression model to classify the dataset. |
| 5: Model evaluation. |