| Literature DB >> 35310782 |
Ning Liu1,2,3, Zhenming Yuan1,4, Qingfeng Tang5.
Abstract
Alzheimer's disease (AD) is a neurodegenerative disease involving the decline of cognitive ability with illness progresses. At present, the diagnosis of AD mainly depends on the interviews between patients and doctors, which is slow, expensive, and subjective, so it is not a better solution to recognize AD using the currently available neuropsychological examinations and clinical diagnostic criteria. A recent study has indicated the potential of language analysis for AD diagnosis. In this study, we proposed a novel feature purification network that can improve the representation learning of transformer model further. Though transformer has made great progress in generating discriminative features because of its long-distance reasoning ability, there is still room for improvement. There exist many common features that are not indicative of any specific class, and we rule out the influence of common features from traditional features extracted by transformer encoder and can get more discriminative features for classification. We apply this method to improve transformer's performance on three public dementia datasets and get improved classification results markedly. Specifically, the method on Pitt datasets gets state-of-the-art (SOTA) result.Entities:
Keywords: Alzheimer's disease; deep learning; machine learning; mild cognitive impairment; natural language processing; speech and language; transformer
Mesh:
Year: 2022 PMID: 35310782 PMCID: PMC8927695 DOI: 10.3389/fpubh.2021.835960
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Relationship between predicted and true classes.
|
| ||
|---|---|---|
|
| Positive | Negative |
| Positive | True positive (TP) | False positive (FP) |
| Negative | False negative (FN) | True negative (TN) |
Figure 1The architecture of GP-Net.
GP-Net
| 1: Input: Supposing the datasets are |
AD vs. CTRL classification scores on Pitt datasets.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| Sweta Karlekar ( | POS | CNN-RNN | - | - | 91.1 | - |
| Fritsch et al. ( | n-gram | NNLM+LSTM | - | - | 85.6 | - |
| Orimaye et al. ( | n-grams | D2NN | - | - | 88.9 | - |
| Fraser et al. ( | 35 Hand-Crafted | LR | - | - | 81.92 | - |
| Yancheva et al. ( | 12 Cluster-Based Features+LS&A | Random Forest | 80.00 | 80.00 | 80.00 | 80.00 |
| Sirts et al. ( | Cluster+PID+SID | LR | 74.4 | 72.5 | - | 72.7 |
| Hernandez et al. ( | 105 Hand-Crafted | SVM | 81.00 | 81.00 | 79.00 | 81.00 |
| Roshanzamir et al. ( | BERT Base | LR | 90.31 | 76.52 | 84.46 | 82.72 |
| Roshanzamir et al. ( | Bert Large | LR | 90.57 | 84.34 | 88.08 | 87.23 |
| Pan et al. ( | GloVe Word Embedding Sequence | BiLSTM|GRU | 84.02 | 84.97 | - | 84.43 |
| Li et al. ( | 185Hand-Craft Features | LR | - | - | 77 | - |
| Fraser et al. ( | Info and LM | SVM | - | - | 75 | 77 |
| Transformer+FP25 | Transformer | Transformer | 88 |
| 90.3 | 90.6 |
| Transformer+GP | Transformer+Feature purification | Transformer |
| 89 |
|
|
The study marked with bold is the best performances on Pitt dataset.
The result of pre-trained models on Pitt dataset.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| BertCNN | Bert | CNN | 58.85 | 56.25 | 56.25 | 52.79 |
| BertRCNN | Bert | RCNN | - | 50.00 | 50.00 | 33.33 |
| BertDPCNN | Bert | DPCNN | 41.11 | 47.92 | 47.92 | 35.59 |
| ERNIEDPCNN | ERNIE | DPCNN | - | 50.00 | 50.00 | 33.33 |
| BertLogistic | Bert | Logistic Regression | 88 | 85 | 86.20 | 85.60 |
| Transformer+GP | Transformer | Transformer | 94.00 | 89.00 | 93.50 | 91.19 |
Accuracy on three datasets.
|
|
|
|
|
|---|---|---|---|
| Transformer | 91.4 | 74.3 | 81.6 |
| Transformer+GP | 93.5 | 78.6 | 83.7 |