| Literature DB >> 35693342 |
Jia Lu1, Weiming Zeng1, Lu Zhang2, Yuhu Shi3.
Abstract
The Extreme Learning Machine (ELM) is a simple and efficient Single Hidden Layer Feedforward Neural Network(SLFN) algorithm. In recent years, it has been gradually used in the study of Alzheimer's disease (AD). When using ELM to diagnose AD based on high-dimensional features, there are often some features that have no positive impact on the diagnosis, while others have a significant impact on the diagnosis. In this paper, a novel Key Features Screening Method based on Extreme Learning Machine (KFS-ELM) is proposed. It can screen for key features that are relevant to the classification (diagnosis). It can also assign weights to key features based on their importance. We designed an experiment to screen for key features of AD. A total of 920 key functional connections screened from 4005 functional connections. Their weights were also obtained. The results of the experiment showed that: (1) Using all (4,005) features to diagnose AD, the accuracy is 95.33%. Using 920 key features to diagnose AD, the accuracy is 99.20%. The 3,085 (4,005 - 920) features that were screened out had a negative effect on the diagnosis of AD. This indicates the KFS-ELM is effective in screening key features. (2) The higher the weight of the key features and the smaller their number, the greater their impact on AD diagnosis. This indicates that the KFS-ELM is rational in assigning weights to the key features for their importance. Therefore, KFS-ELM can be used as a tool for studying features and also for improving classification accuracy.Entities:
Keywords: AD; KFS-ELM; brain functional connectivity; extreme learning machine; fMRI
Year: 2022 PMID: 35693342 PMCID: PMC9177228 DOI: 10.3389/fnagi.2022.888575
Source DB: PubMed Journal: Front Aging Neurosci ISSN: 1663-4365 Impact factor: 5.702
FIGURE 1Flow of the experiment.
Accuracy of ELM trained with full amount of features,
| Classifier no. | Training accuracy (%) | Verification accuracy (%) | Test accuracy (%) |
| 1 | 100.00 | 97.78 | 95.00 |
| 2 | 100.00 | 97.78 | 85.00 |
| 3 | 100.00 | 98.89 | 90.00 |
| 4 | 100.00 | 97.78 | 95.00 |
| 5 | 100.00 | 97.78 | 95.00 |
| 6 | 100.00 | 97.78 | 95.00 |
| 7 | 100.00 | 97.78 | 100.00 |
| 8 | 100.00 | 97.78 | 75.00 |
| 9 | 100.00 | 97.78 | 75.00 |
| 10 | 100.00 | 97.78 | 95.00 |
| 11 | 100.00 | 97.78 | 100.00 |
| 12 | 100.00 | 97.78 | 100.00 |
| 13 | 100.00 | 97.78 | 95.00 |
| 14 | 100.00 | 97.78 | 95.00 |
| 15 | 100.00 | 97.78 | 90.00 |
| 16 | 100.00 | 97.78 | 100.00 |
| 17 | 100.00 | 97.78 | 90.00% |
| Mean | 100.00 | 97.84 | 92.35% |
Number of nodes and accuracy of ELM classifier after pruning.
| Classifier no. | Training accuracy (%) | Validation accuracy (%) | Test accuracy (%) | Number of input nodes screened | Number of hidden layer nodes screened |
| 1 | 100.00 | 100.00 | 100.00 | 71 | 707 |
| 2 | 100.00 | 100.00 | 85.00 | 87 | 963 |
| 3 | 100.00 | 100.00 | 80.00 | 160 | 1,094 |
| 4 | 100.00 | 100.00 | 85.00 | 82 | 680 |
| 5 | 100.00 | 100.00 | 90.00 | 106 | 1,245 |
| 6 | 100.00 | 100.00 | 95.00 | 150 | 1,563 |
| 7 | 100.00 | 100.00 | 85.00 | 89 | 716 |
| 8 | 100.00 | 100.00 | 65.00 | 154 | 1,577 |
| 9 | 100.00 | 100.00 | 85.00 | 86 | 890 |
| 10 | 100.00 | 100.00 | 85.00 | 117 | 738 |
| 11 | 100.00 | 100.00 | 95.00 | 175 | 1,939 |
| 12 | 100.00 | 100.00 | 85.00 | 108 | 1,402 |
| 13 | 100.00 | 100.00 | 95.00 | 63 | 557 |
| 14 | 100.00 | 98.89 | 70.00 | 85 | 641 |
| 15 | 100.00 | 98.89 | 80.00 | 87 | 825 |
| 16 | 100.00 | 98.89 | 85.00 | 96 | 934 |
| 17 | 100.00 | 100.00 | 75.00 | 90 | 1,252 |
| Mean | 100.00 | 99.80 | 84.71 | 106.24 | 1,042.53 |
FIGURE 2(A) Key features filtered in the first ELM classifier, (B) key features filtered in the second ELM classifier, (C) key features filtered in the third ELM classifier, (D) the concatenated set of all key features.
Accuracy of constructing ELM classifier with full amount of features or key features with different weights.
| Criteria for feature selection | Amount of features | Selected features/Full amount of features (%) | Training accuracy (%) | Validation accuracy (%) | Test accuracy ± Standard deviation |
| Weight ≥ 0 | 4005 | 100.00 | 100.00 | 100.00 | 95.33% ± 0.0035 |
| Weight ≥ 1 | 920 | 22.97 | 100.00 | 100.00 | 99.20% ± 0.0021 |
| Weight ≥ 2 | 397 | 9.91 | 100.00 | 100.00 | 96.92% ± 0.0057 |
| Weight ≥ 3 | 199 | 4.97 | 100.00 | 100.00 | 95.24% ± 0.0048 |
| Weight ≥ 4 | 109 | 2.72 | 100.00 | 100.00 | 93.33% ± 0.0074 |
| Weight ≥ 5 | 62 | 1.55 | 100.00 | 100.00 | 91.68% ± 0.0090 |
| Weight ≥ 6 | 45 | 1.12 | 100.00 | 100.00 | 87.84% ± 0.0106 |
| Weight ≥ 7 | 28 | 0.70 | 100.00 | 99.75 | 82.19% ± 0.0126 |
| Weight ≥ 8 | 22 | 0.55 | 100.00 | 97.55 | 75.27% ± 0.0195 |
| Weight ≥ 9 | 11 | 0.27 | 100.00 | 92.70 | 69.03% ± 0.0170 |
| Weight ≥ 10 | 8 | 0.20 | 100.00 | 87.75 | 60.14% ± 0.0359 |
| Weight ≥ 11 | 4 | 0.10 | 100.00 | 81.65 | 51.69% ± 0.0251 |
| Weight ≥ 12 | 1 | 0.02 | 58.05 | 84.85 | 55.34% ± 0.0140 |
The key features with weight value greater than or equal to 6.
| Functional connectivity | Weight | Functional connectivity | Weight | Functional connectivity | Weight |
| FFG.R-STG.R (56–82) | 12 | MOG.R-PCUN.R (52–68) | 8 | ORBsup.L-MTG.L (5–85) | 6 |
| MOG.L-PCUN.R (51–68) | 11 | MOG.R-PCL.R (52–70) | 8 | ORBmid.L-IFGtriang.L (9–13) | 6 |
| TPOsup.R-ITG.R (84–90) | 11 | IOG.L-IOG.R (53–54) | 8 | IFGoperc.L-SFGmed.L (11–23) | 6 |
| MTG.L-TPOmid.L (85–87) | 11 | FFG.R-PCL.L (56–69) | 8 | IFGoperc.R-HIP.R (12–38) | 6 |
| ORBsup.L-TPOmid.L (5–87) | 10 | PCL.L-ITG.R (56–90) | 8 | SOG.R-PCL.L (50–69) | 6 |
| ORBmid.L-IFGoperc.L (9–11) | 10 | STG.R-ITG.R (82–90) | 8 | SOG.R-PCL.R (50–70) | 6 |
| FFG.R-PUT.L (56–73) | 10 | TPOmid.L-ITG.L (87–89) | 8 | IOG.L-FFG.R (53–56) | 6 |
| FFG.R-TPOsup.R (56–84) | 10 | PreCG.L-THA.L (1–77) | 7 | FFG.R-PCUN.R (56–68) | 6 |
| PreCG.R-HES.R (2–80) | 9 | SFGdor.R-TPOsup.R (4–84) | 7 | FFG.R-TPOsup.L (56–83) | 6 |
| MFG.L-IFGoperc.L (7–11) | 9 | MFG.R-IFGtriang.R (8–14) | 7 | PoCG.R-IPL.L 58–61) | 6 |
| MOG.L-PCL.L (51–69) | 9 | CUN.R-PCUN.R (46–68) | 7 | PCUN.R-MTG.L (68–85) | 6 |
| MFG.R-IFGoperc.R (8–12) | 8 | MOG.L-SPG.L (51–59) | 7 | PCL.R-PUT.L (70–73) | 6 |
| SOG.R-IOG.R (50–54) | 8 | CAU.R-STG.R (72–82) | 7 | PCL.R-PUT.R (70–74) | 6 |
| MOG.L-PCUN.L (51–67) | 8 | SFGdor.R-THA.R (4–78) | 6 | MTG.L-ITG.R (85–90) | 6 |
| MOG.L-PCL.R (51–70) | 8 | ORBsup.L-PCL.R (5–70) | 6 | MTG.R-TPOmid.L (86–87) | 6 |
The brain regions corresponding to the key features with weight greater than or equal to 6.
| Brain region (serial No.) | Weight | Brain region (serial No.) | Weight | Brain region (serial No.) | Weight |
| FFG.R (56) | 66 | MOG.R (52) | 16 | CUN.R (46) | 7 |
| MOG.L (51) | 43 | IOG.R (54) | 16 | SPG.L (59) | 7 |
| PCL.R (70) | 40 | PUT.L (73) | 16 | CAU.R (72) | 7 |
| PCUN.R (68) | 38 | MFG.R (8) | 15 | THA.L (77) | 7 |
| TPOmid.L (87) | 35 | IFGoperc.R (12) | 14 | IFGtriang.L (13) | 6 |
| ITG.R (90) | 33 | IOG.L (53) | 14 | SFGmed.L (23) | 6 |
| MTG.L (85) | 29 | SFGdor.R (4) | 13 | HIP.R (38) | 6 |
| TPOsup.R (84) | 28 | PreCG.R (2) | 9 | PoCG.R (58) | 6 |
| STG.R (82) | 27 | MFG.L (7) | 9 | IPL.L (61) | 6 |
| IFGoperc.L (11) | 25 | HES.R (80) | 9 | PUT.R (74) | 6 |
| PCL.L (69) | 23 | PCUN.L (67) | 8 | THA.R (78) | 6 |
| ORBsup.L (5) | 22 | ITG.L (89) | 8 | TPOsup.L (83) | 6 |
| SOG.R (50) | 20 | PreCG.L (1) | 7 | MTG.R (86) | 6 |
| ORBmid.L (9) | 16 | IFGtriang.R (14) | 7 |
FIGURE 3Full-view diagram of key features with weights greater than or equal to 6.
Effect of features with different weights on AD diagnosis.
| Weight of feature group | Features amount | Contribution of feature group to validation accuracy (%) | Contribution of each feature to validation accuracy (%) | Contribution of feature group to test accuracy (%) | Contribution of each feature to test accuracy (%) |
| Weight = 10 | 4 | 6.10 | 1.5250 | 8.45 | 2.1125 |
| Weight = 9 | 3 | 4.95 | 1.6500 | 8.89 | 2.9633 |
| Weight = 8 | 11 | 4.85 | 0.4409 | 6.24 | 0.5673 |
| Weight = 7 | 6 | 2.20 | 0.3667 | 6.92 | 1.1533 |
| Weight = 6 | 17 | 0.25 | 0.0147 | 5.65 | 0.3324 |
| Weight = 5 | 17 | 0.00 | 0.0000 | 3.84 | 0.2259 |
| Weight = 4 | 47 | 0.00 | 0.0000 | 1.65 | 0.0351 |
| Weight = 3 | 90 | 0.00 | 0.0000 | 1.91 | 0.0212 |
| Weight = 2 | 198 | 0.00 | 0.0000 | 1.68 | 0.0085 |
| Weight = 1 | 523 | 0.00 | 0.0000 | 2.28 | 0.0044 |
| Weight = 0 | 3085 | 0.00 | 0.0000 | -3.87 | -0.0013 |
Comparison the performances with references.
| References | DataSet AD:CN | Feature measures | Feature screening methods | Criteria for feature selection | Feature weights represent importance on classification | Classifier | Selected features | Selected features/All features(%) | Acc (%) |
|
| 25:36 | Funtional Connectivity (FC) | random neural network cluster | Selection of the best feature set from 11,000 times clustering | NO | Elam Neural Network | 120 | 30.00 | 92.31 |
|
| 34:31 | voxel-wise regional spontaneous, FC | t-test, SVM-RFE, LASSO | Thresholds for selecting features | NO | ELM | – | – | 98.86 |
|
| 35:31 | FC | ReliefF | Top n Features | NO | K-Nearest Neighbor | 100 | 1.50 | 87.10 |
| 200 | 3.00 | 93.50 | |||||||
| 300 | 4.50 | 88.70 | |||||||
| 400 | 6.00 | 91.90 | |||||||
| 500 | 7.50 | 91.70 | |||||||
|
| 252:215 (none ADNI) | first-order neighborhood aggregation of FC | Weight-constrained lowrank Learning | Top n Features | NO | Multi-Kernel SVM | – | 25.00 | 88.63 |
| Proposed method | 100:100 | FC | KFS-ELM | Automatic generation | YES | ELM | 45 | 1.12 | 87.84 |
| 62 | 1.55 | 91.68 | |||||||
| 109 | 2.72 | 93.33 | |||||||
| 199 | 4.97 | 95.24 | |||||||
| 397 | 9.91 | 96.92 | |||||||
| 920 | 22.97 | 99.20 |