| Literature DB >> 32256550 |
Qingshan She1, Kang Chen1, Zhizeng Luo1, Thinh Nguyen2, Thomas Potter2, Yingchun Zhang2.
Abstract
Recent technological advances have enabled researchers to collect large amounts of electroencephalography (EEG) signals in labeled and unlabeled datasets. It is expensive and time consuming to collect labeled EEG data for use in brain-computer interface (BCI) systems, however. In this paper, a novel active learning method is proposed to minimize the amount of labeled, subject-specific EEG data required for effective classifier training, by combining measures of uncertainty and representativeness within an extreme learning machine (ELM). Following this approach, an ELM classifier was first used to select a relatively large batch of unlabeled examples, whose uncertainty was measured through the best-versus-second-best (BvSB) strategy. The diversity of each sample was then measured between the limited labeled training data and previously selected unlabeled samples, and similarity is measured among the previously selected samples. Finally, a tradeoff parameter is introduced to control the balance between informative and representative samples, and these samples are then used to construct a powerful ELM classifier. Extensive experiments were conducted using benchmark and multiclass motor imagery EEG datasets to evaluate the efficacy of the proposed method. Experimental results show that the performance of the new algorithm exceeds or matches those of several state-of-the-art active learning algorithms. It is thereby shown that the proposed method improves classifier performance and reduces the need for training samples in BCI applications.Entities:
Mesh:
Year: 2020 PMID: 32256550 PMCID: PMC7091553 DOI: 10.1155/2020/3287589
Source DB: PubMed Journal: Comput Intell Neurosci
Algorithm 1The double-criteria active learning with the ELM algorithm.
Details of the datasets including the numbers of the corresponding features and samples.
| Dataset | Number of | Percentage of initial labeled instances (%) | Percentage of initial unlabeled instances (%) | Percentage of test instances (%) | ||
|---|---|---|---|---|---|---|
| Features | Instances | Classes | ||||
| Liver | 7 | 345 | 2 | 10 | 40 | 50 |
| Diabetes | 8 | 768 | 2 | 10 | 40 | 50 |
| Wdbc | 30 | 569 | 2 | 10 | 40 | 50 |
| Twonorm | 20 | 7400 | 2 | 1 | 49 | 50 |
| Hayes-Roth | 4 | 160 | 3 | 10 | 40 | 50 |
| Iris | 4 | 150 | 3 | 10 | 40 | 50 |
| Wine | 13 | 178 | 3 | 10 | 40 | 50 |
| Segment | 19 | 2310 | 7 | 10 | 40 | 50 |
| Letter | 16 | 20000 | 26 | 1 | 49 | 50 |
Details of the optimal parameter settings for the different datasets using four methods.
| Dataset | D-AL-ELM | AL-ELM | ELM-Entropy | PL-ELM | |
|---|---|---|---|---|---|
|
|
|
|
|
| |
| Liver | 110 | 0.1 | 110 | 110 | 110 |
| Diabetes | 110 | 0.3 | 110 | 110 | 110 |
| Wdbc | 200 | 0.1 | 200 | 200 | 200 |
| Twonorm | 120 | 0.1 | 120 | 120 | 120 |
| Hayes-Roth | 100 | 0.3 | 100 | 100 | 100 |
| Iris | 170 | 0.9 | 170 | 170 | 170 |
| Wine | 60 | 0.7 | 60 | 60 | 60 |
| Segment | 200 | 0.6 | 200 | 200 | 200 |
| Letter | 700 | 0.4 | 700 | 700 | 700 |
Figure 1The learning curves of the four different learning algorithms on 9 benchmark datasets. (a) Liver. (b) Diabetes. (c) Wdbc. (d) Twonorm. (e) Wine. (f) Hayes-Roth. (g) Iris. (h) Segment. (i) Letter.
Mean accuracy results of the learning processes on 9 datasets (%).
| Dataset | D-AL-ELM | AL-ELM | ELM-Entropy | PL-ELM |
|---|---|---|---|---|
| Liver |
| 67.46 | 67.14 | 66.14 |
| Diabetes |
| 76.13 | 76.12 | 74.60 |
| Wdbc |
|
| 96.11 | 94.69 |
| Twonorm |
| 97.25 | 97.25 | 97.13 |
| Wine |
| 95.72 | 95.64 | 95.79 |
| Hayes-Roth |
| 56.23 | 54.03 | 56.56 |
| Iris |
| 96.43 | 94.44 | 95.58 |
| Segment |
| 89.17 | 88.06 | 87.63 |
| Letter |
| 81.67 | 67.09 | 75.76 |
ALC comparisons of four methods on 9 datasets.
| Dataset | D-AL-ELM | AL-ELM | ELM-Entropy | PL-ELM |
|---|---|---|---|---|
| Liver |
| 0.7610 | 0.7572 | 0.7447 |
| Diabetes |
| 0.7624 | 0.7622 | 0.7469 |
| Wdbc |
| 0.9623 | 0.9616 | 0.9474 |
| Twonorm |
| 0.9729 | 0.9729 | 0.9715 |
| Wine |
| 0.9584 | 0.9573 | 0.9584 |
| Hayes-Roth |
| 0.5622 | 0.5395 | 0.5663 |
| Iris |
| 0.9655 | 0.9450 | 0.9563 |
| Segment |
| 0.8929 | 0.8812 | 0.8766 |
| Letter |
| 0.8204 | 0.6718 | 0.7606 |
Average running time (s) for each learning algorithm.
| Dataset | D-AL-ELM | AL-ELM | ELM-Entropy | PL-ELM |
|---|---|---|---|---|
| Liver | 0.9141 | 0.7531 | 0.7719 | 0.7453 |
| Diabetes | 1.2031 | 1.0438 | 1.0060 | 0.9719 |
| Wdbc | 1.3484 | 1.2500 | 1.2828 | 1.2234 |
| Twonorm | 7.9813 | 5.3047 | 5.5875 | 4.8844 |
| Wine | 0.5391 | 0.4625 | 0.4625 | 0.4391 |
| Hayes-Roth | 0.5109 | 0.4516 | 0.4594 | 0.4203 |
| Iris | 0.5250 | 0.4469 | 0.4856 | 0.4313 |
| Segment | 3.6578 | 3.3906 | 3312 | 3.3250 |
| Letter | 121.8047 | 115.5203 | 123.0641 | 111.4641 |
Figure 2The learning curves of the proposed algorithm with different h values on Iris and Wine. (a) Iris. (b) Wine.
Figure 3The learning curves of the proposed algorithm with different m values on Iris and Wine. (a) Iris. (b) Wine.
Figure 4Learning curves of the four different learning algorithms on BCI Competition IV Dataset 2a. (a) S1. (b) S2. (c) S3. (d) S4. (e) S5. (f) S6. (g) S7. (h) S8. (i) S9.
Mean accuracy (%) of the learning process on BCI Competition IV Dataset 2a.
| Datasets | D-AL-ELM | AL-ELM | ELM-Entropy | PL-ELM |
|---|---|---|---|---|
| S1 |
| 83.64 | 83.32 | 83.76 |
| S2 | 53.91 | 52.50 | 52.33 |
|
| S3 | 85.66 |
| 85.57 | 84.39 |
| S4 |
| 63.48 | 62.94 | 64.95 |
| S5 |
| 48.90 | 47.79 | 48.00 |
| S6 |
| 53.96 | 52.97 | 52.36 |
| S7 | 83.15 |
| 82.45 | 81.96 |
| S8 | 83.46 |
| 83.34 | 81.71 |
| S9 |
| 82.80 | 82.39 | 82.91 |
| Mean |
| 70.92 | 70.34 | 70.51 |
ALC values of the four methods on BCI Competition IV Dataset 2a.
| Datasets | D-AL-ELM | AL-ELM | ELM-Entropy | PL-ELM |
|---|---|---|---|---|
| S1 |
| 0.8109 | 0.8076 | 81.21 |
| S2 | 0.5238 | 0.5100 | 0.5085 |
|
| S3 | 0.8302 |
| 0.8291 | 0.8177 |
| S4 |
| 0.6159 | 0.6105 | 0.6308 |
| S5 |
| 0.4749 | 0.4638 | 0.4662 |
| S6 |
| 0.5241 | 0.5144 | 0.5088 |
| S7 | 0.8066 |
| 0.7996 | 0.7952 |
| S8 | 0.8090 |
| 0.8079 | 0.7923 |
| S9 |
| 0.8032 | 0.7992 | 0.8042 |
Average running time (s) of each learning algorithm.
| Datasets | D-AL-ELM | AL-ELM | ELM-Entropy | PL-ELM |
|---|---|---|---|---|
| S1 | 1.7266 | 1.4625 | 1.4594 | 1.4172 |
| S2 | 2.5125 | 2.2813 | 2.3859 | 2.2469 |
| S3 | 1.7219 | 1.4859 | 1.4578 | 1.4234 |
| S4 | 2.0719 | 1.7891 | 1.8328 | 1.8125 |
| S5 | 2.0781 | 1.8281 | 1.8141 | 1.7719 |
| S6 | 1.7609 | 1.4594 | 1.4094 | 1.3609 |
| S7 | 2.4188 | 2.1969 | 2.1734 | 2.1641 |
| S8 | 1.5562 | 1.3375 | 1.3609 | 1.3078 |
| S9 | 1.2906 | 0.9484 | 0.9563 | 0.8906 |
| Mean | 1.9042 | 1.6432 | 1.6500 | 1.5995 |