| Literature DB >> 35126902 |
Deng Liang1,2, Aiping Liu1,2, Le Wu2, Chang Li3, Ruobing Qian1, Rabab K Ward4, Xun Chen1,2.
Abstract
Early prediction of epilepsy seizures can warn the patients to take precautions and improve their lives significantly. In recent years, deep learning has become increasingly predominant in seizure prediction. However, existing deep learning-based approaches in this field require a great deal of labeled data to guarantee performance. At the same time, labeling EEG signals does require the expertise of an experienced pathologist and is incredibly time-consuming. To address this issue, we propose a novel Consistency-based Semisupervised Seizure Prediction Model (CSSPM), where only a fraction of training data is labeled. Our method is based on the principle of consistency regularization, which underlines that a robust model should maintain consistent results for the same input under extra perturbations. Specifically, by using stochastic augmentation and dropout, we consider the entire neural network as a stochastic model and apply a consistency constraint to penalize the difference between the current prediction and previous predictions. In this way, unlabeled data could be fully utilized to improve the decision boundary and enhance prediction performance. Compared with existing studies requiring all training data to be labeled, the proposed method only needs a small portion of data to be labeled while still achieving satisfactory results. Our method provides a promising solution to alleviate the labeling cost for real-world applications.Entities:
Mesh:
Year: 2022 PMID: 35126902 PMCID: PMC8808146 DOI: 10.1155/2022/1573076
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Figure 1A schematic illustration of the consistency regularization. When trained only on the limited labeled data in a supervised manner, the decision boundary does not follow the “manifold” of the data. On the other hand, consistency regularization could leverage unlabeled data to draw a decision boundary that better reflects the underlying structure of the data.
Data information of the selected patients in the CHB-MIT Scalp EEG Database.
| Pat1 | F | 11 | Frontal | 7 | 17.0 |
|---|---|---|---|---|---|
| Pat3 | F | 14 | Frontal | 6 | 21.9 |
| Pat5 | F | 7 | Frontal | 5 | 13.0 |
| Pat9 | F | 10 | Temporal/occipital | 4 | 12.3 |
| Pat10 | M | 3 | Temporal | 6 | 11.1 |
| Pat13 | F | 3 | Temporal/occipital | 5 | 14.0 |
| Pat14 | F | 9 | Frontal/temporal | 5 | 5.0 |
| Pat18 | F | 18 | Frontal | 6 | 23.0 |
| Pat20 | F | 6 | Temporal/parietal | 5 | 20.0 |
| Pat21 | F | 13 | Temporal/parietal | 4 | 20.9 |
| Pat23 | F | 6 | Temporal | 5 | 3.0 |
Figure 2The overall structure of CSSPM. The loss function consists of two components. The first cross-entropy loss is evaluated for labeled inputs only, where the ground truth y is only given for these data. With the stochastic augmentation and dropout in the network, the entire neural network is considered as a stochastic model. The same input would yield different results at different epochs. Hence a mean square error loss, evaluated for all training data, is applied to penalize the bias between the current prediction z and the ensemble prediction . A ramp-up weighting function ω(t) is added to control the weight of the unsupervised mean square error loss.
Algorithm 1The pseudocode of CSSPM.
Figure 3The network architecture used in this study. It includes three convolution blocks, and in each block, a batch normalization layer, a convolution layer with ReLU activation, and a max-pooling layer are built in turn. The first block uses 3D convolution, while the next two adopt 2D convolution. The features of these convolution blocks are flattened and explored by two fully connected layers to generate the final prediction. Both of them have a dropout rate of 0.5.
Figure 4The partitioning strategy for the seizure data. Assume that there are five seizure events.
Figure 5Definition of the SOP and SPH. The prediction is correct only when the seizure occurs within the SOP.
Seizure prediction performance achieved by the baseline and CSSPM for all 11 patients.
| Patient | Baseline (trained on all labeled recordings) | Baseline (trained on one labeled recording) | CSSPM | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity (%) | FPR (h) |
| Sensitivity (%) | FPR (h) |
| Sensitivity (%) | FPR (h) |
| |
| Pat1 | 85.7 ± 0.0 | 0.24 ± 0.00 | <0.001 | 85.7 ± 0.0 | 0.77 ± 0.06 | 0.005 | 100.0 ± 0.0 | 0.50 ± 0.03 | <0.001 |
| Pat3 | 100.0 ± 0.0 | 0.18 ± 0.00 | <0.001 | 41.7 ± 8.3 | 0.16 ± 0.02 | 0.008 | 66.7 ± 0.0 | 0.12 ± 0.07 | <0.001 |
| Pat5 | 80.0 ± 20.0 | 0.19 ± 0.03 | 0.001 | 40.0 ± 0.0 | 0.35 ± 0.04 | 0.185 | 60.0 ± 0.0 | 0.46 ± 0.00 | 0.062 |
| Pat9 | 50.0 ± 0.0 | 0.12 ± 0.12 | 0.067 | 50.0 ± 0.0 | 1.01 ± 0.20 | 0.519 | 50.0 ± 0.0 | 0.90 ± 0.08 | 0.459 |
| Pat10 | 33.3 ± 0.0 | 0.00 ± 0.00 | 0.025 | 33.3 ± 0.0 | 1.26 ± 0.27 | 0.857 | 33.3 ± 0.0 | 0.45 ± 0.05 | 0.348 |
| Pat13 | 80.0 ± 0.0 | 0.14 ± 0.00 | <0.001 | 80.0 ± 0.0 | 0.18 ± 0.04 | <0.001 | 90.0 ± 10.0 | 0.18 ± 0.05 | <0.001 |
| Pat14 | 80.0 ± 0.0 | 0.40 ± 0.00 | 0.004 | 60.0 ± 0.0 | 1.40 ± 0.00 | 0.506 | 80.0 ± 0.0 | 0.70 ± 0.10 | 0.029 |
| Pat18 | 100.0 ± 0.0 | 0.28 ± 0.02 | <0.001 | 50.0 ± 0.0 | 0.28 ± 0.02 | 0.033 | 83.3 ± 0.0 | 0.15 ± 0.02 | <0.001 |
| Pat20 | 100.0 ± 0.0 | 0.25 ± 0.05 | <0.001 | 80.0 ± 0.0 | 0.15 ± 0.10 | <0.001 | 100.0 ± 0.0 | 0.15 ± 0.00 | <0.001 |
| Pat21 | 100.0 ± 0.0 | 0.23 ± 0.09 | <0.001 | 75.0 ± 0.0 | 0.55 ± 0.17 | 0.046 | 100.0 ± 0.0 | 0.41 ± 0.02 | 0.001 |
| Pat23 | 100.0 ± 0.0 | 0.33 ± 0.00 | <0.001 | 80.0 ± 0.0 | 1.50 ± 0.17 | 0.224 | 100.0 ± 0.0 | 0.83 ± 0.17 | 0.005 |
| Ave | 82.6 ± 1.8 | 0.21 ± 0.02 | n.a | 61.4 ± 0.8 | 0.70 ± 0.08 | n.a | 78.5 ± 0.9 | 0.44 ± 0.04 | n.a |
Comparison of the baseline and CSSPM while increasing the number of labeled recordings in the training set (sensitivity/FPR).
| Patient | Baseline (all labeled) | With one recording labeled | With two recording labeled | With three recordings labeled | |||
|---|---|---|---|---|---|---|---|
| Baseline | CSSPM | Baseline | CSSPM | Baseline | CSSPM | ||
| Pat1 | 85.7/0.24 | 85.7/0.77 | 100.0/0.50 | 85.7/0.38 | 100.0/0.44 | 85.7/0.16 | 100.0/0.15 |
| Pat3 | 100.0/0.18 | 41.7/0.16 | 66.7/0.12 | 66.7/0.14 | 66.7/0.05 | 75.0/0.14 | 75.0/0.07 |
| Pat10 | 33.3/0.00 | 33.3/1.26 | 33.3/0.45 | 41.7/0.72 | 50.0/0.50 | 58.4/0.72 | 58.4/0.68 |
| Pat18 | 100.0/0.28 | 50.0/0.28 | 83.3/0.15 | 83.3/0.20 | 83.3/0.11 | 83.3/0.13 | 83.3/0.09 |
| Ave | 79.8/0.18 | 52.7/0.62 | 70.8/0.31 | 69.4/0.36 | 75.0/0.28 | 75.6/0.29 | 79.2/0.25 |
The performance of CSSPM with and without unlabeled data.
| Patient | Only labeled data | Labeled and unlabeled data | ||||
|---|---|---|---|---|---|---|
| Sensitivity (%) | FPR (h) |
| Sensitivity (%) | FPR (h) |
| |
| Pat1 | 100.0 ± 0.0 | 0.56 ± 0.03 | <0.001 | 100.0 ± 0.0 | 0.50 ± 0.03 | <0.001 |
| Pat3 | 41.7 ± 8.3 | 0.18 ± 0.05 | 0.010 | 66.7 ± 0.0 | 0.12 ± 0.07 | <0.001 |
| Pat5 | 60.0 ± 0.0 | 0.46 ± 0.00 | 0.062 | 60.0 ± 0.0 | 0.46 ± 0.00 | 0.062 |
| Pat9 | 50.0 ± 0.0 | 0.9 ± 0.123 | 0.476 | 50.0 ± 0.0 | 0.90 ± 0.08 | 0.459 |
| Pat10 | 33.3 ± 0.0 | 1.08 ± 0.09 | 0.783 | 33.3 ± 0.0 | 0.45 ± 0.05 | 0.348 |
| Pat13 | 80.0 ± 0.0 | 0.18 ± 0.04 | <0.001 | 90.0 ± 10.0 | 0.18 ± 0.05 | <0.001 |
| Pat14 | 80.0 ± 0.0 | 0.80 ± 0.00 | 0.043 | 80.0 ± 0.0 | 0.70 ± 0.10 | 0.029 |
| Pat18 | 66.7 ± 0.0 | 0.20 ± 0.02 | 0.001 | 83.3 ± 0.0 | 0.15 ± 0.02 | <0.001 |
| Pat20 | 80.0 ± 0.0 | 0.15 ± 0.00 | <0.001 | 100.0 ± 0.0 | 0.15 ± 0.00 | <0.001 |
| Pat21 | 75.0 ± 0.0 | 0.43 ± 0.00 | 0.025 | 100.0 ± 0.0 | 0.41 ± 0.02 | 0.001 |
| Pat23 | 80.0 ± 0.0 | 1.67 ± 0.00 | 0.281 | 100.0 ± 0.0 | 0.83 ± 0.17 | 0.005 |
| Ave | 67.9 ± 0.8 | 0.60 ± 0.01 | n.a | 78.5 ± 0.9 | 0.44 ± 0.04 | n.a |
The performance of CSSPM with and without Gaussian augmentation (average of all 11 patients).
| Whether to adopt Gaussian augmentation | Sensitivity (%) | FPR (h) |
|---|---|---|
| CSSPM (without augmentation) | 75.4 ± 0.7 | 0.49 ± 0.03 |
| CSSPM (with augmentation) |
|
|