| Literature DB >> 29795031 |
Vignesh Raja Karuppiah Ramachandran1, Huibert J Alblas2, Duc V Le3, Nirvana Meratnia4.
Abstract
In the last decade, seizure prediction systems have gained a lot of attention because of their enormous potential to largely improve the quality-of-life of the epileptic patients. The accuracy of the prediction algorithms to detect seizure in real-world applications is largely limited because the brain signals are inherently uncertain and affected by various factors, such as environment, age, drug intake, etc., in addition to the internal artefacts that occur during the process of recording the brain signals. To deal with such ambiguity, researchers transitionally use active learning, which selects the ambiguous data to be annotated by an expert and updates the classification model dynamically. However, selecting the particular data from a pool of large ambiguous datasets to be labelled by an expert is still a challenging problem. In this paper, we propose an active learning-based prediction framework that aims to improve the accuracy of the prediction with a minimum number of labelled data. The core technique of our framework is employing the Bernoulli-Gaussian Mixture model (BGMM) to determine the feature samples that have the most ambiguity to be annotated by an expert. By doing so, our approach facilitates expert intervention as well as increasing medical reliability. We evaluate seven different classifiers in terms of the classification time and memory required. An active learning framework built on top of the best performing classifier is evaluated in terms of required annotation effort to achieve a high level of prediction accuracy. The results show that our approach can achieve the same accuracy as a Support Vector Machine (SVM) classifier using only 20 % of the labelled data and also improve the prediction accuracy even under the noisy condition.Entities:
Keywords: EEG; epilepsy; health-care; implantable body sensor networks; machine learning; seizure prediction; signal processing
Mesh:
Year: 2018 PMID: 29795031 PMCID: PMC6022213 DOI: 10.3390/s18061698
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Example of a schematic flow of closed loop operation of the DBS process. Inclusion of the expert is always in loop before any treatment is applied locally. Picture adapted under CC BY 3.0 license from [20].
Figure 2Four states of ElectroCorticoGraph (ECoG) of an epileptic patient [22]. A snapshot of data captured from an array of 16 brain implanted electrodes of an epileptic patient with generalized tonic-clonic seizure. The sampling rate is 500 Hz and is recorded for a total duration of 40 h. Duration of the seizure prediction horizon and the seizure occurrence period are illustrated with respect to the onset of a seizure alarm. (A snapshot of ECoG signal obtained from intracranial ElectroEncephaloGraphy (iEEG) viewer [24].)
Technical specifications of implantable medical devices.
| Network Parameter | Characteristics of Implantable Medical Devices | ||||
|---|---|---|---|---|---|
| Pace-Maker [ | Neural Stimulators [ | Drug-Delivery Systems [ | Cochlear Implants [ | Endoscopy Capsules [ | |
| Processing duty-cycle | up to 25% of the ON time | up to 50% of the ON time | up to 15% of the ON time | up to 75% of the ON time | up to 100% of the ON time |
| Processing CPU clock | 10–100 kHz | 10–100 kHz | 10–100 kHz | 10–100 MHz | 10–100 MHz |
| Longevity | up to 5 years | up to 5 years | up to 5 years | up to 5 years | up to 2 days |
| Battery | up to 5 Ah | up to 5 Ah | up to 5 Ah | up to 5 Ah Ah | up to 5 Ah |
| Memory | up to 128 kB | up to 128 kB | up to 64 kB | up to 2 MB | up to up to 256 MB |
| Telemetry | Yes | Yes | Yes | Yes | Yes |
A selective summary of seizure prediction methods.
| Name of the Work | Machine Learning Method | Dataset Used | Feature Used | Validation Method | Sensitivity | Specificity FP/h |
|---|---|---|---|---|---|---|
| Cook et al. [ | Decision Tree, k-Nearest Neighbour | Private data | Average energy, Teager-Kaiser energy, line length | Comparison with random predictor based on ground truth | 65–100% | Not reported |
| Shiao et al. [ | Support Vector Machine, intuitive datasegmentation | Kaggle.com, iEEG.org | Band pass filtered power density, Power of Fast Fourier Transform bin. correlation matrix | Ground truth | 89–100% | 0–0.3 FP/day |
| Xiao et al. [ | Adaptive Linear Discriminant Analysis, Adaptive NaiveBayes | Private data | Lyapunov exponent, pairwiseeuclidean distance, T-statistic, Pearson correlation, temporal pattern | Comparison with random predictor based on ground truth | 72–82% | 0.69–0.93 FP/horizon |
| Parvez et al. [ | Least square–Support Vector Machine, | Freiburg | Customized phase correlation | Comparison with six existing methods based on ground truth | 91–95% | 2.4 FP/patient |
| Aarabi et al. [ | Rule basedclassification | Freiburg | Lempel-Ziv complexity, Lyapunov exponent, nonlinear interdependence, correlation dimension, correlation entropy, noise-level | Ground truth | 86.7–92.9% | 0.64–4.69 FP/h |
| Gadhoumi et al. [ | Discriminant analysis based classification | Private data | Wavelet energy, entropy, state-similarity using inclusion, persistence, & distance measures | Comparison with random predictor based on ground truth | 85–100% | 0.1–0.35 FP/h |
Authors used a different notion to illustrate specificity; Authors calculated false prediction in a specific prediction horizon and not uniformly over an hour.
Figure 3Schematic flow of our seizure prediction framework. The base classifier is expected to output both soft label (or degree of certainty), which is used to determine the certainty of the classification and crisp label (or label) to know the class.
Figure 4Schematic flow of active learner block based on Bernoulli-Gaussian model. This model takes the ambiguous samples along with the base classifier’s label prediction with its certainty as inputs.
Data characteristics used in our study accessed from [49]. Source: [37].
| Subject | Sampling Rate in Hz | # of Inter-ictal Segments #(h) | # of Pre-ictal Segments #(h) | # of Lead Seizures | %of Pre-ictal Segments | %of Inter-ictal Segments | Total Duration of Labelled Data (h) |
|---|---|---|---|---|---|---|---|
| Dog1 | 400 | 480 (80 h) | 24 (4 h) | 8 | 4.8% | 95.2% | 84 (hours) |
| Dog2 | 400 | 500 (83) | 42 (7) | 40 | 7.8% | 92.2% | 90 |
| Dog3 | 400 | 1440 (240) | 72 (12) | 18 | 4.8% | 95.2% | 252 |
| Dog4 | 400 | 804 (134) | 97 (16) | 27 | 10.8% | 89.2% | 150 |
| Dog5 | 400 | 450 (75) | 30 (5) | 8 | 6.2% | 93.8% | 80 |
| Patient1 | 5000 | 50 (8) | 18 (3) | 4 | 26.5% | 73.5% | 11 |
| Patient2 | 5000 | 42 (7) | 18 (3) | 6 | 30.0% | 70.0% | 10 |
Separation of 1 h blocks of data for a fair training and validation for Dog1. Each number represents the sequence number of the 1 h blocks [25]. Table 5, Table 6, Table 7, Table 8, Table 9 and Table 10 present the separation of 1 h blocks of data for each subject. However, all iterations are not shown due to their repetitiveness.
| Subject | Training Set | Validation Set | ||
|---|---|---|---|---|
| Inter-Ictal | Pre-Ictal | Inter-Ictal | Pre-Ictal | |
| Dog1 (Iteration1) | 2–80 | 2,3,4 | 1 | 1 |
| Dog1 (Iteration2) | 1,3–80 | 1,3,4 | 2 | 2 |
| Dog1 (Iteration3) | 1,2,4–80 | 1,2,4 | 3 | 3 |
| Dog1 (Iteration4) | 1–3,5–80 | 1,2,3 | 4 | 4 |
Selection of data blocks for validating active learners.
| Subject | Training Set | AL-Validation Set | ||
|---|---|---|---|---|
| Inter-Ictal | Pre-Ictal | Inter-Ictal | Pre-Ictal | |
| Dog5 (benchmark) | 1-75 | 1,2,3,4,5 | 71,72,73,74,75 | 1,2,3,4,5 |
| Dog5 (Iteration1) | 1-74 | 1,2,3,4 | 75 | 5 |
| Dog5 (Iteration2) | 1-73 | 1,2,3 | 74,75 | 4,5 |
| Dog5 (Iteration3) | 1-72 | 1,2 | 73,74,75 | 3,4,5 |
| Dog5 (Iteration4) | 1-71 | 1 | 72,73,74,75 | 2,3,4,5 |
Figure 5Performance of classifiers for different sets of features.
Figure 6Accuracy of a linear Support Vector Machine classifier as a function of window sizes.
Figure 7Classification time for 20 s of Electro Cortico Graphy data from 16 channels using feature set, timing with Dog5 data.
Performance of the Support Vector Machine classifier in terms of False Positive Rate (FPR) and False Negative Rate (FNR) with Dog5 Pre-ictal dataset using feature set.
| Iteration | Classifier Performance | Feature Samples (%) | ||
|---|---|---|---|---|
| FPR (%) | FNR (%) | Clear | Ambiguous | |
| Dog5 (Iteration1) | 2.30 | 23.2 | 52 | 48 |
| Dog5 (Iteration2) | 3.62 | 20.7 | 63 | 37 |
| Dog5 (Iteration3) | 2.45 | 13.8 | 37 | 63 |
| Dog5 (Iteration4) | 1.54 | 22.5 | 46 | 54 |
| Dog5 (Iteration5) | 4.47 | 31.3 | 22 | 78 |
Performance of the Support Vector Machine classifier in terms of False Positive Rate (FPR) and False Negative Rate (FNR) with Dog5 Inter-ictal dataset using feature set.
| Iteration | Classifier Performance | Feature Samples (%) | ||
|---|---|---|---|---|
| FPR (%) | FNR (%) | Clear | Ambiguous | |
| Dog5 (Iteration1) | 0.01 | 0.12 | 92 | 8 |
| Dog5 (Iteration2) | 0.41 | 0.21 | 93 | 7 |
| Dog5 (Iteration3) | 0.21 | 0.87 | 91 | 9 |
| Dog5 (Iteration4) | 0.14 | 1.01 | 97 | 3 |
| Dog5 (Iteration5) | 0.37 | 0.91 | 96 | 4 |
Dog2.
| Subject | Tra.–set | Val.–set | |||
|---|---|---|---|---|---|
| I . I | P.I | I.I | P.I | ||
| Dog2 (It.1) | 2–83 | 2–7 | 1 | 1 | |
| Dog2 (It.2) | 1,3–83 | 1,3–7 | 2 | 2 | |
| ... | |||||
| ... | |||||
| Dog2 (It.7) | 1–6,8–83 | 1–6 | 7 | 7 | |
Dog3.
| Subject | Tra.–set | Val.–set | ||
|---|---|---|---|---|
| I . I | P.I | I.I | P.I | |
| Dog3 (It.1) | 2–240 | 2–12 | 1 | 1 |
| Dog3 (It.2) | 1,3–240 | 1,3–12 | 2 | 2 |
| ... | ||||
| ... | ||||
| Dog3 (It.12) | 1–11,13–240 | 1–11 | 12 | 12 |
Dog4.
| Subject | Tra.–set | Val.–set | ||
|---|---|---|---|---|
| I . I | P.I | I.I | P.I | |
| Dog4 (It.1) | 2–134 | 2–16 | 1 | 1 |
| Dog4 (It.2) | 1,3–134 | 1,3–16 | 2 | 2 |
| ... | ||||
| ... | ||||
| Dog4 (It.16) | 1–15,17–134 | 1–15 | 16 | 16 |
Dog5.
| Subject | Tra.–set | Val.–set | ||
|---|---|---|---|---|
| I . I | P.I | I.I | P.I | |
| Dog5 (It.1) | 2–75 | 2–5 | 1 | 1 |
| Dog5 (It.2) | 1,3–75 | 1,3–5 | 2 | 2 |
| ... | ||||
| ... | ||||
| Dog5 (It.5) | 1–4,6–75 | 1–4 | 5 | 5 |
Patient1.
| Subject | Tra.–set | Val.–set | ||
|---|---|---|---|---|
| I . I | P.I | I.I | P.I | |
| Patient1 (It.1) | 2–8 | 2,3 | 1 | 1 |
| Patient1 (It.2) | 1,3–8 | 1,3 | 2 | 2 |
| Patient1 (It.3) | 1,2,4–7 | 1,2 | 3 | 3 |
Patient2.
| Subject | Tra.–set | Val.–set | ||
|---|---|---|---|---|
| I . I | P.I | I.I | P.I | |
| Patient2 (It.1) | 2–7 | 2,3 | 1 | 1 |
| Patient2 (It.2) | 1,3–7 | 1,3 | 2 | 2 |
| Patient2 (It.3) | 1,2,4–7 | 1,2 | 3 | 3 |
Performance of the Support Vector Machine classifier in terms of False Positive Rate (FPR) and False Negative Rate (FNR) for all the subject’s pre-ictal dataset using feature set. Classifier performances of only the best iterations are shown.
| Subject | Classifier Performance | Feature Samples (%) | ||
|---|---|---|---|---|
| FPR (%) | FNR (%) | Clear | Ambiguous | |
| Dog1 | 4.15 | 34.2 | 29 | 71 |
| Dog2 | 3.74 | 26.7 | 24 | 76 |
| Dog3 | 3.45 | 25.8 | 21 | 79 |
| Dog4 | 3.88 | 29.8 | 26 | 74 |
| Dog5 | 4.47 | 31.3 | 22 | 78 |
| Patient1 | 8.24 | 38.5 | 16 | 84 |
| Patient2 | 10.47 | 41.3 | 12 | 88 |
Performance of the Support Vector Machine classifier in terms of False Positive Rate (FPR) and False Negative Rate (FNR) for all the subject’s inter-ictal dataset using feature set. Classifier performances of only the best iterations are shown.
| Subject | Classifier Performance | Feature Samples (%) | ||
|---|---|---|---|---|
| FPR (%) | FNR (%) | Clear | Ambiguous | |
| Dog1 | 0.19 | 0.18 | 94 | 6 |
| Dog2 | 0.12 | 0.23 | 95 | 5 |
| Dog3 | 0.02 | 0.11 | 98 | 2 |
| Dog4 | 0.25 | 0.62 | 97 | 3 |
| Dog5 | 0.37 | 0.91 | 96 | 4 |
| Patient1 | 0.54 | 1.45 | 91 | 9 |
| Patient2 | 0.47 | 1.32 | 90 | 10 |
Figure 8Number of seizure episodes missed from classification as a function of threshold for all dataset.
Figure 9Label complexity of three different settings of the Support Vector Machine classifier. In each setting, the error rate is measured as a function of label fraction.
Figure 10Label complexity of three different settings of the Support Vector Machine classifier. Error rate as a function of label fraction.
Figure 11Noise complexity of three different settings of the the Support Vector Machine classifier. Accuracy as a function of noise rate .
Performance of the Support Vector Machine classifier in terms of False Positive Rate (FPR) and False Negative Rate (FNR) for the overall dataset using feature set.
| Subject | Random Poisson Predictor | Active Learning Predictor | ||||
|---|---|---|---|---|---|---|
| TPR | FPR | Prediction Horizon | TPR | FPR | Prediction Horizon | |
| Dog1 | 0.25 | 0.98 | 8 (minutes) | 0.88 | 0.12 | 14.5 (minutes) |
| Dog2 | 0.13 | 0.96 | 94 | 0.95 | 0.08 | 20.3 |
| Dog3 | 0.17 | 0.94 | 62 | 0.94 | 0.11 | 15.6 |
| Dog4 | 0.15 | 0.99 | 4 | 0.88 | 0.03 | 18.2 |
| Dog5 | 0.13 | 0.97 | 12 | 0.88 | 0.13 | 36.2 |
| Patient1 | 0.25 | 0.98 | 30 | 0.75 | 0.30 | 20.2 |
| Patient2 | 0.16 | 0.97 | 60 | 0.83 | 0.16 | 23.2 |
Performance of active learning based seizure prediction framework in terms of True Prediction per hour (TP/h) and False Prediction per hour (FP/h) for the overall dataset using feature set.
| Subject | Random Poisson Predictor | Active Learning Predictor | ||
|---|---|---|---|---|
| TP/h | FP/h | TP/h | FP/h | |
| Dog1 | 0.05 | 0.95 | 0.90 | 0.10 |
| Dog2 | 0.09 | 0.91 | 0.92 | 0.08 |
| Dog3 | 0.06 | 0.94 | 0.97 | 0.03 |
| Dog4 | 0.02 | 0.98 | 0.89 | 0.11 |
| Dog5 | 0.04 | 0.96 | 0.95 | 0.05 |
| Patient1 | 0.02 | 0.98 | 0.40 | 0.60 |
| Patient2 | 0.03 | 0.97 | 0.74 | 0.26 |
Comparison of our study with existing works.
| Parameters | This Study | Bandarabadi et al. [ | Li et al. [ | Williamson et al. [ | Aarabi and He [ | Kuhlmann et al. [ | Gadhoumi et al. [ |
|---|---|---|---|---|---|---|---|
| Feature-type | Time-domain | Time-Frequency | Time-domain | Time-domain | Time-domain | Time-domain | Time-Frequency |
| Database | iEEG | EPILEPSIAE | FSPEEG | FSPEEG | FSPEEG | Freiburg | Private |
| FP/h | 0.03–0.60 | 0.15 | 0.11–0.15 | 0.03–0.07 | 0.11–0.17 | 0.64–4.69 | 0.1–0.35 |
| TP/h | 40–97% | 78.36% | 56–72% | 86–95% | 79.9–90.2% | 50–88% | 85% |
| # of Subjects | 7 | 24 | 21 | 21 | 21 | 6 | 17 |
FSPEEG has been discontinued to be complemented and replaced by the larger EPILEPSIAE database; 8 of 24 had intra-cranial EEG recordings.