| Literature DB >> 35906248 |
Artur Gramacki1, Jarosław Gramacki2.
Abstract
Electroencephalogram (EEG) is one of the main diagnostic tests for epilepsy. The detection of epileptic activity is usually performed by a human expert and is based on finding specific patterns in the multi-channel electroencephalogram. This is a difficult and time-consuming task, therefore various attempts are made to automate it using both conventional and Deep Learning (DL) techniques. Unfortunately, authors do not often provide sufficiently detailed and complete information to be able to reproduce their results. Our work is intended to fill this gap. Using a carefully selected 79 neonatal EEG recordings we developed a complete framework for seizure detection using DL approch. We share a ready to use R and Python codes which allow: (a) read raw European Data Format files, (b) read data files containing the seizure annotations made by human experts, (c) extract train, validation and test data, (d) create an appropriate Convolutional Neural Network (CNN) model, (e) train the model, (f) check the quality of the neural classifier, (g) save all learning results.Entities:
Mesh:
Year: 2022 PMID: 35906248 PMCID: PMC9338048 DOI: 10.1038/s41598-022-15830-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1The overall workflow of the of the proposed system.
Figure 2Electrode locations of International 10–20 system for EEG recording (figure taken from[30]).
Numbers of seizures for every infant annotated by 3 experts (marked as A, B and C). Cells marked with a hyphen (-) means that no seizure was annotated for a given infant by a given expert. 40 neonates had a seizure annotated by all 3 experts (EXP3 subset), 22 neonates were seizure free (EXP0 subset) and 17 neonates had a seizure annotated by 1 or 2 experts (EXP12 subset).
Lengths (in whole seconds) of seizures for every infant annotated by 3 experts (marked as A, B and C). When a given expert did not mark any seizures for a given infant, it was marked with a hyphen (-).
| Infant | A | B | C |
|---|---|---|---|
| 1 | 18, 135, 59, 29, 31, 49, 57, 23, 87, 23, 31, 93, 104, 34, 25, 24, 78, 122, 74, 333, 19, 23, 99, 15, 17 | 17, 27, 17, 158, 36, 57, 26, 50, 29, 115, 10, 12, 80, 119, 106, 99, 134, 30, 8, 42, 140, 61, 160, 35, 47, 25, 42, 96, 76, 51, 159, 52, 11, 148, 168, 43, 23, 28, 228, 249, 39, 88 | 26, 120, 16, 19, 9, 9, 105, 16, 12, 13, 35, 11, 17, 21, 34, 14, 12, 12, 7, 14, 14, 43, 73, 15, 14, 10, 11, 27, 9, 25, 71, 27, 12, 14, 12, 98 |
| 2 | 47, 18 | - | - |
| 4 | 882, 43 | 48, 30, 33, 931, 42, 52, 25 | 102, 35, 850 |
| 5 | 127, 631, 534, 853, 1180 | 107, 621, 454, 825, 1182 | 147, 620, 457, 571, 245, 419, 725 |
| 6 | - | - | 16, 98, 69, 308 |
| 7 | 16, 18, 133, 141, 168, 149 | 15, 10, 12, 11, 23, 165, 25, 45, 26, 45, 27, 16, 26, 10, 9, 10, 27, 141, 12, 167, 148 | 15, 12, 131, 138, 164, 148 |
| 8 | 32 | 12, 61 | - |
| 9 | 708, 158, 16 | 12, 715, 26, 157, 37, 52, 17, 25 | 705, 148, 9 |
| 11 | 21,33, 45 | 14, 36, 28, 49 | 40 |
| 12 | - | - | 31 |
| 13 | 292, 496, 112, 233, 138 | 315, 495, 138, 240, 125, 108 | 292, 495, 98, 237, 119, 105 |
| 14 | 24, 17, 273, 21, 49, 25, 19, 16, 14, 113, 19, 20, 19, 19, 16, 344, 110, 20, 16, 18, 55, 17, 29, 28, 463, 49, 24, 10, 31, 23, 16, 21, 18, 14, 29, 14, 14, 13, 15, 13, 12, 20, 14, 11, 17 | 29, 338, 48, 26, 19, 16, 16, 113, 19, 23, 18, 545, 16, 108, 779, 25, 15, 36, 14, 12, 13, 62, 15, 14, 21, 73 | 61, 24, 337, 48, 25, 18, 16, 15, 26, 85, 19, 20, 17, 16, 378, 136, 15, 114, 29, 29, 461, 52, 24, 9, 65, 15, 19, 14, 11, 20, 13, 11, 12, 45, 13, 13, 19, 70 |
| 15 | 23, 53, 133, 133, 20, 29, 167, 39, 41, 51, 38, 154, 61, 155, 54, 41, 164, 52, 63 | 164, 64, 51, 155, 93, 46 | 20, 41, 78, 35, 115, 158, 40, 43, 64, 37, 23, 10, 142, 64, 149, 55, 36, 10, 131, 57 |
| 16 | 45, 14, 19, 18, 42, 29, 14, 21, 18, 17, 14, 13, 16, 24, 12, 15, 22, 11, 17, 25, 13, 32, 17, 26, 24, 25, 22, 23, 23, 9 | 42, 64, 39, 96, 13, 70, 32, 78, 96, 96, 123, 88, 77, 88, 95, 97, 105, 105, 111, 306, 119, 115, 119, 122, 78, 207, 95, 100, 25, 109, 73, 37, 112, 118, 97, 119, 115, 103, 79, 102, 109, 87, 22, 100, 64 | 25, 32, 23, 22, 13, 11, 15, 15, 29, 11, 22, 22, 26, 21, 21, 21, 21, 21, 26, 16, 18, 32, 30, 37, 37, 12, 31, 36, 17, 26, 11, 27, 26, 12, 31, 14, 17, 22, 11, 22, 20, 13, 14, 24, 24, 26, 26, 19, 28, 14, 23, 18, 27, 31, 27, 23, 29, 16, 27, 15, 29 |
| 17 | 55, 52, 42, 22 | 213, 92, 278 | 40, 9, 11 |
| 19 | 89, 57, 103, 156, 102, 64, 686, 899, 48 | 81, 112, 92, 18, 108, 155, 163, 79, 686, 15, 1002, 92, 76 | 63, 46, 61, 158, 140, 63, 674, 901, 30, 46 |
| 20 | 48, 132, 42, 28, 30, 45, 74, 23, 29, 27, 31, 38, 17, 97, 59, 12, 19 | 193, 225, 22, 59, 219, 103, 81, 167, 140, 75, 58, 66, 92, 116, 17, 41, 10, 76, 34 | 27, 73, 137, 14, 14, 21, 29, 44, 101, 44, 32, 75, 29, 28, 71, 32, 35, 13, 11, 20, 114, 14, 60, 16 |
| 21 | 43 | 42 | 39 |
| 22 | 71, 88, 187, 58, 77, 26, 52, 138 | 113, 353, 173, 72, 303, 365 | 57, 83, 150, 66, 97, 53, 134 |
| 23 | 67, 444, 28, 27 | - | 10, 13, 13, 24, 10, 11, 11, 15, 20 |
| 24 | - | 295 | - |
| 25 | 15, 20, 24, 18, 51, 17, 29, 13, 41, 11, 23, 22 | 200, 14, 21, 7, 26, 54, 35, 99, 71, 18, 10, 63, 142, 38 | 13, 23, 20, 17 |
| 26 | - | - | 12, 10, 11, 12, 15, 14, 22 |
| 31 | 80, 102 | 80, 101 | 68, 100 |
| 33 | 582 | - | 13, 11, 9, 15, 14, 27 |
| 34 | 455 | 452 | 452 |
| 36 | 147, 345 | 532 | 110, 341 |
| 38 | 239, 132, 63, 571, 267, 408, 73, 197, 241, 63, 55, 294, 86, 38, 97, 78, 29, 22, 30 | 302, 138, 96, 577, 52, 376, 818, 451, 622, 552, 227, 94, 78, 47, 50, 25 | 235, 100, 53, 269, 159, 91, 49, 292, 396, 73, 197, 38, 236, 56, 88, 228, 53, 34, 70, 70, 20, 15, 27, 15 |
| 39 | 1479, 180, 92, 223, 96, 189 | 1477, 27, 165, 88, 312, 216, 232 | 1477, 171, 109, 189, 104, 179 |
| 40 | 29, 19, 22, 20, 17, 29, 100, 21, 26, 33, 135, 150 | 14, 20, 28, 12, 19, 22, 14, 16, 29, 101, 21, 16, 16, 90, 40, 135, 20, 14 | 26, 25, 24, 137, 15, 127, 28, 131 |
| 41 | 158, 166, 350, 219, 215, 156, 200, 193, 14, 157, 471, 148, 322, 201, 121, 130, 177, 119, 41, 162, 153, 59, 63, 117, 159, 104, 158, 167, 25, 38, 90, 238, 140, 173, 107, 69, 98, 144, 263, 174, 63, 85, 606, 868, 200 | 951, 602, 436, 2708, 864, 364, 536, 1232, 1491, 266 | 159, 167, 131, 197, 213, 211, 85, 83, 211, 187, 11, 58, 130, 169, 94, 138, 147, 172, 164, 216, 153, 134, 38, 51, 79, 145, 51, 163, 152, 168, 160, 158, 117, 171, 170, 43, 41, 14, 83, 243, 148, 195, 140, 145, 47, 93, 154, 210, 45, 189, 59, 74, 32, 633, 380, 479, 206, 36 |
| 43 | - | - | 23, 12, 12, 9 |
| 44 | 20, 82, 18, 102, 15, 19, 85 | 21, 82, 20, 33, 102, 11, 9, 14, 19, 86 | 17, 79, 17, 17, 98, 10, 16, 81 |
| 46 | - | - | 36, 25, 10, 12 |
| 47 | 61, 58, 92 | 63, 67, 35, 58, 89 | 56, 56, 89 |
| 50 | 89, 98, 91, 97, 94, 104, 97, 102, 70, 88 | 91, 83, 84, 74, 108, 91, 104, 86, 80, 97 | 88, 86, 87, 73, 88, 97, 102, 115, 77, 89 |
| 51 | 16, 16, 24, 329 | 320 | 56, 20, 12, 29, 11, 33, 23, 307 |
| 52 | 119 | 53, 47 | 46, 40 |
| 54 | 167, 238, 136, 55, 114, 371, 110, 76 | - | 28, 31, 87, 40, 46, 38, 152, 172, 84, 47, 20, 112, 21, 91, 57, 153, 973 |
| 56 | - | - | 814 |
| 61 | - | - | 95, 17, 16 |
| 62 | 382 | 390 | 380 |
| 63 | 74, 108, 104, 44, 43 | 88, 124, 98, 10, 11, 141, 16, 180, 41, 96, 111, 125 | 49, 339, 204, 56, 13, 101, 106, 68, 36, 172, 169, 68, 123, 82, 25, 102, 27, 17, 24, 13, 11, 117, 11, 20, 87 |
| 64 | - | 35, 27, 58, 43, 62, 48, 47, 48, 75, 97, 80, 83, 86, 50, 40, 59, 88, 131, 75, 128, 56, 81, 71, 57, 80 | 26, 25, 27, 17, 24, 36, 19, 28, 36, 16, 26, 28 |
| 65 | - | - | 15, 20, 16, 43 |
| 66 | 857, 881 | 881, 997 | 823, 725 |
| 67 | 43, 90, 30, 278, 45, 29, 242, 48, 21, 60, 51, 224, 124, 37, 35, 67 | 43, 137, 29, 307, 60, 46, 252, 54, 28, 59, 63, 216, 137, 62, 81, 74 | 43, 158, 36, 269, 44, 44, 246, 58, 23, 59, 56, 222, 127, 8, 31, 12, 30, 85, 25 |
| 68 | 33 | 44, 32 | - |
| 69 | 141, 149, 183, 141, 140, 316, 70, 102, 265, 127, 124, 212, 117, 181 | 39, 163, 18, 154, 996, 936, 269, 565, 315 | 40, 149, 13, 146, 183, 232, 142, 309, 71, 150, 148, 265, 571, 121, 176 |
| 71 | 55, 24, 33, 12 | 72, 48, 82, 117 | 71, 36 |
| 73 | 25, 235, 26, 266, 24, 292 | 450, 51, 271, 368 | 204, 217, 26, 34, 216, 44, 306 |
| 74 | - | 17, 18, 79, 44, 280 | 11, 89, 52, 16, 20, 15 |
| 75 | 920 | 925 | 918 |
| 76 | 46, 431 | 105, 50, 445 | 40, 190, 199 |
| 77 | 258 | 154, 257, 51 | 108, 256, 156 |
| 78 | 128, 147, 118, 81, 76, 88, 60, 61, 163, 83, 79, 81, 54, 111, 74, 184, 26, 101, 184, 90, 22, 205 | 124, 143, 119, 23, 82, 75, 75, 45, 84, 160, 86, 52, 78, 183, 141, 89, 184, 116, 190, 124, 152, 188 | 95, 137, 82, 75, 52, 72, 34, 51, 153, 58, 28, 78, 87, 48, 109, 72, 184, 15, 92, 80, 92, 84, 17, 216 |
| 79 | 43, 19, 56, 26, 54 | 68, 51, 81, 43, 96, 47, 75, 14 | 40, 17, 55, 61, 19, 50 |
A summary of the seizures annotations by each human expert.
| Feature | Expert A | Expert B | Expert C |
|---|---|---|---|
| Number of neonates with seizures annotated | 46 | 45 | 53 |
| Total seizures annotated | 402 | 429 | 548 |
| Min, max, mean and median of seizures duration | 9 1479 119.3 59.5 | 7 2708 147,5 79 | 7 1477 95,8 43 |
| Neonates with seizures annotated by experts A, B, C (consensus annotations) | 40 | ||
| Neonates with seizures annotated by only one exert | 10 | ||
| Neonates with seizures annotated by two experts | 7 | ||
| Neonates where no expert annotated any seizure | 22 | ||
Figure 3Two exemplary EEG signals. At the top, the original signal sampled at 256 Hz is depicted, at the bottom one can see the signal after reducing the frequency to 64 Hz.
Figure 4Sliding window design. (a) Channel F3-C3 of infant # 1. (b) Channel F3-C3 of infant # 10 which is seizures free. Red and blue signals are the real ones. The top signal has 2 seizures annotated by expert A. The first one starts at 104th second and ends at 121st second and is 18 seconds long. The second one starts at 6847th second and ends at 6863rd second and is 17 seconds long (see Table 7). By setting the appropriate values for the window and chunks variables, we can control the length of the samples (window variable) and their total number ( chunks variable). The window length was set to 6 seconds and the number of chunks was set to 3. Note, that the length of the second seizure fragment is 17 seconds. Consequently, it is possible to select only two chunks from the second seizure (although we assumed that we are selecting 3 chunks). From the first seizure one can safely get 3 chunks. The bottom EEG signal has no seizures annotated. We select randomly the same number of chunks (i.e. 5) as we have selected from the top EEG signal. Thanks to this method of selecting chunks, the number of seizure and non-seizure chunks is well balanced. The starting and ending seconds were chosen randomly (form example form 44 to 49 etc.).
Figure 5An analogous example to the one shown in Fig. 4. The EEG signals are the same. The figures differ in that window and chunks parameters have different values. Note also that now the window length was set to 5 and 2 chunks from each of the two seizure fragments were selected, see top picture (a). Consequently, 4 chunks were selected form the non-seizure signal, see bottom picture (b).
Figure 6An analogous example to the one shown in Fig. 4. The EEG signals are the same. The figures differ in that window and chunks parameters have different values. Note also that now window length was set to 2 and 5 chunks from each of the two seizure fragments were selected, see top picture (a). Consequently, 10 chunks were selected form the non-seizure signal, see bottom picture (b).
Figure 8After selecting the desired number of chunks in Fig. 4 one must combine them in a matrix form. In this example the matrix has 18+1 rows (the last row is a class indicator) and 10 columns. Every single cell represents a 6 seconds long EEG signal.
Figure 7The CNN sequential model used by the authors in all numerical experiments. The Python codes where the model is implemented is available for download in Electronic Supplements.
A summary of the most important parameters of our CNN architecture.
| Parameter/Note | Value |
|---|---|
| CNN network | Sequential, Three Conv2D layers, Two dense layers with L2 regularizers (l2=0.001) |
| Optimization algorithm | SGD with the parameters: Learning_rate = 0.01, Momentum = 0.5, Nestrrov = False |
| Activation function | Sigmoid |
| Loss function | Binary_crossentropy |
| Batch size | 16 |
| Number of epochs | 300 |
Figure 9The two-dimensional matrix shown in Fig. 8 cannot be fed into the neural network in this form. In Keras a 3D tensor is required. The figure shows how the 2D matrix must be divided into a tensor and a vector with seizure indicators.
Figure 10Four exemplary seizure (top) and non-seizure (bottom) fragments of 2 seconds where the sampling frequency is 64 Hz. This gives 128 individual datapoins. The EEG signals are represented as colormaps. It is easy to notice that the analysis of EEG signals, in the form of time series, de facto leads to the analysis of two-dimensional images.
Figure 11The fivefold cross-validation scheme used in our numerical experiments.
Evaluation results for dataset based on annotations given by expert A. Evaluation was performed on the test set using fivefold cross-validation scheme (see Fig. 11). Three values are given for every window size and every number of contiguous chunks: (a) the test-set accuracy in %, (b) average computation time for fivefolds (see Fig. 11) rounded to full minutes, (c) total number of chunks (see tensor in Fig. 9). The given computation times should be treated as indicative as they are very dependent on the instantaneous loads in the Colab system used. 10,000 means that the maximum possible set of contiguous chunks was selected. We can safely set chunks to 10000 and this way we are sure that the maximum possible set of chunks will be selected. Our dataset simply doesn’t have seizures as long as 10,000 seconds.
| Window size | Number of contiguous chunks | |||||
|---|---|---|---|---|---|---|
| 1 | 2 | 5 | 10 | 20 | 10000 | |
| Test-set accuracy in % | ||||||
| 1 | 58.9 | 70.6 | 81.2 | 84.1 | 86.7 | 92.7 |
| 2 | 61.5 | 78.4 | 83.8 | 89.5 | 91.1 | 95.9 |
| 5 | 68.2 | 81.4 | 90.3 | 92.9 | 94.7 | 96.2 |
| 10 | 74.0 | 79.0 | 90.0 | 93.9 | 96.1 | 95.6 |
| 20 | 75.4 | 78.8 | 88.5 | 93.1 | 92.7 | 94.1 |
Evaluation results for dataset based on annotations given by expert B. The rest of the caption is identical as in Table 3.
| Window size | Number of contiguous chunks | |||||
|---|---|---|---|---|---|---|
| 1 | 2 | 5 | 10 | 20 | 10000 | |
| Test-set accuracy in % | ||||||
| 1 | 63.3 | 73.0 | 80.6 | 82.9 | 86.2 | 90.8 |
| 2 | 63.5 | 75.6 | 83.8 | 87.8 | 91.5 | 94.3 |
| 5 | 66.7 | 79.8 | 88.5 | 92.3 | 95.2 | 96.7 |
| 10 | 66.6 | 78.9 | 90.1 | 94.1 | 95.7 | 94.9 |
| 20 | 71.6 | 78.5 | 88.5 | 90.2 | 93.6 | 96.0 |
Evaluation results for dataset based on annotations given by expert C. The rest of the caption is identical as in Table 3.
| Window size | Number of contiguous chunks | |||||
|---|---|---|---|---|---|---|
| 1 | 2 | 5 | 10 | 20 | 10000 | |
| Test-set accuracy in % | ||||||
| 1 | 64.2 | 73.3 | 82.9 | 85.6 | 89.6 | 93.1 |
| 2 | 61.5 | 75.8 | 87.9 | 90.4 | 93.2 | 94.8 |
| 5 | 69.3 | 84.2 | 90.9 | 93.7 | 96.0 | 97.0 |
| 10 | 78.5 | 85.0 | 91.6 | 95.4 | 96.6 | 96.5 |
| 20 | 80.5 | 85.0 | 89.5 | 93.2 | 94.7 | 94.9 |
Description of the content of working subdirectories in the repository attached to the paper.
| Directory name | Description |
|---|---|
| acc_loss | Stores training and validation loss curves side by side, as well as the training and validation accuracy curves |
| best_models | Stores the best CNN models obtained during training (in terms of model weights, i.e. trainable parameters). The data is saved in the binary HDF5 format. Best models can be loaded later and thus there is no need to train the neural network every time when you want to run a classifier for test data. An example of how to load a best model is shown in the enclosed Jupyter notebook (the load_weights function) |
| hists | Stores models’ training and validation accuracy and loss values. This data allows you to prepare visualizations of network training, similarly to those depicted in Figs. |
| inputs | Stores HDF5 files which are inputs for our CNN model. These files are created in R (EEG_neonatal.R script) using the raw EDF files which are stored in edf directory. To find out exactly which fragments of the original raw EDF files were used in HDF5 files (i.e. the exact samples numbers), files with names beginning with non_seizures_ and seizures_ are additionally generated |
| logs | Stores log files to be parsed by TensorBoard (TensorBoard is a tool for providing the measurements and visualizations needed during the machine learning workflow). |
| results | Stores CNN classification results of the validation and test datasets (given in %). The classification results presented in Tables |
| ROC | Stores ROC curves along with the AUC metrics |
| waveforms | Stores all EEG seizure waveforms annotated by 3 experts. There are 1379 waveforms in total, as depicted in Table |
Figure 13An example of CNN training and validation accuracy (upper curves), as well as the training and validation loss (lower curves). These curves can be considered almost ideal: accuracy is almost 1, loss is almost 0 and there is no very disadvantageous phenomenon called overfitting.
Figure 14An example of CNN training and validation accuracy (upper curves), as well as the training and validation loss (lower curves). Unlike the curves shown in Fig. 13, these are very bad, overfitting occurs very quickly, in this example around the 50th epoch.