| Literature DB >> 36092304 |
Mubeen Janmohamed1,2,3, Duong Nhu4, Levin Kuhlmann4, Amanda Gilligan5, Chang Wei Tan4, Piero Perucca1,2,6,7, Terence J O'Brien1,2, Patrick Kwan1,2.
Abstract
The application of deep learning approaches for the detection of interictal epileptiform discharges is a nascent field, with most studies published in the past 5 years. Although many recent models have been published demonstrating promising results, deficiencies in descriptions of data sets, unstandardized methods, variation in performance evaluation and lack of demonstrable generalizability have made it difficult for these algorithms to be compared and progress to clinical validity. A few recent publications have provided a detailed breakdown of data sets and relevant performance metrics to exemplify the potential of deep learning in epileptiform discharge detection. This review provides an overview of the field and equips computer and data scientists with a synopsis of EEG data sets, background and epileptiform variation, model evaluation parameters and an awareness of the performance metrics of high impact and interest to the trained clinical and neuroscientist EEG end user. The gold standard and inter-rater disagreements in defining epileptiform abnormalities remain a challenge in the field, and a hierarchical proposal for epileptiform discharge labelling options is recommended. Standardized descriptions of data sets and reporting metrics are a priority. Source code-sharing and accessibility to public EEG data sets will increase the rigour, quality and progress in the field and allow validation and real-world clinical translation.Entities:
Keywords: EEG; automated detection; deep learning; epilepsy; epileptiform abnormalities
Year: 2022 PMID: 36092304 PMCID: PMC9453433 DOI: 10.1093/braincomms/fcac218
Source DB: PubMed Journal: Brain Commun ISSN: 2632-1297
Pros and cons of future computer-assisted detection in EEG laboratories
| Pros |
|
Speed labelling and substantial data reduction leading to faster workflows Substituting unavailable expertise in low-resource countries Artificial intelligence is purported to have the potential of better results than traditionally trained experts. |
| Cons |
|
Missed true epileptiform discharges (false negatives) with the potential to delay treatment Exaggerated labelling of artefacts as abnormalities (false positives) (see Reduction of job and learning opportunities for EEG scientists and epilepsy trainees |
Figure 3Artefacts mimicking interictal epileptiform abnormalities. IED mimics (A) v-wave mimicking sharp wave and is labelled as abnormal by algorithm. (B) High-amplitude slow wave in Stage 3 sleep causing false positive, (C) ocular artefact, (D) ECG artefact picked up as runs of IEDs, (E) lateral rectus spikes and (F) wicket spike picked as false positive.
Figure 1The structure of data available from hospital-based EEG servers.
Figure 2Epileptiform variation in Genetic Generalized Epilepsy EEG data sets. (From left to right) (A) Classic 3 Hz spike and wave on transverse montage, (B) polyspikes with EMG artefact in frontopolar channels and eye movements, (C) slow spike/wave on transverse montage, (D) mild EMG affecting frontal channels with embedded small spike and waves and irregular slow waves, (E) fragments on transverse montage, (F) polyspike/slow waves on transverse montage, (G) marked EMG artefact confounding epileptiform abnormality in temporal and frontal channels on longitudinal montage, (H) a train of focal posterior sharp waves and a (I) generalized paroxysmal fast burst.
Performance metrics commonly used in deep and machine learning studies
| Metrics of clinical utility for IED detection |
|---|
|
Sensitivity: Proportion of true gold standard IEDs correctly detected |
|
Precision: The proportion of true marked gold standard IEDs to all machine predicted positive labels. (True positives)/(true positives + false positives) |
|
False positive rate: Rate of false positives which were not classified by the gold standard as IEDs typically reported in per hour |
|
F1-score—This takes into account the two most relevant metrics of precision and recall. |
|
AUPRC—Area under the precision–recall curve (AUPRC) which differs from the area under the ROC curve. A model achieves perfect score when it identifies all epileptiform abnormalities without marking normal or benign abnormalities |
|
Metrics of limited clinical utility in isolation |
|
True negatives, specificity, accuracy and AUROC (area under ROC curve) |