Jin Jing1,2, Aline Herlopian1,3, Ioannis Karakis4, Marcus Ng5, Jonathan J Halford6, Alice Lam1, Douglas Maus1, Fonda Chan1, Marjan Dolatshahi1, Carlos F Muniz1, Catherine Chu1, Valeria Sacca7, Jay Pathmanathan1,8, WenDong Ge1, Haoqi Sun1, Justin Dauwels2, Andrew J Cole1, Daniel B Hoch1, Sydney S Cash1, M Brandon Westover1. 1. Division of Clinical Neurophysiology, Department of Neurology, Massachusetts General Hospital, Boston. 2. School of Electrical and Electronics Engineering, Nanyang Technological University, Singapore. 3. Department of Neurology, Yale School of Medicine, New Haven, Connecticut. 4. Department of Neurology, Emory University School of Medicine, Atlanta, Georgia. 5. Department of Neurology, University of Manitoba, Winnipeg, Manitoba, Canada. 6. Department of Neurology, Medical University of South Carolina, Charleston. 7. Department of Neurology, Department of Medical and Surgical Sciences, University "Magna Graecia" of Catanzaro, Italy. 8. Department of Neurology, Hospital of the University of Pennsylvania, Philadelphia.
Abstract
Importance: The validity of using electroencephalograms (EEGs) to diagnose epilepsy requires reliable detection of interictal epileptiform discharges (IEDs). Prior interrater reliability (IRR) studies are limited by small samples and selection bias. Objective: To assess the reliability of experts in detecting IEDs in routine EEGs. Design, Setting, and Participants: This prospective analysis conducted in 2 phases included as participants physicians with at least 1 year of subspecialty training in clinical neurophysiology. In phase 1, 9 experts independently identified candidate IEDs in 991 EEGs (1 expert per EEG) reported in the medical record to contain at least 1 IED, yielding 87 636 candidate IEDs. In phase 2, the candidate IEDs were clustered into groups with distinct morphological features, yielding 12 602 clusters, and a representative candidate IED was selected from each cluster. We added 660 waveforms (11 random samples each from 60 randomly selected EEGs reported as being free of IEDs) as negative controls. Eight experts independently scored all 13 262 candidates as IEDs or non-IEDs. The 1051 EEGs in the study were recorded at the Massachusetts General Hospital between 2012 and 2016. Main Outcomes and Measures: Primary outcome measures were percentage of agreement (PA) and beyond-chance agreement (Gwet κ) for individual IEDs (IED-wise IRR) and for whether an EEG contained any IEDs (EEG-wise IRR). Secondary outcomes were the correlations between numbers of IEDs marked by experts across cases, calibration of expert scoring to group consensus, and receiver operating characteristic analysis of how well multivariate logistic regression models may account for differences in the IED scoring behavior between experts. Results: Among the 1051 EEGs assessed in the study, 540 (51.4%) were those of females and 511 (48.6%) were those of males. In phase 1, 9 experts each marked potential IEDs in a median of 65 (interquartile range [IQR], 28-332) EEGs. The total number of IED candidates marked was 87 636. Expert IRR for the 13 262 individually annotated IED candidates was fair, with the mean PA being 72.4% (95% CI, 67.0%-77.8%) and mean κ being 48.7% (95% CI, 37.3%-60.1%). The EEG-wise IRR was substantial, with the mean PA being 80.9% (95% CI, 76.2%-85.7%) and mean κ being 69.4% (95% CI, 60.3%-78.5%). A statistical model based on waveform morphological features, when provided with individualized thresholds, explained the median binary scores of all experts with a high degree of accuracy of 80% (range, 73%-88%). Conclusions and Relevance: This study's findings suggest that experts can identify whether EEGs contain IEDs with substantial reliability. Lower reliability regarding individual IEDs may be largely explained by various experts applying different thresholds to a common underlying statistical model.
Importance: The validity of using electroencephalograms (EEGs) to diagnose epilepsy requires reliable detection of interictal epileptiform discharges (IEDs). Prior interrater reliability (IRR) studies are limited by small samples and selection bias. Objective: To assess the reliability of experts in detecting IEDs in routine EEGs. Design, Setting, and Participants: This prospective analysis conducted in 2 phases included as participants physicians with at least 1 year of subspecialty training in clinical neurophysiology. In phase 1, 9 experts independently identified candidate IEDs in 991 EEGs (1 expert per EEG) reported in the medical record to contain at least 1 IED, yielding 87 636 candidate IEDs. In phase 2, the candidate IEDs were clustered into groups with distinct morphological features, yielding 12 602 clusters, and a representative candidate IED was selected from each cluster. We added 660 waveforms (11 random samples each from 60 randomly selected EEGs reported as being free of IEDs) as negative controls. Eight experts independently scored all 13 262 candidates as IEDs or non-IEDs. The 1051 EEGs in the study were recorded at the Massachusetts General Hospital between 2012 and 2016. Main Outcomes and Measures: Primary outcome measures were percentage of agreement (PA) and beyond-chance agreement (Gwet κ) for individual IEDs (IED-wise IRR) and for whether an EEG contained any IEDs (EEG-wise IRR). Secondary outcomes were the correlations between numbers of IEDs marked by experts across cases, calibration of expert scoring to group consensus, and receiver operating characteristic analysis of how well multivariate logistic regression models may account for differences in the IED scoring behavior between experts. Results: Among the 1051 EEGs assessed in the study, 540 (51.4%) were those of females and 511 (48.6%) were those of males. In phase 1, 9 experts each marked potential IEDs in a median of 65 (interquartile range [IQR], 28-332) EEGs. The total number of IED candidates marked was 87 636. Expert IRR for the 13 262 individually annotated IED candidates was fair, with the mean PA being 72.4% (95% CI, 67.0%-77.8%) and mean κ being 48.7% (95% CI, 37.3%-60.1%). The EEG-wise IRR was substantial, with the mean PA being 80.9% (95% CI, 76.2%-85.7%) and mean κ being 69.4% (95% CI, 60.3%-78.5%). A statistical model based on waveform morphological features, when provided with individualized thresholds, explained the median binary scores of all experts with a high degree of accuracy of 80% (range, 73%-88%). Conclusions and Relevance: This study's findings suggest that experts can identify whether EEGs contain IEDs with substantial reliability. Lower reliability regarding individual IEDs may be largely explained by various experts applying different thresholds to a common underlying statistical model.
Authors: Patrick M Bossuyt; Johannes B Reitsma; David E Bruns; Constantine A Gatsonis; Paul P Glasziou; Les M Irwig; David Moher; Drummond Rennie; Henrica C W de Vet; Jeroen G Lijmer Journal: Ann Intern Med Date: 2003-01-07 Impact factor: 25.391
Authors: Jonathan J Halford; M Brandon Westover; Suzette M LaRoche; Micheal P Macken; Ekrem Kutluay; Jonathan C Edwards; Leonardo Bonilha; Giridhar P Kalamangalam; Kan Ding; Jennifer L Hopp; Amir Arain; Rachael A Dawson; Gabriel U Martz; Bethany J Wolf; Chad G Waters; Brian C Dean Journal: J Clin Neurophysiol Date: 2018-09 Impact factor: 2.177
Authors: Jonathan J Halford; Robert J Schalkoff; Jing Zhou; Selim R Benbadis; William O Tatum; Robert P Turner; Saurabh R Sinha; Nathan B Fountain; Amir Arain; Paul B Pritchard; Ekrem Kutluay; Gabriel Martz; Jonathan C Edwards; Chad Waters; Brian C Dean Journal: J Neurosci Methods Date: 2012-11-19 Impact factor: 2.390
Authors: W O Tatum; G Rubboli; P W Kaplan; S M Mirsatari; K Radhakrishnan; D Gloss; L O Caboclo; F W Drislane; M Koutroumanidis; D L Schomer; D Kasteleijn-Nolst Trenite; Mark Cook; S Beniczky Journal: Clin Neurophysiol Date: 2018-02-01 Impact factor: 3.708
Authors: Francis Levira; David J Thurman; Josemir W Sander; W Allen Hauser; Dale C Hesdorffer; Honorati Masanja; Peter Odermatt; Giancarlo Logroscino; Charles R Newton Journal: Epilepsia Date: 2016-12-18 Impact factor: 5.864
Authors: Jin Jing; Haoqi Sun; Jennifer A Kim; Aline Herlopian; Ioannis Karakis; Marcus Ng; Jonathan J Halford; Douglas Maus; Fonda Chan; Marjan Dolatshahi; Carlos Muniz; Catherine Chu; Valeria Sacca; Jay Pathmanathan; Wendong Ge; Justin Dauwels; Alice Lam; Andrew J Cole; Sydney S Cash; M Brandon Westover Journal: JAMA Neurol Date: 2020-01-01 Impact factor: 18.302
Authors: Emily L Thorn; Lauren M Ostrowski; Dhinakaran M Chinappen; Jin Jing; M Brandon Westover; Steven M Stufflebeam; Mark A Kramer; Catherine J Chu Journal: Epilepsia Date: 2020-09-18 Impact factor: 5.864
Authors: Mark A Kramer; Sally M Stoyell; Dhinakaran Chinappen; Lauren M Ostrowski; Elizabeth R Spencer; Amy K Morgan; Britt Carlson Emerton; Jin Jing; M Brandon Westover; Uri T Eden; Robert Stickgold; Dara S Manoach; Catherine J Chu Journal: J Neurosci Date: 2021-01-19 Impact factor: 6.167
Authors: Nitish M Harid; Jin Jing; Jacob Hogan; Fábio A Nascimento; An Ouyang; Wei-Long Zheng; Wendong Ge; Sahar F Zafar; Jennifer A Kim; D Lam Alice; Aline Herlopian; Douglas Maus; Ioannis Karakis; Marcus Ng; Shenda Hong; Zhu Yu; Peter W Kaplan; Sydney Cash; Mouhsin Shafi; Gabriel Martz; Jonathan J Halford; Michael Brandon Westover Journal: Epileptic Disord Date: 2022-06-01 Impact factor: 2.333
Authors: Simon Henin; Anita Shankar; Helen Borges; Adeen Flinker; Werner Doyle; Daniel Friedman; Orrin Devinsky; György Buzsáki; Anli Liu Journal: Brain Date: 2021-06-22 Impact factor: 15.255
Authors: Alice D Lam; Rani A Sarkis; Kyle R Pellerin; Jin Jing; Barbara A Dworetzky; Daniel B Hoch; Claire S Jacobs; Jong Woo Lee; Daniel S Weisholtz; Rodrigo Zepeda; M Brandon Westover; Andrew J Cole; Sydney S Cash Journal: Neurology Date: 2020-08-06 Impact factor: 9.910