Mark L Scheuer1, Anto Bagic2, Scott B Wilson3. 1. Persyst Development Corporation, USA. Electronic address: scheuerml@persyst.com. 2. University of Pittsburgh, Department of Neurology, USA. 3. Persyst Development Corporation, USA.
Abstract
OBJECTIVE: Compare the spike detection performance of three skilled humans and three computer algorithms. METHODS: 40 prolonged EEGs, 35 containing reported spikes, were evaluated. Spikes and sharp waves were marked by the humans and algorithms. Pairwise sensitivity and false positive rates were calculated for each human-human and algorithm-human pair. Differences in human pairwise performance were calculated and compared to the range of algorithm versus human performance differences as a type of statistical Turing test. RESULTS: 5474 individual spike events were marked by the humans. Mean, pairwise human sensitivities and false positive rates were 40.0%, 42.1%, and 51.5%, and 0.80, 0.97, and 1.99/min. Only the Persyst 13 (P13) algorithm was comparable to humans - 43.9% and 1.65/min. Evaluation of pairwise differences in sensitivity and false positive rate demonstrated that P13 met statistical noninferiority criteria compared to the humans. CONCLUSION: Humans had only a fair level of agreement in spike marking. The P13 algorithm was statistically noninferior to the humans. SIGNIFICANCE: This was the first time that a spike detection algorithm and humans performed similarly. The performance comparison methodology utilized here is generally applicable to problems in which skilled human performance is the desired standard and no external gold standard exists.
OBJECTIVE: Compare the spike detection performance of three skilled humans and three computer algorithms. METHODS: 40 prolonged EEGs, 35 containing reported spikes, were evaluated. Spikes and sharp waves were marked by the humans and algorithms. Pairwise sensitivity and false positive rates were calculated for each human-human and algorithm-human pair. Differences in human pairwise performance were calculated and compared to the range of algorithm versus human performance differences as a type of statistical Turing test. RESULTS: 5474 individual spike events were marked by the humans. Mean, pairwise human sensitivities and false positive rates were 40.0%, 42.1%, and 51.5%, and 0.80, 0.97, and 1.99/min. Only the Persyst 13 (P13) algorithm was comparable to humans - 43.9% and 1.65/min. Evaluation of pairwise differences in sensitivity and false positive rate demonstrated that P13 met statistical noninferiority criteria compared to the humans. CONCLUSION:Humans had only a fair level of agreement in spike marking. The P13 algorithm was statistically noninferior to the humans. SIGNIFICANCE: This was the first time that a spike detection algorithm and humans performed similarly. The performance comparison methodology utilized here is generally applicable to problems in which skilled human performance is the desired standard and no external gold standard exists.
Authors: Jin Jing; Haoqi Sun; Jennifer A Kim; Aline Herlopian; Ioannis Karakis; Marcus Ng; Jonathan J Halford; Douglas Maus; Fonda Chan; Marjan Dolatshahi; Carlos Muniz; Catherine Chu; Valeria Sacca; Jay Pathmanathan; Wendong Ge; Justin Dauwels; Alice Lam; Andrew J Cole; Sydney S Cash; M Brandon Westover Journal: JAMA Neurol Date: 2020-01-01 Impact factor: 18.302
Authors: Jin Jing; Aline Herlopian; Ioannis Karakis; Marcus Ng; Jonathan J Halford; Alice Lam; Douglas Maus; Fonda Chan; Marjan Dolatshahi; Carlos F Muniz; Catherine Chu; Valeria Sacca; Jay Pathmanathan; WenDong Ge; Haoqi Sun; Justin Dauwels; Andrew J Cole; Daniel B Hoch; Sydney S Cash; M Brandon Westover Journal: JAMA Neurol Date: 2020-01-01 Impact factor: 18.302
Authors: Elham Bagheri; Justin Dauwels; Brian C Dean; Chad G Waters; M Brandon Westover; Jonathan J Halford Journal: Clin Neurophysiol Date: 2017-07-18 Impact factor: 3.708
Authors: Emily L Thorn; Lauren M Ostrowski; Dhinakaran M Chinappen; Jin Jing; M Brandon Westover; Steven M Stufflebeam; Mark A Kramer; Catherine J Chu Journal: Epilepsia Date: 2020-09-18 Impact factor: 5.864
Authors: Jonathan J Halford; M Brandon Westover; Suzette M LaRoche; Micheal P Macken; Ekrem Kutluay; Jonathan C Edwards; Leonardo Bonilha; Giridhar P Kalamangalam; Kan Ding; Jennifer L Hopp; Amir Arain; Rachael A Dawson; Gabriel U Martz; Bethany J Wolf; Chad G Waters; Brian C Dean Journal: J Clin Neurophysiol Date: 2018-09 Impact factor: 2.177
Authors: Mark A Kramer; Sally M Stoyell; Dhinakaran Chinappen; Lauren M Ostrowski; Elizabeth R Spencer; Amy K Morgan; Britt Carlson Emerton; Jin Jing; M Brandon Westover; Uri T Eden; Robert Stickgold; Dara S Manoach; Catherine J Chu Journal: J Neurosci Date: 2021-01-19 Impact factor: 6.167
Authors: Mark A Kramer; Lauren M Ostrowski; Daniel Y Song; Emily L Thorn; Sally M Stoyell; McKenna Parnes; Dhinakaran Chinappen; Grace Xiao; Uri T Eden; Kevin J Staley; Steven M Stufflebeam; Catherine J Chu Journal: Brain Date: 2019-05-01 Impact factor: 13.501
Authors: Elham Bagheri; Jing Jin; Justin Dauwels; Sydney Cash; M Brandon Westover Journal: Proc IEEE Int Conf Acoust Speech Signal Process Date: 2018-09-13