STUDY OBJECTIVES: Automated sleep stage scoring is not yet vigorously used in practice because of the black-box nature and the risk of wrong predictions. The objective of this study was to introduce a confidence-based framework to detect the possibly wrong predictions that would inform clinicians about which epochs would require a manual review and investigate the potential to improve accuracy for automated sleep stage scoring. METHODS: We used 702 polysomnography studies from a local clinical dataset (SNUBH dataset) and 2804 from an open dataset (SHHS dataset) for experiments. We adapted the state-of-the-art TinySleepNet architecture to train the classifier and modified the ConfidNet architecture to train an auxiliary confidence model. For the confidence model, we developed a novel method, Dropout Correct Rate (DCR), and the performance of it was compared with other existing methods. RESULTS: Confidence estimates (0.754) reflected accuracy (0.758) well in general. The best performance for differentiating correct and wrong predictions was shown when using the DCR method (AUROC: 0.812) compared to the existing approaches which largely failed to detect wrong predictions. By reviewing only 20% of epochs that received the lowest confidence values, the overall accuracy of sleep stage scoring was improved from 76% to 87%. For patients with reduced accuracy (ie, individuals with obesity or severe sleep apnea), the possible improvement range after applying confidence estimation was even greater. CONCLUSION: To the best of our knowledge, this is the first study applying confidence estimation on automated sleep stage scoring. Reliable confidence estimates by the DCR method help screen out most of the wrong predictions, which would increase the reliability and interpretability of automated sleep stage scoring.
STUDY OBJECTIVES: Automated sleep stage scoring is not yet vigorously used in practice because of the black-box nature and the risk of wrong predictions. The objective of this study was to introduce a confidence-based framework to detect the possibly wrong predictions that would inform clinicians about which epochs would require a manual review and investigate the potential to improve accuracy for automated sleep stage scoring. METHODS: We used 702 polysomnography studies from a local clinical dataset (SNUBH dataset) and 2804 from an open dataset (SHHS dataset) for experiments. We adapted the state-of-the-art TinySleepNet architecture to train the classifier and modified the ConfidNet architecture to train an auxiliary confidence model. For the confidence model, we developed a novel method, Dropout Correct Rate (DCR), and the performance of it was compared with other existing methods. RESULTS: Confidence estimates (0.754) reflected accuracy (0.758) well in general. The best performance for differentiating correct and wrong predictions was shown when using the DCR method (AUROC: 0.812) compared to the existing approaches which largely failed to detect wrong predictions. By reviewing only 20% of epochs that received the lowest confidence values, the overall accuracy of sleep stage scoring was improved from 76% to 87%. For patients with reduced accuracy (ie, individuals with obesity or severe sleep apnea), the possible improvement range after applying confidence estimation was even greater. CONCLUSION: To the best of our knowledge, this is the first study applying confidence estimation on automated sleep stage scoring. Reliable confidence estimates by the DCR method help screen out most of the wrong predictions, which would increase the reliability and interpretability of automated sleep stage scoring.
Authors: Huy Phan; Fernando Andreotti; Navin Cooray; Oliver Y Chen; Maarten De Vos Journal: IEEE Trans Neural Syst Rehabil Eng Date: 2019-01-31 Impact factor: 3.802
Authors: Haoqi Sun; Wolfgang Ganglberger; Ezhil Panneerselvam; Michael J Leone; Syed A Quadri; Balaji Goparaju; Ryan A Tesh; Oluwaseun Akeju; Robert J Thomas; M Brandon Westover Journal: Sleep Date: 2020-07-13 Impact factor: 5.849
Authors: S F Quan; B V Howard; C Iber; J P Kiley; F J Nieto; G T O'Connor; D M Rapoport; S Redline; J Robbins; J M Samet; P W Wahl Journal: Sleep Date: 1997-12 Impact factor: 5.849
Authors: Cathy A Goldstein; Richard B Berry; David T Kent; David A Kristo; Azizi A Seixas; Susan Redline; M Brandon Westover Journal: J Clin Sleep Med Date: 2020-04-15 Impact factor: 4.062
Authors: Henri Korkalainen; Juhani Aakko; Sami Nikkonen; Samu Kainulainen; Akseli Leino; Brett Duce; Isaac O Afara; Sami Myllymaa; Juha Toyras; Timo Leppanen Journal: IEEE J Biomed Health Inform Date: 2019-12-19 Impact factor: 5.772
Authors: Heidi Danker-Hopfe; Peter Anderer; Josef Zeitlhofer; Marion Boeck; Hans Dorn; Georg Gruber; Esther Heller; Erna Loretz; Doris Moser; Silvia Parapatics; Bernd Saletu; Andrea Schmidt; Georg Dorffner Journal: J Sleep Res Date: 2009-03 Impact factor: 3.981
Authors: Guo-Qiang Zhang; Licong Cui; Remo Mueller; Shiqiang Tao; Matthew Kim; Michael Rueschman; Sara Mariani; Daniel Mobley; Susan Redline Journal: J Am Med Inform Assoc Date: 2018-10-01 Impact factor: 4.497