| Literature DB >> 35262853 |
Anuja Bandyopadhyay1, Cathy Goldstein2.
Abstract
BACKGROUND: The past few years have seen a rapid emergence of artificial intelligence (AI)-enabled technology in the field of sleep medicine. AI refers to the capability of computer systems to perform tasks conventionally considered to require human intelligence, such as speech recognition, decision-making, and visual recognition of patterns and objects. The practice of sleep tracking and measuring physiological signals in sleep is widely practiced. Therefore, sleep monitoring in both the laboratory and ambulatory environments results in the accrual of massive amounts of data that uniquely positions the field of sleep medicine to gain from AI.Entities:
Keywords: Artificial intelligence; Disorders of excessive somnolence; Machine learning; Polysomnogram; Sleep apnea
Year: 2022 PMID: 35262853 PMCID: PMC8904207 DOI: 10.1007/s11325-022-02592-4
Source DB: PubMed Journal: Sleep Breath ISSN: 1520-9512 Impact factor: 2.816
Fig. 1Artificial Intelligence timeline
Fig. 2Types of machine learning
Fig. 3Development of machine algorithms
Examples of studies using machine learning algorithms for sleep stage and respiratory scoring
| Author, year | Population studied | Dataset source | Channels and sensors | Data preprocess | Classifier used | Performance measures | Other findings |
|---|---|---|---|---|---|---|---|
| Perslev [ | 15,660 participants | 16 clinical datasets | Single channel EEG and EOG sugnal | U Sleep software | Convoluted Neural Network(CNN) | U sleep performed as well as other algorithms, even though U sleep was not trained on similar datasets | It predicts sleep stages in a single forward pass |
| Sharma [ | 80 subjects comprising of healthy controls as well as various sleep disorders | Cyclic Alternating Pattern (CAP) database | Dual channel EEG | Optimized wavelet filters | Bagged tree (EBT) classifier with tenfold cross validation | Accuracy 85.3% | Accuracy improved in a balanced dataset created using over-sampling and under-sampling techniques |
| Sun, 2020 [ | 8682 PSG | N/A | ECG and respiratory signals | 270 s time windows used | CNN | Performance is better for younger ages | |
| Jaoude 2020 [ | 6431 patients | MGH-PSG dataset; Ambulatory scalpEEG dataset | 4 EEG channels | Bandpass filter, downsampling to 100 Hz, generated bipolar montage to make it “reference channel-free” | CNN followed by RNN | Performance was consistent across common EEG background abnormalities | |
| Zhange et al., 2020 [ | 294 sleep studies (122 training data set, 20 validation dataset, 152 testing data set) | Prospectively collected | 2 channel EEG, EOG, EMG, ECG, airflow | Filtered signal at 66 Hz, downsampled signal sampling frequency to 66 Hz | CNN | Accuracy 81.81% | Number of arousals affected model’s performance |
| Peter-Derex 2020 [ | 23 patients with insomnia, 24 patients with idiopathic hypersomnia, 24 patients with narcolepsy, 24 patients with OSA | Lyon sleep database | Single channel EEG | ASEEGA software | ASEEGA software | Agreement between software and consensual scorer was: insomnia 85.5% ( | |
| Sridhar, 2020 [ | 800 (561 subjects); 993 nights (993 subjects) | Sleep Heart Health Study; Multi-ethnic study of Atherosclerosis; Physionet Computing in Cardiology (CinC) | ECG | Normalize ECG signal, interbeat interval time series computed | CNN | Accuracy 77% (SHHS) Accuracy 72% (CinC) | |
| Zhu, 2020 [ | 8 recordings (4 healthy, 4 sleep disorders); 20 recordings (healthy) | Sleep EDF, Sleep EDFX | Single channel EEG | Z score normalization of data | CNN + attention mechanism | Accuracy 93.7% F1 score 84.5% | Attention mechanism helped in learning inter and intra-epoch features |
| Xu, 2020 [ | 5793 participants (sleep disorders) | Sleep Heart Health Study | Multichannel EEG, EOG | Time frequency spectra | LSTM (Long short term memory)/RNN | Accuracy 87.4% | RNNtakes temporal information into account |
| Zhang et al. Sleep, 2019 [ | 5213 patients | Sleep Heart Health Study | Multichannel EEG, EOG, EMG | Raw signal, short term Fourier transform for spectogram | Recurrent and convolutional neural networks | Validation MrOS Validation SOF | |
| Yildrim, 2019 [ | 8 recordings; 61 recordings (healthy and insomnia) | Sleep EDF, Sleep EDFX | Single channel EEG, Single channel EOG | Raw PSG signal | CNN | Accuracy 98.06% | |
| Phan, 2019 [ | 200 recordings | MGH sleep lab | Single channel EEG, EOG, EMG | Time frequency images | CNN + RNN | Accuracy 87% | Trained network in end to end fashion |
| Zhang and Wu, 2018 [ | 25 recordings (sleep disorders) 16 recordings | MIT-BIH database, Sleep EDF | Single channel EEG | Phase encoder, unsupervised training | CNN | Accuracy 87% | |
| Stephansen, 2018 [ | 3000 recordings (healthy and sleep disorders) | 10 databases | Multichannel EEG, EOG, | Filter + octave encoding | CNN + RNN | Accuracy = 87% | Automates type 1 narcolepsy diagnosis |
| Sors et al., 2018 [ | 5793 recordings (sleep disorders) | Sleep Heart Healthy Study | Single channel EEG | raw | CNN | Accuracy = 87% | |
| Biswal et al. 2018 [ | 1000 recordings 5804 recordings | Sleep Heart Health study; ISRUC-sleep | Multichannel EEG | Spectogram | Recurrent and convolutional neural networks | accuracy 87.5% κ = 0.805 Validation SHHS accuracy 77.7% | Minimal reduction in accuracy noted when working on single channel |
| Patanaik et al. Sleep, 2018 [ | 1046 recordings (healthy adolescents) 284 recordings (healthy young adults) 210 recordings(sleep disorders) 77 recordings (Parkinson disease adults) | CNL lab, CSL lab, Singapore; UCSD sleep lab, UC database | Multichannel EEG, EOG, | Spectral images | Convolutional neural network | accuracy 89.2% Validation set 1- accuracy 81.4% Validation set 2 (PD)- accuracy 72.1% | Faster compared to human experts (F sec compared to 30–60 min) |
| Olesen 2018 [ | 2310 recordings (healthy and sick) | CNL lab, Singapore | Multichannel EEG, EOG, EMG | Raw data | CNN | Most errors made in stage N1 and N3 | |
| Malafeev, 2018 [ | 54 recordings (healthy); 43 recordings (22 PSG and 21 MSLT, hypersomnia patients) | Warsaw (healthy); Wisconsin Sleep Cohort (hypersomnia) | Single channel EEG, EMG, EOG | Spectrogram | CNN | Performance in healthy subjects were better compared to those on hypersomnia patients | |
| Cui 2018 [ | 116 recordings including healthy and sick population | University of Zurich | Multichannel EEG, EOG, EMG | Time series | CNN | Accuracy = 92.2% | |
| Chambon, 2018 [ | 61 recordings from healthy adults | MGH sleep lab | Multichannel EEG Three chin EMG | Linear spatial filtering | CNN | Sensitivity = 52% | Utilized 1 min of data before and after each data segment which offered the strongest improvement |
| Vilamala, 2017 [ | 40 recordings from 20 healthy adults | Montreal archive | Single channel EEG | Multitaper spectral analysis | CNN | Acc = 84–88% | |
| Supratak,2017 [ | 62 healthy recordings | MGH sleep lab, Montreal Archive | Single channel EEG | Raw data | CNN + RNN | Acc = 86.2% K = 0.8 | Used multiple datasets |
| Tsinalis, 2016 [ | 40 recordings from 20 Healthy young adults | Montreal archive | Single channel EEG | Class balanced random sampling | CNN and 2D stack of frequency-specific activity in time | Accuracy = 71–76% Per stage Accuracy = 80–84% | Performance balanced across classes |
CNN convolutional neural network, RNN recurrent neural network, LSTM long-short-term memory, EBT ensemble bagged tree, Κ kappa
Confusion matrix: confusion matrix for predicted sleep stage displays agreement between human experts and the prediction by the dataset (example-2 stage classification of sleep and wake)
| Dataset scored wake (predicted class positive) | Dataset scored sleep (predicted class negative) | ||
|---|---|---|---|
| Human expert scored wake (actual class positive) | True positive (TP) | False negative (FN) | Sensitivity TP/(TP + FN) |
| Human expert scored sleep (actual class negative) | False positive (FP) | True negative (TN) | Specificity TN/(TN + FP) |
Precision TP/(TP + FP) | Negative predictive value TN/(TN + FN) | Accuracy (TP + TN)/(TP + TN + FP + FN) |
Performance metrics
| Performance metric | Formula |
|---|---|
| Accuracy | (TP + TN)/(total population) |
| Specificity | TN/(TN + FP) |
| Sensitivity | TP/(TP + FN) |
| PPV | TP/(TP + FP) |
| F1 | 2 * [(PPV*sensitivity)/(PPV + sensitivity)] |
| AUC | Area under ROC curve |
Commonly used metrics to quantify algorithm performance against gold standard
PPV, positive predictive value; AUC, area under the curve; TP, true positive; TN, true negative; FP, false positive; FN, false negative; ROC, receiver operating characteristic curve (y axis = sensitivity, x axis = 1-specificity)