| Literature DB >> 33086521 |
José Miguel Calderón1, Julio Álvarez-Pitti2, Irene Cuenca1, Francisco Ponce2,3, Pau Redon2,3.
Abstract
Obstructive sleep apnea syndrome is a reduction of the airflow during sleep which not only produces a reduction in sleep quality but also has major health consequences. The prevalence in the obese pediatric population can surpass 50%, and polysomnography is the current gold standard method for its diagnosis. Unfortunately, it is expensive, disturbing and time-consuming for experienced professionals. The objective is to develop a patient-friendly screening tool for the obese pediatric population to identify those children at higher risk of suffering from this syndrome. Three supervised learning classifier algorithms (i.e., logistic regression, support vector machine and AdaBoost) common in the field of machine learning were trained and tested on two very different datasets where oxygen saturation raw signal was recorded. The first dataset was the Childhood Adenotonsillectomy Trial (CHAT) consisting of 453 individuals, with ages between 5 and 9 years old and one-third of the patients being obese. Cross-validation was performed on the second dataset from an obesity assessment consult at the Pediatric Department of the Hospital General Universitario of Valencia. A total of 27 patients were recruited between 5 and 17 years old; 42% were girls and 63% were obese. The performance of each algorithm was evaluated based on key performance indicators (e.g., area under the curve, accuracy, recall, specificity and positive predicted value). The logistic regression algorithm outperformed (accuracy = 0.79, specificity = 0.96, area under the curve = 0.9, recall = 0.62 and positive predictive value = 0.94) the support vector machine and the AdaBoost algorithm when trained with the CHAT datasets. Cross-validation tests, using the Hospital General de Valencia (HG) dataset, confirmed the higher performance of the logistic regression algorithm in comparison with the others. In addition, only a minor loss of performance (accuracy = 0.75, specificity = 0.88, area under the curve = 0.85, recall = 0.62 and positive predictive value = 0.83) was observed despite the differences between the datasets. The proposed minimally invasive screening tool has shown promising performance when it comes to identifying children at risk of suffering obstructive sleep apnea syndrome. Moreover, it is ideal to be implemented in an outpatient consult in primary and secondary care.Entities:
Keywords: machine learning; obese pediatric population; obstructive sleep apnea syndrome; oxygen saturation signal
Year: 2020 PMID: 33086521 PMCID: PMC7712243 DOI: 10.3390/bioengineering7040131
Source DB: PubMed Journal: Bioengineering (Basel) ISSN: 2306-5354
Summary of the performance of oxygen saturation (SpO2)-based tools to diagnose obstructive sleep apnea (OSA) syndrome.
| Reference | Cohort | Type of Classifier | Sample Size | Sensitivity | Specificity | Accuracy | Year | Home-Based |
|---|---|---|---|---|---|---|---|---|
| [ | A | Multivariate adaptive regression splines | 793 | 83 | 54 | NA | 1999 | N |
| [ | A | Linear regression | 148 | 91 | 83 | 89 | 2009 | N |
| [ | A | Univariate | 475 | 96 | 67 | 87 | 2012 | Y |
| [ | A | Baggin ReTree | 25 | 78 | 84 | 83 | 2012 | N |
| [ | A | Artificial Neural Network | 93 | 88 | 100 | 93 | 2012 | N |
| [ | A | Univariate | 996 | 84 | 86 | NA | 2014 | Y |
| [ | A | Linear discriminant analysis | 302 | 97 | 50 | 93 | 2017 | Y |
| [ | A | Deep belief networks | 33 | 60 | 92 | 85 | 2017 | N |
| [ | A | Long-short term memory | 8 | 93 | NA | 96 | 2017 | N |
| [ | A | Convolutional neural networks | 23 | NA | NA | 80 | 2018 | N |
| [ | A | Recurrent and convolutional neural network | 15,804 | NA | NA | 88 | 2018 | N |
| [ | A | Common Bayesian Network | 32 | NA | NA | 85 | 2017 | N |
| [ | P | Neural network | 176 | NA | NA | 84.7–85.8 | 2015 | N |
| [ | P | Logistic regression | 298 | 79.1 | 84.1 | 81.9 | 2017 | N |
| [ | P | Neural network | 4191 | 84.0–68.7 | 53–94 | 75.2–90 | 2017 | N |
| [ | P | Logistic regression, QDA, LDA | 176 | NA | NA | 84.3–82.7 | 2018 | N |
| [ | P | Convolutional neural network | 298 | NA | NA | 81.3–85.3 | 2018 | N |
| [ | P | Convolutional neural network | 779 | 40–54 | 98.6–99.6 | 74.8–95.1 | 2020 | N |
| [ | P | AdaBoost | 974 | 91–41 | 22.7–98.1 | 78.2–85.9 | 2020 | N |
A, adult population; P, pediatric population; NA, not available; QDA, quadratic discriminant analysis; LDA, linear discriminant analysis; Y, yes; N, no.
Extracted features from the raw SpO2 signal used in the Childhood Adenotonsillectomy Trial (CHAT) [6,35,36,37].
| Variable | Description |
|---|---|
|
| Apnea/hypopnea index (AHI) ≥ 3% oxygen desaturation per hour of sleep |
|
| Oxygen desaturation index ≥ 3% during sleep time |
|
| Oxygen desaturation index ≥ 4% during sleep time |
|
| Number of desaturations with ≥ 2% desaturation |
|
| Number of desaturations with ≥ 3% desaturation |
|
| Number of desaturations with ≥ 4% desaturation |
|
| Number of desaturations with ≥ 5% desaturation |
|
| Percentage of time ≤ 90% oxygen saturation |
|
| Percentage of time ≤ 92% oxygen saturation |
Figure 1Schematic representation of the workflow followed to generate a screening tool based exclusively on pulse oximetry measurements and machine learning algorithms. This tool is specially oriented to screen asymptomatic obese pediatric population in search of subjects at risk of suffering OSA syndrome. The manuscript only focuses on the generation of this new tool, emphasizing the development of the classifier.
Figure 2Histogram of apnea/hypopnea index of the CHAT (blue) and the Hospital General de Valencia (HG) (orange) datasets. The discontinuous red line depicts the threshold value used in the present paper, AHI = 5. Individuals with AHI ≤ 5 were considered as healthy.
Result of applying inferential statistics test on extracted features from the SpO2 raw signal.
| Feature | Healthy, n = 197 | At Risk, n = 256 | Shapiro–Wilk | Mann Whitney U |
|---|---|---|---|---|
| ndes2ph | 82.91 ± 58.31 | 189.94 ± 107.56 | < 1 × 10−15 | < 1 × 10−30 |
| ndes3ph | 28.13 ± 20.45 | 91.67 ± 63.06 | < 1 × 10−20 | < 1 × 10−40 |
| ndes4ph | 10.08 ± 8.44 | 47.17 ± 40.32 | < 1 × 10−20 | < 1 × 10−40 |
| ndes5ph | 4.41 ± 4.49 | 26.51 ± 26.97 | < 1 × 10−25 | < 1 × 10−40 |
| odi3 | 2.79 ± 2.08 | 10.53 ± 7.38 | < 1 × 10−20 | < 1 × 10−45 |
| odi4 | 0.98 ± 0.83 | 5.53 ± 4.81 | < 1 × 10−25 | < 1 × 10−45 |
| pctle90 | 0.06 ± 0.69 | 0.29 ± 0.51 | < 1 × 10−35 | < 1 × 10−25 |
| pctle92 | 0.38 ± 3.49 | 0.81 ± 1.37 | < 1 × 10−35 | <1 × 10−25 |
ndes2ph, number of desaturations ≥ 2% per hour; ndes3ph, number of desaturations ≥ 3% per hour; ndes4ph, number of desaturations ≥ 4% per hour; ndes5ph, number of desaturations ≥ 5% per hour; odi3, oxygen desaturation index ≥ 3% during sleep time; odi4, oxygen desaturation index ≥ 4% during sleep time; pctle90, percentage of time desaturation was ≤ 90%; pctle92, percentage of time desaturation was ≤ 92%.
Figure 3Cross-correlation matrix of the SpO2 signal features.
Figure 4Impact of the number of samples in the overfitting of the model. The overlapping of the blue and green lines as well as the corresponding shaded areas reflect the lack of overfitting. Considering more than 120 samples mitigates overfitting.
Performance results for each algorithm in the CHAT and the HG datasets.
| Dataset | Algorithm | AUC | Accuracy | Sensitivity | Specificity | PPV |
|---|---|---|---|---|---|---|
| CHAT | SVM | 89.2 ± 7.7 | 82.9 ± 9.9 | 78.3 ± 13.5 | 87.4± 13.5 | 87.7± 12.2 |
| LR | 90.2 ± 6.9 | 79.0 ± 7.2 | 62.0 ± 13.2 | 96.0 ± 5.4 | 94.3 ± 7.2 | |
| AB | 89.0 ± 6.7 | 82.1 ± 6.7 | 73.2 ± 11.8 | 90.9 ± 9.3 | 90.2 ± 9.8 | |
| HG | SVM | 68.3 ± 4.3 | 66.7 ± 4.9 | 80.8 ± 13.6 | 52.5 ± 6.8 | 62.8 ± 4.2 |
| LR | 85.2 ± 0.0 | 75.0 ± 0.0 | 62.5 ± 0.0 | 87.5 ± 0.0 | 83.3 ± 0.0 | |
| AB | 79.9 ± 1.3 | 74.6 ± 2.8 | 86.7 ± 3.1 | 62.5 ± 4.6 | 69.9 ± 2.7 |
STD, standard deviation; CHAT, Childhood Adenotonsillectomy Trial; HG, Hospital General de Valencia; SVM, support vector machine; LR, logistic regression; AB, AdaBoost; AUC, area under the curve; PPV, positive predictive value.