| Literature DB >> 30227856 |
Xavier Rafael-Palou1,2, Cecilia Turino3,4, Alexander Steblin5, Manuel Sánchez-de-la-Torre3,4, Ferran Barbé3,4, Eloisa Vargiu5.
Abstract
BACKGROUND: Patients suffering obstructive sleep apnea are mainly treated with continuous positive airway pressure (CPAP). Although it is a highly effective treatment, compliance with this therapy is problematic to achieve with serious consequences for the patients' health. Unfortunately, there is a clear lack of clinical analytical tools to support the early prediction of compliant patients.Entities:
Keywords: Continuous positive airway pressure; Machine learning; Obtrusive sleep apnea; Predictive methods
Mesh:
Year: 2018 PMID: 30227856 PMCID: PMC6145365 DOI: 10.1186/s12911-018-0657-z
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1Pipeline steps designed for building classifiers for compliance with the CPAP therapy
Pipeline parameters tested using grid-search and 10-fold CV
| Pipeline step | Parameter options |
|---|---|
| Combine_fs | percentile = [5, 10, 20, 30, 40, 50] |
| Lasso_fs | estimator = Logistic Regression |
| penalty = “l1” | |
| RFE_RF_fs | class_weight = ‘balanced’ |
| n_estimators = 100 | |
| step = [0,1 ] | |
| n_features_to_select = [0.4,0.6,0.8] | |
| Smote_fs | n_neighbors = [3,4,5] |
| ratio=‘auto’ | |
| kind=‘regular’ | |
| k-NN | n_neighbors = [1,3,5,7,9,11] |
| weights = [‘uniform’, ‘distance’] | |
| LR | |
| class_weight = [None, ‘balanced’] | |
| penalty = [‘l1’, ‘l2’] | |
| RF | n_estimators = [100,150,200,250,500] |
| criterion = [‘entropy’,‘gini’] | |
| max_depth = [‘None’,4,6] | |
| class_weight = [None, ‘balanced’] | |
| SVM | C = [0.01,0.1,0.5,1,5,10,15,30,50] |
| gamma = [0.0001,0.001,0.01, 0.1,1,5] | |
| kernel = ‘radial’ | |
| class_weight = [None, ’balanced’] | |
| NN | alpha = [1e −5, 0.00001, 0.0001, 0.001, 0.01,0.1,1,3,5,10] |
| hidden_layer_sizes = [(30,), (50,), (70,), (100,), (150,), | |
| (30,30),(50,50),(70,70),(100,100), | |
| (30,30,30),(50,50,50),(70,70,70)] |
Performances of the best pipelines in each dataset
| id | ds | sm | fs | metric | cls | params |
| p0 | D0 | none | none | precision_weighted | SVM | [0.001, balanced, 30] |
| p1 | D1 | Smote | none | f1_weighted | SVM | [0.001, None, 4, 15] |
| p3 | D3 | Smote | Lasso_fs | precision_weighted | RF | [1, 250, gini, 4, None, None] |
| id | cv_prec | cv_rec | cv_f1 | test_prec | test_rec | test_f1 |
| p0 | 0.78+/-0.2 | 0.74+/-0.17 | 0.73+/-0.18 | 0.77 | 0.85 | 0.76 |
| p1 | 0.84+/-0.06 | 0.82+/-0.05 | 0.82+/-0.06 | 0.85 | 0.88 | 0.84 |
| p3 | 0.89+/-0.14 | 0.88+/-0.14 | 0.87+/-0.15 | 0.85 | 0.88 | 0.84 |
Fig. 2ROC curves for cross-validation and test of the best pipeline for dataset D0
Fig. 3ROC curves for cross-validation and test of the best pipeline for dataset D1
Fig. 4ROC curves for cross-validation and test of the best pipeline for dataset D3
Fig. 5Learning curves of the best pipeline for dataset D0
Fig. 6Learning curves of the best pipeline for dataset D1
Fig. 7Learning curves of the best pipeline for dataset D3
Fig. 8Best pipelines results in cross-validation and test at different time-points
Performance difference of f1 cross-validation along the different datasets achieved among pipelines configured with different techniques
| D0 | D1 | D3 | |||||
|---|---|---|---|---|---|---|---|
| Methods | Comparisons | Avg | Max | Avg | Max | Avg | Max |
| Sampling | Smote vs None | 0.01 | -0.02 | -0.01 | 0.04 | 0.0 | 0.02 |
| Feature selection | Best vs None | -0.07 | -0.07 | -0.05 | -0.13 | 0.0 | 0.02 |
| Metrics | f1 vs prec | 0.01 | 0.02 | 0.01 | 0.04 | 0.01 | -0.01 |
| Classifier algorithm | Best vs Worst | 0.09 | 0.14 | 0.14 | 0.21 | 0.19 | 0.20 |
Best descriptive and non-descriptive pipelines by dataset
| ds | sm | fs | metric | cls | params |
|---|---|---|---|---|---|
| D0 | none | none | prec | SVM | [0.001, balanced, 30] |
| D0 | Smote | none | prec | LR | [None, 15, 4, l2] |
| D1 | Smote | none | f1 | SVM | [0.001, None, 4, 15] |
| D1 | none | none | f1 | LR | [None, 5, l2] |
| D3 | Smote | Lasso_fs | prec | RF | [1, 250, gini, 4, None, None] |
| D3 | none | none | f1 | LR | [None, 0.5, l1] |
Fig. 9Stability scores and feature weights of the best pipeline for dataset D0
Fig. 10Stability scores and feature weights of the best pipeline for dataset D1
Fig. 11Stability scores and feature weights of the best pipeline for dataset D3
Performance comparison between best pipelines for each dataset
| Pipelines | Difference (cv_f1) | Statistic | |
|---|---|---|---|
| p0 vs p1 | 0.09 +/- 0.15 | -1.71 | 0.1201 |
| p0 vs p3 | 0.14 +/- 0.18 | -2.70 | 0.0241 |
| p1 vs p3 | 0.05 +/- 0.14 | -1.0931 | 0.3027 |