| Literature DB >> 26645087 |
Gregor Stiglic1,2, Petra Povalej Brzan1, Nino Fijacko1, Fei Wang3, Boris Delibasic4, Alexandros Kalousis5,6, Zoran Obradovic7.
Abstract
Different studies have demonstrated the importance of comorbidities to better understand the origin and evolution of medical complications. This study focuses on improvement of the predictive model interpretability based on simple logical features representing comorbidities. We use group lasso based feature interaction discovery followed by a post-processing step, where simple logic terms are added. In the final step, we reduce the feature set by applying lasso logistic regression to obtain a compact set of non-zero coefficients that represent a more comprehensible predictive model. The effectiveness of the proposed approach was demonstrated on a pediatric hospital discharge dataset that was used to build a readmission risk estimation model. The evaluation of the proposed method demonstrates a reduction of the initial set of features in a regression model by 72%, with a slight improvement in the Area Under the ROC Curve metric from 0.763 (95% CI: 0.755-0.771) to 0.769 (95% CI: 0.761-0.777). Additionally, our results show improvement in comprehensibility of the final predictive model using simple comorbidity based terms for logistic regression.Entities:
Mesh:
Year: 2015 PMID: 26645087 PMCID: PMC4672891 DOI: 10.1371/journal.pone.0144439
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Classification performance of the three observed approaches.
Four sets of boxplots represent predictive performance measured in Area under the ROC curve (AUC) for 1-Standard Error (1SE), boosted C5.0 decision trees (C5.0), glinternet (GLI) and a model using optimal lambda (OPT) setting obtained using cross-validation. Each set is obtained for a different setting of “Number of Discovered Interactions” (NDI)–i.e. 5, 10, 15 and 20 interactions.
Fig 2Complexity of the three observed approaches.
Comparison of model complexity, measured as number of selected features, for the three compared approaches and four different settings of “Number of Discovered Interactions” (NDI).
Ranked list of the most frequent positive coefficients including comorbidity terms for both proposed approaches.
| Rank | Variable | Present in the final set of features (%) | |
|---|---|---|---|
| OPT | 1SE | ||
| 1 | 288.00 = 0 AND 204.00 = 1; 288.00—Neutropenia, unspecified; 204.00—Acute lymphoid leukemia, without mention of having achieved remission | 100.0 | 99.9 |
| 2 | 288.00 = 1 AND 204.00 = 0; 288.00—Neutropenia, unspecified; 204.00—Acute lymphoid leukemia, without mention of having achieved remission | 100.0 | 100.0 |
| 3 | V58.11 = 0 AND 194.0 = 1; V58.11—Encounter for antineoplastic chemotherapy; 194.0—Malignant neoplasm of adrenal gland | 100.0 | 100.0 |
| 4 | V58.11 = 0 AND 284.1 = 1; V58.11—Encounter for antineoplastic chemotherapy; 284.1—Pancytopenia | 100.0 | 100.0 |
| 5 | Length of stay (log transformed) | 100.0 | 100.0 |
| 6 | Number of chronic conditions | 100.0 | 100.0 |
| 7 | V58.11 = 1 AND 288.00 = 0; V58.11—Encounter for antineoplastic chemotherapy; 288.00—Neutropenia, unspecified | 99.8 | 99.9 |
| 8 | V58.11 = 0 AND 780.60 = 1; V58.11—Encounter for antineoplastic chemotherapy; 780.60—Fever, unspecified | 99.2 | 93.8 |
| 9 | V58.11 = 0 AND 780.61 = 1; V58.11—Encounter for antineoplastic chemotherapy; 780.61—Fever presenting with conditions classified elsewhere | 99.2 | 99.3 |
| 10 | V58.11 = 1 AND 284.1 = 0; V58.11—Encounter for antineoplastic chemotherapy; 284.1 –Pancytopenia | 99.0 | 97.5 |
Ranked list of the most frequent negative coefficients including comorbidity terms for both proposed approaches.
| Rank | Variable | Present in the final set of features (%) | |
|---|---|---|---|
| OPT | 1SE | ||
| 1 | Number of chronic conditions AND Age | 100.0 | 46.1 |
| 2 | Number of chronic conditions AND Length of stay (log transformed) | 100.0 | 8.87 |
| 3 | Number of chronic conditions AND Number of procedures | 99.3 | 19.1 |
| 4 | 486 = 0 AND 327.23 = 1; 486—Pneumonia, organism unspecified; 327.23—Obstructive sleep apnea (adult)(pediatric) | 97.8 | 94.8 |
| 5 | 486—Pneumonia, organism unspecified | 82.9 | 85.3 |
| 6 | 486 = 1 AND 327.23 = 0; 486—Pneumonia, organism unspecified; 327.23—Obstructive sleep apnea (adult)(pediatric) | 72.1 | 51.9 |
| 7 | 486 = 0 AND 382.9 = 1; 486—Pneumonia, organism unspecified; 382.9—Unspecified otitis media | 69.7 | 58.8 |
| 8 | 799.02 –Hypoxemia | 57.8 | 61.3 |
| 9 | 493.00—Extrinsic asthma, unspecified | 53.7 | 55.5 |
| 10 | 382.9—Unspecified otitis media | 39.4 | 45.5 |
Fig 3Risk of readmission with and without the interaction term.
Surface plot of the response (risk of readmission) from the model without (left) and with interaction between length of stay (LOS_LOG) and number of chronic diseases (NCHRONIC).