| Literature DB >> 23882354 |
Bravein Amalakuhan1, Lukasz Kiljanek, Arvin Parvathaneni, Michael Hester, Pramil Cheriyath, Daniel Fischman.
Abstract
Frequent COPD exacerbations have a large impact on morbidity, mortality and health-care expenditures. By 2020, the World Health Organization expects COPD and COPD exacerbations to be the third leading cause of death world-wide. Furthermore, In 2005 it was estimated that COPD exacerbations cost the U.S. health-care system 38 billion dollars. Studies attempting to determine factors related to COPD readmissions are still very limited. Moreover, few have used a organized machine-learning, sensitivity analysis approach, such as a Random Forest (RF) statistical model, to analyze this problem. This study utilized the RF machine learning algorithm to determine factors that predict risk for multiple COPD exacerbations in a single year. This was a retrospective study with a data set of 106 patients. These patients were divided randomly into training (80%) and validating (20%) data-sets, 100 times, using approximately sixty variables intially, which in prior studies had been found to be associated with patient readmission for COPD exacerbation. In an interactive manner, an RF model was created using the training set and validated on the testing dataset. Mean area-under-curve (AUC) statistics, sensitivity, specificity, and negative/positive predictive values (NPV, PPV) were calculated for the 100 runs. THE FOLLOWING VARIABLES WERE FOUND TO BE IMPORTANT PREDICTORS OF PATIENTS HAVING AT LEAST TWO COPD EXACERBATIONS WITHIN ONE YEAR: employment, body mass index, number of previous surgeries, administration of azithromycin/ceftriaxone/moxifloxacin, and admission albumin level. The mean AUC was 0.72, sensitivity of 0.75, specificity of 0.56, PPV of 0.7 and NPV of 0.63. Histograms were used to confirm consistent accuracy. The RF design has consistently demonstrated encouraging results. We expect to validate our results on new patient groups and improve accuracy by increasing our training dataset. We hope that identifying patients at risk for frequent readmissions will improve patient outcome and save valuable hospital resources.Entities:
Keywords: COPD; prediction; random forest; readmission
Year: 2012 PMID: 23882354 PMCID: PMC3714087 DOI: 10.3402/jchimp.v2i1.9915
Source DB: PubMed Journal: J Community Hosp Intern Med Perspect ISSN: 2000-9666
General variables collected
| Age | BMI | History of CAD | History of diastolic heart failure |
| Sex | Marital status | History of NSTEMI | History of previous pneumonia |
| Race | Number of allergies | History of STEMI | History of asthma |
| History of drinking | Number of meds | History of HTN | Inhaled anticholinergic use |
| Living arrangement | Number of major diseases in past medical history | History of DM | Home oxygen use |
| Employment status | Number of surgeries in past medical history | History of systolic heart failure | Home B2 agonists use |
| Zip code of home | History of smoking | Most recent HbA1c | Up-to-date on influenza and pneumovax vaccine |
Variables collected within ‘index of admission (IA)/hospital stay’
| Admission 02 saturation | Moxifloxacin use | HC03 on basic metabolic profile at discharge date | Month of admission |
| Lowest 02 saturation | PO/IV steroid use | Serum albumin on admission | Pneumonia |
| Beta-2 agonist use | PO steroid prescription upon discharge | Serum magnesium on admission | Length of stay in days |
| Inhaled anticholinergic use | pH, PaCO2, PaO2, HC03, 02 saturation (ABG) | Pulmonologist consulted | |
| Inhaled steroid use | HC03 on basic metabolic profile at admission date | Intubated |
Fig. 1ROC curve generated in the final data run, using the random forest algorithm.