Literature DB >> 25656667

A prediction model for Clostridium difficile recurrence.

Francis D LaBarbera1, Ivan Nikiforov2, Arvin Parvathenani2, Varsha Pramil2, Subhash Gorrepati2.   

Abstract

BACKGROUND: Clostridium difficile infection (CDI) is a growing problem in the community and hospital setting. Its incidence has been on the rise over the past two decades, and it is quickly becoming a major concern for the health care system. High rate of recurrence is one of the major hurdles in the successful treatment of C. difficile infection. There have been few studies that have looked at patterns of recurrence. The studies currently available have shown a number of risk factors associated with C. difficile recurrence (CDR); however, there is little consensus on the impact of most of the identified risk factors.
METHODS: Our study was a retrospective chart review of 198 patients diagnosed with CDI via Polymerase Chain Reaction (PCR) from February 2009 to Jun 2013. In our study, we decided to use a machine learning algorithm called the Random Forest (RF) to analyze all of the factors proposed to be associated with CDR. This model is capable of making predictions based on a large number of variables, and has outperformed numerous other models and statistical methods.
RESULTS: We came up with a model that was able to accurately predict the CDR with a sensitivity of 83.3%, specificity of 63.1%, and area under curve of 82.6%. Like other similar studies that have used the RF model, we also had very impressive results.
CONCLUSIONS: We hope that in the future, machine learning algorithms, such as the RF, will see a wider application.

Entities:  

Keywords:  Random Forest; hospital infection; machine learning algorithm

Year:  2015        PMID: 25656667      PMCID: PMC4318823          DOI: 10.3402/jchimp.v5.26033

Source DB:  PubMed          Journal:  J Community Hosp Intern Med Perspect        ISSN: 2000-9666


Clostridium difficile is a spore-forming, gram-positive, bacillus bacterium that is associated with severe and often life-threatening infections and inflammation of the colon. The disease processes can range from mild diarrhea to fulminant colitis and death. C. difficile has become the most frequent cause of nosocomial diarrhea in the United States. According to the Centers for Disease Control, the infection is responsible for over 14,000 deaths per year. Because of more virulent strains and evolving antimicrobial resistance, the rates of incidence and recurrence have been increasing (1). In addition, C. difficile infection (CDI) is a heavy burden on health care expenses and accounts for an increased use of medical resources. Recent studies have shown that health care costs ranged from $2,871 to $4,846 per case for primary CDI and from $13,655 to $18,067 per case for recurrent CDI (2). Factors, which are known to alter the normal enteric flora, are associated with risk of C. difficile colonization (3). Although the predominant risk factor among them is associated with antibiotic therapy (4), other postulated risk factors such as, advanced age, chronic illnesses or comorbidities, hospitalizations, non-surgical gastrointestinal procedures, chemotherapy, and other immunosuppressants, play a major role in altering the flora and subsequent acquisition of CDI (5). There have been multiple publications that have demonstrated the association of these different variables with the acquisition of CDI and subsequent re-infections; however, the use of an organized machine learning, sensitivity analysis approach, such as a Random Forest (RF) statistical model, has not been used. In our study, we emulated the techniques of Amalakuhan et al. (6) and the prediction model they created using an RF model in predicting patients at risk for chronic obstructive pulmonary disease (COPD) exacerbation. We employed the RF machine learning algorithm to predict C. difficile recurrence (CDR).

Methodology

Definitions

‘Recurrence of C. difficile’ was defined as confirmed presence of C. difficile toxin via polymerase chain reaction (PCR) after complete resolution of diarrhea for a minimum of 2 weeks and a maximum of 6 months and the completion of antibiotic therapy (5, 7). RF is a statistical model used for classification and regression which works by using input variables to create an output of regression trees. The mode class of all of the classes of the individual trees is then chosen as the output and is used to determine the variables of interest. Regression trees are binary trees with nodes that correspond to different values in the input variables. These trees are developed using a training set. At each node or branch, the RF algorithm searches for a value that best separates all instances within that node based on the outcome of interest. If the instance chosen is not able to be separated further, then it is called a terminal node. This process is repeated until all instances are terminal nodes (6).

Variables

The variables used in this study were selected after an extensive review of literature and only the most common comorbidities found in patients with CDI and re-infection were included. Among the numerous existing variables, 25 of the most strongly associated variables have been selected. Variables which were not found to be significant have not been included in order to optimize the RF algorithm. Table 1 depicts the significant variables used in the RF model in this study.
Table 1

Explanatory variables used in this study

AgeSmoking
Coronary artery diseaseLow-risk antibiotics
Chronic kidney diseaseHigh-risk antibiotics
Gastrointestinal (GI) malignancyH2 antagonist
GenderAlcohol use
Peptic ulcer diseaseNo GI surgery
Inflammatory bowel diseaseOne GI surgery
ImmunosuppressionTwo GI surgeries
RaceHypertension
Gastro esophageal reflux diseaseProton pump inhibitor (PPI) 20 mg
CorticosteroidsPPI 40 mg
ChemotherapyPPI 80 mg
Diabetes
Explanatory variables used in this study

Sample selection

A retrospective chart review was performed on patients diagnosed with CDI based on International Classification of Disease (ICD-9) codes. The selection of the study population, selection criteria, and sampling were all performed subject to the approval of the institutional review board (IRB).

Inclusion criteria

Patients diagnosed with CDI via PCR between February 2009 and June 2013.

Exclusion criteria

Patients with recurrence within 2 weeks or after 6 months of initial CDI Patients with documented ‘non-compliance’ to prescribed medical therapy

Sample size

Using these criteria, data of 200 randomized patients diagnosed with CDI were collected. Two patients were subsequently removed due to our set exclusion criteria. The prevalence of CDR within the randomly selected sample size was 15% (30 patients).

Data analysis

RF statistical analysis and randomization was performed using the SPM Salford Predictive Modeler® version 6.0 (Salford System, San Diego, CA). The predictive model was designed by professors Leo Breiman and Adele Cutler of the University of California, Los Angeles. Our sample population was randomly separated into two distinct groups, a training group and a validation group. The training group comprised 70% of our patients. The remaining 30% were placed in the validation group. The training group was used essentially to create our ‘learning algorithm’, which comprised 2,000 regression trees. Each tree comprised binary nodes or branches, which contribute their results to the variables of interest. At each node, a variable is tested (e.g., 80 mg PPI). At that node, the data entered will either fall into a branch of CDR or no recurrence. Through the 2,000 regression trees, the outcome of interest (CDR) will be distinguished. The cumulative predictions created in the 2,000 trees will create the probability of the patient having a recurrence of an initial CDI. The training and testing groups, although broken into two groups, still allow for all patients in the sample selection to be run through the predictive model. The 30% represents only one validation group in one of the 500 runs. This group essentially can be completely different from one run to the next. This statement remains the same for the 70% training group. As opposed to Amalakuhan's study, which used a 75% training group to 25% validation group distribution split (8, 9), we used an experimental 70%:30% distributional split. This allowed an additional 10 patients to our validation group, presumptively strengthening the end result and the predictive probability of our model. Within both the training and validation groups, the predicted probability of determining the patients who had CDR was almost completely equal. The RF model using the indices as set above was run 500 times, and for each of these 500 runs, the accuracy of the predictive model was assessed by calculating the sensitivity, specificity, area under the curve (AUC) for receiver operating characteristic, and precision (Fig. 1).
Fig. 1

Receiver operating curve with area under curve.

Receiver operating curve with area under curve.

Results

Our population consisted of 198 patients from two sister hospitals in a community setting with a documented CDI. Of those patients, 30 had CDR, giving us a recurrence rate of approximately 15%. The mean age of the patients in the recurrence group was 70.8 (SD 17), while the mean age of the population without recurrence was 68.7 (SD 17.7). The majority of the population was Caucasian (88.4%) with the remainder of the population being either African American (6.6%) or other races (5.0%) (Table 2).
Table 2

Demographic and risk factor table

No recurrenceRecurrence p
Total number of patients16830
Demographics
 Age (mean, SD, range)68.7 (17.7)3–9670.8 (17.0)28–930.5339
 Male (number,%)6236.691756.670.0393
 Race
  Caucasian15189.35%2583.33%0.3544
  African American95.33%413.33%0.1126
  Others95.33%13.33%1
Risk factors
 Smoking6035.50%826.67%0.3364
 ETOH4325.44%826.67%0.8876
 Hypertension13378.70%2273.33%0.5141
 CAD7846.15%1240.00%0.5326
 CKD5130.18%1136.67%0.4794
 Diabetes5633.14%1240.00%0.4651
 GERD4627.22%826.67%0.9500
 PUD116.51%310.00%0.4479
 IBD31.78%00.00%1.0000
 IBS63.55%13.33%1.0000
 GI cancer52.96%13.33%1.0000
 Immunosuppressed2313.61%413.33%1.0000
 Low-risk antibiotics5532.54%1136.67%0.6585
 High-risk antibiotics10461.54%1756.67%0.6145
 H2 antagonist137.69%13.33%0.6988
 Corticosteroids2615.38%930.00%0.0527
 Chemotherapy148.28%13.33%0.7022
Number of GI surgeries
 One3520.71%1240.00%0.0219
 Two or more169.47%310.00%1.0000
 None11869.82%1550.00%0.0336
PPI dose
 20 mg169.47%413.33%0.5126
 40 mg11769.23%1653.33%0.0883
 80 mg10.59%310.00%0.0113
Demographic and risk factor table Our final model based on 500 runs of the RF produced the following results: sensitivity, 83.3%; specificity, 63.1%; overall correct predicted percentage, 66.1%; and AUC, 82.6%

Discussion

CDI is a growing concern in the health care system, and it represents one of the most difficult challenges faced by clinicians. Throughout the 1990s and 2000s, the number of C. difficile-related hospital stays per 100,000 population has been steadily increasing, peaking at 114.6 in 2008 from a low of 33.2 in 1993 (10). The major problem associated with C. difficile infection is the high rate of recurrence of 20 to 35%. If a recurrence has occurred, the chances of a second recurrence further increases to 45–65% (11, 12). CDR increases mortality and morbidity, length of hospital stay, health care costs, and utilization of other health care resources. It also puts additional burden on the patient's quality of life and their families. Previous studies have also documented that CDR resulted in 265 additional days of vancomycin use and 19.7 days of metronidazole use (13). Although some studies have looked at the causes, there is little consensus on causes of CDR. Previous studies have identified the following associated risk factors: age, Horn's index, proton pump inhibitor use, antibiotic use, alteration of colonic microflora, initial disease severity, and hospital exposure (14–18). For this study, we created an RF predictive model using multiple well-known risk factors associated with CDI identified in previous studies. After 500 runs of the tree-based algorithm, the RF model performed extremely well in classifying predictors of CDR versus non-recurrent cases. The overall accuracy of our RF model was 66.1% with a sensitivity and specificity of 83.3 and 63.1%, respectively. The area under the receiver operating curve was 82.6%, which is comparable to other strong models. Our literature review did not reveal many studies using prediction models to identify CDR. Hu et al. (15) showed similar results. Our study expanded on their work by using a larger amount of patients that were followed for a longer period of time of 3 years. In addition, specific important co-variables such as PPI's chemotherapy, corticosteroids, and antibiotic use were used in our study (19). We also had a substantially higher sensitivity and AUC. The RF has already been used successfully in other studies in determining predictors in various chronic and acute conditions. The study done by Amalakuhan et al. (6) previously demonstrated that the RF was excellent at predicting factors associated with readmissions for COPD exacerbation. A well-documented study by Adrienne Chu demonstrated that the RF machine learning algorithm outperformed six other prediction models in determining the cause of gastrointestinal bleeding (20). The models that were outperformed included well-known algorithms such as the support vector machine, boosting, artificial neural network, linear discriminant analysis, and logistic regression. One of the major strengths of our study was that it contained a large number of patients who were followed for several years (2009–2013). We also used a large number of significant variables in this study to create the prediction model. The study had a high AUC and high sensitivity. One of the major limitations of our study was that the majority of our sample population was Caucasian.

Conclusion

C. difficile is a growing problem in the hospital and community setting. The recurrent nature of infection is worrisome since repeated use of antibiotics against the same strain of bacteria may lead to resistance mechanisms. In this study, we used the RF machine learning algorithm to create a strong prediction model with high sensitivity to predict CDR. In the evolving field of medical informatics, the use of such learning algorithm models can be used in risk factor stratification for hospitalized patients. If patients at risk for CDR could be accurately identified, specific management strategies could be developed resulting in better management, decreased morbidity and mortality, better health care resource utilization, and decreased length of hospital stay. We believe that the advantages offered by the RF makes it the ideal tool for this task.
  17 in total

1.  Prospective derivation and validation of a clinical prediction rule for recurrent Clostridium difficile infection.

Authors:  Mary Y Hu; Kianoosh Katchar; Lorraine Kyne; Seema Maroo; Sanjeev Tummala; Valley Dreisbach; Hua Xu; Daniel A Leffler; Ciarán P Kelly
Journal:  Gastroenterology       Date:  2008-12-13       Impact factor: 22.682

Review 2.  The changing epidemiology of Clostridium difficile infections.

Authors:  J Freeman; M P Bauer; S D Baines; J Corver; W N Fawley; B Goorhuis; E J Kuijper; M H Wilcox
Journal:  Clin Microbiol Rev       Date:  2010-07       Impact factor: 26.132

3.  Fecal microbiota transplantation in relapsing Clostridium difficile infection.

Authors:  Faith Rohlke; Neil Stollman
Journal:  Therap Adv Gastroenterol       Date:  2012-11       Impact factor: 4.409

4.  Breaking the cycle: treatment strategies for 163 cases of recurrent Clostridium difficile disease.

Authors:  Lynne V McFarland; Gary W Elmer; Christina M Surawicz
Journal:  Am J Gastroenterol       Date:  2002-07       Impact factor: 10.864

Review 5.  Can we identify patients at high risk of recurrent Clostridium difficile infection?

Authors:  C P Kelly
Journal:  Clin Microbiol Infect       Date:  2012-12       Impact factor: 8.067

6.  Proton pump inhibitors and risk for recurrent Clostridium difficile infection among inpatients.

Authors:  Daniel E Freedberg; Hojjat Salmasian; Carol Friedman; Julian A Abrams
Journal:  Am J Gastroenterol       Date:  2013-09-24       Impact factor: 10.864

7.  Risk factors for recurrence of clostridium difficile-associated diarrhoea.

Authors:  Ahmed Abdel Samie; Marc Traub; Klaus Bachmann; Karolin Kopischke; Lorenz Theilmann
Journal:  Hepatogastroenterology       Date:  2013-09

8.  Management and outcomes of a first recurrence of Clostridium difficile-associated disease in Quebec, Canada.

Authors:  Jacques Pépin; Sophie Routhier; Sandra Gagnon; Isabel Brazeau
Journal:  Clin Infect Dis       Date:  2006-02-07       Impact factor: 9.079

9.  Predictors of first recurrence of Clostridium difficile infection: implications for initial management.

Authors:  David W Eyre; A Sarah Walker; David Wyllie; Kate E Dingle; David Griffiths; John Finney; Lily O'Connor; Alison Vaughan; Derrick W Crook; Mark H Wilcox; Timothy E A Peto
Journal:  Clin Infect Dis       Date:  2012-08       Impact factor: 9.079

10.  Risk factors for recurrent Clostridium difficile infection (CDI) hospitalization among hospitalized patients with an initial CDI episode: a retrospective cohort study.

Authors:  Marya D Zilberberg; Kimberly Reske; Margaret Olsen; Yan Yan; Erik R Dubberke
Journal:  BMC Infect Dis       Date:  2014-06-04       Impact factor: 3.090

View more
  8 in total

1.  Gut metabolites predict Clostridioides difficile recurrence.

Authors:  Jennifer J Dawkins; Jessica R Allegretti; Travis E Gibson; Emma McClure; Mary Delaney; Lynn Bry; Georg K Gerber
Journal:  Microbiome       Date:  2022-06-09       Impact factor: 16.837

Review 2.  WSES guidelines for management of Clostridium difficile infection in surgical patients.

Authors:  Massimo Sartelli; Mark A Malangoni; Fikri M Abu-Zidan; Ewen A Griffiths; Stefano Di Bella; Lynne V McFarland; Ian Eltringham; Vishal G Shelat; George C Velmahos; Ciarán P Kelly; Sahil Khanna; Zaid M Abdelsattar; Layan Alrahmani; Luca Ansaloni; Goran Augustin; Miklosh Bala; Frédéric Barbut; Offir Ben-Ishay; Aneel Bhangu; Walter L Biffl; Stephen M Brecher; Adrián Camacho-Ortiz; Miguel A Caínzos; Laura A Canterbury; Fausto Catena; Shirley Chan; Jill R Cherry-Bukowiec; Jesse Clanton; Federico Coccolini; Maria Elena Cocuz; Raul Coimbra; Charles H Cook; Yunfeng Cui; Jacek Czepiel; Koray Das; Zaza Demetrashvili; Isidoro Di Carlo; Salomone Di Saverio; Irina Magdalena Dumitru; Catherine Eckert; Christian Eckmann; Edward H Eiland; Mushira Abdulaziz Enani; Mario Faro; Paula Ferrada; Joseph Derek Forrester; Gustavo P Fraga; Jean Louis Frossard; Rita Galeiras; Wagih Ghnnam; Carlos Augusto Gomes; Venkata Gorrepati; Mohamed Hassan Ahmed; Torsten Herzog; Felicia Humphrey; Jae Il Kim; Arda Isik; Rao Ivatury; Yeong Yeh Lee; Paul Juang; Luis Furuya-Kanamori; Aleksandar Karamarkovic; Peter K Kim; Yoram Kluger; Wen Chien Ko; Francis D LaBarbera; Jae Gil Lee; Ari Leppaniemi; Varut Lohsiriwat; Sanjay Marwah; John E Mazuski; Gokhan Metan; Ernest E Moore; Frederick Alan Moore; Carl Erik Nord; Carlos A Ordoñez; Gerson Alves Pereira Júnior; Nicola Petrosillo; Francisco Portela; Basant K Puri; Arnab Ray; Mansoor Raza; Miran Rems; Boris E Sakakushev; Gabriele Sganga; Patrizia Spigaglia; David B Stewart; Pierre Tattevin; Jean Francois Timsit; Kathleen B To; Cristian Tranà; Waldemar Uhl; Libor Urbánek; Harry van Goor; Angela Vassallo; Jean Ralph Zahar; Emanuele Caproli; Pierluigi Viale
Journal:  World J Emerg Surg       Date:  2015-08-20       Impact factor: 5.469

3.  Editor's notes.

Authors:  Robert P Ferguson
Journal:  J Community Hosp Intern Med Perspect       Date:  2015-02-03

Review 4.  2019 update of the WSES guidelines for management of Clostridioides (Clostridium) difficile infection in surgical patients.

Authors:  Massimo Sartelli; Stefano Di Bella; Lynne V McFarland; Sahil Khanna; Luis Furuya-Kanamori; Nadir Abuzeid; Fikri M Abu-Zidan; Luca Ansaloni; Goran Augustin; Miklosh Bala; Offir Ben-Ishay; Walter L Biffl; Stephen M Brecher; Adrián Camacho-Ortiz; Miguel A Caínzos; Shirley Chan; Jill R Cherry-Bukowiec; Jesse Clanton; Federico Coccolini; Maria E Cocuz; Raul Coimbra; Francesco Cortese; Yunfeng Cui; Jacek Czepiel; Zaza Demetrashvili; Isidoro Di Carlo; Salomone Di Saverio; Irina M Dumitru; Christian Eckmann; Edward H Eiland; Joseph D Forrester; Gustavo P Fraga; Jean L Frossard; Donald E Fry; Rita Galeiras; Wagih Ghnnam; Carlos A Gomes; Ewen A Griffiths; Xavier Guirao; Mohamed H Ahmed; Torsten Herzog; Jae Il Kim; Tariq Iqbal; Arda Isik; Kamal M F Itani; Francesco M Labricciosa; Yeong Y Lee; Paul Juang; Aleksandar Karamarkovic; Peter K Kim; Yoram Kluger; Ari Leppaniemi; Varut Lohsiriwat; Gustavo M Machain; Sanjay Marwah; John E Mazuski; Gokhan Metan; Ernest E Moore; Frederick A Moore; Carlos A Ordoñez; Leonardo Pagani; Nicola Petrosillo; Francisco Portela; Kemal Rasa; Miran Rems; Boris E Sakakushev; Helmut Segovia-Lohse; Gabriele Sganga; Vishal G Shelat; Patrizia Spigaglia; Pierre Tattevin; Cristian Tranà; Libor Urbánek; Jan Ulrych; Pierluigi Viale; Gian L Baiocchi; Fausto Catena
Journal:  World J Emerg Surg       Date:  2019-02-28       Impact factor: 5.469

5.  Evaluation of a risk score to predict future Clostridium difficile disease using UK primary care and hospital data in Clinical Practice Research Datalink.

Authors:  Clare Marley; Yassine El Hahi; Germano Ferreira; Laura Woods; Ana Ramirez Villaescusa
Journal:  Hum Vaccin Immunother       Date:  2019-04-04       Impact factor: 3.452

6.  Neighborhood disadvantage and 30-day readmission risk following Clostridioides difficile infection hospitalization.

Authors:  Elizabeth Scaria; W Ryan Powell; Jen Birstler; Oguzhan Alagoz; Daniel Shirley; Amy J H Kind; Nasia Safdar
Journal:  BMC Infect Dis       Date:  2020-10-16       Impact factor: 3.090

7.  External validation of two prediction tools for patients at risk for recurrent Clostridioides difficile infection.

Authors:  Tessel M van Rossen; Laura J van Dijk; Martijn W Heymans; Olaf M Dekkers; Christina M J E Vandenbroucke-Grauls; Yvette H van Beurden
Journal:  Therap Adv Gastroenterol       Date:  2021-01-09       Impact factor: 4.409

8.  Evidence-based Clinical Decision Support Systems for the prediction and detection of three disease states in critical care: A systematic literature review.

Authors:  Goran Medic; Melodi Kosaner Kließ; Louis Atallah; Jochen Weichert; Saswat Panda; Maarten Postma; Amer El-Kerdi
Journal:  F1000Res       Date:  2019-10-08
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.