Literature DB >> 35519927

Guessing Game of Patient Outcomes in the Renally Injured Critically Ill: Is There a Perfect Score?

Abstract

Raju GM. Guessing Game of Patient Outcomes in the Renally Injured Critically Ill: Is There a Perfect Score? Indian J Crit Care Med 2022;26(3):253-255.

Entities: Chemical

Keywords: Acute Physiology and Chronic Health Evaluation II; Acute kidney injury; Calibration; Discrimination; Intensive care unit outcomes; Kidney disease: Improving global outcomes; Severity of illness scoring

Year: 2022 PMID： 35519927 PMCID： PMC9015918 DOI： 10.5005/jp-journals-10071-24177

Source DB: PubMed Journal: Indian J Crit Care Med ISSN： 0972-5229

Dealing with a critically ill patient can be a challenge in terms of evaluation, assessment, categorization of disease pathology and severity, tailoring optimal treatment strategies and analysis of outcome (both individual and composite). Acute kidney injury is a significant problem which adversely affects morbidity and mortality both independently or as part of multiorgan dysfunction. Over 30–60% of critically ill patients suffer from acute kidney injury due to various reasons common being sepsis, and associated mortality can range from 15 to 65%.[1] Acute kidney injury is also associated with a higher risk of adverse cardiovascular events, other serious life-threatening complications, and associated negative outcomes like mortality and morbidity.[2,3] There are various scoring systems to measure and assess to help indicate outcomes. There are: General risk prognostication scores like Acute Physiology and Chronic Health Evaluation (APACHE II) score, Simplified Acute Physiology Score (SAPS II), mortality prediction model 3 score (MPM III), and Organ-specific risk prognostication scores like Glasgow Coma Score (GCS), Acute Kidney Injury Network (AKIN), Risk, Injury, Failure, Loss of function, End-stage renal disease (RIFLE) score, Kidney Disease: Improving Global Outcomes (KDIGO) score, organ dysfunction scores like `Sequential Organ Failure Assessment´ (SOFA) score, etc. Among them APACHE II score has been widely used for prognostication among critically ill, specifically to predict mortality. APACHE II score was derived from datasets of patients in ICU's in North America led by Knaus et al. The score can range from 0 to 71 with weightage points assigned to age, acute physiological conditions and specific preexisting chronic disease.[4] Scoring systems related to acute kidney injury per se, the Kidney Disease: Improving Global Outcomes (KDIGO) guidelines 2012 proposed a modified version of the RIFLE and AKIN criteria.[5] The modified definition for acute kidney injury is based on serum creatinine and urine output. Despite potential limitations, the KDIGO classification has been accepted as the consensus standard for use in clinical practice to define acute kidney injury and associated outcomes. The KDIGO classification also gives an idea about the mortality corresponding to each of its class of acute kidney injury. In this issue of IJCCM, an observational study by Patel et al. assessed whether APACHE II scores have sufficient sensitivity and specificity in predicting outcomes associated with acute kidney injury when classified as per KDIGO guidelines. One-hundred patients with acute kidney injury admitted to intensive care unit (ICU) were studied in a tertiary care hospital in Haryana in North India over a period of 1 year. Multiple parameters (age, lab parameters, chronic) were studied and APACHE II score calculated and corresponding mortality noted. Similarly, AKI among patients were also classified as per KDIGO classification and corresponding mortality figures were noted. In this study, the mean APACHE II scores among patients who expired were significantly higher than those who survived, across the various stages of acute kidney injury. It was also noted sepsis (47%) was the most common cause of acute kidney injury in the critically ill. This is similar to that observed by other studies as well.[6,7] The authors have also noted higher mortality rates in patients who require ventilator and vasopressor support, and those with coexisting coronary artery disease. In their study the authors have observed that in predicting mortality, APACHE II score has a sensitivity (95% CI) of 57.14% (39.4–73.7%), and specificity (95% CI) of 86.15% (75.3–93.5%), positive predictive value (95% CI) of 69% (49.2–84.7%), negative predictive value (95% CI) of 78.9% (67.6–87.7%) with area under ROC (AUROC) of 0.79. While the overall the mortality observed closely corresponds to what is arrived at by correlating it with the AKI as per KDIGO, the APACHE II score does not directly correlate to the mortality figures as observed with various KDIGO classes. Authors have observed good discrimination and calibration of the test; however, such an inference may not be as straightforward as it seems. It calls for a deeper understanding of discrimination and calibration. Discrimination and calibration are applied to clinical predictive models or scoring systems: where a number of factors (clinical, laboratory, age, acute or chronic health conditions) are considered and provide an estimate of patients absolute risk of an outcome (ex: mortality risk in patients with acute kidney injury) or the likelihood of a particular diagnosis.

Discrimination and Calibration

“Discrimination” refers to how well the model or the prognostic scoring system can distinguish between those at higher risk of having a particular outcome (e.g. in this study mortality) from those with lower risk. For example, a prognostic score could discriminate well between mortality among patients with and without acute kidney injury, among heterogeneous population with diverse factors (clinical, laboratory, etc.) but may not do so in a more homogeneous population with limited number of factors.[8] While using a prognosticating tool or model, discrimination alone is insufficient. The second most important property of any predictive model gives clues to clinicians how similar the predicted absolute risk estimates are, to the true (observed) risk estimates in patient groups classified according to diverse risk strata. “Calibration” refers to the accuracy of these absolute risk estimates. In any predictive score or model; the more accurate the estimate—the better the calibration.[8]

Measuring Discrimination

It can be measured by various methods, however, when it involves binary outcomes (dead or alive), the receiver operator characteristic (ROC) curve best characterizes discrimination. Greater the area under the ROC curve, better the prediction model/prognostic score. In general, area under the curve approaching 100% is better. On the contrary, ROC curves with areas, less than 50% the events are attributed to chance.[9]

Measuring Calibration

Calibration also known as goodness of fit is a crucial property of a model and indicates the extent to which a predictive model accurately shows the absolute risk (i.e., whether the values indicated as per the model match with the observed values). Poor calibration models will overestimate or underestimate the outcome. Assessment of calibration involves comparing predicted and observed risk at various levels (a) the population under study; (b) various patient groups based on predicted risk; or (c) different patient groups based on combinations of predictive factors. A good model shows strong calibration for various patient groups with widely varying characteristics.[10] Considering the above, the authors concluded that the APACHE II score has good discrimination and calibration is only partly true. Higher the APACHE II scores, worser the outcomes. But while applying the same logic in correspondence to estimating mortality from an individual component such as acute kidney injury measured only using only serum creatinine may not be accurate (APACHE II score includes serum creatinine alone). Attempting to correlate the mortality corresponding to various classes of acute kidney injury as per KDIGO may also lead to erroneous inferences as the KDIGO classification uses both changes in serum creatinine and urine output. Also, studies indicate serum creatinine or urine output alone cannot reliably predict acute kidney injury or related outcomes.[11] Although the current study puts a lot of weight into the prognostic ability of APACHE II score, it should be noted that APACHE II scores are based on data sets of patient characteristics from 1980s to 1990s. Also calculating APACHE II score can vary widely depending on the clinicians knowledge, experience and training. Although web-based or computer-based calculators are available, there have been wide variations in APACHE calculations even among clinicians familiar with using APACHE score in its existing format. Also, APACHE II score does not give estimate of individual mortality. Current technological advancements make it necessary to revamp and recalibrate the APACHE II with a database, which includes patient characteristics according to current times, specific patient population, demographics and geography for better accuracy of predictive models. Third-generation models of severity scoring systems like APACHE 4, MPM III, and SAPS III have been found to have very better discrimination and calibration.[12] Among other prognostics scoring systems, SOFA score can be useful in assessment of daily disease progression and can give reliable clues in outcomes among critically ill patients. Although originally SOFA score was not designed to estimate mortality, a recent study by Wang et al. showed SOFA score has better ability to predict prognosis among critically ill patients with acute kidney injury.[13,14] Trying to simplify a disease process and its outcome based on individual organ's impact on mortality and morbidity can give more insight into the degree a given organ dysfunction plays in the outcome of a patient. But human body works as a syncytium of multiple organ systems interconnected in ways so complex that, oversimplification and extrapolating the role of anyone organ system may not be possible without accounting for the interaction with others.

Bottomline

The purpose of a scoring system is to help comparative audit, assist clinical research and help in the clinical management and prognostication of patients. A good scoring system should have good internal and external validity, and good discrimination and calibration. While the APACHE II score has stood the test of time in terms of mortality estimation albeit its limitations mentioned above; mortality as the most important outcome measure may need a relook. Perhaps disability, chronic complications and need for sustained healthcare support, and quality of life post-acute kidney injury and critical illness may be a better outcome measure in terms of what constitutes a functional recovery. Comprehensive inclusion of population metrics with an emphasis on contemporary data and a practical outcome-based predictive model is needed.

12 in total

1. Evaluating Discrimination of Risk Prediction Models: The C Statistic.

Authors: Michael J Pencina; Ralph B D'Agostino
Journal: JAMA Date: 2015-09-08 Impact factor: 56.272

2. A calibration hierarchy for risk models was defined: from utopia to empirical data.

Authors: Ben Van Calster; Daan Nieboer; Yvonne Vergouwe; Bavo De Cock; Michael J Pencina; Ewout W Steyerberg
Journal: J Clin Epidemiol Date: 2016-01-06 Impact factor: 6.437

3. APACHE II: a severity of disease classification system.

Authors: W A Knaus; E A Draper; D P Wagner; J E Zimmerman
Journal: Crit Care Med Date: 1985-10 Impact factor: 7.598

4. Assessment of KDIGO definitions in patients with histopathologic evidence of acute renal disease.

Authors: Rong Chu; Cui Li; Suxia Wang; Wanzhong Zou; Gang Liu; Li Yang
Journal: Clin J Am Soc Nephrol Date: 2014-05-01 Impact factor: 8.237

5. Discrimination and Calibration of Clinical Prediction Models: Users' Guides to the Medical Literature.

Authors: Ana Carolina Alba; Thomas Agoritsas; Michael Walsh; Steven Hanna; Alfonso Iorio; P J Devereaux; Thomas McGinn; Gordon Guyatt
Journal: JAMA Date: 2017-10-10 Impact factor: 56.272

6. Performance of the third-generation models of severity scoring systems (APACHE IV, SAPS 3 and MPM-III) in acute kidney injury critically ill patients.

Authors: Verônica Torres Costa e Silva; Isac de Castro; Fernando Liaño; Alfonso Muriel; José R Rodríguez-Palomares; Luis Yu
Journal: Nephrol Dial Transplant Date: 2011-04-19 Impact factor: 5.992

7. Acute renal failure in critically ill patients: a multinational, multicenter study.

Authors: Shigehiko Uchino; John A Kellum; Rinaldo Bellomo; Gordon S Doig; Hiroshi Morimatsu; Stanislao Morgera; Miet Schetz; Ian Tan; Catherine Bouman; Ettiene Macedo; Noel Gibney; Ashita Tolwani; Claudio Ronco
Journal: JAMA Date: 2005-08-17 Impact factor: 56.272

Review 8. AKI and Long-Term Risk for Cardiovascular Events and Mortality.

Authors: Ayodele Odutayo; Christopher X Wong; Michael Farkouh; Douglas G Altman; Sally Hopewell; Connor A Emdin; Benjamin H Hunn
Journal: J Am Soc Nephrol Date: 2016-06-13 Impact factor: 10.121

9. SOFA score is superior to APACHE-II score in predicting the prognosis of critically ill patients with acute kidney injury undergoing continuous renal replacement therapy.

Authors: Hai Wang; Xiao Kang; Yu Shi; Zheng-Hai Bai; Jun-Hua Lv; Jiang-Li Sun; Hong Hong Pei
Journal: Ren Fail Date: 2020-11 Impact factor: 2.606

10. Assessment of APACHE II Score to Predict ICU Outcomes of Patients with AKI: A Single-center Experience from Haryana, North India.

Authors: Paras Patel; Sunita Gupta; Happy Patel; Md Abu Bashar
Journal: Indian J Crit Care Med Date: 2022-03