Literature DB >> 33075607

On predictions in critical care: The individual prognostication fallacy in elderly patients.

Michael Beil¹, Sigal Sviri¹, Hans Flaatten², Dylan W De Lange³, Christian Jung⁴, Wojciech Szczeklik⁵, Susannah Leaver⁶, Andrew Rhodes⁶, Bertrand Guidet⁷, P Vernon van Heerden⁸.

Abstract

Predicting the future course of critical conditions involves personal experience, heuristics and statistical models. Although these methods may perform well for some cases and population averages, they suffer from substantial shortcomings when applied to individual patients. The reasons include methodological problems of statistical modeling as well as limitations of cross-sectional data sampling. Accurate predictions for individual patients become crucial when they have to guide irreversible decision-making. This notably applies to triage situations in response to a lack of healthcare resources. We will discuss these issues and argue that analysing longitudinal data obtained from time-limited trials in intensive care can provide a more robust approach to individual prognostication.

Entities: Disease Gene Species

Keywords: Critical care; individual prognostication; predictive modeling; time-limited trial

Year: 2020 PMID： 33075607 PMCID： PMC7553132 DOI： 10.1016/j.jcrc.2020.10.006

Source DB: PubMed Journal: J Crit Care ISSN： 0883-9441 Impact factor: 3.425

Background

Decision-making in critical care is based on the comparison of different management paths for their potential benefit for the individual patient [1]. This might be straightforward in some cases, such as in previously healthy individuals with a traumatic bleed for whom comprehensive information about causalities provides the opportunity for deterministic predictions of outcome with or without interventions. However, most disorders reflect a more complicated scenario. Even in the simple case depicted above, haemorrhagic shock may trigger a number of processes which time-dependently branch off the linear chain of events caused by intravascular volume depletion. To further complicate matters for the individual case, these processes proceed on the background of pre-existing conditions as well as genome-based propensities in responding to stress [2]. Causal inference becomes imprecise in the absence of complete information. This epistemic uncertainty about the intrinsic mechanisms of disease processes makes deterministic predictions impossible, at least with current techniques [3]. Thus, physicians have to leave the realm of strictly mechanistic thinking and turn to other prediction methods to support decision-making for individual patients. These methods - expert judgement and statistical models - are based on variably structured experience, notably on past observations in supposedly similar patients. This paper will discuss the shortcomings of these methods, especially their reliance on reference groups and sampling techniques. We will expand these reflections by technical arguments and suggest a more robust approach to prognostication for individual patients. The interest in predicting has grown substantially in critical care over the past 20 years [4]. The intricacies of this topic are illustrated best by the decision-making for elderly multi-morbid individuals in critical condition. This cohort is characterised by considerable inter-individual heterogeneity which makes statistical description difficult. Although mortality in critical care is high in older patients, the relative survival benefit of critical care for this group can be greater than in younger patients [5]. Thus, identifying the older individual with the right prognosis is pivotal to guide admission to and management in critical care. The COVID-19 pandemic that has disproportionately affected older patients [6] has further emphasized the crucial role of an individualised prognostication when deciding about admitting these patients to critical care. Importantly, recent reports have indicated that patients with COVID-19 are younger on average than those who were admitted with severe respiratory infections in the past [7] suggesting that there already is a different way to select patients even before hospitals were overwhelmed by large numbers of patients [8]. The importance of these issues has culminated in the discussion about triage based on the assessment of prognosis [9] that can ultimately lead to withdrawal of life support in a potentially curable patient because another individual with a supposedly better prognosis has arrived in the hospital where resources are limited [8]. In this situation, a careful and cautious approach to individualised prognostication becomes paramount to protect fundamental principles of medical ethics and, thereby, public trust in medical decision-making.

State of the art

The traditional way for medical professionals to make predictions for patients has been heuristics involving rules built on their past experience. Intuition is especially helpful for decision-making under time constraints and with insufficient information [10]. However, humans may have a different experience and even experts are vulnerable to cognitive biases [11,12] which causes substantial variability in decision-making even in cases classified as easy [13]. With the amount of information growing in more complex cases, prognostication based on heuristics becomes more prone to errors, i.e. an inappropriate judgement was made when a more appropriate alternative should have been chosen as determined in retrospect [14]. For the data-rich environment of critical care, this problem has been illustrated by the rate of incorrect outcome predictions by multidisciplinary teams at the bedside. Approximately 1 in 6 patients who were unanimously predicted to die actually survived [15]. Similar problems with the quality of heuristic outcome predictions have been reported in other settings [16,17]. To overcome the flaws of personal judgement and improve the accuracy of prognostication, more structured approaches to the collection and analysis of data have been developed in the past [18]. These implementations of evidence-based medicine use statistical modeling, mainly regression analysis, to devise prediction scores or algorithms from various types of data (see below) extracted from ostensibly suitable reference groups [19]. When used as severity of illness scores, these models have been successfully employed for the evaluation of ICU performance, quality improvement projects and benchmarking [20]. However, this approach is based on group statistics and, thereby, affected by some fundamental limitations when applied to individual patients [11]. Of note, this problem also applies to new technologies from the field of artificial intelligence and machine learning [21,22]. First, it is assumed that an individual's characteristics match those of a chosen reference group that, importantly, is not uniform in itself. Second, the properties of this match depend on the selection of the model structure, the set of variables as well as their time of sampling (see below) to characterise a disease [23]. These issues are especially relevant to the heterogeneous group of older patients with multi-morbidity that may have evolved over a long time in unique patterns and, thus, makes the search for an appropriate reference group even more difficult. Many of the prediction models currently used to assess the prognosis of older patients in critical care were neither developed nor validated in this cohort, others were shown to have a rather low accuracy in discriminating between outcomes in these patients [24]. Most prediction algorithms assign a probability, i.e. a continuous variable, to categorical outcomes, such as survival or death. This probability is estimated by the frequency of these outcomes observed in large samples taken from reference groups. However, in the individual, i.e. a sample of size 1, the observable outcome is binary for each category, e.g. either survival or death [11]. Thus, there is no practical difference between an outcome based on a prediction of 5% or 95% chance regarding an irreversible event, e.g. death, in an individual [25]. Importantly, the parameters that characterise the quality of predictions at the group level, e.g. calibration and discrimination, do not fully describe the predictive uncertainty for an individual [23]. If one chooses to use probabilities nonetheless, this requires the definition of a cut-off within the probability distribution to further proceed with decision-making for the individual case. This involves a conscious decision to accept a certain number of false positive or false negative cases at the population level. Depending on the impact of a specific outcome or the cost of prediction errors, the performance of a statistical model can be unsatisfactory or unacceptable for some people or subgroups [26]. For example, in a study on prognostication of poor neurological outcome after cardiac arrest that aimed at preventing false positives predictions, the false negative rate reached approximately 30% [27]. This means that a substantial number of patients will be treated in vain. Whether this scenario is acceptable or proportional for individual patients and their relatives needs to be considered carefully during decision-making. Although a major objective of designing statistical prediction models is to capture or at least approximate disease mechanisms, causal inference studies require an extensive analysis of confounding variables, time frames and other sources of bias [19,28]. Regardless of the feasibility of such investigations, any method that uses group-based statistics for decision-making in the individual case will suffer from the shortcomings discussed above. To fill the gap between considering unique features of individual cases and the good performance of statistical prediction models for certain groups of patients, it was suggested to combine these models with experience and heuristics [19]. However, this approach would re-introduce the biases of personal judgement and impair predictions. Thus, the need for robust techniques to prognosticate for individuals, notably from the inherently heterogeneous group of older patients, remains largely unmet by current approaches.

Data types and sampling methods

In addition to choosing a reference group, the selection of data types and sampling methods exerts a fundamental influence on the performance of prediction models [23]. Data types useful for prognostications include genomic risks, past medical history including trajectories of chronic conditions and physiological data. With respect to sampling, there are two main categories of datasets - cross-sectional datasets from a single discrete point in time (snapshots) and longitudinal or time series datasets (Fig. 1 ). The predictive value of snapshot data depends on the timing of sampling and its calibration with the progression of diseases, notably in critically ill patients [28]. Thus, illness severity scores calculated from data taken at a single point in time that is determined more by administrative than by biological factors, such as admission to hospital, do not perform well in prognosticating critical illnesses even at the population level [29]. Importantly, this type of scores, notably the SOFA score, has been recommended for triage decision-making under time constraints during the COVID-19 pandemic [30]. It is conceivable for these situations though, that prognostication by intensivists based on their heuristics could lead to better results than these cross-sectional scores [15]. Even if the snapshot dataset incorporates a multidimensional description of acute system failure, there is a substantial overlap between outcome categories (Fig. 2 ) that limits its usefulness for individual predictions [31]. This problem is illustrated by the uncertainties for individuals associated with recently developed triage guidelines for COVID-19 patients which put the final decision into the hands of clinicians [32].

Fig. 1

Fig. 2

Overlap between groups of elderly survivors and non-survivors of critical care with respect to organ dysfunction (SOFA score - sequential organ failure assessment score), functional capacity (Katz categories measuring the ability to live independently with 0 indicating full dependence in daily activities) and frailty. The overlap compromises the usefulness of these characteristics for individual prognostication. (Data from the VIP2 study [31]).

Data sampling strategies for disease processes with time-dependent variations. Single cross-sectional samples (‘snapshots’) are not suited to characterise the phase and dynamics of a disease without additional information and, thus, are of limited value for predictions. Overlap between groups of elderly survivors and non-survivors of critical care with respect to organ dysfunction (SOFA score - sequential organ failure assessment score), functional capacity (Katz categories measuring the ability to live independently with 0 indicating full dependence in daily activities) and frailty. The overlap compromises the usefulness of these characteristics for individual prognostication. (Data from the VIP2 study [31]). The suboptimal performance of many statistical prediction models derived from snapshot data suggests a more fundamental problem with that approach. A crucial assumption of using this sampling technique is ergodicity of the involved disease processes, i.e. the statistical properties of longitudinal data in individuals are equivalent to the properties of cross-sectional samples taken from a group at discrete points in time. However, this assumption is probably not true for a substantial part of human subject research and, thus, conclusions drawn from these studies may be imprecise [33]. Although some experts consider ergodicity sufficient, but not necessary to infer from groups to individuals [34], a careful analysis of inter- and intra-individual variability should be mandatory when assessing prediction models [35]. This problem is illustrated by the acknowledgement of disorders as critical conditions implying that they fluctuate and may rapidly deteriorate in some patients [36]. Ignoring these variations can lead to interpretational fallacies for the individual case [34]. Fig. 3 depicts longitudinal recordings of body temperature over 48 h in patients who were diagnosed with sepsis 24 h later. The variance of cross-sectional data samples at any point in time underestimates the variance of the most dynamic curve in an individual by up to 80%. Thus, body temperature does not appear to be an ergodic process with respect to variations and findings from group statistics cannot be used for individual patients. However, a rigorous analysis [37] may reveal that the criterion of ergodicity holds for certain phases of a process and, thus, group-based statistics may be applicable to the individual when restricted to these phases. Piecewise ergodicity has to be verified and communicated for a particular prediction model though to align data from new patients with such intervals.

Fig. 3

Averaging of longitudinal data samples. Time course of body temperature in 10 patients recorded every 5 min over 48 h. The recordings started 72 h before the diagnosis of sepsis has been established for each patient. The black curve represents the average temperature for every point in time. Please note that the average curve does not capture the dynamics of curves from individual patients. (Permission to collect the data was obtained from the Hadassah University Hospital review board in Jerusalem, Israel.) As indicated above, many critical conditions may proceed in non-ergodic ways. Their time-dependent characteristics which are crucial for prognosticating outcome might not be sufficiently described by data samples obtained at a single point in time. It is, therefore, not surprising that prediction models processing longitudinal data perform better than those based on snapshot samples [38,39]. In fact, longitudinal data analysis may offer a way out of the ergodicity ‘trap’ for developing statistical prediction models, for example by identifying time-dependent disease patterns. In fact, the analysis of temporal fluctuation patterns of vital signs alone can add significant value in predicting survival [40]. Although the concept of analysing longitudinal data for outcome prediction is not new, it has gained more attention with the widespread use of electronic health records [41]. In a large study on predicting survival in critically ill patients, the aggregation of past medical history with longitudinal physiological data outperformed SAPS II and APACHE II scores calculated on admission [42]. Moreover, time series data of more complex characteristics in critically ill older patients, such as trajectories of severe disabilities prior to hospital admission, appear to have a greater impact on long-term mortality than the severity of the acute condition during hospital admission [43,44]. Thus, longitudinal datasets and new statistical techniques, such as trajectory clustering, are becoming part of the armamentarium in predictive modeling [45].

Individualised prognostication

The heterogeneity of patients with critical illnesses, the non-ergodicity of disease processes and other shortcomings of group-based statistics undermine the application of many prognostication models for individual cases. This situation resembles the problems with the use of information from traditional clinical trials on new interventions, i.e. the prediction of response to treatment in a specific individual [11]. The N-of-1 trial design has been proposed to tackle these issues. The individual patient serves as his/her control thereby excluding many of the above problems [46]. This technique, that can be described as ‘trial and error’ at its most basic level, is not new to critical care and, for example, applied for reversible decision-making while weaning patients from the ventilator. Moreover, the specific response to interventions can be monitored over time and used to identify an individual's physiological makeup. If available, mathematical models of dynamic processes can then be fitted to the individual's longitudinal data to make quantitative predictions over time, e.g. about the time to and future extent of recovery [47,48]. This technique could be considered an individual's ‘system identification’ [49] that, eventually, approximates a deterministic understanding of the individual's biological processes and, thereby, approaches the level of precision medicine [36].

Time-limited trials (TLT) in critical care

To formally implement the above concepts for improved prognostication in individuals, patients are admitted to critical care for a time-limited trial [50,51]. There could be an option to implement a staged approach for elderly and frail patients [52]. The main purpose is to obtain longitudinal data from the individual which then provide the opportunity to apply new statistical techniques for outcome prediction. These datasets can also be used for a system identification if suitable mathematical models are available for fitting to the individual's time course data. However, the efficacy of TLT remains to be validated in clinical trials. A combination of initial (snapshot) assessment prior to admission and subsequent prognostication with longitudinal data may be the most pragmatic implementation. That could mean that if an initial prognostic assessment concludes with sufficient uncertainty in some individuals, they would then be admitted for a TLT. In neurocritical care, for example, there is a strong recommendation for a 72-h observation period to further monitor clinical parameters prior to final decisions about treatment withdrawal [53]. The expected downsides of that approach will be prolonged suffering in patients with an eventually negative TLT outcome [54]. From an economic point of view, prolonged treatment means that more resources are at least partially spent on patients that are beyond saving. This problem becomes important in situations with a severe shortage of resources, e.g. during a pandemic [8], and may affect the criteria for initiating and terminating a TLT to a point where TLTs need to be replaced by retrospective and, ideally, longitudinal data [43] or cumulative characteristics for prognostication [31]. On the upside though, TLTs will also provide learning effects which go beyond the established knowledge as represented by reference groups in classical models.

Conclusions

Several recommendations issued during the beginning of the COVID-19 pandemic imply that prognostication methods currently used in clinical practice can be sufficiently certain to justify triage decisions for individuals, e.g. treatment withdrawal [9]. Some professional organisations, however, have advised against using these methods, notably illness severity scores, for individual prognostication in this situation [55]. We have provided arguments that predictions for individuals based on statistical models are uncertain in principle, although the extent of uncertainty may vary depending on reference groups as well as sampling and modeling techniques. Thus, the appropriate answer to questions about prognosis in an individual is ‘We don't know’ in most cases. Consequently, irreversible decisions should not be solely based on group statistics. By using quantitative measures of uncertainty in individual cases [22], statistical predictions can still be considered for decision-making though, but in a more cautious and transparent way to adhere to the principles of medical ethics, especially non-maleficence and justice [21]. In addition to the traditional methods for prognostication, TLTs could provide a framework for individualised predictions in the future that is expected to reduce but not abolish predictive uncertainty. However, the benefit-to-cost ratio of its implementation will depend on the valuation of immaterial benefits for society which is outside the realm of clinical medicine [8]. Prognosis-dependent decisions for individuals are not only relevant to critical care but constitute a system-wide issue in medicine, for example with respect to the development of economically sustainable healthcare models. Thus, predictive modeling for individual patients is of general importance and needs to be further developed. An individualised system identification through longitudinal data analysis may assist to understand an individual's capacity to recover from an illness and, thereby, help to achieve these goals. The data-rich environment of critical care provides an excellent framework for pilot studies in this field.

42 in total

1. The Eldicus prospective, observational study of triage decision making in European intensive care units. Part II: intensive care benefit for the elderly.

Authors: Charles L Sprung; Antonio Artigas; Jozef Kesecioglu; Angelo Pezzi; Joergen Wiis; Romain Pirracchio; Mario Baras; David L Edbrooke; Antonio Pesenti; Jan Bakker; Chris Hargreaves; Gabriel Gurman; Simon L Cohen; Anne Lippert; Didier Payen; Davide Corbella; Gaetano Iapichino
Journal: Crit Care Med Date: 2012-01 Impact factor: 7.598

Review 2. Personalized evidence based medicine: predictive approaches to heterogeneous treatment effects.

Authors: David M Kent; Ewout Steyerberg; David van Klaveren
Journal: BMJ Date: 2018-12-10

Review 3. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review.

Authors: Benjamin A Goldstein; Ann Marie Navar; Michael J Pencina; John P A Ioannidis
Journal: J Am Med Inform Assoc Date: 2016-05-17 Impact factor: 4.497

4. ICU triage in an impending crisis: uncertainty, pre-emption and preparation.

Authors: Dominic Wilkinson
Journal: J Med Ethics Date: 2020-04-01 Impact factor: 2.903

5. The ability of intensive care unit physicians to estimate long-term prognosis in survivors of critical illness.

Authors: Ivo W Soliman; Olaf L Cremer; Dylan W de Lange; Arjen J C Slooter; Johannes Hans J M van Delden; Diederik van Dijk; Linda M Peelen
Journal: J Crit Care Date: 2017-09-06 Impact factor: 3.425

Review 6. Recommendations for the Critical Care Management of Devastating Brain Injury: Prognostication, Psychosocial, and Ethical Management : A Position Statement for Healthcare Professionals from the Neurocritical Care Society.

Authors: Michael J Souter; Patricia A Blissitt; Sandralee Blosser; Jordan Bonomo; David Greer; Draga Jichici; Dea Mahanes; Evie G Marcolini; Charles Miller; Kiranpal Sangha; Susan Yeager
Journal: Neurocrit Care Date: 2015-08 Impact factor: 3.210

7. Focus on the frail and elderly: who should have a trial of ICU treatment?

Authors: Otavio T Ranzani; Bruno A M P Besen; Margaret S Herridge
Journal: Intensive Care Med Date: 2020-03-02 Impact factor: 17.440

8. System identification of physiological systems using short data segments.

Authors: Daniel Ludvig; Eric J Perreault
Journal: IEEE Trans Biomed Eng Date: 2012-09-28 Impact factor: 4.538

9. To die well: the phenomenology of suffering and end of life ethics.

Authors: Fredrik Svenaeus
Journal: Med Health Care Philos Date: 2020-09

10. Development and Reporting of Prediction Models: Guidance for Authors From Editors of Respiratory, Sleep, and Critical Care Journals.

Authors: Daniel E Leisman; Michael O Harhay; David J Lederer; Michael Abramson; Alex A Adjei; Jan Bakker; Zuhair K Ballas; Esther Barreiro; Scott C Bell; Rinaldo Bellomo; Jonathan A Bernstein; Richard D Branson; Vito Brusasco; James D Chalmers; Sudhansu Chokroverty; Giuseppe Citerio; Nancy A Collop; Colin R Cooke; James D Crapo; Gavin Donaldson; Dominic A Fitzgerald; Emma Grainger; Lauren Hale; Felix J Herth; Patrick M Kochanek; Guy Marks; J Randall Moorman; David E Ost; Michael Schatz; Aziz Sheikh; Alan R Smyth; Iain Stewart; Paul W Stewart; Erik R Swenson; Ronald Szymusiak; Jean-Louis Teboul; Jean-Louis Vincent; Jadwiga A Wedzicha; David M Maslove
Journal: Crit Care Med Date: 2020-05 Impact factor: 7.598

4 in total

1. A retrospective cohort study comparing differences in 30-day mortality among critically ill patients aged ≥ 70 years treated in European tax-based healthcare systems (THS) versus social health insurance systems.

Authors: Bernhard Wernly; Hans Flaatten; Michael Beil; Jesper Fjølner; Raphael Romano Bruno; Antonio Artigas; Bernardo Bollen Pinto; Joerg C Schefold; Malte Kelm; Sviri Sigal; Peter Vernon van Heerden; Wojciech Szczeklik; Muhammed Elhadi; Michael Joannidis; Richard Rezar; Sandra Oeyen; Georg Wolff; Brian Marsh; Finn H Andersen; Rui Moreno; Sarah Wernly; Susannah Leaver; Ariane Boumendil; Dylan W De Lange; Bertrand Guidet; Stefan Perings; Christian Jung
Journal: Sci Rep Date: 2022-10-19 Impact factor: 4.996

2. Variations in end-of-life care practices in older critically ill patients with COVID-19 in Europe.

Authors: Bernhard Wernly; Richard Rezar; Hans Flaatten; Michael Beil; Jesper Fjølner; Raphael R Bruno; Antonio Artigas; Bernardo B Pinto; Joerg C Schefold; Malte Kelm; Sviri Sigal; Peter V van Heerden; Wojciech Szczeklik; Muhammed Elhadi; Michael Joannidis; Sandra Oeyen; Georg Wolff; Brian Marsh; Finn H Andersen; Rui Moreno; Susannah Leaver; Sarah Wernly; Ariane Boumendil; Dylan W De Lange; Bertrand Guidet; Christian Jung
Journal: J Intern Med Date: 2022-04-22 Impact factor: 13.068

3. Clustering analysis of geriatric and acute characteristics in a cohort of very old patients on admission to ICU.

Authors: Oded Mousai; Lola Tafoureau; Tamar Yovell; Hans Flaatten; Bertrand Guidet; Christian Jung; Dylan de Lange; Susannah Leaver; Wojciech Szczeklik; Jesper Fjolner; Peter Vernon van Heerden; Leo Joskowicz; Michael Beil; Gal Hyams; Sigal Sviri
Journal: Intensive Care Med Date: 2022-09-02 Impact factor: 41.787

4. ICU-Mortality in Old and Very Old Patients Suffering From Sepsis and Septic Shock.

Authors: Raphael Romano Bruno; Bernhard Wernly; Behrooz Mamandipoor; Richard Rezar; Stephan Binnebössel; Philipp Heinrich Baldia; Georg Wolff; Malte Kelm; Bertrand Guidet; Dylan W De Lange; Daniel Dankl; Andreas Koköfer; Thomas Danninger; Wojciech Szczeklik; Sviri Sigal; Peter Vernon van Heerden; Michael Beil; Jesper Fjølner; Susannah Leaver; Hans Flaatten; Venet Osmani; Christian Jung
Journal: Front Med (Lausanne) Date: 2021-07-09

4 in total