Literature DB >> 34007618

More Unnecessary Imaginary Worlds - Part 4: The ICER Evidence Report for Crizanlizumab, Voxelotor and L-Glutamine for Sickle Cell Disease.

Abstract

A number of commentaries have been published over the past 4 years by the present author on the manifest flaws in the reference case value assessment framework of the Institute for Economic and Clinical Review. The recent release of the evidence report on sickle cell disease continues ICER's commitment to what has been described as the creation of imaginary worlds to support value assessment. The purpose of the present commentary is to continue the critiques that have been presented for earlier evidence reports. This is important because of the apparent willingness to take ICER's recommendations at face value rather than a critical review of the value assessment framework. The case presented here points to a number of weaknesses in the ICER framework: (i) the fabrication of imaginary constructs with a lifetime cost-per-incremental QALY framework; (ii) the consequent failure to meet the standards of normal science; (iii) the illogical reliance of assumptions drawn from the literature to create future scenarios; (iv) the rejection of hypothesis testing in favor of 'approximate information' and (v) a belief that in the construction of QALYS that the EQ-5D-3L utility scale has ratio properties. This last point is demonstrably false which means that the ICER value assessment framework collapses. It is impossible mathematically, a failure to meet the axioms of fundamental measurement, for an ordinal utility scale to be combined with time spent in a disease state. The result is that the pricing and access recommendations for Crizanlizumab, Voxelotor and L-glutamine in sickle cell disease (SCD) are complete nonsense and should be rejected. © Individual authors.

Entities: Chemical Disease Gene Species

Keywords: ICER; Rasch standards; imaginary worlds; nonsense claims; pseudoscience; sickle cell

Year: 2020 PMID： 34007618 PMCID： PMC8051927 DOI： 10.24926/iip.v11i2.3123

Source DB: PubMed Journal: Innov Pharm ISSN： 2155-0417

Introduction

Insinuate:to introduce by stealthy, smooth, or artful means (Merriam Webster) Over the past few years the Institute for Clinical and Economic Review (ICER) has attempted to insinuate itself as the principal arbiter for value assessments in the US. The ICER business model is built around the construction of lifetime imaginary simulations which claim to provide a framework relevant to health system decision makers for pricing and access with pharmaceutical products and devices. As detailed in a recent review of the ICER value assessment framework, the ICER modeling approach fails to meet the standards of normal science; the discovery of new facts [1]. It is best characterized as pseudoscience (i.e., bunk). Constructing imaginary worlds to support pricing and access recommendations has certainly characterized health technology assessment of the past 30 plus years. Indeed, the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) makes clear that it is not interested in hypothesis testing or the discovery of new facts in treatment impact [2]. ISPOR sees its principal role in generating ‘approximate information’in contrast to real world evidence where meaningful claims for therapy impact and quality of life in disease areas can be evaluated from patient-centric evidence platforms. The purpose of the present commentary is to point to the manifest flaws in the latest attempt by ICER to fabricate imaginary recommendations for pricing and access [3]. This is a critical issue as ICER recommendations can ensure that the access to new therapies, in this case for sickle cell disease, is barred to those most in need. ICER has the responsibility for defending its position; not only for pricing recommendations but for denying access to new therapies. Unfortunately, irrespective of ICERs claim that it adheres to ‘gold standard’ techniques in its fabrication of imaginary cost-per-QALY worlds to support its revelations, its methodology is fatally flawed. Yet ICER perseveres in a program of making recommendations for price discounting and access on a value assessment framework that defy the standards of normal science. What needs to be made clear is the absurdity of the ICER value assessment framework in this latest application in sickle cell disease in its final evidence report [4]. This stems from(i) the fabrication of imaginary constructs with a lifetime cost-per-incremental QALY framework; (ii) the consequent failure to meet the standards of normal science; (iii) the illogical reliance of assumptions drawn from the literature to create future scenarios; (iv) the rejection of hypothesis testing in favor of ‘approximate information’ and (v) a belief that in the construction of QALYS that the EQ-5D-3L utility scale has ratio properties.

The Imaginary Worlds of ICER

Imaginary: existing only in the imagination (Oxford Dictionaries) Imaginary worlds can be compelling; from Peter Pan to Harry Potter millions of children (and adults) have been enthralled with their creativity and their identification with the leading characters. Health technology assessment, as understood and proselytized by groups such as the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) in their advocacy of approximate imaginary information created by imaginary lifetime reference case worlds, has been in the forefront in advocating practice standards for fantasy creations. The imaginary simulated world for SCD has as its primary objective a decision analytic framework to estimate the lifetime cost-effectiveness of three treatments for SCD: crizanlizumab (Novartis AG), voxelotor (Global Blood Therapeutics), and L-glutamine Emmaus), each combined with usual care, compared to usual care alone. The model estimates imaginary outcomes that include life years gained, quality-adjusted life years (QALYs) gained, equal value life years gained (evLYG), clinical events, pain crises avoided, change in hemoglobin, and total costs for each intervention over a lifetime time horizon. The base-case analysis used a health care sector perspective (i.e., direct medical care costs only), with the societal perspective as a co-base case, presented directly alongside the health care sector perspective analysis. The model is a cohort-level, Markov model of costs, quality of life (QoL), clinical events, and mortality associated with SCD among children and adults in the US diagnosed with the disease, using a 2-week cycle length. This approach was chosen due to the chronic nature of disease and the multiple re-occurring events in SCD. The model focuses on transitions between acute and chronic health states and includes the risk of death. Treatments that delay or avoid acute and chronic conditions will, in the model framework, improve patients’ health, quality of life, and health care costs. Evidence of treatment effects on acute pain crises and level of hemoglobin come directly from the pivotal trials. Evidence linking the relationship between acute pain crises and levels of hemoglobin to other acute and chronic conditions come from multiple sources and assumptions as these were not directly measured in the clinical trials. The intricacies of the SCD model are not of concern here. After all, although data are drawn from the various pivotal clinical trials, data are also drawn from a variety of other sources to create the model assumptions. This model is one of many, if not a multiverse of different models, to support imaginary competing claims for therapy impacts in SCD. The key point, from the perspective of an imaginary construct with lifetime non-credible (and obviously non-evaluable) claims is the construction of quality adjusted life years, lifetime QALY estimates for the various assumed and modeled treatment pathways, lifetime costs and, the pièce de résistance, incremental cost-per-QALY estimates to support threshold analysis and the much awaited ICER recommendations for price discounting from WAC. Our focus, therefore, is on the QALY and whether the model assumptions regarding how utilities are ‘discovered’ and assigned and QALYs created make any sense from the perspective of normal science. Although judged here as nonsensical, given the manifest flaws in the SCD model in respect of the criteria of normal science and the disregard of fundamental measurement theory. ICER recommends substantial discounts from wholesale acquisition costs (WAC): crizanlizumab 70% to 74%; voxelotor 79% to 83%; and L-glutamine 35% to 40%. These should be ignored as they fall squarely in the realm of pseudoscience.

The Standards of Normal Science

Pseudoscience: a collection of beliefs or practices mistakenly regarded as being based on scientific method (Oxford Dictionaries) The requirement for testable hypotheses in the evaluation and provisional acceptance of claims made for pharmaceutical products and devices is unexceptional. Since the 17th century, it has been accepted that if a research agenda is to advance, if there is to be an accretion of knowledge, there has to be a process of discovering new facts. ICER is opposed to this. By the 1660s, the scientific method, following the seminal contributions of Bacon, Galileo, Huygens and Boyle, had been clearly articulated by associations such as the Academia del Cimento in Florence (1657) and the Royal Society in England (founded 1660; Royal Charter 1662) with their respective mottos Provando e Riprovando (prove and again prove) and nullius in verba (take no man’s word for it) [5]. By the early 20th century, standards for empirical assessment were put on a sound methodological basis by Popper (Sir Karl Popper 1902-1994) in his advocacy of a process of ‘conjecture and refutation [6,7]. Hypotheses or claims must be capable of falsification; indeed, they should be framed in such a way that makes falsification likely. Although Popper’s view on what demarcates science (e.g., natural selection) from pseudoscience (e.g., intelligent design) is now seen as an oversimplification involving more than just the criteria of falsification, the demarcation problem remains [8]. Certainly, there are different ways of doing science but what all scientific inquiry has in common is the ‘construction of empirically verifiable theories and hypotheses’. Empirical testability is the ‘one major characteristic distinguishing science from pseudoscience’; theories must be tested against data. Hence pivotal clinical trials; not simulated imaginary worlds with selected data inputs from pivotal trial data to recycle old (and imagined) facts. We can only justify our preference for a theory by continued evaluation and replication of claims. This applies in SCD just as it does in other therapies. Constructing imaginary worlds, even if the justification is that they are ‘for information’ is, to use Bentham’s (Jeremy Bentham 1748-1832) memorable phrase ’nonsense on stilts’. If there is a belief, as subscribed to by ICER, in the sure and certain hope of constructing imaginary worlds, to drive formulary and pricing decisions, then it needs to be made clear that this is a belief that lacks scientific merit. It fails the demarcation test; it is pseudoscience (i.e., pure bunk).

Approximate Information (or Disinformation)

Approximate: close to the actual, but not completely accurate or exact (Oxford Dictionaries) It is worth emphasizing that ISPOR, as noted above, ICER’s methodological mentor, explicitly disavows hypothesis testing as a core activity in health technology assessment. The primary role of health technology assessment is to create ‘approximate information’. It is not clear what this means (presumably it can be distinguished from ‘approximate disinformation’) as there is not, in the imaginary world of ICER modeling, any known reference point for ‘true information’ to judge approximation. How close are we? It is difficult to be approximate to the ‘truth’ when the context is imaginary and the ‘truth’ will only be revealed 10, 20 or 30 years or more ahead if all the assumptions in the model are realized. The OED definition may relate approximate to a ‘known’ truth but in the construction of imaginary worlds then can be no such reference point.

Choice of Assumptions

Assumptions: a thing that is accepted as true or as certain to happen, without proof (Oxford Dictionaries) The ICER claim to fame is the ability to construct or fabricate an imaginary world that sets the stage for value impact over 10, 20 or 30 years in the future. In the SCD model the number of assumptions made to support the various simulations and their scenario progeny across the three therapies is truly awesome; some come from the literature, others are pure guesswork. Unfortunately, even if an assumption driving the imaginary value assessment framework is defended by appealing to the literature (including pivotal clinical trials) the effort is wasted. The point, and this goes back to Hume’s (David Hume 1711 – 1776) induction problem, is that we cannot ask clients in health care to believe in models constructed on the belief that prior assumptions will hold into the future. It is logically indefensible: it cannot be ‘ established by logical argument, since from the fact that all past futures have resembled past pasts, it does notfollow that all future futures will resemble future pasts’ [9]. No, Virginia, all swans are not white. You may have seen only English swans, but on my last QANTAS vacation in Western Australia, I saw black swans.

Achilles, Utilities and QALYs

Achilles Heel: a weakness or vulnerable point (Oxford Dictionaries) QALYS are the Achilles heel of the ICER construction and belief in imaginary reference case lifetime worlds; exeunt QALYs and the fantasy edifice collapses. Apart from their use in the ICER contribution to the science fiction literature, QALYs can only survive if the measure is credible, evaluable and replicable. The QALY constructed by ICER in the SCD model meets none of these criteria. In fact, there is only one reference cited in SCD in the ICER report for utilities, in this case the EQ-5D-3L, self-assessment in a hospital environment, the balance of utilities are for other, possibly similar, disease states [10]. The concept of a QALY is not new; it goes back some 40 plus years with the notion of combining time spent in a disease state with some multiplicative ‘score’ on a required interval scale of 0 to 1 (death to perfect health). Combining the two, multiplying time by utility is assumed to produce a QALY. In the ICER imaginary SCD world these are combined to produce QALYs for the modeled lifetime. However, before considering the EQ-5D-3L utility that is central to the imaginary SCD simulation, a brief digression on measurement theory and its application to instrument development in the social science is on order. There are four main types of measurement scale; putting to one side conjoint simultaneous measurement which underpins Rasch Measurement Theory (RMT)[11]. These are: nominal, ordinal, interval and ratio. Each satisfies one or more of the properties of: (i) identity, where each value has a unique meaning; (ii) magnitude, where each value has an ordered relationship to other values; (iii) interval, where scale units are equal to one another; and (iv) ratio, where there is a ‘true zero’ below which no value exists. Nominal scales are purely descriptive and have no inherent value in terms of magnitude. Ordinal scales have both identity and magnitude in an ordered relation but the unknown distances between the ranks means the scale is capable only of generating median and modes. The interval scale has identity, magnitude and equal intervals. It supports mathematical operations of addition and subtraction. A ratio scale satisfies all properties, supporting the additional mathematical operations of multiplication and division. Recognition and adherence to these fundamental axioms of measurement theory is critical if a measure is t have any credibility. In the physical sciences this has been long recognized as accurate measurement is key to hypothesis testing and the discovery of new facts. The same arguments apply to the social sciences. Unfortunately, they appear all too often absent in health technology assessment. The case presented here is that the EQ-5D-3L generates ordinal or manifest scores[12]. It does not have interval properties (i.e., invariance of comparisons) and it certainly does not have ratio properties as the EQ-5D-3L ‘score’ lacks a true zero i.e., distance from zero). Unfortunately, the EQ-5D-3L scale has no demonstrable interval measurement properties (with odd ceiling and floor effects) as well allowing negative utilities (below a true zero). Of course, if the EQ-5D-3L fails to demonstrate interval properties, then it is a waste of time to consider whether it has ratio properties. The actual range for the EQ-5D-3L is not from 0 = death to 1 = perfect health, but from -0.59 to 1.0as the algorithm to compute utilities allows negative values. The fact that the EQ-5D-3L has ordinal properties is easily demonstrated: the symptom elements that comprise the EQ-5D-3L attributes are on an ordinal scale. Simply applying community preference weights results in a composite ordinal scale. There is the further question of unidimensionality. Measurement scales should have the property of unidimensionality. The focus should be on one attribute at a time. We must avoid confusing a number of attributes into a single score. Mutiattribute scales such as the EQ-5D-3L reduce confidence in predictions and the score is a less useful summary. In Rasch modeling, estimates of item difficulty and person ability are meaningful if every question contributes to the measurement of a single underlying attribute. Our analytical procedures, if we are to meet the property of unidimensionality, must incorporate indicators of the extent to which the persons and items fit our concept of an ideal unidimensional line. Items should contribute in a meaningful way to the construct/concept being investigated. In the case of the EQ-5D-3L the notion of unidimensionality is absent. While it is claimed to capture health related quality of life (HRQoL), there is no single attribute or latent construct. It comprises 5 symptoms (mobility, self-care, usual activity, pain/discomfort, anxiety depression) with three ordinal response levels (no problem, some problems and major problems); creating a multiattribute scale with ordinal properties. Each of the symptoms is an attribute that could be the foundation for its own unidimensional scale. While ICER apparently believes the EQ-5D-3L has ratio properties this is demonstrably false given negative utilities. But perhaps this is not as egregious as the ‘false assumption’ position taken by authors where it is acknowledged that the EQ-5D-3L lacks a true zero but that, in order to maintain the QALY illusion, we assume it has ratio properties[13] : The situation does not change when we move from the EQ-5D-3L to the EQ-5D-5L (introduced in 2009) where there are 5 response levels. Increasing the allowed ordinal responses to five reduces the number of respondents with extreme problems. The result is a range, still including negative utilities, from -0.29 to 1.0. Even so, it is still an ordinal or manifest score. Even if ICER were willing to recognize the absence of fundamental measurement properties in the EQ-5D-3L (and other generic utility instruments), this does not mean that this would give succor to the belief in fabricated imaginary evidence. The ICER value assessment framework would still fail the demarcation test as pseudoscience. It is also difficult to see how ICER might underwrite a ‘utility’ instrument that met the standards required (a true zero yet capped at unity). After all, instruments developed by application of RMT focus on the response to interventions on a constructed interval scale from ordinal responses rather than attempting to go the further step of creating instruments which have ratio properties [14,15,16].

Conclusion: Next Steps

The fact that the application of utility values, from a variety of sources to create QALYS, fails the standards of fundamental measurement should be sufficient to show that the ICER reference case model for SCD (and all previous evidence based disease claims) should be rejected; unfortunately, this will not deter ICER. The company has too much invested in its claim as the US technology assessment arbiter of emerging products and technologies. After all, it would be embarrassing to admit that its recommendations for pricing and access are, to say the least, nonsensical, and that the ICER value assessment framework is more appropriately classified with intelligent design than natural selection. In SCD, it will be up to the manufacturers to make the case for ignoring ICER to health system decision makers. They will have to offer an alternative approach to evaluating the ‘value’ of their products. Previous commentaries have proposed that rather than focusing on generic utilities and QALYs, manufacturers should direct their activities to claims based on disease specific QoL instruments. Since the mid-1990s disease specific (both patient and caregiver) instruments have been developed with needs fulfillment as the latent unidimensional construct. The instruments meet the required standards of RMT to create an instrument with interval measurement properties to assess response to therapy: does a new therapy contribute to patients needs being more effectively met in a disease state? An instrument that meets these standards should be considered in SCD. Developing such an instrument would provide a complement to the Adult Sickle Cell Quality of Life Measurement Information System (ASQ-Me) [17]. While the ASQ-Me is not patient centric, as it lacks a focus on needs fulfilment (it also does not meet RMT standards), there are elements from this system that should be retained (e.g., evaluation of pain experience). Adopting a disease specific, RMT standard patient centric instrument, or rather a family of instruments that capture pediatric patients, their caregivers and adults with SCD, gives a sound basis for evaluating response to therapy. It would provide claims that are credible, evaluable and replicable. It would be a simple index of response to therapy and could be an integral part of evidence platforms such as registries in SCD. The fact is, we don’t need ICER (or any other group) to spend eight months from conception through gestation to produce an imaginary construct that fails the standards of normal science. A commitment to fantasy creations that is, surprisingly, supported financially by manufacturers; they should know better. A return to the standards of normal science, to the discovery of new facts in the treatment and response to therapies in diseases such as SCD would be a welcome respite from, and antidote to, ICER.

7 in total

More Unnecessary Imaginary Worlds - Part 4: The ICER Evidence Report for Crizanlizumab, Voxelotor and L-Glutamine for Sickle Cell Disease.

Introduction

The Imaginary Worlds of ICER

The Standards of Normal Science

Approximate Information (or Disinformation)

Choice of Assumptions

Achilles, Utilities and QALYs

Conclusion: Next Steps

1. Application of Rasch analysis in the development and application of quality of life instruments.

2. The use of raw scores from ordinal scales: time to end malpractice?

3. Measurement of patient-reported outcomes. 2: Are current measures failing us?

4. A Health Economics Approach to US Value Assessment Frameworks-Introduction: An ISPOR Special Task Force Report [1].

5. Selecting Health States for EQ-5D-3L Valuation Studies: Statistical Considerations Matter.

6. Measurement of patient-reported outcomes. 1: The search for the Holy Grail.

7. Patient self-assessment of hospital pain, mood and health-related quality of life in adults with sickle cell disease.