Johanna M van Dongen1, Marieke F van Wier, Emile Tompa, Paulien M Bongers, Allard J van der Beek, Maurits W van Tulder, Judith E Bosmans. 1. From the Department of Health Sciences and the EMGO+ Institute for Health and Care Research (Ms van Dongen, Dr van Wier, Prof Dr van Tulder, and Dr Bosmans), Faculty of Earth and Life Sciences, VU University Amsterdam; Department of Public and Occupational Health and the EMGO+ Institute for Health and Care Research (Ms van Dongen and Profs Drs Bongers and van der Beek), VU University Medical Center; Body@Work, Research Center for Physical Activity, Work and Health (Ms van Dongen, Dr van Wier, and Profs Drs Bongers, van der Beek, and van Tulder), TNO-VU University Medical Center; Department of Epidemiology and Biostatistics and the EMGO+ Institute for Health and Care Research (Dr van Wier and Prof Dr van Tulder), VU University Medical Center, Amsterdam, the Netherlands; Institute for Work & Health, Toronto (Dr Tompa), Department of Economics (Dr Tompa), McMaster University, Hamilton; Dalla Lana School of Public Health (Dr Tompa), University of Toronto, Ontario, Canada; and TNO Healthy Living (Prof Dr Bongers), Hoofddorp, the Netherlands.
Abstract
To allocate available resources as efficiently as possible, decision makers need information on the relative economic merits of occupational health and safety (OHS) interventions. Economic evaluations can provide this information by comparing the costs and consequences of alternatives. Nevertheless, only a few of the studies that consider the effectiveness of OHS interventions take the extra step of considering their resource implications. Moreover, the methodological quality of those that do is generally poor. Therefore, this study aims to help occupational health researchers conduct high-quality trial-based economic evaluations by discussing the theory and methodology that underlie them, and by providing recommendations for good practice regarding their design, analysis, and reporting. This study also helps consumers of this literature with understanding and critically appraising trial-based economic evaluations of OHS interventions.
To allocate available resources as efficiently as possible, decision makers need information on the relative economic merits of occupational health and safety (OHS) interventions. Economic evaluations can provide this information by comparing the costs and consequences of alternatives. Nevertheless, only a few of the studies that consider the effectiveness of OHS interventions take the extra step of considering their resource implications. Moreover, the methodological quality of those that do is generally poor. Therefore, this study aims to help occupational health researchers conduct high-quality trial-based economic evaluations by discussing the theory and methodology that underlie them, and by providing recommendations for good practice regarding their design, analysis, and reporting. This study also helps consumers of this literature with understanding and critically appraising trial-based economic evaluations of OHS interventions.
Resources for occupational health are scarce.1,2 Therefore, decision makers in this field increasingly call upon advisors and researchers to demonstrate that occupational health and safety (OHS) interventions are not only effective but also efficient in terms of their resource implications. Economic evaluations provide information on the relative efficiency of two or more alternative interventions and are defined as “the comparative analysis of alternative courses of action in terms of both their costs and consequences.”1(p9) The main aspects of any economic evaluation are to identify, measure, value, and compare the costs and consequences of alternatives.1In the health care sector, economic evaluations are increasingly being conducted and play an important role in many countries when deciding whether (new) treatments should be covered by public funding.1 Nevertheless, only a few of the studies that consider the effectiveness of OHS interventions take the extra step of considering whether they are efficient in terms of their resource implications.3 Moreover, the methodological quality of those that do is generally poor.4–7 Reasons for this may be the distinct challenges that confront researchers when trying to identify the resource implications of OHS interventions, and a lack of recommendations on how to deal with these issues.3 Many economic evaluation text books and articles are designed for use in health care settings and may be difficult to adapt to the occupational health context.4Effectiveness trials are a commonly used vehicle for economic evaluations, as they provide a unique opportunity to reliably estimate the resource implications of a new intervention without substantially higher research expenses. Although some efforts have been undertaken to improve the quality of (trial-based) economic evaluations in occupational health,3,8,9 more needs to be done to accomplish this. Therefore, this study aims to help occupational health researchers conduct high-quality trial-based economic evaluations by discussing the theory and methodology that underlie them, and by providing recommendations for good practice regarding their design, analysis, and reporting.
DESIGN OF AN ECONOMIC EVALUATION
Kind of Economic Evaluations
Choosing the appropriate kind of economic evaluation for a particular occupational health decision context can be a challenge as a result of the relative complexity of the decision-making context that generally includes multiple stakeholders (eg, workers, employers, insurance companies, public policymakers). Four kinds of economic evaluations are distinguished. There are similarities across the 4 kinds. The main difference is the metric used to measure the key outcome (health and/or safety, in the case of OHS interventions).10Cost-effectiveness analysis (CEA). Costs and some consequences (eg, productivity, health care utilization implications) are measured in monetary units, whereas the key outcome is measured in natural units.1Cost–benefit analysis (CBA). Both costs and consequences are measured in monetary units. In business administration, CBAs are sometimes described as return-on-investment (ROI) analyses.Cost-utility analysis (CUA). Costs and some consequences are measured in monetary terms, whereas the key outcome is measured in utility units. Utilities are often expressed in terms of quality adjusted life years (QALYs).1Cost-minimization analysis. Only costs are considered across alternatives, as it is assumed that the consequences are similar. Cost-minimization analyses are considered inappropriate if there is uncertainty regarding a possible difference in the magnitude of consequences.1Which kind of economic evaluation is most appropriate depends on the stakeholders involved and the question being asked. Generally, employers are most interested in CBAs that can provide insight into the impact of an intervention on a company's bottom line, whereas public policymakers may be more interested in CEAs and CUAs, particularly if monetary measures do not adequately capture important health outcomes.1,8,11 Therefore, it is recommended that researchers conduct various kinds of economic evaluations within the same study to inform all relevant stakeholders.3
When to Undertake an Economic Evaluation?
Economic evaluations are often conducted alongside (“piggybacked” onto) trials evaluating the effectiveness of OHS interventions. Various design aspects are, therefore, typically determined by the requirements of the effectiveness trial (eg, alternatives, outcome measures). Nevertheless, to ensure that all relevant economic data are collected in a valid, reliable, and efficient way, it is important to consider the requirements for the economic evaluation at the earliest possible stage.12–14Debate exists as to whether an economic evaluation should be included in a trial before the effectiveness of a new intervention is established. Nevertheless, not including an economic evaluation would risk losing the opportunity to simultaneously collect cost and effect data.14 Also, the absence of statistically significant consequence/effect differences between the alternatives being compared does not necessarily imply that the new alternative is not cost-effective and/or cost-beneficial. Economic evaluations are about the joint distribution of costs and consequences and could demonstrate clear cost-effectiveness/cost–benefit when neither cost nor consequence differences are individually significant.14 Also, cost savings might occur in the absence of health improvements and could thus be missed if an economic evaluation is not performed.
Trial Design
Pragmatic randomized controlled trials (RCTs) are generally acknowledged as the best vehicle for economic evaluations, because they enable the evaluation of the resource implications of OHS interventions under “real life” conditions. This setup increases the external validity of results, while the internal validity is guaranteed by the randomization of participants.4,14 Within the occupational health setting, however, participant-level randomization may not always be feasible (eg, when interventions include organizational components). In such cases, randomization at the level of departments or locations might provide a more feasible approach (ie, cluster-RCTs).3 To ensure that the results of an economic evaluation are generalizable to occupational health practice, trial conditions should resemble daily practice as much as possible. For example, participants should be similar to those who will experience the intervention if it is implemented broadly, monitoring should be done under routine circumstances, and interventions should be compared with usual practice.
Perspective
An essential aspect of an economic evaluation is its perspective. Perspective refers to the “point of view” taken to identify relevant costs and consequences for inclusion in the evaluation. The chosen perspective may be that of any relevant stakeholder or an aggregate of stakeholders such as a societal perspective. The perspective determines which costs and consequences are included. In the societal perspective, for example, all costs and consequences are considered irrespective of who pays or benefits, whereas only those borne by employers are included when the employer's perspective is applied. Given this fact, the perspective is a critical element in an analysis and should therefore be stated explicitly.1The OHS interventions are typically initiated by company management, either to comply with the law, in an effort to save money (ie, reduced sickness absence costs), or for moral reasons.11 Consequently, most economic evaluations of such interventions are performed from the employer's perspective,4–7,15 but other perspectives may also be relevant, for example, worker's, insurer's, and societal perspective. When the employer's perspective is applied, key worker outcomes, such as the value of worker health, are often not included in the analysis, but simply the health-related expenses incurred by an employer (eg, productivity implications). This is a critical oversight, as occupational health is essentially about worker health. A societal perspective is particularly useful to consider as the perspective in a study, as it provides insight into the net effect across all stakeholders. Hereby, it better ensures that the societal costs of an intervention are less than the benefits experienced by all stakeholders, rather than simply the company's costs being less than its benefits.3 This information will ensure that there is a net societal benefit, rather than simply cost shifting from one stakeholder to another. In addition, the disaggregated information on costs and consequences from a societal perspective provides a good sense of their distribution across stakeholders. Such information can be the launch pad for bargaining between them.1 This may be of particular importance in countries with dual-payer (eg, The Netherlands) and universal health care systems (eg, The United Kingdom), because employers generally bear most of the costs of OHS interventions, whereas in such jurisdictions the health care system and/or government reaps a large part of their benefits (ie, reduced medical spending).16 Therefore, it is recommendable to supplement findings from the employer's perspective with those from other relevant perspectives, particularly the societal one.
Analytic Time Frame
Researchers also need to decide about the time frame over which costs and consequences are analyzed. The analytic time frame ought to cover the entire period over which costs and consequences flow from the alternatives under consideration.12 This time frame generally extends beyond the follow-up needed to establish the effectiveness of a new intervention. To illustrate, the follow-up of an effectiveness trial may be terminated after the occurrence of the clinical event of interest (eg, incidence of repetitive strain injury). If this follow-up was used for the economic evaluation, all costs and consequences incurred during the course of the disorder or its recurrences would not be taken into account (eg, repetitive strain injury–related medication and/or operation costs), leading to an underestimation of the total costs and consequences. Although the optimal follow-up period is generally unknown, researchers and readers should at least feel confident that the most important costs and consequences are covered by the chosen analytic time frame. In addition, future costs and consequences that occur after the measurement period can be estimated using information and data from various sources. This is particularly important to do if future costs and consequences are expected to be substantial (eg, many of the [health] benefits of preventive interventions are thought to occur in the future).
Identification, Measurement, and Valuation of Resource Use
In economic evaluations, costs and some consequences are expressed in monetary units. For this purpose, relevant resource use categories should be identified, measured, and valued. As discussed earlier, relevant resource use categories for inclusion in an economic evaluation depend on its perspective. Other factors that might determine the relevance of a resource use category are, among others, the country or jurisdiction in which the study is undertaken and the nature of the alternatives being compared.After relevant resource use categories are identified, researchers should determine how to cost them. Costing generally involves three steps: (1) the measurement of quantities of resources consumed (Q), (2) the assignment of unit prices (P), and (3) the valuation of resources consumed by multiplying their quantities by their respective unit prices (Q*P).1 These estimates should be reported separately so that the reader can judge the relevance of these measures to his or her setting.17
Measurement of Quantities of Resources Consumed
Resource use data are ideally collected prospectively through a data collection process that is fully integrated into the effectiveness trial.1,13 Also, when collecting self-reported resource use data, researchers have to balance recall bias against completeness of information. Shorter recall periods reduce the risk of participants forgetting important information. Nevertheless, collecting data with relatively short recall periods (eg, a couple of weeks) over a longer period of time may be overly burdensome to participants and may thus increase the risk of missing data and dropouts. Therefore, it may be better to maximize completeness at the cost of some recall bias,14 for example, by using 2- to 3-month recall periods in a trial with a long-term follow-up (≥12 months).18 Also, care should be taken to collect resource use data continuously during follow-up and to avoid the need for extrapolation of resource use estimates between measurement periods.
Assignment of Unit Prices
Unit prices used for valuing resource use ought to reflect opportunity costs, that is, “the value of a resource in its most highly valued alternative use.”8(p56) In a world of perfect markets, such costs are revealed by the market price of a good or service. Nevertheless, if a competitive market does not exist for a good or service, market prices often are an inaccurate measure of its value. For example, if a premium is paid for a good or service due to restricted market entry, market prices may overestimate the opportunity costs at the societal level. When the societal perspective is applied, an adjustment should, therefore, be made to the market price, for example, by using the price of a comparable good or service.8 For the employer's perspective, the actual purchase costs incurred by the employer may be more appropriate, as they better represent the sum of money that is not available to the employer for its best alternative use.12,19 Thus, appropriate unit prices may vary between perspectives, and researchers should ensure that they reflect the true resource implications to the decision maker at hand.8A brief description of the methods used for measuring and valuing the most frequently used resource use categories in economic evaluations of OHS interventions is provided later. The most frequently used resource use categories are intervention, productivity, health care, and workers' compensation costs.4–7,15
Intervention Costs
Information on the market price of an intervention may be derived from vendors or company and/or research project records. Many trials, however, assess novel interventions that either have no predefined price weights associated with them or for which the use of market prices is inappropriate (eg, when the societal perspective is applied).12 In such cases, the actual intervention costs can be assessed using a bottom-up micro-costing approach, in which detailed data regarding the quantities of resources consumed as well as their unit prices are collected per intervention component separately. Such resources may include intervention staff hours, materials used, depreciation, overhead activities, square feet of office space, and traveling.1,3,12 Also, workers may be taken away from their regular production activities to participate in the intervention and this should be accounted for as well. Costs associated with the intervention's evaluation should not be included unless it is a condition of implementation.8 Quantities of resources consumed can be measured using administrative databases, expert panels, surveys or interviews with intervention participants and/or providers, intervention operation logs, or observations.20 Unit prices may be collected from administrative databases, scientific literature, vendors, and/or costing manuals (eg,21).
Health Care Costs
Ideally, all health care service use is measured to reduce the likelihood that (unexpected) shifts in health care utilization rates are missed. Although this approach will increase the validity of the results, it may not always be feasible. An alternative strategy is to limit data collection to those health care services that are related to the alternatives and/or condition under study.12 A description of the care path for the condition under study might provide researchers with a clear picture of what those health care services are. In all cases, care should be taken to include the most important cost drivers.Health care utilization can be measured through various means, including retrospective questionnaires, prospective resource use diaries (ie, cost diaries), and insurance or hospital databases. Databases, however, may not always contain all required data, and their validity and reliability may not be very high.10 Moreover, health care costs borne by participants (eg, copayments, over-the-counter medication) are typically not included in these databases. Therefore, researchers are often dependent on self-report data to measure these health care utilization items. To value health care utilization, unit prices may be either estimated using a micro-costing approach or based on predefined price weights, prices according to professional organizations, or tariffs. Typically, several methods are used simultaneously.10,19
Productivity Costs
For employers, an important benefit of OHS interventions is the resulting changes in productivity loss. Productivity loss can be defined as the company's output loss corresponding to reduced labor input (ie, time and efforts/skills of the workforce). According to this definition, to value productivity loss is to value the output loss.22 Unfortunately, however, objective measurement of the true impact of reduced labor input on a company's output is often impossible to estimate. Therefore, researchers typically use proxies of productivity loss, which are often estimated using (self-reported) data on the participants' level of absenteeism (ie, sickness absence) and/or presenteeism (ie, reduced performance while at work). The methodologies used for measuring and valuing absenteeism and presenteeism are a fiercely debated topic in the field of economic evaluations. Later, a brief description of the most frequently used methods is provided. For more information about the main debates and developments regarding the identification, measurement, and valuation of productivity, we refer to other publications.22,23The two main methods for estimating absenteeism costs are the Human Capital Approach (HCA) and the Friction Cost Approach (FCA). For both methods, the number of sickness absence days has to be collected, for which administrative databases, self-report (questionnaires), or reports by others can be used.9 For the FCA, it is also important to identify the number and duration of different absence periods. According to the HCA, absenteeism costs are equal to the amount of money participants would have earned had they not been injured or ill.4,21 Therefore, in the HCA, sickness absence days are typically valued using actual wage rates of participants (including employment overheads and benefits) and represent losses for the entire duration of absence.1,19,24 It is argued that the HCA overestimates the true societal cost of sickness absence, as the possible replacement of workers with long-term sickness absence is not taken into account.1,4 Therefore, the FCA was developed, in which production losses are assumed to be confined to the time-span companies need to replace a sick worker by a formerly unemployed person to restore the company's initial production level (ie, friction period).23 In the FCA, absenteeism is typically valued using age-, gender- and/or education-specific price weights.25 The length of the friction period depends on the state (ie, the unemployment rate) and efficiency of the labor market. As such, friction periods typically differ between countries and should be estimated per country separately.1 If there are important changes in the economic climate, it may be necessary to estimate the friction period anew. In the Netherlands, a friction period of 23 weeks is currently assumed.21 Thus, if a sickness absence period exceeds 23 weeks, absenteeism costs are truncated at the costs of 23 weeks. Furthermore, as a reduction of labor input is often assumed to cause a less than proportional reduction in productivity, Koopmanschap et al25 also proposed the application of an elasticity factor of 0.8, which is often used in economic evaluations that apply the FCA. This elasticity factor implies that a 100% loss of labor input corresponds with an 80% reduction in productivity.25In the economic evaluation literature, the need to consider presenteeism as a component of the costs incurred from productivity loss is increasingly being recognized.9 Presenteeism is typically estimated using participant self-report or report by others. For this purpose, various instruments have been developed, including both generic26–29 and disease-specific questionnaires.30,31 Most of these questionnaires measure work performance in terms of points, percentages, or proportions.32 These responses can then be used to estimate the total number of working days lost due to presenteeism by using the following equation:where P is full working days lost because of presenteeism, E is total working days, A is sickness absence days, and p is the proportion of lost work performance estimated by the instrument used in the study.22 To value the number of lost working days due to presenteeism, actual wage rates of participants, or age-, gender-, and/or job-specific price weights can be used. Researchers should be aware, however, that the estimated number of work days lost because of presenteeism may vary widely between instruments. This suggests a lack of comparability among instruments, but it is still unclear which instrument provides the best presenteeism estimate.22 Given its significance, however, ignoring presenteeism may lead to severe underestimations.22 Therefore, researchers are recommended to include this resource use category whenever possible. To assess the possible influence of the choice of instrument, sensitivity analyses can be performed (see later).
Workers' Compensation Costs
Workers' compensation is an insurance program, offered in some countries (eg, Canada, the United States), through which workers may receive wage replacement and/or medical benefits in the event of an occupational injury or disease. Funding usually comes from premiums paid by employers.8 To estimate workers' compensation costs, total claim costs per participant can be obtained from company and/or workplace insurance records. It is generally inadequate, however, to use workers' compensation costs as the sole cost category, as they do not reflect the full extent of work-related injuries and illnesses.4 Many compensable injuries and illnesses go unreported and others are not compensable.4 When supplementing health care and/or productivity costs with workers' compensation costs, double counting should be avoided. Also, insurance premium-related wage replacement benefits should be excluded for the societal perspective, as they constitute “transfer payments” from the employer via the insurer to the worker rather than depleted sources.1,4
Identification, Measurement, and Valuation of Outcomes
As noted previously, CEAs have the key outcome measured in natural units. The most appropriate outcome used for this purpose depends on the nature of the alternatives being compared, the condition under study, and/or the applied perspective. Sometimes, there may be some concern about whether the chosen outcome captures all relevant consequences. If this is a concern, it is advisable to conduct multiple CEAs using different outcomes.8 In CUAs, the key outcome is measured in utility units, generally known as QALYs. They capture both the duration of survival and health-related quality of life in a single measure.1,12,14 An advantage of QALYs is that they provide a general index score that allows decision makers to compare the consequences of a range of interventions for different health issues.1,10 Nevertheless, even though QALYs are the preferred outcome measure when health care interventions for patients are evaluated from the societal perspective,13,21,33 they have not yet been frequently used in economic evaluations of OHS interventions.4,6,7,34 This may be due to the fact that QALYs may not reflect what occupational health decision makers feel is most important in terms of outcomes. In the case of a workplace safety programs, for example, outcomes such as worker safety may be more meaningful to decision makers than a utility-weighted health measure.11 Moreover, occupational health decision makers are generally unfamiliar with QALYs, and QALYs seem to lack sensitivity to mild conditions that are often the focus of OHS interventions (eg, of worksite health promotion programs).35 Therefore, more sensitive utility measures are warranted for economic evaluations of OHS interventions and/or utility measures that are more applicable to the occupational health setting, for example, the recently conceptualized “Disease-Adjusted Working Years,” which aims to express the amount of working years lost because of poor working conditions and associated illness.36,37
ANALYSIS OF AN ECONOMIC EVALUATION
Later, we discuss some important issues in the analysis of trial-based economic evaluations. To illustrate some of them, data are used from an economic evaluation that was previously performed alongside a 12-month pragmatic RCT, in which construction workers at risk for cardiovascular disease either received a lifestyle intervention or usual practice. A CEA in terms of kilogram body weight loss was performed from the societal perspective and a CBA from that of the employer. Resource use categories included intervention, health care, absenteeism, and sports costs and were expressed in 2008 Euros. More detailed information about this trial-based economic evaluation can be found elsewhere.38
Sample Size
Ideally, economic outcomes are used in the sample size calculation of a trial.13 Nevertheless, although various techniques have been proposed to estimate the appropriate sample size for economic endpoints,39–42 sample size calculations are typically performed on the basis of primary outcomes.10,13,14 This is due to the fact that cost data are right skewed and therefore require larger sample sizes to detect relevant differences than (health) outcome data. A large sample size may be neither feasible nor ethically acceptable.14,43 Also, a large number of parameters have to be specified to perform sample size calculations for economic endpoints (eg, variance parameters of effectiveness measures, cost measures, incremental cost-effectiveness ratios [ICER]), many of which are hard to predict a priori.39,41,42 Consequently, trial-based economic evaluations are typically underpowered for economic outcomes.10 Low-powered studies have imprecise and uncertain cost estimates and should be interpreted with caution.43 Moreover, if studies are likely to be underpowered, researchers are recommended to use estimation rather than hypothesis testing (ie, by using confidence intervals rather than P values).1
Adjusting for Differential Timing
Interventions may have different time profiles of costs and consequences. Within occupational health, intervention costs are generally incurred immediately, while consequences such as productivity costs might extend into the future.44 Two types of adjustments should be made to account for these differences in timing. The first concerns the adjustment of cost data for inflation, that is, “the general upward price movement of goods and services.”12 Because of inflation, prices drawn from different years are generally not comparable.8 All prices should, therefore, be adjusted to the same reference year using consumer price indices and the applied reference year should be stated explicitly.17 The second adjustment concerns the adjustment of cost and outcome data for time preferences of individuals when they are collected over a period of more than 1 year.12 Even within a world with zero inflation, individuals have a preference for receiving benefits today rather than in the future.1 Therefore, costs and consequences incurred in different years have to be discounted at some rate to estimate their present value.44 The appropriate discount rate depends on the borrowing cost of money and other contextual factors. Guidelines for discount rates used in public sector projects are provided by some jurisdictions. For example, in the Netherlands, cost data should be discounted at 4% and health outcomes at 1.5%, while both should be discounted at 3.5% in the United Kingdom.21,33
Intention-to-Treat and Missing Data
Guidelines for conducting trials prescribe that all participants should be included in the analyses, all retained in the group to which they were allocated (ie, intention-to-treat analysis).45 Nevertheless, true intention-to-treat analyses are often hampered by missing data, which are generally inevitable in trials. For economic evaluations, this problem is even more pronounced, because total costs are typically the sum of numerous cost components. As such, cost data will already be incomplete if one component is missing.13 Missing data itself may have no relation to observed and unobserved factors among participants (MCAR: missing completely at random), may only have a relationship to observed factors (MAR: missing at random), or may also have a relationship to unobserved factors (MNAR: missing not at random) (see Box 1 for a more detailed description).46 Historically, complete-case analyses (ie, eliminating cases with missing data) were used to deal with missing data and this is still an often-used approach in trial-based economic evaluations.47 Nevertheless, complete-case analyses reduce the power of a study and lead to biased estimates if missing data are not MCAR.12,13 If the rate of missing data is smaller than 5%, complete-case analyses may be considered. If more than 5% of data are missing, researchers should use imputation techniques to fill in missing values. Nowadays, multiple imputation is generally recommended to impute missing data.13,14 When using multiple imputation, multivariate regression techniques are used to predict missing values on the basis of observed factors.12,14 To account for the uncertainty about the missing data, several different imputed data sets are created.46 As a rule of thumb, White et al48 suggested that the number of data sets should at least be equal to the percentage of incomplete cases. The imputed data sets are subsequently analyzed separately to obtain a set of parameter estimates, which can then be pooled using Rubin's rules to obtain overall estimates, variances, and 95% confidence intervals (95% CIs).46,48,49 Multiple imputation leads to unbiased estimates if missing data are MAR.12 Researchers should bear in mind, however, that cost and consequence estimates derived using multiple imputation are less reliable and precise than those based on a 100% complete data set.14 Every endeavor should, therefore, be made to minimize the amount of missing data.
BOX 1.
Types of Missing Data46
Types of Missing Data46
Incremental Analysis of Costs and Consequences
After costs and consequences have been quantified, their mean differences between the intervention and control group(s) as well as the statistical significance of these differences need to be assessed.12As mentioned previously, cost data are typically right skewed. This is caused by the fact that only a small proportion of participants incur high costs and costs are naturally bound by zero (see Fig. 1).1
FIGURE 1.
Distribution of the societal costs per participant to a trial-based economic evaluation of a lifestyle intervention for construction workers at risk for cardiovascular disease compared to usual practice.38
Distribution of the societal costs per participant to a trial-based economic evaluation of a lifestyle intervention for construction workers at risk for cardiovascular disease compared to usual practice.38The skewed cost distribution complicates the analysis of cost data, as it violates the assumptions of standard statistical tests, such as independent t tests and linear regression analyses. A standard approach to describe skewed data is to provide a summary measure of the distribution in the form of a median. Nevertheless, this is inappropriate for cost data as decision makers need to be able to estimate the total cost of implementing a new intervention (total implementation costs = mean costs per participant × the number of participants). As such, the arithmetic mean is generally viewed as the most informative measure to describe cost data.1,14,50 Various methods are currently used to compare cost data between study arms, including standard nonparametric tests (eg, Mann–Whitney U test), t tests on log-transformed data, and nonparametric bootstrapping. Standard nonparametric tests compare the distribution of the data instead of means and are therefore inappropriate. Transformations to normalize the distribution are not straightforward and are often sensitive to departures from distributional assumptions.13 Moreover, back-transformations are often complicated. Therefore, researchers increasingly favor the nonparametric bootstrap,13,50 which can be used to estimate 95% CIs around mean cost differences while avoiding distributional assumptions (Box 2).51Nonparametric Bootstrapping
Comparing Incremental Costs and Consequences
The core of any economic evaluation is the analysis of the relation between the costs and consequences of alternatives. The preferred methods for conducting such analyses differ between the types of economic evaluations and are discussed later.
CEA and CUA
In CEAs and CUAs, an ICER is calculated by dividing the mean difference in cost (Δ Cost) between study arms by that in effect (Δ Effect). The ICER indicates the additional costs of a new intervention in comparison with a control condition per unit of effect gained.1,12To illustrate, a description of the calculation and interpretation of the example trial's ICER is provided in Box 3.
BOX 3.
Calculation and Interpretation of the Incremental Cost-Effectiveness Ratio (ICER) of a Lifestyle Intervention for Construction Workers at Risk for Cardiovascular Disease Compared to Usual Practice38
Calculation and Interpretation of the Incremental Cost-Effectiveness Ratio (ICER) of a Lifestyle Intervention for Construction Workers at Risk for Cardiovascular Disease Compared to Usual Practice38Incremental cost-effectiveness ratios are generally hard to interpret. For example, negative ICERs might represent reduced costs and positive effects indicating a win–win situation or increased costs and negative effects indicating a lose–lose situation.14 Therefore, ICERs are often graphically illustrated on cost-effectiveness planes (CE-planes), in which incremental effects are plotted on the x axis and incremental costs on the y axis (Fig. 2).54,55
FIGURE 2.
Cost-effectiveness plane.
Cost-effectiveness plane.If an ICER is located either in the South East Quadrant (SE-Q) or in the North West Quadrant (NW-Q), the choice between alternatives is clear (assuming that there is no uncertainty surrounding the ICER). In the SE-Q, the new intervention is more effective and less costly than the control condition and is therefore said to dominate the control condition. In the NW-Q, the opposite is true and the new intervention is dominated by the control condition. If a new intervention is more effective and more costly (NE-Q: North East Quadrant) or less effective and less costly (SW-Q: South West Quadrant), the decision whether or not to adopt it depends on the so-called “willingness-to-pay” (λ). That is, the maximum amount of money decision makers are willing to pay for an additional unit of effect.1 To illustrate, a hypothesized λ is depicted as the diagonal line in Figure 2 and divides the CE-plane into a cost-effective and a non–cost-effective halve. Incremental cost-effectiveness ratios located to the right of this line are considered acceptable, whereas ICERs located to the left are considered inacceptable.14,54,55 The more decision makers are willing to pay for an additional unit of effect, the steeper the slope of this line.14With participant-level data, it is natural to consider representing the uncertainty surrounding ICERs using 95% CIs. Nevertheless, as a ratio measure, estimating 95% CIs around ICERs is not straightforward and, more importantly, 95% CIs around ICERs suffer from the same interpretation problem as ICERs.55 Therefore, alternative methods have been proposed to estimate the uncertainty surrounding ICERs. Current guidelines recommend using the bootstrap method described in Box 2. In this case, both incremental costs and effects are calculated per bootstrap sample. The uncertainty surrounding an ICER can then be graphically illustrated by plotting these bootstrapped incremental cost-effect pairs (CE-pairs) on a CE-plane. As indicated by the example trial's CE-plane provided in Figure 3, CE pairs commonly cover more than one quadrant.
BOX 2.
Nonparametric Bootstrapping
FIGURE 3.
Cost-effectiveness plane for a lifestyle intervention for construction workers at risk for cardiovascular disease compared to usual practice.38 Abbreviations: ICER, incremental cost-effectiveness ratio; NE-Q, North East Quadrant; NW-Q, North West Quadrant; SE-Q, South East Quadrant; SW-Q, South West Quadrant.
Cost-effectiveness plane for a lifestyle intervention for construction workers at risk for cardiovascular disease compared to usual practice.38 Abbreviations: ICER, incremental cost-effectiveness ratio; NE-Q, North East Quadrant; NW-Q, North West Quadrant; SE-Q, South East Quadrant; SW-Q, South West Quadrant.Although CE planes give a good impression of the uncertainty surrounding the ICER, they do not provide a summary measure of the joint uncertainty of costs and effects.56 Therefore, cost-effectiveness acceptability curves (CEACs) were introduced that provide insight into the probability that a new intervention is cost-effective compared to the control condition. This probability can be estimated by determining what proportion of CE pairs is located in the cost-effective half of the CE plane (ie, to the right of the previously mentioned line with the slope equal to λ) (Fig. 2). Because it is generally unknown what decision makers are willing to pay for an additional unit of effect, λ is varied between its natural bounds (range: 0 to ∞) and the probability that the new intervention is cost-effective compared with the control condition is estimated for a range of λs. These values can then be plotted on CEACs that show the probability of cost-effectiveness (y axis) for various λs (x axis).55–57 To illustrate, the CEAC of the example trial is provided in Figure 4.
FIGURE 4.
Cost-effectiveness acceptability curve for a lifestyle intervention for construction workers at risk for cardiovascular disease compared to usual practice.38 This cost-effectiveness acceptability curve corresponds with the cost-effectiveness plane in Figure 3 and indicates the probability of cost-effectiveness for different values of willingness-to-pay per kilogram body weight loss.
Cost-effectiveness acceptability curve for a lifestyle intervention for construction workers at risk for cardiovascular disease compared to usual practice.38 This cost-effectiveness acceptability curve corresponds with the cost-effectiveness plane in Figure 3 and indicates the probability of cost-effectiveness for different values of willingness-to-pay per kilogram body weight loss.This CEAC indicates that if decision makers are not willing to pay anything to obtain an additional kilogram body weight loss (ie, λ = 0), there is a 0.33 probability that the new intervention is cost-effective compared to the control condition. If decision makers are willing to pay €2000 (ie, λ = 2000), this probability is 0.95. When interpreting CEACs, two approaches can be used by decision makers. If their willingness to pay is known, they have to judge whether the probability of cost-effectiveness at this ceiling ratio is acceptable. If their willingness to pay is unknown, they should consider whether the ceiling ratio at an acceptable probability of cost-effectiveness is acceptable to them. The latter might depend on the scale of the outcome measure and the prevalence of the condition under study.
CBA
In health economics and business administration, various measures exist for comparing costs and benefits. Of them, the net benefits (NBs), benefit cost ratio (BCR), and ROI are the most frequently used measures in occupational health research and can be estimated using the following equations6:where Costs are defined as intervention costs and Benefits as the difference in monetized outcomes between the intervention group and the control group (eg, difference in productivity costs). Benefits are estimated by subtracting the mean expenses incurred by the intervention group participants from those of the control group. Hereby, positive benefits indicate reduced spending. The NB indicates the amount of money gained after costs are recovered (ie, net loss or net savings). The BCR indicates the amount of money returned per monetary unit invested. The ROI indicates the percentage of profit per monetary unit invested.58,59 Interventions can be regarded as cost saving if the following criteria are met: NB > 0, BCR > 1, and ROI > 0%. To illustrate, a description of the calculation and interpretation of the example trial's cost–benefit estimates are provided in Box 4.
BOX 4.
Calculation and Interpretation of the Cost–Benefit Estimates of a Lifestyle Intervention for Construction Workers at Risk for Cardiovascular Disease in Comparison to Usual Practice38
Calculation and Interpretation of the Cost–Benefit Estimates of a Lifestyle Intervention for Construction Workers at Risk for Cardiovascular Disease in Comparison to Usual Practice38Cost–benefit estimates, and BCRs and ROIs in particular, are typically presented without an indication of their uncertainty. If uncertainty is substantial and this is not taken into account, wrong conclusions could be drawn. Therefore, we recommend the use of the previously described bootstrap method (Box 2) to estimate the uncertainty surrounding cost–benefit estimates. In this case, the NB, BCR, and/or ROI are calculated per bootstrap sample. Subsequently, 95% CIs can be estimated using the bias corrected and accelerated method.51,53 Although BCRs and ROIs are ratio measures, estimating their 95% CIs is straightforward as the denominator (ie, intervention costs) is typically positive. Many occupational health decision makers, however, may lack the necessary statistical background to interpret 95% CIs.11 A possible way to deal with this issue is to estimate the proportion of NBs, BCRs, and/or ROIs that indicate cost savings (ie, “the probability of financial return”). Occupational health decision makers can subsequently use this information to consider whether the established probability of financial return is acceptable to them.When reporting CBA results, economists and policymakers prefer the NB, whereas the BCR and ROI are more familiar to business managers. As such, it is recommendable to report at least two of them (ie, NB and BCR/ROI), so that the results can be easily interpreted by all stakeholders. Another advantage of this approach is that it makes the results easily comparable with those of other studies, because different metrics are used in the literature to estimate whether OHS interventions generate cost savings.6
Sensitivity Analysis
Economic evaluations are often conducted in the context of incomplete information and uncertainty, which necessitates the use of proxy measures, and invariably, the need to make assumption about the methods and unit prices used for valuing resource use, the methods used for dealing with incomplete data, and the way in which adjustments are made for differential timing.4,8 Therefore, sensitivity analyses should be undertaken to assess how study results would change for different key assumptions and parameter values (ie, the robustness of study results).17,60 The ranges of values tested, and arguments for selecting these ranges, must be clearly described.10,17 Various approaches to sensitivity analyses exist, including one-way, multiway, and probabilistic sensitivity analysis. One-way sensitivity analyses assess the impact of changes to a single parameter at a time, while multiple parameters are varied simultaneously in multiway sensitivity analyses.61 These methods may indicate parameter values for which results could change, but do not provide an indication of the combined impact of the uncertainty surrounding these parameters.60 The latter could be modeled using probabilistic sensitivity analyses.62
CONCLUDING REMARKS
Resources for occupational health are scarce. This makes it necessary for decision makers to have information on the relative efficiency of OHS interventions to allocate available resources to their best use. As such, economic evaluations of OHS interventions are becoming increasingly important, many of which are conducted alongside effectiveness trials. Trial-based economic evaluations provide a unique opportunity to reliably estimate the resource implications of OHS interventions at low incremental cost.10,14 Nevertheless, it is critical that high-quality trial-based economic evaluations are performed when this information is used to inform allocation decisions.Designing a high-quality trial-based economic evaluation requires close collaboration between occupational health specialists, individuals executing the trial, and health economists.14 Careful considerations must be made regarding the perspective, the analytic time frame, the identification, measurement, and valuation of resource use and outcomes, as well as the methods used for calculating sample sizes, comparing costs and consequences, and handling missing data and uncertainty. The latter is of particular importance, as few economic evaluations in occupational health report on the uncertainty surrounding their incremental cost-consequence estimates.4–7,15 Failing to estimate values under uncertainty makes it impossible to determine the certainty of results and could thus lead to inappropriate decision making. To quantify precision, nonparametric bootstrapping can be used as a statistical technique for dealing with the right skewed nature of cost data.1,7 An overview of our core recommendations for trial-based economic evaluations in occupational health can be found in the Appendix.Trial-based economic evaluations may also have shortcomings, including limited sample sizes, limited comparators, and truncated time horizons.14 To deal with the latter, researchers might consider extrapolating economic evaluation results beyond the follow-up of a trial by using decision analytic modeling, in which expected costs and consequences between alternatives are compared by synthesizing information from multiple sources (eg, scientific literature, study results).1,13,14 For more detailed information about decision analytic modeling, we refer to other publications.14,63 Also, even though we recommend a pragmatic (cluster-)RCT design for economic evaluations, we are aware that randomization itself may not always be feasible and/or desired in the occupational health setting. In those cases, well-executed nonrandomized studies may provide valuable information, but it is critical that efforts be made to control for selection bias (eg, by using propensity score matching).64,65When interpreting economic evaluations of OHS interventions, it is important to bear in mind that their results may not be directly applicable to other countries and jurisdictions due to differences in health care, social security systems, and other factors. Verbeek et al66 demonstrated that economic evaluation results can be generalized from one country to another. Nevertheless, to enable the necessary calculations, researchers need to provide an extensive description of the intervention, a detailed list of resource use as well as information of the health care system in the original study and the allocation of costs to various stakeholders.66By simultaneously providing recommendations for good practice in the economic evaluation of OHS interventions and discussing the methods and principles that underlie them, this study aimed to help researchers in conducting and reporting high-quality trial-based economic evaluations. Such studies are expected to contribute to the development of a sound evidence base on the resource implications of OHS interventions,3,4 which is a necessary prerequisite for evidence-based practices occurring in occupational health.11 The present article may also be helpful to consumers of this literature with understanding and critically appraising trial-based economic evaluations of OHS interventions, which might help improve the uptake of their results.
Authors: Karl Claxton; Mark Sculpher; Chris McCabe; Andrew Briggs; Ron Akehurst; Martin Buxton; John Brazier; Tony O'Hagan Journal: Health Econ Date: 2005-04 Impact factor: 3.046
Authors: Scott Ramsey; Richard Willke; Andrew Briggs; Ruth Brown; Martin Buxton; Anita Chawla; John Cook; Henry Glick; Bengt Liljas; Diana Petitti; Shelby Reed Journal: Value Health Date: 2005 Sep-Oct Impact factor: 5.725
Authors: Iris F Groeneveld; Marieke F van Wier; Karin I Proper; Judith E Bosmans; Willem van Mechelen; Allard J van der Beek Journal: J Occup Environ Med Date: 2011-06 Impact factor: 2.162
Authors: Jonathan A C Sterne; Ian R White; John B Carlin; Michael Spratt; Patrick Royston; Michael G Kenward; Angela M Wood; James R Carpenter Journal: BMJ Date: 2009-06-29
Authors: Johanna M van Dongen; Emile Tompa; Laurie Clune; Anna Sarnocinska-Hart; Paulien M Bongers; Maurits W van Tulder; Allard J van der Beek; Marieke F van Wier Journal: Implement Sci Date: 2013-06-03 Impact factor: 7.327
Authors: Teuni H Rooijackers; Silke F Metzelthin; Erik van Rossum; Gertrudis I J M Kempen; Silvia M A A Evers; Andrea Gabrio; G A Rixt Zijlstra Journal: Clin Interv Aging Date: 2021-12-22 Impact factor: 4.458
Authors: Timo T Beemster; Judith M van Velzen; Coen A M van Bennekom; Michiel F Reneman; Monique H W Frings-Dresen Journal: J Occup Rehabil Date: 2019-03
Authors: Sabine Makkes; Johanna M van Dongen; Carry M Renders; Olga H van der Baan-Slootweg; Jacob C Seidell; Judith E Bosmans Journal: Obes Facts Date: 2017-10-07 Impact factor: 3.942
Authors: Nidhi Gupta; Johanna M van Dongen; Andreas Holtermann; Allard J van der Beek; Matthew Leigh Stevens; Charlotte Diana Nørregaard Rasmussen Journal: J Occup Environ Med Date: 2022-02-09 Impact factor: 2.306
Authors: Esther V A Bouwsma; Judith E Bosmans; Johanna M van Dongen; Hans A M Brölmann; Johannes R Anema; Judith A F Huirne Journal: BMJ Open Date: 2018-01-21 Impact factor: 2.692
Authors: Sanne I Stegwee; Ângela J Ben; Mohamed El Alili; Lucet F van der Voet; Christianne J M de Groot; Judith E Bosmans; Judith A F Huirne Journal: BMJ Open Date: 2021-07-02 Impact factor: 2.692
Authors: Timo T Beemster; Judith M van Velzen; Coen A M van Bennekom; Monique H W Frings-Dresen; Michiel F Reneman Journal: Trials Date: 2015-07-28 Impact factor: 2.279