Literature DB >> 28032824

What's Measured Is Not Necessarily What Matters: A Cautionary Story from Public Health.

Abstract

A systematic review of the introduction and use of outcome-based performance management systems for public health organizations found differences between their use as a management system (which requires rigorous definition and measurement to allow comparison across organizational units) versus for improvement (which may require more flexibility). What is included in performance measurement/management systems is influenced by ease of measurement, data quality, ability of organization to control outcomes, ability to measure success in terms of doing things (rather than preventing things) and what is already happening. To the extent that most providers wish to do a good job, the availability of good data to enable benchmarking and improvement is an important step forward. However, to the extent that the health of a population is dependent on multiple factors, many beyond the mandate of the health system, too extensive a reliance on performance measurement may risk unintended consequences of marginalizing critical activities.

Entities: Chemical Disease Species

Mesh：

Year: 2016 PMID： 28032824 PMCID： PMC5221711

Source DB: PubMed Journal: Healthc Policy ISSN： 1715-6572

Introduction

The New Public Management has been associated with an increased emphasis on measuring performance, often summarized using the phrase “What's measured is what matters.” A growing literature has found potential limitations to this view (Bevan and Hood 2006; Exworthy 2010; Kuhlmann 2010). This manuscript, which grew from a synthesis of the literature on performance measurement and management in public health, presents a conceptual framework for viewing performance measurement and suggests an additional set of risks inherent in over reliance on these approaches.

Materials and Methods

Literature search

We adapted Pawson et al.'s (Pawson et al. 2005) approach to literature review, which recognizes that much of the analysis will, of necessity, be thematic and interpretive (Dixon-Woods et al. 2005; Pawson 2002), including use of cross-case analysis (Mays et al. 2005; Pope et al. 2006). As the ESRC UK Centre for Evidence Based Policy has noted, social science reviews differ from the medical template in that they rely on a “more diverse pattern of knowledge production,” including books and grey literature (Grayson and Gomersall 2003). Our search strategy included multiple sources. We began with 213 references provided by our KT partner, the Public Health Practice Branch of the Ontario Ministry of Health and Long-Term Care. To capture published and grey literature, we searched such databases as PubMed, Web of Science and Google Scholar; these sites tend to capture different literatures, and thus helped ensure that key references were not missed, using such keywords as: indicators, accreditation, balanced scorecard, evidence-based public health, local public health, performance measurement, performance standards and public health management, alone and in combination. We also searched relevant websites, both for the selected jurisdictions and for the papers and reports produced by the World Health Organization (WHO), Organisation for Economic Co-operation and Development (OECD) and the European Observatory on Health Systems and Policies. We then analyzed both backwards and forward citation chains from key articles – that is, checking the relevant articles cited by that paper (backwards) and the materials citing that article (forward). Other helpful sources were a US review of performance management in public health (Public Health Foundation 2009) funded by the Robert Wood Johnson Foundation, the materials on their website (available at http://www.phf.org/resourcestools/pages/turning_point_project_publications.aspx) and the proceedings of a WHO European Ministerial Conference on Health Systems, which focused on performance measurement for health system improvement (Smith et al. 2009). The abstracts were then scanned for relevance by our team. The approach taken examined the general literature and then selected literature relevant to key case examples from Australia, New Zealand, the UK, the EU, the US and Canada. Case examples were chosen by looking at the jurisdictions selected, with a focus on those that matched, corresponded or contrasted with the Ontario Public Health Standards. This initial review yielded 970 references, which was subsequently augmented by new publications; we also deleted articles not relevant to this subject. The retained material on which this analysis is based was published between 1966 and 2015, with 13 references before 1990, 125 between 1990 and 1999 and 807 between 2000 and 2011, although we have subsequently examined additional more recent publications. Our analysis of the 55 public health measurement cases we selected has been published elsewhere (Schwartz and Deber 2016). This paper focuses on some key lessons for applying performance management and measurement approaches to public health.

Results

Defining our terms

Increasing attention is being paid to the use of information to improve performance. Much of this dialogue is couched in terms of accountability (Smith et al. 2009). There is an extensive literature from management science and from new public management on the use of performance measurement and management in both the public and private sectors (Bouckaert 1993; Freeman 2002; Julnes 2009; Kuhlmann 2010; Poister and Streib 1999). These authors place heavy emphasis on the role of organizational culture and political support in being able to implement change. Accountability is defined as having to be answerable to someone for meeting defined objectives (Emanuel and Emanuel 1996; Fooks and Maslove 2004; Marmor and Morone 1980). It has financial, performance and political/democratic dimensions (Brinkerhoff 2004) and can be ex ante or ex post. This may translate into fiscal accountability to payers, clinical accountability for quality of care (Dobrow et al. 2008) and/or accountability to the public. The actors involved may include various combinations of providers (public and private), patients, payers (including insurers and the legislative and executive branches of government) and regulators (governmental, professional); these actors are connected in various ways (Shortt and Macdonald 2002; Zimmerman 2005). As noted in a series of sub-studies on approaches to accountability published as a special issue of Healthcare Policy (Deber 2014), the tools for establishing and enforcing accountability are similarly varied, and they require clarifying what is meant by accountability, including specifying for what, by whom, to whom and how. Performance management and measurement is frequently suggested as an important tool for improving systems of accountability. As our review clarified, there is some variation within the literature and the cases examined in how various terms are defined and in the purposes of the performance measurement exercise (Solberg et al. 1997). Underlying most of these examples is the sense that managing is difficult without measurement (Gibberd 2005). Performance measurement has been defined by the US Government Accountability Office (GAO) as “the ongoing monitoring and reporting of program accomplishments, particularly progress toward pre-established goals” (US Government Accountability Office 2005). Their definition notes that such activities are typically conducted by the management of the program or agency responsible for them. The GAO contrasts this with program evaluation, which is often conducted by experts external to the program, and may be periodic or ad hoc, rather than ongoing. The GAO definitions, like many performance measurement systems in healthcare often use the framework of Donabedian, which focuses on various combinations of structures, processes, outputs and outcomes (Donabedian 1966, 1980, 1988). A number of approaches to performance measurement can be found in the literature (Abernethy et al. 2005; Adair et al. 2003, 2006a, 2006b; Arah et al. 2003; Stoto 2014; Veillard 2012). The focus of performance measurement systems can also vary, but increasing attention has been paid to using performance management as a way of improving system performance. Goals may also vary but are often aligned with quality. Published reviews of performance measurement efforts include both examination of individual countries and comparisons among OECD countries, including Canada, the US, the UK and Australia (Baker et al. 1998, 2008; Hurst 2002; Hurst and Jee-Hughes 2001; Kelley and Hurst 2006; Mattke et al. 2006; Smith 2002). Much of the literature focuses on using performance measurement to improve clinical quality of care across a variety of settings, including primary care and emergency care (Barnsley et al. 1996; Linder et al. 2009; Lindsay et al. 2002; Phillips et al. 2008). Other projects focus on using performance measurement to improve governance, often using the language of accountability. For this to occur, ongoing data collection is important, so that management and stakeholders can use up-to-date information to monitor the quality of care being provided (Loeb 2004). One approach is to use performance indicators. Performance management, by contrast, both paves the way for and requires a performance measurement system. Many measurement systems are developed with the goal of defining where improvements can be made, with the assumption that managers can use them once the measurement results are examined (Lebas 1995). Performance management can be defined as the action of using performance measurement data to effect change within an organization to achieve predetermined goals (Folan and Browne 2005). There is now broad recognition that while public sector organizations are doing a great deal of performance measurement, they often do not use the data well in full-fledged performance management systems (Schwartz 2011). Nevertheless, there are a number of success stories in public management of using well-designed measurement systems to improve performance (Ammons 1995). Although measurement may be necessary for management, not all performance measurement systems assume that they will be used for management.

Implementing performance measurement: Goals and indicators

The first step to developing a successful performance measurement system is to clearly define what will be measured. McGlynn and Asch suggest that three considerations should be taken into account when choosing an area to measure: (1) how important the area of health-care being measured is, (2) the amount of potential this area holds for quality improvement and (3) the degree to which healthcare professionals are able to control quality improvement in this area of healthcare. They define importance in terms of mortality/morbidity, but also utilization of health services and cost to treat (McGlynn and Asch 1998). Again, there is likely to be variation, depending on whether one is focused on particular patient groups or on the health of the population. However, from the viewpoint of public health, these considerations point to the importance of surveillance systems to provide decision-makers with information about the prevalence of conditions, how they are being addressed and the outcomes of interventions. Often implicit are what policy goals are being pursued. Different goals may imply different policies. Key goals are usually some combination of access, quality (including safety) (Baker et al. 2004), cost control/cost effectiveness and customer satisfaction (Monahan 2006; Myers and Lacey 1996). Behn suggests the objectives for accountability should be improved performance, fairness and financial stewardship (Behn 2001). This affects what organizations are accountable for. Often, policy goals may clash (Deber et al. 2004). An ongoing issue is the potential for unintended consequences if the measures selected do not reflect the full set of policy goals (Townley 2005). Indeed, one of the purposes of balanced scorecards is to make such potential conflicts between goals and measures more evident (Baker and Pink 1995; Kaplan and Norton 1996; Pink et al. 2001; Ten Asbroek et al. 2004; Weir et al. 2009). Once an appropriate area has been identified for measurement, the next step in developing a performance measurement system is to identify potential indicators that will be used in the measurement system. Indicators have been defined as “a measurement tool used to monitor and evaluate the quality of important governance, management, clinical and support functions” (Klazinga et al. 2001). Indicators can be classified. For example, some authors assume that because performance must be measured against some specification, performance indicators do infer quality. Others (who do not necessarily represent a common view) distinguish between “Activity Indicators,” which measure how frequently an event takes place; “Quality Indicators,” which measure the quality of care being provided; and “Performance Indicators,” which do not infer quality but measure other aspects of the performance of the system (for example, the use of resources) (Campbell et al. 2003).

The issue of measurement

Loeb (2004) argues that not everything in healthcare can or should be measured. Challenges may arise when outcomes are influenced by factors other than the interventions being assessed or beyond the control of those being held accountable. There are also issues associated with balancing the number of indicators needed to provide enough information, with usability and costs associated with having too many indicators. Developing and running a performance measurement system is often expensive, and the data produced needs to be useful and interpretable for its users. Many indicators are developed through a rigorous process by which they are developed, defined and reviewed (Lindsay et al. 2002; McGlynn and Asch 1998). Data sources also need to be identified when developing and choosing a set of indicators, with the most common sources coming from healthcare enrolment, administrative data, clinical data and survey data. Clear definitions will ease implementation of the measurement system and its data collection processes across different organizations/users in a consistent fashion and help to ensure that the data collected within the measurement system will be comparable and reliable across different users of the system. As Black has noted, this is not always the case (Black 2015). Considerable efforts have been made to develop comparable indicators to enable cross-jurisdictional comparisons. These include the OECD quality indicators project (Arah et al. 2006) and the reporting standards for public health indicators (Armstrong et al. 2008). An offsetting concern is the recognition that strategic scorecards also must include locally relevant indicators. Achieving the right mix between local relevance and the ability to compare across organizations is crucial.

Discussion

One ongoing issue is what sorts of indicators should be used. A promising development is the Canadian Institute of Health Information (CIHI) 2012 Performance Measurement Framework for the Canadian Health System (CIHI 2012), which attempts to link performance dimensions through expected causal relationships in four interrelated quadrants: Health System Outcomes, Social Determinants of Health, Health System Outputs and Health System Inputs and Characteristics. Proper application of this and similar frameworks may help to ensure a more balanced approach to what is measured and what matters. However, our review suggests that the factors important to those individuals providing clinical services to clients often differ from those important to program managers, payers or health systems (Tregunno et al. 2004). One class of indicators focuses on adverse outcomes, either at the individual level (e.g., adverse events) or at the system level (e.g., avoidable deaths). Klazinga et al. argued that “epidemiological research has shown the difficulties in validating [negative health outcomes] as indicators for the quality of care that was delivered” (Klazinga et al. 2001). In selecting indicators, a key factor is the extent to which the elements affecting the measurement are under control of decision-makers. Chassin et al. emphasized that for an outcome indicator to be relevant, it must be closely related to the healthcare processes that have an effect on the outcome (Chassin et al. 1998). In addition, there may be differences in what would be done with information; although the information may be valuable, it is difficult to hold managers accountable for things they cannot control. One obvious example is geography, which will often affect travel costs or access. Another, which affects population health, is the extent to which the various determinants of health (e.g., income, housing, tobacco use, etc.) are under the control of public health organizations. Information may thus be helpful in affecting policy levers (e.g., pricing of alcohol, tobacco) that other actors control, but less useful if program managers will be rewarded (or punished) for variables they cannot affect. Other factors include whether different indicators are correlated (which can lead to double counting), how easy they are to measure (transaction costs), extent to which they are subject to “gaming” and whether they cover the outcomes of interest (Bevan 2010; Exworthy 2010; Ham 2010; Hamblin 2008; Irwin 2010; Klazinga 2010; Provincial Auditor of Ontario 2003).

Likely impacts

Another set of issues involves what will be done with the performance measures, including how they will be applied. Frequently, performance measurement involves setting performance targets and assessing the extent to which these are being met. In turn, these may be used for funding (e.g., results-based budgeting) and/or to identify areas for in-depth evaluation. External bodies may use the information to ensure accountability. Managers may use them to monitor activities and make policies. Townley argued that “the use of performance measures reflects a belief in the efficacy of rational management systems in achieving improvements in performance” (Townley 2005). In the UK, use of fiscal levers is sometimes referred to as “targets and terror” (Propper et al. 2008). The way in which measures are likely to affect behaviour varies. Clearly, measurement is simplest if organizations produce a small number of services, have a limited number of goals, understand the relationship between inputs and results and can control their own outcomes. As Townley notes, “A failure to ground performance measures in the everyday activity of the workforce is likely to see them dismissed for being irrelevant, unwieldy, arbitrary, or divisive.” Other potential downsides are that “the time and resources taken to collect measures may outweigh the benefits of their use” (Townley 2005). A related set of factors relates to the organizational infrastructure (Alexander et al. 2006). The workplace culture, including differences between the explicit goals and what some have called the “implicit theories” or “theories in use,” which affect day-to-day functioning, may affect the extent to which change initiatives are embraced and performance changes (Aitken 1994). This is in turn related to concepts of “street level bureaucracy,” which deals with the extent to which it is simple to manage and observe the activities of those responsible for providing the given services (Lipsky 1980). Other less desirable organizational responses to performance measurement may include decoupling, a term used to refer to situations where specialist units are responsible for performance measurement, but where the measures have little impact on day-to-day activities and may lead to a sense that the measurement approach is “ritualistic” and “bureaucratic” rather than integral to improvement (Townley 2005). Even more alarmingly, measurement can lead to dysfunctional consequences, including focusing on measures rather than actual performance, impairment of innovation, gaming and creative accounting, potentially making performance worse (Hamblin 2008; Leggat et al. 1998). Other effects can be subtle; one example is placing less emphasis on prevention than on treating existing problems. The extent to which these positive or negative effects are realized may be heavily dependent upon context.

Conclusions

Selecting indicators

We found considerable differences in what sorts of performance measurement and management are actually being done, not just by jurisdiction (which we expected) but also by type of service. We found heavy emphasis on surveillance and far less on explicitly using the indicator data for management. Additionally, there is more focus on processes of how services are provided than on outcomes. A number of rationales are provided for this state of affairs. An excellent synthesis can be found in the proceedings of a WHO symposium, which stresses the importance of clarifying causality and the difficulty in holding providers accountable for outcomes that they cannot control. As one example, “physicians working in socio-economically disadvantaged localities may be wrongly blamed for securing poor outcomes beyond the control of the health system” (Smith et al. 2009: 12). Risk adjustment methodologies can control for some, but not all, of this variation. Composite indicators can be useful, but only if transparent and valid. Similarly, it may be necessary to deal with random fluctuations before determining when intervention is needed to improve performance. One striking finding that emerged from our review of how performance measurement and management are used in public health was the extent to which they focused on clinical services addressed to individuals (Smith et al. 2009). Activities directed towards improving the health of populations, particularly those with a preventive orientation, tend not to be included. As one example, the chapter in the report of the WHO symposium purportedly devoted to population health focuses almost exclusively on clinical treatment, including heavy focus on tracer conditions. One rationale given by these authors is that the performance measurement/management experiments they reported on wished to focus on the healthcare system. Their reaction to the fact that “it is often difficult to assess the extent to which variations in health outcome can be attributed to the health system” (Nolte et al. 2009) was accordingly to omit such measures. One concern arising from our review is that performance measurement approaches, by focusing so heavily upon the healthcare system, may skew attention away from important initiatives directed at improving the health of the population. Indeed, another chapter in the WHO symposium volume on “measuring clinical quality and appropriateness” explicitly states (pp 88–89): “A number of potential actions to improve population health do not operate through the health-care system (e.g., ensuring adequate sanitation, safe food, clean environments) and some areas do not have health services that are effective in changing an outcome. Neither of these areas is fruitful for developing clinical process measures” (McGlynn 2009). Omitting such areas from measurement systems, however, may falsely imply that they do not matter. Our review stresses the importance of being aware of unintended consequences. For example, in the UK pay-for-performance (P4P), success tended to be measured as doing more of particular things (e.g., screening tests, medication, some immunization) for particular populations (e.g., people with chronic diseases); prevention and population health risk being lost in the shuffle. Some key variables that appear to influence what is being included in performance measurement/management systems include: Ease of measurement. Data quality. Jurisdictions vary considerably in how good the data are. For example, Canada does not yet have good data about immunization at the national level. Ability of organization to control outcomes. Ability to measure success in terms of doing things (rather than preventing things). What is already happening. One example is the UK P4P for physicians, which is generally considered to have been highly successful. However, there was some suggestion that what was being rewarded was better recording rather than changes in practice. The indicator systems appear to, in part, reward providers for things they were already doing, which in turn raises questions about who gets to set the indicators. One important caveat for any performance measurement/performance management system is that it does not, and cannot, capture all activities. In that connection, as Black (2015) has noted, it is important to recognize that most providers are professionals who want to do a good job. Performance measurement/management is only one component, but can give tools to allow all stakeholders to know how they are doing and enable the use of benchmarking to improve performance. A second caveat is that we focused on published information; this may or may not reflect current activities in those jurisdictions. Successful interventions are also more likely to have been published. To the extent that the health of a population is dependent on multiple factors, many beyond the mandate of the healthcare system (both personal health and public health), however, our review suggests that too extensive a reliance on performance measurement may risk unintended consequences of marginalizing critical activities. As ever, balance is key.

43 in total

1. Indicators without a cause. Reflections on the development and use of indicators in health care from a public health perspective.

Authors: N Klazinga; K Stronks; D Delnoij; A Verhoeff
Journal: Int J Qual Health Care Date: 2001-12 Impact factor: 2.038

Review 2. Research methods used in developing and applying quality indicators in primary care.

Authors: S M Campbell; J Braspenning; A Hutchinson; M N Marshall
Journal: BMJ Date: 2003-04-12

3. The development of indicators to measure the quality of clinical care in emergency departments following a modified-delphi approach.

Authors: Patrice Lindsay; Michael Schull; Susan Bronskill; Geoffrey Anderson
Journal: Acad Emerg Med Date: 2002-11 Impact factor: 3.451

4. Toward an accountability framework for Canadian healthcare.

Authors: S E D Shortt; J K MacDonald
Journal: Healthc Manage Forum Date: 2002

5. Competing values of emergency department performance: balancing multiple stakeholder perspectives.

Authors: Deborah Tregunno; G Ross Baker; Jan Barnsley; Michael Murray
Journal: Health Serv Res Date: 2004-08 Impact factor: 3.402

6. Developing a national performance indicator framework for the Dutch health system.

Authors: A H A ten Asbroek; O A Arah; J Geelhoed; T Custers; D M Delnoij; N S Klazinga
Journal: Int J Qual Health Care Date: 2004-04 Impact factor: 2.038

Review 7. Synthesising qualitative and quantitative evidence: a review of possible methods.

Authors: Mary Dixon-Woods; Shona Agarwal; David Jones; Bridget Young; Alex Sutton
Journal: J Health Serv Res Policy Date: 2005-01

8. The role of organizational infrastructure in implementation of hospitals' quality improvement.

Authors: Jeffrey A Alexander; Bryan J Weiner; Stephen M Shortell; Laurence C Baker; Mark P Becker
Journal: Hosp Top Date: 2006

9. The three faces of performance measurement: improvement, accountability, and research.

Authors: L I Solberg; G Mosser; S McDonald
Journal: Jt Comm J Qual Improv Date: 1997-03

10. Thinking about accountability.

Authors: Raisa B Deber
Journal: Healthc Policy Date: 2014-09

4 in total

Review 1. Incentivizing performance in health care: a rapid review, typology and qualitative study of unintended consequences.

Authors: Xinyu Li; Jenna M Evans
Journal: BMC Health Serv Res Date: 2022-05-23 Impact factor: 2.908

2. Association of Primary Care Physician Compensation Incentives and Quality of Care in the United States, 2012-2016.

Authors: David S Burstein; David T Liss; Jeffrey A Linder
Journal: J Gen Intern Med Date: 2021-04-14 Impact factor: 5.128

3. Effectiveness of capacity building interventions relevant to public health practice: a systematic review.

Authors: Kara DeCorby-Watson; Gloria Mensah; Kim Bergeron; Samiya Abdi; Benjamin Rempel; Heather Manson
Journal: BMC Public Health Date: 2018-06-01 Impact factor: 3.295

4. High and Sustained Participation in a Multi-year Voluntary Performance Measurement Initiative Among Primary Care Teams.

Authors: Carol Mulder; Jennifer Rayner
Journal: Int J Health Policy Manag Date: 2022-04-01

4 in total