OBJECTIVE: To examine the reliability of quality measures to assess physician performance, which are increasingly used as the basis for quality improvement efforts, contracting decisions, and financial incentives, despite concerns about the methodological challenges. STUDY DESIGN: Evaluation of health plan administrative claims and enrollment data. METHODS: The study used administrative data from 9 health plans representing more than 11 million patients. The number of quality events (patients eligible for a quality measure), mean performance, and reliability estimates were calculated for 27 quality measures. Composite scores for preventive, chronic, acute, and overall care were calculated as the weighted mean of the standardized scores. Reliability was estimated by calculating the physician-to-physician variance divided by the sum of the physician-to-physician variance plus the measurement variance, and 0.70 was considered adequate. RESULTS: Ten quality measures had reliability estimates above 0.70 at a minimum of 50 quality events. For other quality measures, reliability was low even when physicians had 50 quality events. The largest proportion of physicians who could be reliably evaluated on a single quality measure was 8% for colorectal cancer screening and 2% for nephropathy screening among patients with diabetes mellitus. More physicians could be reliably evaluated using composite scores (<17% for preventive care, >7% for chronic care, and 15%-20% for an overall composite). CONCLUSIONS: In typical health plan administrative data, most physicians do not have adequate numbers of quality events to support reliable quality measurement. The reliability of quality measures should be taken into account when quality information is used for public reporting and accountability. Efforts to improve data available for physician profiling are also needed.
OBJECTIVE: To examine the reliability of quality measures to assess physician performance, which are increasingly used as the basis for quality improvement efforts, contracting decisions, and financial incentives, despite concerns about the methodological challenges. STUDY DESIGN: Evaluation of health plan administrative claims and enrollment data. METHODS: The study used administrative data from 9 health plans representing more than 11 million patients. The number of quality events (patients eligible for a quality measure), mean performance, and reliability estimates were calculated for 27 quality measures. Composite scores for preventive, chronic, acute, and overall care were calculated as the weighted mean of the standardized scores. Reliability was estimated by calculating the physician-to-physician variance divided by the sum of the physician-to-physician variance plus the measurement variance, and 0.70 was considered adequate. RESULTS: Ten quality measures had reliability estimates above 0.70 at a minimum of 50 quality events. For other quality measures, reliability was low even when physicians had 50 quality events. The largest proportion of physicians who could be reliably evaluated on a single quality measure was 8% for colorectal cancer screening and 2% for nephropathy screening among patients with diabetes mellitus. More physicians could be reliably evaluated using composite scores (<17% for preventive care, >7% for chronic care, and 15%-20% for an overall composite). CONCLUSIONS: In typical health plan administrative data, most physicians do not have adequate numbers of quality events to support reliable quality measurement. The reliability of quality measures should be taken into account when quality information is used for public reporting and accountability. Efforts to improve data available for physician profiling are also needed.
Authors: Lawrence P Casalino; Arthur Elster; Andy Eisenberg; Evelyn Lewis; John Montgomery; Diana Ramos Journal: Health Aff (Millwood) Date: 2007-04-10 Impact factor: 6.301
Authors: I-Chan Huang; Constantine Frangakis; Francesca Dominici; Gregory B Diette; Albert W Wu Journal: Health Serv Res Date: 2005-02 Impact factor: 3.402
Authors: Sheldon Greenfield; Sherrie H Kaplan; Richard Kahn; John Ninomiya; John L Griffith Journal: Ann Intern Med Date: 2002-01-15 Impact factor: 25.391
Authors: I-Chan Huang; Gregory B Diette; Francesca Dominici; Constantine Frangakis; Albert W Wu Journal: Am J Manag Care Date: 2005-01 Impact factor: 2.229
Authors: Terry Shih; Adam I Cole; Paul M Al-Attar; Apurba Chakrabarti; Hussein A Fardous; Peter F Helvie; Michael T Kemp; Chris Lee; Eytan Shtull-Leber; Darrell A Campbell; Michael J Englesbe Journal: Ann Surg Date: 2015-05 Impact factor: 12.969
Authors: Michael P Thompson; Cameron M Kaplan; Yu Cao; Gloria J Bazzoli; Teresa M Waters Journal: Health Serv Res Date: 2016-10-21 Impact factor: 3.402
Authors: Sarah Hudson Scholle; Joachin Roski; Daniel L Dunn; John L Adams; Donna Pillitterre Dugan; L Gregory Pawlson; Eve A Kerr Journal: Am J Manag Care Date: 2009-01 Impact factor: 2.229