OBJECTIVE: Comparisons of the performance of multiple health care providers are often based on hypothesis tests, those with resulting P-values below some critical threshold being identified as potentially extreme. Because of the multiple testing involved, the classical P-value threshold of, say, 0.05 may not be considered strict enough, as it will tend to lead to too many "false positives." However, we argue that the commonly used Bonferroni-corrected threshold is in general too strict for the problem in hand. The purpose of this article is to demonstrate a suitable alternative thresholding procedure that is already well established in other fields. STUDY DESIGN AND SETTING: The suggested procedure involves control of an error measure called the "false discovery rate" (FDR). We present a worked example involving a comparison of risk-adjusted mortality rates following heart surgery in New York State hospitals during 2000-2002. It is shown that the FDR critical threshold lines can be drawn on a "funnel plot," providing a simple graphical presentation of the results. RESULTS: The FDR procedure identified more providers as potentially extreme than the Bonferroni correction, while maintaining control of an intuitively sensible error measure. CONCLUSION: Control of the FDR offers a simple guideline to determining where to draw critical thresholds when comparing multiple health care providers.
OBJECTIVE: Comparisons of the performance of multiple health care providers are often based on hypothesis tests, those with resulting P-values below some critical threshold being identified as potentially extreme. Because of the multiple testing involved, the classical P-value threshold of, say, 0.05 may not be considered strict enough, as it will tend to lead to too many "false positives." However, we argue that the commonly used Bonferroni-corrected threshold is in general too strict for the problem in hand. The purpose of this article is to demonstrate a suitable alternative thresholding procedure that is already well established in other fields. STUDY DESIGN AND SETTING: The suggested procedure involves control of an error measure called the "false discovery rate" (FDR). We present a worked example involving a comparison of risk-adjusted mortality rates following heart surgery in New York State hospitals during 2000-2002. It is shown that the FDR critical threshold lines can be drawn on a "funnel plot," providing a simple graphical presentation of the results. RESULTS: The FDR procedure identified more providers as potentially extreme than the Bonferroni correction, while maintaining control of an intuitively sensible error measure. CONCLUSION: Control of the FDR offers a simple guideline to determining where to draw critical thresholds when comparing multiple health care providers.
Authors: Karl Y Bilimoria; Mark E Cohen; Ryan P Merkow; Xue Wang; David J Bentrem; Angela M Ingraham; Karen Richards; Bruce L Hall; Clifford Y Ko Journal: J Gastrointest Surg Date: 2010-09-08 Impact factor: 3.452
Authors: Frances T Sheehan; Aditya Derasari; Kenneth M Fine; Timothy J Brindle; Katharine E Alter Journal: Clin Orthop Relat Res Date: 2009-05-09 Impact factor: 4.176
Authors: Shelby Kutty; Philip G Jones; Quentin Karels; Navya Joseph; John A Spertus; Paul S Chan Journal: Circulation Date: 2017-10-04 Impact factor: 29.690
Authors: Dolores Catelan; Manuela Giangreco; Annibale Biggeri; Fabio Barbone; Lorenzo Monasta; Giuseppe Ricci; Federico Romano; Valentina Rosolen; Gabriella Zito; Luca Ronfani Journal: Int J Environ Res Public Health Date: 2021-07-05 Impact factor: 3.390