BACKGROUND: Selecting controls that match cases on risk factors for the outcome is a pervasive practice in biomarker research studies. Such matching, however, biases estimates of biomarker prediction performance. The magnitudes of these biases are unknown. METHODS: We examined the prediction performance of biomarkers and improvements in prediction gained by adding biomarkers to risk factor information. Data simulated from bivariate normal statistical models and data from a study to identify critically ill patients were used. We compared true performance with that estimated from case control studies that do or do not use matching. ROC curves were used to quantify performance. We propose a new statistical method to estimate prediction performance from matched studies for which data on the matching factors are available for subjects in the population. RESULTS: Performance estimated with standard analyses can be grossly biased by matching, especially when biomarkers are highly correlated with matching risk factors. In our studies, the performance of the biomarker alone was underestimated whereas the improvement in performance gained by adding the marker to risk factors was overestimated by 2-10-fold. We found examples for which the relative ranking of 2 biomarkers for prediction was inappropriately reversed by use of a matched design. The new approach to estimation corrected for bias in matched studies. CONCLUSIONS: To properly gauge prediction performance in the population or the improvement gained by adding a biomarker to known risk factors, matched case control studies must be supplemented with risk factor information from the population and must be analyzed with nonstandard statistical methods.
BACKGROUND: Selecting controls that match cases on risk factors for the outcome is a pervasive practice in biomarker research studies. Such matching, however, biases estimates of biomarker prediction performance. The magnitudes of these biases are unknown. METHODS: We examined the prediction performance of biomarkers and improvements in prediction gained by adding biomarkers to risk factor information. Data simulated from bivariate normal statistical models and data from a study to identify critically ill patients were used. We compared true performance with that estimated from case control studies that do or do not use matching. ROC curves were used to quantify performance. We propose a new statistical method to estimate prediction performance from matched studies for which data on the matching factors are available for subjects in the population. RESULTS: Performance estimated with standard analyses can be grossly biased by matching, especially when biomarkers are highly correlated with matching risk factors. In our studies, the performance of the biomarker alone was underestimated whereas the improvement in performance gained by adding the marker to risk factors was overestimated by 2-10-fold. We found examples for which the relative ranking of 2 biomarkers for prediction was inappropriately reversed by use of a matched design. The new approach to estimation corrected for bias in matched studies. CONCLUSIONS: To properly gauge prediction performance in the population or the improvement gained by adding a biomarker to known risk factors, matched case control studies must be supplemented with risk factor information from the population and must be analyzed with nonstandard statistical methods.
Authors: Sholom Wacholder; Patricia Hartge; Ross Prentice; Montserrat Garcia-Closas; Heather Spencer Feigelson; W Ryan Diver; Michael J Thun; David G Cox; Susan E Hankinson; Peter Kraft; Bernard Rosner; Christine D Berg; Louise A Brinton; Jolanta Lissowska; Mark E Sherman; Rowan Chlebowski; Charles Kooperberg; Rebecca D Jackson; Dennis W Buckman; Peter Hui; Ruth Pfeiffer; Kevin B Jacobs; Gilles D Thomas; Robert N Hoover; Mitchell H Gail; Stephen J Chanock; David J Hunter Journal: N Engl J Med Date: 2010-03-18 Impact factor: 91.245
Authors: Yuanyuan Liang; Donna P Ankerst; Norma S Ketchum; Barbara Ercole; Girish Shah; John D Shaughnessy; Robin J Leach; Ian M Thompson Journal: J Urol Date: 2010-11-12 Impact factor: 7.450
Authors: Christopher W Seymour; Jeremy M Kahn; Colin R Cooke; Timothy R Watkins; Susan R Heckbert; Thomas D Rea Journal: JAMA Date: 2010-08-18 Impact factor: 56.272
Authors: Garnet L Anderson; Martin McIntosh; Lieling Wu; Matt Barnett; Gary Goodman; Jason D Thorpe; Lindsay Bergan; Mark D Thornquist; Nathalie Scholler; Nam Kim; Kathy O'Briant; Charles Drescher; Nicole Urban Journal: J Natl Cancer Inst Date: 2009-12-30 Impact factor: 13.506
Authors: Margaret S Pepe; Ziding Feng; Holly Janes; Patrick M Bossuyt; John D Potter Journal: J Natl Cancer Inst Date: 2008-10-07 Impact factor: 13.506
Authors: Sakina H Bharmal; Sayali A Pendharkar; Ruma G Singh; Mark O Goodarzi; Stephen J Pandol; Maxim S Petrov Journal: Pancreatology Date: 2017-09-20 Impact factor: 3.996
Authors: Hui Xu; Jing Qian; Nina P Paynter; Xuehong Zhang; Brian W Whitcomb; Shelley S Tworoger; Kathryn M Rexrode; Susan E Hankinson; Raji Balasubramanian Journal: Stat Med Date: 2018-11-22 Impact factor: 2.373
Authors: K Vuong; B K Armstrong; M Drummond; J L Hopper; J H Barrett; J R Davies; D T Bishop; J Newton-Bishop; J F Aitken; G G Giles; H Schmid; M A Jenkins; G J Mann; K McGeechan; A E Cust Journal: Br J Dermatol Date: 2019-09-22 Impact factor: 9.302
Authors: William R Wikoff; Samir Hanash; Brian DeFelice; Suzanne Miyamoto; Matt Barnett; Yang Zhao; Gary Goodman; Ziding Feng; David Gandara; Oliver Fiehn; Ayumu Taguchi Journal: J Clin Oncol Date: 2015-08-17 Impact factor: 44.544
Authors: Peter J Mazzone; Catherine Rufatto Sears; Doug A Arenberg; Mina Gaga; Michael K Gould; Pierre P Massion; Vish S Nair; Charles A Powell; Gerard A Silvestri; Anil Vachani; Renda Soylemez Wiener Journal: Am J Respir Crit Care Med Date: 2017-10-01 Impact factor: 21.405