Margaret S Pepe1, Holly Janes, Christopher I Li. 1. Affiliations of authors: Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA (MSP, HJ, CIL); Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA (HJ).
Abstract
BACKGROUND: The Net Reclassification Index (NRI) and its P value are used to make conclusions about improvements in prediction performance gained by adding a set of biomarkers to an existing risk prediction model. Although proposed only 5 years ago, the NRI has gained enormous traction in the risk prediction literature. Concerns have recently been raised about the statistical validity of the NRI. METHODS: Using a population dataset of 10000 individuals with an event rate of 10.2%, in which four biomarkers have no predictive ability, we repeatedly simulated studies and calculated the chance that the NRI statistic provides a positive statistically significant result. Subjects for training data (n = 420) and test data (n = 420 or 840) were randomly selected from the population, and corresponding NRI statistics and P values were calculated. For comparison, the change in the area under the receiver operating characteristic curve and likelihood ratio statistics were calculated. RESULTS: We found that rates of false-positive conclusions based on the NRI statistic were unacceptably high, being 63.0% in the training datasets and 18.8% to 34.4% in the test datasets. False-positive conclusions were rare when using the change in the area under the curve and occurred at the expected rate of approximately 5.0% with the likelihood ratio statistic. CONCLUSIONS: Conclusions about biomarker performance that are based primarily on a statistically significant NRI statistic should be treated with skepticism. Use of NRI P values in scientific reporting should be halted.
BACKGROUND: The Net Reclassification Index (NRI) and its P value are used to make conclusions about improvements in prediction performance gained by adding a set of biomarkers to an existing risk prediction model. Although proposed only 5 years ago, the NRI has gained enormous traction in the risk prediction literature. Concerns have recently been raised about the statistical validity of the NRI. METHODS: Using a population dataset of 10000 individuals with an event rate of 10.2%, in which four biomarkers have no predictive ability, we repeatedly simulated studies and calculated the chance that the NRI statistic provides a positive statistically significant result. Subjects for training data (n = 420) and test data (n = 420 or 840) were randomly selected from the population, and corresponding NRI statistics and P values were calculated. For comparison, the change in the area under the receiver operating characteristic curve and likelihood ratio statistics were calculated. RESULTS: We found that rates of false-positive conclusions based on the NRI statistic were unacceptably high, being 63.0% in the training datasets and 18.8% to 34.4% in the test datasets. False-positive conclusions were rare when using the change in the area under the curve and occurred at the expected rate of approximately 5.0% with the likelihood ratio statistic. CONCLUSIONS: Conclusions about biomarker performance that are based primarily on a statistically significant NRI statistic should be treated with skepticism. Use of NRI P values in scientific reporting should be halted.
Authors: Mark A Hlatky; Philip Greenland; Donna K Arnett; Christie M Ballantyne; Michael H Criqui; Mitchell S V Elkind; Alan S Go; Frank E Harrell; Yuling Hong; Barbara V Howard; Virginia J Howard; Priscilla Y Hsue; Christopher M Kramer; Joseph P McConnell; Sharon-Lise T Normand; Christopher J O'Donnell; Sidney C Smith; Peter W F Wilson Journal: Circulation Date: 2009-04-13 Impact factor: 29.690
Authors: Kathleen F Kerr; Zheyu Wang; Holly Janes; Robyn L McClelland; Bruce M Psaty; Margaret S Pepe Journal: Epidemiology Date: 2014-01 Impact factor: 4.822
Authors: Laine E Thomas; Emily C O'Brien; Jonathan P Piccini; Ralph B D'Agostino; Michael J Pencina Journal: Eur Heart J Date: 2019-06-14 Impact factor: 29.983
Authors: Peter M Burch; Warren E Glaab; Daniel J Holder; Jonathan A Phillips; John-Michael Sauer; Elizabeth G Walker Journal: Toxicol Sci Date: 2017-03-01 Impact factor: 4.849
Authors: Francys C Verdial; David K Madtes; Billanna Hwang; Michael S Mulligan; Katherine Odem-Davis; Rachel Waworuntu; Douglas E Wood; Farhood Farjah Journal: Ann Thorac Surg Date: 2019-01-30 Impact factor: 4.330
Authors: Kathleen F Kerr; Allison Meisner; Heather Thiessen-Philbrook; Steven G Coca; Chirag R Parikh Journal: Clin J Am Soc Nephrol Date: 2014-05-22 Impact factor: 8.237
Authors: Annette M Molinaro; Leah M Ferrucci; Brenda Cartmel; Erikka Loftfield; David J Leffell; Allen E Bale; Susan T Mayne Journal: Am J Epidemiol Date: 2015-04-08 Impact factor: 4.897
Authors: Lucas W Thornblade; Michael S Mulligan; Katherine Odem-Davis; Billanna Hwang; Rachel L Waworuntu; Erika M Wolff; Larry Kessler; Douglas E Wood; Farhood Farjah Journal: Ann Thorac Surg Date: 2018-07-19 Impact factor: 4.330