Literature DB >> 15350579

Multireader, multicase receiver operating characteristic analysis: an empirical comparison of five methods.

Nancy A Obuchowski1, Sergey V Beiden, Kevin S Berbaum, Stephen L Hillis, Hemant Ishwaran, Hae Hiang Song, Robert F Wagner.   

Abstract

RATIONALE AND
OBJECTIVES: Several statistical methods have been developed for analyzing multireader, multicase (MRMC) receiver operating characteristic (ROC) studies. The objective of this article is to increase awareness of these methods and determine if their results are concordant for published datasets.
MATERIALS AND METHODS: Data from three previously published studies were reanalyzed using five MRMC methods. For each method the 95% confidence intervals (CIs) for the mean of the readers' ROC areas for each diagnostic test, the P value for the comparison of the diagnostic tests' mean accuracies, and the 95% CIs for the mean difference in ROC areas of the diagnostic tests were reported.
RESULTS: Important differences in P values and CIs were seen when using parametric versus nonparametric estimates of accuracy, and there were the expected differences for random-reader versus fixed-reader models. Controlling for these differences, the Dorfman-Berbaum-Metz (DBM), Obuchowski-Rockette, Beiden-Wagner-Campbell, and Song's multivariate Wilcoxon-Mann-Whitney (WMW) methods gave almost identical results for the fixed-reader model. For the random-reader model, the DBM, Obuchowski-Rockette, and Beiden-Wagner-Campbell methods yielded approximately the same inferences, but the CIs for the Beiden-Wagner-Campbell method tend to be broader. Ishwaran's hierarchical ROC sometimes yielded significance not found with other methods. Song's modification of DBM's jack-knifing algorithm sometimes led to different conclusions than the original DBM algorithm.
CONCLUSION: In choosing and applying MRMC methods, it is important to recognize: (1) the distinction between random-reader and fixed-reader models, the uncertainties accounted for by each, and thus the level of generalizeability expected from each; (2) assumptions made by the various MRMC methods; and (3) limitations of a five- or six-reader study when the reader variability is great.

Mesh:

Year:  2004        PMID: 15350579     DOI: 10.1016/j.acra.2004.04.014

Source DB:  PubMed          Journal:  Acad Radiol        ISSN: 1076-6332            Impact factor:   3.173


  27 in total

Review 1.  ROC analysis in medical imaging: a tutorial review of the literature.

Authors:  Charles E Metz
Journal:  Radiol Phys Technol       Date:  2007-10-27

2.  Evaluating imaging and computer-aided detection and diagnosis devices at the FDA.

Authors:  Brandon D Gallas; Heang-Ping Chan; Carl J D'Orsi; Lori E Dodd; Maryellen L Giger; David Gur; Elizabeth A Krupinski; Charles E Metz; Kyle J Myers; Nancy A Obuchowski; Berkman Sahiner; Alicia Y Toledano; Margarita L Zuley
Journal:  Acad Radiol       Date:  2012-02-03       Impact factor: 3.173

3.  A comparison of denominator degrees of freedom methods for multiple observer ROC analysis.

Authors:  Stephen L Hillis
Journal:  Stat Med       Date:  2007-02-10       Impact factor: 2.373

4.  Comparing areas under receiver operating characteristic curves: potential impact of the "Last" experimentally measured operating point.

Authors:  David Gur; Andriy I Bandos; Howard E Rockette
Journal:  Radiology       Date:  2008-02-07       Impact factor: 11.105

5.  Evaluation of computer-aided detection and diagnosis systems.

Authors:  Nicholas Petrick; Berkman Sahiner; Samuel G Armato; Alberto Bert; Loredana Correale; Silvia Delsanto; Matthew T Freedman; David Fryd; David Gur; Lubomir Hadjiiski; Zhimin Huo; Yulei Jiang; Lia Morra; Sophie Paquerault; Vikas Raykar; Frank Samuelson; Ronald M Summers; Georgia Tourassi; Hiroyuki Yoshida; Bin Zheng; Chuan Zhou; Heang-Ping Chan
Journal:  Med Phys       Date:  2013-08       Impact factor: 4.071

6.  Evaluation of the channelized Hotelling observer with an internal-noise model in a train-test paradigm for cardiac SPECT defect detection.

Authors:  Jovan G Brankov
Journal:  Phys Med Biol       Date:  2013-09-20       Impact factor: 3.609

7.  Different pixel pitch and maximum luminance of medical grade displays may result in different evaluations of digital radiography images.

Authors:  Alberto Laffranchi; Calogero Cicero; Manuela Lualdi; Chiara M Ciniselli; Giuseppina Calareso; Stefano Canestrini; Francesca G Greco; Enrico Alberioli; Claudia Cavatorta; Alessandro Guarise; Emanuele Pignoli; Maddalena Plebani; Davide Scaramuzza; Claudio Siciliano; Paolo Verderio; Alfonso Marchianò
Journal:  Radiol Med       Date:  2018-04-18       Impact factor: 3.469

8.  Informational analysis: a Shannon theoretic approach to measure the performance of a diagnostic test.

Authors:  Rossano Girometti; Francesco Fabris
Journal:  Med Biol Eng Comput       Date:  2015-04-17       Impact factor: 2.602

9.  Reliable and computationally efficient maximum-likelihood estimation of "proper" binormal ROC curves.

Authors:  Lorenzo L Pesce; Charles E Metz
Journal:  Acad Radiol       Date:  2007-07       Impact factor: 3.173

10.  The "laboratory" effect: comparing radiologists' performance and variability during prospective clinical and laboratory mammography interpretations.

Authors:  David Gur; Andriy I Bandos; Cathy S Cohen; Christiane M Hakim; Lara A Hardesty; Marie A Ganott; Ronald L Perrin; William R Poller; Ratan Shah; Jules H Sumkin; Luisa P Wallace; Howard E Rockette
Journal:  Radiology       Date:  2008-08-05       Impact factor: 11.105

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.