Literature DB >> 20236840

Is an ROC-type response truly always better than a binary response in observer performance studies?

David Gur1, Andriy I Bandos, Howard E Rockette, Margarita L Zuley, Christiane M Hakim, Denise M Chough, Marie A Ganott, Jules H Sumkin.   

Abstract

RATIONALE AND
OBJECTIVES: The aim of this study was to assess similarities and differences between methods of performance comparisons under binary (yes or no) and receiver-operating characteristic (ROC)-type pseudocontinuous (0-100) rating data ascertained during an observer performance study of interpretation of full-field digital mammography (FFDM) versus FFDM plus digital breast tomosynthesis.
MATERIALS AND METHODS: Rating data consisted of ROC-type pseudocontinuous and binary ratings generated by eight radiologists evaluating 77 digital mammographic examinations. Overall performance levels were summarized with a conventionally used probability of correct discrimination or, equivalently, the area under the ROC curve (AUC), which under a binary scale is related to Youden's index. Magnitudes of differences in the reader-averaged empirical AUCs between FFDM alone and FFDM plus digital breast tomosynthesis were compared in the context of fixed-reader and random-reader variability of the estimates.
RESULTS: The absolute differences between modes using the empirical AUCs were larger on average for the binary scale (0.12 vs 0.07) and for the majority of individual readers (six of eight). Standardized differences were consistent with this finding (2.32 vs 1.63 on average). Reader-averaged differences in AUCs standardized by fixed-reader and random-reader variances were also smaller under the binary rating paradigm. The discrepancy between AUC differences depended on the location of the reader-specific binary operating points.
CONCLUSIONS: The human observer's operating point should be a primary consideration in designing an observer performance study. Although in general, the ROC-type rating paradigm provides more detailed information on the characteristics of different modes, it does not reflect the actual operating point adopted by human observers. There are application-driven scenarios in which analysis based on binary responses may provide statistical advantages. Copyright 2010 AUR. Published by Elsevier Inc. All rights reserved.

Entities:  

Mesh:

Year:  2010        PMID: 20236840      PMCID: PMC2856622          DOI: 10.1016/j.acra.2009.12.012

Source DB:  PubMed          Journal:  Acad Radiol        ISSN: 1076-6332            Impact factor:   3.173


  21 in total

1.  An empirical comparison of discrete ratings and subjective probability ratings.

Authors:  Kevin S Berbaum; Donald D Dorfman; E A Franken; Robert T Caldwell
Journal:  Acad Radiol       Date:  2002-07       Impact factor: 3.173

2.  Changes in breast cancer detection and mammography recall rates after the introduction of a computer-aided detection system.

Authors:  David Gur; Jules H Sumkin; Howard E Rockette; Marie Ganott; Christiane Hakim; Lara Hardesty; William R Poller; Ratan Shah; Luisa Wallace
Journal:  J Natl Cancer Inst       Date:  2004-02-04       Impact factor: 13.506

3.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method.

Authors:  D D Dorfman; K S Berbaum; C E Metz
Journal:  Invest Radiol       Date:  1992-09       Impact factor: 6.016

4.  A comparison of the Dorfman-Berbaum-Metz and Obuchowski-Rockette methods for receiver operating characteristic (ROC) data.

Authors:  Stephen L Hillis; Nancy A Obuchowski; Kevin M Schartz; Kevin S Berbaum
Journal:  Stat Med       Date:  2005-05-30       Impact factor: 2.373

5.  The use of continuous and discrete confidence judgments in receiver operating characteristic studies of diagnostic imaging techniques.

Authors:  H E Rockette; D Gur; C E Metz
Journal:  Invest Radiol       Date:  1992-02       Impact factor: 6.016

6.  Context bias. A problem in diagnostic radiology.

Authors:  T K Egglin; A R Feinstein
Journal:  JAMA       Date:  1996-12-04       Impact factor: 56.272

7.  Regret graphs, diagnostic uncertainty and Youden's Index.

Authors:  J Hilden; P Glasziou
Journal:  Stat Med       Date:  1996-05-30       Impact factor: 2.373

8.  Satisfaction of search in diagnostic radiology.

Authors:  K S Berbaum; E A Franken; D D Dorfman; S A Rooholamini; M H Kathol; T J Barloon; F M Behlke; Y Sato; C H Lu; G Y el-Khoury
Journal:  Invest Radiol       Date:  1990-02       Impact factor: 6.016

9.  The meaning and use of the area under a receiver operating characteristic (ROC) curve.

Authors:  J A Hanley; B J McNeil
Journal:  Radiology       Date:  1982-04       Impact factor: 11.105

10.  Collecting 48,000 CT exams for the lung screening study of the National Lung Screening Trial.

Authors:  Kenneth W Clark; David S Gierada; Guillermo Marquez; Stephen M Moore; David R Maffitt; Joan D Moulton; Mary A Wolfsberger; Paul Koppel; Stanley R Phillips; Fred W Prior
Journal:  J Digit Imaging       Date:  2008-09-06       Impact factor: 4.056

View more
  6 in total

1.  A nonparametric procedure for comparing the areas under correlated LROC curves.

Authors:  Adam Wunderlich; Frédéric Noo
Journal:  IEEE Trans Med Imaging       Date:  2012-06-18       Impact factor: 10.048

2.  Listening to Women: Expectations and Experiences in Breast Imaging.

Authors:  Susan Harvey; Aimee M Gallagher; Martha Nolan; Christine M Hughes
Journal:  J Womens Health (Larchmt)       Date:  2015-09       Impact factor: 2.681

3.  Assessing radiologist performance using combined digital mammography and breast tomosynthesis compared with digital mammography alone: results of a multicenter, multireader trial.

Authors:  Elizabeth A Rafferty; Jeong Mi Park; Liane E Philpotts; Steven P Poplack; Jules H Sumkin; Elkan F Halpern; Loren T Niklason
Journal:  Radiology       Date:  2012-11-20       Impact factor: 11.105

Review 4.  The Reproducibility of Changes in Diagnostic Figures of Merit Across Laboratory and Clinical Imaging Reader Studies.

Authors:  Frank W Samuelson; Craig K Abbey
Journal:  Acad Radiol       Date:  2017-06-27       Impact factor: 3.173

5.  Measuring agreement between rating interpretations and binary clinical interpretations of images: a simulation study of methods for quantifying the clinical relevance of an observer performance paradigm.

Authors:  Dev P Chakraborty
Journal:  Phys Med Biol       Date:  2012-04-20       Impact factor: 3.609

6.  Estimating the receiver operating characteristic curve in studies that match controls to cases on covariates.

Authors:  Margaret Sullivan Pepe; Jing Fan; Christopher W Seymour
Journal:  Acad Radiol       Date:  2013-04-17       Impact factor: 3.173

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.