Frank W Samuelson1, Craig K Abbey2. 1. U.S. Food and Drug Administration, 10903 New Hampshire Ave., Building 62, Room 3102, Silver Spring, MD 20993-0002. Electronic address: frank.samuelson@fda.hhs.gov. 2. Department of Psychological and Brain Sciences, University of California, Santa Barbara, California.
Abstract
RATIONALE AND OBJECTIVES: In this paper we examine which comparisons of reading performance between diagnostic imaging systems made in controlled retrospective laboratory studies may be representative of what we observe in later clinical studies. The change in a meaningful diagnostic figure of merit between two diagnostic modalities should be qualitatively or quantitatively comparable across all kinds of studies. MATERIALS AND METHODS: In this meta-study we examine the reproducibility of relative measures of sensitivity, false positive fraction (FPF), area under the receiver operating characteristic (ROC) curve, and expected utility across laboratory and observational clinical studies for several different breast imaging modalities, including screen film mammography, digital mammography, breast tomosynthesis, and ultrasound. RESULTS: Across studies of all types, the changes in the FPFs yielded very small probabilities of having a common mean value. The probabilities of relative sensitivity being the same across ultrasound and tomosynthesis studies were low. No evidence was found for different mean values of relative area under the ROC curve or relative expected utility within any of the study sets. CONCLUSION: The comparison demonstrates that the ratios of areas under the ROC curve and expected utilities are reproducible across laboratory and clinical studies, whereas sensitivity and FPF are not. Published by Elsevier Inc.
RATIONALE AND OBJECTIVES: In this paper we examine which comparisons of reading performance between diagnostic imaging systems made in controlled retrospective laboratory studies may be representative of what we observe in later clinical studies. The change in a meaningful diagnostic figure of merit between two diagnostic modalities should be qualitatively or quantitatively comparable across all kinds of studies. MATERIALS AND METHODS: In this meta-study we examine the reproducibility of relative measures of sensitivity, false positive fraction (FPF), area under the receiver operating characteristic (ROC) curve, and expected utility across laboratory and observational clinical studies for several different breast imaging modalities, including screen film mammography, digital mammography, breast tomosynthesis, and ultrasound. RESULTS: Across studies of all types, the changes in the FPFs yielded very small probabilities of having a common mean value. The probabilities of relative sensitivity being the same across ultrasound and tomosynthesis studies were low. No evidence was found for different mean values of relative area under the ROC curve or relative expected utility within any of the study sets. CONCLUSION: The comparison demonstrates that the ratios of areas under the ROC curve and expected utilities are reproducible across laboratory and clinical studies, whereas sensitivity and FPF are not. Published by Elsevier Inc.
Authors: David Gur; Andriy I Bandos; Howard E Rockette; Margarita L Zuley; Christiane M Hakim; Denise M Chough; Marie A Ganott; Jules H Sumkin Journal: Acad Radiol Date: 2010-03-16 Impact factor: 3.173
Authors: Etta D Pisano; Constantine Gatsonis; Edward Hendrick; Martin Yaffe; Janet K Baum; Suddhasatta Acharyya; Emily F Conant; Laurie L Fajardo; Lawrence Bassett; Carl D'Orsi; Roberta Jong; Murray Rebner Journal: N Engl J Med Date: 2005-09-16 Impact factor: 91.245
Authors: John M Lewin; Carl J D'Orsi; R Edward Hendrick; Lawrence J Moss; Pamela K Isaacs; Andrew Karellas; Gary R Cutter Journal: AJR Am J Roentgenol Date: 2002-09 Impact factor: 3.959
Authors: Maryellen L Giger; Marc F Inciardi; Alexandra Edwards; John Papaioannou; Karen Drukker; Yulei Jiang; Rachel Brem; Jeremy Bancroft Brown Journal: AJR Am J Roentgenol Date: 2016-04-04 Impact factor: 3.959