Literature DB >> 22516804

Measuring agreement between rating interpretations and binary clinical interpretations of images: a simulation study of methods for quantifying the clinical relevance of an observer performance paradigm.

Dev P Chakraborty1.   

Abstract

Laboratory receiver operating characteristic (ROC) studies, that are often used to evaluate medical imaging systems, differ from 'live' clinical interpretations in several respects which could compromise their clinical relevance. The aim was to develop methodology for quantifying the clinical relevance of a laboratory ROC study. A simulator was developed to generate ROC ratings data and binary clinical interpretations classified as correct or incorrect for a common set of images interpreted under clinical and laboratory conditions. The area under the trapezoidal ROC curve (AUC) was used as the laboratory figure-of-merit and the fraction of correct clinical decisions as the clinical figure-of-merit. Conventional agreement measures (Pearson, Spearman, Kendall and kappa) between the bootstrap-induced fluctuations of the two figures of merit were estimated. A jackknife pseudovalue transformation applied to the figures of merit was also investigated as a way to capture agreement existing at the individual image level that could be lost at the figure-of-merit level. It is shown that the pseudovalues define a relevance-ROC curve. The area under this curve (rAUC) measures the ability of the laboratory figure-of-merit-based pseudovalues to correctly classify incorrect versus correct clinical interpretations. Therefore, rAUC is a measure of the clinical relevance of an ROC study. The conventional measures and rAUC were compared under varying simulator conditions. It was found that design details of the ROC study, namely the number of bins, the difficulty level of the images, the ratio of disease-present to disease-absent images and the unavoidable difference between laboratory and clinical performance levels, can lead to serious underestimation of the agreement as indicated by conventional agreement measures, even for perfectly correlated data, while rAUC showed high agreement and was relatively immune to these details. At the same time rAUC was sensitive to factors such as intrinsic correlation between the laboratory and clinical decision variables and differences in reporting thresholds that are expected to influence agreement both at the individual image level and at the figure-of-merit level. Suggestions are made for how to conduct relevance-ROC studies aimed at assessing agreement between laboratory and clinical interpretations. The method could be used to evaluate the clinical relevance of alternative scalar figures of merit, such as the sensitivity at a predifined specificity.

Entities:  

Mesh:

Year:  2012        PMID: 22516804      PMCID: PMC3352681          DOI: 10.1088/0031-9155/57/10/2873

Source DB:  PubMed          Journal:  Phys Med Biol        ISSN: 0031-9155            Impact factor:   3.609


  28 in total

1.  Data analysis for detection and localization of multiple abnormalities with application to mammography.

Authors:  N A Obuchowski; M L Lieber; K A Powell
Journal:  Acad Radiol       Date:  2000-07       Impact factor: 3.173

2.  Basic principles of ROC analysis.

Authors:  C E Metz
Journal:  Semin Nucl Med       Date:  1978-10       Impact factor: 4.446

3.  Observer studies involving detection and localization: modeling, analysis, and validation.

Authors:  Dev P Chakraborty; Kevin S Berbaum
Journal:  Med Phys       Date:  2004-08       Impact factor: 4.071

4.  Biases in the assessment of diagnostic tests.

Authors:  C B Begg
Journal:  Stat Med       Date:  1987-06       Impact factor: 2.373

5.  Some practical issues of experimental design and data analysis in radiological ROC studies.

Authors:  C E Metz
Journal:  Invest Radiol       Date:  1989-03       Impact factor: 6.016

6.  Evaluating the success of mammography at the local level: how to conduct an audit of your practice.

Authors:  D B Spring; K Kimbrell-Wilmot
Journal:  Radiol Clin North Am       Date:  1987-09       Impact factor: 2.303

7.  Maximum likelihood estimation of parameters of signal detection theory--a direct solution.

Authors:  D D Dorfman; E Alf
Journal:  Psychometrika       Date:  1968-03       Impact factor: 2.500

8.  The meaning and use of the area under a receiver operating characteristic (ROC) curve.

Authors:  J A Hanley; B J McNeil
Journal:  Radiology       Date:  1982-04       Impact factor: 11.105

Review 9.  Assessment methodologies and statistical issues for computer-aided diagnosis of lung nodules in computed tomography: contemporary research topics relevant to the lung image database consortium.

Authors:  Lori E Dodd; Robert F Wagner; Samuel G Armato; Michael F McNitt-Gray; Sergey Beiden; Heang-Ping Chan; David Gur; Geoffrey McLennan; Charles E Metz; Nicholas Petrick; Berkman Sahiner; Jim Sayre
Journal:  Acad Radiol       Date:  2004-04       Impact factor: 3.173

10.  Channelized hotelling and human observer correlation for lesion detection in hepatic SPECT imaging.

Authors:  H C Gifford; M A King; D J de Vries; E J Soares
Journal:  J Nucl Med       Date:  2000-03       Impact factor: 10.057

View more
  3 in total

1.  Quantifying the clinical relevance of a laboratory observer performance paradigm.

Authors:  D P Chakraborty; T M Haygood; J Ryan; E M Marom; M Evanoff; M F McEntee; P C Brennan
Journal:  Br J Radiol       Date:  2012-05-09       Impact factor: 3.039

2.  Modeling visual search behavior of breast radiologists using a deep convolution neural network.

Authors:  Suneeta Mall; Patrick C Brennan; Claudia Mello-Thoms
Journal:  J Med Imaging (Bellingham)       Date:  2018-08-11

3.  Application of threshold-bias independent analysis to eye-tracking and FROC data.

Authors:  Dev P Chakraborty; Hong-Jun Yoon; Claudia Mello-Thoms
Journal:  Acad Radiol       Date:  2012-10-04       Impact factor: 3.173

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.