PURPOSE: Knowledge of the exact shape of a lesion, or ground truth (GT), is necessary for the development of diagnostic tools by means of algorithm validation, measurement metric analysis, accurate size estimation. Four methods that estimate GTs from multiple readers' documentations by considering the spatial location of voxels were compared: thresholded Probability-Map at 0.50 (TPM(0.50)) and at 0.75 (TPM(0.75)), simultaneous truth and performance level estimation (STAPLE) and truth estimate from self distances (TESD). METHODS: A subset of the publicly available Lung Image Database Consortium archive was used, selecting pulmonary nodules documented by all four radiologists. The pair-wise similarities between the estimated GTs were analyzed by computing the respective Jaccard coefficients. Then, with respect to the readers' marking volumes, the estimated volumes were ranked and the sign test of the differences between them was performed. RESULTS: (a) the rank variations among the four methods and the volume differences between STAPLE and TESD are not statistically significant, (b) TPM(0.50) estimates are statistically larger (c) TPM(0.75) estimates are statistically smaller (d) there is some spatial disagreement in the estimates as the one-sided 90% confidence intervals between TPM(0.75) and TPM(0.50), TPM(0.75) and STAPLE, TPM(0.75) and TESD, TPM(0.50) and STAPLE, TPM(0.50) and TESD, STAPLE and TESD, respectively, show: [0.67, 1.00], [0.67, 1.00], [0.77, 1.00], [0.93, 1.00], [0.85, 1.00], [0.85, 1.00]. CONCLUSIONS: The method used to estimate the GT is important: the differences highlighted that STAPLE and TESD, notwithstanding a few weaknesses, appear to be equally viable as a GT estimator, while the increased availability of computing power is decreasing the appeal afforded to TPMs. Ultimately, the choice of which GT estimation method, between the two, should be preferred depends on the specific characteristics of the marked data that is used with respect to the two elements that differentiate the method approaches: relative reliabilities of the readers and the reliability of the region boundaries.
PURPOSE: Knowledge of the exact shape of a lesion, or ground truth (GT), is necessary for the development of diagnostic tools by means of algorithm validation, measurement metric analysis, accurate size estimation. Four methods that estimate GTs from multiple readers' documentations by considering the spatial location of voxels were compared: thresholded Probability-Map at 0.50 (TPM(0.50)) and at 0.75 (TPM(0.75)), simultaneous truth and performance level estimation (STAPLE) and truth estimate from self distances (TESD). METHODS: A subset of the publicly available Lung Image Database Consortium archive was used, selecting pulmonary nodules documented by all four radiologists. The pair-wise similarities between the estimated GTs were analyzed by computing the respective Jaccard coefficients. Then, with respect to the readers' marking volumes, the estimated volumes were ranked and the sign test of the differences between them was performed. RESULTS: (a) the rank variations among the four methods and the volume differences between STAPLE and TESD are not statistically significant, (b) TPM(0.50) estimates are statistically larger (c) TPM(0.75) estimates are statistically smaller (d) there is some spatial disagreement in the estimates as the one-sided 90% confidence intervals between TPM(0.75) and TPM(0.50), TPM(0.75) and STAPLE, TPM(0.75) and TESD, TPM(0.50) and STAPLE, TPM(0.50) and TESD, STAPLE and TESD, respectively, show: [0.67, 1.00], [0.67, 1.00], [0.77, 1.00], [0.93, 1.00], [0.85, 1.00], [0.85, 1.00]. CONCLUSIONS: The method used to estimate the GT is important: the differences highlighted that STAPLE and TESD, notwithstanding a few weaknesses, appear to be equally viable as a GT estimator, while the increased availability of computing power is decreasing the appeal afforded to TPMs. Ultimately, the choice of which GT estimation method, between the two, should be preferred depends on the specific characteristics of the marked data that is used with respect to the two elements that differentiate the method approaches: relative reliabilities of the readers and the reliability of the region boundaries.
Authors: P Therasse; S G Arbuck; E A Eisenhauer; J Wanders; R S Kaplan; L Rubinstein; J Verweij; M Van Glabbeke; A T van Oosterom; M C Christian; S G Gwyther Journal: J Natl Cancer Inst Date: 2000-02-02 Impact factor: 13.506
Authors: Jane P Ko; Henry Rusinek; Erika L Jacobs; James S Babb; Margrit Betke; Georgeann McGuinness; David P Naidich Journal: Radiology Date: 2003-09 Impact factor: 11.105
Authors: Samuel G Armato; Geoffrey McLennan; Michael F McNitt-Gray; Charles R Meyer; David Yankelevitz; Denise R Aberle; Claudia I Henschke; Eric A Hoffman; Ella A Kazerooni; Heber MacMahon; Anthony P Reeves; Barbara Y Croft; Laurence P Clarke Journal: Radiology Date: 2004-09 Impact factor: 11.105
Authors: Charles R Meyer; Timothy D Johnson; Geoffrey McLennan; Denise R Aberle; Ella A Kazerooni; Heber Macmahon; Brian F Mullan; David F Yankelevitz; Edwin J R van Beek; Samuel G Armato; Michael F McNitt-Gray; Anthony P Reeves; David Gur; Claudia I Henschke; Eric A Hoffman; Peyton H Bland; Gary Laderach; Richie Pais; David Qing; Chris Piker; Junfeng Guo; Adam Starkey; Daniel Max; Barbara Y Croft; Laurence P Clarke Journal: Acad Radiol Date: 2006-10 Impact factor: 3.173
Authors: Michael F McNitt-Gray; Samuel G Armato; Charles R Meyer; Anthony P Reeves; Geoffrey McLennan; Richie C Pais; John Freymann; Matthew S Brown; Roger M Engelmann; Peyton H Bland; Gary E Laderach; Chris Piker; Junfeng Guo; Zaid Towfic; David P-Y Qing; David F Yankelevitz; Denise R Aberle; Edwin J R van Beek; Heber MacMahon; Ella A Kazerooni; Barbara Y Croft; Laurence P Clarke Journal: Acad Radiol Date: 2007-12 Impact factor: 3.173
Authors: Anthony P Reeves; Alberto M Biancardi; Tatiyana V Apanasovich; Charles R Meyer; Heber MacMahon; Edwin J R van Beek; Ella A Kazerooni; David Yankelevitz; Michael F McNitt-Gray; Geoffrey McLennan; Samuel G Armato; Claudia I Henschke; Denise R Aberle; Barbara Y Croft; Laurence P Clarke Journal: Acad Radiol Date: 2007-12 Impact factor: 3.173
Authors: R S Breiman; J W Beck; M Korobkin; R Glenny; O E Akwari; D K Heaston; A V Moore; P C Ram Journal: AJR Am J Roentgenol Date: 1982-02 Impact factor: 3.959
Authors: Nicholas Petrick; Berkman Sahiner; Samuel G Armato; Alberto Bert; Loredana Correale; Silvia Delsanto; Matthew T Freedman; David Fryd; David Gur; Lubomir Hadjiiski; Zhimin Huo; Yulei Jiang; Lia Morra; Sophie Paquerault; Vikas Raykar; Frank Samuelson; Ronald M Summers; Georgia Tourassi; Hiroyuki Yoshida; Bin Zheng; Chuan Zhou; Heang-Ping Chan Journal: Med Phys Date: 2013-08 Impact factor: 4.071
Authors: M A Deeley; A Chen; R Datteri; J H Noble; A J Cmelak; E F Donnelly; A W Malcolm; L Moretti; J Jaboin; K Niermann; Eddy S Yang; David S Yu; F Yei; T Koyama; G X Ding; B M Dawant Journal: Phys Med Biol Date: 2011-07-01 Impact factor: 3.609
Authors: Samuel G Armato; Geoffrey McLennan; Luc Bidaut; Michael F McNitt-Gray; Charles R Meyer; Anthony P Reeves; Binsheng Zhao; Denise R Aberle; Claudia I Henschke; Eric A Hoffman; Ella A Kazerooni; Heber MacMahon; Edwin J R Van Beeke; David Yankelevitz; Alberto M Biancardi; Peyton H Bland; Matthew S Brown; Roger M Engelmann; Gary E Laderach; Daniel Max; Richard C Pais; David P Y Qing; Rachael Y Roberts; Amanda R Smith; Adam Starkey; Poonam Batrah; Philip Caligiuri; Ali Farooqi; Gregory W Gladish; C Matilda Jude; Reginald F Munden; Iva Petkovska; Leslie E Quint; Lawrence H Schwartz; Baskaran Sundaram; Lori E Dodd; Charles Fenimore; David Gur; Nicholas Petrick; John Freymann; Justin Kirby; Brian Hughes; Alessi Vande Casteele; Sangeeta Gupte; Maha Sallamm; Michael D Heath; Michael H Kuhn; Ekta Dharaiya; Richard Burns; David S Fryd; Marcos Salganicoff; Vikram Anand; Uri Shreter; Stephen Vastagh; Barbara Y Croft Journal: Med Phys Date: 2011-02 Impact factor: 4.071
Authors: M A Deeley; A Chen; R D Datteri; J Noble; A Cmelak; E Donnelly; A Malcolm; L Moretti; J Jaboin; K Niermann; Eddy S Yang; David S Yu; B M Dawant Journal: Phys Med Biol Date: 2013-05-17 Impact factor: 3.609