Xuan Guo1, Qi Yu2, Cecilia Ovesdotter Alm3, Cara Calvelli4, Jeff B Pelz5, Pengcheng Shi2, Anne R Haake2. 1. College of Computing & Information Sciences, Rochester Institute of Technology, 20 Lomb Memorial Drive, Rochester, NY 14623, USA. Electronic address: andyroddick1989@gmail.com. 2. College of Computing & Information Sciences, Rochester Institute of Technology, 20 Lomb Memorial Drive, Rochester, NY 14623, USA. 3. College of Liberal Arts, Rochester Institute of Technology, 92 Lomb Memorial Drive, Rochester, NY 14623, USA. 4. College of Health Sciences & Technology, Rochester Institute of Technology, 90 Lomb Memorial Drive, Rochester, NY 14623, USA. 5. Center for Imaging Science, Rochester Institute of Technology, 54 Lomb Memorial Drive, Rochester, NY 14623, USA.
Abstract
OBJECTIVES: Extracting useful visual clues from medical images allowing accurate diagnoses requires physicians' domain knowledge acquired through years of systematic study and clinical training. This is especially true in the dermatology domain, a medical specialty that requires physicians to have image inspection experience. Automating or at least aiding such efforts requires understanding physicians' reasoning processes and their use of domain knowledge. Mining physicians' references to medical concepts in narratives during image-based diagnosis of a disease is an interesting research topic that can help reveal experts' reasoning processes. It can also be a useful resource to assist with design of information technologies for image use and for image case-based medical education systems. METHODS AND MATERIALS: We collected data for analyzing physicians' diagnostic reasoning processes by conducting an experiment that recorded their spoken descriptions during inspection of dermatology images. In this paper we focus on the benefit of physicians' spoken descriptions and provide a general workflow for mining medical domain knowledge based on linguistic data from these narratives. The challenge of a medical image case can influence the accuracy of the diagnosis as well as how physicians pursue the diagnostic process. Accordingly, we define two lexical metrics for physicians' narratives--lexical consensus score and top N relatedness score--and evaluate their usefulness by assessing the diagnostic challenge levels of corresponding medical images. We also report on clustering medical images based on anchor concepts obtained from physicians' medical term usage. These analyses are based on physicians' spoken narratives that have been preprocessed by incorporating the Unified Medical Language System for detecting medical concepts. RESULTS: The image rankings based on lexical consensus score and on top 1 relatedness score are well correlated with those based on challenge levels (Spearman correlation>0.5 and Kendall correlation>0.4). Clustering results are largely improved based on our anchor concept method (accuracy>70% and mutual information>80%). CONCLUSIONS: Physicians' spoken narratives are valuable for the purpose of mining the domain knowledge that physicians use in medical image inspections. We also show that the semantic metrics introduced in the paper can be successfully applied to medical image understanding and allow discussion of additional uses of these metrics.
OBJECTIVES: Extracting useful visual clues from medical images allowing accurate diagnoses requires physicians' domain knowledge acquired through years of systematic study and clinical training. This is especially true in the dermatology domain, a medical specialty that requires physicians to have image inspection experience. Automating or at least aiding such efforts requires understanding physicians' reasoning processes and their use of domain knowledge. Mining physicians' references to medical concepts in narratives during image-based diagnosis of a disease is an interesting research topic that can help reveal experts' reasoning processes. It can also be a useful resource to assist with design of information technologies for image use and for image case-based medical education systems. METHODS AND MATERIALS: We collected data for analyzing physicians' diagnostic reasoning processes by conducting an experiment that recorded their spoken descriptions during inspection of dermatology images. In this paper we focus on the benefit of physicians' spoken descriptions and provide a general workflow for mining medical domain knowledge based on linguistic data from these narratives. The challenge of a medical image case can influence the accuracy of the diagnosis as well as how physicians pursue the diagnostic process. Accordingly, we define two lexical metrics for physicians' narratives--lexical consensus score and top N relatedness score--and evaluate their usefulness by assessing the diagnostic challenge levels of corresponding medical images. We also report on clustering medical images based on anchor concepts obtained from physicians' medical term usage. These analyses are based on physicians' spoken narratives that have been preprocessed by incorporating the Unified Medical Language System for detecting medical concepts. RESULTS: The image rankings based on lexical consensus score and on top 1 relatedness score are well correlated with those based on challenge levels (Spearman correlation>0.5 and Kendall correlation>0.4). Clustering results are largely improved based on our anchor concept method (accuracy>70% and mutual information>80%). CONCLUSIONS: Physicians' spoken narratives are valuable for the purpose of mining the domain knowledge that physicians use in medical image inspections. We also show that the semantic metrics introduced in the paper can be successfully applied to medical image understanding and allow discussion of additional uses of these metrics.
Keywords:
Clustering algorithm; Image-based diagnostic reasoning; Lexical consensus; Medical data analysis; Semantic relatedness; Unified Medical Language System