Travis Zack1,2, Gurpreet Dhaliwal3,4, Rabih Geha3,4, Mary Margaretten5, Sara Murray6, Julian C Hong7,8. 1. Division of Hematology/Oncology, Department of Medicine, University of California, San Francisco, CA, USA. travis.zack@ucsf.edu. 2. Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, 94158, USA. travis.zack@ucsf.edu. 3. San Francisco VA Medical Center, San Francisco, CA, USA. 4. Department of Medicine, University of California, San Francisco, CA, USA. 5. Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA. 6. Division of Hospital Medicine, Department of Medicine, University of California, San Francisco, CA, USA. 7. Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, 94158, USA. 8. Department of Radiation Oncology, University of California, San Francisco, CA, USA.
Abstract
IMPORTANCE: Case reports that externalize expert diagnostic reasoning are utilized for clinical reasoning instruction but are difficult to search based on symptoms, final diagnosis, or differential diagnosis construction. Computational approaches that uncover how experienced diagnosticians analyze the medical information in a case as they formulate a differential diagnosis can guide educational uses of case reports. OBJECTIVE: To develop a "reasoning-encoded" case database for advanced clinical reasoning instruction by applying natural language processing (NLP), a sub-field of artificial intelligence, to a large case report library. DESIGN: We collected 2525 cases from the New England Journal of Medicine (NEJM) Clinical Pathological Conference (CPC) from 1965 to 2020 and used NLP to analyze the medical terminology in each case to derive unbiased (not prespecified) categories of analysis used by the clinical discussant. We then analyzed and mapped the degree of category overlap between cases. RESULTS: Our NLP algorithms identified clinically relevant categories that reflected the relationships between medical terms (which included symptoms, signs, test results, pathophysiology, and diagnoses). NLP extracted 43,291 symptoms across 2525 cases and physician-annotated 6532 diagnoses (both primary and related diagnoses). Our unsupervised learning computational approach identified 12 categories of medical terms that characterized the differential diagnosis discussions within individual cases. We used these categories to derive a measure of differential diagnosis similarity between cases and developed a website ( universeofcpc.com ) to allow visualization and exploration of 55 years of NEJM CPC case series. CONCLUSIONS: Applying NLP to curated instances of diagnostic reasoning can provide insight into how expert clinicians correlate and coordinate disease categories and processes when creating a differential diagnosis. Our reasoning-encoded CPC case database can be used by clinician-educators to design a case-based curriculum and by physicians to direct their lifelong learning efforts.
IMPORTANCE: Case reports that externalize expert diagnostic reasoning are utilized for clinical reasoning instruction but are difficult to search based on symptoms, final diagnosis, or differential diagnosis construction. Computational approaches that uncover how experienced diagnosticians analyze the medical information in a case as they formulate a differential diagnosis can guide educational uses of case reports. OBJECTIVE: To develop a "reasoning-encoded" case database for advanced clinical reasoning instruction by applying natural language processing (NLP), a sub-field of artificial intelligence, to a large case report library. DESIGN: We collected 2525 cases from the New England Journal of Medicine (NEJM) Clinical Pathological Conference (CPC) from 1965 to 2020 and used NLP to analyze the medical terminology in each case to derive unbiased (not prespecified) categories of analysis used by the clinical discussant. We then analyzed and mapped the degree of category overlap between cases. RESULTS: Our NLP algorithms identified clinically relevant categories that reflected the relationships between medical terms (which included symptoms, signs, test results, pathophysiology, and diagnoses). NLP extracted 43,291 symptoms across 2525 cases and physician-annotated 6532 diagnoses (both primary and related diagnoses). Our unsupervised learning computational approach identified 12 categories of medical terms that characterized the differential diagnosis discussions within individual cases. We used these categories to derive a measure of differential diagnosis similarity between cases and developed a website ( universeofcpc.com ) to allow visualization and exploration of 55 years of NEJM CPC case series. CONCLUSIONS: Applying NLP to curated instances of diagnostic reasoning can provide insight into how expert clinicians correlate and coordinate disease categories and processes when creating a differential diagnosis. Our reasoning-encoded CPC case database can be used by clinician-educators to design a case-based curriculum and by physicians to direct their lifelong learning efforts.
Authors: Renée E Stalmeijer; Diana H J M Dolmans; Hetty A M Snellen-Balendong; Marijke van Santen-Hoeufft; Ineke H A P Wolfhagen; Albert J J A Scherpbier Journal: Acad Med Date: 2013-06 Impact factor: 6.893