Literature DB >> 26614020

Evaluating topic model interpretability from a primary care physician perspective.

Corey W Arnold1, Andrea Oh2, Shawn Chen2, William Speier2.   

Abstract

BACKGROUND AND
OBJECTIVE: Probabilistic topic models provide an unsupervised method for analyzing unstructured text. These models discover semantically coherent combinations of words (topics) that could be integrated in a clinical automatic summarization system for primary care physicians performing chart review. However, the human interpretability of topics discovered from clinical reports is unknown. Our objective is to assess the coherence of topics and their ability to represent the contents of clinical reports from a primary care physician's point of view.
METHODS: Three latent Dirichlet allocation models (50 topics, 100 topics, and 150 topics) were fit to a large collection of clinical reports. Topics were manually evaluated by primary care physicians and graduate students. Wilcoxon Signed-Rank Tests for Paired Samples were used to evaluate differences between different topic models, while differences in performance between students and primary care physicians (PCPs) were tested using Mann-Whitney U tests for each of the tasks.
RESULTS: While the 150-topic model produced the best log likelihood, participants were most accurate at identifying words that did not belong in topics learned by the 100-topic model, suggesting that 100 topics provides better relative granularity of discovered semantic themes for the data set used in this study. Models were comparable in their ability to represent the contents of documents. Primary care physicians significantly outperformed students in both tasks.
CONCLUSION: This work establishes a baseline of interpretability for topic models trained with clinical reports, and provides insights on the appropriateness of using topic models for informatics applications. Our results indicate that PCPs find discovered topics more coherent and representative of clinical reports relative to students, warranting further research into their use for automatic summarization.
Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

Entities:  

Keywords:  Clinical reports; Primary care; Topic modeling

Mesh:

Year:  2015        PMID: 26614020      PMCID: PMC4724339          DOI: 10.1016/j.cmpb.2015.10.014

Source DB:  PubMed          Journal:  Comput Methods Programs Biomed        ISSN: 0169-2607            Impact factor:   5.428


  14 in total

1.  Primary care physician time utilization before and after implementation of an electronic health record: a time-motion study.

Authors:  Lisa Pizziferri; Anne F Kittler; Lynn A Volk; Melissa M Honour; Sameer Gupta; Samuel Wang; Tiffany Wang; Margaret Lippincott; Qi Li; David W Bates
Journal:  J Biomed Inform       Date:  2004-12-14       Impact factor: 6.317

2.  Distributed, intelligent, interactive visualization and exploration of time-oriented clinical data and their abstractions.

Authors:  Yuval Shahar; Dina Goren-Bar; David Boaz; Gil Tahan
Journal:  Artif Intell Med       Date:  2005-12-15       Impact factor: 5.326

3.  LifeLines: using visualization to enhance navigation and analysis of patient records.

Authors:  C Plaisant; R Mushlin; A Snyder; J Li; D Heller; B Shneiderman
Journal:  Proc AMIA Symp       Date:  1998

4.  Summarization of clinical information: a conceptual model.

Authors:  Joshua C Feblowitz; Adam Wright; Hardeep Singh; Lipika Samal; Dean F Sittig
Journal:  J Biomed Inform       Date:  2011-03-31       Impact factor: 6.317

5.  Context-based electronic health record: toward patient specific healthcare.

Authors:  William Hsu; Ricky K Taira; Suzie El-Saden; Hooshang Kangarloo; Alex A T Bui
Journal:  IEEE Trans Inf Technol Biomed       Date:  2012-03

6.  Amazon's Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data?

Authors:  Michael Buhrmester; Tracy Kwang; Samuel D Gosling
Journal:  Perspect Psychol Sci       Date:  2011-02-03

7.  It's about time: physicians' perceptions of time constraints in primary care medical practice in three national healthcare systems.

Authors:  Thomas R Konrad; Carol L Link; Rebecca J Shackelton; Lisa D Marceau; Olaf von dem Knesebeck; Johannes Siegrist; Sara Arber; Ann Adams; John B McKinlay
Journal:  Med Care       Date:  2010-02       Impact factor: 2.983

8.  A bibliometric analysis on tobacco regulation investigators.

Authors:  Hongfang Liu; Scott Leischow; Dingcheng Li; Janet Okamoto
Journal:  BioData Min       Date:  2015-03-21       Impact factor: 2.522

9.  HARVEST, a longitudinal patient record summarizer.

Authors:  Jamie S Hirsch; Jessica S Tanenbaum; Sharon Lipsky Gorman; Connie Liu; Eric Schmitz; Dritan Hashorva; Artem Ervits; David Vawdrey; Marc Sturm; Noémie Elhadad
Journal:  J Am Med Inform Assoc       Date:  2014-10-28       Impact factor: 4.497

10.  Redundancy-aware topic modeling for patient record notes.

Authors:  Raphael Cohen; Iddo Aviram; Michael Elhadad; Noémie Elhadad
Journal:  PLoS One       Date:  2014-02-13       Impact factor: 3.240

View more
  10 in total

1.  Using phrases and document metadata to improve topic modeling of clinical reports.

Authors:  William Speier; Michael K Ong; Corey W Arnold
Journal:  J Biomed Inform       Date:  2016-04-21       Impact factor: 6.317

2.  Interpretable Topic Features for Post-ICU Mortality Prediction.

Authors:  Yen-Fu Luo; Anna Rumshisky
Journal:  AMIA Annu Symp Proc       Date:  2017-02-10

3.  Using Clinical Notes and Natural Language Processing for Automated HIV Risk Assessment.

Authors:  Daniel J Feller; Jason Zucker; Michael T Yin; Peter Gordon; Noémie Elhadad
Journal:  J Acquir Immune Defic Syndr       Date:  2018-02-01       Impact factor: 3.731

4.  Improving the utility of MeSH® terms using the TopicalMeSH representation.

Authors:  Zhiguo Yu; Elmer Bernstam; Trevor Cohen; Byron C Wallace; Todd R Johnson
Journal:  J Biomed Inform       Date:  2016-03-19       Impact factor: 6.317

5.  Bidirectional Representation Learning From Transformers Using Multimodal Electronic Health Record Data to Predict Depression.

Authors:  Yiwen Meng; William Speier; Michael K Ong; Corey W Arnold
Journal:  IEEE J Biomed Health Inform       Date:  2021-08-05       Impact factor: 7.021

6.  Trends in anesthesiology research: a machine learning approach to theme discovery and summarization.

Authors:  Alexander Rusanov; Riccardo Miotto; Chunhua Weng
Journal:  JAMIA Open       Date:  2018-09-04

7.  Subtypes in patients with opioid misuse: A prognostic enrichment strategy using electronic health record data in hospitalized patients.

Authors:  Majid Afshar; Cara Joyce; Dmitriy Dligach; Brihat Sharma; Robert Kania; Meng Xie; Kristin Swope; Elizabeth Salisbury-Afshar; Niranjan S Karnik
Journal:  PLoS One       Date:  2019-07-16       Impact factor: 3.752

8.  HCET: Hierarchical Clinical Embedding With Topic Modeling on Electronic Health Records for Predicting Future Depression.

Authors:  Yiwen Meng; William Speier; Michael Ong; Corey W Arnold
Journal:  IEEE J Biomed Health Inform       Date:  2021-04-06       Impact factor: 5.772

9.  What Patients Can Tell Us: Topic Analysis for Social Media on Breast Cancer.

Authors:  Mike Donald Tapi Nzali; Sandra Bringay; Christian Lavergne; Caroline Mollevi; Thomas Opitz
Journal:  JMIR Med Inform       Date:  2017-07-31

10.  Validation of a Natural Language Processing Algorithm for Detecting Infectious Disease Symptoms in Primary Care Electronic Medical Records in Singapore.

Authors:  Antony Hardjojo; Arunan Gunachandran; Long Pang; Mohammed Ridzwan Bin Abdullah; Win Wah; Joash Wen Chen Chong; Ee Hui Goh; Sok Huang Teo; Gilbert Lim; Mong Li Lee; Wynne Hsu; Vernon Lee; Mark I-Cheng Chen; Franco Wong; Jonathan Siung King Phang
Journal:  JMIR Med Inform       Date:  2018-06-11
  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.