Literature DB >> 27109931

Using phrases and document metadata to improve topic modeling of clinical reports.

William Speier1, Michael K Ong2, Corey W Arnold3.   

Abstract

Probabilistic topic models provide an unsupervised method for analyzing unstructured text, which have the potential to be integrated into clinical automatic summarization systems. Clinical documents are accompanied by metadata in a patient's medical history and frequently contains multiword concepts that can be valuable for accurately interpreting the included text. While existing methods have attempted to address these problems individually, we present a unified model for free-text clinical documents that integrates contextual patient- and document-level data, and discovers multi-word concepts. In the proposed model, phrases are represented by chained n-grams and a Dirichlet hyper-parameter is weighted by both document-level and patient-level context. This method and three other Latent Dirichlet allocation models were fit to a large collection of clinical reports. Examples of resulting topics demonstrate the results of the new model and the quality of the representations are evaluated using empirical log likelihood. The proposed model was able to create informative prior probabilities based on patient and document information, and captured phrases that represented various clinical concepts. The representation using the proposed model had a significantly higher empirical log likelihood than the compared methods. Integrating document metadata and capturing phrases in clinical text greatly improves the topic representation of clinical documents. The resulting clinically informative topics may effectively serve as the basis for an automatic summarization system for clinical reports.
Copyright © 2016 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Document metadata; LDA; Topic modeling; n-grams

Mesh:

Year:  2016        PMID: 27109931      PMCID: PMC4902330          DOI: 10.1016/j.jbi.2016.04.005

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  7 in total

1.  Finding scientific topics.

Authors:  Thomas L Griffiths; Mark Steyvers
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-10       Impact factor: 11.205

2.  Dialect topic modeling for improved consumer medical search.

Authors:  Steven P Crain; Shuang-Hong Yang; Hongyuan Zha; Yu Jiao
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

3.  Summarization of clinical information: a conceptual model.

Authors:  Joshua C Feblowitz; Adam Wright; Hardeep Singh; Lipika Samal; Dean F Sittig
Journal:  J Biomed Inform       Date:  2011-03-31       Impact factor: 6.317

4.  Clinical Case-based Retrieval Using Latent Topic Analysis.

Authors:  Corey W Arnold; Suzie M El-Saden; Alex A T Bui; Ricky Taira
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

5.  Evaluating topic model interpretability from a primary care physician perspective.

Authors:  Corey W Arnold; Andrea Oh; Shawn Chen; William Speier
Journal:  Comput Methods Programs Biomed       Date:  2015-10-30       Impact factor: 5.428

6.  Unfolding Physiological State: Mortality Modelling in Intensive Care Units.

Authors:  Marzyeh Ghassemi; Tristan Naumann; Finale Doshi-Velez; Nicole Brimmer; Rohit Joshi; Anna Rumshisky; Peter Szolovits
Journal:  KDD       Date:  2014-08-24

7.  Redundancy-aware topic modeling for patient record notes.

Authors:  Raphael Cohen; Iddo Aviram; Michael Elhadad; Noémie Elhadad
Journal:  PLoS One       Date:  2014-02-13       Impact factor: 3.240

  7 in total
  5 in total

1.  Bidirectional Representation Learning From Transformers Using Multimodal Electronic Health Record Data to Predict Depression.

Authors:  Yiwen Meng; William Speier; Michael K Ong; Corey W Arnold
Journal:  IEEE J Biomed Health Inform       Date:  2021-08-05       Impact factor: 7.021

2.  Extracting Additional Influences From Physician Profiles With Topic Modeling: Impact on Ratings and Page Views in Online Healthcare Communities.

Authors:  Xiaoling Wei; Yuan-Teng Hsu
Journal:  Front Psychol       Date:  2022-04-01

3.  Latent Patient Cluster Discovery for Robust Future Forecasting and New-Patient Generalization.

Authors:  Ting Qian; Aaron J Masino
Journal:  PLoS One       Date:  2016-09-16       Impact factor: 3.240

4.  HCET: Hierarchical Clinical Embedding With Topic Modeling on Electronic Health Records for Predicting Future Depression.

Authors:  Yiwen Meng; William Speier; Michael Ong; Corey W Arnold
Journal:  IEEE J Biomed Health Inform       Date:  2021-04-06       Impact factor: 5.772

5.  Mapping and Modeling of Discussions Related to Gastrointestinal Discomfort in French-Speaking Online Forums: Results of a 15-Year Retrospective Infodemiology Study.

Authors:  Florent Schäfer; Carole Faviez; Paméla Voillot; Pierre Foulquié; Matthieu Najm; Jean-François Jeanne; Guy Fagherazzi; Stéphane Schück; Boris Le Nevé
Journal:  J Med Internet Res       Date:  2020-11-03       Impact factor: 5.428

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.