Literature DB >> 20871805

Incorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors.

David Andrzejewski1, Xiaojin Zhu, Mark Craven.   

Abstract

Users of topic modeling methods often have knowledge about the composition of words that should have high or low probability in various topics. We incorporate such domain knowledge using a novel Dirichlet Forest prior in a Latent Dirichlet Allocation framework. The prior is a mixture of Dirichlet tree distributions with special structures. We present its construction, and inference via collapsed Gibbs sampling. Experiments on synthetic and real datasets demonstrate our model's ability to follow and generalize beyond user-specified domain knowledge.

Entities:  

Year:  2009        PMID: 20871805      PMCID: PMC2943854          DOI: 10.1145/1553374.1553378

Source DB:  PubMed          Journal:  Proc Int Conf Mach Learn


  2 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  Finding scientific topics.

Authors:  Thomas L Griffiths; Mark Steyvers
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-10       Impact factor: 11.205

  2 in total
  10 in total

1.  Interpretable Topic Features for Post-ICU Mortality Prediction.

Authors:  Yen-Fu Luo; Anna Rumshisky
Journal:  AMIA Annu Symp Proc       Date:  2017-02-10

2.  Recent advances and prospects of computational methods for metabolite identification: a review with emphasis on machine learning approaches.

Authors:  Dai Hai Nguyen; Canh Hao Nguyen; Hiroshi Mamitsuka
Journal:  Brief Bioinform       Date:  2019-11-27       Impact factor: 11.622

3.  Learning probabilistic phenotypes from heterogeneous EHR data.

Authors:  Rimma Pivovarov; Adler J Perotte; Edouard Grave; John Angiolillo; Chris H Wiggins; Noémie Elhadad
Journal:  J Biomed Inform       Date:  2015-10-14       Impact factor: 6.317

4.  Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media.

Authors:  Amir Hossein Yazdavar; Hussein S Al-Olimat; Monireh Ebrahimi; Goonmeet Bajaj; Tanvi Banerjee; Krishnaprasad Thirunarayan; Jyotishman Pathak; Amit Sheth
Journal:  Proc IEEE ACM Int Conf Adv Soc Netw Anal Min       Date:  2017-07-31

5.  Categorising patient concerns using natural language processing techniques.

Authors:  Paul Fairie; Zilong Zhang; Adam G D'Souza; Tara Walsh; Hude Quan; Maria J Santana
Journal:  BMJ Health Care Inform       Date:  2021-06

6.  Incorporating linguistic knowledge for learning distributed word representations.

Authors:  Yan Wang; Zhiyuan Liu; Maosong Sun
Journal:  PLoS One       Date:  2015-04-13       Impact factor: 3.240

7.  Hierarchical lifelong topic modeling using rules extracted from network communities.

Authors:  Muhammad Taimoor Khan; Nouman Azam; Shehzad Khalid; Furqan Aziz
Journal:  PLoS One       Date:  2022-03-03       Impact factor: 3.240

8.  Integrating Natural Language Processing and Interpretive Thematic Analyses to Gain Human-Centered Design Insights on HIV Mobile Health: Proof-of-Concept Analysis.

Authors:  Simone J Skeen; Stephen Scott Jones; Carolyn Marie Cruse; Keith J Horvath
Journal:  JMIR Hum Factors       Date:  2022-07-21

9.  Remote work and the COVID-19 pandemic: An artificial intelligence-based topic modeling and a future agenda.

Authors:  Majid Aleem; Muhammad Sufyan; Irfan Ameer; Mekhail Mustak
Journal:  J Bus Res       Date:  2022-09-21

10.  Online Knowledge-Based Model for Big Data Topic Extraction.

Authors:  Muhammad Taimoor Khan; Mehr Durrani; Shehzad Khalid; Furqan Aziz
Journal:  Comput Intell Neurosci       Date:  2016-04-19
  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.