Literature DB >> 21346937

Using ontology network structure in text mining.

Donald J Berndt1, James A McCart, Stephen L Luther.   

Abstract

Statistical text mining treats documents as bags of words, with a focus on term frequencies within documents and across document collections. Unlike natural language processing (NLP) techniques that rely on an engineered vocabulary or a full-featured ontology, statistical approaches do not make use of domain-specific knowledge. The freedom from biases can be an advantage, but at the cost of ignoring potentially valuable knowledge. The approach proposed here investigates a hybrid strategy based on computing graph measures of term importance over an entire ontology and injecting the measures into the statistical text mining process. As a starting point, we adapt existing search engine algorithms such as PageRank and HITS to determine term importance within an ontology graph. The graph-theoretic approach is evaluated using a smoking data set from the i2b2 National Center for Biomedical Computing, cast as a simple binary classification task for categorizing smoking-related documents, demonstrating consistent improvements in accuracy.

Mesh:

Year:  2010        PMID: 21346937      PMCID: PMC3041319     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  2 in total

1.  Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS.

Authors:  Hongfang Liu; Stephen B Johnson; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2002 Nov-Dec       Impact factor: 4.497

2.  Identifying patient smoking status from medical discharge records.

Authors:  Ozlem Uzuner; Ira Goldstein; Yuan Luo; Isaac Kohane
Journal:  J Am Med Inform Assoc       Date:  2007-10-18       Impact factor: 4.497

  2 in total
  1 in total

1.  Semantic characteristics of NLP-extracted concepts in clinical notes vs. biomedical literature.

Authors:  Stephen Wu; Hongfang Liu
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.