Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Hidden Markov model using Dirichlet process for de-identification.

Literature DB >> 26407642

Hidden Markov model using Dirichlet process for de-identification.

Tao Chen¹, Richard M Cullen², Marshall Godwin³.

Abstract

For the 2014 i2b2/UTHealth de-identification challenge, we introduced a new non-parametric Bayesian hidden Markov model using a Dirichlet process (HMM-DP). The model intends to reduce task-specific feature engineering and to generalize well to new data. In the challenge we developed a variational method to learn the model and an efficient approximation algorithm for prediction. To accommodate out-of-vocabulary words, we designed a number of feature functions to model such words. The results show the model is capable of understanding local context cues to make correct predictions without manual feature engineering and performs as accurately as state-of-the-art conditional random field models in a number of categories. To incorporate long-range and cross-document context cues, we developed a skip-chain conditional random field model to align the results produced by HMM-DP, which further improved the performance.

Entities: Chemical Disease Gene Species

Keywords: De-identification; Dirichlet process; Hidden Markov model; Natural language processing; Variational method

Mesh：

Year: 2015 PMID： 26407642 PMCID： PMC4984397 DOI： 10.1016/j.jbi.2015.09.004

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

4 in total

1. Finding scientific topics.

Authors: Thomas L Griffiths; Mark Steyvers
Journal: Proc Natl Acad Sci U S A Date: 2004-02-10 Impact factor: 11.205

2. State-of-the-art anonymization of medical records using an iterative machine learning framework.

Authors: György Szarvas; Richárd Farkas; Róbert Busa-Fekete
Journal: J Am Med Inform Assoc Date: 2007 Sep-Oct Impact factor: 4.497

3. Evaluating the state-of-the-art in automatic de-identification.

Authors: Ozlem Uzuner; Yuan Luo; Peter Szolovits
Journal: J Am Med Inform Assoc Date: 2007-06-28 Impact factor: 4.497

4. Large-scale evaluation of automated clinical note de-identification and its impact on information extraction.

Authors: Louise Deleger; Katalin Molnar; Guergana Savova; Fei Xia; Todd Lingren; Qi Li; Keith Marsolo; Anil Jegga; Megan Kaiser; Laura Stoutenborough; Imre Solti
Journal: J Am Med Inform Assoc Date: 2012-08-02 Impact factor: 4.497

4 in total

6 in total

Hidden Markov model using Dirichlet process for de-identification.

1. Finding scientific topics.

2. State-of-the-art anonymization of medical records using an iterative machine learning framework.

3. Evaluating the state-of-the-art in automatic de-identification.

4. Large-scale evaluation of automated clinical note de-identification and its impact on information extraction.

1. Leveraging existing corpora for de-identification of psychiatric notes using domain adaptation.

2. Automatic prediction of coronary artery disease from clinical narratives.

3. Practical applications for natural language processing in clinical research: The 2014 i2b2/UTHealth shared tasks.

4. A hybrid approach to automatic de-identification of psychiatric notes.

5. De-identification of clinical notes via recurrent neural network and conditional random field.

6. Transferability of neural network clinical deidentification systems.