Literature DB >> 28526460

Semi-supervised medical entity recognition: A study on Spanish and Swedish clinical corpora.

Alicia Pérez1, Rebecka Weegar2, Arantza Casillas3, Koldo Gojenola1, Maite Oronoz1, Hercules Dalianis2.   

Abstract

OBJECTIVE: The goal of this study is to investigate entity recognition within Electronic Health Records (EHRs) focusing on Spanish and Swedish. Of particular importance is a robust representation of the entities. In our case, we utilized unsupervised methods to generate such representations.
METHODS: The significance of this work stands on its experimental layout. The experiments were carried out under the same conditions for both languages. Several classification approaches were explored: maximum probability, CRF, Perceptron and SVM. The classifiers were enhanced by means of ensembles of semantic spaces and ensembles of Brown trees. In order to mitigate sparsity of data, without a significant increase in the dimension of the decision space, we propose the use of clustered approaches of the hierarchical Brown clustering represented by trees and vector quantization for each semantic space.
RESULTS: The results showed that the semi-supervised approaches significantly improved standard supervised techniques for both languages. Moreover, clustering the semantic spaces contributed to the quality of the entity recognition while keeping the dimension of the feature-space two orders of magnitude lower than when directly using the semantic spaces.
CONCLUSIONS: The contributions of this study are: (a) a set of thorough experiments that enable comparisons regarding the influence of different types of features on different classifiers, exploring two languages other than English; and (b) the use of ensembles of clusters of Brown trees and semantic spaces on EHRs to tackle the problem of scarcity of available annotated data.
Copyright © 2017 Elsevier Inc. All rights reserved.

Keywords:  Health records; Medical entity recognition; Supervised and unsupervised learning

Mesh:

Year:  2017        PMID: 28526460     DOI: 10.1016/j.jbi.2017.05.009

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  2 in total

1.  Recent advances in Swedish and Spanish medical entity recognition in clinical texts using deep neural approaches.

Authors:  Rebecka Weegar; Alicia Pérez; Arantza Casillas; Maite Oronoz
Journal:  BMC Med Inform Decis Mak       Date:  2019-12-23       Impact factor: 2.796

2.  Evaluation of Natural Language Processing for the Identification of Crohn Disease-Related Variables in Spanish Electronic Health Records: A Validation Study for the PREMONITION-CD Project.

Authors:  Carmen Montoto; Javier P Gisbert; Iván Guerra; Rocío Plaza; Ramón Pajares Villarroya; Luis Moreno Almazán; María Del Carmen López Martín; Mercedes Domínguez Antonaya; Isabel Vera Mendoza; Jesús Aparicio; Vicente Martínez; Ignacio Tagarro; Alonso Fernandez-Nistal; Lea Canales; Sebastian Menke; Fernando Gomollón
Journal:  JMIR Med Inform       Date:  2022-02-18
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.