Literature DB >> 26547986

Learning multiple distributed prototypes of semantic categories for named entity recognition.

Aron Henriksson.   

Abstract

The scarcity of large labelled datasets comprising clinical text that can be exploited within the paradigm of supervised machine learning creates barriers for the secondary use of data from electronic health records. It is therefore important to develop capabilities to leverage the large amounts of unlabelled data that, indeed, tend to be readily available. One technique utilises distributional semantics to create word representations in a wholly unsupervised manner and uses existing training data to learn prototypical representations of predefined semantic categories. Features describing whether a given word belongs to a certain category are then provided to the learning algorithm. It has been shown that using multiple distributional semantic models, each employing a different word order strategy, can lead to enhanced predictive performance. Here, another hyperparameter is also varied--the size of the context window--and an experimental investigation shows that this leads to further performance gains.

Mesh:

Year:  2015        PMID: 26547986     DOI: 10.1504/ijdmb.2015.072766

Source DB:  PubMed          Journal:  Int J Data Min Bioinform        ISSN: 1748-5673            Impact factor:   0.667


  2 in total

1.  [Freiburg keratoconus registry : Example of application of smart data for clinical research and inititial results].

Authors:  S J Lang; D Böhringer; T Reinhard
Journal:  Ophthalmologe       Date:  2016-06       Impact factor: 1.059

2.  Ensembles of randomized trees using diverse distributed representations of clinical events.

Authors:  Aron Henriksson; Jing Zhao; Hercules Dalianis; Henrik Boström
Journal:  BMC Med Inform Decis Mak       Date:  2016-07-21       Impact factor: 2.796

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.