Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study.

Literature DB >> 24508177

Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study.

Maria Skeppstedt¹, Maria Kvist², Gunnar H Nilsson³, Hercules Dalianis⁴.

Abstract

Automatic recognition of clinical entities in the narrative text of health records is useful for constructing applications for documentation of patient care, as well as for secondary usage in the form of medical knowledge extraction. There are a number of named entity recognition studies on English clinical text, but less work has been carried out on clinical text in other languages. This study was performed on Swedish health records, and focused on four entities that are highly relevant for constructing a patient overview and for medical hypothesis generation, namely the entities: Disorder, Finding, Pharmaceutical Drug and Body Structure. The study had two aims: to explore how well named entity recognition methods previously applied to English clinical text perform on similar texts written in Swedish; and to evaluate whether it is meaningful to divide the more general category Medical Problem, which has been used in a number of previous studies, into the two more granular entities, Disorder and Finding. Clinical notes from a Swedish internal medicine emergency unit were annotated for the four selected entity categories, and the inter-annotator agreement between two pairs of annotators was measured, resulting in an average F-score of 0.79 for Disorder, 0.66 for Finding, 0.90 for Pharmaceutical Drug and 0.80 for Body Structure. A subset of the developed corpus was thereafter used for finding suitable features for training a conditional random fields model. Finally, a new model was trained on this subset, using the best features and settings, and its ability to generalise to held-out data was evaluated. This final model obtained an F-score of 0.81 for Disorder, 0.69 for Finding, 0.88 for Pharmaceutical Drug, 0.85 for Body Structure and 0.78 for the combined category Disorder+Finding. The obtained results, which are in line with or slightly lower than those for similar studies on English clinical text, many of them conducted using a larger training data set, show that the approaches used for English are also suitable for Swedish clinical text. However, a small proportion of the errors made by the model are less likely to occur in English text, showing that results might be improved by further tailoring the system to clinical Swedish. The entity recognition results for the individual entities Disorder and Finding show that it is meaningful to separate the general category Medical Problem into these two more granular entity types, e.g. for knowledge mining of co-morbidity relations and disorder-finding relations.

Entities: Species

Keywords: Clinical text processing; Corpora development; Disorder; Finding; Named entity recognition; Swedish

Mesh：

Year: 2014 PMID： 24508177 DOI： 10.1016/j.jbi.2014.01.012

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

Keyword Cloud
Cited

17 in total

Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study.

Review 1. Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare.

Review 2. Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis.

3. Identifying the Clinical Laboratory Tests from Unspecified "Other Lab Test" Data for Secondary Use.

4. Comparing Different Methods for Named Entity Recognition in Portuguese Neurology Text.

5. A Relation-Oriented Model With Global Context Information for Joint Extraction of Overlapping Relations and Entities.

6. Finding Cervical Cancer Symptoms in Swedish Clinical Text using a Machine Learning Approach and NegEx.

7. Machine Learning for Geriatric Clinical Care: Opportunities and Challenges.

8. Fine-grained information extraction from German transthoracic echocardiography reports.

Review 9. Clinical Natural Language Processing in languages other than English: opportunities and challenges.

10. Design of an extensive information representation scheme for clinical narratives.