Literature DB >> 16169769

Developing a corpus of clinical notes manually annotated for part-of-speech.

Serguei V Pakhomov1, Anni Coden, Christopher G Chute.   

Abstract

PURPOSE: This paper presents a project whose main goal is to construct a corpus of clinical text manually annotated for part-of-speech (POS) information. We describe and discuss the process of training three domain experts to perform linguistic annotation.
METHODS: Three domain experts were trained to perform manual annotation of a corpus of clinical notes. A part of this corpus was combined with the Penn Treebank corpus of general purpose English text and another part was set aside for testing. The corpora were then used for training and testing statistical part-of-speech taggers. We list some of the challenges as well as encouraging results pertaining to inter-rater agreement and consistency of annotation.
RESULTS: We used the Trigrams'n'Tags (TnT) [T. Brants, TnT-a statistical part-of-speech tagger, In: Proceedings of NAACL/ANLP-2000 Symposium, 2000] tagger trained on general English data to achieve 89.79% correctness. The same tagger trained on a portion of the medical data annotated for this project improved the performance to 94.69%. Furthermore, we find that discriminating between different types of discourse represented by different sections of clinical text may be very beneficial to improve correctness of POS tagging.
CONCLUSION: Our preliminary experimental results indicate the necessity for adapting state-of-the-art POS taggers to the sublanguage domain of clinical text.

Entities:  

Mesh:

Year:  2005        PMID: 16169769     DOI: 10.1016/j.ijmedinf.2005.08.006

Source DB:  PubMed          Journal:  Int J Med Inform        ISSN: 1386-5056            Impact factor:   4.046


  8 in total

1.  Part-of-speech tagging for clinical text: wall or bridge between institutions?

Authors:  Jung-wei Fan; Rashmi Prasad; Rommel M Yabut; Richard M Loomis; Daniel S Zisook; John E Mattison; Yang Huang
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

2.  Quantitative analysis of ontology research articles in the radiologic domain.

Authors:  Naoki Nishimoto; Ayako Yagahara; Yuki Yokooka; Shintaro Tsuji; Masahito Uesugi; Katsuhiko Ogasawara; Masaji Maezawa
Journal:  Radiol Phys Technol       Date:  2010-05-22

3.  Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation.

Authors:  Jeffrey P Ferraro; Hal Daumé; Scott L Duvall; Wendy W Chapman; Henk Harkema; Peter J Haug
Journal:  J Am Med Inform Assoc       Date:  2013-03-13       Impact factor: 4.497

4.  Agreement between patient-reported symptoms and their documentation in the medical record.

Authors:  Serguei V Pakhomov; Steven J Jacobsen; Christopher G Chute; Veronique L Roger
Journal:  Am J Manag Care       Date:  2008-08       Impact factor: 2.229

Review 5.  What can natural language processing do for clinical decision support?

Authors:  Dina Demner-Fushman; Wendy W Chapman; Clement J McDonald
Journal:  J Biomed Inform       Date:  2009-08-13       Impact factor: 6.317

6.  A comprehensive study of mobility functioning information in clinical notes: Entity hierarchy, corpus annotation, and sequence labeling.

Authors:  Thanh Thieu; Jonathan Camacho Maldonado; Pei-Shu Ho; Min Ding; Alex Marr; Diane Brandt; Denis Newman-Griffis; Ayah Zirikly; Leighton Chan; Elizabeth Rasch
Journal:  Int J Med Inform       Date:  2020-12-24       Impact factor: 4.046

7.  Characteristics of Finnish and Swedish intensive care nursing narratives: a comparative analysis to support the development of clinical language technologies.

Authors:  Helen Allvin; Elin Carlsson; Hercules Dalianis; Riitta Danielsson-Ojala; Vidas Daudaravičius; Martin Hassel; Dimitrios Kokkinakis; Heljä Lundgrén-Laine; Gunnar H Nilsson; Oystein Nytrø; Sanna Salanterä; Maria Skeppstedt; Hanna Suominen; Sumithra Velupillai
Journal:  J Biomed Semantics       Date:  2011-07-14

8.  Design of an extensive information representation scheme for clinical narratives.

Authors:  Louise Deléger; Leonardo Campillos; Anne-Laure Ligozat; Aurélie Névéol
Journal:  J Biomed Semantics       Date:  2017-09-11
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.