| Literature DB >> 29678056 |
Hans Moen1, Laura-Maria Peltonen2, Mikko Koivumäki2, Henry Suhonen2, Tapio Salakoski1, Filip Ginter1, Sanna Salanterä2.
Abstract
We report on the development and evaluation of a prototype tool aimed to assist laymen/patients in understanding the content of clinical narratives. The tool relies largely on unsupervised machine learning applied to two large corpora of unlabeled text - a clinical corpus and a general domain corpus. A joint semantic word-space model is created for the purpose of extracting easier to understand alternatives for words considered difficult to understand by laymen. Two domain experts evaluate the tool and inter-rater agreement is calculated. When having the tool suggest ten alternatives to each difficult word, it suggests acceptable lay words for 55.51% of them. This and future manual evaluation will serve to further improve performance, where also supervised machine learning will be used.Entities:
Keywords: Text simplification; distributional semantics; electronic health records; natural language processing; unsupervised machine learning; word2vec
Mesh:
Year: 2018 PMID: 29678056
Source DB: PubMed Journal: Stud Health Technol Inform ISSN: 0926-9630