| Literature DB >> 25451102 |
Markus Kreuzthaler1, Stefan Schulz2, Andrea Berghold2.
Abstract
Controlled clinical trials are usually supported with an in-front data aggregation system, which supports the storage of relevant information according to the trial context within a highly structured environment. In contrast to the documentation of clinical trials, daily routine documentation has many characteristics that influence data quality. One such characteristic is the use of non-standardized text, which is an indispensable part of information representation in clinical information systems. Based on a cohort study we highlight challenges for mining electronic health records targeting free text entry fields within semi-structured data sources. Our prototypical information extraction system achieved an F-measure of 0.91 (precision=0.90, recall=0.93) for the training set and an F-measure of 0.90 (precision=0.89, recall=0.92) for the test set. We analyze the obtained results in detail and highlight challenges and future directions for the secondary use of routine data in general.Keywords: Clinical narrative; Information extraction; Secondary use
Mesh:
Year: 2014 PMID: 25451102 DOI: 10.1016/j.jbi.2014.10.010
Source DB: PubMed Journal: J Biomed Inform ISSN: 1532-0464 Impact factor: 6.317