Natalia Viani1, Cristiana Larizza2, Valentina Tibollo3, Carlo Napolitano3, Silvia G Priori4, Riccardo Bellazzi5, Lucia Sacchi2. 1. Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Via Ferrata 5, 27100, Pavia, PV, Italy. Electronic address: natalia.viani01@universitadipavia.it. 2. Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Via Ferrata 5, 27100, Pavia, PV, Italy. 3. IRCCS Istituti Clinici Scientifici Maugeri, Via Salvatore Maugeri 10, 27100, Pavia, PV, Italy. 4. IRCCS Istituti Clinici Scientifici Maugeri, Via Salvatore Maugeri 10, 27100, Pavia, PV, Italy; Department of Molecular Medicine, University of Pavia, Via Forlanini, 27100, Pavia, PV, Italy. 5. Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Via Ferrata 5, 27100, Pavia, PV, Italy; IRCCS Istituti Clinici Scientifici Maugeri, Via Salvatore Maugeri 10, 27100, Pavia, PV, Italy.
Abstract
OBJECTIVE: In this work, we propose an ontology-driven approach to identify events and their attributes from episodes of care included in medical reports written in Italian. For this language, shared resources for clinical information extraction are not easily accessible. MATERIALS AND METHODS: The corpus considered in this work includes 5432 non-annotated medical reports belonging to patients with rare arrhythmias. To guide the information extraction process, we built a domain-specific ontology that includes the events and the attributes to be extracted, with related regular expressions. The ontology and the annotation system were constructed on a development set, while the performance was evaluated on an independent test set. As a gold standard, we considered a manually curated hospital database named TRIAD, which stores most of the information written in reports. RESULTS: The proposed approach performs well on the considered Italian medical corpus, with a percentage of correct annotations above 90% for most considered clinical events. We also assessed the possibility to adapt the system to the analysis of another language (i.e., English), with promising results. DISCUSSION AND CONCLUSION: Our annotation system relies on a domain ontology to extract and link information in clinical text. We developed an ontology that can be easily enriched and translated, and the system performs well on the considered task. In the future, it could be successfully used to automatically populate the TRIAD database.
OBJECTIVE: In this work, we propose an ontology-driven approach to identify events and their attributes from episodes of care included in medical reports written in Italian. For this language, shared resources for clinical information extraction are not easily accessible. MATERIALS AND METHODS: The corpus considered in this work includes 5432 non-annotated medical reports belonging to patients with rare arrhythmias. To guide the information extraction process, we built a domain-specific ontology that includes the events and the attributes to be extracted, with related regular expressions. The ontology and the annotation system were constructed on a development set, while the performance was evaluated on an independent test set. As a gold standard, we considered a manually curated hospital database named TRIAD, which stores most of the information written in reports. RESULTS: The proposed approach performs well on the considered Italian medical corpus, with a percentage of correct annotations above 90% for most considered clinical events. We also assessed the possibility to adapt the system to the analysis of another language (i.e., English), with promising results. DISCUSSION AND CONCLUSION: Our annotation system relies on a domain ontology to extract and link information in clinical text. We developed an ontology that can be easily enriched and translated, and the system performs well on the considered task. In the future, it could be successfully used to automatically populate the TRIAD database.
Authors: Seyedmostafa Sheikhalishahi; Riccardo Miotto; Joel T Dudley; Alberto Lavelli; Fabio Rinaldi; Venet Osmani Journal: JMIR Med Inform Date: 2019-04-27
Authors: Pilar López-Úbeda; Alexandra Pomares-Quimbaya; Manuel Carlos Díaz-Galiano; Stefan Schulz Journal: BMC Med Inform Decis Mak Date: 2021-05-04 Impact factor: 2.796