Birger Haarbrandt1, Erik Tute2, Michael Marschollek2. 1. Peter L. Reichertz Institute for Medical Informatics, University of Braunschweig - Institute of Technology and Hanover Medical School, Hanover, Germany. Electronic address: birger.haarbrandt@plri.de. 2. Peter L. Reichertz Institute for Medical Informatics, University of Braunschweig - Institute of Technology and Hanover Medical School, Hanover, Germany.
Abstract
BACKGROUND: Detailed Clinical Model (DCM) approaches have recently seen wider adoption. More specifically, openEHR-based application systems are now used in production in several countries, serving diverse fields of application such as health information exchange, clinical registries and electronic medical record systems. However, approaches to efficiently provide openEHR data to researchers for secondary use have not yet been investigated or established. METHODS: We developed an approach to automatically load openEHR data instances into the open source clinical data warehouse i2b2. We evaluated query capabilities and the performance of this approach in the context of the Hanover Medical School Translational Research Framework (HaMSTR), an openEHR-based data repository. RESULTS: Automated creation of i2b2 ontologies from archetypes and templates and the integration of openEHR data instances from 903 patients of a paediatric intensive care unit has been achieved. In total, it took an average of ∼2527s to create 2.311.624 facts from 141.917 XML documents. Using the imported data, we conducted sample queries to compare the performance with two openEHR systems and to investigate if this representation of data is feasible to support cohort identification and record level data extraction. DISCUSSION: We found the automated population of an i2b2 clinical data warehouse to be a feasible approach to make openEHR data instances available for secondary use. Such an approach can facilitate timely provision of clinical data to researchers. It complements analytics based on the Archetype Query Language by allowing querying on both, legacy clinical data sources and openEHR data instances at the same time and by providing an easy-to-use query interface. However, due to different levels of expressiveness in the data models, not all semantics could be preserved during the ETL process.
BACKGROUND: Detailed Clinical Model (DCM) approaches have recently seen wider adoption. More specifically, openEHR-based application systems are now used in production in several countries, serving diverse fields of application such as health information exchange, clinical registries and electronic medical record systems. However, approaches to efficiently provide openEHR data to researchers for secondary use have not yet been investigated or established. METHODS: We developed an approach to automatically load openEHR data instances into the open source clinical data warehouse i2b2. We evaluated query capabilities and the performance of this approach in the context of the Hanover Medical School Translational Research Framework (HaMSTR), an openEHR-based data repository. RESULTS: Automated creation of i2b2 ontologies from archetypes and templates and the integration of openEHR data instances from 903 patients of a paediatric intensive care unit has been achieved. In total, it took an average of ∼2527s to create 2.311.624 facts from 141.917 XML documents. Using the imported data, we conducted sample queries to compare the performance with two openEHR systems and to investigate if this representation of data is feasible to support cohort identification and record level data extraction. DISCUSSION: We found the automated population of an i2b2 clinical data warehouse to be a feasible approach to make openEHR data instances available for secondary use. Such an approach can facilitate timely provision of clinical data to researchers. It complements analytics based on the Archetype Query Language by allowing querying on both, legacy clinical data sources and openEHR data instances at the same time and by providing an easy-to-use query interface. However, due to different levels of expressiveness in the data models, not all semantics could be preserved during the ETL process.
Authors: Christian Maier; Jan Christoph; Danilo Schmidt; Thomas Ganslandt; H U Prokosch; Stefan Kraus; Martin Sedlmayr Journal: J Healthc Eng Date: 2019-01-17 Impact factor: 2.682
Authors: Lorenz Rosenau; Raphael W Majeed; Josef Ingenerf; Alexander Kiel; Björn Kroll; Thomas Köhler; Hans-Ulrich Prokosch; Julian Gruendner Journal: JMIR Med Inform Date: 2022-04-27
Authors: Antje Wulff; Marcel Mast; Marcus Hassler; Sara Montag; Michael Marschollek; Thomas Jack Journal: Methods Inf Med Date: 2020-10-14 Impact factor: 2.176