Literature DB >> 23999002

The EHR-ARCHE project: satisfying clinical information needs in a Shared Electronic Health Record system based on IHE XDS and Archetypes.

Georg Duftschmid¹, Christoph Rinner, Michael Kohler, Gudrun Huebner-Bloder, Samrend Saboor, Elske Ammenwerth.

Abstract

PURPOSE: While contributing to an improved continuity of care, Shared Electronic Health Record (EHR) systems may also lead to information overload of healthcare providers. Document-oriented architectures, such as the commonly employed IHE XDS profile, which only support information retrieval at the level of documents, are particularly susceptible for this problem. The objective of the EHR-ARCHE project was to develop a methodology and a prototype to efficiently satisfy healthcare providers' information needs when accessing a patient's Shared EHR during a treatment situation. We especially aimed to investigate whether this objective can be reached by integrating EHR Archetypes into an IHE XDS environment.
METHODS: Using methodical triangulation, we first analysed the information needs of healthcare providers, focusing on the treatment of diabetes patients as an exemplary application domain. We then designed ISO/EN 13606 Archetypes covering the identified information needs. To support a content-based search for fine-grained information items within EHR documents, we extended the IHE XDS environment with two additional actors. Finally, we conducted a formative and summative evaluation of our approach within a controlled study.
RESULTS: We identified 446 frequently needed diabetes-specific information items, representing typical information needs of healthcare providers. We then created 128 Archetypes and 120 EHR documents for two fictive patients. All seven diabetes experts, who evaluated our approach, preferred the content-based search to a conventional XDS search. Success rates of finding relevant information was higher for the content-based search (100% versus 80%) and the latter was also more time-efficient (8-14min versus 20min or more).
CONCLUSIONS: Our results show that for an efficient satisfaction of health care providers' information needs, a content-based search that rests upon the integration of Archetypes into an IHE XDS-based Shared EHR system is superior to a conventional metadata-based XDS search.

Entities: CellLine Chemical Disease Gene Species

Keywords: Electronic health records; Medical records systems, computerized; Models, theoretical; Reference standards

Mesh：

Year: 2013 PMID： 23999002 PMCID： PMC3851741 DOI： 10.1016/j.ijmedinf.2013.08.002

Source DB: PubMed Journal: Int J Med Inform ISSN： 1386-5056 Impact factor: 4.046

Introduction

One of the key benefits of Shared Electronic Health Record (EHR) systems [1] is their ability to integrate a patient's health data across the borders of different healthcare institutions. They can thus essentially improve continuity of care [2]. As Shared EHRs can be expected to contain much more documents than conventional intra-institutional EHRs, there are, however, also concerns that Shared EHR systems may promote information overload [3,4]. Information overload occurs when “information received becomes a hindrance rather than a help, even though the information is potentially useful” [5]. This may particularly be the case for patients with chronic diseases or multimorbid conditions, which will commonly lead to voluminous EHRs containing a plethora of documents. One obvious way to avoid information overload is to reduce the number of documents that have to be searched for satisfying a particular information need. The problem-oriented medical record suggested by Weed [6] follows this strategy. However, even if a Shared EHR system organizes a patient's documents based on medical problems (the nationwide Austrian ELGA system [7] for example is not problem-oriented), the set of documents may again become large for chronic diseases. Information overload is further fostered by document-oriented Shared EHR systems, if they do not support information retrieval mechanisms at finer granularities than the document level. The Integrating the Healthcare Enterprise (IHE) Cross-Enterprise Document Sharing (XDS) profile [8] is an open standards-based architecture specification for document-oriented Shared EHR systems. It is commonly employed for the implementation of regional Shared EHR systems, e.g. [9-13]. It is also a fundamental component of several upcoming national Shared EHR systems, such as the French DMP system [14] and the Austrian ELGA system [7]. A geographic overview of current IHE XDS implementations around the world can be found at [15]. The standard way of retrieving information in IHE XDS relies on queries that exclusively refer to document metadata (for example, patient ID, provider ID, document type, date of document creation) and always return complete documents as the smallest unit of information. More specific queries that refer to fine-grained information items (e.g., insulin therapy of the last 12 month, or course of HbA1c values) are not supported – i.e. the user has to know which documents to search for when looking for a particular information item. Further, in current implementations of the IHE XDS search component, the user typically has to manually browse the returned documents to find the relevant data for the information items.1 This may become seriously laborious, e.g. if data for a time series has to be located or if the desired information items are optional and only sparsely recorded on the returned documents. Searching Shared EHRs purely based on document metadata may thus lead to information overload by forcing users to (a) access a potentially large number of EHR documents (where due to the manual selection relevant documents may be missing and irrelevant ones may be present), and (b) find and filter the relevant data within the documents to satisfy their particular information need. Our basic assumption is that users of Shared EHR systems need extended search options to satisfy their information needs and to avoid information overload. Instead of relying on document metadata only, these extended search options should allow to search for specific content within structured EHR documents. The ISO/EN 13606 [16], Health Level Seven (HL7) Clinical Document Architecture (CDA) [17,18], and openEHR [19] are the currently most important open specifications for modelling structured EHRs’ contents. All three before-mentioned specifications support the so-called “dual-model approach”, which uses separate information and knowledge layers to model EHR contents. The information layer is based on a Reference Model. It defines a set of generic classes from which any EHR content may be constructed. The knowledge layer is based on a set of models, which specify how each individual EHR content is composed from the generic classes of the Reference Model. These knowledge models can be represented in free text, such as in CDA implementation guides (e.g., the CDA Continuity of Care Document [20]), or in computer-processable form, such as in Archetypes [21] or HL7 Templates [22]. In the following we focus on Archetypes as the modelling technique of the knowledge layer. According to an analysis of Archetype-related literature, Archetypes represent a suitable solution for EHR storage and interoperability [23]. For example, within two investigations by the National Health Service (NHS) in England, nearly 400 different Archetypes were developed in an in depth evaluation of this concept. As a result, Archetypes were found well suited for defining detailed clinical models for reuse across NHS IT projects [24]. However, it is unclear whether Archetype-based EHR architectures can actually help to satisfy the information needs of Shared EHR systems’ users via extended, content-based search options. Previous work on optimizing information retrieval in IHE XDS based Shared EHR systems has focused on semantic EHR document indexing techniques [25-27]. The semantics of EHR document contents were derived from coded data, from classes of the Reference Model, or via natural language processing – Archetypes have not yet been employed for this purpose. It is further unclear what information needs the users of Shared EHR systems in fact have. Prior work in this context primarily focused on what kind of information is frequently accessed. As an example, fourteen physicians were observed and it was found that most often medications and prescriptions are accessed, followed by vital signs and requests to lab orders [28]. In another observation, Zeng and Cimino found that physicians primarily accessed results of specific tests, past diagnoses and past hospitalizations and tests [29]. Hripcsak et al. found that in an emergency department most frequently clinical lab tests, radiology reports and clinician notes of earlier contacts were reviewed [30]. However, all these studies do not assess information needs, but information access. Hereby they overlook unavailable, but needed information. To address the before-mentioned problems, we defined the following objectives for our EHR-ARCHE project [31]: Identify healthcare providers’ information needs when accessing Shared EHRs. Develop a concept to satisfy these information needs within IHE XDS- and Archetype-based, standardized Shared EHR systems. Implement a prototype and evaluate the concept. In this paper we will present an overview of the EHR-ARCHE project and describe how we realized these three objectives.

Methods

Information needs analysis

As exemplary medical application domain we chose the treatment of diabetes patients, which has been shown to be fragmented between healthcare providers and would thus benefit from Shared EHR systems [32]. The goal was to find out which medical information is needed by diabetes experts in which clinical situation when treating diabetes patients. To achieve this goal, we aimed to integrate different perspectives of the investigated objects using methodical triangulation [33]: Information was collected by means of (a) analysing diabetes-specific clinical guidelines, (b) performing interviews with diabetes specialists, (c) observing clinical encounters of diabetes patients, and (d) analysing the contents of diabetes patients’ EHRs. First, we examined five international evidence-based diabetes guidelines for clinical diagnostics and medical treatment of diabetes [34-38] to identify relevant information items. We then performed semi-standardized expert interviews with six internists from the outpatient clinics of three Austrian hospitals and from a private practice specialized in diabetes. To validate information from the interviews and potentially gain further insights, unstructured observations of clinical encounters were conducted. All 22 observed encounters of diabetes patients took place at the University Hospital of Innsbruck's diabetes outpatient clinic of internal medicine. Finally, we analysed the data elements of diabetes patients’ EHRs in three diabetes outpatient clinics of internal medicine in three Austrian hospitals. All data from the guideline analysis, interviews, observations and EHRs were analysed by inductive qualitative content analysis to identify clinical situations and related information needs. For more details see Ref. [39].

Preparation of Archetypes and Electronic Health Record documents

Aiming for a fully standardized solution, we chose ISO/EN 13606 as our model for structuring EHR documents and for specifying Archetypes. As a prerequisite for structured ISO/EN 13606 EHR documents containing those information items that were identified in the information needs analysis, we mapped the information items to Archetypes. Each Archetype consists of a hierarchy of nodes, which reflect the hierarchical structure of a set of related information items. In our ambition to reuse existing Archetypes, we had to rely on openEHR Archetype libraries [40-42], as libraries of 13606-specific Archetypes were not publicly available. Instead of specializing the Archetypes, we cloned those parts of the existing Archetypes that covered our information items. As we had to transform the original openEHR Archetypes to the ISO/EN 13606 Reference Model, we would not have been able to keep a regular inheritance relation to the original Archetypes anyway. Using the linkEHR Archetype editor [43], we designed our Archetypes in an iterative process that was based on an intensive involvement of medical domain experts [44]. The Archetypes were set up in German language and also translated to English. To achieve a loose coupling between the information items and the corresponding Archetype nodes, we applied a two-stage mapping. First we coded our information items (e.g., information item “therapy GDM” was associated with code “AAArche::105”) and then bound the Archetype nodes to these codes in the Archetypes’ term-binding sections (e.g., Archetype node “at0009” representing item “therapy GDM” was bound to code “AAArche::105”). The manual selection of suitable codes from a standardized terminology such as SNOMED-CT to represent a set of information items is complex and time-consuming [45]. As this task was also not in the main focus of our project, we decided to use a self-defined, proprietary terminology as a proxy for a standardized terminology instead.

Content-based information retrieval

As the starting point of our concept for information retrieval, we assumed a Shared EHR system based on the IHE XDS architecture. Our main goal was to enable a content-based search for fine-grained information items that may be present within EHR documents by integrating Archetypes into an IHE XDS environment. Our approach for content-based search consisted of two steps. First we retrieved all those EHR documents from the XDS document repositories, which may contain data for the desired information items. We did this by using those document types, which contained the Archetype nodes that were mapped to the information items (compare Section 2.2), as filter criteria in a standard XDS document query. In the second step, the data for the desired information items were extracted from the before-mentioned set of documents via the paths of the corresponding Archetype nodes. The data were visually prepared and returned to the user, together with links to the source documents from which the data were extracted.

Evaluation of content-based search

Within the evaluation we aimed to determine, whether information needs could be better addressed and satisfied faster by means of a content-based search in comparison to a conventional XDS search based on document metadata. Further, we aimed to analyse the users’ satisfaction with the user interface of the content-based search. Combining observations, interviews, and a survey [33], we conducted a formative and summative evaluation within a controlled study. For the evaluation, we prepared full EHRs for two test patients. Each EHR contained around 60 documents. Each document was created twice, using different formats but identical contents–as unstructured PDFs as well as Archetype-based ISO/EN 13606 EHR extracts. As our case scenario we prepared for both patients a routine check, in which the physician had to answer five typical questions (e.g., overview on the medication of the last six months, medication intolerance) by accessing the patient's EHR. For each question we prepared a “gold standard” answer based on the available EHR data. To answer the questions concerning the two patients, 30 respective 27 different information items had to be located. Seven diabetes specialists participated in the evaluation. None of them was involved in the design of our system to avoid a potential evaluation bias. Each participant was confronted with both patient case scenarios and asked to answer the corresponding questions. For one patient case, the participant was asked to use the conventional XDS search based on document metadata, for the other patient case the content-based search should be used. The order of presented patient cases was varied between participants. The available time was limited to 20 min, reflecting a typical duration in a patient encounter. All participants were observed while working on the patient cases, and their success in answering the clinical questions was documented. After finishing both patient cases, each participant answered a short written survey on their impression of the content-based search, its usability and user friendliness. In addition, the usefulness and feasibility of the content-based search as well as ideas for improvement were collected in the course of a semi-structured expert interview. The survey results and the answers to the clinical questions were analysed using quantitative descriptive data analysis. Success rates of finding the answers to the clinical questions in both (study and control) groups were compared using the Wilcoxon–Mann–Whitney–U test with alpha set to 5%. The expert interviews and field notes from the observations were analysed using inductive summative content analysis. For more details see Ref. [46].

Results

We identified 446 information items (e.g., type and onset of diabetes, weight-height stati) that are frequently needed in the treatment of diabetes patients. The full list of information items can be found on the project homepage [31]. Further, six typical time windows (last 3/6/12/36/60 months, all available data) were identified that are commonly used by physicians for information retrieval during the encounter of a diabetes patient. In addition, ten clinical situations (e.g., initial clinical interview, routine check) were identified and each of them was associated with a set of information items typically needed. Finally, 68 particular information items were identified, which are frequently accessed independently of a particular clinical situation (e.g., glucose status, certain types of pathological lab data). More details on the information needs analysis can be found in Ref. [39]. For 17% of our information items we found corresponding openEHR Archetypes. In total we created 128 ISO/EN 13606 Archetypes that represent our 446 information items. Based on real but anonymized patient data, 120 test EHR documents associated with two fictive diabetes patients were manually synthesised by a medical domain expert. For each document, two versions with identical contents were created–an unstructured PDF document as well as an Archetype-based ISO/EN 13606 document. For the creation of the latter we developed a tool that automatically generates entry forms from ISO/EN 13606 Archetypes [47]. The tool was embedded into our IHE XDS environment and takes the role of a Document Source actor (compare Section 3.3). This means, it allows data recorded by means of a generated entry form to be submitted as Archetype-based ISO/EN 13606 EHR documents to an IHE XDS Document Repository and to be registered at the Document Registry. The ISO/EN 13606 and PDF versions of each document were submitted to the same Document Repository and registered with identical document metadata, except for the unique document identifiers and the attributes formatCode and mimeType.

Overall architectural design

Fig. 1 depicts an overview of the EHR-ARCHE architecture and highlights our project-specific components, which extend the standard commercial IHE XDS setting [48] that we used as basis. All project-specific components are pure extensions of IHE XDS, i.e. they do not modify any requirements of the profile. Thus, conventional IHE XDS Document Source and Document Consumer actors could be employed within the EHR-ARCHE system in parallel to our project-specific components. They would keep their standard functionality, but would obviously not support our content-based search.

Fig. 1

Overall technical architecture of the EHR-ARCHE project, based on an IHE XDS infrastructure. White: Standard XDS actors. Blue: New components. Yellow: Adaptors. Red: New communication relations between actors. ITI-XX: Standard XDS transactions (for details see Ref. [49]). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

The Document Registry actor

The Document Registry is a standard actor of the IHE XDS profile [8]. According to this profile, the Document Registry is responsible for managing a standardized set of metadata for each registered medical document. Amongst others, this set of metadata contains a link to the actual document that is stored by the Document Repository Actor. When queried by a Document Consumer (or in our solution by the Document Crawler), the Document Registry returns this link and also the metadata for those documents that match the query criteria.

The Document Repository actor

The main task of the standard actor Document Repository is to persistently and securely store medical documents. In case of new documents, the Document Repository contacts the Document Registry to register these new documents appropriately. In this course, the Repository transmits a set of standardized metadata describing the documents and links with which the documents can be retrieved.

The Document Consumer actor

We extended the standard actor Document Consumer to support content-related queries additionally to its standard queries related to document metadata. After selecting the desired patient, the user can choose whether to apply a conventional search purely based on document metadata or a content-based search that refers to individual information items within documents. For a conventional search, the user may specify conditions for the creation date and type of document as well as for the medical discipline of the document's author (compare Fig. 2).

Fig. 2

Conventional IHE document search based on document metadata within the Document Consumer. Conditions may be defined for the creation date (“Zeitliche Angaben”) and type of document (“Dokumenteninformation”) as well as for the medical discipline of the document's author (“Medizinische Organisation”).

Based on these conditions a standard ITI-18 query [49] is sent to the Document Registry, which returns a list of corresponding documents. The titles and creation dates of these documents are presented to the user in the Document Consumer. Each document may then be opened on demand, after downloading it from the source Document Repository via a standard ITI-17 [49] query. For a content-based search, the user may specify ad-hoc queries or execute pre-defined queries (compare Fig. 3). Both kinds of queries refer to codes representing those information items, which were identified during the information needs analysis (compare chapter 3.1). For this purpose, the Document Consumer loads the labels of all information items as well as the associated codes during its start-up sequence. These labels are then offered to the user to create ad-hoc queries from them. Ad-hoc queries as well as pre-defined queries allow the search to be limited to selected time periods.

Fig. 3

Content-based search within the Document Consumer. The user may create (a) ad-hoc queries referring to individual information items (here: HbA1c), or (b) execute pre-defined queries that search for a combination of information items (here: Erstgespräch = First Encounter). Both kinds of searches further allow constraining the timeframe in which the documents were created (here: letzten 6 Monate = last six months).

For our study we decided to provide only simple ad-hoc queries, which are limited to single information items. However, our query language also supports boolean combinations of multiple information items and value constraints. Therefore, a corresponding extension of our ad-hoc queries functionality is only a matter of providing a suitable user interface for this purpose. Pre-defined queries are provided to simplify the retrieval of frequently needed sets of information items that are typically needed in combination. A pre-defined query may refer to any number of information items linked with boolean operators and allows value constraints for those information items. We prepared ten pre-defined queries covering all clinical situations that were identified in the information needs analysis. Each of these pre-defined queries retrieves a group of information items that are relevant in a particular clinical situation and thus enables a context-specific satisfaction of information needs. Further 19 pre-defined queries were prepared to cover the frequently accessed information items that were identified in the information needs analysis. Our pre-defined queries contained 23.3 information items in average (minimum 1, maximum 96). The content-related queries cannot be answered by a standard XDS Document Registry. Instead, they are sent to the newly developed Document Crawler actor. This actor analyses the content-related query, coordinates all necessary communication with the standard XDS actors and sends the query results back to the Document Consumer. The latter displays the results in a tabular structure (compare Fig. 4).

Fig. 4

Result presentation of pre-defined query “glucose status (pathologic data)”: The results are organized according to the different information items combined within this pre-defined query. (a) Each information item and its associated constraints as defined in the query; (b) The attributes of the concerned Archetype nodes (here: result, reference value, interpretation and comment of fasting respectively postprandial blood glucose measurements); (c) The corresponding data (H = elevated). For each value the corresponding source document can be viewed by clicking on the document icon in the bottom row (d) (here: two discharge letters and four lab reports from 2010 resp. 2011).

For each individual value the corresponding source document and the latter's creation date (both are provided within the IHE XDS document metadata) are presented as contextual information. For queries that include two or more information items linked with a boolean AND operator, the complete fragment of an EHR document that unites all concerned information items is shown in the result set. This document fragment shows all data starting with the common “ancestor” item of the concerned information items within the hierarchical structure of an EHR document. As an example, the query “analysis = ‘BG postprandial’ AND result > 130″ in Fig. 4 returns the data of complete laboratory measurements, which represent the ancestor item of the items “analysis” and “result”. If the result includes data pertaining to a person other than the patient (the ISO/EN 13606 Reference Model provides the attribute subject_of_information for the documentation of “external” data such as the family history), these data are shown in grey. Further contextual data, such as the standard IHE XDS document metadata author, healthcare facility, and service start/stop time are available and could be shown via pop up windows, but have not yet been implemented. The full context of any result data can of course be accessed by opening the underlying source document via the corresponding link provided for each value (compare Fig. 4).

The Document Crawler actor

The Document Crawler is a non-standard actor, which we added to the IHE XDS architecture to enable content-related queries. After receiving a content-related query that refers to one or more information items via the corresponding codes from the Document Consumer, its first task is to retrieve all potentially relevant documents. For this purpose the Document Crawler sends the codes of the desired information items to the Archetype Repository and receives in return a list of those document types, which may contain data for these information items. Mapping the document types to the XDS metadata classCode and practiceSettingCode and further supplying the XDS metadata author, creationTime, and patientId from the content-related query, a standard XDS ITI-18 query [49] is composed and sent to the Document Registry. The documents themselves are then downloaded via a subsequent ITI-17 query [49] from the Document Repositories. The second task is to extract all data for the desired information items from the set of potentially relevant documents. For this purpose, the paths of all Archetype nodes that possess a binding to one of the content-related query's information items are requested from the Archetype Repository. These paths are translated to an XQuery, which is used to extract the desired data from the documents. All extracted data and the links to the corresponding source documents are compiled into a common result set, which is finally returned to the Document Consumer for display. More details on the Document Crawler can be found in Ref. [50].

The Archetype Repository actor

The Archetype Repository is a non-standard XDS actor, which is responsible for maintaining the Archetype-based structural specifications of EHR documents, as well as the mappings between information items and Archetype nodes and between Archetypes and document types. It supplies the Document Consumer with a list of information items, which may be referred to in a content-related query, and their associated codes. Further, when receiving from the Document Crawler the codes of a set of information items, which are part of a content-related query, it returns those document types, which contain one or more of these information items according to the Archetype-based structure of the EHR documents. Finally, it supports the Document Crawler in extracting the data for the desired information items from the EHR documents. Upon receiving the codes of the information items addressed in a content-related query from the Document Crawler, the Archetype Repository returns the paths of all Archetype nodes that are bound to these information item codes (compare Section 3.3.4). As an information item may appear on more than one document type, multiple Archetype node paths may be returned to the Document Crawler for each information item.

Evaluation of the content-based search

The study was conducted as planned. Seven diabetes specialists, both from inpatient and outpatient settings in Tyrol and Vienna, volunteered to participate. As described before, each participant was confronted with two patient cases. When using the content-based search, all participants were able to correctly answer all questions within the given time limit of 20 min. For the first patient case they needed between 10 and 14 min, for the second patient case between 8 and 12 min. When using the conventional search, participants found only around 80% of the expected information items within the time limit. Only one participant was able to find the correct answers to all clinical questions in the given time frame. The others had to stop the search after 20 min or aborted the search in face of the huge number of documents. In the course of the observations, several common search patterns were revealed. As an example, in the conventional search all participants focused on the three or four most recent documents first like in the daily routine and in most cases restricted their search for older documents to certain clinical domains only. Having to open and close each individual PDF document was found cumbersome by most participants. In the content-based search, all participants combined pre-defined queries with ad-hoc queries. Several participants were irritated if information items were identically contained in more than one document and thus also repeatedly appeared in the result table. In the interviews, 23 suggestions for improving the user interface could be collected. In the course of the survey, all participants found the content-based search simple and self-explaining, as well as more intuitive, faster and better suited to manage information overload than the conventional search. They all found the pre-defined queries useful to get an overview of a clinical situation and supported the development of software tools to search in clinical documents. In the course of the interviews, all experts stressed the usefulness of the pre-defined queries but in four cases desired an option for individually adapting these queries. Amongst others, four experts reported to be satisfied with the content-based search's response time, and three experts stated that the completeness and correctness of the presented data was legally important. Concerning the conventional search, two experts complained that it was difficult to find needed information due to the huge numbers of documents, and three experts stated that differently structured documents from different institutions complicated information retrieval. A detailed presentation of the results of the evaluation can be found in Ref. [46].

Discussion

Related work

The advancement of information retrieval methodology in Shared EHR systems to avoid information overload has been and still is the goal of several research projects. Three approaches have been presented, which aim to optimize information retrieval in an IHE XDS and HL7 CDA based Shared EHR system framework [25-27]. They all use semantic EHR document indexing techniques to extend the standard search options based on document metadata. Liu et al. use a pre-processing module that scans the CDA documents for coded entries and transforms them into Resource Description Framework (RDF) triples [25]. Pruski and Wisniewski apply a similar module based on natural language processing methods to derive index information from a CDA document and store this information as UMLS codes [27]. Liu et al. build the index by extracting the contents of several CDA-specific document components, such as the contents of CDA classes Observation, Procedure, and SubstanceAdministration [26]. During the processing of user search queries, the pre-collected index information is then considered additionally to the XDS metadata to retrieve the most relevant documents for the query. Two of these three approaches only return complete documents to the user as the result of a query [25,27]. Liu et al. support the search and display of fine-grained information items, including links to the underlying source documents [26]. Further, a summary view with pre-defined queries for several basic health information items is provided, although it is not explained how these information items were selected. The three before-mentioned approaches commonly pursue the strategy of creating a central and far more detailed index on the documents’ contents than the XDS Document Registry. This may raise data privacy concerns, as besides its audit and access control features, the distributed structure of the IHE XDS architecture with only limited central metadata on existing documents is frequently seen as a particular strength in the context of data privacy. In contrast to them, the EHR-ARCHE approach gets by with the standard XDS Document Registry, no additional central document metadata are created. The currently ongoing project RAVEL [51] aims to support EHR system users in locating and visualizing relevant elements within EHRs to avoid information overload. In contrast to EHR-ARCHE, this project seems to focus on Local EHR systems, particularly hospital information systems. They plan to use biomedical terminology standards to relate differently coded data to each other, but there is no reference to open standards-based EHR architecture specifications (e.g., IHE XDS) nor to open EHR content specifications (e.g., ISO/EN 13606) in Ref. [51]. Santos et al. propose an architecture for the Shared EHR system of the Brazilian state of Minas Gerais, which plans to employ the ISO/EN 13606 standard and Archetypes to specify EHR contents [52]. IHE XDS will not be used, instead the architecture envisages amongst others a central storage of all EHR documents. Although the need for a mechanism enabling the retrieval of a patient's clinical information from the central EHR document repository has been determined, corresponding details are not reported. The ByMedConnect system aims to provide the exchange of patient summaries between EHR systems based on Archetype-conformant ISO/EN 13606 EHR documents [53,54]. The architecture consists of a peer-to-peer communication of the patient summary via a central intermediate document server, where the patient selects the receiver. Information needs of physicians will be investigated based on the contents of the patient summary and displayed in the EHR systems by means of a generic visualization method that will supplement the Archetype model [55]. The ByMedConnect approach focuses on a one-time transfer of a single patient summary between two health care providers. After the transfer, the patient summary is removed from the document server. In contrast, the EHR-ARCHE approach is based on complete EHRs containing multiple persistent documents. It aims to locate the relevant information in these documents and visualize it in an integrated fashion. Munoz et al. describe the development of an EHR server that allows the storage and exchange of Archetype-conformant ISO/EN 13606 EHR documents [56]. The latter are stored in a central repository. The search for contents of EHR documents is said to be supported but is not described in detail. Overall, prior work that aims to avoid information overload in Shared EHR systems based on EHR standards either employs open standards-based EHR architecture specifications (typically IHE XDS) or Archetypes. None of the approaches integrates both technologies. The EHR-ARCHE project fills this gap by combining the strengths of IHE XDS – an established framework for exchanging EHR documents – and Archetypes, a foundation for specifying and thus searching fine-grained information items within EHR documents. In contrast to existing approaches that are based on the IHE XDS profile, EHR-ARCHE does not create a central index with detailed information on the contents of EHR documents. In our opinion such kind of an index dispossesses IHE XDS of a key strength–i.e. to support data privacy by storing only a limited set of high-level document metadata in the central registry and holding the documents themselves in distributed repositories.

Experience concerning the application of our approach

In the course of the evaluation, all participants preferred the content-based search to a conventional search, where information retrieval is only possible at the level of complete documents. This assessment is in accordance with a higher success rate of finding relevant information (100%) and a more time-efficient search (8–14 min) for the content-based search in comparison with the conventional search (80%; 20 min or more). This superiority of the content-based search is not self-explaining–in contrast, the higher complexity of the search interface, offering both, item-specific queries and pre-defined queries, as well as the more complex presentation of information items, could have contributed to a feeling of information overload. But this was not confirmed in our evaluation study. Instead, by providing a fast and complete retrieval of relevant information, the content-based search seems to save time for the physical contact with the patient and may lay the foundations for a high quality treatment. Apart from the actual patient care, the option to query fine-grained information items is also essential for research purposes, e.g. when looking for patients who satisfy particular trial eligibility criteria. As a limitation, it must be noted that only two patient test cases were prepared by a medical domain expert. Further evaluation using routine clinical data of more patients seems needed. Although performance optimization was not in the focus of our project, we performed a coarse performance analysis of our content-based search. For this purpose we used an Intel(R)-Core(TM)2-quad-core PC with 2.66 GHz, 8 GB RAM, and Windows 7 64-Bit PC as the hardware platform of the Document Crawler, which covers practically all time-critical processing steps. We measured for six fictive queries with different complexity the extra processing time that is needed for our content-based search as compared to a conventional search2. A query referring to (a) one single information item took around 0.6 s, (b) 10 information items linked with boolean OR operators took around 1.9 s, (c) 100 information items linked with OR operators took around 10.5 s, (d) two information items linked with a boolean AND operator took around 0.8 s, (e) five pairs of information items, where each pair was internally linked with an AND operator and all pairs were linked with OR operators took around 1.6 s, (f) equivalent to (e) and 90 additional information items, where the five pairs and 90 information items were linked with OR operators took around 8.5 s. As expected the duration of a query grows with the number of information items. In all cases the execution of the XQuery (compare Section 3.3.4) was the most time-consuming step of a content-based query (data not shown) and thus represents an obvious starting point for a future performance optimization. Concerning the information needs of EHR system users, we focused on the diabetes domain as a proof-of-concept. This application area may be extended based on a subsequent information needs analysis of other medical domains. Our architecture itself is designed in a generic, domain-independent way. It can handle any lists of information items, Archetypes, and pre-defined queries, regardless of the underlying medical domain. As soon as a new Archetype with bindings to one or more information items is defined in the Archetype Repository, the bound information items are available for creating pre-defined and ad-hoc content-based queries. Based on the results of our information needs analysis, we set up 29 pre-defined queries for the diabetes domain. We have not yet examined whether this could serve as a rough estimation of pre-defined queries needed in other medical domains. If two or more medical domains are covered in parallel by our content-based search, a domain-based filtering of information items and pre-defined queries should be considered in the Document Consumer. This would reduce the information items and pre-defined queries offered to the user to those, which are relevant according to the user's speciality. As our two additional IHE XDS actors Document Crawler and Archetype Repository and the extension of the Document Consumer actor are designed independently of the medical domain and could be employed in parallel to conventional actors, our actors could also be suitable candidates for future extensions of the IHE XDS architecture.

Unanswered questions and future work

As a future work item we plan to provide an intuitive query editor for the content-based search to the EHR system users. A corresponding prototype has already been created that allows value and temporal constraints on information items to be combined with boolean operators [57]. When integrated in our system, this will enable users to create ad-hoc queries based on multiple information items. By storing these queries they will have the option to extend their set of pre-defined queries. In our project we assumed fully structured EHR documents as the source of our content-based search. If we look at the Austrian ELGA system, which will operate with level 3 CDA documents by 2018 according to §27 paragraph 9 of the ELGA law [58], structured EHR documents do not seem to be unrealistic in the medium run. However, even these ELGA documents will still contain unstructured parts and it can be expected that semi-structured documents, probably combined with unstructured documents, will also be common in other future Shared EHR systems. Therefore, we plan to incorporate free-text search functionality in a future version of our approach. Obviously this kind of information retrieval is limited in several respects. As an example, when querying for all HbA1c values higher than a certain threshold it will be hard to safely associate a numeric value in the free-text data with the information item HbA1c. Further, the context of an information item will not be easily recognizable from free-text data, such as a diagnosis that does not refer to the patient but to a family member of the patient. However, complementing our Archetype-based search, a free-text search option may still be desirable. We will further analyse whether our approach would benefit from integrating the IHE profiles On-Demand Documents [59] or Query for Existing Data (QED) [60], which allow data for fine-grained information items to be retrieved from Source Actors. They could optimize the first step of our content-based search. Instead of having to collect all potentially relevant documents for one or more desired information items, the Source Actors could directly deliver the data within corresponding on-demand documents or QED messages to the Document Crawler. As a prerequisite, all Source Actors would naturally have to implement the creation of the required on-demand documents or QED queries. Our second step, where the Document Crawler would extract the data from the different on-demand documents or QED messages and integrate them to a complete overview on the patient, would remain unchanged. Another plan is to adapt our approach towards the framework requirements of the upcoming national Austrian Shared EHR system ELGA [7], which will be based on the IHE XDS architecture. Adaptations will be necessary here concerning the EHR content standard as ELGA employs HL7 CDA. Further, four CDA implementation guides have been published as the basis of the planned ELGA document types. These implementation guides could be the basis of a corresponding information needs analysis and Archetype design.

Authors’ contributions

GD conceptualized the presented approach in cooperation with EA, and drafted the manuscript. CR and MK implemented the Document Crawler, Document Source and Archetype Repository actors, and designed the Archetypes. GHB conducted the information needs analysis and the evaluation. She further prepared the EHR documents. SS implemented the IHE XDS environment and the extended Document Consumer actor. EA conceptualized the presented approach in cooperation with GD and was the overall project leader. All authors read, revised, and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests. What was already known on this topic Shared EHR systems can improve continuity of care but may at the same time promote information overload of healthcare providers. IHE XDS is a commonly employed open standards-based architecture specification for the implementation of Shared EHR systems. Archetypes, which represent a key component of dual-model based open EHR content specifications, are a suitable solution for EHR storage and interoperability. What this study added to our knowledge The risk of information overload can be reduced in IHE XDS-based Shared EHR systems by adding a content-based search function to the conventional XDS document-based search. This content-based search enables information retrieval on the level of fine-grained information items instead of complete documents only. The content-based search function can be efficiently implemented by integrating Archetypes into the IHE XDS architecture.

26 in total

1. Providing concept-oriented views for clinical data using a knowledge-based system: an evaluation.

Authors: Qing Zeng; James J Cimino; Kelly H Zou
Journal: J Am Med Inform Assoc Date: 2002 May-Jun Impact factor: 4.497

2. HL7 Clinical Document Architecture, Release 2.

Authors: Robert H Dolin; Liora Alschuler; Sandy Boyer; Calvin Beebe; Fred M Behlen; Paul V Biron; Amnon Shabo Shvo
Journal: J Am Med Inform Assoc Date: 2005-10-12 Impact factor: 4.497

3. Emergency department access to a longitudinal medical record.

Authors: George Hripcsak; Soumitra Sengupta; Adam Wilcox; Robert A Green
Journal: J Am Med Inform Assoc Date: 2007-01-09 Impact factor: 4.497

4. Proof-of-concept design and development of an EN13606-based electronic health care record service.

Authors: Adolfo Muñoz; Roberto Somolinos; Mario Pascual; Juan A Fragua; Miguel A González; Jose Luis Monteagudo; Carlos H Salvador
Journal: J Am Med Inform Assoc Date: 2006-10-26 Impact factor: 4.497

5. LinkEHR-Ed: a multi-reference model archetype editor based on formal semantics.

Authors: José A Maldonado; David Moner; Diego Boscá; Jesualdo T Fernández-Breis; Carlos Angulo; Montserrat Robles
Journal: Int J Med Inform Date: 2009-04-21 Impact factor: 4.046

Review 6. Archetype-based electronic health records: a literature review and evaluation of their applicability to health data interoperability and access.

Authors: Dennis Wollersheim; Anny Sari; Wenny Rahayu
Journal: Health Inf Manag Date: 2009 Impact factor: 3.185

7. Development of an EHR system for sharing - a semantic perspective.

Authors: Haifeng Liu; Xue Qiao Hou; Gang Hu; Jing Li; Yu Qi Ding
Journal: Stud Health Technol Inform Date: 2009

8. Towards personal health record: current situation, obstacles and trends in implementation of electronic healthcare record in Europe.

Authors: I Iakovidis
Journal: Int J Med Inform Date: 1998 Oct-Dec Impact factor: 4.046

9. Medical records that guide and teach.

Authors: L L Weed
Journal: N Engl J Med Date: 1968-03-14 Impact factor: 91.245

10. An EHR prototype using structured ISO/EN 13606 documents to respond to identified clinical information needs of diabetes specialists: a controlled study on feasibility and impact.

Authors: Gudrun Huebner-Bloder; Georg Duftschmid; Michael Kohler; Christoph Rinner; Samrend Saboor; Elske Ammenwerth
Journal: AMIA Annu Symp Proc Date: 2012-11-03

2 in total

1. Relevance of health level 7 clinical document architecture and integrating the healthcare enterprise cross-enterprise document sharing profile for managing chronic wounds in a telemedicine context.

Authors: Philippe Finet; Bernard Gibaud; Olivier Dameron; Régine Le Bouquin Jeannès
Journal: Healthc Technol Lett Date: 2016-03-23

2. Modeling EHR with the openEHR approach: an exploratory study in China.

Authors: Lingtong Min; Qi Tian; Xudong Lu; Huilong Duan
Journal: BMC Med Inform Decis Mak Date: 2018-08-29 Impact factor: 2.796

2 in total