| Literature DB >> 32724799 |
Kavishwar B Wagholikar1,2,3, Shreekanth V Joshi4, Vishal V Pai Vernekar4, Yuri Ostrovsky4, Somnath D Desai4, Pooja B Magdum4, Sachin B Wakle4, Sheetal Jain4, Akshay Zagade4, Rahul Patel4, Shawn N Murphy1,2,3.
Abstract
Despite the widespread use of the "Informatics for Integrating Biology and the Bedside" (i2b2) platform, there are substantial challenges for loading electronic health records (EHR) into i2b2 and for querying i2b2. We have previously presented a simplified framework for semantic abstraction of EHR records into i2b2. Building on our previous work, we have created a proof-of-concept implementation of cloud services on an i2b2 data store for cohort identification. Specifically, we have implemented a graphical user interface (GUI) that declares the key components for data import, transformation, and query of EHR data. The GUI integrates with Azure cloud services to create data pipelines for importing EHR data into i2b2, creation of derived facts, and querying for generating Sankey-like flow diagrams that characterize the patient cohorts. We have evaluated the implementation using the real-world MIMIC-III dataset. We discuss the key features of this implementation and direction for future work, which will advance the efforts of the research community for patient cohort identification.Entities:
Mesh:
Year: 2020 PMID: 32724799 PMCID: PMC7366204 DOI: 10.1155/2020/2851713
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1High-level technical block diagram.
Figure 2Creating concepts and derived concepts.
Figure 3Creating a concept.
Figure 4Creating a derived concept using Pyspark.
Figure 5Graphical user interface for cohort querying. The flow diagram shows each node in the selection criteria.
Logic definitions.
| Logic type | Name | Logic path | Modality of implementation |
|---|---|---|---|
| Import | Diagnosis-import | Import/Diagnosis-import | SQL query |
| Import | Labs-import | Import/Labs-import | SQL query |
| Concept hierarchy | Diagnosis hierarchy | Concept_Hierarchy/Diagnosis-hierarchy | Tab separated file |
| Concept hierarchy | Labs hierarchy | Concept_Hierarchy/Labs-hierarchy | Tab separated file |
| Derived concept: Boolean | Diagnosis-diabetes | Derived_Concept/Diagnosis-Diabetes | SQL query |
| Derived concept: numeric | Last HbA1c | Labs/BloodTest/Chemistry/HbA1c/Last HbA1c | SQL query |
| Derived concept: numeric | Serum glucose | Labs/BloodTest/Chemistry/Glucose/Last-Glucose | SQL query |
| Derived concept: Boolean | Last two consecutive serum glucose > 126 | Labs/BloodTest/chemistry/glucose/last-twoSerumGlucose/LastTwoSerumGlucose_ > _126 | SQL query |
Diabetes cohort criteria.
| Rationale for filter | Description for filter |
|---|---|
| Based on ICD9 codes | Diagnoses code of diabetes |
| Based on serum glucose | Last two consecutive serum glucose > 126 |
| Based on HbA1c | Last HbA1c > 6.4 |
Figure 6Flow diagram summarizes the result of the query, where each node shows the patient count resulting from filters in the current node and parent.