| Literature DB >> 32314069 |
Mathias Kaspar1,2, Leon Liman3, Maximilian Ertl4, Georg Fette5, Lea Katharina Seidlmayer5, Laura Schreiber5, Frank Puppe3, Stefan Störk5.
Abstract
Clinical Data Warehouses (DWHs) are used to provide researchers with simplified access to pseudonymized and homogenized clinical routine data from multiple primary systems. Experience with the integration of imaging and metadata from picture archiving and communication systems (PACS), however, is rare. Our goal was therefore to analyze the viability of integrating a production PACS with a research DWH to enable DWH queries combining clinical and medical imaging metadata and to enable the DWH to display and download images ad hoc. We developed an application interface that enables to query the production PACS of a large hospital from a clinical research DWH containing pseudonymized data. We evaluated the performance of bulk extracting metadata from the PACS to the DWH and the performance of retrieving images ad hoc from the PACS for display and download within the DWH. We integrated the system into the query interface of our DWH and used it successfully in four use cases. The bulk extraction of imaging metadata required a median (quartiles) time of 0.09 (0.03-2.25) to 12.52 (4.11-37.30) seconds for a median (quartiles) number of 10 (3-29) to 103 (8-693) images per patient, depending on the extraction approach. The ad hoc image retrieval from the PACS required a median (quartiles) of 2.57 (2.57-2.79) seconds per image for the download, but 5.55 (4.91-6.06) seconds to display the first and 40.77 (38.60-41.63) seconds to display all images using the pure web-based viewer. A full integration of a production PACS with a research DWH is viable and enables various use cases in research. While the extraction of basic metadata from all images can be done with reasonable effort, the extraction of all metadata seems to be more appropriate for subgroups.Entities:
Keywords: Clinical data warehouse; Medical images; Secondary data usage; System integration
Year: 2020 PMID: 32314069 PMCID: PMC7522145 DOI: 10.1007/s10278-020-00334-0
Source DB: PubMed Journal: J Digit Imaging ISSN: 0897-1889 Impact factor: 4.056
Fig. 1Architecture of the DWH including the PACS-to-DWH (P2D) interface application. The P2D interface (gray background) connects to the PACS, to the identity (ID) management system, to the DWH database (via the metadata extractor), and to the DWH web-client (via the ad hoc query system)
Fig. 2Screenshots of the PACS-to-DWH (P2D) integration. a An exemplary DWH query (top panel) with the catalog entries (left panel) of the PACS entries. Query results are shown on the bottom panel. The right-most columns present buttons to download (“Bild herunterladen”) or view images (“Bild ansehen”; results shown in b) of the patient or patient case. b The study/series list view of a single patient or patient case (depending on the DWH query). A click on the left part of the list leads to the view shown in c. A click on the right part provides a directory containing the downloaded images. c The view of a single DICOM CT series within the DICOM viewer
Number of images, times, and data size required for the bulk metadata extraction of 1000 test patients from the production PACS, aggregated per patient (see supplementary tables for more details)
| Basic header | Extended header | |||
|---|---|---|---|---|
| First image per series | All images | First image per series | ||
| Number of images per patient | Mean | 26.5 | 1004 | 26.5 |
| Median (quartiles) | 10 (3–30) | 103 (8–693) | 10 (3–30) | |
| Minimum | 1 | 1 | 1 | |
| Maximum | 664 | 37,846 | 663 | |
| Time required to extract a patient’s data (seconds) | Mean | 2.76 | 3.92 | 34.10 |
| Median (quartiles) | 0.09 (0.03–2.25) | 0.38 (0.03–2.67) | 12.52 (4.11–37.30) | |
| Minimum | < 0.01 | < 0.01 | 0.01 | |
| Maximum | 100.05 | 153.46 | 835.67 | |
| Data size to extract a patient’s data (KiB) | Mean | 417.5 | 417.1 | 123,120.8 |
| Median (quartiles) | 38.1 (3.0–278.1) | 37.8 (3.0–277.2) | 29,541.0 (6367.3–118,253.9) | |
| Minimum | 0.2 | 0.2 | 0.2 | |
| Maximum | 15,577.1 | 15,577.1 | 3,939,238.9 | |
Number of images, times, and data size required for the bulk metadata extraction of 20 test patients from the Orthanc test setup, aggregated per patient (see supplementary tables for more details)
| Basic header | Extended header | |||
|---|---|---|---|---|
| First image per series | All images | First image per series | ||
| Number of images per patient | Mean | 44.1 | 1434.3 | 44.1 |
| Median (quartiles) | 13 (5–67) | 100 (32–1891) | 13 (5–67) | |
| Minimum | 1 | 1 | 1 | |
| Maximum | 171 | 7943 | 170 | |
| Time required to extract a patient’s data (seconds) | Mean | 10.78 | 13.64 | 110.88 |
| Median (quartiles) | 2.16 (1.08–9.46) | 6.14 (0.78–29.76) | 32.33 (9.19–147.75) | |
| Minimum | < 0.01 | 0.02 | 4.08 | |
| Maximum | 67.47 | 46.47 | 403.69 | |
| Data size to extract a patient’s data (KiB) | Mean | 848.3 | 848.3 | 164,800.3 |
| Median (quartiles) | 56.9 (13.9–1240.6) | 56.9 (13.9–1240.6) | 26,307.3 (4921.7–139,309.6) | |
| Minimum | 0.7 | 0.7 | 0.2 | |
| Maximum | 4648.0 | 4648.0 | 1,863,962.9 | |