| Literature DB >> 22211176 |
Mijung Kim1, Tahsin Kurc, Alessandro Orso, Jake Cobb, David Gutman, Mary Jean Harrold, Andrew Post, Ashish Sharma, Joel Saltz.
Abstract
Clinical research is increasingly relying on information gathered and managed in different database systems and institutions. Distributed data collection and management processes in such settings can be extremely complex and lead to a range of issues involving the integrity and accuracy of the distributed data. To address this challenge, we propose a middleware framework for assessing the data integrity and correctness in federated environments. The framework has two main elements: (1) a test model describing the dependencies between and constraints on data sources and datasets, and (2) a family of testing techniques that create and execute test cases based on the model.Entities:
Year: 2011 PMID: 22211176 PMCID: PMC3248750
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Figure 1:Intuitive view of the proposed framework.
Datasets and data management systems in the test bed.
| Radiology images in DICOM format, imaging metadata | Virtual PACS |
| Manual annotations provided by neuroradiologists | AIME |
| mRNA, miRNA, methylation data, copy number, sequence data | in-house developed database with file system for data files |
| Clinical data (including days to death, diagnosis, year of initial pathologic diagnosis), specimen (e.g., sample type), etc., data | i2b2 |
| Whole slide microscopy images as 20x and 40x magnification, image metadata | caMicroscope |
| Computer- and human-generated annotations of pathology images | PAIS |