Literature DB >> 31319948

Towards a content agnostic computable knowledge repository for data quality assessment.

Naresh Sundar Rajan1, Ramkiran Gouripeddi2, Peter Mo3, Randy K Madsen4, Julio C Facelli5.   

Abstract

BACKGROUND AND
OBJECTIVE: In recent years, several data quality conceptual frameworks have been proposed across the Data Quality and Information Quality domains towards assessment of quality of data. These frameworks are diverse, varying from simple lists of concepts to complex ontological and taxonomical representations of data quality concepts. The goal of this study is to design, develop and implement a platform agnostic computable data quality knowledge repository for data quality assessments.
METHODS: We identified computable data quality concepts by performing a comprehensive literature review of articles indexed in three major bibliographic data sources. From this corpus, we extracted data quality concepts, their definitions, applicable measures, their computability and identified conceptual relationships. We used these relationships to design and develop a data quality meta-model and implemented it in a quality knowledge repository.
RESULTS: We identified three primitives for programmatically performing data quality assessments: data quality concept, its definition, its measure or rule for data quality assessment, and their associations. We modeled a computable data quality meta-data repository and extended this framework to adapt, store, retrieve and automate assessment of other existing data quality assessment models.
CONCLUSION: We identified research gaps in data quality literature towards automating data quality assessments methods. In this process, we designed, developed and implemented a computable data quality knowledge repository for assessing quality and characterizing data in health data repositories. We leverage this knowledge repository in a service-oriented architecture to perform scalable and reproducible framework for data quality assessments in disparate biomedical data sources.
Copyright © 2019 Elsevier B.V. All rights reserved.

Keywords:  Data Quality Metadata Repository; Data quality assessment; Data quality dimensions; Data quality framework; Knowledge representation

Year:  2019        PMID: 31319948     DOI: 10.1016/j.cmpb.2019.05.017

Source DB:  PubMed          Journal:  Comput Methods Programs Biomed        ISSN: 0169-2607            Impact factor:   5.428


  2 in total

Review 1.  Quality assessment of real-world data repositories across the data life cycle: A literature review.

Authors:  Siaw-Teng Liaw; Jason Guan Nan Guo; Sameera Ansari; Jitendra Jonnagaddala; Myron Anthony Godinho; Alder Jose Borelli; Simon de Lusignan; Daniel Capurro; Harshana Liyanage; Navreet Bhattal; Vicki Bennett; Jaclyn Chan; Michael G Kahn
Journal:  J Am Med Inform Assoc       Date:  2021-07-14       Impact factor: 4.497

2.  Recent Trends in Patient Registries for Health Services Research.

Authors:  Jürgen Stausberg; Sonja Harkener; Sebastian C Semler
Journal:  Methods Inf Med       Date:  2021-04-16       Impact factor: 2.176

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.