Literature DB >> 31287405

Enhancing medical data quality through data curation: a case study in primary Sjögren's syndrome.

Vasileios C Pezoulas1, Konstantina D Kourou2, Fanis Kalatzis1, Themis P Exarchos3, Aliki I Venetsanopoulou4, Evi Zampeli5, Saviana Gandolfo6, Fotini N Skopouli7, Salvatore De Vita6, Athanasios G Tzioufas4, Dimitrios I Fotiadis8.   

Abstract

OBJECTIVES: To address the need for automatically assessing the quality of clinical data in terms of accuracy, relevance, conformity, and completeness, through the concise development and application of an automated method which is able to automatically detect problematic fields and match clinical terms under a specific domain.
METHODS: The proposed methodology involves the automated construction of three diagnostic reports that summarise valuable information regarding the types and ranges of each term in the dataset, along with the detected outliers, inconsistencies, and missing values, followed by a set of clinically relevant terms based on a reference model which serves as a set of terms which describes the domain knowledge of a disease of interest.
RESULTS: A case study was conducted using anonymised data from 250 patients who were diagnosed with primary Sjögren's syndrome (pSS), yielding reliable outcomes that were highlighted for clinical evaluation. Our method was able to successfully identify 28 features with detected outliers, and unknown data types, as well as, identify outliers, missing values, similar terms, and inconsistencies within the dataset. The data standardisation method was able to match 76 out of 85 (89.41%) pSS-related terms according to a standard pSS reference model which has been introduced by the clinicians.
CONCLUSIONS: Our results confirm the clinical value of the data curation method towards the improvement of the dataset quality through the precise identification of outliers, missing values, inconsistencies, and similar terms, as well as, through the automated detection of pSS-related relevant terms towards data standardisation.

Entities:  

Mesh:

Year:  2019        PMID: 31287405

Source DB:  PubMed          Journal:  Clin Exp Rheumatol        ISSN: 0392-856X            Impact factor:   4.473


  2 in total

1.  Developing a systematic approach to assessing data quality in secondary use of clinical data based on intended use.

Authors:  Hanieh Razzaghi; Jane Greenberg; L Charles Bailey
Journal:  Learn Health Syst       Date:  2021-05-03

2.  Overcoming the Barriers That Obscure the Interlinking and Analysis of Clinical Data Through Harmonization and Incremental Learning.

Authors:  Vasileios C Pezoulas; Konstantina D Kourou; Fanis Kalatzis; Themis P Exarchos; Evi Zampeli; Saviana Gandolfo; Andreas Goules; Chiara Baldini; Fotini Skopouli; Salvatore De Vita; Athanasios G Tzioufas; Dimitrios I Fotiadis
Journal:  IEEE Open J Eng Med Biol       Date:  2020-03-16
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.