Literature DB >> 30878889

Medical data quality assessment: On the development of an automated framework for medical data curation.

Vasileios C Pezoulas1, Konstantina D Kourou2, Fanis Kalatzis1, Themis P Exarchos3, Aliki Venetsanopoulou4, Evi Zampeli5, Saviana Gandolfo6, Fotini Skopouli7, Salvatore De Vita6, Athanasios G Tzioufas4, Dimitrios I Fotiadis8.   

Abstract

Data quality assessment has gained attention in the recent years since more and more companies and medical centers are highlighting the importance of an automated framework to effectively manage the quality of their big data. Data cleaning, also known as data curation, lies in the heart of the data quality assessment and is a key aspect prior to the development of any data analytics services. In this work, we present the objectives, functionalities and methodological advances of an automated framework for data curation from a medical perspective. The steps towards the development of a system for data quality assessment are first described along with multidisciplinary data quality measures. A three-layer architecture which realizes these steps is then presented. Emphasis is given on the detection and tracking of inconsistencies, missing values, outliers, and similarities, as well as, on data standardization to finally enable data harmonization. A case study is conducted in order to demonstrate the applicability and reliability of the proposed framework on two well-established cohorts with clinical data related to the primary Sjögren's Syndrome (pSS). Our results confirm the validity of the proposed framework towards the automated and fast identification of outliers, inconsistencies, and highly-correlated and duplicated terms, as well as, the successful matching of more than 85% of the pSS-related medical terms in both cohorts, yielding more accurate, relevant, and consistent clinical data.
Copyright © 2019. Published by Elsevier Ltd.

Entities:  

Keywords:  Big data; Data curation; Data quality; Data quality assessment; Data standardization

Mesh:

Year:  2019        PMID: 30878889     DOI: 10.1016/j.compbiomed.2019.03.001

Source DB:  PubMed          Journal:  Comput Biol Med        ISSN: 0010-4825            Impact factor:   4.589


  11 in total

1.  Inaccurate Labels in Weakly-Supervised Deep Learning: Automatic Identification and Correction and Their Impact on Classification Performance.

Authors:  Degan Hao; Lei Zhang; Jules Sumkin; Aly Mohamed; Shandong Wu
Journal:  IEEE J Biomed Health Inform       Date:  2020-02-17       Impact factor: 5.772

Review 2.  Advancing computer-aided drug discovery (CADD) by big data and data-driven machine learning modeling.

Authors:  Linlin Zhao; Heather L Ciallella; Lauren M Aleksunes; Hao Zhu
Journal:  Drug Discov Today       Date:  2020-07-11       Impact factor: 7.851

3.  A Rule-Based Data Quality Assessment System for Electronic Health Record Data.

Authors:  Zhan Wang; John R Talburt; Ningning Wu; Serhan Dagtas; Meredith Nahm Zozus
Journal:  Appl Clin Inform       Date:  2020-09-23       Impact factor: 2.342

4.  Predicting Lymphoma Development by Exploiting Genetic Variants and Clinical Findings in a Machine Learning-Based Methodology With Ensemble Classifiers in a Cohort of Sjögren's Syndrome Patients.

Authors:  Konstantina D Kourou; Vasileios C Pezoulas; Eleni I Georga; Themis Exarchos; Costas Papaloukas; Michalis Voulgarelis; Andreas Goules; Andrianos Nezos; Athanasios G Tzioufas; Earalampos M Moutsopoulos; Clio Mavragani; Dimitrios I Fotiadis
Journal:  IEEE Open J Eng Med Biol       Date:  2020-02-14

Review 5.  Big Data in Nephrology.

Authors:  Navchetan Kaur; Sanchita Bhattacharya; Atul J Butte
Journal:  Nat Rev Nephrol       Date:  2021-06-30       Impact factor: 28.314

6.  A Multimodal Approach for the Risk Prediction of Intensive Care and Mortality in Patients with COVID-19.

Authors:  Vasileios C Pezoulas; Konstantina D Kourou; Costas Papaloukas; Vassiliki Triantafyllia; Vicky Lampropoulou; Eleni Siouti; Maria Papadaki; Maria Salagianni; Evangelia Koukaki; Nikoletta Rovina; Antonia Koutsoukou; Evangelos Andreakos; Dimitrios I Fotiadis
Journal:  Diagnostics (Basel)       Date:  2021-12-28

7.  ICU admission and mortality classifiers for COVID-19 patients based on subgroups of dynamically associated profiles across multiple timepoints.

Authors:  Vasileios C Pezoulas; Konstantina D Kourou; Eugenia Mylona; Costas Papaloukas; Angelos Liontos; Dimitrios Biros; Orestis I Milionis; Chris Kyriakopoulos; Kostantinos Kostikas; Haralampos Milionis; Dimitrios I Fotiadis
Journal:  Comput Biol Med       Date:  2021-12-27       Impact factor: 6.698

8.  Developing a systematic approach to assessing data quality in secondary use of clinical data based on intended use.

Authors:  Hanieh Razzaghi; Jane Greenberg; L Charles Bailey
Journal:  Learn Health Syst       Date:  2021-05-03

9.  Overcoming the Barriers That Obscure the Interlinking and Analysis of Clinical Data Through Harmonization and Incremental Learning.

Authors:  Vasileios C Pezoulas; Konstantina D Kourou; Fanis Kalatzis; Themis P Exarchos; Evi Zampeli; Saviana Gandolfo; Andreas Goules; Chiara Baldini; Fotini Skopouli; Salvatore De Vita; Athanasios G Tzioufas; Dimitrios I Fotiadis
Journal:  IEEE Open J Eng Med Biol       Date:  2020-03-16

10.  Primary Sjögren's Syndrome of Early and Late Onset: Distinct Clinical Phenotypes and Lymphoma Development.

Authors:  Andreas V Goules; Ourania D Argyropoulou; Vasileios C Pezoulas; Loukas Chatzis; Elena Critselis; Saviana Gandolfo; Francesco Ferro; Marco Binutti; Valentina Donati; Sara Zandonella Callegher; Aliki Venetsanopoulou; Evangelia Zampeli; Maria Mavrommati; Paraskevi V Voulgari; Themis Exarchos; Clio P Mavragani; Chiara Baldini; Fotini N Skopouli; Dimitrios I Fotiadis; Salvatore De Vita; Haralampos M Moutsopoulos; Athanasios G Tzioufas
Journal:  Front Immunol       Date:  2020-10-19       Impact factor: 7.561

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.