Literature DB >> 19696952

Evaluation of record linkage methods for iterative insertions.

Murat Sariyar1, A Borg, K Pommerening.   

Abstract

OBJECTIVES: There have been many developments and applications of mathematical methods in the context of record linkage as one area of interdisciplinary research efforts. However, comparative evaluations of record linkage methods are still underrepresented. In this paper improvements of the Fellegi-Sunter model are compared with other elaborated classification methods in order to direct further research endeavors to the most promising methodologies.
METHODS: The task of linking records can be viewed as a special form of object identification. We consider several non-stochastic methods and procedures for the record linkage task in addition to the Fellegi-Sunter model and perform an empirical evaluation on artificial and real data in the context of iterative insertions. This evaluation provides a deeper insight into empirical similarities and differences between different modelling frames of the record linkage problem. In addition, the effects of using string comparators on the performance of different matching algorithms are evaluated.
RESULTS: Our central results show that stochastic record linkage based on the principle of the EM algorithm exhibits best classification results when calibrating data are structurally different to validation data. Bagging, boosting together with support vector machines are best classification methods when calibrating and validation data have no major structural differences.
CONCLUSIONS: The most promising methodologies for record linkage in environments similar to the one considered in this paper seem to be stochastic ones.

Mesh:

Year:  2009        PMID: 19696952     DOI: 10.3414/ME9238

Source DB:  PubMed          Journal:  Methods Inf Med        ISSN: 0026-1270            Impact factor:   2.176


  2 in total

1.  Missing values in deduplication of electronic patient data.

Authors:  M Sariyar; A Borg; K Pommerening
Journal:  J Am Med Inform Assoc       Date:  2011-10-15       Impact factor: 4.497

2.  Linking health facility data from young adults aged 18-24 years to longitudinal demographic data: Experience from The Kilifi Health and Demographic Surveillance System.

Authors:  Christopher Nyundo; Aoife M Doyle; David Walumbe; Mark Otiende; Michael Kinuthia; David Amadi; Boniface Jibendi; George Mochamah; Norbert Kihuha; Thomas N Williams; David A Ross; Evasius Bauni
Journal:  Wellcome Open Res       Date:  2020-02-27
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.