Literature DB >> 28303597

A scaling approach to record linkage.

Harvey Goldstein1,2, Katie Harron3, Mario Cortina-Borja2.   

Abstract

With increasing availability of large datasets derived from administrative and other sources, there is an increasing demand for the successful linking of these to provide rich sources of data for further analysis. Variation in the quality of identifiers used to carry out linkage means that existing approaches are often based upon 'probabilistic' models, which are based on a number of assumptions, and can make heavy computational demands. In this paper, we suggest a new approach to classifying record pairs in linkage, based upon weights (scores) derived using a scaling algorithm. The proposed method does not rely on training data, is computationally fast, requires only moderate amounts of storage and has intuitive appeal.
Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

Entities:  

Keywords:  correspondence analysis; data linkage; record linkage; scaling

Mesh:

Year:  2017        PMID: 28303597      PMCID: PMC6205620          DOI: 10.1002/sim.7287

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  4 in total

1.  Record linkage: statistical models for matching computer records.

Authors:  J B Copas; F J Hilton
Journal:  J R Stat Soc Ser A Stat Soc       Date:  1990       Impact factor: 2.483

2.  Ignoring dependency between linking variables and its impact on the outcome of probabilistic record linkage studies.

Authors:  Miranda Tromp; Nora Méray; Anita C J Ravelli; Johannes B Reitsma; Gouke J Bonsel
Journal:  J Am Med Inform Assoc       Date:  2008-06-25       Impact factor: 4.497

3.  The analysis of record-linked data using multiple imputation with data value priors.

Authors:  Harvey Goldstein; Katie Harron; Angie Wade
Journal:  Stat Med       Date:  2012-07-17       Impact factor: 2.373

4.  Linkage, evaluation and analysis of national electronic healthcare data: application to providing enhanced blood-stream infection surveillance in paediatric intensive care.

Authors:  Katie Harron; Harvey Goldstein; Angie Wade; Berit Muller-Pebody; Roger Parslow; Ruth Gilbert
Journal:  PLoS One       Date:  2013-12-20       Impact factor: 3.240

  4 in total
  5 in total

1.  Assessing data linkage quality in cohort studies.

Authors:  Katie Harron; James C Doidge; Harvey Goldstein
Journal:  Ann Hum Biol       Date:  2020-03       Impact factor: 1.533

2.  Demystifying probabilistic linkage: Common myths and misconceptions.

Authors:  J C Doidge; K Harron
Journal:  Int J Popul Data Sci       Date:  2018-01-10

3.  On the Accuracy and Scalability of Probabilistic Data Linkage Over the Brazilian 114 Million Cohort.

Authors:  Robespierre Pita; Clicia Pinto; Samila Sena; Rosemeire Fiaccone; Leila Amorim; Sandra Reis; Mauricio L Barreto; Spiros Denaxas; Marcos Ennes Barreto
Journal:  IEEE J Biomed Health Inform       Date:  2018-03       Impact factor: 5.772

4.  Linkage of Hospital Records and Death Certificates by a Search Engine and Machine Learning.

Authors:  Sebastien Cossin; Serigne Diouf; Romain Griffier; Philippine Le Barrois d'Orgeval; Gayo Diallo; Vianney Jouhet
Journal:  JAMIA Open       Date:  2021-03-01

5.  A guide to evaluating linkage quality for the analysis of linked data.

Authors:  Katie L Harron; James C Doidge; Hannah E Knight; Ruth E Gilbert; Harvey Goldstein; David A Cromwell; Jan H van der Meulen
Journal:  Int J Epidemiol       Date:  2017-10-01       Impact factor: 7.196

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.