| Literature DB >> 16160240 |
Loïc Le Mignot1, Claude Mugnier, Mohamed Ben Saïd, Jean-Philippe Jais, Jean-Baptiste Richard, Christine Le Bihan-Benjamin, Pierre Taupin, Paul Landais.
Abstract
Difficulties in reconstituting patients' trajectory in the public health information systems are raised by errors in patients' identification processes. A crucial issue to achieve is avoiding doubles in distributed web databases. We explored Needleman and Wunsch (N&W) algorithm in order to optimize the properties of string matching. Five variants of the N&W algorithm were developed. The algorithms were implemented for a web Multi-Source Information System. This system was dedicated to tracking patients with End-Stage Renal Disease at both regional and national level. A simulated study database of 73,210 records was created. An insertion or suppression of each character of the original string was simulated. The rate of double entries was 2% given an acceptable distance set to 5 modifications. The search was sensitive and specific with an acceptable detection time. It detected up to 10% of modifications that is above the estimated error rate. A variant of the N&W algorithm designed as "cut-off heuristic", proved to be efficient for the search of double entries occurring in nominative distributed databases.Entities:
Mesh:
Year: 2005 PMID: 16160240
Source DB: PubMed Journal: Stud Health Technol Inform ISSN: 0926-9630