| Literature DB >> 18693929 |
Abstract
This article presents a study of the use of data preparation for data mining methodology to prepare biomedical citation data for visualization. Deterministic record linkage models were compared with probabilistic record linkage in a situation for which the truth is known through the use of gold standard or truth datasets. The linkages are evaluated on data from the Web of Science (WOS) and Medline citation databases. Sensitivity, specificity, and overall performance of record linkage models were empirically compared with ROC analysis. Data quality and visualization metrics are presented for datasets prepared with and without probabilistic record linkage and information fusion of Medline abstracts and MESH terms into WOS citation records. The major contributions of this work are to specifically develop a novel model of record linkage for biomedical citation databases, with the objective of improving and enriching biomedical knowledge domain visualizations.Mesh:
Year: 2007 PMID: 18693929 PMCID: PMC2655784
Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN: 1559-4076